GeoIP on Amazon Elastic Map Reduce (EMR) using Hadoop Streaming (Python)
April 23, 2012
Leave a comment
I wanted to be able to run geo-data calculations on Amazon Elastic Map Reduce using Hadoop streaming jobs – particularly in Python. While we cannot easily install required Python dependencies, this problem can be solved by using the cacheArchive feature of Hadoop.
Categories: AWS, python
Elastic Map Reduce
