Before this lesson, we recommend you go through
- Check out the original map-reduce paper from Google
- Understand the need for the MapReduce paradigm
- Understand the functioning of MapReduce and walk through a sample program
- Understand the underlying architecture of MapReduce
- Learn about special features and functionality that MapReduce provides
- Think and Do in MapReduce
- Why Mapreduce
- Mapreduce Architecture
- Thinking in Mapreduce
- Mapreduce code walkthrough
- mrjob is a Python library from Yelp that wraps map-reduce and can run jobs on EMR.
- Luigi is a Python library from Spotify that lets you write map-reduce workflows more easily.
- Cascading is a layer on top of Hadoop that has further layers such as Scalding (Scala) from Twitter - yet another way to simplify working with map-reduce.