[mapreduce] Good MapReduce examples

I couldn't think of any good examples other than the "how to count words in a long text with MapReduce" task. I found this wasn't the best example to give others an impression of how powerful this tool can be.

I'm not looking for code-snippets, really just "textual" examples.

This question is related to mapreduce

The answer is


One of the best examples of Hadoop-like MapReduce implementation.

Keep in mind though that they are limited to key-value based implementations of the MapReduce idea (so they are limiting in applicability).


From time to time I present MR concepts to people. I find processing tasks familiar to people and then map them to the MR paradigm.

Usually I take two things:

  1. Group By / Aggregations. Here the advantage of the shuffling stage is clear. An explanation that shuffling is also distributed sort + an explanation of distributed sort algorithm also helps.

  2. Join of two tables. People working with DB are familiar with the concept and its scalability problem. Show how it can be done in MR.


One set of familiar operations that you can do in MapReduce is the set of normal SQL operations: SELECT, SELECT WHERE, GROUP BY, ect.

Another good example is matrix multiply, where you pass one row of M and the entire vector x and compute one element of M * x.