[c#] Map and Reduce in .NET

What scenarios would warrant the use of the "Map and Reduce" algorithm?


Is there a .NET implementation of this algorithm?

This question is related to c# mapreduce

The answer is


The classes of problem that are well suited for a mapreduce style solution are problems of aggregation. Of extracting data from a dataset. In C#, one could take advantage of LINQ to program in this style.

From the following article: http://codecube.net/2009/02/mapreduce-in-c-using-linq/

the GroupBy method is acting as the map, while the Select method does the job of reducing the intermediate results into the final list of results.

var wordOccurrences = words
                .GroupBy(w => w)
                .Select(intermediate => new
                {
                    Word = intermediate.Key,
                    Frequency = intermediate.Sum(w => 1)
                })
                .Where(w => w.Frequency > 10)
                .OrderBy(w => w.Frequency);

For the distributed portion, you could check out DryadLINQ: http://research.microsoft.com/en-us/projects/dryadlinq/default.aspx


Since I never can remember that LINQ calls it Where, Select and Aggregate instead of Filter, Map and Reduce so I created a few extension methods you can use:

IEnumerable<string> myStrings = new List<string>() { "1", "2", "3", "4", "5" };
IEnumerable<int> convertedToInts = myStrings.Map(s => int.Parse(s));
IEnumerable<int> filteredInts = convertedToInts.Filter(i => i <= 3); // Keep 1,2,3
int sumOfAllInts = filteredInts.Reduce((sum, i) => sum + i); // Sum up all ints
Assert.Equal(6, sumOfAllInts); // 1+2+3 is 6

Here are the 3 methods (from https://github.com/cs-util-com/cscore/blob/master/CsCore/PlainNetClassLib/src/Plugins/CsCore/com/csutil/collections/IEnumerableExtensions.cs ):

public static IEnumerable<R> Map<T, R>(this IEnumerable<T> self, Func<T, R> selector) {
    return self.Select(selector);
}

public static T Reduce<T>(this IEnumerable<T> self, Func<T, T, T> func) {
    return self.Aggregate(func);
}

public static IEnumerable<T> Filter<T>(this IEnumerable<T> self, Func<T, bool> predicate) {
    return self.Where(predicate);
}

Some more details from https://github.com/cs-util-com/cscore#ienumerable-extensions :

enter image description here


The classes of problem that are well suited for a mapreduce style solution are problems of aggregation. Of extracting data from a dataset. In C#, one could take advantage of LINQ to program in this style.

From the following article: http://codecube.net/2009/02/mapreduce-in-c-using-linq/

the GroupBy method is acting as the map, while the Select method does the job of reducing the intermediate results into the final list of results.

var wordOccurrences = words
                .GroupBy(w => w)
                .Select(intermediate => new
                {
                    Word = intermediate.Key,
                    Frequency = intermediate.Sum(w => 1)
                })
                .Where(w => w.Frequency > 10)
                .OrderBy(w => w.Frequency);

For the distributed portion, you could check out DryadLINQ: http://research.microsoft.com/en-us/projects/dryadlinq/default.aspx