如果不知道MapReduce是怎么工作的,请看这里,如果不知道MapReduce是什么,请google之!
今天“闲”来无事,忽想起C#里没有MapReduce的方法,构思之,coding之:
#region IEnumerable<T>.MapReduce public static Dictionary<TKey, TResult> MapReduce<TInput, TKey, TValue, TResult>( this IEnumerable<TInput> list, Func<TInput, IEnumerable<KeyValuePair<TKey, TValue>>> map, Func<TKey, IEnumerable<TValue>, TResult> reduce) { Dictionary<TKey, List<TValue>> mapResult = new Dictionary<TKey, List<TValue>>(); foreach (var item in list) { foreach (var one in map(item)) { List<TValue> mapValues; if (!mapResult.TryGetValue(one.Key, out mapValues)) { mapValues = new List<TValue>(); mapResult.Add(one.Key, mapValues); } mapValues.Add(one.Value); } } var result = new Dictionary<TKey, TResult>(); foreach (var m in mapResult) { result.Add(m.Key, reduce(m.Key, m.Value)); } return result; } #endregion
注:由于在map方法里可emit多次,所以这里返回IEnumerable,下文例子中可以看到用yield return来实现。
例:
public class Person { public int ID { get; set; } public string Name { get; set; } public int Age { get; set; } }
static void Main(string[] args) { List<Person> list=new List<Person> (); list.Add(new Person { ID=1, Name="user1", Age=23 }); list.Add(new Person { ID = 2, Name = "user2", Age = 24 }); list.Add(new Person { ID = 3, Name = "user3", Age = 23 }); list.Add(new Person { ID = 4, Name = "user4", Age = 25 }); list.Add(new Person { ID = 5, Name = "user5", Age = 20 }); var result = list.MapReduce<Person, int, string, string>(Map, (key, values) => string.Join(",", values)); foreach (var d in result) { Console.WriteLine(d.Key + ":" + d.Value); } } public static IEnumerable<KeyValuePair<int, string>> Map(Person p) { if (p.Age > 22) yield return new KeyValuePair<int, string>(p.Age, p.Name); }
上面程序所做的事为统计年龄大于22的,各个年龄都有谁,显示如:
C:\Windows\system32\cmd.exe 23:user1,user3(嫌上传图片太麻烦,弄了个html版控制台,见谅!)
肯定有人会问为什么map不像reduce方法一样用lambda表达式,因为yield return不能在匿名方法和lambda表达式中!MS表示已知道这个问题,但重写yield花费很大,将来肯定会解决!