mincemeat.py is a Python implementation of the MapReduce distributed computing framework.
mincemeat.py is:git clone http://mincemeatpy.com/git/mincemeatpy.git
Let's look at the canonical MapReduce example, word counting:
example.py:#!/usr/bin/env python import mincemeat data = ["Humpty Dumpty sat on a wall", "Humpty Dumpty had a great fall", "All the King's horses and all the King's men", "Couldn't put Humpty together again", ] def mapfn(k, v): for w in v.split(): yield w, 1 def reducefn(k, vs): result = 0 for v in vs: result += v return result s = mincemeat.Server() # The data source can be any dictionary-like object s.datasource = dict(enumerate(data)) s.mapfn = mapfn s.reducefn = reducefn results = s.run_server(password="changeme") print results
python example.py
python mincemeat.py -p changeme [server address]
{'a': 2, 'on': 1, 'great': 1, 'Humpty': 3, 'again': 1, 'wall': 1, 'Dumpty': 2, 'men': 1,
'had': 1, 'all': 1, 'together': 1, "King's": 2, 'horses': 1, 'All': 1, "Couldn't": 1,
'fall': 1, 'and': 1, 'the': 2, 'put': 1, 'sat': 1}
This example was overly simplistic, but changing the datasource to be a collection of large files and running the client on multiple machines will work just as well. In fact, mincemeat.py has been used to produce a word frequency lists for >3GB of text using a slightly modified version of this code.
Sorry! I don't have much documentation available at the moment as mincemeat.py is still in its early stages of development. In the mean time, feel free to contact me with any questions or suggestions at .
The following features will be included in mincemeat.py by version 1.0:
Get in touch with me at .
Patches are welcome, especially for the roadmapped features. It's best to contact me to make sure that your potential work fits the goals of the project and has not already been started.