I'm developing an application where I need to distribute a set of tasks across a potentially quite large cluster of different machines.
Ideally I'd like a very simple, idiomatic way to do this in Clojure, e.g. something like:
; create a clustered set of machines
(def my-cluster (new-cluster list-of-ip-addresses))
; define a task to be executed
(deftask my-task (my-function arg1 arg2))
; run a task 10000 times on the cluster
(def my-job (run-task my-cluster my-task {:repeat 10000})
; do something with the results:
(some-function (get-results my-job))
Bonus if it can do something like Map-Reduce on the cluster as well.....
What's the best way to achieve something like this? Maybe I could wrap an appropriate Java library?
UPDATE:
Thanks for all the suggestion of Apache Hadoop - looks like it might fit the bill, however it seem a bit like overkill since I'm not needing a distributed data storage system like Hadoop uses (i.e. i don't need to process billions of records)... something more lightweight and focused on compute tasks only would be preferable if it exists.