Overview and getting started¶

No description has been provided for this image

Introduction¶

Let's start with an overview of IPython's architecture for parallel and distributed computing. This architecture abstracts out parallelism in a very general way, which enables IPython to support many different styles of parallelism including:

Single program, multiple data (SPMD) parallelism
Multiple program, multiple data (MPMD) parallelism
Message passing using MPI or ØMQ
Task farming
Data parallel
Coordination of distributed processes
Combinations of these approaches
Custom user defined approaches

Most importantly, IPython enables all types of parallel applications to be developed, executed, debugged and monitored interactively. Hence, the I in IPython. Some example use cases for IPython.parallel:

Quickly parallelize algorithms that are embarrassingly parallel using a number of simple approaches. Many simple things can be parallelized interactively in one or two lines of code.
Steer traditional MPI applications on a supercomputer from an IPython session on your laptop.
Analyze and visualize large datasets (that could be remote and/or distributed) interactively using IPython and tools like matplotlib/TVTK.
Develop, test and debug new parallel algorithms (that may use MPI or PyZMQ) interactively.
Tie together multiple MPI jobs running on different systems into one giant distributed and parallel system.
Start a parallel job on your cluster and then have a remote collaborator connect to it and pull back data into their local IPython session for plotting and analysis.
Run a set of tasks on a set of CPUs using dynamic load balancing.

Architecture overview¶

No description has been provided for this image

The IPython architecture consists of four components:

The IPython engine
The IPython hub
The IPython schedulers
The cluster client

These components live in the IPython.parallel package and are installed with IPython.

IPython engine¶

The IPython engine is a Python instance that accepts Python commands over a network connection. When multiple engines are started, parallel and distributed computing becomes possible. An important property of an IPython engine is that it blocks while user code is being executed. Read on for how the IPython controller solves this problem to expose a clean asynchronous API to the user.

IPython controller¶

No description has been provided for this image

The IPython controller processes provide an interface for working with a set of engines. At a general level, the controller is a collection of processes to which IPython engines and clients can connect. The controller is composed of a Hub and a collection of Schedulers, which may be in processes or threads.

The controller provides a single point of contact for users who wish to utilize the engines in the cluster. There is a variety of different ways of working with a controller, but all of these models are implemented via the View.apply method, after constructing View objects to represent different collections engines. The two primary models for interacting with engines are:

A Direct interface, where engines are addressed explicitly.
A LoadBalanced interface, where the Scheduler is trusted with assigning work to appropriate engines.

Advanced users can readily extend the View models to enable other styles of parallelism.

The Hub¶

The center of an IPython cluster is the Hub. The Hub can be viewed as an über-logger, which keeps track of engine connections, schedulers, clients, as well as persist all task requests and results in a database for later use.

Schedulers¶

All actions that can be performed on the engine go through a Scheduler. While the engines themselves block when user code is run, the schedulers hide that from the user to provide a fully asynchronous interface to a set of engines. Each Scheduler is a small GIL-less function in C provided by pyzmq (the Python load-balanced scheduler being an exception).

ØMQ and PyZMQ¶

All of this is implemented with the lovely ØMQ messaging library, and pyzmq, the lightweight Python bindings, which allows very fast zero-copy communication of objects like numpy arrays.

IPython client and views¶

There is one primary object, the Client, for connecting to a cluster. For each execution model, there is a corresponding View. These views allow users to interact with a set of engines through the interface. Here are the two default views:

The DirectView class for explicit addressing.
The LoadBalancedView class for destination-agnostic scheduling.

Getting Started¶

Starting the IPython controller and engines¶

To follow along with this tutorial, you will need to start the IPython controller and four IPython engines. The simplest way of doing this is to use the ipcluster command:

$ ipcluster start -n 4

There isn't time to go into it here, but ipcluster can be used to start engines and the controller with various batch systems including:

SGE
PBS
LSF
MPI
SSH
WinHPC

More information on starting and configuring the IPython cluster in the IPython.parallel docs.

Once you have started the IPython controller and one or more engines, you are ready to use the engines to do something useful.

To make sure everything is working correctly, let's do a very simple demo:

In [2]:

from IPython import parallel
rc = parallel.Client()
rc.block = True

In [3]:

rc.ids

Out[3]:

[0, 1, 2, 3]

In [4]:

def mul(a,b):
    return a*b

In [5]:

mul(5,6)

Out[5]:

What does it look like to call this function remotely?

Just turn f(*args, **kwargs) into view.apply(f, *args, **kwargs)!

In [6]:

rc[0].apply(mul, 5, 6)

Out[6]:

And the same thing in parallel?

In [7]:

rc[:].apply(mul, 5, 6)

Out[7]:

[30, 30, 30, 30]

Python has a builtin map for calling a function with a variety of arguments

In [11]:

map(mul, range(1,10), range(2,11))

Out[11]:

[2, 6, 12, 20, 30, 42, 56, 72, 90]

So how do we do this in parallel?

In [10]:

view = rc.load_balanced_view()
view.map(mul, range(1,10), range(2,11))

Out[10]:

[2, 6, 12, 20, 30, 42, 56, 72, 90]

I'll go into further detail about different views, and asynchronous communication later. First, let's peek at the performance of the IPython.parallel infrastructure, so we can see what level of granularity we can consider using.