4

The parallel_coordinates function from pandas is very useful:

import pandas
import matplotlib.pyplot as plt
from pandas.tools.plotting import parallel_coordinates
sampdata = read_csv('/usr/local/lib/python3.3/dist-packages/pandas/tests/data/iris.csv')
parallel_coordinates(sampdata, 'Name')

enter image description here

But when you have continous data, its behavior is not what you would expect:

mypos = np.random.randint(10, size=(100, 2))
mydata = DataFrame(mypos, columns=['x', 'y'])
myres = np.random.rand(100, 1)
mydata['res'] = myres
parallel_coordinates(mydata, 'res')

enter image description here

I would like to have the color of the lines to reflect the magnitude of the continuous variable, e.g. in a gradient from white to black, preferably also with the possibility of some transparency (alpha value), and with a color bar beside.

1
  • hi there, I do not seem to have the iris (or iris.csv) template file which is needed to understand how the data should be structured to create a parallel coordinates chart. Can someone upload it for me please? Thanks Commented Oct 10, 2016 at 17:48

1 Answer 1

10

I had the exact same problem today. My solution was to copy the parallel_coordinates from pandas and to adapt it for my special needs. As I think it can be useful for others, here is my implementation:

def parallel_coordinates(frame, class_column, cols=None, ax=None, color=None,
                     use_columns=False, xticks=None, colormap=None,
                     **kwds):
    import matplotlib.pyplot as plt
    import matplotlib as mpl

    n = len(frame)
    class_col = frame[class_column]
    class_min = np.amin(class_col)
    class_max = np.amax(class_col)

    if cols is None:
        df = frame.drop(class_column, axis=1)
    else:
        df = frame[cols]

    used_legends = set([])

    ncols = len(df.columns)

    # determine values to use for xticks
    if use_columns is True:
        if not np.all(np.isreal(list(df.columns))):
            raise ValueError('Columns must be numeric to be used as xticks')
        x = df.columns
    elif xticks is not None:
        if not np.all(np.isreal(xticks)):
            raise ValueError('xticks specified must be numeric')
        elif len(xticks) != ncols:
            raise ValueError('Length of xticks must match number of columns')
        x = xticks
    else:
        x = range(ncols)

    fig = plt.figure()
    ax = plt.gca()

    Colorm = plt.get_cmap(colormap)

    for i in range(n):
        y = df.iloc[i].values
        kls = class_col.iat[i]
        ax.plot(x, y, color=Colorm((kls - class_min)/(class_max-class_min)), **kwds)

    for i in x:
        ax.axvline(i, linewidth=1, color='black')

    ax.set_xticks(x)
    ax.set_xticklabels(df.columns)
    ax.set_xlim(x[0], x[-1])
    ax.legend(loc='upper right')
    ax.grid()

    bounds = np.linspace(class_min,class_max,10)
    cax,_ = mpl.colorbar.make_axes(ax)
    cb = mpl.colorbar.ColorbarBase(cax, cmap=Colorm, spacing='proportional', ticks=bounds, boundaries=bounds, format='%.2f')

    return fig

I don't know if it will works with every option that pandas original function provides. But for your example, it gives something like this:

parallel_coordinates(mydata, 'res', colormap="binary")

Example from question

You can add alpha value by changing this line in the previous function:

ax.plot(x, y, color=Colorm((kls - class_min)/(class_max-class_min)), alpha=(kls - class_min)/(class_max-class_min), **kwds)

And for pandas original example, removing names and using the last column as values:

sampdata = read_csv('iris_modified.csv')
parallel_coordinates(sampdata, 'Value')

Example from pandas documentation

I hope this will help you!

Christophe

Sign up to request clarification or add additional context in comments.

1 Comment

thanks for your tip, I tried it and it works fine! How would you go about setting a separate scale of values (y range) for each axis? In other words, imagining that: SepalLenght varied between 1 and 8 SepalWidth between 20 and 100 PetalLenght between 0 and 1 Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.