Pandas bar plot -- specify bar color by column

Question

Is there a simply way to specify bar colors by column name using Pandas DataFrame.plot(kind='bar') method?

I have a script that generates multiple DataFrames from several different data files in a directory. For example it does something like this:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

df1.plot(kind='bar', ax=plt.subplot(121))
df2.plot(kind='bar', ax=plt.subplot(122))

plt.show()

With the following output:

Output

Unfortunately, the column colors aren't consistent for each label in the different plots. Is it possible to pass in a dictionary of (filenames:colors), so that any particular column always has the same color. For example, I could imagine creating this by zipping up the filenames with the Matplotlib color_cycle:

data_files = ['a', 'b', 'c', 'd']
colors = plt.rcParams['axes.color_cycle']
print zip(data_files, colors)

[('a', u'b'), ('b', u'g'), ('c', u'r'), ('d', u'c')]

I could figure out how to do this directly with Matplotlib: I just thought there might be a simpler, built-in solution.

Edit:

Below is a partial solution that works in pure Matplotlib. However, I'm using this in an IPython notebook that will be distributed to non-programmer colleagues, and I'd like to minimize the amount of excessive plotting code.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']
mpl_colors = plt.rcParams['axes.color_cycle']
colors = dict(zip(data_files, mpl_colors))

def bar_plotter(df, colors, sub):
    ncols = df.shape[1]
    width = 1./(ncols+2.)
    starts = df.index.values - width*ncols/2.
    plt.subplot(120+sub)
    for n, col in enumerate(df):
        plt.bar(starts + width*n, df[col].values, color=colors[col],
                width=width, label=col)
    plt.xticks(df.index.values)
    plt.grid()
    plt.legend()

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

bar_plotter(df1, colors, 1)
bar_plotter(df2, colors, 2)

plt.show()

Desired Output

stackoverflow.com/questions/11927715/… I think this is maybe a good starting point. Maybe slice the color list [1:] for the second graph before passing it as the color? — DataSwede
– DataSwede, Commented Sep 5, 2014 at 22:06

DataSwede · Accepted Answer · 2014-09-05 22:29:24Z

16

You can pass a list as the colors. This will require a little bit of manual work to get it to line up, unlike if you could pass a dictionary, but may be a less cluttered way to accomplish your goal.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

color_list = ['b', 'g', 'r', 'c']


df1.plot(kind='bar', ax=plt.subplot(121), color=color_list)
df2.plot(kind='bar', ax=plt.subplot(122), color=color_list[1:])

plt.show()

enter image description here

EDIT Ajean came up with a simple way to return a list of the correct colors from a dictionary:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']
color_list = ['b', 'g', 'r', 'c']
d2c = dict(zip(data_files, color_list))

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

df1.plot(kind='bar', ax=plt.subplot(121), color=map(d2c.get,df1.columns))
df2.plot(kind='bar', ax=plt.subplot(122), color=map(d2c.get,df2.columns))

plt.show()

edited Sep 5, 2014 at 22:29

answered Sep 5, 2014 at 22:14

DataSwede

5,62111 gold badges45 silver badges68 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Ajean Over a year ago

Very nice. Something a bit extra to improve the robustness, you could create a data2color dict (d2c=dict(zip(data_files, color_list))) and then in the plot command put color=map(d2c.get,df1.columns) and likewise for df2. Looks like that works :).

DataSwede Over a year ago

I actually like that more. Feels like this should be a simple to implement feature request

Ajean Over a year ago

I suppose the list input is enough customization for the pandas devs, not sure what else they could do. Also I totally found this little trick elsewhere on SO so I can't take all the credit, hehe!

Ryan Over a year ago

This is a great solution. I like the map with dictionary approach. Thanks Data Swede and Ajean!

Kumar · Accepted Answer · 2020-08-06 09:20:30Z

2

Pandas version 1.1.0 makes this easier. You can pass a dictionary to specify different color for each column in the pandas.DataFrame.plot.bar() function:

Here is an example:

df1 = pd.DataFrame({'a': [1.2, .8, .9], 'b': [.2, .9, .7]})
df2 = pd.DataFrame({'b': [0.2, .5, .4], 'c': [.5, .6, .7], 'd': [1.1, .6, .7]})
color_dict = {'a':'green', 'b': 'red', 'c':'blue', 'd': 'cyan'}
df1.plot.bar(color = color_dict)
df2.plot.bar(color = color_dict)

answered Aug 6, 2020 at 9:20

Kumar

3041 gold badge2 silver badges8 bronze badges

1 Comment

Veggiet Over a year ago

this works, but I have to construct it the inverted way {"color": "column"} which is weird? It also seems to desync as I manipulate my data, even though I keep the column names the same it doesn't work and I can't figure out why.

Collectives™ on Stack Overflow

Pandas bar plot -- specify bar color by column

2 Answers 2

4 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related