1

After I updated pandas (0.23.4) and matplotlib (3.01) I get a strange error trying to do something like the following:

import pandas as pd
import matplotlib.pyplot as plt


clrdict = {1: "#a6cee3", 2: "#1f78b4", 3: "#b2df8a", 4: "#33a02c"}

df_full = pd.DataFrame({'x':[20,30,30,40],
                        'y':[25,20,30,25],
                        's':[100,200,300,400],
                        'l':[1,2,3,4]})

df_full['c'] = df_full['l'].replace(clrdict)

df_part = df_full[(df_full.x == 30)]

fig = plt.figure()
plt.scatter(x=df_full['x'],
            y=df_full['y'],
            s=df_full['s'],
            c=df_full['c'])
plt.show()

fig = plt.figure()
plt.scatter(x=df_part['x'],
            y=df_part['y'],
            s=df_part['s'],
            c=df_part['c'])
plt.show()

The scatterplot of the original DataFrame (df_full) is shown without problems. But the plot of the partially DataFrame raises the following error:

Traceback (most recent call last):
  File "G:\data\project\test.py", line 27, in <module>
    c=df_part['c'])
  File "C:\Program Files\Python37\lib\site-packages\matplotlib\pyplot.py", line 2864, in scatter
    is not None else {}), **kwargs)
  File "C:\Program Files\Python37\lib\site-packages\matplotlib\__init__.py", line 1805, in inner
    return func(ax, *args, **kwargs)
  File "C:\Program Files\Python37\lib\site-packages\matplotlib\axes\_axes.py", line 4195, in scatter
    isinstance(c[0], str))):
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\series.py", line 767, in __getitem__
    result = self.index.get_value(self, key)
  File "C:\Program Files\Python37\lib\site-packages\pandas\core\indexes\base.py", line 3118, in get_value
    tz=getattr(series.dtype, 'tz', None))
  File "pandas\_libs\index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value
  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0

This is due to the color-option c=df_part['c']. When you leave it out – the problem doesn't occur. This hasn't happend before the updates, so maybe you're not able to reproduce this with lower versions of matplotlib or pandas (I have no idea which one causes it).

In my project the df_part = df_full[(df_full.x == i)] line is used within the update-function of a matplotlib.animation.FuncAnimation. The result is an animation over the values of x (which are timestamps in my project). So I need a way to part the DataFrame.

4
  • what happens if you use c=df_part['c'].values Commented Nov 6, 2018 at 17:42
  • Thanks, ALollz. This solves the problem. Can you explain, why I have to call the values explicitly? Commented Nov 6, 2018 at 17:52
  • The issue is that pandas.Series have an index, so when plt.scatter tries to grab Series[0] it is looking for the row where the index = 0, not the first row of the Series. In your second case, this row doesn't exist, since your subset doesn't contain the first row. Using .values will convert your Series to an ndarray in which case ndarray[0] will give the first value in the array, regardless of whatever index the Series had. Commented Nov 6, 2018 at 18:00
  • I do understand. Thanks for taking your time to write this usefull explanation. Commented Nov 6, 2018 at 18:05

1 Answer 1

3

This is a bug which got fixed by https://github.com/matplotlib/matplotlib/pull/12673.

It should hopefully be available in the next bugfix release 3.0.2, which should be up within the next days.

In the meantime, you may use the numpy array from the pandas series, series.values.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.