2

I have a dataframe with a lot of missing values which looks like this:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

date = pd.date_range(start='2003/01/01', end='2005/12/31')

df = pd.DataFrame({'date':date, })

Assign missing values to columns:

df = pd.DataFrame(np.nan, index=date, columns=['A', 'B'])

Add some actual values throughout to illustrate what my data actually looks like

df.loc['2003-01-10', 'B'] = 50
df.loc['2003-01-15', 'A'] = 70

df.loc['2003-06-10', 'B'] = 45
df.loc['2003-07-15', 'A'] = 55

df.loc['2004-01-01', 'B'] = 20
df.loc['2004-01-05', 'A'] = 30

df.loc['2004-05-01', 'B'] = 25
df.loc['2004-06-05', 'A'] = 35

df.loc['2005-01-01', 'B'] = 40
df.loc['2005-01-05', 'A'] = 35

Plot the data

df.plot(style = '-o')

This plot looks like this:

enter image description here

So you can see that I have specified that it be a line plot using the style = '-o' command, and it shows up correctly in the legend, but the dots are not joined by lines on the graph. When I plot it with no style specification I get a blank graph.

Any help would be greatly appreciated. Thank you.

2 Answers 2

3

I assume this is due to the NaNs in your data set. Your data is simply not tidy. I assumed pandas could figure this out just using stack but it doesn't work either. Also, a bit inconvenient is that for a specific date not both values are defined( maybe one could use interpolate here. However, what works is simply:

df['A'].dropna().plot()
df['B'].dropna().plot()

in a single Jupiter notebook cell. Both plots will be drawn to the same axis there.

Interpolate works, but looks a bit different due to the scaling:

pd.concat([df['A'].interpolate(),
           df['B'].interpolate()], axis=1).plot()

note that here the legend is created directly. I was too lazy to overwrite the old df.

Tweaking interpolate a bit and realizing that it's already a DataFrame method one could also do:

df.interpolate(limit_area='inside').plot()

for qualitatively the drop_na result or

df.interpolate(limit_area='inside').plot()

for the concat result.

Sign up to request clarification or add additional context in comments.

2 Comments

This works and your interpolate tip is great as it keeps the legend. Thank you :)
just added some more content for interpolate.
2

You have a lot of NaN values in your dataframe, so that it can't draw a line (the actual points aren't following each other).

What you can do is drop the nan values like this:

df.B.dropna().plot()
df.A.dropna().plot()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.