Trouble in plotting dates in PyPlot

Question

I am trying to plot a simple time-series. Here's my code:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
%matplotlib inline

df = pd.read_csv("sample.csv", parse_dates=['t'])
df[['sq', 'iq', 'rq']] = df[['sq', 'iq', 'rq']].apply(pd.to_numeric, errors='coerce')
df = df.fillna(0)
df.set_index('t')

This is part of the output:

df[['t','sq']].plot()
plt.show()

As you can see, the x-axis in the plot above is not the dates I intended it to show. When I change the plotting call as below, I get the following gibberish plot, although the x-axis is now correct.

df[['t','sq']].plot(x = 't')
plt.show()

Any tips on what I am doing wrong? Please comment and let me know if you need more information about the problem. Thanks in advance.

KRKirov · Accepted Answer · 2018-03-06 20:38:11Z

1

I think your problem is that although you have parsed the t column it is not of type date-time. Try the following:

# Set t to date-time and then to index
df['t'] = pd.to_datetime(df['t'])
df.set_index('t', inplace=True)

Reading you comment and the answer you have added someone may conclude that this kind of problem can only be solved by specifying a parser in pd.read_csv(). So here is proof that my solution works in principle. Looking at what you have posted as a question, the other problem with you code is the way you have specified the plot command. Once t has become an index, you only need to select columns other than t for the plot command.

import pandas as pd
import matplotlib.pyplot as plt

# Read data from file
df = pd.read_csv('C:\\datetime.csv', parse_dates=['Date'])

# Convert Date to date-time and set as index
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)

df.plot(marker='D')
plt.xlabel('Date')
plt.ylabel('Number of Visitors')
plt.show()


df
Out[37]: 
        Date  Adults  Children  Seniors
0 2018-01-05     309       240      296
1 2018-01-06     261       296      308
2 2018-01-07     273       249      338
3 2018-01-08     311       250      244
4 2018-01-08     272       234      307

df
Out[39]: 
            Adults  Children  Seniors
Date                                 
2018-01-05     309       240      296
2018-01-06     261       296      308
2018-01-07     273       249      338
2018-01-08     311       250      244
2018-01-08     272       234      307

edited Mar 6, 2018 at 20:38

answered Mar 4, 2018 at 14:12

KRKirov

4,0142 gold badges20 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

suj1th Over a year ago

The problem, indeed, was incorrect parsing of 't' column. However, pd.to_datetime did not solve it for me. The solution for it turned out to be adding a date_parser to the read_csv() call. Thank you for pointing me in the right direction. Cheers!

suj1th · Accepted Answer · 2018-03-06 17:04:47Z

0

The issue turned out to be incorrect parsing of dates, as pointed out in an answer above. However, the solution for it was to pass a date_parser to the read_csv method call:

from datetime import datetime as dt
dtm = lambda x: dt.strptime(str(x), "%Y-%m-%d")    
df = pd.read_csv("sample.csv", parse_dates=['t'], infer_datetime_format = True, date_parser= dtm)

answered Mar 6, 2018 at 17:04

suj1th

1,8112 gold badges15 silver badges22 bronze badges

1 Comment

KRKirov Over a year ago

I have modified my answer to show that the method works in principle. I would like to verify that my solution does not work on your data and understand why. Could you show what you were getting with it.

Collectives™ on Stack Overflow

Trouble in plotting dates in PyPlot

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related