1

I am trying to plot a simple time-series. Here's my code:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
%matplotlib inline

df = pd.read_csv("sample.csv", parse_dates=['t'])
df[['sq', 'iq', 'rq']] = df[['sq', 'iq', 'rq']].apply(pd.to_numeric, errors='coerce')
df = df.fillna(0)
df.set_index('t')

This is part of the output:

Output2

df[['t','sq']].plot()
plt.show()

First Plot

As you can see, the x-axis in the plot above is not the dates I intended it to show. When I change the plotting call as below, I get the following gibberish plot, although the x-axis is now correct.

df[['t','sq']].plot(x = 't')
plt.show()

Second Plot

Any tips on what I am doing wrong? Please comment and let me know if you need more information about the problem. Thanks in advance.

2 Answers 2

1

I think your problem is that although you have parsed the t column it is not of type date-time. Try the following:

# Set t to date-time and then to index
df['t'] = pd.to_datetime(df['t'])
df.set_index('t', inplace=True)

Reading you comment and the answer you have added someone may conclude that this kind of problem can only be solved by specifying a parser in pd.read_csv(). So here is proof that my solution works in principle. Looking at what you have posted as a question, the other problem with you code is the way you have specified the plot command. Once t has become an index, you only need to select columns other than t for the plot command.

import pandas as pd
import matplotlib.pyplot as plt

# Read data from file
df = pd.read_csv('C:\\datetime.csv', parse_dates=['Date'])

# Convert Date to date-time and set as index
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)

df.plot(marker='D')
plt.xlabel('Date')
plt.ylabel('Number of Visitors')
plt.show()


df
Out[37]: 
        Date  Adults  Children  Seniors
0 2018-01-05     309       240      296
1 2018-01-06     261       296      308
2 2018-01-07     273       249      338
3 2018-01-08     311       250      244
4 2018-01-08     272       234      307

df
Out[39]: 
            Adults  Children  Seniors
Date                                 
2018-01-05     309       240      296
2018-01-06     261       296      308
2018-01-07     273       249      338
2018-01-08     311       250      244
2018-01-08     272       234      307

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

The problem, indeed, was incorrect parsing of 't' column. However, pd.to_datetime did not solve it for me. The solution for it turned out to be adding a date_parser to the read_csv() call. Thank you for pointing me in the right direction. Cheers!
0

The issue turned out to be incorrect parsing of dates, as pointed out in an answer above. However, the solution for it was to pass a date_parser to the read_csv method call:

from datetime import datetime as dt
dtm = lambda x: dt.strptime(str(x), "%Y-%m-%d")    
df = pd.read_csv("sample.csv", parse_dates=['t'], infer_datetime_format = True, date_parser= dtm)

1 Comment

I have modified my answer to show that the method works in principle. I would like to verify that my solution does not work on your data and understand why. Could you show what you were getting with it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.