4

I have a DataFrame with the row index as a DatetimeIndex.

This index is coming up differently on the x-axis while I am making line and bar plots. My code is as follows:

start_date = datetime.datetime.strptime('2017-02-20', '%Y-%m-%d').date()
end_date = datetime.datetime.strptime('2017-02-23', '%Y-%m-%d').date()

daterange = pd.date_range(start_date, end_date)
df = pd.DataFrame(index = daterange, data = {'Male':[12, 23, 13, 11], 'Female': [10, 25, 15, 9]})


df.plot(kind='line')
df.plot(kind='bar', stacked = False, grid=1)

The plots i am obtaining are as follows. Line plot with nice formatting of dates on x-axis:

Line plot with nice formatting of dates on x-axis

Bar plot without formatting of dates on x-axis:

Bar plot without formatting of dates on x-axis

In the line plot the x-axis labels are well formatted with the month and year on left corner and the dates used as x-ticks. But in the bar plot, the entire date along with the time (00:00:00) is shown unlike in the line plot.

How can I get the proper formatting of dates on x-axis in the bar plot and without the time being shown?

2 Answers 2

2

The problem is in the source code of pandas. You cannot get the bar plot to use the pd.DateTimeFormatter without deriving custom subclasses or using matplotlib directly.

In line 1766 (1784 in the dev version) of pandas.tools.plotting the datetime formatting for LinePlot is done. This is not present in BarPlot, for reasons that I can only hypothesize:

Line charts are intended to print timeseries data, whereas the same does not necessarily make sense for bar charts.

I would still like to see bar plots being able to format dates properly without using matplotlib, so you might want to open an issue with the pandas project.

With matplotlib directly:

import pandas as pd
import datetime
import matplotlib.ticker as ticker
import matplotlib.pyplot as plt

start_date = datetime.datetime.strptime('2017-02-20', '%Y-%m-%d').date()
end_date = datetime.datetime.strptime('2017-02-23', '%Y-%m-%d').date()
daterange = pd.date_range(start_date, end_date)
df = pd.DataFrame(index = daterange, data = {'Male':[12, 23, 13, 11], 'Female': [10, 25, 15, 9]})
ax=df.plot.bar(xticks=df.index.month, stacked = False, grid=1)
ticklabels = [item.strftime('%b %d') for item in df.index]
ax.xaxis.set_major_formatter(ticker.FixedFormatter(ticklabels))
plt.gcf().autofmt_xdate()

plt.show()

Correctly formatted

Sign up to request clarification or add additional context in comments.

1 Comment

With this solution I can use different directives of strftime() function to get weekday names and more for adding additional details to my graph like %a for getting the abbreviated weekday name. I was using the same code as your solution to get weekday names on the xticks for the line plot but it doesnt work with it giving me only an empty plot. Is it clashing with the already present formatting for the LinePlot ?
1

Bar plots are generally meant to be used to plot categorical data. That means that unlike in the line plot, the x-values are simply ascending integer values, not dates. The labels are then simply the texts from the dataframe.

An easy option to get rid of the hours and minutes is to reset the labels as follows:

ax = df.plot(kind='bar', stacked = False)
ax.set_xticklabels([t.get_text().split()[0] for t in ax.get_xticklabels()])

enter image description here

Additionally adding ax.figure.autofmt_xdate() rotates and recenters the labels to take less space.

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.