2

This may be a very stupid question, but when plotting a Pandas DataFrame using .plot() it is very quick and produces a graph with an appropriate index. As soon as I try to change this to a bar chart, it just seems to lose all formatting and the index goes wild. Why is this the case? And is there an easy way to just plot a bar chart with the same format as the line chart?

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

df = pd.DataFrame()
df['Date'] = pd.date_range(start='01/01/2012', end='31/12/2018')
df['Value'] = np.random.randint(low=5, high=100, size=len(df))
df.set_index('Date', inplace=True)

df.plot()
plt.show()

df.plot(kind='bar')
plt.show()

df.plot()

df.plot(kind='bar'

Update: For comparison, if I take the data and put it into Excel, then create a line plot and a bar ('column') plot it instantly will convert the plot and keep the axis labels as they were for the line plot. If I try to produce many (thousands) of bar charts in Python with years of daily data, this takes a long time. Is there just an equivalent way of doing this Excel transformation in Python?

Excel plots

2 Answers 2

6

Pandas bar plots are categorical in nature; i.e. each bar is a separate category and those get their own label. Plotting numeric bar plots (in the same manner a line plots) is not currently possible with pandas.

In contrast matplotlib bar plots are numerical if the input data is numbers or dates. So

plt.bar(df.index, df["Value"])

produces

enter image description here

Note however that due to the fact that there are 2557 data points in your dataframe, distributed over only some hundreds of pixels, not all bars are actually plotted. Inversely spoken, if you want each bar to be shown, it needs to be one pixel wide in the final image. This means with 5% margins on each side your figure needs to be more than 2800 pixels wide, or a vector format.

So rather than showing daily data, maybe it makes sense to aggregate to monthly or quarterly data first.

Sign up to request clarification or add additional context in comments.

Comments

3

The default .plot() connects all your data points with straight lines and produces a line plot.

On the other hand, the .plot(kind='bar') plots each data point as a discrete bar. To get a proper formatting on the x-axis, you will have to modify the tick-labels post plotting.

2 Comments

If I am producing say 1,000 charts of multiple years worth of daily data, it will be extremely slow using the .plot(kind='bar') method. Is there a better way of achieving this? I know it's not a great comparison but if I extract the data into Excel and plot a line chart vs plotting a bar ('column') chart, Excel plots exactly what I want very quickly. Is there an equivalent to this in Python?
@13sen1 : I do not know what excel does. May be you can include the excel figure in your question so that the readers know what you want

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.