1

Sorry, i'm new to plotly.
I have a clear pd.DataFrame with dates in order.

The initial date is in this format: YYYYMMDD.
When i tried to convert it, plotly would display the date as what i can only describe as random numbers (in hover text). After a lot of searching i found my workaround (see code) as the only solution.

To the real problem though: As you can see, each month has a different amount of entries. When i drop the day, plotly puts all entries of a month into the same spot.
When i use x=df.index, i get the best result but i have no visualization of the date. What i would like is to have even gaps between each entry and have a clear visual cue showing which entry belongs to which month.
I'll append an image to the end of the post to better explain my problems (english isn't my first language..)

The code:

import pandas as pd
import plotly.express as px

columns = ["date", "farts"]
df = pd.read_csv('test.csv', sep=',', engine='python', names=columns)

# Using a smaller made up csv file for testing. It looks like this:
# 20200119, 50
# 20200115, 40
# 20200105, 30
# 20191215, 40
# 20191120, 35
# 20191115, 12

print(df)

df["date"] = pd.to_datetime(df["date"], format="%Y%m%d")

df["date"] = df["date"].dt.strftime('%Y-%m')

print(df)

#works very well so far:

# before:

#        date  farts
# 0  20200119     50
# 1  20200115     40
# 2  20200105     30
# 3  20191215     40
# 4  20191120     35
# 5  20191115     12

# after:

#       date  farts
# 0  2020/01     50
# 1  2020/01     40
# 2  2020/01     30
# 3  2019/12     40
# 4  2019/11     35
# 5  2019/11     12

fig = px.bar(df, x="date", y='farts', width=1000, height=350)
fig.show()

Do you guys have any ideas what i could do to get a better looking graph?

picture to help understand: https://i.sstatic.net/R3T0p.png

Edit: Tried a bit around and i'm getting more and more frustrated. Either not showing, date gets reversed, etc.....

If i go with df["date"] i can't stop plotly from lumping entries from the same months to one place.
If i go with df.index, i can't seem to name the x axis entries in accordance to the date column.

1
  • What happens if you do pd.set_index('date') before graph? Commented Mar 4, 2020 at 11:52

2 Answers 2

3

IIUC you can just plot and then update layout for xtick name.

import pandas as pd
import plotly.graph_objs as go
import plotly.express as px

from io import StringIO

df = """date,farts
20200119, 50
20200115, 40
20200105, 30
20191215, 40
20191120, 35
20191115, 12"""

df = pd.read_csv(StringIO(df))

df["date"] = pd.to_datetime(df["date"], format="%Y%m%d")\
               .dt.strftime('%Y-%m')
df = df.sort_values("date").reset_index(drop=True)

plotly.graph_objs

fig =  go.Figure()
fig.add_trace(go.Bar(x=df.index,y=df["farts"]))
fig.update_layout(
    xaxis = dict(
        tickmode = 'array',
        tickvals = df.index,
        ticktext = df["date"]
    )
)
fig.show()

plotly.express

px.bar(df,x=df.index,y="farts")
fig.update_layout(
    xaxis = dict(
        tickmode = 'array',
        tickvals = df.index,
        ticktext = df["date"]
    )
)

The output is the same enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

You have 2 options depending on what you want

First let's create the data for the example:

data = [
    ["20200119", 50],
    ["20200115", 40],
    ["20200105", 30],
    ["20191215", 40],
    ["20191120", 35],
    ["20191115", 12],
]

1. Plot as categories

By default plotly will set dates as dates, you can overwrite that with:

df = pd.DataFrame(data, columns=["date", "farts"])
df["date"] = "D" + df["date"] # Add a string so that plotly won't transform to date

fig = px.bar(df, x="date", y='farts')

2. Monthly resample

If you want to plot monthly date you should avoid duplicates. To do that you can resample and do a mean or sum of all entries of each month:

df = pd.DataFrame(data, columns=["date", "farts"])
df["date"] = pd.to_datetime(df["date"], format="%Y%m%d")
df = df.resample('MS', on='date').mean() # You should not have duplicates
df = df.reset_index() # You need date as a column with plotly express

fig = px.bar(df, x="date", y='farts')

2 Comments

Thank you for your answer. 1. Doesn't work, the "D" line gives errors. 2. Is not a good solution because i want to keep every single entry separate.
That is because you are using pd.to_datetime and you should not. I have edited the answer to make it more clear.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.