0

Hi I am working on a categorical data. I want to see device behavior on a given day. I have these as my dataframe:

On toronto_time, I have a datetime64[D]. I previously used dt.time to remove the date. However, it presents a datatype problem which makes it a type object and not datetime64[D]. Converting it again with pd.to_datetime will add a date on itself.

So I left it with the original:

       toronto_time             description
0      2018-09-08 00:00:50      STATS
1      2018-09-08 00:01:55      STATS
2      2018-09-08 00:02:18      DEV_OL
3      2018-09-08 00:05:24      STATS
4      2018-09-08 00:05:34      STATS
5      2018-09-08 00:06:33      CMD_ERROR

I tried plotting it with seaborn with these codes:

import matplotlib.pyplot as plt
import matplotlib.dates as md
import seaborn as sns

plt.style.use('seaborn-colorblind')
plt.figure(figsize=(8,6))
sns.swarmplot('toronto_time', 'description', data=df);
plt.show()

However the visualization is compressed on that day. I want to remove the day in the xlabel also stretch them according to hours (0:00 to 24:00)

This is what I got: enter image description here

3
  • If you make a transformation of your dataframe like this df['toronto_time'] = df['toronto_time'].dt.hour, doesn't it provide the desired output? Commented Sep 12, 2018 at 9:30
  • It will just stay with the hour. Categories will point to specific hour. Commented Sep 12, 2018 at 9:39
  • 1
    I'm afraid I don't quite understand you... See my answer and comment on it if it doesn't produce your desired output Commented Sep 12, 2018 at 9:46

1 Answer 1

3

I'm not sure why you want the minutes and seconds on the graph if your ticks are only on the hour? But you can do it by setting a formatter for your axis. Although I would suggest also changing you axis limits if you're looking for ticks by the hour.

import pandas as pd

import matplotlib.pyplot as plt
import matplotlib.dates as md
import seaborn as sns

df = pd.DataFrame({'toronto_time': ['2018-09-08 00:00:50',
                                    '2018-09-08 01:01:55',
                                    '2018-09-08 02:02:18',
                                    '2018-09-08 03:05:24',
                                    '2018-09-08 04:05:34',
                                    '2018-09-08 05:06:33'], 
                    'description': ['STATS', 'STATS', 'DEV_OL', 'STATS', 'STATS', 'CMD_ERROR']})
df['toronto_time'] = pd.to_datetime(df['toronto_time'], format='%Y-%m-%d %H:%M:%S')

plt.style.use('seaborn-colorblind')
fig, ax = plt.subplots(figsize=(8,6))
sns.swarmplot('toronto_time', 'description', data=df, ax=ax)
ax.set_xlim(df['toronto_time'].min()-pd.Timedelta(1,'h'),
            df['toronto_time'].max()+pd.Timedelta(1,'h'))
ax.xaxis.set_major_formatter(md.DateFormatter('%H:%M:%S'))

plt.show()

enter image description here

Here's a nice example showing how to use a locator to define how the ticks are spaced as well: http://leancrew.com/all-this/2015/01/labeling-time-series/

Sign up to request clarification or add additional context in comments.

2 Comments

Nice! Thanks! Although my xticks shows (0, 3, 6, 9, 12 ... 21:00, 0:00). How can I display all hours? (0:00 to 23:59 or 0)
It's in the link at the bottom of the answer: try ax.xaxis.set_major_locator(md.HoursLocator())

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.