1

Once in a while I have time data where I would like to just visualize how often events are occurring. So I basically have a list of datetimes and I want to show a plot with

  • x-axis is hour (0 - 24, hence 24 bins)
  • y-axis is the number of events

So basically it is a histogram, grouped by hour.

I already have one solution, but how do I make sure that all 24 bins exist? (and it could look nicer, too)

Minimal Example

#!/usr/bin/env python


"""Create and visualize date with timestamps."""

# core modules
from datetime import datetime
import random

# 3rd party module
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt


def create_data(num_samples, year, month_p=None, day_p=None):
    """
    Create timestamp data.

    Parameters
    ----------
    num_samples : int
    year: int
    month_p : int, optional (default: None)
    day_p : int, optional (default: None)

    Returns
    -------
    data : Pandas.Dataframe object
    """
    data = []
    for _ in range(num_samples):
        if month_p is None:
            month = random.randint(1, 12)
        else:
            month = month_p
        if day_p is None:
            day = random.randint(1, 28)
        else:
            day = day_p
        hour = int(np.random.normal(loc=7) * 3) % 24
        minute = random.randint(0, 59)
        data.append({'date': datetime(year, month, day, hour, minute)})
    data = sorted(data, key=lambda n: n['date'])
    return pd.DataFrame(data)


def visualize_data(df):
    """
    Plot data binned by hour.

    x-axis is the hour, y-axis is the number of datapoints.

    Parameters
    ----------
    df : Pandas.Dataframe object
    """
    df.groupby(df["date"].dt.hour).count().plot(kind="bar")
    plt.show()


df = create_data(2000, 2017)
visualize_data(df)

As you can see, the 7, 9 and 10 are missing.

enter image description here

0

3 Answers 3

4

reindex the resulting DataFrame with all the values and then call the plot method:

res = df.groupby(df["date"].dt.hour).count().reindex(np.arange(24), fill_value=0)
res.plot(kind="bar")
plt.show()

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

1

Try this function:

def visualize_data(df):
    """
    Plot data binned by hour.

    x-axis is the hour, y-axis is the number of datapoints.

    Parameters
    ----------
    df : Pandas.Dataframe object
    """
    y = df.groupby(df["date"].dt.hour).count()
    for i in range(24):
        y.loc[i] = 0 if i not in y.index else y.loc[i]  # Add missing locations.
    y.sort_index(inplace = True)   # Sort the locations.
    y.plot(kind="bar")
    plt.show()

Comments

-1
matplotlib.style.use('ggplot')

see - https://pandas.pydata.org/pandas-docs/stable/visualization.html

As you can see, the 7, 9 and 10 are missing.

O events ?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.