0
import pandas as pd
d = [{'col1' : ' B', 'col2' : '2015-3-06 01:37:57'},
       {'col1' : ' A', 'col2' : '2015-3-06 01:39:57'},
       {'col1' : ' A', 'col2' : '2015-3-06 01:45:28'},
       {'col1' : ' B', 'col2' : '2015-3-06 02:31:44'},
       {'col1' : ' B', 'col2' : '2015-3-06 03:55:45'},
       {'col1' : ' B', 'col2' : '2015-3-06 04:01:40'}]
df = pd.DataFrame(d)
df['col2'] = pd.to_datetime(df['col2'])

For each row I want to count number of rows with same values of 'col1' and time within window of past 10 minutes before time of this row(include). I'm interested in implementation which work fast

this source work very slow on big dataset:

dt = pd.Timedelta(10, unit='m')
def count1(row):
    id1 = row['col1']
    start_time = row['col2'] - dt
    end_time = row['col2']
    mask = (df['col1'] == id1) & ((df['col2'] >= start_time) & (df['col2'] <= end_time))
    return df.loc[mask].shape[0]

df['count1'] = df.apply(count1, axis=1)

df.head(6)

    col1    col2    count1
0   B   2015-03-06 01:37:57     1
1   A   2015-03-06 01:39:57     1
2   A   2015-03-06 01:45:28     2
3   B   2015-03-06 02:31:44     1
4   B   2015-03-06 03:55:45     1
5   B   2015-03-06 04:01:40     2

Notice: column 'col2' is date sensitive, not only time

2 Answers 2

3

The problem is, that apply is very expensive. One option is to optimize the code via cython or with the use of numba.

This might be helpful.

Another option is the following:

  1. Create a column with timestamps from col2
  2. Create a column with ids which group the timestamps by your 10 min criterium
  3. Create a combined column with the previous created ids and col1 as in df['time_ids'].map(str) + df['col1']
  4. Use groupby to determine the number of equal rows. Something like: df.groupby(df['combined_ids']).size()
Sign up to request clarification or add additional context in comments.

Comments

0

Try to use

df.col2=pd.to_datetime(df.col2)
df.groupby([pd.Grouper(key='col2',freq='H'),df.col1]).size().reset_index(name='count')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.