0

I'm trying to calculate the datetime difference between rows for each unique machine_id here. I have already grouped the Dataframes and have tried

newdf = newdf.copy()
newdf['diffs'] = float('nan')
newdf = newdf.copy()
for index in newdf.index.levels[0]:
    newdf.diffs[index] = newdf.event_datetime[index].diff

the dataset looks like

https://i.sstatic.net/eg93C.png

2
  • What package are you using for your dataframe? Commented Dec 10, 2019 at 0:32
  • I'm using pandas Commented Dec 10, 2019 at 0:43

2 Answers 2

1

Have you tried diff after groupby operation? Something like:

newdf.groupby('machine_id').event_date.diff()
Sign up to request clarification or add additional context in comments.

6 Comments

HI, yes. I have tried it. But the problems is event_data has a type of datetime. I'm trying to find a way to calculate the difference between two datetime variables
I think diff works with datetime also. What is the dtype of your event_date? And what is your desired output? Differences in days, hours, seconds (integers) or just timedelta?
yes, the diff function should work fine even the type is datetime. Maybe you should set your variable into datetime using pd.to_datetime
Hi, my desired output is differences in days, hours and seconds.
@RunyaoYin then you can try converting your target column to datetime first df['col'] = pd.to_datetime(df['col'])
|
1

I tried to create multi index data frame, it should work fine using diff() function.

using newdf.groupby('machine_id').event_date.diff() suggested by ATL should work fine. o

# hierarchical indices and columns
index = pd.MultiIndex.from_product([[598, 615, 721], [43, 43, 45]],
                                   names=['machine_id', 'prod_category_id'])

# mock some data
data = ['2017-03-20 12:00:00','2017-03-29 01:00:00','2017-04-29 01:00:00',
        '2017-03-30 02:00:00', '2017-04-29 02:00:00','2017-05-29 12:00:00',
        '2017-10-30 02:00:00', '2017-11-29 02:00:00', '2017-11-29 04:00:00']

# create the DataFrame
newdf = pd.DataFrame(data, index=index)
newdf.columns = ['event_date']

newdf['event_date'] = pd.to_datetime(newdf['event_date'])
newdf.groupby(level=0)['event_date'].diff()

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.