3

Need to add new rows to dataframe based on condition.

Current dataframe:

enter image description here

In this dataframe there are 4 columns. what i want to do ischeck the 'Time' column and check the nearest value for 12PM mid night in every night shift and add two new row as 11:59:59 and 00:00:01 with same values as the that nearest datapoint.

For examle: Closest value(to 12PM) for 03-01 Night is 21:46:54. so need to add two rows,

W25     03-01 Night    RUNNING    23:59:59
W25     03-01 Night    RUNNING    00:00:01

so final expected dataframe should be like this:

enter image description here

Sample data:

data={'Machine': {0: 'W5', 343: 'W5', 344: 'W5', 586: 'W5', 587: 'W5'}, 'State': {0: 'start', 343: 'STOPPED', 344: 'RUNNING', 586: 'STOPPED', 587: 'MAINT'}, 'Day-Shift': {0: '03-01 Night', 343: '03-01 Night', 344: '03-01 Night', 586: '03-01 Night', 587: '03-01 Night'}, 'Time': {0: Timestamp('2021-03-01 21:00:00'), 343: Timestamp('2021-03-01 22:16:54'), 344: Timestamp('2021-03-01 23:16:54'), 586: Timestamp('2021-03-01 23:48:45'), 587: Timestamp('2021-03-02 02:28:54')}}

Really appreciate your support !!!!!

2
  • 1
    do you have a full date time object you can work with, i see the year is missing from your columns ? also night/day is redundant if you have properly formatted date time objects. Commented Mar 24, 2021 at 9:38
  • @Manakin Do you have any idea to fix this? Commented Mar 24, 2021 at 15:32

1 Answer 1

1

you can use idxmax() to find the max record per day, then create a datetime object.

df1 = df.loc[df.groupby([df['Time'].dt.normalize()])['Time'].idxmax()]
df1 = pd.concat([df1] * 2)

df1['Time'] = pd.to_datetime((df1['Time'].dt.normalize().astype(str) + [' 23:59:59', ' 00:00:01']))

print(df1)

    Machine  State  Day-Shift                Time
587     W25  MAINT  03-01 Day 2021-03-01 23:59:59
587     W25  MAINT  03-01 Day 2021-03-01 00:00:01

df = pd.concat([df,df1]).sort_index().reset_index(drop=True)


  Machine    State  Day-Shift                Time
0     W25    start  03-01 Day 2021-03-01 07:00:00
1     W25  STOPPED  03-01 Day 2021-03-01 07:16:54
2     W25  RUNNING  03-01 Day 2021-03-01 07:16:54
3     W25  STOPPED  03-01 Day 2021-03-01 07:28:45
4     W25    MAINT  03-01 Day 2021-03-01 07:28:54
5     W25    MAINT  03-01 Day 2021-03-01 23:59:59
6     W25    MAINT  03-01 Day 2021-03-01 00:00:01
Sign up to request clarification or add additional context in comments.

15 Comments

Thanks a lot. But this gives me this error. ValueError: operands could not be broadcast together with shapes (4,) (2,)
i don't see how the above could do that, did you change any lines of code ?
No i didnt change anything. in 3rd row, df['Time'] containg four columns but in list contains only two elements. i think that is the reason for this error.
df1['Time'] = pd.to_datetime((df1['Time'].dt.normalize().astype(str) + [' 23:59:59', ' 00:00:01'])) This line gives the above error
@domahc not the most elegant, but try df1 = pd.concat([ y.assign(Time=pd.to_datetime((y['Time'].dt.normalize().astype(str) + [' 23:59:59', ' 00:00:01'])) ) for x,y in df1.groupby(df1['Time'].dt.day) ])
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.