Python - NumPy.Where with a Dictionary

Question

I might be doing this wrong, or there might be a much better way than this, as i am still new to Python. Apologies upfront for any obvious mistakes.

I have a Pandas Dataframe with a STR column that holds a Date and Time. It is STR because the times are "Broadcast" formatted, which means there are 29 hours in the day. so we will see dates like 01/Jan/2018 29:59:59. As 1 second to that and its 02/Jan/2018 06:00:00.

My goal here is to convert this data to a real time. Which means any hour between 24 and 29 requires a date shift too. I have already split the STR into 2 new Columns ['Dt'] and ['Ti'], from ['Ti'], pulled out the Hour to a new Column as ['Hr'] and made it an INT.

I then applied a pd.to_datetime to the ['Dt'] and added a rule.

df['Dt'] = np.where(df['Hr'] > 23, df['Dt']+pd.DateOffset(1),df['Dt']+pd.DateOffset(0) )

this works perfect.

I now need to change the Hour to be real time, eg, 24 = 00, 25 = 02 etc.

I thought the best way was to use a DICT and map it, so i made a DICT,

HourMap = {'24':'00','25':'01','26':'02','27':'03','28':'04','29':'05','30':'06'}

Then wrote this

df['Hr1'] = np.where(df['Hr'] > 23, df.replace({'Hr':HourMap}),df['Hr'])

But I get a "ValueError"

ValueError: operands could not be broadcast together with shapes (273,) (273,29) (273,)

I have looked at those rows in the dataframe and they are just normal INTs. On testing I can apply Maths to them (eg. df['Test'] = df['Hr'] + 1.

I did convert them to STR and try the same rules, but got the same error.

Am I just crazy?

Thanks,

Don't use a dictionary, use the modulo operator i.e. %. So it's just the hour % 23 — Dan
– Dan, Commented Oct 29, 2018 at 11:37

jezrael · Accepted Answer · 2018-10-29 11:33:42Z

4

I believe need change:

df.replace({'Hr':HourMap})

to map and if some values is not matched and returned NaNs replace it to original values by fillna:

df['Hr'].map(HourMap).fillna(df['Hr'])
#alternative solution if performance is not important in large df
#df['Hr'].replace(HourMap)

because df.replace return all columns of DataFrame with replaced column Hr

answered Oct 29, 2018 at 11:33

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Runawaygeek Over a year ago

Thanks for pointing me at .map. In this case, i get NaN for every matched INT within the Dict. As there will only ever be 24 - 29, due to the restrictions at source, all potential outputs are mapped in the Dict. When I use df['Hr1'] = np.where(df['Hr'] > 23, df['Hr'].map(HourMap),df['Hr']) Intentionally returning NaN to check etc, any of the items within the Dict return as NaN, rather than the mapped value eg, 26 = 02?

jezrael Over a year ago

@Runawaygeek - I see problem, in dictionary are used strings. Need change HourMap = {'24':'00','25':'01','26':'02','27':'03','28':'04','29':'05','30':'06'} to HourMap = {24:00,25:01,26:02,27:03,28:04,29:05,30:06}

Dan Over a year ago

But why use a dictionary at all? There is a constant difference here, you can literally just subtract 24 to get the value.

Runawaygeek Over a year ago

Thanks, I also quickly read more about .map and see i dont need to parse it inside the np.where function either. Made the adjustments and now it works.

Dan · Accepted Answer · 2018-10-29 12:21:29Z

2

You really shouldn't be using a dictionary here, you don't even need the np.where. Use the modulo operator

In [1]: import numpy as np
In [2]: np.arange(31)%24
Out[2]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23,  0,  1,  2,  3,  4,  5, 6], dtype=int32)

You have numbers that 'wrap around' at 24, this is the text book use case for modulo. So the full code just becomes:

df['Hr1'] = df['Hr'] % 24

Also by the same token you can add to your dates without np.where by just making use of integer division

df['Dt'] = df['Dt']+pd.DateOffset(Df['Hr']//24)

edited Oct 29, 2018 at 12:21

answered Oct 29, 2018 at 12:16

Dan

45.8k20 gold badges98 silver badges169 bronze badges

1 Comment

Runawaygeek Over a year ago

Thanks for the clean up knowledge. Not come across that in readings, but have made a note and will read up on it tonight. :-)

Collectives™ on Stack Overflow

Python - NumPy.Where with a Dictionary

2 Answers 2

4 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related