How to map key to multiple values to dataframe column w/ multiple values?

Question

I ultimately want to group by items that have multiple shipping requirements vs ones that have just 1.

I have a pandas df column that looks like this:

ID #(column name = ID)
1111
1111,2222
1111,2222
2222,4444,3333
2222,4444

How can I create a dictionary object or mapping layer(open to all suggestions) where I can pass any value if it matches criteria and changes to the key value?

For example if the value is 1111, 4444 then change it to Express Shipping, Standard Shipping and have that be in the same dataframe.

I ultimately want to group by items that have multiple shipping requirements vs ones that have just 1.

1. shipping_num = (1111, 2222, 3333, 4444).

2. shipping_map = (Express shipping, Standard Shipping, 2-day shipping, 1-day shipping)



*NEW_SHIPPING MAP COLUMN*
Express shipping
Express shipping, Standard Shipping
Standard Shipping, 1-day shipping,2-day shipping
Standard Shipping, 1-day shipping

Thanks for looking!

user7864386 · Accepted Answer · 2022-04-01 20:37:04Z

2

You could create a mapping dictionary from shipping_num and shipping_map, then use str.split + explode to get individual ID numbers from the ID column. Then use map to get shipping maps; finally use groupby + agg to get back to original shape:

shipping_num = (1111, 2222, 3333, 4444)
shipping_map = ('Express shipping','Standard Shipping', '2-day shipping', '1-day shipping')

mapping = dict(zip(shipping_num, shipping_map))
df['shipping_map'] = df['ID'].str.split(',').explode().astype(int).map(mapping).groupby(level=0).agg(', '.join)

Output:

               ID                                       shipping_map
0            1111                                   Express shipping
1       1111,2222                Express shipping, Standard Shipping
2       1111,2222                Express shipping, Standard Shipping
3  2222,4444,3333  Standard Shipping, 1-day shipping, 2-day shipping
4       2222,4444                  Standard Shipping, 1-day shipping

answered Apr 1, 2022 at 20:37

user7864386

Sign up to request clarification or add additional context in comments.

2 Comments

aero8991 Over a year ago

thanks this worked! Some of the IDS also contain letters so I had to add .astype(str) to the line after I mapped everything.

user7864386 Over a year ago

@aero8991 if IDs contain letters, then astype(int) is probably not needed then?

Evan W. · Accepted Answer · 2022-04-01 20:55:42Z

1

I'm not 100% clear on what you want the code to do. One thing that might help is to set up a dictionary like MyDict = {1111:"Express Shipping",4444:"Standard Shipping}. From there, you can do a df["NewColumn"] = df["ID"].apply(lambda x: MyDict[x]) to translate the numbers into labels.

But it sort of seems like you are interested in using multiple keys?

Another thing that might help is to temporarily convert your "ID" columns to strings (e.g. df['StrID'] = df['ID'].astype(str)) and then "split" the column on a comma. (e.g. df['StrID'].str.split(',') which would set you up to counting the number of IDs in each column. You can then run a group-by over that.

edited Apr 1, 2022 at 20:55

answered Apr 1, 2022 at 20:39

Evan W.

3422 silver badges10 bronze badges

Collectives™ on Stack Overflow

How to map key to multiple values to dataframe column w/ multiple values?

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related