1

I ultimately want to group by items that have multiple shipping requirements vs ones that have just 1.

I have a pandas df column that looks like this:

ID #(column name = ID)
1111
1111,2222
1111,2222
2222,4444,3333
2222,4444

How can I create a dictionary object or mapping layer(open to all suggestions) where I can pass any value if it matches criteria and changes to the key value?

For example if the value is 1111, 4444 then change it to Express Shipping, Standard Shipping and have that be in the same dataframe.

I ultimately want to group by items that have multiple shipping requirements vs ones that have just 1.

1. shipping_num = (1111, 2222, 3333, 4444).

2. shipping_map = (Express shipping, Standard Shipping, 2-day shipping, 1-day shipping)



*NEW_SHIPPING MAP COLUMN*
Express shipping
Express shipping, Standard Shipping
Standard Shipping, 1-day shipping,2-day shipping
Standard Shipping, 1-day shipping

Thanks for looking!

2 Answers 2

2

You could create a mapping dictionary from shipping_num and shipping_map, then use str.split + explode to get individual ID numbers from the ID column. Then use map to get shipping maps; finally use groupby + agg to get back to original shape:

shipping_num = (1111, 2222, 3333, 4444)
shipping_map = ('Express shipping','Standard Shipping', '2-day shipping', '1-day shipping')

mapping = dict(zip(shipping_num, shipping_map))
df['shipping_map'] = df['ID'].str.split(',').explode().astype(int).map(mapping).groupby(level=0).agg(', '.join)

Output:

               ID                                       shipping_map
0            1111                                   Express shipping
1       1111,2222                Express shipping, Standard Shipping
2       1111,2222                Express shipping, Standard Shipping
3  2222,4444,3333  Standard Shipping, 1-day shipping, 2-day shipping
4       2222,4444                  Standard Shipping, 1-day shipping
Sign up to request clarification or add additional context in comments.

2 Comments

thanks this worked! Some of the IDS also contain letters so I had to add .astype(str) to the line after I mapped everything.
@aero8991 if IDs contain letters, then astype(int) is probably not needed then?
1

I'm not 100% clear on what you want the code to do. One thing that might help is to set up a dictionary like MyDict = {1111:"Express Shipping",4444:"Standard Shipping}. From there, you can do a df["NewColumn"] = df["ID"].apply(lambda x: MyDict[x]) to translate the numbers into labels.

But it sort of seems like you are interested in using multiple keys?

Another thing that might help is to temporarily convert your "ID" columns to strings (e.g. df['StrID'] = df['ID'].astype(str)) and then "split" the column on a comma. (e.g. df['StrID'].str.split(',') which would set you up to counting the number of IDs in each column. You can then run a group-by over that.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.