2

I have a Pandas DataFrame, DF:

Column A Column B Column C
Apple red Texas
Apple red California
Banana yellow Indiana
Banana yellow Florida

I would like to get it in a dictionary in the form:

{ "Apple red" : ['Texas', 'California'], "Banana yellow" : ['Indiana', 'Florida'] }

where Key = concatenation of strings in column A and column B (and)

Value = all corresponding strings from column C (based on groupby) in a list.

I am not sure how to achieve this.

Key Note: It should also work if there are more than 3 columns to be grouped for dictionary's key

2 Answers 2

3

Try:

x = dict(
    df.groupby(df["Column A"] + " " + df["Column B"])["Column C"].agg(list)
)

print(x)

Prints:

{'Apple red': ['Texas', 'California'], 'Banana yellow': ['Indiana', 'Florida']}
Sign up to request clarification or add additional context in comments.

2 Comments

Sorry, I just made an last minute edit to the post. It should also work if there are multiple columns in the DF to be grouped for the dictionary's key.
@Krishna If they are strings then just concatenate the columns together: df.groupby(df["Column A"] + " " + df["Column B"] + " " + df['Column X'] + ...etc)
1

One option, which should be performant as well, is with a default dictionary:

from collections import defaultdict

out = defaultdict(list)

for a, b, c in zip(df['Column A'], df['Column B'], df['Column C']):
    key = a + " " + b
    out[key].append(c)

out
defaultdict(list,
            {'Apple red': ['Texas', 'California'],
             'Banana yellow': ['Indiana', 'Florida']})

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.