2

I want to create python dictionary with pandas data frame column 2(source) and column 3(description) and group by column 1(title) Also, I want to get values of only provided titles titles = ['test1','test2']

   title  source description
1  Test1    ABC  description1
2  Test2    ABC  description2
3  Test2    DEF  description3
4  Test3    XYZ  description4

output = {'Test1':{'ABC':'description1'},'Test2':{'ABC':'description2':'DEF':'description3'}
0

3 Answers 3

4

Use boolean indexing with Series.isin for filter first, then is used GroupBy.apply with lambda function for Series of dicts and last Series.to_dict:

titles = ['Test1','Test2']

d = (df[df['title'].isin(titles)]
       .groupby('title')[['source','description']]
       .apply(lambda x: dict(x.to_numpy()))
       .to_dict())
print (d)
{'Test1': {'ABC': 'description1'}, 'Test2': {'ABC': 'description2', 'DEF': 'description3'}}
Sign up to request clarification or add additional context in comments.

Comments

2

You can group by the dataframe w.r.t. title and then use python zip function to create inner dictionary with source and description. Please find below code for the same.

final_dict=dict()
all_groups = df.groupby('title')
for title in titles: 
    title_group = all_groups.get_group(title)
    source_desc=dict(zip(title_group.source, title_group.description))
    final_dict[title_group] = source_desc
print(final_dict)

Comments

0

try this,

result = {}

filter_ = ['Test1','Test2']

for x in df[df['title'].isin(filter_)].to_dict(orient='records'):
    result.setdefault(x['title'], {}).update({x['source']: x['description']})

{'Test1': {'ABC': 'description1'}, 'Test2': {'ABC': 'description2', 'DEF': 'description3'}}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.