Create a JSON string from column values

Question

How do I mutate a Pandas DataFrame with a series of dictionaries.

Given the following DataFrame:

data = [['tom', 10], ['nick', 15], ['juli', 14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])

# add dict series
df = df.assign(my_dict="{}")
df.my_dict = df.my_dict.apply(json.loads)

Name	Age	my_dict
tom	10	{}
nick	15	{}
juli	14	{}

How would I operate on column my_dict and mutate it as follows:

Age > 10

Name	Age	my_dict
tom	10	{"age>10": false}
nick	15	{"age>10": true}
juli	14	{"age>10": true}

And then mutate again:

Name = "tom":

Name	Age	my_dict
tom	10	{"age>10": false, "name=tom": true}
nick	15	{"age>10": true, "name=tom", false}
juli	14	{"age>10": true, "name=tom", false}

I'm interested in the process of mutating the dictionary, the rules are arbitrary examples.

There are many issue with your example, including: 1- "{}" is a string, not a dictionary, 2- you do not explain where the new data should come from. Ideally you should create this column only once as the operation of mutating it will be expensive — mozway
– mozway, Commented Feb 8, 2022 at 11:31
If you read the question, in the example I apply json.loads to transform the string. The new data comes from the conditions which are clearly described in the question. — shbfy
– shbfy, Commented Feb 8, 2022 at 11:36
pandas isn't of great help here at all FYI compared to list of dicts — user18122470
– user18122470, Commented Feb 8, 2022 at 11:58
@Either what would you recommend? Open to better suggestions! — shbfy
– shbfy, Commented Feb 8, 2022 at 12:25

Corralien · Accepted Answer · 2022-02-08 11:46:11Z

1

You can use:

df['my_dict'] = df.apply(lambda x: x['my_dict'] | {'Age': x['Age'] > 10}, axis=1)
print(df)

# Output
   Name  Age         my_dict
0   tom   10  {'Age': False}
1  nick   15   {'Age': True}
2  juli   14   {'Age': True}

Add a new condition:

df['my_dict'] = df.apply(lambda x: x['my_dict'] | {'Name': x['Name'] == 'tom'}, axis=1)
print(df)

# Output
   Name  Age                       my_dict
0   tom   10  {'Age': False, 'Name': True}
1  nick   15  {'Age': True, 'Name': False}
2  juli   14  {'Age': True, 'Name': False}

Obviously if you want to convert to json, use:

>>> df['my_dict'].apply(json.dumps)
0    {"Age": false, "Name": true}
1    {"Age": true, "Name": false}
2    {"Age": true, "Name": false}
Name: my_dict, dtype: object

answered Feb 8, 2022 at 11:46

Corralien

121k8 gold badges43 silver badges69 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

mozway Over a year ago

@shbfy This is however not "dynamic" like you originally requested, there is no mutation of an object, just a simple one-time creation

butterflyknife · Accepted Answer · 2022-02-08 11:53:29Z

apply is generally supposed to be slow. Here are two alternatives, both using list comprehensions, which according to this highly voted answer, is slightly faster than apply.

import pandas as pd
data = [['tom', 10], ['nick', 15], ['juli', 14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])

# Define your weird func: takes a row of df and returns your dict
def weird_func2(row):
    return {"name=tom":row["Name"]=="tom", "age>10":row["Age"]>10}

# add dict series
df["mydict"] = [weird_func2(i[1]) for i in df.iterrows()]
df

Or you can try:

import pandas as pd
data = [['tom', 10], ['nick', 15], ['juli', 14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])

# Define your weird func: takes a row of df and returns your dict
def weird_func(name, age):
    return {"name=tom":name=="tom", "age>10":age>10}

# add dict series
df["mydict"] = [weird_func(name, age) for name, age in zip(df["Name"], df["Age"])]
df

Collectives™ on Stack Overflow

Create a JSON string from column values

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related