1

I am trying to extract all of the 'uid' values into a simple python list for each row of the 'data' column and have been unsuccessful.

Here is the code so far:

import json

df = pd.DataFrame({
    'bank_account': [101, 102, 201, 301],
    'data': [
        '[{"uid": 100, "account_type": 1, "account_data": {"currency": {"current": 1000, "minimum": -500}, "fees": {"monthly": 13.5}}, "user_name": "Alice"},{"uid": 150, "account_type": 1, "account_data": {"currency": {"current": 1000, "minimum": -500}, "fees": {"monthly": 13.5}}, "user_name": "jer"}]',
        '[{"uid": 100, "account_type": 2, "account_data": {"currency": {"current": 2000, "minimum": 0},  "fees": {"monthly": 0}}, "user_name": "Alice"}]',
        '[{"uid": 200, "account_type": 1, "account_data": {"currency": {"current": 3000, "minimum": 0},  "fees": {"monthly": 13.5}}, "user_name": "Bob"}]',        
        '[{"uid": 300, "account_type": 1, "account_data": {"currency": {"current": 4000, "minimum": 0},  "fees": {"monthly": 13.5}}, "user_name": "Carol"}]'        
    ]},
    index = ['Alice', 'Alice', 'Bob', 'Carol']
)

# df["data"] = df["data"].apply(lambda x: pd.read_json(x, lines=True)["uid"][0])

df["data"] = [[json.loads(d)['uid'] for d in li] for li in df['data']]
print(df)

0

2 Answers 2

2

You could iterate over the list created by json.loads and get "uid" values:

import json
df['data'] = [[item['uid'] for item in json.loads(d)] for d in df['data']]

Output:

       bank_account        data
Alice           101  [100, 150]
Alice           102       [100]
Bob             201       [200]
Carol           301       [300]

You could also explode "data":

df = df.explode('data')

Output:

       bank_account data
Alice           101  100
Alice           101  150
Alice           102  100
Bob             201  200
Carol           301  300
Sign up to request clarification or add additional context in comments.

Comments

1

Modified apply a bit as you're using apply

import pandas as pd

df = pd.DataFrame(
    {
        "bank_account": [101, 102, 201, 301],
        "data": [
            '[{"uid": 100, "account_type": 1, "account_data": {"currency": {"current": 1000, "minimum": -500}, "fees": {"monthly": 13.5}}, "user_name": "Alice"},{"uid": 150, "account_type": 1, "account_data": {"currency": {"current": 1000, "minimum": -500}, "fees": {"monthly": 13.5}}, "user_name": "jer"}]',
            '[{"uid": 100, "account_type": 2, "account_data": {"currency": {"current": 2000, "minimum": 0},  "fees": {"monthly": 0}}, "user_name": "Alice"}]',
            '[{"uid": 200, "account_type": 1, "account_data": {"currency": {"current": 3000, "minimum": 0},  "fees": {"monthly": 13.5}}, "user_name": "Bob"}]',
            '[{"uid": 300, "account_type": 1, "account_data": {"currency": {"current": 4000, "minimum": 0},  "fees": {"monthly": 13.5}}, "user_name": "Carol"}]',
        ],
    },
    index=["Alice", "Alice", "Bob", "Carol"],
)
df["uid"] = df["data"].apply(lambda x: eval(x)[0]["uid"])
print(df)

Then for list you can do print(list(df["uid"]))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.