1

In Python, how do I expand this dataframe...

sect_id sector func_list func_id
0 AAA ['A', 'B'] [1,2]
1 BBB ['C', 'D'] [3,4]

To this format?

sect_id sector func_list func_id
0 AAA A 1
0 AAA B 2
1 BBB C 3
1 BBB D 4

Thanks for your advice.

3 Answers 3

1

Try:

from ast import literal_eval

# apply ast.literal_eval if necessary:
df["func_list"] = df["func_list"].apply(literal_eval)
df["func_id"] = df["func_id"].apply(literal_eval)

print(df.explode(["func_list", "func_id"]))

Prints:

   sect_id sector func_list func_id
0        0    AAA         A       1
0        0    AAA         B       2
1        1    BBB         C       3
1        1    BBB         D       4
Sign up to request clarification or add additional context in comments.

1 Comment

This solution is perfect and easy to understand. Thanks Andrej! :)
1
import pandas as pd

table = pd.read_clipboard()

import ast

def unlist_stringlist(stringlst):
    x = stringlst
    x = ast.literal_eval(x)
    return(x)

df_tmp=[]
for i in range(len(table)):
    sctid = table.sect_id.values[i]
    sct = table.sector.values[i]
    funclist_tmp = unlist_stringlist(table['func_list'][i])
    funcid_tmp = unlist_stringlist(table['func_id'][i])
    len_stringlist = len(funclist_tmp)
    for j in range(len_stringlist):
        df_tmp.append([sctid,sct,funclist_tmp[j],funcid_tmp[j]])

df_result = pd.DataFrame(df_tmp)
df_result.columns = table.columns
df_result

This would work.

2 Comments

Thanks for helping Suhan. Andrej's solution above works just as well.
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.
1

Try using explode with pandas 1.3.0+ with multi-column explode:

df.explode(['func_list', 'func_id'])

Output:

   sect_id sector func_list func_id
0        0    AAA         A       1
0        0    AAA         B       2
1        1    BBB         C       3
1        1    BBB         D       4

Given, df:

df = pd.DataFrame({'sect_id': [0, 1], 
                   'sector' : ['AAA', 'BBB'], 
                   'func_list': [['A', 'B'],['C', 'D']],
                   'func_id': [[1, 2], [3, 4]]})

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.