In Python, how do I expand this dataframe...
| sect_id | sector | func_list | func_id |
|---|---|---|---|
| 0 | AAA | ['A', 'B'] | [1,2] |
| 1 | BBB | ['C', 'D'] | [3,4] |
To this format?
| sect_id | sector | func_list | func_id |
|---|---|---|---|
| 0 | AAA | A | 1 |
| 0 | AAA | B | 2 |
| 1 | BBB | C | 3 |
| 1 | BBB | D | 4 |
Thanks for your advice.
Try:
from ast import literal_eval
# apply ast.literal_eval if necessary:
df["func_list"] = df["func_list"].apply(literal_eval)
df["func_id"] = df["func_id"].apply(literal_eval)
print(df.explode(["func_list", "func_id"]))
Prints:
sect_id sector func_list func_id
0 0 AAA A 1
0 0 AAA B 2
1 1 BBB C 3
1 1 BBB D 4
import pandas as pd
table = pd.read_clipboard()
import ast
def unlist_stringlist(stringlst):
x = stringlst
x = ast.literal_eval(x)
return(x)
df_tmp=[]
for i in range(len(table)):
sctid = table.sect_id.values[i]
sct = table.sector.values[i]
funclist_tmp = unlist_stringlist(table['func_list'][i])
funcid_tmp = unlist_stringlist(table['func_id'][i])
len_stringlist = len(funclist_tmp)
for j in range(len_stringlist):
df_tmp.append([sctid,sct,funclist_tmp[j],funcid_tmp[j]])
df_result = pd.DataFrame(df_tmp)
df_result.columns = table.columns
df_result
This would work.
Try using explode with pandas 1.3.0+ with multi-column explode:
df.explode(['func_list', 'func_id'])
Output:
sect_id sector func_list func_id
0 0 AAA A 1
0 0 AAA B 2
1 1 BBB C 3
1 1 BBB D 4
Given, df:
df = pd.DataFrame({'sect_id': [0, 1],
'sector' : ['AAA', 'BBB'],
'func_list': [['A', 'B'],['C', 'D']],
'func_id': [[1, 2], [3, 4]]})