Dataframe add multiple columns from list with each column name created

Question

There is a list of dicts d, in which x is an embedded list, e.g.,

d = [{"name":"Python", "x":[0,1,2,3,4,5]},  # x has 300 elements
     {"name":"C++", "x":[0,1,0,3,4,4]},
     {"name":"Java","x":[0,4,5,6,1]}]

I want to transform d to Dataframe, and add columns automatically for each element in x that the added column name has a prefix "abc", e,g.,

df.columns = ["name", "abc0", "abc1", ..., "abc300"]

I'm looking for an efficient way, as d has lots of dicts . When I manually added columns, Python says

PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead.  To get a de-fragmented frame, use `newframe = frame.copy()`

Timus · Accepted Answer · 2021-12-09 13:27:42Z

2

Are you looking for something like this:

d = [{"name":"Python", "x":[0,1,2,3,4,5]},  # x has 300 elements
     {"name":"C++", "x":[0,1,0,3,4,4]},
     {"name":"Java","x":[0,4,5,6,1]}]

df = pd.DataFrame(
    {
        "name": record["name"],
        **{f"abc{i}": n for i, n in enumerate(record["x"])}
    }
    for record in d
)

Result for your sample:

     name  abc0  abc1  abc2  abc3  abc4  abc5
0  Python     0     1     2     3     4   5.0
1     C++     0     1     0     3     4   4.0
2    Java     0     4     5     6     1   NaN

answered Dec 9, 2021 at 13:27

Timus

11.4k5 gold badges20 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Blade Over a year ago

It's a very elegant solution, thanks!

eandklahn · Accepted Answer · 2021-12-09 13:44:24Z

2

You can take all content of the list of dictionaries and turn it into a list of strings with the following list comprehension

column_names = [p['name']+str(p['x'][idx]) for p in d for idx in range(len(p['x']))]

for your example, you obtain

['Python0', 'Python1', 'Python2', 'Python3', 'Python4', 'Python5', 'C++0', 'C++1', 'C++0', 'C++3', 'C++4', 'C++4', 'Java0', 'Java4', 'Java5', 'Java6', 'Java1']

and then you can construct an empty DataFrame with

df = pandas.DataFrame(columns=column_names)

answered Dec 9, 2021 at 13:44

eandklahn

5874 silver badges8 bronze badges

Comments

Raymond Toh · Accepted Answer · 2021-12-09 08:35:48Z

0

I hope this is what you need. If it help do upvote and accept the answer.

d = {
  "name": "abc",
  "x":[i for i in range(300)]  # 300 elements
}

df = pd.DataFrame(d)
df = df.T
df.columns = [i+str(idx) for idx, i in enumerate(df.iloc[0])]
df.drop(index=df.index[0], axis=0, inplace=True)
df
Out[91]: 
  abc0 abc1 abc2 abc3 abc4 abc5 abc6 abc7 abc8 abc9  ... abc290 abc291 abc292  \
x    0    1    2    3    4    5    6    7    8    9  ...    290    291    292   

  abc293 abc294 abc295 abc296 abc297 abc298 abc299  
x    293    294    295    296    297    298    299  

[1 rows x 300 columns]

answered Dec 9, 2021 at 8:35

Raymond Toh

7992 gold badges9 silver badges27 bronze badges

2 Comments

Blade Over a year ago

Thanks. If d has multiple dicts, how to do that in an elegant way?

Raymond Toh Over a year ago

@Blade are you able to provide some sample example? If not it's hard to understand what you are trying to do.

Collectives™ on Stack Overflow

Dataframe add multiple columns from list with each column name created

3 Answers 3

1 Comment

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related