3

There is a list of dicts d, in which x is an embedded list, e.g.,

d = [{"name":"Python", "x":[0,1,2,3,4,5]},  # x has 300 elements
     {"name":"C++", "x":[0,1,0,3,4,4]},
     {"name":"Java","x":[0,4,5,6,1]}]

I want to transform d to Dataframe, and add columns automatically for each element in x that the added column name has a prefix "abc", e,g.,

df.columns = ["name", "abc0", "abc1", ..., "abc300"]

I'm looking for an efficient way, as d has lots of dicts . When I manually added columns, Python says

PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead.  To get a de-fragmented frame, use `newframe = frame.copy()`
0

3 Answers 3

2

Are you looking for something like this:

d = [{"name":"Python", "x":[0,1,2,3,4,5]},  # x has 300 elements
     {"name":"C++", "x":[0,1,0,3,4,4]},
     {"name":"Java","x":[0,4,5,6,1]}]

df = pd.DataFrame(
    {
        "name": record["name"],
        **{f"abc{i}": n for i, n in enumerate(record["x"])}
    }
    for record in d
)

Result for your sample:

     name  abc0  abc1  abc2  abc3  abc4  abc5
0  Python     0     1     2     3     4   5.0
1     C++     0     1     0     3     4   4.0
2    Java     0     4     5     6     1   NaN
Sign up to request clarification or add additional context in comments.

1 Comment

It's a very elegant solution, thanks!
2

You can take all content of the list of dictionaries and turn it into a list of strings with the following list comprehension

column_names = [p['name']+str(p['x'][idx]) for p in d for idx in range(len(p['x']))]

for your example, you obtain

['Python0', 'Python1', 'Python2', 'Python3', 'Python4', 'Python5', 'C++0', 'C++1', 'C++0', 'C++3', 'C++4', 'C++4', 'Java0', 'Java4', 'Java5', 'Java6', 'Java1']

and then you can construct an empty DataFrame with

df = pandas.DataFrame(columns=column_names)

Comments

0

I hope this is what you need. If it help do upvote and accept the answer.

d = {
  "name": "abc",
  "x":[i for i in range(300)]  # 300 elements
}

df = pd.DataFrame(d)
df = df.T
df.columns = [i+str(idx) for idx, i in enumerate(df.iloc[0])]
df.drop(index=df.index[0], axis=0, inplace=True)
df
Out[91]: 
  abc0 abc1 abc2 abc3 abc4 abc5 abc6 abc7 abc8 abc9  ... abc290 abc291 abc292  \
x    0    1    2    3    4    5    6    7    8    9  ...    290    291    292   

  abc293 abc294 abc295 abc296 abc297 abc298 abc299  
x    293    294    295    296    297    298    299  

[1 rows x 300 columns]

2 Comments

Thanks. If d has multiple dicts, how to do that in an elegant way?
@Blade are you able to provide some sample example? If not it's hard to understand what you are trying to do.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.