2

I have a df

index col1
0      a,c
1      d,f
2      o,k

I need a df like this

index col1
0     {"col1":"a,c"}
1     {"col1":"d,f"}
2     {"col1":"o,k"}

This needs to be applied for all columns in the df.

Tried with orient, but not as expected.

3
  • can you be more explicit on the input/output? are those strings? dictionaries? Are you missing the quotes? Ideally provide an object of both input/output Commented Apr 21, 2022 at 8:30
  • those are strings..and yes sorry im missing the quotes as well.. i tried to do replica and did mistake.. Commented Apr 21, 2022 at 8:32
  • I provided several alternatives, let me know which one you want (and please update the question accordingly) Commented Apr 21, 2022 at 8:36

2 Answers 2

3

For all columns use double apply, columns name is passed by x.name, get dictionary:

df = df.apply(lambda x: x.apply(lambda y: {x.name: y}))

For json use:

import json

df = df.apply(lambda x: x.apply(lambda y: json.dumps({x.name: y})))
print (df)
              col1
0  {"col1": "a,c"}
1  {"col1": "d,f"}
2  {"col1": "o,k"}

Alternative solution for dictionaries:

df = pd.DataFrame({c: [{c: x} for x in df[c]] for c in df.columns}, index=df.index)

Alterative2 solution for json (working well if all columns are filled by strings):

df = '{"' + df.columns + '": "' + df.astype(str) + '"}'
Sign up to request clarification or add additional context in comments.

5 Comments

What is diff between normal and alternative solution..
@Kowsi - I think if large Dataframe altarnative should be faster, not tested.
the fastest is to use vectorial string concatenation as I suggested (6x faster on small df, 15x faster on 30k), but json will be more practical if quoting is expected (doesn't seem to be the case here though)
@mozway - I understand OP need dictionary or json ouput in all columns.
Yes, but you can apply the vectorial string concatenation on all columns (I updated my answer) ;)
1
If you want strings exactly as shown, use:
df['col1'] = '{col1:'+df['col1']+'}'

# or 
c = 'col1'
df[c] = f'{{{c}:'+df[c]+'}'

output:

0    {col1:a,c}
1    {col1:d,f}
2    {col1:o,k}
Name: col1, dtype: object

or, with quotes:

df['col1'] = '{"col1":"'+df['col1']+'"}'

# or 
c = 'col1'
df[c] = f'{{"{c}":"'+df[c]+'"}'

output:

   index            col1
0      0  {"col1":"a,c"}
1      1  {"col1":"d,f"}
2      2  {"col1":"o,k"}
for all columns:
df = df.apply(lambda c: f'{{"{c.name}":"'+c.astype(str)+'"}')

NB. ensure "index" is the index

for dictionaries:
df['col1'] = [{'col1': x} for x in df['col1']]

output:

   index             col1
0      0  {'col1': 'a,c'}
1      1  {'col1': 'd,f'}
2      2  {'col1': 'o,k'}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.