turn lists of lists into strings pandas dataframe

Question

Background

I have the following toy df that contains lists in the columns Before and After as seen below

import pandas as pd
before = [list(['in', 'the', 'bright', 'blue', 'box']), 
       list(['because','they','go','really','fast']), 
       list(['to','ride','and','have','fun'])]
after = [list(['there', 'are', 'many', 'different']), 
       list(['i','like','a','lot','of', 'sports']), 
       list(['the','middle','east','has','many'])]

df= pd.DataFrame({'Before' : before, 
                   'After' : after,
                  'P_ID': [1,2,3], 
                  'Word' : ['crayons', 'cars', 'camels'],
                  'N_ID' : ['A1', 'A2', 'A3']
                 })

Output

                    After                Before                     N_ID P_ID   Word
0   [in, the, bright, blue, box]        [there, are, many, different]   A1  1   crayons
1   [because, they, go, really, fast]   [i, like, a, lot, of, sports ]  A2  2   cars
2   [to, ride, and, have, fun]        [the, middle, east, has, many]    A3  3   camels

Problem

Using the following block of code:

df.loc[:, ['After', 'Before']] = df[['After', 'Before']].apply(lambda x: x.str[0].str.replace(',', '')) taken from Removing commas and unlisting a dataframe produce the following output:

Close-to-what-I-want-but-not-quite- Output

    After   Before  N_ID  P_ID  Word
0   in      there    A1    1    crayons
1   because  i       A2    2    cars
2   to      the      A3    3    camels

This output is close but not quite what I am looking for because After and Before columns have only one word outputs (e.g. there) when my desired output looks as such:

Desired Output

     After                           Before               N_ID  P_ID  Word
0 in the bright blue box        there are many different  A1    1   crayons
1 because they go really fast   i like a lot of sports    A2    2   cars
2 to ride and have fun         the middle east has many   A3    3   camels

Question

How do I get my Desired Output?

user3483203 · Accepted Answer · 2019-07-07 02:22:41Z

4

agg + join. The commas aren't present in your lists, they are just part of the __repr__ of the list.

str_cols = ['Before', 'After']

d = {k: ' '.join for k in str_cols}

df.agg(d).join(df.drop(str_cols, 1))

                        Before                     After  P_ID     Word N_ID
0       in the bright blue box  there are many different     1  crayons   A1
1  because they go really fast    i like a lot of sports     2     cars   A2
2         to ride and have fun  the middle east has many     3   camels   A3

If you'd prefer in place (faster):

df[str_cols] = df.agg(d)

edited Jul 7, 2019 at 2:22

answered Jul 7, 2019 at 2:11

user3483203

51.3k10 gold badges72 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

piRSquared Over a year ago

Surely this deserves an up vote as well as accpetance

piRSquared · Accepted Answer · 2019-07-07 03:51:28Z

`applymap`

In line

New copy of a dataframe with desired results

df.assign(**df[['After', 'Before']].applymap(' '.join))

                        Before                     After  P_ID     Word N_ID
0       in the bright blue box  there are many different     1  crayons   A1
1  because they go really fast    i like a lot of sports     2     cars   A2
2         to ride and have fun  the middle east has many     3   camels   A3

In place

Mutate existing df

df.update(df[['After', 'Before']].applymap(' '.join))
df

                        Before                     After  P_ID     Word N_ID
0       in the bright blue box  there are many different     1  crayons   A1
1  because they go really fast    i like a lot of sports     2     cars   A2
2         to ride and have fun  the middle east has many     3   camels   A3

`stack` and `str.join`

We can use this result in a similar "In line" and "In place" way as shown above.

df[['After', 'Before']].stack().str.join(' ').unstack()

                      After                       Before
0  there are many different       in the bright blue box
1    i like a lot of sports  because they go really fast
2  the middle east has many         to ride and have fun

Erfan · Accepted Answer · 2019-07-07 02:20:09Z

2

We can specify the lists we want to convert to string and then use .apply in a for loop:

lst_cols = ['Before',  'After']

for col in lst_cols:
    df[col] = df[col].apply(' '.join)

                        Before                     After  P_ID     Word N_ID
0       in the bright blue box  there are many different     1  crayons   A1
1  because they go really fast    i like a lot of sports     2     cars   A2
2         to ride and have fun  the middle east has many     3   camels   A3

answered Jul 7, 2019 at 2:20

Erfan

43.3k10 gold badges75 silver badges86 bronze badges

1 Comment

Erfan Over a year ago

Who ever downvoted my answer, could you explain why?

Collectives™ on Stack Overflow

turn lists of lists into strings pandas dataframe

3 Answers 3

1 Comment

`applymap`

In line

In place

`stack` and `str.join`

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

applymap

In line

In place

stack and str.join

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related

`applymap`

`stack` and `str.join`