0

I have a dataframe that I'm importing from a text file. The file is organized where certain columns include multiple pieces of data separated by comma. In effect, for certain indices in the df the column value is a list. However pandas isn't reading the data as such, rather as a string that happens to include commas. (Example in the MRE below)

What I ultimately want to do is use df.explode to expand these columns into separate rows but first I need to get pandas to recognize the data as a list. Obviously I could loop through the whole df but there's got to be a vectorized solution here.

Sample code:

import pandas as pd
import numpy as np

df_data = {'Day': ['Mon', 'Tues', 'Weds', 'Thurs', 'Fri', 'Sat', 'Sun'],
        'Visit':['MK', ['E', 'DAK'], 'MK', 
                ['DHS', 'E'], 'E', ['DAK', 'DHS', 'E'], 'MK'],
        'Visit2':['MK', 'E, DAK', 'MK', 
                'DHS, E', 'E', 'DAK, DHS, E', 'MK']}

df = pd.DataFrame(data = df_data)

print(df.explode('Visit'))
print(df.explode('Visit2'))

The data I'm dealing with looks like column Visit2 but what I want is something like Visit. OR some other data transformation that ends up with the desired result:

     Day Visit
0    Mon    MK
1   Tues     E
1   Tues   DAK
2   Weds    MK
3  Thurs   DHS
3  Thurs     E
4    Fri     E
5    Sat   DAK
5    Sat   DHS
5    Sat     E
6    Sun    MK

1 Answer 1

1

IIUC, you can convert your Visit2 to Visit with str.split():

df['Visit2']= df['Visit2'].str.split(',')

And then when you do:

>>> print(df.explode('Visit2'))

     Day          Visit Visit2
0    Mon             MK     MK
1   Tues       [E, DAK]      E
1   Tues       [E, DAK]    DAK
2   Weds             MK     MK
3  Thurs       [DHS, E]    DHS
3  Thurs       [DHS, E]      E
4    Fri              E      E
5    Sat  [DAK, DHS, E]    DAK
5    Sat  [DAK, DHS, E]    DHS
5    Sat  [DAK, DHS, E]      E
6    Sun             MK     MK

# Or drop the column first

>>> print(df.drop('Visit',axis=1).explode('Visit2'))

     Day Visit2
0    Mon     MK
1   Tues      E
1   Tues    DAK
2   Weds     MK
3  Thurs    DHS
3  Thurs      E
4    Fri      E
5    Sat    DAK
5    Sat    DHS
5    Sat      E
6    Sun     MK

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.