10

To explode list like column to row, we can use pandas explode() function. My pandas' version '0.25.3'

The given example worked for me and another answer of Stackoverflow.com works as expected but it doesn't work for my dataset.

    city        nested_city
0   soto        ['Soto']
1   tera-kora   ['Daniel']
2   jan-thiel   ['Jan Thiel']
3   westpunt    ['Westpunt']
4   nieuwpoort  ['Nieuwpoort', 'Santa Barbara Plantation']

What I have tried:

test_data['nested_city'].explode()

and

test_data.set_index(['nested_city']).apply(pd.Series.explode).reset_index()

Output

0    ['Soto']                                  
1    ['Daniel']                                
2    ['Jan Thiel']                             
3    ['Westpunt']                              
4    ['Nieuwpoort', 'Santa Barbara Plantation']
Name: neighbors, dtype: object
9
  • Please check the nested_city is list or string ? Commented Aug 18, 2020 at 16:19
  • 1
    1: Do you get any error? in that case you might want to check the pandas version , 2: check if they are actual lists (test_data['nested_city'].apply(type) ) or just string representation of a list in which case do test_data['nested_city'].apply(ast.literal_eval).explode() Commented Aug 18, 2020 at 16:19
  • type(test_data['nested_city']) returns, pandas.core.series.Series Commented Aug 18, 2020 at 16:19
  • 2
    @AlwaysSunny that is the reason <class 'str'> , not list type, explode is for list type Commented Aug 18, 2020 at 16:23
  • 1
    I would recommend some time invested in the understanding the python dtypes supported by various methods, also for your last query you need a loop with df.join check documentation. I will leave it to you as a homework :) Commented Aug 18, 2020 at 16:34

1 Answer 1

27

You need to ensure that your column is of list type to be able to use pandas' explode(). Here is a working solution:

from ast import literal_eval

test_data['nested_city'] = test_data['nested_city'].apply(literal_eval) #convert to list type
test_data['nested_city'].explode()

To explode multiple columns at a time, you can do the following:

not_list_cols = [col for col in test_data.columns if col not in ['col1', 'col2']] #list of columns you are not exploding (assume col1 and col2 are being exploded)
test_data = test_data.set_index(not_list_cols).apply(pd.Series.explode).reset_index()
Sign up to request clarification or add additional context in comments.

1 Comment

If you get an exception from literal_eval make sure your columns are a string representation of a list -test_data['nested_city'] = test_data['nested_city'].fillna({i: [] for i in test_data.index}) and test_data['nested_city'] = '[' + test_data['nested_city'].astype(str) + ']'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.