Python: create nested list from long format panda dataframes

Question

I have a dataframe, where only "peak_time" is a column:

stimulus position peak_time 
1        1        1.0
         2        1.5
2        1        2.0
         2        2.0
3        1        2.5

Now I'm trying to squeeze the position column and get lists instead, so that it looks like this:

stimulus peak_time  
1        [1.0, 1.5]
2        [2.0, 2.0]
3        [2.5]

It's probably super easy, but I can't find any solution using goole. If someone opened this topic already, I'd appreciate the respective link, too. Thanks for your help!

Code to create dataframe:

import random, scipy
import pandas as pd
trial     = [1,1,2,1,1,2,2,1,2]
stimulus  = [1,1,1,2,2,2,2,3,3] 
position  = [1,2,1,1,2,1,2,1,1]
peak_time = random.sample(range(1000), 9)
df        = pd.DataFrame({"trial": trial, "stimulus": stimulus, "position": position, "peak_time": peak_time})
median_   = df.groupby(['stimulus', 'position']).apply(scipy.nanmedian)
median_   = pd.DataFrame(median_)
median_.columns = ['peak_time']
median_

EDIT

As I can only post one question every 90 minutes, I'd like to ask a follow up question underneath this post. So now I have two panda series looking like this:

median_:
stimulus
1    [1.0, 1.5]
2    [2.0, 2.0]
3    [2.0]

quartile_:
stimulus
1    [[1.0, 70.0],  [1.0, 183.25]]
2    [[1.0, 65.75], [2.0, 98.75]]
3    [[1.0, 51.25]]

I want to substract median_ from quartile_ so that I get

distance_: 
stimulus
1   [1-1, 70-1], [1.5-1, 183.25-1.5]
2   [2-1, 65.75-1], [2-2, 98.75-2]
3   [2-1, 51.25-2]

Is there a simple way to do that? abs(median_ - quartile_) does not work.

Code to create series:

import random, scipy
import pandas as pd
trial     = [1,1,2,1,1,2,2,1,2]
stimulus  = [1,1,1,2,2,2,2,3,3] 
position  = [1,2,1,1,2,1,2,1,1]
peak_time = random.sample(range(1000), 9)
df        = pd.DataFrame({"trial": trial, "stimulus": stimulus, "position": position, "peak_time": peak_time})
median_   = df.groupby(['stimulus', 'position']).apply(scipy.nanmedian).groupby(level=0).apply(list)
quartile_ = df.groupby(['stimulus', 'position']).apply(lambda x: scipy.nanpercentile(x, [25, 75])).groupby(level=0).apply(list)

Solution

Apply groupby(level=0).apply(list) later, so

median_   = df.groupby(['stimulus', 'position']).apply(scipy.nanmedian)
quartile_ = df.groupby(['stimulus', 'position']).apply(lambda x: scipy.nanpercentile(x, [25, 75]))

Then I can substract them easily

distance_ = abs(median_ - quartile_)
distance_ = distance.groupby(level=0).apply(list)
distance_

stimulus
1   [1-1, 70-1], [1.5-1, 183.25-1.5]
2   [2-1, 65.75-1], [2-2, 98.75-2]
3   [2-1, 51.25-2]

jezrael · Accepted Answer · 2018-02-22 13:51:17Z

3

It is MultiIndex Series, so need Series.groupby with apply list:

#added column peak_time
median_   = df.groupby(['stimulus', 'position'])['peak_time'].apply(scipy.nanmedian)
df        = median_.groupby(level=0).apply(list).reset_index()
print (df)
   stimulus   peak_time
0         1  [1.0, 1.5]
1         2  [2.0, 2.0]
2         3       [2.5]

edited Feb 22, 2018 at 13:51

answered Feb 22, 2018 at 13:44

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Svenno Nito Over a year ago

That was probably the quickest response ever. Solves my problem entirely, thanks!! Gonna mark it as an answer in 6 min (stackoverflow makes me wait)..

Svenno Nito Over a year ago

I have a follow up question, maybe you know a quick solution for that, too?

jezrael Over a year ago

@SvennoNito - Not so easy, because pandas native dont support nested list very nice.

Svenno Nito Over a year ago

@jezreal, I got a workaround, by applying your solution just on the distance measure. So distance_ = median_ - quartile_ and then distance_.groupby(level=0).apply(list)

Collectives™ on Stack Overflow

Python: create nested list from long format panda dataframes

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related