Add empty rows each certain rows in a multiindex data frame

Question

I have a data frame like this

probe_names  PLAGL1  GRB10  MEST   H19  KCNQ1OT1  MEG3  MEG8  SNRPN  PEG3  \
Jemima   0     0.55   0.53  0.53  0.47      0.62  0.11  0.83   0.50  0.49   
        1     0.51   0.46  0.53  0.52      0.47  0.00  0.91   0.47  0.54   
        2      NaN    NaN   NaN  0.55       NaN   NaN   NaN    NaN  0.50   
        0     0.54   0.59  0.53  0.47      0.66  0.13  0.90   0.51  0.53   
        1     0.48   0.45  0.54  0.50      0.47  0.00  0.90   0.50  0.53   
        2      NaN    NaN   NaN  0.54       NaN   NaN   NaN    NaN  0.53   
Elena    0     0.54   0.55  0.55  0.57      0.53  0.58  0.55   0.52  0.45   
        1     0.53   0.49  0.53  0.65      0.38  0.62  0.48   0.49  0.55   
        2      NaN    NaN   NaN  0.66       NaN   NaN   NaN    NaN  0.42   
        0     0.51   0.53  0.55  0.62      0.52  0.57  0.53   0.50  0.48   
        1     0.48   0.45  0.52  0.63      0.38  0.59  0.46   0.53  0.55   
        2      NaN    NaN   NaN  0.63       NaN   NaN   NaN    NaN  0.45

And I want to add 2 empty/NaN rows each 3 rows

probe_names  PLAGL1  GRB10  MEST   H19  KCNQ1OT1  MEG3  MEG8  SNRPN  PEG3  \
Jemima   0     0.55   0.53  0.53  0.47      0.62  0.11  0.83   0.50  0.49   
        1     0.51   0.46  0.53  0.52      0.47  0.00  0.91   0.47  0.54   
        2      NaN    NaN   NaN  0.55       NaN   NaN   NaN    NaN  0.50  
        3
        4
        0     0.54   0.59  0.53  0.47      0.66  0.13  0.90   0.51  0.53   
        1     0.48   0.45  0.54  0.50      0.47  0.00  0.90   0.50  0.53   
        2      NaN    NaN   NaN  0.54       NaN   NaN   NaN    NaN  0.53   
        3
        4
Elena    0     0.54   0.55  0.55  0.57      0.53  0.58  0.55   0.52  0.45   
        1     0.53   0.49  0.53  0.65      0.38  0.62  0.48   0.49  0.55   
        2      NaN    NaN   NaN  0.66       NaN   NaN   NaN    NaN  0.42   
        0     0.51   0.53  0.55  0.62      0.52  0.57  0.53   0.50  0.48   
        1     0.48   0.45  0.52  0.63      0.38  0.59  0.46   0.53  0.55   
        2      NaN    NaN   NaN  0.63       NaN   NaN   NaN    NaN  0.45
        3
        4

I don't know how to do this in multi-index tables. I have checked on the internet but I can't see something really similar.

Can you provide the constructor for your dataframe with df.to_dict()? — not_speshal
– not_speshal, Commented May 5, 2022 at 15:56
Please include that dictionary in your post. Don't link to external sites. — not_speshal
– not_speshal, Commented May 5, 2022 at 16:07

not_speshal · Accepted Answer · 2022-05-05 18:52:31Z

1

reindex to a MultiIndex:

#include an extra index level to differentiate between duplicated indices
df = df.set_index(df.groupby(level=[0,1],sort=False).cumcount(),append=True)

#create a 3-level multiindex
midx = pd.MultiIndex.from_product([df.index.levels[0],range(df.index.levels[1].max()+3),df.index.levels[2]])

#reindex, sort, and drop the extra level
output = df.reindex(midx).sort_index(level=[0,2],ascending=[False,True]).droplevel(2)

>>> output
          PLAGL1  GRB10  MEST   H19  KCNQ1OT1  MEG3  MEG8  SNRPN  PEG3
Jemima 0    0.55   0.53  0.53  0.47      0.62  0.11  0.83   0.50  0.49
       1    0.51   0.46  0.53  0.52      0.47  0.00  0.91   0.47  0.54
       2     NaN    NaN   NaN  0.55       NaN   NaN   NaN    NaN  0.50
       3     NaN    NaN   NaN   NaN       NaN   NaN   NaN    NaN   NaN
       4     NaN    NaN   NaN   NaN       NaN   NaN   NaN    NaN   NaN
       0    0.54   0.59  0.53  0.47      0.66  0.13  0.90   0.51  0.53
       1    0.48   0.45  0.54  0.50      0.47  0.00  0.90   0.50  0.53
       2     NaN    NaN   NaN  0.54       NaN   NaN   NaN    NaN  0.53
       3     NaN    NaN   NaN   NaN       NaN   NaN   NaN    NaN   NaN
       4     NaN    NaN   NaN   NaN       NaN   NaN   NaN    NaN   NaN
Elena  0    0.54   0.55  0.55  0.57      0.53  0.58  0.55   0.52  0.45
       1    0.53   0.49  0.53  0.65      0.38  0.62  0.48   0.49  0.55
       2     NaN    NaN   NaN  0.66       NaN   NaN   NaN    NaN  0.42
       3     NaN    NaN   NaN   NaN       NaN   NaN   NaN    NaN   NaN
       4     NaN    NaN   NaN   NaN       NaN   NaN   NaN    NaN   NaN
       0    0.51   0.53  0.55  0.62      0.52  0.57  0.53   0.50  0.48
       1    0.48   0.45  0.52  0.63      0.38  0.59  0.46   0.53  0.55
       2     NaN    NaN   NaN  0.63       NaN   NaN   NaN    NaN  0.45
       3     NaN    NaN   NaN   NaN       NaN   NaN   NaN    NaN   NaN
       4     NaN    NaN   NaN   NaN       NaN   NaN   NaN    NaN   NaN

edited May 5, 2022 at 18:52

answered May 5, 2022 at 16:19

not_speshal

23.2k2 gold badges18 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Manolo Dominguez Becerra Over a year ago

ValueError: cannot handle a non-unique multi-index! in the second line of your code

Manolo Dominguez Becerra Over a year ago

can it be a pandas version issue?

not_speshal Over a year ago

Are you using different data than in your sample? If yes, provide a sample which generates the error you mention. For reference, I'm on pandas 1.4.2 but I doubt that is the issue.

Manolo Dominguez Becerra Over a year ago

No. I am using the same data I generate the dict. How do you convert the dict into the pandas data frame

not_speshal Over a year ago

pd.DataFrame(dictionary)?

|

Collectives™ on Stack Overflow

Add empty rows each certain rows in a multiindex data frame

1 Answer 1

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related