1

I have the following dataframe:

[['M', 'A', '0', '0.2', '0.2', '0.2'],
 [nan, nan, nan, '0.3', '0.3', '1'],
 [nan, nan, nan, '1.4', '3.2', '32'],
 [nan, nan, nan, nan, nan, nan],
 [nan, nan, nan, nan, nan, nan],
 ['sex', 'test', 'conc', 'sugar', 'flour', 'yeast'],
 ['M', 'A', '3', '1.2', '1.2', '1.2'],
 [nan, nan, nan, '1.3', '1.3', '2'],
 [nan, nan, nan, '2.4', '4.2', '33'],
 [nan, nan, nan, nan, nan, nan],
 ['sex', 'test', 'conc', 'sugar', 'flour', 'yeast'],
 ['M', 'A', '6', '2.2', '2.2', '2.2'],
 [nan, nan, nan, '2.3', '2.3', '3'],
 [nan, nan, nan, '3.4', '5.2', '34']]

I'd like to split it when a row is all nans, into multiple dataframes. I've tried the following code from the link below, and it does as I think I want it to do, but it appears to return a list of the splits. How do I get each one into its individual dataframe, so I'd have multiple dataframes?

SOF

df_list = np.split(df, df[df.isnull().all(1)].index)
for df in df_list:
    print(df, '\n') 

2 Answers 2

2

IIUC, you can use:

m = df.isna().all(axis=1)

dfs = [g for k,g in df[~m].groupby(m.cumsum())]

Output:

[     0    1    2    3    4    5
 0    M    A    0  0.2  0.2  0.2
 1  NaN  NaN  NaN  0.3  0.3    1
 2  NaN  NaN  NaN  1.4  3.2   32,
      0     1     2      3      4      5
 5  sex  test  conc  sugar  flour  yeast
 6    M     A     3    1.2    1.2    1.2
 7  NaN   NaN   NaN    1.3    1.3      2
 8  NaN   NaN   NaN    2.4    4.2     33,
       0     1     2      3      4      5
 10  sex  test  conc  sugar  flour  yeast
 11    M     A     6    2.2    2.2    2.2
 12  NaN   NaN   NaN    2.3    2.3      3
 13  NaN   NaN   NaN    3.4    5.2     34]

Getting individual dataframes:

dfs[0]

     0    1    2    3    4    5
0    M    A    0  0.2  0.2  0.2
1  NaN  NaN  NaN  0.3  0.3    1
2  NaN  NaN  NaN  1.4  3.2   32
Sign up to request clarification or add additional context in comments.

Comments

1

here is one way about it

dfs=[] # list to hold the DF

# code that you already have. which is to split the DF on null rows
df_list = np.split(df, df[df.isnull().all(1)].index)

# Iterate over the df_list and append to dfs
for idx, data in enumerate(df_list):
    dfs.append(data)

dfs[0]
    0   1   2   3   4   5
0   M   A   0   0.2     0.2     0.2
1   NaN     NaN     NaN     0.3     0.3     1
2   NaN     NaN     NaN     1.4     3.2     32

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.