0

I am working with the following DataFrame:

df1 = pd.DataFrame([
                    [1,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN],
                    [0.5,2,np.NaN,np.NaN,np.NaN,np.NaN],
                    [np.NaN,1.5,3,np.NaN,np.NaN,np.NaN],
                    [np.NaN,np.NaN,2.5,4,np.NaN,np.NaN],
                    [np.NaN,np.NaN,np.NaN,3.5,5,5.5],
                    [np.NaN,np.NaN,np.NaN,np.NaN,6.2,6],
                    ], columns=['AA','BB','CC','DD', 'EE', 'FF'])

And as output I get:

DataFrame1_______
    AA   BB   CC   DD   EE   FF
0  1.0  NaN  NaN  NaN  NaN  NaN
1  0.5  2.0  NaN  NaN  NaN  NaN
2  NaN  1.5  3.0  NaN  NaN  NaN
3  NaN  NaN  2.5  4.0  NaN  NaN
4  NaN  NaN  NaN  3.5  5.0  5.5
5  NaN  NaN  NaN  NaN  6.2  6.0

I would like to know if there is a way to convert this dataframe to another without NaNs values such as:

new_DataFrame1______
    AA   BB   CC   DD   EE   FF
0  1.0  2.0  3.0  4.0  5.0  5.5
1  0.5  1.5  2.5  3.5  6.2  6.0

Basically i would like to move every value that is not NaN to the index=0 of its column.

Thanks in advance

3
  • pd.DataFrame(np.diag(df1)[None], columns=df1.columns) Commented Sep 10, 2020 at 6:50
  • I have edited the question since it was an example and I dont really have a diagonal dataframe. Commented Sep 10, 2020 at 6:56
  • @FélixdelPradoHurtado - Then use another solution with bfill Commented Sep 10, 2020 at 7:01

2 Answers 2

3

Use justify with remove only missing rows by DataFrame.dropna:

#https://stackoverflow.com/a/44559180/2901002
df = pd.DataFrame(justify(df1.to_numpy(), invalid_val=np.nan, axis=0), 
                  columns=df1.columns).dropna(how='all')
print (df)
    AA   BB   CC   DD   EE   FF
0  1.0  2.0  3.0  4.0  5.0  5.5
1  0.5  1.5  2.5  3.5  6.2  6.0

Another solution:

df = pd.concat([df1[c].dropna().reset_index(drop=True) for c in df1.columns], axis=1)
print (df)
    AA   BB   CC   DD   EE   FF
0  1.0  2.0  3.0  4.0  5.0  5.5
1  0.5  1.5  2.5  3.5  6.2  6.0
Sign up to request clarification or add additional context in comments.

Comments

2

You can also use stack and groupby with dict comprehension:

print (pd.DataFrame({col:i.tolist() for col, i in df1.stack().groupby(level=1)}))

    AA   BB   CC   DD   EE   FF
0  1.0  2.0  3.0  4.0  5.0  5.5
1  0.5  1.5  2.5  3.5  6.2  6.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.