I have the following pandas dataframe :
import pandas as pd
import numpy as np
data = [['Apple', 1, 1, 1 ,1,], ['Orange', np.nan, 1, 1, np.nan], ['Banana', 1, np.nan, 1, np.nan]]
df = pd.DataFrame(data, columns = ['Type of fruit', 'Paris', "Boston", "Austin", "New York"])
output:
Type of fruit Paris Boston Austin New York
0 Apple 1.0 1.0 1 1.0
1 Orange NaN 1.0 1 NaN
2 Banana 1.0 NaN 1 NaN
I would like to create a new column named "Location", with new indexes based on the four columns Paris, Boston, Austin, New York such as:
Ideal ouptut :
Location Type of fruit
0 Paris Apple
1 Boston Apple
2 Austin Apple
3 New York Apple
4 Boston Orange
5 Austin Orange
6 Paris Banana
7 Austin Banana
I could filter each location columns to keep non-null indexes (exemple for Paris) :
df_paris = df.loc[df["Paris"].notna(),["Type of fruit"]]
df_paris["Location"] = "Paris"
and then concatenate the dataframes for each location:
pd.concat([df_paris, df_boston, df_austin, df_new_york])
but I'm sure there is a better way to do this stuff using pandas functions.