Is there a convenient way of filling na values with (the first) values of an array or column?
Imagine the following DataFrame:
dfcolors = pd.DataFrame({'Colors': ['Blue', 'Red', np.nan, 'Green', np.nan, np.nan, 'Brown']})
Colors
0 Blue
1 Red
2 NaN
3 Green
4 NaN
5 NaN
6 Brown
I want to fill the NaN values with values from another DataFrame, or array, so:
dfalt = pd.DataFrame({'Alt': ['Cyan', 'Pink']})
Alt
0 Cyan
1 Pink
When there are more NaN's then fill values some NaN's should remain. And when there are more fill values, not all of them will be used. So we'll have to do some counting:
n_missing = len(dfcolors) - dfcolors.count().values[0]
n_fill = min(n_missing, len(dfalt))
The number n_fill is the amount of values that can be filled.
Selecting the NaN values which can/should be filled can be done with:
dfcolors.Colors[pd.isnull(dfcolors.Colors)][:n_fill]
2 NaN
4 NaN
Name: Colors, dtype: object
Selecting the fill values
dfalt.Alt[:n_fill]
0 Cyan
1 Pink
Name: Alt, dtype: object
And them i'm stuck at something like:
dfcolors.Colors[pd.isnull(dfcolors.Colors)][:n_fill] = dfalt.Alt[:n_fill]
Which doesn't work... Any tips would be great.
This is the output that i want:
Colors
0 Blue
1 Red
2 Cyan
3 Green
4 Pink
5 NaN
6 Brown
NaN values are filled from top to bottom, and the fill values are also selected from top to bottom if there are more fill values than NaN's
.valuesor even wrapping it in a new DataFrame. No luck so far.