Pandas fillna with list/array

Question

Is there a convenient way of filling na values with (the first) values of an array or column?

Imagine the following DataFrame:

dfcolors = pd.DataFrame({'Colors': ['Blue', 'Red', np.nan, 'Green', np.nan, np.nan, 'Brown']})

  Colors
0   Blue
1    Red
2    NaN
3  Green
4    NaN
5    NaN
6  Brown

I want to fill the NaN values with values from another DataFrame, or array, so:

dfalt = pd.DataFrame({'Alt': ['Cyan', 'Pink']})

           Alt
0         Cyan
1         Pink

When there are more NaN's then fill values some NaN's should remain. And when there are more fill values, not all of them will be used. So we'll have to do some counting:

n_missing = len(dfcolors) - dfcolors.count().values[0]    
n_fill = min(n_missing, len(dfalt))

The number n_fill is the amount of values that can be filled.

Selecting the NaN values which can/should be filled can be done with:

dfcolors.Colors[pd.isnull(dfcolors.Colors)][:n_fill]

2    NaN
4    NaN
Name: Colors, dtype: object

Selecting the fill values

dfalt.Alt[:n_fill]

0    Cyan
1    Pink
Name: Alt, dtype: object

And them i'm stuck at something like:

dfcolors.Colors[pd.isnull(dfcolors.Colors)][:n_fill] = dfalt.Alt[:n_fill]

Which doesn't work... Any tips would be great.

This is the output that i want:

  Colors
0   Blue
1    Red
2   Cyan
3  Green
4   Pink
5    NaN
6  Brown

NaN values are filled from top to bottom, and the fill values are also selected from top to bottom if there are more fill values than NaN's

It's returning view vs copy (fancy indexing always returns a copy)... hmm — Andy Hayden
– Andy Hayden, Commented Jul 9, 2013 at 10:00
Yes I think thats a main issue, I have tried all kinds of things like adding .values or even wrapping it in a new DataFrame. No luck so far. — Rutger Kassies
– Rutger Kassies, Commented Jul 9, 2013 at 10:03

Andy Hayden · Accepted Answer · 2013-07-09 10:16:22Z

3

This is rather awful, but iterating over the index of the nulls works:

In [11]: nulls = dfcolors[pd.isnull(dfcolors['Colors'])]

In [12]: for i, ni in enumerate(nulls.index[:len(dfalt)]):
             dfcolors['Colors'].loc[ni] = dfalt['Alt'].iloc[i]

In [13]: dfcolors
Out[13]:
  Colors
0   Blue
1    Red
2   Cyan
3  Green
4   Pink
5    NaN
6  Brown

answered Jul 9, 2013 at 10:16

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

joente · Accepted Answer · 2013-07-09 10:23:53Z

3

You could use a generator. That way you could write something like this:

import pandas as pd
from pandas import np

dfcolors = pd.DataFrame({'Colors': ['Blue', 'Red', np.nan, 'Green', np.nan, np.nan, 'Brown']})
dfalt = pd.DataFrame({'Alt': ['Cyan', 'Pink']})

gen_alt = (alt for alt in dfalt.Alt)

for i, color in enumerate(dfcolors.Colors):
    if not pd.isnull(color): continue
    try:
        dfcolors.Colors[i] = gen_alt.next()
    except StopIteration:
        break
print(dfcolors)
#     Colors
# 0   Blue
# 1    Red
# 2   Cyan
# 3  Green
# 4   Pink
# 5    NaN
# 6  Brown

answered Jul 9, 2013 at 10:23

joente

8567 silver badges10 bronze badges

Collectives™ on Stack Overflow

Pandas fillna with list/array

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related