Replacing zero values in dataframe using another dataframe

Question

I would like to replace some of the values in the foll. dataframe:

dataframe_a

Y2000   Y2001   Y2002    Y2003    Y2004    Item    Item Code
34        43      0      0          25     Test      Val

I would like to replace the values in the columns with a numeric value derived by multiplying a scalar (say 0.5) by all values in this dataframe:

dataframe_b

Y2000   Y2001   Y2002    Y2003    Y2004    Item    Item Code
34        43      10      20        25     Test      Val

So, in dataframe_a value for column Y2002 should be 10 * 0.5 and value for column Y2003 should be 20 * 0.5

Currently, I am doing this:

df = dataframe_a[dataframe_a == 0]
df = df * dataframe_b * 0.5

However, not sure how I can update dataframe_a with the new values

What about databrame.replace() : pandas.pydata.org/pandas-docs/stable/generated/… — Richard
– Richard, Commented Sep 22, 2015 at 18:28

EdChum · Accepted Answer · 2015-09-22 19:00:19Z

2

You can use the boolean mask and then call fillna:

In [58]:
fill = df2.select_dtypes(include = [np.number]) * 0.5
df1 = df1[df1!=0].fillna(fill)
df1

Out[58]:
   Y2000  Y2001  Y2002  Y2003  Y2004  Item Item  Code
0     34     43      5     10     25  Test        Val

Here df1[df1 !=0] will produce a df of the same shape with NaN values where the condition is not met, you can then call fillna on this and pass the other df which will replace the NaN values where the index and columns align.

The result of the boolean mask:

In [63]:
df1[df1!=0]

Out[63]:
   Y2000  Y2001  Y2002  Y2003  Y2004  Item Item  Code
0     34     43    NaN    NaN     25  Test        Val

edited Sep 22, 2015 at 19:00

answered Sep 22, 2015 at 18:35

EdChum

397k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Leb Over a year ago

That replaces the NaN values buy the ones from df2 not by a scalar multiplier as he asked.

Leb Over a year ago

No idea, hope that'll help

user308827 Over a year ago

thanks @EdChum, excellent explanation as usual. I didn't downvote :)

Leb · Accepted Answer · 2015-09-22 18:57:45Z

2

A generic one, in case you don't know the location of the 0 value:

new_df = 0.5*df2[df==0]
new_df.fillna(df, inplace=True)
print(new_df)

    0   1  2  3   4     5    6
0  34  43  5  5  25  Test  Val

Where dataframe_a is df and dataframe_b is df2

edited Sep 22, 2015 at 18:57

answered Sep 22, 2015 at 18:52

Leb

16k11 gold badges58 silver badges77 bronze badges

Comments

Community · Accepted Answer · 2017-05-23 10:27:07Z

1

import pandas as pd
import numpy as np
randn = np.random.randn
s = Series(randn(5), index=['a', 'b', 'c', 'd', 'e'])
d = {'one' : Series([1., 2., 3.], index=['a', 'b', 'c']),
     'two' : Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
df
df.replace(1, 12*4)  # replace all values 1 by 12*4
df

Ref about replace() : Replace all occurrences of a string in a pandas dataframe (Python)

edited May 23, 2017 at 10:27

CommunityBot

11 silver badge

answered Sep 22, 2015 at 18:37

Richard

7716 silver badges17 bronze badges

Comments

hilberts_drinking_problem · Accepted Answer · 2015-09-22 18:48:39Z

1

dataframe_a[dataframe_a == 0] = 0.5 * dataframe_b[dataframe_a == 0]

edited Sep 22, 2015 at 18:48

answered Sep 22, 2015 at 18:42

hilberts_drinking_problem

11.6k3 gold badges25 silver badges55 bronze badges

Comments

wwii · Accepted Answer · 2015-09-24 02:08:36Z

pandas.DataFrame.where might be what you need. You would have to construct another dataframe with the specific column values that you want to substitute.

I don't have Pandas installed here so I can't show a dataframe example - but it works similarly with numpy arrays.

>>> a
array([1, 2, 0, 3, 4, 0, 5])
>>> subst
array([10, 20, 30, 40, 50, 60, 70])
>>> k = -.5
>>> np.where(a == 0, subst * k, a)
array([  1.,   2., -15.,   3.,   4., -30.,   5.])
>>>

One difference with the dataframe is that it can do an in-place substitution and you only have to specify the other dataframe (the one with the substitute values).

Finally a Pandas example:

>>> 
>>> df
   d  e  f
a  0  1  1
b  1  1  0
c  1  0  1
>>> s
    d   e   f
a  10  20  30
b  10  20  30
c  10  20  30
>>> k = -.5
>>> df.where(df != 0, other = s * k)
   d   e   f
a -5   1   1
b  1   1 -15
c  1 -10   1
>>> 
>>> df.where(df != 0, other = s * k, inplace = True)
>>> df
   d   e   f
a -5   1   1
b  1   1 -15
c  1 -10   1
>>>

Some examples from the pydata site.

Collectives™ on Stack Overflow

Replacing zero values in dataframe using another dataframe

5 Answers 5

3 Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

3 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related