4

Ok, my real problem is bigger than this, but I have a simple working example.

>>> import pandas as pd
>>> import numpy as np
>>> a = pd.DataFrame(np.array([[2, 1990], [4,1991], [5,1992]]), \
...                  index=[1,2,3], columns=['var', 'yr'])
>>> a
   var    yr
1    2  1990
2    4  1991
3    5  1992
>>> b = pd.DataFrame(index=a.index, columns=['new_var'])
>>> b
  new_var
1     NaN
2     NaN
3     NaN
>>> b[a.yr<1992].loc[:, 'new_var'] = a[a.yr<1992].loc[:, 'var']
>>> b
  new_var
1     NaN
2     NaN
3     NaN

I desire the following output:

>>> b
  new_var
1       2
2       4
3     NaN

3 Answers 3

3

With that filtering stuff, you're creating a copy of a slice, and thus it won't assign.

Do this instead:

b.loc[a.yr<1992, 'new_var'] = a['var']

Sign up to request clarification or add additional context in comments.

2 Comments

This is a good answer! However, the slice on a is unnecessary. This will suffice b.loc[a.yr<1992, 'new_var'] = a['var'] pandas will handle the alignment for you. +1 from me.
Cool. Yeah, Pandas seems to be pretty good at being reasonably concise.
1

you can also use assign + query to add intuitiveness

b.assign(new_var=a.query('yr < 1992')['var'])

   new_var
1      2.0
2      4.0
3      NaN

This returns the dataframe you'd want. You'll have to assign it back to b if you want it to persist.

2 Comments

this is rather unusual use case for assign + query ;-)
@MaxU I'm always trying to push on the edges.
0

yet another "creative" solution:

In [181]: b['new_var'] = np.where(a.yr < 1992, a['var'], b['new_var'])

In [182]: b
Out[182]:
  new_var
1       2
2       4
3     NaN

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.