how to shift value in dataframe using pandas?

Question

I have data like this, without z1, what i need is to add a column to DataFrame, so it will add column z1 and represent values as in the example, what it should do is to shift z value equally on 1 day before for the same Start date.

I was thinking it could be done with apply and lambda in pandas, but i`m not sure how to define lambda function

data = pd.read_csv("....")

data["Z"] = data[[
                "Start", "Z"]].apply(lambda x:

Why 564545 in the last row? Isn't supposed to be 56? Because if you want the z value equally on 1 day before for the same Start date., it would correspond to 32400000 2012-10-02 (row 7) instead of 32400000 2012-10-01 (row 2). — dot.Py
– dot.Py, Commented Jul 18, 2016 at 18:19

jezrael · Accepted Answer · 2016-07-18 20:44:05Z

3

You can use DataFrameGroupBy.shift with merge:

#if not datetime
df['date'] = pd.to_datetime(df.date)
df.set_index('date', inplace=True)
df1 = df.groupby('start')['z'].shift(freq='1D',periods=1).reset_index()
print (pd.merge(df.reset_index(),df1, on=['start','date'], how='left', suffixes=('','1')))

        date  start       z        z1
0 2012-12-01    324  564545       NaN
1 2012-12-01    384    5555       NaN
2 2012-12-01    349     554       NaN
3 2012-12-02    855     635       NaN
4 2012-12-02    324      56  564545.0
5 2012-12-01    341      98       NaN
6 2012-12-03    324     888      56.0

EDIT:

Try find duplicates and fillna by 0:

df['date'] = pd.to_datetime(df.date)
df.set_index('date', inplace=True)
df1 = df.groupby('start')['z'].shift(freq='1D',periods=1).reset_index()
df2 = pd.merge(df.reset_index(),df1, on=['start','date'], how='left', suffixes=('','1'))
mask = df2.start.duplicated(keep=False)
df2.ix[mask, 'z1'] = df2.ix[mask, 'z1'].fillna(0)
print (df2)
        date  start       z        z1
0 2012-12-01    324  564545       0.0
1 2012-12-01    384    5555       NaN
2 2012-12-01    349     554       NaN
3 2012-12-02    855     635       NaN
4 2012-12-02    324      56  564545.0
5 2012-12-01    341      98       NaN
6 2012-12-03    324     888      56.0

edited Jul 18, 2016 at 20:44

answered Jul 18, 2016 at 18:44

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

15 Comments

Khrystyna Kosenko Over a year ago

thats great, thanks! but how come using different data set i get NotImplementedError: Not supported for type Index

jezrael Over a year ago

It looks likeyou forget dtetimeindex df.set_index('date', inplace=True).

jezrael Over a year ago

What is print df.index before df1 = df.groupby('start')['z'].shift(freq='1D',periods=1).reset_index() ?

Khrystyna Kosenko Over a year ago

i tried df['date']= pd.to_datetime(pd.Series(['date']), format="%Y-%m-%d")

Khrystyna Kosenko Over a year ago

tells me that date doesnt match format, i checked data itself and nothing wrong with it

|

Collectives™ on Stack Overflow

how to shift value in dataframe using pandas?

1 Answer 1

15 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

15 Comments

Your Answer

Sign up or log in

Post as a guest

Related