0

I have a DataFrame:

df = pd.DataFrame(
    np.random.rand(10, 3),
    columns='sensor_id|unix_timestamp|value'.split('|'))

I want to create 5 more columns in which each new column is a shifted version of the value column.

   sensor_id  unix_timestamp     value  value_shift_0  value_shift_1  value_shift_2  value_shift_3  value_shift_4
0   0.901001        0.036683  0.945908            NaN            NaN            NaN            NaN            NaN
1   0.751759        0.038600  0.117308            NaN            NaN            NaN            NaN            NaN
2   0.737604        0.484417  0.602733            NaN            NaN            NaN            NaN            NaN
3   0.259865        0.522115  0.074188            NaN            NaN            NaN            NaN            NaN
4   0.932359        0.662560  0.648445            NaN            NaN            NaN            NaN            NaN
5   0.114668        0.066766  0.285553            NaN            NaN            NaN            NaN            NaN
6   0.795851        0.565259  0.888404            NaN            NaN            NaN            NaN            NaN
7   0.082534        0.355506  0.671816            NaN            NaN            NaN            NaN            NaN
8   0.336648        0.651789  0.859373            NaN            NaN            NaN            NaN            NaN
9   0.917073        0.842281  0.458542            NaN            NaN            NaN            NaN            NaN

But I don't know how to fill in with the appropriated shifted value columns.

1
  • Where do the columns come from: from another dataframe? From lists? From Series? Commented Jan 7, 2017 at 1:00

1 Answer 1

2

pd.concat with a dictionary comprehension along with join

df.join(
    pd.concat(
        {'value_shift_{}'.format(i): df.value.shift(i) for i in range(5)},
        axis=1))

enter image description here


alternative with numpy

def multi_shift(s, n):
    a = np.arange(len(s))
    i = (a[:, None] - a[:n]).ravel()
    e = np.empty(i.shape)
    e.fill(np.nan)
    w = np.where(i >= 0)
    e[w] = df.value.values[i[w]]

    return pd.DataFrame(e.reshape(10, -1),
                        s.index, ['shift_%i' % s for s in range(n)])

df.join(multi_shift(df.value, n))

timing

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.