0

I have a Pandas DataFrame that contains half hourly data. I want to shift a small range of datetime indices by 1 hour:

import numpy as np
import pandas as pd
from datetime import datetime, date, timedelta

df = pd.DataFrame()
df['datetime'] = pd.date_range(start='2016-01-01', end='2019-01-01', freq='30T')
df = df.set_index('datetime')

print(df.index)

# shift 1 hour forward for custom date range
date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
df[(df.index>=date1) & (df.index<=date2)].index += pd.Timedelta('1H')
#df.index = df[(df.index>=date1) & (df.index<=date2)].index.shift(periods=1,freq='H')

print(df.index)

I tried some different strategies (which don't work) but I want to know if there is a more compact way to do this. Note that this MWE has freq=30T but my actual data has freq=None.

2 Answers 2

4

The expression df[mask] generally produces a weakly linked copy of the original DataFrame containing only the filtered values. Meaning the expression df[mask].index creates a copy, += pd.Timedelta('1H') updates the index of that copy and then throws the new DataFrame away since there is no variable reference.

One approach here would be to use values to update the underlying values (since the "Index does not support mutable operations"):

# shift 1 hour forward for custom date range
date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
df.index.values[(date1 <= df.index) & (df.index <= date2)] += pd.Timedelta('1H')

df:

                         0
datetime                  
2016-01-01 01:00:00      0  # Now starts at 01:00:00
2016-01-01 01:30:00      1
2016-01-01 02:00:00      2
2016-01-01 02:30:00      3
2016-01-01 03:00:00      4

In terms of a more concise method there aren't really any options when working with the index.

np.where works but is not necessarily more concise:

date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
df.index = np.where((date1 <= df.index) & (df.index <= date2),
                    df.index + pd.Timedelta('1H'),
                    df.index)
df.index.name = 'datetime'  # add index name back if needed

Converting to_series then using between is also an option, but requires making a complete copy of the index which is likely unnecessary:

date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
df.index.values[
    df.index.to_series().between(date1, date2)
] += pd.Timedelta('1H')

Index.shift would also work if using the mask on both sides of the equation, but this requires masking the index twice:

date1 = '2016-01-01 00:00:00'
date2 = '2016-03-02 23:49:50'
m = (date1 <= df.index) & (df.index <= date2)
df.index.values[m] = df.index[m].shift(freq='1H')
Sign up to request clarification or add additional context in comments.

Comments

0

You can probably use pd.index.shift

From the Pandas documentation:

Notes

This method is only implemented for datetime-like index classes, i.e., DatetimeIndex, PeriodIndex and TimedeltaIndex.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.