1

I have a DataFrame object representing a time series indexed in minute resolution. There are segments in which there are no entries, which have been identified as follows:

missing = pd.date_range(df.index.min(), df.index.max(), freq='1min').difference(df.index)

Doing a quick print of this results in the following output:

DatetimeIndex(['2021-02-11 03:41:00', '2021-02-11 03:42:00',
           '2021-02-11 03:43:00', '2021-02-11 03:44:00',
           '2021-02-11 03:45:00', '2021-02-11 03:46:00',
           '2021-02-11 03:47:00', '2021-02-11 03:48:00',
           '2021-02-11 03:49:00', '2021-02-11 03:50:00',
           ...
           '2021-09-29 08:50:00', '2021-09-29 08:51:00',
           '2021-09-29 08:52:00', '2021-09-29 08:53:00',
           '2021-09-29 08:54:00', '2021-09-29 08:55:00',
           '2021-09-29 08:56:00', '2021-09-29 08:57:00',
           '2021-09-29 08:58:00', '2021-09-29 08:59:00'],
          dtype='datetime64[ns]', length=5057, freq=None)

My question is how to insert each of these values as a new row into the initial DataFrame object with NaN values for all columns existing in that initial dataframe as such:

date a b c d
2021-12-31 23:59:00 3675.10 3678.63 3675.09 3676.23
2021-12-31 23:58:00 Nan Nan Nan Nan
2021-12-31 23:57:00 Nan Nan Nan Nan
2021-12-31 23:56:00 3676.79 3678.54 3676.32 3676.94
2021-12-31 23:55:00 3679.36 3679.36 3676.28 3676.65

Where, in this contrived example, the dates 2021-12-31 23:58:00 and 2021-12-31 23:57:00 would have been identified via the date_range function before and then inserted with NaN values into the initial DataFrame. In this case, I am not interested in interpolating between surrounding values.

6
  • can you provide a sample of your input dataframe? Commented Jun 24, 2022 at 13:58
  • have you looked at pandas.pydata.org/docs/reference/api/… Commented Jun 24, 2022 at 13:59
  • @Umar.H The input DataFrame is essentially the same as that in the last code block, just sans the rows with NaN values. Commented Jun 24, 2022 at 14:00
  • @programmar I am only vaguely familiar with the resample function -- I use it to turn n-periods into n/m periods often but am not fluent enough to know how to apply it in this case. Suggestions? Commented Jun 24, 2022 at 14:02
  • 1
    please have a chance to read minimal reproducible example and How to Ask you need to provide a representative sample of your input and output dataframes. Commented Jun 24, 2022 at 14:04

1 Answer 1

3

Use resample!

df.resample('1T').fillna(method=None)

Note: '1T' stands for 1 minute

And here some code that works, recreating the data frame from your sample data. But please next time, provide this yourself. You will get better and faster answers!

import pandas
from io import StringIO


s = """date  a  b  c  d
2021-12-31 23:59:00  3675.10  3678.63  3675.09  3676.23
2021-12-31 23:56:00  3676.79  3678.54  3676.32  3676.94
2021-12-31 23:55:00  3679.36  3679.36  3676.28  3676.65"""


csvStringIO = StringIO(s)

df = pandas.read_csv(csvStringIO,  sep='  ', index_col='date')
df.index = pandas.to_datetime(df.index)

df = df.sort_index()

df.resample('1T').fillna(method=None)

Output:

                           a        b        c        d
date                                                   
2021-12-31 23:55:00  3679.36  3679.36  3676.28  3676.65
2021-12-31 23:56:00  3676.79  3678.54  3676.32  3676.94
2021-12-31 23:57:00      NaN      NaN      NaN      NaN
2021-12-31 23:58:00      NaN      NaN      NaN      NaN
2021-12-31 23:59:00  3675.10  3678.63  3675.09  3676.23

Edit: Your index is unsorted. I had to add the line with index_sort(). Keep that in mind.

Sign up to request clarification or add additional context in comments.

2 Comments

This works great! The only issue is that the resulting data is in reversed order -- any idea how to avoid that? It's no issue to simply use .iloc[::-1] on the resulting DataFrame, I'm mostly just curious.
I just fixed that. How is it now?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.