1

I have a pandas data frame with three columns. Column A is of datetime type, Column B type is integer, Column C type is float but not important for this question. My goal is to add rows to the data frame determined by each value in Column B, while incrementing the datetime in A by one hour each.

For example, given this data frame:

A                   B   C
4/18/2021 1:00:00   3   1
4/20/2021 5:00:00   2   0

produces this output:

A                   B   C
4/18/2021 1:00:00   3   1
4/18/2021 2:00:00   3   1
4/18/2021 3:00:00   3   1
4/20/2021 5:00:00   2   0
4/20/2021 6:00:00   2   0

A naive approach would be to loop through each row of the data frame adding new rows iteratively, but I prefer to use a more efficient solution to manipulate the data.

1
  • would a O(n) solution be sufficient? As in you'll still need to loop over each data in the original dataframe, but when adding 4/18/2021 you'll add two rows in one loop. Commented Apr 21, 2022 at 22:23

1 Answer 1

1

One option is a list comprehension, followed by explode:

(df
.assign(
    A = [pd.date_range(start = a, periods = b, freq='1H') 
         for a, b in zip(df.A, df.B)])
.explode('A')
)
                    A  B  C
0 2021-04-18 01:00:00  3  1
0 2021-04-18 02:00:00  3  1
0 2021-04-18 03:00:00  3  1
1 2021-04-20 05:00:00  2  0
1 2021-04-20 06:00:00  2  0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.