Create a new DataFrame using pandas date_range

Question

I have the following DataFrame:

   date_start          date_end
0  2023-01-01 16:00:00 2023-01-01 17:00:00
1  2023-01-02 16:00:00 2023-01-02 17:00:00
2  2023-01-03 16:00:00 2023-01-03 17:00:00
3  2023-01-04 17:00:00 2023-01-04 19:00:00
4  NaN                 NaN

and I want to create a new DataFrame which will contain values starting from the date_start and ending at the date_end of each row. So for the first row by using the code below:

new_df = pd.Series(pd.date_range(start=df['date_start'][0], end=df['date_end'][0], freq= '15min'))

I get the following:

0   2023-01-01 16:00:00
1   2023-01-01 16:15:00
2   2023-01-01 16:30:00
3   2023-01-01 16:45:00
4   2023-01-01 17:00:00

How can I get the same result for all the rows of the df combined in a new df?

mozway · Accepted Answer · 2023-01-10 13:12:13Z

1

You can use a list comprehension and concat:

out = pd.concat([pd.DataFrame({'date': pd.date_range(start=start, end=end,
                                                     freq='15min')})
                  for start, end in zip(df['date_start'], df['date_end'])],
                ignore_index=True))

Output:

                  date
0  2023-01-01 16:00:00
1  2023-01-01 16:15:00
2  2023-01-01 16:30:00
3  2023-01-01 16:45:00
4  2023-01-01 17:00:00
5  2023-01-02 16:00:00
6  2023-01-02 16:15:00
7  2023-01-02 16:30:00
8  2023-01-02 16:45:00
9  2023-01-02 17:00:00
10 2023-01-03 16:00:00
11 2023-01-03 16:15:00
12 2023-01-03 16:30:00
13 2023-01-03 16:45:00
14 2023-01-03 17:00:00
15 2023-01-04 17:00:00
16 2023-01-04 17:15:00
17 2023-01-04 17:30:00
18 2023-01-04 17:45:00
19 2023-01-04 18:00:00
20 2023-01-04 18:15:00
21 2023-01-04 18:30:00
22 2023-01-04 18:45:00
23 2023-01-04 19:00:00

handling NAs:

out = pd.concat([pd.DataFrame({'date': pd.date_range(start=start, end=end,
                                                     freq='15min')})
                  for start, end in zip(df['date_start'], df['date_end'])
                  if pd.notna(start) and pd.notna(end)
                ],
                ignore_index=True)

edited Jan 10, 2023 at 13:12

answered Jan 10, 2023 at 12:59

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Pythoneer Over a year ago

@thanks mozway! In case there are NaNs, I am getting the following error: ValueError: Neither start` nor end can be NaT`. How do I ignore the NaNs?

mozway Over a year ago

You can first run df2 = df.dropna(['date_start', 'date_end']) and use df2 in the list comprehension. Or add if pd.notna(start) and pd.notna(end) as test at the end of the list comprehension

Tranbi · Accepted Answer · 2023-01-10 13:04:33Z

1

Adding to the previous answer that date_range has a to_series() method and that you could proceed like this as well:

pd.concat(
  [
    pd.date_range(start=row['date_start'], end=row['date_end'], freq= '15min').to_series()
    for _, row in df.iterrows()
  ], ignore_index=True
)

answered Jan 10, 2023 at 13:04

Tranbi

12.8k6 gold badges19 silver badges39 bronze badges

Collectives™ on Stack Overflow

Create a new DataFrame using pandas date_range

2 Answers 2

handling NAs:

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

handling NAs:

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related