i have a df comprised of power outages with several columns, a start date column, and an end date column
what i would like to be able to do:
- scan the "start date" column for the earliest date
- scan the "finish date" column for the latest date
- build a date index with all dates in between those two dates
- for each row, create a row for each date from the start date to the finish date, thus removing the need for both date columns
so if my df looked as follows:
start date mw outage end date location
01/01/2000 1000 01/04/2000 merica
01/01/2000 2000 01/03/2000 canadia
i'd want it instead to look like this
date mw outage location
01/01/2000 1000 merica
01/01/2000 2000 canadia
01/02/2000 1000 merica
01/02/2000 2000 canadia
01/03/2000 1000 merica
01/03/2000 2000 canadia
01/04/2000 1000 merica
i think i can use reindex to add the missing dates but i'm not sure how to identify the oldest/newest and i don't know how to create the rows in this manner