Fill in missing dates in pandas df

Question

I have a data corresponding to a list of DBs and diff rows with dates that they were in use.

 DB             Dates        USAGE

 ABC            03-06-2018   IN USE
 ABC            07-06-2018   IN USE 
 XYZ            04-06-2018   IN USE
 XYZ            08-06-2018   IN USE

What i want is to have the full calendar month corresponding to every db and not just the dates on which they were in use

 DB             Dates        USAGE
 ABC            01-06-2018    NOT IN USE
 ABC            02-06-2018    NOT IN USE
 ABC            03-06-2018    IN USE
 .
 .
 ABC            07-06-2018    IN USE
 .
 .
 ABC            30-06-2018    NOT IN USE 
 XYZ            01-06-2018    NOT IN USE
 .
 .
 XYZ            30-06-2018    NOT IN USE

If I understood you well, you can query the the dataframe based on the column "usage", follow this question: stackoverflow.com/questions/17071871/… .. is this what you need ? — Minions
– Minions, Commented Jul 30, 2018 at 5:45
Not sure about a downvote, but pretty sure it is a dupe. Both the OP and the linked question are about adding missing dates from a range. — DYZ
– DYZ, Commented Jul 30, 2018 at 5:58
@jezrael It's up to you, of course. I say it is a possible dupe. — DYZ
– DYZ, Commented Jul 30, 2018 at 6:00
@jezrael I agree that, while the linked answer may be used to answer the OP, a complete answer requires more bits and pieces than the linked one. — DYZ
– DYZ, Commented Jul 30, 2018 at 6:06

jezrael · Accepted Answer · 2018-07-30 06:03:24Z

2

Use:

df['Dates'] = pd.to_datetime(df['Dates'], format='%d-%m-%Y')

a = df['Dates'].dt.to_period('m')
dates = pd.date_range(a.min().to_timestamp('ms'), a.max().to_timestamp('m'))

mux = pd.MultiIndex.from_product([df['DB'].unique(), dates], names=['DB','Dates'])

df = df.set_index(['DB','Dates'])['USAGE'].reindex(mux, fill_value='NOT IN USE').reset_index()
print (df.head())
    DB      Dates       USAGE
0  ABC 2018-06-01  NOT IN USE
1  ABC 2018-06-02  NOT IN USE
2  ABC 2018-06-03      IN USE
3  ABC 2018-06-04  NOT IN USE
4  ABC 2018-06-05  NOT IN USE

print (df.tail())
     DB      Dates       USAGE
55  XYZ 2018-06-26  NOT IN USE
56  XYZ 2018-06-27  NOT IN USE
57  XYZ 2018-06-28  NOT IN USE
58  XYZ 2018-06-29  NOT IN USE
59  XYZ 2018-06-30  NOT IN USE

Detail:

print (dates)
DatetimeIndex(['2018-06-01', '2018-06-02', '2018-06-03', '2018-06-04',
               '2018-06-05', '2018-06-06', '2018-06-07', '2018-06-08',
               '2018-06-09', '2018-06-10', '2018-06-11', '2018-06-12',
               '2018-06-13', '2018-06-14', '2018-06-15', '2018-06-16',
               '2018-06-17', '2018-06-18', '2018-06-19', '2018-06-20',
               '2018-06-21', '2018-06-22', '2018-06-23', '2018-06-24',
               '2018-06-25', '2018-06-26', '2018-06-27', '2018-06-28',
               '2018-06-29', '2018-06-30'],
              dtype='datetime64[ns]', freq='D')

Exlanation:

First convert column to_datetime
Create all possible dates - first convert column to to_period, then to date_range with to_timestamp with start and end of month
Then create MultiIndex from_product
and reindex with replace missing values.

edited Jul 30, 2018 at 6:03

answered Jul 30, 2018 at 5:51

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

techdoodle Over a year ago

@jezrail any way to ignore the dates falling on the weekends?

jezrael Over a year ago

@techdoodle - Do you think remove weekend dateimes from dates like dates = dates[~dates.weekday.isin([5,6])] ?

jezrael Over a year ago

@techdoodle - Or set different these dates as last step like df.loc[df['Dates'].dt.weekday.isin([5,6]), 'USAGE'] = 'no using' ?

techdoodle Over a year ago

how can I get the same range of dates but i want the hourly interval too. 24 entries for each date

Collectives™ on Stack Overflow

Fill in missing dates in pandas df

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related