0

I have a df in which some dates of the time period for each station are missing: How do I fill the missing dates per station and set "Value" to NaN within a multi index df?

The df looks like this:


                           LST
Date        Station_Number                
2003-01-01    SWE00137272 -238
2003-01-09    SWE00137272 -172
2003-01-17    SWE00137272 -191
2003-01-25    SWE00137272 -202
2003-02-02    SWE00137272 -297
...                   ...  ...
2020-11-24    GLM00004301 -321
2020-12-02    GLM00004301 -323
2020-12-10    GLM00004301 -347
2020-12-18    GLM00004301 -340
2020-12-26    GLM00004301 -312

[636672 rows x 2 columns]

The time span goes from 01.01.2003 until 31.12.2020. I have tried using:

dates_index = polar_temp.set_index('Date').resample('D').mean().reset_index()['Date'].to_list()
all_possible_dates = pd.DataFrame(product(dates_index, stations), columns=['Date', 'Station_Number'])

date_merge = pd.merge(stations_polar, all_possible_dates, how='outer',on= ['Station_Number','Date'])

However the missing dates will just be appended at the end of the df and even dates that are in both dfs will be appended.

Ideally the added dates would be set to NaN in the LST column. The output should look like this:

                            LST
Date        Station_Number                 
2003-01-01    SWE00137272 -238
2003-01-02    SWE00137272 NaN
2003-01-03    SWE00137272 NaN
2003-01-04    SWE00137272 NaN
2003-01-05    SWE00137272 NaN
2003-01-06    SWE00137272 NaN
2003-01-07    SWE00137272 NaN
2003-01-08    SWE00137272 NaN
2003-01-09    SWE00137272 -202
2003-01-10    SWE00137272 NaN

-Dots meaning the dates per station continue in a continues time period from 2003 to 2020 per station, added dates are set to NaN.

1 Answer 1

1
  1. Take Station_Number out from the index.
  2. Convert date index to datetime (If required).
  3. resample and then ffill the Station_Number
df1 = df.reset_index(-1)
df1.index = pd.to_datetime(df1.index)
df1 = df1.resample('D').first().assign(Station_Number = lambda x: x['Station_Number'].ffill())
Sign up to request clarification or add additional context in comments.

1 Comment

Nice it worked, but only for one station, is there a way to iterate that function over the other stations as well?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.