0

I have a Xarray dataset with irregular values for daily data. Some times there are two values for one day sometimes there is a gap for several days.

[Timestamp('2015-04-01 00:00:00'),
 Timestamp('2015-04-01 00:00:00'),
 Timestamp('2015-04-03 00:00:00'),
 Timestamp('2015-04-03 00:00:00'),
 Timestamp('2015-04-05 00:00:00'),
 Timestamp('2015-04-06 00:00:00'),
 Timestamp('2015-04-06 00:00:00')]

If I apply resample()

model.resample(time='1D').mean()

I end up with

[Timestamp('2015-04-01 00:00:00'),
 Timestamp('2015-04-02 00:00:00'),
 Timestamp('2015-04-03 00:00:00'),
 Timestamp('2015-04-04 00:00:00'),
 Timestamp('2015-04-05 00:00:00'),
 Timestamp('2015-04-06 00:00:00'),
 Timestamp('2015-04-07 00:00:00')]

But I am looking for resample the data like this

[Timestamp('2015-04-01 00:00:00'),
 Timestamp('2015-04-03 00:00:00'),
 Timestamp('2015-04-05 00:00:00'),
 Timestamp('2015-04-06 00:00:00')]

What options do I have to get the .mean() of values on equal days without adding new times to the model? I try to reproduce the problem in a small sample:

value_1 = np.arange(0,7,1)
times = np.array(['2015-04-01', '2015-04-01', '2018-01-03', '2018-01-03', '2018-01-05', '2018-01-05', '2018-01-06'], dtype='datetime64')

time_ = xr.Dataset(
        data_vars={'value':    (('time'), value_1)},
        coords={'time': times})

time_resample = time_.resample(time='1D').mean().sel(time=slice('2015-04-01', '2015-04-06'))

print(time_.time, time_resample.time)


<xarray.DataArray 'time' (time: 7)>
array(['2015-04-01T00:00:00.000000000', '2015-04-01T00:00:00.000000000',
       '2018-01-03T00:00:00.000000000', '2018-01-03T00:00:00.000000000',
       '2018-01-05T00:00:00.000000000', '2018-01-05T00:00:00.000000000',
       '2018-01-06T00:00:00.000000000'], dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2015-04-01 2015-04-01 ... 2018-01-06 <xarray.DataArray 'time' (time: 6)>
array(['2015-04-01T00:00:00.000000000', '2015-04-02T00:00:00.000000000',
       '2015-04-03T00:00:00.000000000', '2015-04-04T00:00:00.000000000',
       '2015-04-05T00:00:00.000000000', '2015-04-06T00:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2015-04-01 2015-04-02 ... 2015-04-06
2
  • 2
    groupby('Date') or something similar, not resample. Commented Dec 20, 2019 at 14:59
  • 1
    If you solved your issue you can answer your question and accept it (or any other answer). It is a better way than editing the solution inside the question Commented Jan 5, 2020 at 22:40

1 Answer 1

1

You have to group by time and apply the function mean.

time_groupby = time_.value.groupby('time').mean()

xarray is quite similar to pandas on that point.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.