I have a function apply_dask_function() that I use to apply the function dask_function() on a xarray dataset (ds_example).
dask_function() takes 3 inputs, which are two 1D arrays (length "time") the same size + a "new_time" array, and returns two outputs (length "new_time").
My dataset ds_example is chunked such as ds_example.chunk({'time': -1, 'x':10, 'y':10}). This is because the function dask_function() uses the timeseries of every pixel in the (x,y) plane, and has to be applied on all of them.
I can not manage to apply the function on two 3D arrays at the same time.
How can I use xr.apply_ufunc() to apply the function in parallel by taking two 3D chunked arrays ?
Code example:
# Dask wrapper function
def dask_apply_function(DataArray, SigmaArray, dim, kwargs=None):
results = xr.apply_ufunc(
dask_GPR,
DataArray,
SigmaArray, # Pass sigma as an argument
kwargs=kwargs,
input_core_dims=[[dim], [dim]], # Ensure both DEM and sigma share the same dimension
output_core_dims=[["new_time"], ["new_time"]],
output_sizes={"new_time": len(kwargs["new_time"])},
output_dtypes=[float, float],
vectorize=True,
dask="parallelized",
)
# Dask function on 1D
def dask_function(
DataArray,
sigma,
new_time=None,
):
# Applies an interpolation on DataArray while taking "sigma" into account, and returns 2 outputs the length of "new_time"
return mean, error