2,451 questions
2
votes
0
answers
43
views
How to optimize NetCDF files and dask for processing long-term climataological indices with xclim (ex. SPI using 30-day rolling window)?
I am trying to analyze the 30 day standardized precipitation index for a multi-state range of the southeastern US for the year 2016. I'm using xclim to process a direct pull of gridded daily ...
0
votes
0
answers
20
views
Introducing new dimension in xarray apply_ufunc
There has been at least one other question regarding the introduction of new dimensions in the output of xarray.apply_ufunc; I have two problems with this answer: First, I feel like the answer avoids ...
0
votes
0
answers
35
views
Coarsening the resolution of a xarray dataset
Very new to python! I am trying to model bottom water temperatures over time and need to reduce the resolution of my model from 1/20º to 1º. My ultimate goal is to map this and select specific grid ...
1
vote
2
answers
69
views
Convert existing dataset to rioxarray object
I have a dataset that I need to convert into a rioxarray dataset so I can use regridding features but I can't work out how to convert an already existing xarray object to rioxarray.
Unfortunately I ...
0
votes
1
answer
51
views
how to quickly pull values from an xarray using indices
I'm working in a Jupyter Notebook using Python 3.12. I have a 2D xarray (in reality, it's 3D, but we can treat it as 2D). I want to pull out the values based on indices I acquire elsewhere, and then ...
3
votes
0
answers
69
views
How to drop rows with a boolean mask in xarray/dask without .compute() blowing up memory?
I’m trying to subset a large xarray.Dataset backed by Dask and save it back to Zarr, but I’m running into a major memory problem when attempting to drop rows with a boolean mask.
Here’s a minimal ...
0
votes
0
answers
34
views
Adding global attributes to existing netCDF file in Xarray
I have netCDF files of oceanographic data processed in Python, that I'd like to update the global attributes of (i.e., add the same attributes to a bunch of files). Tried doing it in Xarray per their ...
0
votes
1
answer
109
views
How can I concatenate 10 netcdf files along the time axis, while also retaining the attributes of each individual file
I am trying to concatenate 10 netCDF files that are output files from a software named Ichthyop. Each of the files is a result of a lagrangian simulation of particles drifting in the Eastern ...
0
votes
0
answers
40
views
How to properly use joblib files in Dask?
from joblib import load
ntrees_16_model = load(r"ntrees_quantile_16_model_watermask.joblib")
ntrees_50_model = load(r"ntrees_quantile_50_model_watermask.joblib")
ntrees_84_model = ...
0
votes
1
answer
57
views
Extreme trends in GSL (Growing Season Length) due to missing years in ERA5-Land based calculation
I calculated the Growing Season Length (GSL) index for the 1950–2023 period in Turkey using ERA5-Land daily mean temperature data. According to the definition:
First, find the first occurrence of at ...
1
vote
1
answer
63
views
How to reduce xarray.coarsen with majority vote?
I'm currently trying to resample a large geotiff file to a coarser resolution. This file contains classes of tree species (indicated by integer values) at each pixel, so I want to resample each block (...
0
votes
1
answer
61
views
High RAM usage when using Datashader with dasked xarray
I have a dasked xarray which is about 150k x 90k with chunk size of 8192 x 8192. I am working on a Window virtual machine which has 100gb RAM and 16 cores.
I want to plot it using the Datashader ...
0
votes
0
answers
46
views
Convert wrfout to netcdf
I use Python version 3.9.18 to reading wrfout files (name like: wrfout_d02_2020-01-01_00:00:00) and get T2, Q2, PSFC, U10, V10, ACSWDNB variables and combine all days in the month to a output netcdf ...
0
votes
1
answer
81
views
What is the meaning of Data variables: *empty* in working with .nc file?
I had the task of analyzing a .nc file. I have not worked with it before, so I did some research online and found some codes.
I was exploring the dataset. I tried:
I am using Debian 12, Python3 and ...
0
votes
0
answers
45
views
Combing two .nc files with different dimensions using Icechunk, Virtualizarr, and Xarray
My overall goal is the set up a virtual dataset of ERA5 data using Icechunk. As a smaller test example, I'm trying to pull all the data located in the 194001 ERA5 folder. I've been mostly able to ...
2
votes
1
answer
51
views
Storing Xarray.Datasets in single Pandas.DataFrame cells
I have an existing DataFrame with metadata. I am now trying to add a column with the data. For each row in the DataFrame I want to add a subset of my xarray.DataSet. However, pandas seems to try and ...
0
votes
0
answers
37
views
Saving DataArray through to_netcdf loses coordinates?
I am using xarray and rioxarray to compute values from an existing dataset, the existing dataset as band like red, blue and green so I have something like this:
import xarray as xr
dataset: xr....
0
votes
0
answers
29
views
How do I upload a large Dask array to S3 in chunks using rasterio?
I have a very large Dask array containing geospatial information. I need to upload this array as a TIF file to an S3 bucket, but I cannot afford to load this raster in memory or save it to disk: I'd ...
0
votes
1
answer
47
views
Why intake_xarray NetCDFSource seems to not use user input parameters?
I have prepared a catalog.yml with this content:
metadata:
description: Sample catalog
information.
version: 1
plugins:
source:
- module: intake_xarray
sources:
this_source:
args:
...
0
votes
0
answers
70
views
Xarray apply function to every element of dataset
I currently have to do some calculations on a netcdf dataset. For this, I have to apply a function to each non-NaN element.
Here is my current approach:
import xarray as xr
def calc_things(wind_speed)...
1
vote
0
answers
29
views
Xarray combine_by_coords runs OOM when loading several dimensions
I have a script to load individual files into xarray for a custom data format which works fine when loading in individual files, however the moment I try to either load using open_mfdataset() or ...
0
votes
1
answer
35
views
Assign new dimension to xarray.Dataset using existing coordinates
I have an xarray.Dataset that looks like this, which is available here: https://psl.noaa.gov/thredds/dodsC/Datasets/NARR/Dailies/monolevel/acpcp.1979.nc:
<xarray.Dataset>
Dimensions: ...
1
vote
1
answer
27
views
How to disable automatic groupby widget in hvplot?
hvplot has a groupby parameter that lets you pick what variables to group the output by. This results in a widget you use to select the data subset you want to plot.
If groupby is not specified, ...
0
votes
0
answers
48
views
Reduce memory usage in CDO collgrid command
I have 78 netcdf files each around 17MB, with shape (time=1, x=2048, y=2048) to be merged spatially. The single timestep is shared for all 78 files. The collgrid merge command below was able to ...
0
votes
0
answers
64
views
Error in saving a very large xarray dataset to zarr in python
I have global daily radiation data for 19 years. It is divided into one netCDF file for every day (so around 7000 files). I am loading all the files together as a single xarray dataset. This takes ...
2
votes
1
answer
58
views
multidimensional coordinate transform with xarray
How to convert multidimensional coordinate to standard coordinate in order to unify data when using xarray for nc data:
import xarray as xr
da = xr.DataArray(
[[0, 1], [2, 3]],
coords={
...
1
vote
0
answers
82
views
Why can't I use DataArray.to_zarr() to save a NetCDF format file as Zarr format?
I'm not sure if it's a version compatibility issue either. I've spent a lot of time trying but still can't get it to work. Please help me!The codes and mistakes are as follows:
[in]:
print(xr....
0
votes
0
answers
41
views
MetPy "interpolate_to_isosurface" results in "IndexError: Unlabeled multi-dimensional array cannot be used for indexing: pressure_level"
I need help on the usage of MetPy's interpolate_to_isosurface function (link).
My goal was to interpolate a gridded meteorological dataset, such that its vertical coordinate is transformed from ...
0
votes
0
answers
25
views
Does hvplot lazily load data from xarray objects?
Suppose I open an xarray dataset ds using xarray.open_dataset that contains a 3d array called cube with dimension coordinates x, y, and z. Usually just opening the dataset doesn't load all the data ...
1
vote
0
answers
404
views
How can I resolve xarray "unrecognized engine cfgrib"?
An example HRRR file I'm trying to open with xarray: https://storage.googleapis.com/high-resolution-rapid-refresh/hrrr.20250427/conus/hrrr.t18z.wrfprsf06.grib2
When trying to open the dataset with ...
0
votes
2
answers
83
views
Splitting the time dimension of nc data using xarray
Now I have a timelonlat 3D data where time is recorded as year, month and day. I need to split time in the form of year*month+day. So that the data becomes 4 dimensional. How should I do this?
I have ...
1
vote
1
answer
81
views
One variable in xr.DataArray has an index depending on another variable, how to correctly implement this?
I am trying to get started using Xarray, but having an issue with a specific task: the labels I want to use in one dimension are dependent on another dimension.
Specifically, in the example below, the ...
1
vote
1
answer
167
views
How to select from xarray.Dataset without hardcoding the name of the dimension?
When selecting data from an xarray.Dataset type, the examples they provide all include hardcoding the name of the dimension like so:
ds = ds.sel(state_name='California')
TLDR; How can you select from ...
0
votes
0
answers
64
views
What is the data_vars argument of xarray.open_mfdataset doing?
I have two datasets with identical dimension names and shapes and I am trying to use xarray.open_mfdataset() to merge them into one dataset before opening them. You can drop this code in your own IDE ...
0
votes
0
answers
135
views
rioxarray and clip for a netcdf
I'm looking to use a shapefile to rio.clip a raster so I can find where they overlap.
I'm using the polygon for the overlap (in operation it will be a state or county shapefile, this is simplified for ...
0
votes
1
answer
412
views
What is the difference between an xarray Dataset and a DataArray?
According to the xarray documentation:
xarray.DataArray is xarray’s implementation of a labeled, multi-dimensional array.
To me, this means a DataArray is just numpy with labels. However, xarray ...
0
votes
1
answer
121
views
Getting [Errno 22] Invalid argument when trying to access data variables
I have a grib file containing reanalysis (an), ensemble mean (em), and ensemble spread (es) data. The dataset I access is the analysis data,
import xarray as xr
import cfgrib
file = 'C:/Users/...
1
vote
0
answers
46
views
Calculating a step in a direction of a vector
I have gridded atmospheric field data stored in an xarray.DataArray. I computed its gradient and rotated it to obtain vector components that are tangent to the isolines at each grid point:
ddata_dlon =...
0
votes
1
answer
74
views
How can I initialize a Zarr file that is larger than available memory?
My workflow generates a dataset of format xr.Dataset with dims (6, 36, 2, 13, 699, 1920) in float32.
I can process and write output array chunk by chunk, but only if the zarr file already exists, with:...
0
votes
0
answers
67
views
Applying a custom function to Xarray resample drops the dimension coordinates
When using the map method to apply a custom function in Xarray resample, the dimension coordinates are lost, and the resulting Dataset has a sequence instead of the actual coordinate values.
The ...
0
votes
1
answer
200
views
How to use numpy masked arrays to create a masked xarray DataArray?
I'm using metpy.calc.windchill in order to calculate wind chill values, and it automatically spits out an array with a numpy mask on it.
t2m, uwind, and vwind all come from ERA5 (on the Google Cloud: ...
0
votes
2
answers
154
views
A Netcdf file were generate from Google Earth Engine with the aid of packages Xee: Xarray + GEE and I can't open correctly it in QGIS
I am trying to generated Netcdf files from a imageCollection. The code is working fine. I can save as nc file and reopened it in Colab using Xarray package. I also can open it in Panoply, however it ...
1
vote
1
answer
108
views
How to remove xarray plot bad value edge colour
I know set_bad can colour the pixel into a specific colour but in my example I only want to have edge colour for blue and grey pixels with values and not the bad pixels (red)
import matplotlib.pyplot ...
0
votes
1
answer
101
views
Open a xarray created netCDF in QGIS
I have created a netCDF file using xarray in python with the code below:
latitude_save = sorted(list(set(copy2_sorted['lat'])))
longitud_save = sorted(list(set(copy2_sorted['lon'])))
time_save = ...
1
vote
0
answers
282
views
Sending large Dask graphs causing major slowdown before computation
I am encountering a warning from Dask where it takes a long time to start a computation (roughly around 5 minutes) because of my large task graphs. Here is the full warning:
UserWarning: Sending large ...
0
votes
0
answers
30
views
Xarray apply_ufunc with input of several 3D arrays
I have a function apply_dask_function() that I use to apply the function dask_function() on a xarray dataset (ds_example).
dask_function() takes 3 inputs, which are two 1D arrays (length "time&...
1
vote
1
answer
74
views
How to make xarray.DataArray.to_zarr readable by napari?
I have big TIFF-arrays that I want to save with xarray and view in napari. However, napari seems unable to read the zarr-format produced by xarray. Is there a way that I can specify the arguments of ...
0
votes
0
answers
145
views
Python Xarray Interpolate rotated grid to regular lat/lon grid
I have an xarray dataset that contains data on a rotated grid:
I want to regrid it to a regular grid. I am using:
dat.interp(latitude=np.arange(8, 24, 0.001), longitude=np.arange(102, 110, 0.001), ...
1
vote
1
answer
208
views
Using xarray in JupyterLab to read NC file from url
I am trying to prevent unnecessary downloading of large datasets by reading the publicly available files directly from their online location. Surprisingly I cannot find an answer to my question on ...
-1
votes
1
answer
244
views
How to Extract timeseries from gridded lat lon data (ERA5) in Python?
I have downloaded ERA5 global data which contains the following params:
total precipitation
snow fall
cloud cover
temperature at 2m
The data range is from Jan 2008 to Dec 2024.I am using xarray ...