5

I am new to using Python and also new to NetCDF, so apologies if I'm unclear. I have an nc file that has several variables and I need to extract data from those nc files in a new order.

My nc file has 8 variables (longitude, latitude, time, u10, v10, swh, mwd, mwp) and the logic I'm trying is "If I input longitude and latitude, my program outputs other variables (u10, v10, swh, mwd, mwp) ordered by time." Then I would put the extracted data in another database.

I tested my nc file as below:

import netCDF4
from netCDF4 import Dataset

jan = Dataset('2016_01.nc')
print jan.variables.keys()

lon = jan.variables['longitude']
lat = jan.variables['latitude']
time = jan.variables['time']

for d in jan.dimensions.items():
    print d

lon_array = lon[:]
lat_array = lat[:]
time_array = time[:]

print lon_array
print lat_array
print time_array

and some of the result is below

[u'longitude', u'latitude', u'time', u'u10', u'v10', u'swh', u'mwd', u'mwp']

(u'longitude', <type 'netCDF4._netCDF4.Dimension'>: name = 'longitude', size = 1440)

(u'latitude', <type 'netCDF4._netCDF4.Dimension'>: name = 'latitude', size = 721)

(u'time', <type 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'time', size = 186)

Any advice would be appreciated. Thank you.

3 Answers 3

8

2022 edit: this is now much easier with xarray, as shown in Adrian's answer: https://stackoverflow.com/a/74599597/3581217


You first need to know the order of the dimensions in the time/space varying variables like e.g. u10, which you can obtain with:

u10 = jan.variables['u10']
print(u10.dimensions)

Next it is a matter of slicing/indexing the array correctly. If you want data for lets say latitude=30, longitude = 10, the corresponding (closest) indexes can be found with (after importing Numpy as import numpy as np):

i = np.abs(lon_array - 10).argmin()
j = np.abs(lat_array - 30).argmin()

Assuming that the dimensions of u10 are ordered as {time, lat, lon}, you can read the data as:

u10_time = u10[:,j,i]

Which gives you all (time varying) u10 values for your requested location.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much. it fits.
1

This kind of task is straightforward using xarray, for example

import xarray as xr
lon=30
lat=10

# open the file, select the location and write to new netcdf 
da=xr.open_dataset('2016_01.nc') 
ts=da.sel(x=lon, y=lat, method="nearest")
ts.to_netcdf('timeseries.nc')

1 Comment

2022-me fully agrees, I added a reference to your answer in my accepted answer..
-1

I used this on netCDF files that are generated with the WRF model.

import numpy as np
from netCDF4 import Dataset  # http://code.google.com/p/netcdf4-python/
import pandas as pd
import os

os.chdir('.../netcdf') # Select your dir
f = Dataset('wrfout_d01_2007-01-01_10_00_00', 'r') #Charge your file

latbounds = [ 4.691417 ]# Latitud
lonbounds = [ -74.209 ]# Longitud

cor_lat = pd.DataFrame(f.variables['XLAT'][0][:])
cor_lat2 = pd.DataFrame({'a':cor_lat.iloc[:,0], 'b':abs(cor_lat.iloc[:,0] - latbounds)})

a = cor_lat2[cor_lat2.b == min(cor_lat2.b)].index.get_values()[0]

cor_lon = pd.DataFrame(f.variables['XLONG'][0][:])
cor_lon2 = pd.DataFrame({'a':cor_lon.iloc[0,:], 'b':abs(cor_lon.iloc[0,:] - lonbounds)})

b = cor_lon2[cor_lon2.b == min(cor_lon2.b)].index.get_values()[0]

vlr = (f.variables['T2'][ : , a , b ] - 273.15)[0] #This change from kelvin to celsius
vlr

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.