I'm doing some analysis of CESM coupler history files with python. The metadata is a bit awkward. I'm trying to clean it up with xarray and I'm stumped. A subset of ncdump from an example file is
doma_nx = 640 ; doma_ny = 320 ; a2x_nx = 640 ; a2x_ny = 320 ; double doma_lat(time, doma_ny, doma_nx) ; doma_lat:_FillValue = 1.e+30 ; doma_lat:units = "degrees north" ; doma_lat:long_name = "latitude" ; doma_lat:standard_name = "latitude" ; doma_lat:internal_dname = "dom_ax" ; double doma_lon(time, doma_ny, doma_nx) ; doma_lon:_FillValue = 1.e+30 ; doma_lon:units = "degrees east" ; doma_lon:long_name = "longitude" ; doma_lon:standard_name = "longitude" ; doma_lon:internal_dname = "dom_ax" ; double a2x_Faxa_bcphidry(time, a2x_ny, a2x_nx) ; a2x_Faxa_bcphidry:_FillValue = 1.e+30 ; a2x_Faxa_bcphidry:units = "kg m-2 s-1" ; a2x_Faxa_bcphidry:long_name = "Hydrophylic black carbon dry deposition flux" ; a2x_Faxa_bcphidry:standard_name = "dry_deposition_flux_of_hydrophylic_black_carbon" ; a2x_Faxa_bcphidry:internal_dname = "a2x_ax" ;
I would like to assign the coordinate variables doma_lat
and doma_lon
as coordinates for the variable a2x_Faxa_bcphidry
. When I try the command
var_ctrl.assign_coords({"lon": ds_CTRL["doma_lon"], "lat": ds_CTRL["doma_lat"]})
I get the error message
ValueError: cannot add coordinates with new dimensions to a DataArray
I suspect this is because the dimension names for the variable differ from the dimension names for the coordinate variables, but it's also possible that I'm not understanding the API of assign_coords. Assuming the former, I'd like versions of the doma_lon
and doma_lat
variables on the a2x_nx
and a2x_ny
dimensions. I can't figure out how to do that in xarray.
Suggestions on how to proceed?
FYI, the file I'm looking at is located at
/glade/scratch/mvertens/SMS_Vmct_Ld1.TL319_g17.G1850ECOIAF_JRA_PHYS_DEV.cheyenne_intel.pop-ecosys.validate00/run/SMS_Vmct_Ld1.TL319_g17.G1850ECOIAF_JRA_PHYS_DEV.cheyenne_intel.pop-ecosys.validate00.cpl.hi.0001-01-01-03600.nc
Try rename_dims
to rename the doma_*
dimensions to a2x_*
. (rename_dims
was contributed by @Julia Kent).
When I try rename_dims
on the Dataset, I get the error message
ValueError: Cannot rename doma_nx to a2x_nx because a2x_nx already exists. Try using swap_dims instead.
swap_dims
on the Dataset seems to do what I want. I had previously read the docstring for swap_dims
Returns a new DataArray with swapped dimensions
and thought that it wasn't what I wanted. I think of swap as exchanging 2 labels, while the rename seems more like a substitution. I guess swap means different things to different people.
I think it's unfortunate that user code needs to use a different method for rename if the desired name already exists.
That said, thanks for pointing me on a path that led to what works.
I think the core issue here is you're trying to rename to a "dimension name" for which coordinate values already exist. Instead of potentially overwriting values, it raises an error. So ds.drop_vars(["doma_nx", "doma_ny"]).rename_dims
should work. swap_dims
is also a good solution.
The aren't any variables named doma_nx
or doma_ny
in the Dataset, just dimensions. So ds.drop_vars(["doma_nx", "doma_ny"])
generates the error
ValueError: One or more of the specified variables cannot be found in this dataset
The metadata in this file is awkward/lacking enough that xarray doesn't detect/deduce that any variables are coordinates, except for time
.
Ah sorry I got confused between doma_lon
(for which there is a variable) and doma_nx
No worries, swap_dims
works.
Last updated: May 16 2025 at 17:14 UTC