Hi, I'm throwing out this question in case it's helpful to someone else. I have an xarray dataset with a variable that has coordinates (time, lat, lon). The variable also has attributes associated with it, i.e. ds.<var>.attrs returns a non-empty dictionary. When I multiply the variable by some weights and assign to a new variable (ds2 = ds.<var> * ds_wts, it seems the attrs are dropped, but I need to keep them. Is this expected behavior? Is there some way to preserve metadata over computations?
Maybe the problem is that there are two xarray datasets, so it's not automatic to decide which dataset's attributes get kept. Perhaps I need to assign to the dataset variable directly, instead of assigning to a new dataset.
So I tried the following, which did not work:
ds2 = ds ds2['var'] = ds['var'] * ds_wts
I am taking advantage of the named coordinates for the weights, with coordinate 'lat', to be applied to ds2['var'], which has coordinates (time, lat, lon).
I think it will work to copy over the attrs explicitly, I just thought there was a more elegant or automatic way for metadata to be preserved:
ds2['var'].attrs = ds['var'].attrs
...And I believe this page answers my question: xarray does not preserve metadata for many of its operations: http://xarray.pydata.org/en/stable/faq.html#what-is-your-approach-to-metadata
In some instances, you can do something like
ds.var.values = ds.var * ds_wts
and the metadata will be preserved. You would not want to do this with dask arrays.
unfortunately this pattern of assigning to .values
is the cause of many recent bugs in esmlab. It should absolutely not be done for "dimension coordinates" but that distinction is hard to remember.
Instead consider using ds["var"] = ds.var.copy(data=ds.var * ds_wts)
.
Another possibility is:
with xr.set_options(keep_attrs=True): ds["var"] = ds.var * ds_wts
but I don't think that flag has been implemented on binary operations yet.
Thanks for your ideas. I will give them a try. I thought for a while about suggesting that xarray should give precedence to the metadata from the "left" operand. For example:
ds["var"] = ds["var"] * ds_wts
...would keep the metadata from ds["var"] unchanged. But I suppose it is not clear whether the metadata from ds_wts should be added in if it does not conflict, or if it should be left out by default. It seems there is no automatic, intuitive way to handle metadata with math operations, and the best approach is probably to implement the flag "keep_attrs" to let the user decide.
Last updated: May 16 2025 at 17:14 UTC