I'm using xarray 0.16.2 and zarr 2.6.1.
I am trying to save Zarr stores created from merging together Xarray datasets created from separate NetCDF files, using intake-esm's to_dataset_dict()
function. The default behavior of the merge operation seems to be to preserve metadata that is identical across all source datasets. However, I'm trying to control the value of a metadata attribute that has conflicting values across the source datasets, like this:
del ds['time'].attrs['calendar'] ds['time'].attrs['calendar'] = 'gregorian' ds.to_zarr("/path/to/store", consolidated=True)
I get the following error message:
Failed to write /glade/scratch/bonnland/na-cordex/zarr/tas.eval.day.NAM-22i.raw.zarr: failed to prevent overwriting existing key calendar in attrs. This is probably an encoding field used by xarray to describe how a variable is serialized. To proceed, remove this key from the variable's attributes manually.
Is there a way to control such metadata parameters that I might be missing? Thanks for any help.
After some digging into the xarray code, it appears that XArray datasets with datetime64 time values cannot set or change the "units" and "calendar" attributes. These apparently help interpret the time axis values in a critical way. But the values for "units" and "calendar" do not show up in the list of attributes for ds['time'], so I'm not sure exactly what's going on.
I think both units
and calendar
are in ds["time"].encoding
OK, but the error message seems misleading then. Also, I suspect users won't typically look in ds["time"].encoding
to understand what kind of calendar they are using. Or maybe I'm not aware of the conventions...
I think the error message is actually pretty informative: "This is probably an encoding field
" Though I didn't realize it was refering to da.encoding
until I started playing around on my own :) Also, I'm a little confused about how you mention you want to "control the value of a metadata attribute that has conflicting values across the source datasets". Didn't this come up at an Xdev meeting a while back? What do you expect to happen when you create a single dataset from many different datasets using different time axes?
When properly combined, the merged dataset has time axis values from the union of all datasets. This is the Gregorian calendar in most cases I have.
Thanks for your note about the message. I take it that ds['time'].attrs
is not the conventional home for "units" and "calendar", which is news to me.
It must be time-specific, maybe when it recognizes that it is reading a datetime
or cftime
object? I know units
is typically in da.attrs
for other fields
Yes, I can set attributes for any other coordinate or variable other than the time dimension, which I know now I should just leave alone. Thanks for your insights...
Last updated: May 16 2025 at 17:14 UTC