Stream: general

Topic: Problem with reading in zarr files


view this post on Zulip Judith Berner (Jun 15 2021 at 21:18):

I am running a script that worked recently. Now I get an error when reading in zarr files (netcdf ok). Is that realted to the recent issues or is there a fix. Unfortunately, I don't understand the error message.

view this post on Zulip Max Grover (Jun 15 2021 at 21:18):

What error are you getting?

view this post on Zulip Judith Berner (Jun 15 2021 at 21:19):

pasted image

view this post on Zulip Judith Berner (Jun 15 2021 at 21:19):

pasted image

view this post on Zulip Judith Berner (Jun 15 2021 at 21:20):

What error are you getting?

There must be a better way sharing code than pasting screenshots...

view this post on Zulip Max Grover (Jun 15 2021 at 21:23):

Looks like that Zarr file you are trying to read in might be missing its consolidated metadata... any thoughts @Anderson Banihirwe ? https://stackoverflow.com/questions/66144743/getting-keyerror-zmetadata-when-opening-remote-zarr-store

view this post on Zulip Judith Berner (Jun 15 2021 at 21:31):

Looks like that Zarr file you are trying to read in might be missing its consolidated metadata... any thoughts Anderson Banihirwe ? https://stackoverflow.com/questions/66144743/getting-keyerror-zmetadata-when-opening-remote-zarr-store

I tried a number of different zarr files, restarted kernel etc. must be something more fundamental.

view this post on Zulip Max Grover (Jun 15 2021 at 21:32):

Can you share a path to one of the zarr files?

view this post on Zulip Judith Berner (Jun 15 2021 at 21:35):

Can you share a path to one of the zarr files?

#hinda2 = xr.open_zarr("/glade/campaign/cesm/development/cross-wg/S2S/jaye/CESM2.S2S.tas_2m.anoms.zarr/"glade/campaign/cesm/development/cross-wg/S2S/jaye
#hinda1 = xr.open_zarr("/glade/campaign/cesm/development/cross-wg/S2S/jaye/CESM1.S2S.tas_2m.anoms.zarr/", consolidated=True)
#hindaw = xr.open_zarr("/glade/campaign/cesm/development/cross-wg/S2S/jaye/WACCM.S2S.tas_2m.anoms.zarr/", consolidated=True)

view this post on Zulip Deepak Cherian (Jun 15 2021 at 22:04):

Does consolidated=False help?

view this post on Zulip Anderson Banihirwe (Jun 15 2021 at 22:26):

@Judith Berner, Deepak's suggestion should address your issue:

In [1]: import xarray as xr

In [2]: xr.open_zarr("/glade/campaign/cesm/development/cross-wg/S2S/jaye/WACCM.S2S.tas_2m.anoms.zarr/"
   ...: , consolidated=False)
Out[2]:
<xarray.Dataset>
Dimensions:    (init: 655, lat: 181, lead: 46, lon: 360, member: 5)
Coordinates:
    dayofyear  (init) int64 dask.array<chunksize=(211,), meta=np.ndarray>
  * init       (init) object 1999-01-04 00:00:00 ... 2020-10-26 00:00:00
  * lat        (lat) float32 -90.0 -89.0 -88.0 -87.0 ... 87.0 88.0 89.0 90.0
  * lead       (lead) int64 0 1 2 3 4 5 6 7 8 9 ... 37 38 39 40 41 42 43 44 45
  * lon        (lon) float32 0.0 1.0 2.0 3.0 4.0 ... 356.0 357.0 358.0 359.0
  * member     (member) int64 1 2 3 4 5
Data variables:
    TAS        (member, init, lead, lat, lon) float32 dask.array<chunksize=(5, 655, 1, 112, 90), meta=np.ndarray>

view this post on Zulip Deepak Cherian (Jun 15 2021 at 22:27):

we should open an issue about raising a more useful error message, possibly in zarr. WDYT Anderson

view this post on Zulip Anderson Banihirwe (Jun 15 2021 at 22:36):

@Judith Berner, if you want to use the consolidated=True option in the future, ensure consolidated is set to True when saving the data: ds.to_zarr(...., consolidated=True)

we should open an issue about raising a more useful error message, possibly in zarr. WDYT Anderson

Considering how common this error/issue is,It's reasonable for xarray to be less strict here:

try:
    xr.open_zarr(...., consolidated=True)
except KeyError:
   xr.open_zarr(..., consolidated=False)

# Raise an exception if still unable to read the zarr store

xarray already does something along these lines when decoding calendars using cftime and pandas in xr.open_dataset(...., use_cftime=None)

view this post on Zulip Anderson Banihirwe (Jun 15 2021 at 22:37):

@Deepak Cherian, it turns out that Stephan already has a proposal for this: https://github.com/pydata/xarray/issues/5251 :slight_smile:

view this post on Zulip Deepak Cherian (Jun 15 2021 at 22:38):

and a PR! but I meant raising a nicer error message from zarr, saying that "consolidated metadata do not exist. Please pass consolidated=False".

view this post on Zulip Anderson Banihirwe (Jun 15 2021 at 22:43):

and a PR! but I meant raising a nicer error message from zarr, saying that "consolidated metadata do not exist. Please pass consolidated=False".

Oooh I see... That would be nice.. I will open an issue upstream

view this post on Zulip Judith Berner (Jun 16 2021 at 00:27):

@Abby Jaye


Last updated: Jan 30 2022 at 12:01 UTC