Stream: ESP-SMYLE

Topic: Data issues


view this post on Zulip Anderson Banihirwe (Apr 22 2021 at 22:29):

While building a catalog for the SYMLE output, I found a few problematic/corrupted files. I was wondering whether there is an established process for reporting data issues?

Here's a list of some corrupted files:

For instance:

$ ncdump -h /glade/campaign/cesm/development/espwg/SMYLE/archive/b.e21.BSMYLE.f09_g17.1972-11.007/ocn/proc/tseries/month_1/b.e21.BSMYLE.f09_g17.1972-11.007.pop.h.J_DIC.197211-197410.nc

results in

NetCDF: Unknown file format

view this post on Zulip Stephen Yeager (Apr 22 2021 at 22:58):

@Anderson Banihirwe Thanks for this info. There is no established process, but this is a good venue for reporting issues. I'll email Nan & Gary.

view this post on Zulip Anderson Banihirwe (Apr 22 2021 at 23:15):

Sounds good...


Not sure if this was intentional or not, but time units for most files under the glc component appear to be invalid ( as far as cftime and xarray are concerned).

Here's the error from xarray:

ValueError: unable to decode time units 'common_year since 0000-01-01 0:0:0' with "calendar 'noleap'". Try opening your dataset with decode_times=False or installing cftime if it is not installed.
$ ncdump -h /glade/campaign/cesm/development/espwg/SMYLE/archive/b.e21.BSMYLE.f09_g17.2000-11.015/glc/proc/tseries/year_1/b.e21.BSMYLE.f09_g17.2000-11.015.cism.h.artm.2001-2002.nc
netcdf b.e21.BSMYLE.f09_g17.2000-11.015.cism.h.artm.2001-2002 {
dimensions:
        time = UNLIMITED ; // (2 currently)
        level = 11 ;
        lithoz = 20 ;
        staglevel = 10 ;
        stagwbndlevel = 12 ;
        x0 = 415 ;
        x1 = 416 ;
        y0 = 703 ;
        y1 = 704 ;

....

  double time(time) ;
                time:long_name = "Model time" ;
                time:standard_name = "time" ;
                time:units = "common_year since 0000-01-01 0:0:0" ;
                time:calendar = "noleap" ;

view this post on Zulip Nan Rosenbloom (Apr 22 2021 at 23:47):

Steve. The original data from these runs no longer exist. These runs will need to be recreated to replace these files.

view this post on Zulip Nan Rosenbloom (Apr 22 2021 at 23:51):

@Anderson Banihirwe I don't think this is intentional; but I just checked the CMIP6 timeseries and I see the same time dimensionality:
/glade/campaign/collections/cmip/CMIP6/timeseries-cmip6/b.e21.B1850cmip6.f09_g17.DAMIP-hist-sol.002/glc/proc/tseries/year_1/b.e21.B1850cmip6.f09_g17.DAMIP-hist-sol.002.cism.h.artm.1851-1900.nc

view this post on Zulip Anderson Banihirwe (Sep 30 2021 at 20:28):

Not sure if this was intentional or not, but time units for most files under the glc component appear to be invalid ( as far as cftime and xarray are concerned).

For anyone who runs into this calendar issue when decoding time with units: 'common_year since 0000-01-01 0:0:0', this issue has been resolved in cftime v1.5.1

In [3]: path = "/glade/campaign/cesm/development/espwg/SMYLE/archive/b.e21.BSMYLE.f09_g17.2000-11.015/glc/proc/tseries/year_1/b.e21.BSMYLE.f09_g17.2000-11.015.cism.h.artm.2001-2002.nc"

In [4]: ds = xr.open_dataset(path, use_cftime=True, chunks={})

In [7]: ds.time.encoding
Out[7]:
{'zlib': True,
 'shuffle': True,
 'complevel': 1,
 'fletcher32': False,
 'contiguous': False,
 'chunksizes': (512,),
 'source': '/glade/campaign/cesm/development/espwg/SMYLE/archive/b.e21.BSMYLE.f09_g17.2000-11.015/glc/proc/tseries/year_1/b.e21.BSMYLE.f09_g17.2000-11.015.cism.h.artm.2001-2002.nc',
 'original_shape': (2,),
 'dtype': dtype('float64'),
 'units': 'common_year since 0000-01-01 0:0:0',
 'calendar': 'noleap'}

Last updated: May 16 2025 at 17:14 UTC