Hi all! Using dask and xr.open_mfdataset, I want to call CESM2-LE monthly DIC output from 185001-201412. What is an efficient way to do this? Previously I only needed two decades so I called one decade a time and concatenated them. Thanks in advance for suggestions and/or script you can share.
Hi Holly, I'm assuming you're already using parallel=True
?
Hi! Yes. My script is this:
n_members='DIC'
ds_members_DIC=xr.open_mfdataset(DIR+n_members+'/b.e21.BHISTsmbb.f09_g17.LE2-1251.*.pop.h.DIC.185001-185912.nc',
chunks={'nlat': 64, 'nlon': 80},
concat_dim='member_id',
combine='nested',
# join='override',
coords='minimal', # uses the coordinates from the first file
# data_vars='minimal',
autoclose=True,
decode_times=False,
parallel=True)
I responded on the discussion in #dask > Calling multiple decadal output datasets: CESM2-LE before I saw this more specific example -- I think adding compat='override'
to the argument list should speed up the read quite a bit.
Last updated: May 16 2025 at 17:14 UTC