Stream: dask

Topic: Trouble saving data xarray to nc/zarr using dask


view this post on Zulip Dylan Oldenburg (Feb 24 2025 at 16:38):

Hello, I am having trouble saving small amounts of data (~30 kB) to nc/zarr when using dask:

When I run
output_path = '/glade/derecho/scratch/oldend/%s_UOHC_leads.nc'
ens_ts.to_netcdf(output_path, unlimited_dims=['cycle'])

I receive

/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/client.py:3162: UserWarning: Sending large graph of size 29.59 MiB.
This may cause some slowdown.
Consider scattering data ahead of time and using futures.
warnings.warn(

KilledWorker Traceback (most recent call last)
Cell In[10], line 135
133 print('saving lead zarr')
134 output_path = '/glade/derecho/scratch/oldend/%s_UOHC_leads.nc'
--> 135 ens_ts.to_netcdf(output_path, unlimited_dims=['cycle'])
136 print('ens ts saved')
137 #### Regrid OBS

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/dataarray.py:4086, in DataArray.to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf)
4082 else:
4083 # No problems with the name - so we're fine!
4084 dataset = self.to_dataset()
-> 4086 return to_netcdf( # type: ignore # mypy cannot resolve the overloads:(
4087 dataset,
4088 path,
4089 mode=mode,
4090 format=format,
4091 group=group,
4092 engine=engine,
4093 encoding=encoding,
4094 unlimited_dims=unlimited_dims,
4095 compute=compute,
4096 multifile=False,
4097 invalid_netcdf=invalid_netcdf,
4098 )

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/backends/api.py:1324, in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf)
1321 if multifile:
1322 return writer, store
-> 1324 writes = writer.sync(compute=compute)
1326 if isinstance(target, BytesIO):
1327 store.sync()

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/backends/common.py:256, in ArrayWriter.sync(self, compute, chunkmanager_store_kwargs)
253 if chunkmanager_store_kwargs is None:
254 chunkmanager_store_kwargs = {}
--> 256 delayed_store = chunkmanager.store(
257 self.sources,
258 self.targets,
259 lock=self.lock,
260 compute=compute,
261 flush=True,
262 regions=self.regions,
263 **chunkmanager_store_kwargs,
264 )
265 self.sources = []
266 self.targets = []

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/daskmanager.py:233, in DaskManager.store(self, sources, targets, **kwargs)
225 def store(
226 self,
227 sources: DaskArray | Sequence[DaskArray],
228 targets: Any,
229 **kwargs,
230 ):
231 from dask.array import store
--> 233 return store(
234 sources=sources,
235 targets=targets,
236 **kwargs,
237 )

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/dask/array/core.py:1236, in store(failed resolving arguments)
1234 elif compute:
1235 store_dsk = HighLevelGraph(layers, dependencies)
-> 1236 compute_as_if_collection(Array, store_dsk, map_keys, **kwargs)
1237 return None
1239 else:

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/dask/base.py:406, in compute_as_if_collection(cls, dsk, keys, scheduler, get, **kwargs)
404 schedule = get_scheduler(scheduler=scheduler, cls=cls, get=get)
405 dsk2 = optimization_function(cls)(dsk, keys, **kwargs)
--> 406 return schedule(dsk2, keys, **kwargs)

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/client.py:3280, in Client.get(self, dsk, keys, workers, allow_other_workers, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
3278 should_rejoin = False
3279 try:
-> 3280 results = self.gather(packed, asynchronous=asynchronous, direct=direct)
3281 finally:
3282 for f in futures.values():

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/client.py:2383, in Client.gather(self, futures, errors, direct, asynchronous)
2380 local_worker = None
2382 with shorten_traceback():
-> 2383 return self.sync(
2384 self._gather,
2385 futures,
2386 errors=errors,
2387 direct=direct,
2388 local_worker=local_worker,
2389 asynchronous=asynchronous,
2390 )

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/client.py:2243, in Client._gather(self, futures, errors, direct, local_worker)
2241 exc = CancelledError(key)
2242 else:
-> 2243 raise exception.with_traceback(traceback)
2244 raise exc
2245 if errors == "skip":

KilledWorker: Attempted to run task ('rechunk-merge-5b0135762bd66e79864c3b31aad9a747', 39, 0, 0, 0, 0, 0) on 4 different workers, but all those workers died while running it. The last worker that attempt to run the task was tcp://128.117.208.62:37601. Inspecting worker logs is often a good next step to diagnose what went wrong. For more information see https://distributed.dask.org/en/stable/killed.html.

I've attached some worker logs. Really not sure what is going on. If I try to save to zarr I get the same problem (or other errors..) I have tried adding .load() with no avail.

3872229.casper-pbs.ER
3872198.casper-pbs.ER

Thanks,

Dylan

view this post on Zulip Michael Levy (Feb 24 2025 at 19:15):

How are you creating ens_to? I wonder if the problem is in the setup of the dataset, and the to_netcdf call is the first time dask is trying to run through the whole graph. If you have xr.open_mfdataset() or xr.concat() calls, using compat='override' and join='override' will remove the consistency checks that verify all the coordinate arrays are the same across every file / dataset.

view this post on Zulip Dylan Oldenburg (Feb 24 2025 at 19:41):

Thanks. I was using xr.open_mfdataset (already with compat='override' and join='override') as well as xr.concat (without those). I've added ens_ts0 = xr.concat([ens_ts0,ens_ts],dim='L',compat="override",join="override") but now I receive this error when doing xr.concat:


ValueError Traceback (most recent call last)
Cell In[8], line 127
125 ens_time_year = dp2_time.isel(L=lvals+i).mean('L')
126 ens_ts = ens_ts.assign_coords(time=("time",ens_time_year.data)).expand_dims('L')
--> 127 ens_ts0 = xr.concat([ens_ts0,ens_ts],dim='L',compat="override",join="override")
128 ens_ts = ens_ts0.copy()
129 del ens_ts0

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/concat.py:264, in concat(objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs, create_index_for_new_dim)
259 raise ValueError(
260 f"compat={compat!r} invalid: must be 'broadcast_equals', 'equals', 'identical', 'no_conflicts' or 'override'"
261 )
263 if isinstance(first_obj, DataArray):
--> 264 return _dataarray_concat(
265 objs,
266 dim=dim,
267 data_vars=data_vars,
268 coords=coords,
269 compat=compat,
270 positions=positions,
271 fill_value=fill_value,
272 join=join,
273 combine_attrs=combine_attrs,
274 create_index_for_new_dim=create_index_for_new_dim,
275 )
276 elif isinstance(first_obj, Dataset):
277 return _dataset_concat(
278 objs,
279 dim=dim,
(...)
287 create_index_for_new_dim=create_index_for_new_dim,
288 )

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/concat.py:755, in _dataarray_concat(arrays, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs, create_index_for_new_dim)
752 arr = arr.rename(name)
753 datasets.append(arr._to_temp_dataset())
--> 755 ds = _dataset_concat(
756 datasets,
757 dim,
758 data_vars,
759 coords,
760 compat,
761 positions,
762 fill_value=fill_value,
763 join=join,
764 combine_attrs=combine_attrs,
765 create_index_for_new_dim=create_index_for_new_dim,
766 )
768 merged_attrs = merge_attrs([da.attrs for da in arrays], combine_attrs)
770 result = arrays[0]._from_temp_dataset(ds, name)

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/concat.py:545, in _dataset_concat(datasets, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs, create_index_for_new_dim)
539 datasets = [
540 ds.expand_dims(dim_name, create_index_for_new_dim=create_index_for_new_dim)
541 for ds in datasets
542 ]
544 # determine which variables to concatenate
--> 545 concat_over, equals, concat_dim_lengths = _calc_concat_over(
546 datasets, dim_name, dim_names, data_vars, coords, compat
547 )
549 # determine which variables to merge, and then merge them according to compat
550 variables_to_merge = (coord_names | data_names) - concat_over

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/concat.py:440, in _calc_concat_over(datasets, dim, dim_names, data_vars, coords, compat)
437 concat_over.update(opt)
439 process_subset_opt(data_vars, "data_vars")
--> 440 process_subset_opt(coords, "coords")
441 return concat_over, equals, concat_dim_lengths

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/concat.py:350, in _calc_concat_over.<locals>.process_subset_opt(opt, subset)
348 if opt == "different":
349 if compat == "override":
--> 350 raise ValueError(
351 f"Cannot specify both {subset}='different' and compat='override'."
352 )
353 # all nonindexes that are not the same in each dataset
354 for k in getattr(datasets[0], subset):

ValueError: Cannot specify both coords='different' and compat='override'.
ens_ts

view this post on Zulip Dylan Oldenburg (Feb 24 2025 at 19:42):

This is how ens_ts is computed:
lvals = np.arange(nyear)
lvalsda = xr.DataArray(np.arange(nleads),dims="L",name="L")
for i in range(nleads):
if i==range(nleads)[0]:
ens_ts0 = ohc_anom.isel(L=lvals+i).mean('L').rename({'Y':'time'}).chunk(dict(time=-1))
ens_time_year = dp2_time.isel(L=lvals+i).mean('L')
ens_ts0 = ens_ts0.assign_coords(time=("time",ens_time_year.data)).expand_dims('L')
else:
ens_ts = ohc_anom.isel(L=lvals+i).mean('L').rename({'Y':'time'}).chunk(dict(time=-1))
ens_time_year = dp2_time.isel(L=lvals+i).mean('L')
ens_ts = ens_ts.assign_coords(time=("time",ens_time_year.data)).expand_dims('L')
ens_ts0 = xr.concat([ens_ts0,ens_ts],dim='L',compat="override",join="override")
ens_ts = ens_ts0.copy()
del ens_ts0

view this post on Zulip Michael Levy (Feb 24 2025 at 21:07):

I've added ens_ts0 = xr.concat([ens_ts0,ens_ts],dim='L',compat="override",join="override") but now I receive this error when doing xr.concat:

Can you add coords = "minimal" as well? You can read about these options on the xr.concat() documentation page, but I think this change (combined with compat="override") will assume that coordinates are the same across all ensemble members

view this post on Zulip Dylan Oldenburg (Feb 24 2025 at 21:24):

That does make xr.concat() work, but then when I try to do ens_ts.to_netcdf(output_path) then I get:

CancelledError Traceback (most recent call last)
Cell In[7], line 138
135 print('saving lead')
137 output_path = '/glade/derecho/scratch/oldend/%s_UOHC_leads.nc' % model
--> 138 ens_ts.to_netcdf(output_path)#, unlimited_dims=['cycle'])
140 print('ens ts saved')
141 #### Regrid OBS

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/dataarray.py:4211, in DataArray.to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf, auto_complex)
4207 else:
4208 # No problems with the name - so we're fine!
4209 dataset = self.to_dataset()
-> 4211 return to_netcdf( # type: ignore[return-value] # mypy cannot resolve the overloads:(
4212 dataset,
4213 path,
4214 mode=mode,
4215 format=format,
4216 group=group,
4217 engine=engine,
4218 encoding=encoding,
4219 unlimited_dims=unlimited_dims,
4220 compute=compute,
4221 multifile=False,
4222 invalid_netcdf=invalid_netcdf,
4223 auto_complex=auto_complex,
4224 )

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/backends/api.py:1882, in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf, auto_complex)
1879 if multifile:
1880 return writer, store
-> 1882 writes = writer.sync(compute=compute)
1884 if isinstance(target, BytesIO):
1885 store.sync()

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/backends/common.py:351, in ArrayWriter.sync(self, compute, chunkmanager_store_kwargs)
348 if chunkmanager_store_kwargs is None:
349 chunkmanager_store_kwargs = {}
--> 351 delayed_store = chunkmanager.store(
352 self.sources,
353 self.targets,
354 lock=self.lock,
355 compute=compute,
356 flush=True,
357 regions=self.regions,
358 **chunkmanager_store_kwargs,
359 )
360 self.sources = []
361 self.targets = []

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/namedarray/daskmanager.py:247, in DaskManager.store(self, sources, targets, **kwargs)
239 def store(
240 self,
241 sources: Any | Sequence[Any],
242 targets: Any,
243 **kwargs: Any,
244 ) -> Any:
245 from dask.array import store
--> 247 return store(
248 sources=sources,
249 targets=targets,
250 **kwargs,
251 )

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/dask/array/core.py:1236, in store(failed resolving arguments)
1234 elif compute:
1235 store_dsk = HighLevelGraph(layers, dependencies)
-> 1236 compute_as_if_collection(Array, store_dsk, map_keys, **kwargs)
1237 return None
1239 else:

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/dask/base.py:402, in compute_as_if_collection(cls, dsk, keys, scheduler, get, **kwargs)
400 schedule = get_scheduler(scheduler=scheduler, cls=cls, get=get)
401 dsk2 = optimization_function(cls)(dsk, keys, **kwargs)
--> 402 return schedule(dsk2, keys, **kwargs)

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/client.py:3279, in Client.get(self, dsk, keys, workers, allow_other_workers, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
3277 should_rejoin = False
3278 try:
-> 3279 results = self.gather(packed, asynchronous=asynchronous, direct=direct)
3280 finally:
3281 for f in futures.values():

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/client.py:2372, in Client.gather(self, futures, errors, direct, asynchronous)
2369 local_worker = None
2371 with shorten_traceback():
-> 2372 return self.sync(
2373 self._gather,
2374 futures,
2375 errors=errors,
2376 direct=direct,
2377 local_worker=local_worker,
2378 asynchronous=asynchronous,
2379 )

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/client.py:2233, in Client._gather(self, futures, errors, direct, local_worker)
2231 else:
2232 raise exception.with_traceback(traceback)
-> 2233 raise exc
2234 if errors == "skip":
2235 bad_keys.add(key)

CancelledError: ('store-map-0b438090245098763a0c364fab4fea00', 0, 5, 0)

view this post on Zulip Jemma Jeffree (Feb 25 2025 at 22:40):

A couple of thoughts that might help debug:
— When you say adding a .load() doesn't fix the problem, does the .load() on its own throw the error, or the saving to netcdf after loading the file? This might narrow down whether it's a netcdf/saving problem or an issue with the dask graph.
— Does halving what you're computing or making smaller chunk sizes help? In the extreme case, is there some trivial/smallest subset of what you're trying to calculate that works? Or, in the other direction, what happens if you double or triple the memory but keep the number of workers the same?
— Can you open the dask homepage and watch the memory levels or taskstream just before it crashes? Does it do some computing, then sort-of stop and one worker's memory skyrocket?

Based on the error you're receiving and what you've described, it seems likely to me that this is not a problem directly related to saving, but instead an error when loading/calculating the thing you want to save. In which case, whatever is causing the error could be code from well before the code that throws the error, that's embedding instructions into the dask graph that's only calculated later

view this post on Zulip Dylan Oldenburg (Feb 26 2025 at 00:39):

Doing the load() by itself throws an error:


CancelledError Traceback (most recent call last)
Cell In[19], line 1
----> 1 ens_ts = ens_ts.load()

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/dataarray.py:1175, in DataArray.load(self, **kwargs)
1155 def load(self, **kwargs) -> Self:
1156 """Manually trigger loading of this array's data from disk or a
1157 remote source into memory and return this array.
1158
(...)
1173 dask.compute
1174 """
-> 1175 ds = self._to_temp_dataset().load(**kwargs)
1176 new = self._from_temp_dataset(ds)
1177 self._variable = new._variable

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/core/dataset.py:899, in Dataset.load(self, **kwargs)
896 chunkmanager = get_chunked_array_type(*lazy_data.values())
898 # evaluate all the chunked arrays simultaneously
--> 899 evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
900 *lazy_data.values(), **kwargs
901 )
903 for k, data in zip(lazy_data, evaluated_data, strict=False):
904 self.variables[k].data = data

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/xarray/namedarray/daskmanager.py:85, in DaskManager.compute(self, *data, **kwargs)
80 def compute(
81 self, *data: Any, **kwargs: Any
82 ) -> tuple[np.ndarray[Any, _DType_co], ...]:
83 from dask.array import compute
---> 85 return compute(*data, **kwargs)

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/dask/base.py:661, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
658 postcomputes.append(x.__dask_postcompute__())
660 with shorten_traceback():
--> 661 results = schedule(dsk, keys, **kwargs)
663 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])

File ~/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/client.py:2233, in Client._gather(self, futures, errors, direct, local_worker)
2231 else:
2232 raise exception.with_traceback(traceback)
-> 2233 raise exc
2234 if errors == "skip":
2235 bad_keys.add(key)

CancelledError: ('transpose-964b93e84eb977cb7290c56ee6ec09a5', 0, 4, 6)

It shows a lot of movement during the previous parts of the code, then when it gets to the to_netcdf part, it just stays like this, then remains like it before/after the error shows up (see attached)
Capture-décran-2025-02-25-à-16.31.42.png
Capture-décran-2025-02-25-à-16.30.39.png
Capture-décran-2025-02-25-à-16.30.18.png
The workers no longer crash, but it throws this strange CancelledError now. I've attached some worker logs
3890166.casper-pbs.ER
3890165.casper-pbs.ER
3890161.casper-pbs.ER

view this post on Zulip Dylan Oldenburg (Feb 26 2025 at 00:45):

The "CancelledError" is the new problem. Previously it seemed like there was a memory overload for the workers, but now that does not seem to be the problem.

view this post on Zulip Dylan Oldenburg (Feb 26 2025 at 00:46):

The errors changed when I updated all my packages and my conda version. I used to be able to plot these arrays and load() used to work, allowing me to call the values. Now none of it works at all.

view this post on Zulip Dylan Oldenburg (Feb 26 2025 at 00:53):

Also, when the CancelledError appears, an error appears below where I call the dask client which is too long to even include here, but includes:

2025-02-25 17:42:54,603 - distributed.protocol.core - CRITICAL - Failed to Serialize
Traceback (most recent call last):
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/protocol/core.py", line 109, in dumps
frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/__init__.py", line 35, in packb
return Packer(**kwargs).pack(o)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 885, in pack
self._pack(obj)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 864, in _pack
return self._pack_map_pairs(
^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 970, in _pack_map_pairs
self._pack(v, nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 819, in _pack
n = len(obj) * obj.itemsize
^^^^^^^^
TypeError: 0-dim memory has no length

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/protocol/core.py", line 130, in dumps
frames[0] = msgpack.dumps(
^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/__init__.py", line 35, in packb
return Packer(**kwargs).pack(o)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 885, in pack
self._pack(obj)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 864, in _pack
return self._pack_map_pairs(
^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 970, in _pack_map_pairs
self._pack(v, nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 819, in _pack
n = len(obj) * obj.itemsize
^^^^^^^^
TypeError: 0-dim memory has no length
2025-02-25 17:42:54,608 - distributed.comm.utils - ERROR - 0-dim memory has no length
Traceback (most recent call last):
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/protocol/core.py", line 109, in dumps
frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/__init__.py", line 35, in packb
return Packer(**kwargs).pack(o)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 885, in pack
self._pack(obj)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 864, in _pack
return self._pack_map_pairs(
^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 970, in _pack_map_pairs
self._pack(v, nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 819, in _pack
n = len(obj) * obj.itemsize
^^^^^^^^
TypeError: 0-dim memory has no length

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/comm/utils.py", line 34, in _to_frames
return list(protocol.dumps(msg, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/protocol/core.py", line 130, in dumps
frames[0] = msgpack.dumps(
^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/__init__.py", line 35, in packb
return Packer(**kwargs).pack(o)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 885, in pack
self._pack(obj)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 864, in _pack
return self._pack_map_pairs(
^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 970, in _pack_map_pairs
self._pack(v, nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 819, in _pack
n = len(obj) * obj.itemsize
^^^^^^^^
TypeError: 0-dim memory has no length
2025-02-25 17:42:54,611 - distributed.batched - ERROR - Error in batched write
Traceback (most recent call last):
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/distributed/protocol/core.py", line 109, in dumps
frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/__init__.py", line 35, in packb
return Packer(**kwargs).pack(o)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 885, in pack
self._pack(obj)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 864, in _pack
return self._pack_map_pairs(
^^^^^^^^^^^^^^^^^^^^^
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 970, in _pack_map_pairs
self._pack(v, nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 861, in _pack
self._pack(obj[i], nest_limit - 1)
File "/glade/u/home/oldend/anaconda3/envs/derecho/lib/python3.12/site-packages/msgpack/fallback.py", line 819, in _pack
n = len(obj) * obj.itemsize
^^^^^^^^
TypeError: 0-dim memory has no length

It seems like basically every single part of this code is failing.

view this post on Zulip Dylan Oldenburg (Feb 26 2025 at 04:15):

It turns out the problem was with my python environment - when I switch to the built-in NPL 2025a conda environment, everything works. I have no idea why.

view this post on Zulip Dylan Oldenburg (Feb 26 2025 at 04:16):

Well, not quite true. doing ens_ts = ens_ts.load() gives me a netCDF HDI error, but I can save the file.


Last updated: May 16 2025 at 17:14 UTC