Stream: python-questions

Topic: Excessive time combining variables with same dimensions


view this post on Zulip Andrew Shao (Sep 16 2021 at 16:07):

This is more a question about if the problem is with me or if others have experienced this problem as well.

Basically, in some combinations of versions of intake, intake-esm, and xarray, I run into problems when combining variables with all the same dimensions/coordinates. What it looks like it's doing is that it for some reason thinks that the coords aren't exactly identical and so starts broadcasting like crazy. Anyone run into this problem before and/or know if this is a known bug that has since been resolved?

For example this problem occurs with

intake                        0.5.5
intake-esm              2020.3.16.2
xarray                        0.16

but not

intake                       0.6.3
intake-esm             2021.8.17
xarray                       0.19.0

view this post on Zulip Andrew Shao (Sep 16 2021 at 16:08):

Here's a quick code snippet:

import intake
import xarray as xr

cat_url = "/space/hall4/sitestore/eccc/crd/CMIP6/final/canesm_final.json"

col = intake.open_esm_datastore(cat_url)

query = dict(variable_id=['tas', 'pr', 'psl'], table_id='Amon', source_id='CanESM5', experiment_id='historical', member_id='r1i1p2f1')
cat = col.search(**query)
dset_dict = cat.to_dataset_dict()

view this post on Zulip Andrew Shao (Sep 16 2021 at 16:10):

Note: you can point cat_url to whatever your favorite CMIP6 catalogue is.

view this post on Zulip Matt Long (Sep 16 2021 at 16:42):

Not sure how to explain the version dependencies, but sometimes there are roundoff level differences in coordinates that can trip up the xarray combine.

view this post on Zulip Andrew Shao (Sep 16 2021 at 18:02):

@Matt Long Thanks. I do suspect that you're right about that. If I use preprocess to overwrite the coordinates, it works as expected


Last updated: Jan 30 2022 at 12:01 UTC