Is there any status update on this development (issue #7)?
@Yassir Eddebbar, @Riley Brady, @Keith Lindsay and @Deepak Cherian have been following this issue. I think we've foundered a bit, but would be good to assess and revive.
a bunch of the pieces are almost there :slight_smile:
cc @Anna-Lena Deppenmeier
Hi all, @Yassir Eddebbar and I have been working on a notebook that closes, for example, temperature budget with xarray only, with xgcm and with xgcm metrics. I would say what works now is closing the temperature budget with xgcm, it's not optimally using metrics yet and xarray only has some problems.
I could turn the xgcm version into a function and move it to pop-tools.
it does need DZU which as far as I understand cannot be calculated from the released version of pop-tools right now?!
wow much further along than I thought ! nice!
yeah it's POP specific, though. Which is fine for pop-tools, I guess
Hi all; glad this is making some headway.
I was looking to flesh out some convenience variables from the get_grid()
function so we could really easily wrangle this with xgcm
(DZU, DZT, HU, HT, KMU). I could use some feedback on https://github.com/NCAR/pop-tools/pull/54 and we need to get https://github.com/NCAR/pop-tools/pull/44 merged. We might not need every one of those variables, but if I remember correctly DZU
and DZT
are needed.
Also some thoughts on https://github.com/NCAR/pop-tools/issues/56 would be useful -- getting DZBC
could help us close budgets for runs with partial bottom cells. Although maybe that is a step we take in a future PR. We also haven't done anything to deal with the overflow parameterization that @Matt Long meticulously deals with in his NCL code, but again this should be dealt with after the first iteration of this code.
There's a lot of discussion on my original purely xarray
implementation here that might help in making the updated version work well: https://github.com/NCAR/pop-tools/pull/12. I closed it in favor of giving code/feedback to @Yassir Eddebbar and @Anna-Lena Deppenmeier on doing a purely xgcm
implementation.
Here's a demo from @Yassir Eddebbar and @Anna-Lena Deppenmeier closing the O2 budget using purely xgcm
: https://nbviewer.jupyter.org/gist/Eddebbar/cb30e0e4a3151b2900fb49648e1e50c8. Yassir and I had talked about just hosting these as examples on the pop-tools
docs so folks could just follow along themselves. I still think it'd be valuable to have a function in the package. The best advice I can give from our discussion on my PR is to use YAML files to denote the variables specifically needed for each tracer. For instance, iron needs IRON_FLUX
, while O2 needs STF_O2
, while DIC needs FG_CO2
. We can have an exhaustive list for the tracer variables on what is needed.
Another minor issue is that budget terms are not all available as outputs in many runs, e.g. for BGC in the high res runs, which prevents demonstrating full budget closure for now. e.g. the gist posted above by @Riley Brady is missing the HDIF
terms, which likely explains the lack of budget closure.
@Anna-Lena Deppenmeier , a pop_tools function would be awesome, happy to help with BGC implementation. A couple other differences for BGC vs heat budgets are lack of a QSW_3D
term for BGC budgets, and handling different units for various BGC terms (with ```YAML files too?), both manageable I think.
@Yassir Eddebbar, @Stephen Yeager's FOSI run under the CESM-DPLE directory has a lot of the terms. That's what I based my original closure demos and code on in the previous PR. See /glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast
Here's a demo YAML file: https://github.com/NCAR/pop-tools/blob/master/pop_tools/pop_grid_definitions.yaml. I would envision the headers being the given variable name, then the sub-sections could be a list of required/expected budget variables (could make some optional based on run output) and other booleans as needed.
I still think if we can advance discussion on get_grid()
returning some of the other grid diagnostics/metrics we should be able to implement this pretty smoothly.
Yassir Eddebbar, Stephen Yeager's FOSI run under the CESM-DPLE directory has a lot of the terms. That's what I based my original closure demos and code on in the previous PR. See
/glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast
Excellent. I'll test on those!
@Anna-Lena Deppenmeier @Yassir Eddebbar Would you be willing to share your notebook that closes the POP temperature budget using xarray and xgcm?
sure, I am just running it now to make sure it works properly and will share it then
cc @Max Grover
Not sure if of interest, but I have an [O2] budget version with major input from @Anna-Lena Deppenmeier as well, need to rerun and clean too...
Currently my budget is not closing. Not sure why, this is the same code that closed a while ago, before a bunch of updates. Am looking into it and hopefully can close it again soon.
Hi @Stephen Yeager I have something that works, do you currently work on the CGD system or casper/cheyenne?
Great! Prefer something that works on casper. Thank you very much for debugging and sharing!
ok, I'm currently running it on cgd. I will put it on casper and make sure it works, then I'll let you know where to find it.
Anna you should upload it to pop-tools!
The notebook itself? Or change it into a function?
The notebook. almost 0 effort :slight_smile:
ok so here come the problems. so far it's not running on casper -- I assume something to do with xgcm. This is the error I get when I try to convert my dataset ds into a xgcm compatible dataset:
# here we get the xgcm compatible dataset gridx, dsx = pop_tools.to_xgcm_grid_dataset(ds)
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-6-c629003e2721> in <module> 1 # here we get the xgcm compatible dataset ----> 2 gridx, dsx = pop_tools.to_xgcm_grid_dataset(ds) 3 4 # make sure we have the cell volumne for calculations 5 dsx["cell_volume"] = dsx.DZT * dsx.DXT * dsx.DYT /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/pop_tools/xgcm_util.py in to_xgcm_grid_dataset(ds, **kwargs) 202 203 try: --> 204 import xgcm 205 except ImportError: 206 raise ImportError( /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/xgcm/__init__.py in <module> 4 del get_versions 5 ----> 6 from .grid import Grid, Axis 7 from .autogenerate import generate_grid_ds /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/xgcm/grid.py in <module> 757 758 --> 759 class Grid: 760 """ 761 An object with multiple :class:`xgcm.Axis` objects representing different /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/xgcm/grid.py in Grid() 1065 return out 1066 -> 1067 @docstrings.dedent 1068 def _apply_vector_function(self, function, vector, **kwargs): 1069 # the keys, should be axis names /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/docrep/decorators.py in update_docstring(self, *args, **kwargs) 41 return func(self, *args, **kwargs) 42 elif len(args) and callable(args[0]): ---> 43 doc = func(self, args[0].__doc__, *args[1:], **kwargs) 44 _set_object_doc(args[0], doc, py2_class=self.python2_classes) 45 return args[0] /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/docrep/__init__.py in dedent(self, s, stacklevel) 532 encountering an invalid key in the string 533 """ --> 534 s = inspect.cleandoc(s) 535 return safe_modulo(s, self.params, stacklevel=stacklevel) 536 /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/inspect.py in cleandoc(doc) 617 onwards is removed.""" 618 try: --> 619 lines = doc.expandtabs().split('\n') 620 except UnicodeError: 621 return None AttributeError: 'NoneType' object has no attribute 'expandtabs'
I have tried updating my environment and making sure the relevant packages are the same version, but I fail wrt pop-tools
and xgcm
.
conda_analysis_casper.txt:pop-tools 2020.2.17.post6 pypi_0 pypi conda_dcpy_andre.txt:pop-tools 2020.12.15 pyhd8ed1ab_0 conda-forge
When I run conda update -c conda-forge pop-tools
it stays the same, when I try pip install git+https://github.com/NCAR/pop-tools.git
it gives me this error:
(analysis) -bash-4.2$ pip install git+https://github.com/NCAR/pop-tools.git Collecting git+https://github.com/NCAR/pop-tools.git Cloning https://github.com/NCAR/pop-tools.git to /glade/scratch/deppenme/pip-req-build-8etvcn_v Running command git clone -q https://github.com/NCAR/pop-tools.git /glade/scratch/deppenme/pip-req-build-8etvcn_v Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... done Preparing wheel metadata ... done Requirement already satisfied: dask>=2.14 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (2021.2.0) Requirement already satisfied: pyyaml>=5.3.1 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (5.4.1) Collecting numba>=0.52 Using cached numba-0.52.0-cp37-cp37m-manylinux2014_x86_64.whl (3.2 MB) Requirement already satisfied: xarray>=0.16.1 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (0.17.1.dev3+g48378c4) Requirement already satisfied: numpy>=1.17.0 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (1.19.5) Requirement already satisfied: pooch>=1.3.0 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (1.3.0) Requirement already satisfied: setuptools in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from numba>=0.52->pop-tools==2020.12.15) (49.6.0.post20210108) Collecting llvmlite<0.36,>=0.35.0 Using cached llvmlite-0.35.0-cp37-cp37m-manylinux2010_x86_64.whl (25.3 MB) Requirement already satisfied: requests in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pooch>=1.3.0->pop-tools==2020.12.15) (2.25.1) Requirement already satisfied: packaging in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pooch>=1.3.0->pop-tools==2020.12.15) (20.9) Requirement already satisfied: appdirs in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pooch>=1.3.0->pop-tools==2020.12.15) (1.4.4) Requirement already satisfied: pandas>=0.25 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from xarray>=0.16.1->pop-tools==2020.12.15) (1.2.3) Requirement already satisfied: python-dateutil>=2.7.3 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pandas>=0.25->xarray>=0.16.1->pop-tools==2020.12.15) (2.8.1) Requirement already satisfied: pytz>=2017.3 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pandas>=0.25->xarray>=0.16.1->pop-tools==2020.12.15) (2021.1) Requirement already satisfied: six>=1.5 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas>=0.25->xarray>=0.16.1->pop-tools==2020.12.15) (1.15.0) Requirement already satisfied: pyparsing>=2.0.2 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from packaging->pooch>=1.3.0->pop-tools==2020.12.15) (2.4.7) Requirement already satisfied: chardet<5,>=3.0.2 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (4.0.0) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (1.26.3) Requirement already satisfied: certifi>=2017.4.17 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (2020.12.5) Requirement already satisfied: idna<3,>=2.5 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (2.10) Building wheels for collected packages: pop-tools Building wheel for pop-tools (PEP 517) ... done Created wheel for pop-tools: filename=pop_tools-2020.12.15-py3-none-any.whl size=30153 sha256=988c711795c5b46037a868957be10d82f56fab4eab935b862cbc8c054f5f7717 Stored in directory: /glade/scratch/deppenme/pip-ephem-wheel-cache-bdpxkpi5/wheels/65/87/a3/7667dcc7225e5105e95e09186af11e0befb140c2779fc074cf Successfully built pop-tools Installing collected packages: llvmlite, numba, pop-tools Attempting uninstall: llvmlite Found existing installation: llvmlite 0.31.0 ERROR: Cannot uninstall 'llvmlite'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
Anyway, I believe the issue is in principle with xgcm (see the .grid
problem), so I tried to at least get that to the same version on both systems. Both seem to be installed by conda-forge
conda_analysis_casper.txt:xgcm 0.3.0 py_0 conda-forge conda_dcpy_andre.txt:xgcm 0.5.1 py_0 conda-forge
and when I try to update with conda-forge it says it's already installed and doesn't change the version.
Long story short: help please. @Deepak Cherian @Anderson Banihirwe any ideas?
Yes this is an xgcm version problem. conda update
should fix it, are you getting some other error in that case?
I keep doing conda update
and it doesn't change the version.
(analysis) -bash-4.2$ conda update xgcm Collecting package metadata (current_repodata.json): done Solving environment: done # All requested packages already installed. (analysis) -bash-4.2$
same with conda update -c conda-forge xgcm
@Anna-Lena Deppenmeier, I think the issue is coming from an incompatible docrep
version... Try pinning docrep
to v0.2.7
conda install -c conda-forge docrep==0.2.7
yes, I have come across the docrep problem. it is currently at 0.2.7
docrep 0.2.7 pypi_0 pypi
am doing the conda install thing now and will then try a conda update xgcm again.
it's thinking really hard and currently hangs at
(analysis) -bash-4.2$ conda install -c conda-forge docrep==0.2.7 Collecting package metadata (current_repodata.json): done Solving environment: - The environment is inconsistent, please check the package plan carefully The following packages are causing the inconsistency: - conda-forge/linux-64::psyplot==1.3.1=py37h89c1867_0 - conda-forge/noarch::xrft==0.2.3=pyhd3deb0d_0 - conda-forge/noarch::funcargparse==0.2.3=pyh9f0ad1d_0 failed with initial frozen solve. Retrying with flexible solve. Solving environment: |
what can I do about the inconsistent environment?
ok it took a while but it worked and I was able to update xgcm. Thanks @Anderson Banihirwe ! And while you're here (; I am initializing my cluster like this
# this is for when you do things on casper import ncar_jobqueue import dask import distributed dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'}) cluster = ncar_jobqueue.NCARCluster() # initializes cluster client = distributed.Client(cluster) cluster.adapt(minimum=6, maximum=32, wait_count=600) # gets you 6 workers minimum, max # makes them wait 10 minutes before releasing client
but I can't see my dashboard when I click on the link, what am I doing wrong?
what version of ncar-jobqueue
are you using?
With the newest version of ncar-jobqueue, you don't need this
dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'})
'2020.3.4'
so when I remove the line and click on the new link. it wants to log into jupyterhub, which I am not using in the first place
it's thinking really hard and currently hangs at
You may find mamba
to be very useful in this kind of situations
conda install -c conda-forge mamba`# in your base
and whenever you want to install packages via conda, replace conda
with mamba
. E.g.: mamba install -c conda-forge docrep==0.2.7
'2020.3.4'
That's old
Try the latest version 2021.2.10
conda install -c conda-forge ncar-jobqueue==2021.2.10
hm. I did a conda update all in the beginning of the day, because I haven't worked on casper in a while, I wonder why it didn't update it.
Is there a running documentary somewhere where we keep the current versions? So for example I am mostly working on the cgd system these days, and whenever I switch back to casper I am having these issues and have to ask people (partly because conda update all
does not seem to take care of these things?)
but anyhow I updated and I still get [pasted image] upon clicking on the link (user_uploads/2/43/qRHGVdyfK9fAaORJ5dlfqOn8/pasted_image.png)
hm. I did a conda update all in the beginning of the day, because I haven't worked on casper in a while, I wonder why it didn't update it.
the conda solver is very flexible unless you pin down which versions you want.. It's hard to control this when you've installed packages via conda install ...
command. one remedy is to curate your environment in a environment.yml
and then run conda env update -f environment.yml
command whenever you want to update... (With this approach, it's easy to control which versions get installed)
Is there a running documentary somewhere where we keep the current versions? So for example I am mostly working on the cgd system these days, and whenever I switch back to casper I am having these issues and have to ask people (partly because
conda update all
does not seem to take care of these things?)
but anyhow I updated and I still get [pasted image] upon clicking on the link (user_uploads/2/43/qRHGVdyfK9fAaORJ5dlfqOn8/pasted_image.png)
This looks very familiar
This is likely a version mismatch
hm. I did a conda update all in the beginning of the day, because I haven't worked on casper in a while, I wonder why it didn't update it.
the conda solver is very flexible unless you pin down which versions you want.. It's hard to control this when you've installed packages via
conda install ...
command. one remedy is to curate your environment in aenvironment.yml
and then runconda env update -f environment.yml
command whenever you want to update... (With this approach, it's easy to control which versions get installed)
I don't understand what you mean with curate -- like I literally go in and find the version numbers I need?
tbh that doesn't sound very easy to me at all. I thought it was the idea to be able to use conda install
and then get the latest version if I don't specify which version I want with the ==
?
I don't understand what you mean with curate -- like I literally go in and find the version numbers I need?
For instance, if you were starting with a new environment, you would create a new environment.yml
file. The original contents of this file may look like this:
name: my-new-env channels: - conda-forge dependencies: - python=3.8 - xarray==0.16.2 - dask==2.14
To create/update this environment, you would run conda env update -f environment.yml
If let's say a week later, you decide that you want to upgrade one or more dependencies in this environment, you need to go back to the environment.yml
file and update it, e.g.
name: my-new-env channels: - conda-forge dependencies: - python=3.8 - xarray==0.17 # Upgrade to latest version of xarray - dask==2.14
At this point, you would then run conda env update -f environment.yml
With this approach, you can have an idea of what versions of packages you care about are installed in your environment by looking at your environment.yml file and you can easily re-create this environment on the same machine or another machine with the same type of operating system (Linux, MacOS, etc...). This is what I mean by environment curation.
I hope this clarifies my previous point
Regarding the version mismatch issue, what is the output of
import dask, distributed, bokeh print(dask.__version__, distributed.__version__, bokeh.__version__)
from your notebook??
I hope this clarifies my previous point
It does, thanks. I am not sure whether I would be able to do that though. It seems like I would be on top of every package version I would want to use, which seems like a lot to research / keep in mind. Anyway.
The output is 2021.02.0 2021.02.0 2.3.0
I also just tried pushing my notebook to git and it says I am denied
Username for 'https://github.com': ALDepp Password for 'https://ALDepp@github.com': remote: Permission to NCAR/pop-tools.git denied to ALDepp. fatal: unable to access 'https://github.com/NCAR/pop-tools.git/': The requested URL returned error: 403
The output is 2021.02.0 2021.02.0 2.3.0
Try pinning your dask, distributed, bokeh versions to these versions:
dask==2.14 distributed==2.14 bokeh==1.4
I also just tried pushing my notebook to git and it says I am denied
I think you want to push to your fork (https://github.com/ALDepp/pop-tools) repo and open a pull request to merge your changes into NCAR/pop-tools
Can you confirm your git repo is pointed to the right remote endpoint?
Try this
git remote -v
I also just tried pushing my notebook to git and it says I am denied
I think you want to push to your fork (https://github.com/ALDepp/pop-tools) repo and open a pull request to merge your changes into NCAR/pop-tools
yes, thank you (:
git remote -v
origin https://github.com/ALDepp/pop-tools.git (fetch) origin https://github.com/ALDepp/pop-tools.git (push) upstream https://github.com/NCAR/pop-tools.git (fetch) upstream https://github.com/NCAR/pop-tools.git (push)
The output is 2021.02.0 2021.02.0 2.3.0
Try pinning your dask, distributed, bokeh versions to these versions:
dask==2.14
distributed==2.14
bokeh==1.4
I have not been able to pin bokeh to 1.4. It took a very long time and then gave me a very long list of packages that conflict, and in the end says
Your installed version is: 2.17
Could you point me to the location of this conda environment?
/glade/work/deppenme/miniconda3/envs/analysis
I am wondering if it could be helpful to users not deeply familiar with the POP grid to add an illustration/visualization in @Anna-Lena Deppenmeier 's notebook on where the various terms sit in the grid cell, what the various terms mean. This what I found most challenging in creating the budget terms... I started on something a few months ago on illustrator that can be edited for this purpose (needs QA/QC review for accuracy):
POP_Grid.png
yes please! very cool image
if you have html/svg skills you could add this to xgcm: https://github.com/xgcm/xgcm/issues/276
Would be good to add to pop-tools as well?
@Yassir Eddebbar I think that would be useful. Are v.t and u.t consistent in this figure? one seems to be on the corner and one on the face
@Yassir Eddebbar is it okay if I add this to the example notebook (since casper is up again I aim to make changes and resubmit today)
Yassir Eddebbar is it okay if I add this to the example notebook (since casper is up again I aim to make changes and resubmit today)
Yes, as long as it looks ok by you and others accuracy wise, go for it. Happy to edit some later
Yassir Eddebbar I think that would be useful. Are v.t and u.t consistent in this figure? one seems to be on the corner and one on the face
yeah, I struggled with how to represent v.t , it's supposed to look like it's popping out the center of the northern cell face (location 3121) not corner, but now looking at it, it can be confused for the SE upper corner... I can work something better later?
if you have html/svg skills you could add this to xgcm: https://github.com/xgcm/xgcm/issues/276
@Deepak Cherian I'll look into it, not familiar with svg yet, but sounds like it could be really useful for xgcm.
@Anna-Lena Deppenmeier Here is an updated version with more POP-consistent naming and a legend: POP_Grid.png
@Max Grover Feel free to use on pop-tools doc or elsewhere, and let me know if you need any edits or additions for other uses, I can also send over the .ai file
Just an update: This awesome notebooks is now live thanks to @Anna-Lena Deppenmeier @Yassir Eddebbar @Anderson Banihirwe .
I am trying to expanding this awesome notebook to work for prediction datasets (in particular DPLE). Thanks to Anderson, now to_xgcm_grid_dataset()
can handle the DPEL dimensions (time, lead, ensemble, z, y, x). But, when I try to compute the buget term using in Z direction, ie.,
budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * dsxgcm.VOL.values, axis="Z") / dsxgcm.VOL)
, I got the following error, while working fine in X and Y directions: "None of the DataArray's dims ('Y', 'M', 'L', 'z_w_top', 'nlat_t', 'nlon_t') were found in axis coords." Can anone help me to figure out how to fix this problem? Thanks!
Hi Who, on which coordinate is WTT? I'm assuming VOL is on z_t and WTT on z_w_bot or z_w_top
If that is the case then you need to either reasign / rename the coordinate or interpolate first for xgcm to want to mulitply WTT with VOL.
@Who Kim can you paste your code for how you set up your grid, vertical thickness, volume, etc?
@Anna-Lena Deppenmeier I have tried that, but didn't work (I forget what was the error messsage). In fact, this line is same as in your original script and only the difference is that WTT has now more coornidates. Doesn't dsxgcm.VOL.values
makes it free from its assigned coordinates?
@Yassir Eddebbar Below is how I set up those, which is similar to the Anna's script with some modifications,
dple["DZT"] = xr.DataArray(dple.dz.values[:,None,None]*np.ones((len(dple.dz),len(dple.nlat),len(dple.nlon))) , dims=['z_t','nlat','nlon'], coords={'z_t':dple.z_t,'nlat':dple.nlat,'nlon':dple.nlon}) dple["DZU"] = xr.DataArray(dple.dz.values[:,None,None]*np.ones((len(dple.dz),len(dple.nlat),len(dple.nlon))) , dims=['z_t','nlat','nlon'], coords={'z_t':dple.z_t,'nlat':dple.nlat,'nlon':dple.nlon}) dple.DZT.attrs["long_name"] = "Thickness of T cells" dple.DZT.attrs["units"] = "centimeter" dple.DZT.attrs["grid_loc"] = "3111" dple.DZU.attrs["long_name"] = "Thickness of U cells" dple.DZU.attrs["units"] = "centimeter" dple.DZU.attrs["grid_loc"] = "3221" # make sure we have the cell volumne for calculations VOL = (dple.DZT * dple.DXT * dple.DYT).compute() KMT = dple.KMT.compute() for j in tqdm(range(len(KMT.nlat))): for i in range(len(KMT.nlon)): k = KMT.values[j, i].astype(int) VOL.values[k:, j, i] = 0.0 dple["VOL"] = VOL dple.VOL.attrs["long_name"] = "volume of T cells" dple.VOL.attrs["units"] = "centimeter^3" dple.VOL.attrs["grid_loc"] = "3111"
I didn't catch the .values
, thanks for pointing that out. I'm not sure xgcm would multiply with values
though, given that it uses the coordinates to determine how to multiply. Usually it looks for the same coordinates to know how to multiply.
Below is the codes from your script. I can compute UET and VNT using the same codes without any problems. Is there a reason why you multify (dzdxtdyt).values for WTT insead of VOL.values? I think I tried that, but I can try again with your original formula.
budget["UET"] = -(gridxgcm.diff(dsxgcm.UET * dsxgcm.VOL.values, axis="X") / dsxgcm.VOL) budget["VNT"] = -(gridxgcm.diff(dsxgcm.VNT * dsxgcm.VOL.values, axis="Y") / dsxgcm.VOL) budget["WTT"] = ( gridxgcm.diff(dsxgcm.WTT.fillna(0) * (dsxgcm.dz * dsxgcm.DXT * dsxgcm.DYT).values, axis="Z") / dsxgcm.VOL )
Hmmm, how about the xgcm grid set up, any modification there? i.e. this part:
metrics = { ("X",): ["DXU", "DXT"], # X distances ("Y",): ["DYU", "DYT"], # Y distances ("Z",): ["DZU", "DZT"], # Z distances ("X", "Y"): ["UAREA", "TAREA"], } # here we get the xgcm compatible dataset gridxgcm, dsxgcm = pop_tools.to_xgcm_grid_dataset( ds, periodic=False, metrics=metrics, boundary={"X": "extend", "Y": "extend", "Z": "extend"}, ) for coord in ["nlat", "nlon"]: if coord in dsxgcm.coords: dsxgcm = dsxgcm.drop_vars(coord)
It is exactly same.
Following on @Anna-Lena Deppenmeier 's reassigning coordinates idea, does an xr.roll / shift method work (just for debugging purposes)? something like:
di=xr.Dataset() di['UET'] = -((ds.UET*ds.VOL) - (ds.UET*ds.VOL).roll(nlon=1, roll_coords=True))/ds.VOL di['VNT'] = -((ds.VNT*ds.VOL) - (ds.VNT*ds.VOL).roll(nlat=1, roll_coords=True))/ds.VOL di['WTT'] = - ((ds.WTT*(ds.VOL.drop('z_t').rename({"z_t":"z_w_top"}).assign_coords(z_w_top=ds.z_w_top)) - (ds.WTT*(ds.VOL.drop('z_t').rename({"z_t":"z_w_top"}).assign_coords(z_w_top=ds.z_w_top))).shift(z_w_top=-1).fillna(0) ).drop('z_w_top').rename({"z_w_top":"z_t"}).assign_coords(z_t=ds.z_t))/ds.VOL
It appears to be working (at least I didn't see any error). I have also generated a new VOL defined at z_w_top and added to dsxgcm, but I got the same error:
dsxgcm["VOL2"] = xr.DataArray(dsxgcm.VOL.values, dims=['z_w_top', 'nlat_t', 'nlon_t'], coords={'z_w_top':dsxgcm.z_w_top, 'nlat_t':dsxgcm.nlat_t, 'nlon_t':dsxgcm.nlon_t}) budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * dsxgcm.VOL2.values, axis="Z")/dsxgcm.VOL) --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-86-192e3d3043c5> in <module> 2 vol_top = vol_top.rename({'z_t': 'z_w_top'}) 3 ----> 4 budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * vol_top, axis="Z")/ dsxgcm.VOL) ~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in diff(self, da, axis, **kwargs) 1483 >>> grid.diff(da, ['X', 'Y'], fill_value={'X':0, 'Y':100}) 1484 """ -> 1485 return self._grid_func("diff", da, axis, **kwargs) 1486 1487 @docstrings.dedent ~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in _grid_func(self, funcname, da, axis, **kwargs) 1419 out = out * metric 1420 -> 1421 out = func(out, **kwargs) 1422 1423 if metric_weighted: ~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in diff(self, da, to, boundary, fill_value, boundary_discontinuity, vector_partner, keep_coords) 638 """ 639 --> 640 return self._neighbor_binary_func( 641 da, 642 raw_diff_function, ~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in _neighbor_binary_func(self, da, f, to, boundary, fill_value, boundary_discontinuity, vector_partner, keep_coords) 275 The differenced data 276 """ --> 277 position_from, dim = self._get_axis_coord(da) 278 if to is None: 279 to = self._default_shifts[position_from] ~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in _get_axis_coord(self, da) 1013 return position, coord_name 1014 -> 1015 raise KeyError( 1016 "None of the DataArray's dims %s were found in axis " 1017 "coords." % repr(da.dims) KeyError: "None of the DataArray's dims ('Y', 'M', 'L', 'z_w_top', 'nlat_t', 'nlon_t') were found in axis coords."
Who, I don't know where the 'Y', 'M', 'L' dimenions are coming from. I recommend always checking your dimensions for the DataArrays, as they have to line up for xgcm to want to multiply.
also, you negate that you added dimensions to VOL2
when you then use .values
. it turns the xarray dataarray into a numpy.ndarray
array.
Can you try with with just multiplying VOL2
instead of VOL2.values
?
(I know I've asked this 50x before but I keep forgetting!) Why are we multiplying by VOL.values
. Is it because there's a z_w
and z_w_top
mismatch?
I believe the mismatch was z_t
and z_w_top
since WTT
is centered at the center of the upper face (3112) vs VOL
is in the center of the cell (3111).
Who, I don't know where the 'Y', 'M', 'L' dimensions are coming from. I recommend always checking your dimensions for the DataArrays, as they have to line up for xgcm to want to multiply.
maybe xgcm is confused by the DPLE Y
dimension? in:
~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in diff(self, da, axis, **kwargs) 1483 >>> grid.diff(da, ['X', 'Y'], fill_value={'X':0, 'Y':100}) 1484 """ -> 1485 return self._grid_func("diff", da, axis, **kwargs) 1486 1487 @docstrings.dedent
copying @Julius Busecke who helps develop xgcm
@Anna-Lena Deppenmeier As I mentioned in my first post, I am working on DPLE data with Y, M, L being the start year, ensemble memble, and lead time, respectively. I believe the reason why .values
is used that UET and VNT are at (nlat_t, nlon_u) while VOL is at T-grid for both. If I don't use .values, I got: broadcasting cannot handle duplicate dimensions: ['time', 'z_t', 'nlat_t', 'nlon_u', 'nlon_u']
. I though Anna used .values for a similar region because WTT is at z_w_top although it is at T-grid horizonally. If this logic is correct, I don't see any reason why this operation is not working for WTT. Indeed, it is working fine when my dataset has the conventional POP (t, z, y, x) dimensions. So, my speculation is it is related to the unconventional dataset dimensions, which xgcm cannot perhaps handle it?
I just want to add that the operation for UET and VNT is working with the unconventional data dimentions.
would you mind copying the dimensions of your dataarrays here Who? I don't understand where the duplicate nlon_u
is coming from.
specifically VOL
(without .values
) and WTT
'VOL' (z_t: 60, nlat_t: 384, nlon_t: 320)
'WTT' (Y: 63, M: 39, L: 122, z_w_top: 60, nlat_t: 384, nlon_t: 320)
The error message above is what I got when I ran the operation for UET without .values, not WTT
I just want to add that the operation for UET and VNT is working with the unconventional data dimentions.
ah so you meant when you run this with .values? I misunderstood
Correct
so what are the dimensions of UET
? (Y: 63, M: 39, L: 122, z_t: 60, nlat_t: 384, nlon_u: 320)
?
Yes
:thinking:
Hey everyone. I have to admit I am a bit overwhelmed right here (might also be the dask summit haha). What is the exact code that fails?
The code fails to compute,
budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * dsxgcm.VOL2.values, axis="Z")/dsxgcm.VOL)
, while the the same operation in X and Y directions is working fine. I am working with data (CESM DPLE) with "unusual" dimensions, which include additional "lead time" and "member" dimesions (ie., (lead time, member, start year, z, y, x)). The same operation is working fine when the data dimensions are the usual POP dimenions (time, z, y, x).
It might be useful to have a reproducible example notebook in order to diagnose this issue. @Who Kim, is your work in a notebook that is publicly accessible or somewhere on Glade?
I think I found what went wrong. When reading the data, I only stored the variables and coordinates that I though necessary, including z_w_top
, but somehow to_xgcm_grid_dataset
coudn't convert z_w_top
to xgcm dataset as a coordinate, while it did for z_w_bot
. When I imported everying, z_w_top
appears as a coordinate, weird. Thank everyone responded here!
hmm.. that error message is really misleading then. It should've told you that z_w_top
was missing
I think I found what went wrong. When reading the data, I only stored the variables and coordinates that I though necessary, including
z_w_top
, but somehowto_xgcm_grid_dataset
coudn't convertz_w_top
to xgcm dataset as a coordinate, while it did forz_w_bot
. When I imported everying,z_w_top
appears as a coordinate, weird.
I actually just had the same problem, would be great if someone could look into why to_xgcm_grid_dataset
does not bring z_w_top
along!
open an issue!
I am currently having problems with this line
budget["DIA_IMPVF_TEMP"][:, 0, :, :] = ( SRF_TEMP_FLUX * dsxgcm.TAREA - dsxgcm.DIA_IMPVF_TEMP.isel(z_w_bot=0) * dsxgcm.TAREA ) / dsxgcm.VOL.values[0, :, :]
This works when I use the example script, which only has 1 timestep. In that case ds.DIA_IMPVF_TEMP
is not a dask array. It does not work when I load multiple files, then ds.DIA_IMPVF_TEMP
is a dask array, and I can't assign with budget["DIA_IMPVF_TEMP"][:, 0, :, :] =...
. yesterday I tried to get around this by loading ds.DIA_IMPVF_TEMP
, which resulted in a memory error. How can I assign something to a dask array without loading?
Last updated: May 16 2025 at 17:14 UTC