pop-tools : tracer budgets · python-dev

@Yassir Eddebbar, @Riley Brady, @Keith Lindsay and @Deepak Cherian have been following this issue. I think we've foundered a bit, but would be good to assess and revive.

Deepak Cherian (Sep 04 2020 at 15:13):

Anna-Lena Deppenmeier (Sep 04 2020 at 15:16):

Hi all, @Yassir Eddebbar and I have been working on a notebook that closes, for example, temperature budget with xarray only, with xgcm and with xgcm metrics. I would say what works now is closing the temperature budget with xgcm, it's not optimally using metrics yet and xarray only has some problems.

Anna-Lena Deppenmeier (Sep 04 2020 at 15:17):

it does need DZU which as far as I understand cannot be calculated from the released version of pop-tools right now?!

Deepak Cherian (Sep 04 2020 at 15:18):

Anna-Lena Deppenmeier (Sep 04 2020 at 15:21):

Riley Brady (Sep 04 2020 at 16:20):

I was looking to flesh out some convenience variables from the get_grid() function so we could really easily wrangle this with xgcm (DZU, DZT, HU, HT, KMU). I could use some feedback on https://github.com/NCAR/pop-tools/pull/54 and we need to get https://github.com/NCAR/pop-tools/pull/44 merged. We might not need every one of those variables, but if I remember correctly DZU and DZT are needed.

Also some thoughts on https://github.com/NCAR/pop-tools/issues/56 would be useful -- getting DZBC could help us close budgets for runs with partial bottom cells. Although maybe that is a step we take in a future PR. We also haven't done anything to deal with the overflow parameterization that @Matt Long meticulously deals with in his NCL code, but again this should be dealt with after the first iteration of this code.

There's a lot of discussion on my original purely xarray implementation here that might help in making the updated version work well: https://github.com/NCAR/pop-tools/pull/12. I closed it in favor of giving code/feedback to @Yassir Eddebbar and @Anna-Lena Deppenmeier on doing a purely xgcm implementation.

Riley Brady (Sep 04 2020 at 16:22):

Here's a demo from @Yassir Eddebbar and @Anna-Lena Deppenmeier closing the O2 budget using purely xgcm: https://nbviewer.jupyter.org/gist/Eddebbar/cb30e0e4a3151b2900fb49648e1e50c8. Yassir and I had talked about just hosting these as examples on the pop-tools docs so folks could just follow along themselves. I still think it'd be valuable to have a function in the package. The best advice I can give from our discussion on my PR is to use YAML files to denote the variables specifically needed for each tracer. For instance, iron needs IRON_FLUX, while O2 needs STF_O2, while DIC needs FG_CO2. We can have an exhaustive list for the tracer variables on what is needed.

Yassir Eddebbar (Sep 04 2020 at 17:02):

Another minor issue is that budget terms are not all available as outputs in many runs, e.g. for BGC in the high res runs, which prevents demonstrating full budget closure for now. e.g. the gist posted above by @Riley Brady is missing the HDIF terms, which likely explains the lack of budget closure.

@Anna-Lena Deppenmeier , a pop_tools function would be awesome, happy to help with BGC implementation. A couple other differences for BGC vs heat budgets are lack of a QSW_3D term for BGC budgets, and handling different units for various BGC terms (with ```YAML files too?), both manageable I think.

Riley Brady (Sep 04 2020 at 17:11):

@Yassir Eddebbar, @Stephen Yeager's FOSI run under the CESM-DPLE directory has a lot of the terms. That's what I based my original closure demos and code on in the previous PR. See /glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast

Riley Brady (Sep 04 2020 at 17:12):

I still think if we can advance discussion on get_grid() returning some of the other grid diagnostics/metrics we should be able to implement this pretty smoothly.

Yassir Eddebbar (Sep 04 2020 at 17:17):

Stephen Yeager (Mar 01 2021 at 20:11):

@Anna-Lena Deppenmeier @Yassir Eddebbar Would you be willing to share your notebook that closes the POP temperature budget using xarray and xgcm?

Anna-Lena Deppenmeier (Mar 01 2021 at 20:24):

sure, I am just running it now to make sure it works properly and will share it then

Matt Long (Mar 01 2021 at 20:27):

Yassir Eddebbar (Mar 01 2021 at 22:45):

Not sure if of interest, but I have an [O2] budget version with major input from @Anna-Lena Deppenmeier as well, need to rerun and clean too...

Anna-Lena Deppenmeier (Mar 02 2021 at 00:11):

Currently my budget is not closing. Not sure why, this is the same code that closed a while ago, before a bunch of updates. Am looking into it and hopefully can close it again soon.

Anna-Lena Deppenmeier (Mar 02 2021 at 23:33):

Hi @Stephen Yeager I have something that works, do you currently work on the CGD system or casper/cheyenne?

Stephen Yeager (Mar 03 2021 at 00:47):

Great! Prefer something that works on casper. Thank you very much for debugging and sharing!

Anna-Lena Deppenmeier (Mar 03 2021 at 16:01):

ok, I'm currently running it on cgd. I will put it on casper and make sure it works, then I'll let you know where to find it.

Deepak Cherian (Mar 03 2021 at 16:07):

Anna-Lena Deppenmeier (Mar 03 2021 at 16:09):

Deepak Cherian (Mar 03 2021 at 16:11):

Anna-Lena Deppenmeier (Mar 03 2021 at 19:37):

ok so here come the problems. so far it's not running on casper -- I assume something to do with xgcm. This is the error I get when I try to convert my dataset ds into a xgcm compatible dataset:

# here we get the xgcm compatible dataset
gridx, dsx = pop_tools.to_xgcm_grid_dataset(ds)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-c629003e2721> in <module>
      1 # here we get the xgcm compatible dataset
----> 2 gridx, dsx = pop_tools.to_xgcm_grid_dataset(ds)
      3
      4 # make sure we have the cell volumne for calculations
      5 dsx["cell_volume"] = dsx.DZT * dsx.DXT * dsx.DYT

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/pop_tools/xgcm_util.py in to_xgcm_grid_dataset(ds, **kwargs)
    202
    203     try:
--> 204         import xgcm
    205     except ImportError:
    206         raise ImportError(

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/xgcm/__init__.py in <module>
      4 del get_versions
      5
----> 6 from .grid import Grid, Axis
      7 from .autogenerate import generate_grid_ds

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/xgcm/grid.py in <module>
    757
    758
--> 759 class Grid:
    760     """
    761     An object with multiple :class:`xgcm.Axis` objects representing different

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/xgcm/grid.py in Grid()
   1065         return out
   1066
-> 1067     @docstrings.dedent
   1068     def _apply_vector_function(self, function, vector, **kwargs):
   1069         # the keys, should be axis names

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/docrep/decorators.py in update_docstring(self, *args, **kwargs)
     41             return func(self, *args, **kwargs)
     42         elif len(args) and callable(args[0]):
---> 43             doc = func(self, args[0].__doc__, *args[1:], **kwargs)
     44             _set_object_doc(args[0], doc, py2_class=self.python2_classes)
     45             return args[0]

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/docrep/__init__.py in dedent(self, s, stacklevel)
    532             encountering an invalid key in the string
    533         """
--> 534         s = inspect.cleandoc(s)
    535         return safe_modulo(s, self.params, stacklevel=stacklevel)
    536

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/inspect.py in cleandoc(doc)
    617     onwards is removed."""
    618     try:
--> 619         lines = doc.expandtabs().split('\n')
    620     except UnicodeError:
    621         return None

AttributeError: 'NoneType' object has no attribute 'expandtabs'

I have tried updating my environment and making sure the relevant packages are the same version, but I fail wrt pop-tools and xgcm.

conda_analysis_casper.txt:pop-tools                 2020.2.17.post6          pypi_0    pypi
conda_dcpy_andre.txt:pop-tools                 2020.12.15         pyhd8ed1ab_0    conda-forge

When I run conda update -c conda-forge pop-tools it stays the same, when I try pip install git+https://github.com/NCAR/pop-tools.git it gives me this error:

(analysis) -bash-4.2$ pip install git+https://github.com/NCAR/pop-tools.git
Collecting git+https://github.com/NCAR/pop-tools.git
  Cloning https://github.com/NCAR/pop-tools.git to /glade/scratch/deppenme/pip-req-build-8etvcn_v
  Running command git clone -q https://github.com/NCAR/pop-tools.git /glade/scratch/deppenme/pip-req-build-8etvcn_v
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
    Preparing wheel metadata ... done
Requirement already satisfied: dask>=2.14 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (2021.2.0)
Requirement already satisfied: pyyaml>=5.3.1 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (5.4.1)
Collecting numba>=0.52
  Using cached numba-0.52.0-cp37-cp37m-manylinux2014_x86_64.whl (3.2 MB)
Requirement already satisfied: xarray>=0.16.1 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (0.17.1.dev3+g48378c4)
Requirement already satisfied: numpy>=1.17.0 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (1.19.5)
Requirement already satisfied: pooch>=1.3.0 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (1.3.0)
Requirement already satisfied: setuptools in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from numba>=0.52->pop-tools==2020.12.15) (49.6.0.post20210108)
Collecting llvmlite<0.36,>=0.35.0
  Using cached llvmlite-0.35.0-cp37-cp37m-manylinux2010_x86_64.whl (25.3 MB)
Requirement already satisfied: requests in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pooch>=1.3.0->pop-tools==2020.12.15) (2.25.1)
Requirement already satisfied: packaging in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pooch>=1.3.0->pop-tools==2020.12.15) (20.9)
Requirement already satisfied: appdirs in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pooch>=1.3.0->pop-tools==2020.12.15) (1.4.4)
Requirement already satisfied: pandas>=0.25 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from xarray>=0.16.1->pop-tools==2020.12.15) (1.2.3)
Requirement already satisfied: python-dateutil>=2.7.3 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pandas>=0.25->xarray>=0.16.1->pop-tools==2020.12.15) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pandas>=0.25->xarray>=0.16.1->pop-tools==2020.12.15) (2021.1)
Requirement already satisfied: six>=1.5 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas>=0.25->xarray>=0.16.1->pop-tools==2020.12.15) (1.15.0)
Requirement already satisfied: pyparsing>=2.0.2 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from packaging->pooch>=1.3.0->pop-tools==2020.12.15) (2.4.7)
Requirement already satisfied: chardet<5,>=3.0.2 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (4.0.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (1.26.3)
Requirement already satisfied: certifi>=2017.4.17 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (2020.12.5)
Requirement already satisfied: idna<3,>=2.5 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (2.10)
Building wheels for collected packages: pop-tools
  Building wheel for pop-tools (PEP 517) ... done
  Created wheel for pop-tools: filename=pop_tools-2020.12.15-py3-none-any.whl size=30153 sha256=988c711795c5b46037a868957be10d82f56fab4eab935b862cbc8c054f5f7717
  Stored in directory: /glade/scratch/deppenme/pip-ephem-wheel-cache-bdpxkpi5/wheels/65/87/a3/7667dcc7225e5105e95e09186af11e0befb140c2779fc074cf
Successfully built pop-tools
Installing collected packages: llvmlite, numba, pop-tools
  Attempting uninstall: llvmlite
    Found existing installation: llvmlite 0.31.0
ERROR: Cannot uninstall 'llvmlite'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

Anyway, I believe the issue is in principle with xgcm (see the .grid problem), so I tried to at least get that to the same version on both systems. Both seem to be installed by conda-forge

conda_analysis_casper.txt:xgcm                      0.3.0                      py_0    conda-forge
conda_dcpy_andre.txt:xgcm                      0.5.1                      py_0    conda-forge

and when I try to update with conda-forge it says it's already installed and doesn't change the version.

Deepak Cherian (Mar 03 2021 at 19:59):

Yes this is an xgcm version problem. conda update should fix it, are you getting some other error in that case?

Anna-Lena Deppenmeier (Mar 03 2021 at 20:05):

(analysis) -bash-4.2$ conda update xgcm
Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

(analysis) -bash-4.2$

Anna-Lena Deppenmeier (Mar 03 2021 at 20:06):

Anderson Banihirwe (Mar 03 2021 at 21:00):

@Anna-Lena Deppenmeier, I think the issue is coming from an incompatible docrep version... Try pinning docrep to v0.2.7

Anderson Banihirwe (Mar 03 2021 at 21:00):

conda install -c conda-forge docrep==0.2.7

Anna-Lena Deppenmeier (Mar 03 2021 at 21:01):

yes, I have come across the docrep problem. it is currently at 0.2.7
docrep 0.2.7 pypi_0 pypi

am doing the conda install thing now and will then try a conda update xgcm again.

Anna-Lena Deppenmeier (Mar 03 2021 at 21:06):

(analysis) -bash-4.2$ conda install -c conda-forge docrep==0.2.7
Collecting package metadata (current_repodata.json): done
Solving environment: -
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - conda-forge/linux-64::psyplot==1.3.1=py37h89c1867_0
  - conda-forge/noarch::xrft==0.2.3=pyhd3deb0d_0
  - conda-forge/noarch::funcargparse==0.2.3=pyh9f0ad1d_0
failed with initial frozen solve. Retrying with flexible solve.
Solving environment: |

Anna-Lena Deppenmeier (Mar 03 2021 at 21:09):

ok it took a while but it worked and I was able to update xgcm. Thanks @Anderson Banihirwe ! And while you're here (; I am initializing my cluster like this

# this is for when you do things on casper
import ncar_jobqueue
import dask
import distributed

dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'})
cluster = ncar_jobqueue.NCARCluster() # initializes cluster
client = distributed.Client(cluster)

cluster.adapt(minimum=6, maximum=32, wait_count=600) # gets you 6 workers minimum, max
                                                     # makes them wait 10 minutes before releasing
client

Anderson Banihirwe (Mar 03 2021 at 21:42):

dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'})

Anna-Lena Deppenmeier (Mar 03 2021 at 21:44):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:45):

so when I remove the line and click on the new link. it wants to log into jupyterhub, which I am not using in the first place

Anderson Banihirwe (Mar 03 2021 at 21:45):

conda install -c conda-forge mamba`#  in your base

and whenever you want to install packages via conda, replace conda with mamba. E.g.: mamba install -c conda-forge docrep==0.2.7

Anderson Banihirwe (Mar 03 2021 at 21:46):

conda install -c conda-forge ncar-jobqueue==2021.2.10

Anna-Lena Deppenmeier (Mar 03 2021 at 21:47):

hm. I did a conda update all in the beginning of the day, because I haven't worked on casper in a while, I wonder why it didn't update it.

Anna-Lena Deppenmeier (Mar 03 2021 at 21:48):

Is there a running documentary somewhere where we keep the current versions? So for example I am mostly working on the cgd system these days, and whenever I switch back to casper I am having these issues and have to ask people (partly because conda update all does not seem to take care of these things?)
but anyhow I updated and I still get [pasted image] upon clicking on the link (user_uploads/2/43/qRHGVdyfK9fAaORJ5dlfqOn8/pasted_image.png)

Anderson Banihirwe (Mar 03 2021 at 21:52):

the conda solver is very flexible unless you pin down which versions you want.. It's hard to control this when you've installed packages via conda install ... command. one remedy is to curate your environment in a environment.yml and then run conda env update -f environment.yml command whenever you want to update... (With this approach, it's easy to control which versions get installed)

Anderson Banihirwe (Mar 03 2021 at 21:52):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:53):

I don't understand what you mean with curate -- like I literally go in and find the version numbers I need?

Anna-Lena Deppenmeier (Mar 03 2021 at 21:55):

tbh that doesn't sound very easy to me at all. I thought it was the idea to be able to use conda installand then get the latest version if I don't specify which version I want with the ==?

Anderson Banihirwe (Mar 03 2021 at 22:00):

For instance, if you were starting with a new environment, you would create a new environment.yml file. The original contents of this file may look like this:

name: my-new-env
channels:
   - conda-forge
dependencies:
  - python=3.8
  - xarray==0.16.2
  - dask==2.14

To create/update this environment, you would run conda env update -f environment.yml

If let's say a week later, you decide that you want to upgrade one or more dependencies in this environment, you need to go back to the environment.yml file and update it, e.g.

name: my-new-env
channels:
   - conda-forge
dependencies:
  - python=3.8
  - xarray==0.17 # Upgrade to latest version of xarray
  - dask==2.14

With this approach, you can have an idea of what versions of packages you care about are installed in your environment by looking at your environment.yml file and you can easily re-create this environment on the same machine or another machine with the same type of operating system (Linux, MacOS, etc...). This is what I mean by environment curation.

Anderson Banihirwe (Mar 03 2021 at 22:00):

Anderson Banihirwe (Mar 03 2021 at 22:01):

import dask, distributed, bokeh
print(dask.__version__, distributed.__version__, bokeh.__version__)

Anna-Lena Deppenmeier (Mar 03 2021 at 22:04):

It does, thanks. I am not sure whether I would be able to do that though. It seems like I would be on top of every package version I would want to use, which seems like a lot to research / keep in mind. Anyway.

Username for 'https://github.com': ALDepp
Password for 'https://ALDepp@github.com':
remote: Permission to NCAR/pop-tools.git denied to ALDepp.
fatal: unable to access 'https://github.com/NCAR/pop-tools.git/': The requested URL returned error: 403

Anderson Banihirwe (Mar 03 2021 at 22:06):

dask==2.14
distributed==2.14
bokeh==1.4

Anderson Banihirwe (Mar 03 2021 at 22:09):

I think you want to push to your fork (https://github.com/ALDepp/pop-tools) repo and open a pull request to merge your changes into NCAR/pop-tools

Anderson Banihirwe (Mar 03 2021 at 22:10):

Anna-Lena Deppenmeier (Mar 03 2021 at 22:13):

Anna-Lena Deppenmeier (Mar 03 2021 at 22:23):

origin  https://github.com/ALDepp/pop-tools.git (fetch)
origin  https://github.com/ALDepp/pop-tools.git (push)
upstream    https://github.com/NCAR/pop-tools.git (fetch)
upstream    https://github.com/NCAR/pop-tools.git (push)

Anna-Lena Deppenmeier (Mar 04 2021 at 16:19):

dask==2.14
distributed==2.14
bokeh==1.4
I have not been able to pin bokeh to 1.4. It took a very long time and then gave me a very long list of packages that conflict, and in the end says

Your installed version is: 2.17

Anderson Banihirwe (Mar 04 2021 at 16:29):

Anna-Lena Deppenmeier (Mar 04 2021 at 16:50):

Yassir Eddebbar (Mar 12 2021 at 17:46):

I am wondering if it could be helpful to users not deeply familiar with the POP grid to add an illustration/visualization in @Anna-Lena Deppenmeier 's notebook on where the various terms sit in the grid cell, what the various terms mean. This what I found most challenging in creating the budget terms... I started on something a few months ago on illustrator that can be edited for this purpose (needs QA/QC review for accuracy):
POP_Grid.png

Deepak Cherian (Mar 12 2021 at 17:47):

Deepak Cherian (Mar 12 2021 at 17:49):

Max Grover (Mar 12 2021 at 17:49):

Anna-Lena Deppenmeier (Mar 12 2021 at 17:50):

@Yassir Eddebbar I think that would be useful. Are v.t and u.t consistent in this figure? one seems to be on the corner and one on the face

Anna-Lena Deppenmeier (Mar 12 2021 at 17:50):

@Yassir Eddebbar is it okay if I add this to the example notebook (since casper is up again I aim to make changes and resubmit today)

Yassir Eddebbar (Mar 12 2021 at 17:56):

Yes, as long as it looks ok by you and others accuracy wise, go for it. Happy to edit some later

Yassir Eddebbar (Mar 12 2021 at 18:01):

yeah, I struggled with how to represent v.t , it's supposed to look like it's popping out the center of the northern cell face (location 3121) not corner, but now looking at it, it can be confused for the SE upper corner... I can work something better later?

Yassir Eddebbar (Mar 12 2021 at 18:03):

@Deepak Cherian I'll look into it, not familiar with svg yet, but sounds like it could be really useful for xgcm.

Yassir Eddebbar (Mar 12 2021 at 20:02):

@Anna-Lena Deppenmeier Here is an updated version with more POP-consistent naming and a legend: POP_Grid.png
@Max Grover Feel free to use on pop-tools doc or elsewhere, and let me know if you need any edits or additions for other uses, I can also send over the .ai file

Deepak Cherian (Mar 17 2021 at 14:03):

Just an update: This awesome notebooks is now live thanks to @Anna-Lena Deppenmeier @Yassir Eddebbar @Anderson Banihirwe .

Who Kim (May 18 2021 at 21:39):

I am trying to expanding this awesome notebook to work for prediction datasets (in particular DPLE). Thanks to Anderson, now to_xgcm_grid_dataset() can handle the DPEL dimensions (time, lead, ensemble, z, y, x). But, when I try to compute the buget term using in Z direction, ie.,

budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * dsxgcm.VOL.values, axis="Z") / dsxgcm.VOL)

, I got the following error, while working fine in X and Y directions: "None of the DataArray's dims ('Y', 'M', 'L', 'z_w_top', 'nlat_t', 'nlon_t') were found in axis coords." Can anone help me to figure out how to fix this problem? Thanks!

Anna-Lena Deppenmeier (May 19 2021 at 15:41):

Hi Who, on which coordinate is WTT? I'm assuming VOL is on z_t and WTT on z_w_bot or z_w_top

Anna-Lena Deppenmeier (May 19 2021 at 15:41):

If that is the case then you need to either reasign / rename the coordinate or interpolate first for xgcm to want to mulitply WTT with VOL.

Yassir Eddebbar (May 19 2021 at 17:35):

@Who Kim can you paste your code for how you set up your grid, vertical thickness, volume, etc?

Who Kim (May 19 2021 at 20:31):

@Anna-Lena Deppenmeier I have tried that, but didn't work (I forget what was the error messsage). In fact, this line is same as in your original script and only the difference is that WTT has now more coornidates. Doesn't dsxgcm.VOL.values makes it free from its assigned coordinates?

@Yassir Eddebbar Below is how I set up those, which is similar to the Anna's script with some modifications,

dple["DZT"] = xr.DataArray(dple.dz.values[:,None,None]*np.ones((len(dple.dz),len(dple.nlat),len(dple.nlon)))
                , dims=['z_t','nlat','nlon'], coords={'z_t':dple.z_t,'nlat':dple.nlat,'nlon':dple.nlon})
dple["DZU"] = xr.DataArray(dple.dz.values[:,None,None]*np.ones((len(dple.dz),len(dple.nlat),len(dple.nlon)))
                , dims=['z_t','nlat','nlon'], coords={'z_t':dple.z_t,'nlat':dple.nlat,'nlon':dple.nlon})

dple.DZT.attrs["long_name"] = "Thickness of T cells"
dple.DZT.attrs["units"] = "centimeter"
dple.DZT.attrs["grid_loc"] = "3111"
dple.DZU.attrs["long_name"] = "Thickness of U cells"
dple.DZU.attrs["units"] = "centimeter"
dple.DZU.attrs["grid_loc"] = "3221"

# make sure we have the cell volumne for calculations
VOL = (dple.DZT * dple.DXT * dple.DYT).compute()
KMT = dple.KMT.compute()

for j in tqdm(range(len(KMT.nlat))):
    for i in range(len(KMT.nlon)):
        k = KMT.values[j, i].astype(int)
        VOL.values[k:, j, i] = 0.0

dple["VOL"] = VOL

dple.VOL.attrs["long_name"] = "volume of T cells"
dple.VOL.attrs["units"] = "centimeter^3"

dple.VOL.attrs["grid_loc"] = "3111"

Anna-Lena Deppenmeier (May 19 2021 at 20:33):

I didn't catch the .values, thanks for pointing that out. I'm not sure xgcm would multiply with values though, given that it uses the coordinates to determine how to multiply. Usually it looks for the same coordinates to know how to multiply.

Who Kim (May 19 2021 at 20:38):

Below is the codes from your script. I can compute UET and VNT using the same codes without any problems. Is there a reason why you multify (dzdxtdyt).values for WTT insead of VOL.values? I think I tried that, but I can try again with your original formula.

budget["UET"] = -(gridxgcm.diff(dsxgcm.UET * dsxgcm.VOL.values, axis="X") / dsxgcm.VOL)
budget["VNT"] = -(gridxgcm.diff(dsxgcm.VNT * dsxgcm.VOL.values, axis="Y") / dsxgcm.VOL)
budget["WTT"] = (
    gridxgcm.diff(dsxgcm.WTT.fillna(0) * (dsxgcm.dz * dsxgcm.DXT * dsxgcm.DYT).values, axis="Z")
    / dsxgcm.VOL
)

Yassir Eddebbar (May 19 2021 at 21:05):

metrics = {
    ("X",): ["DXU", "DXT"],  # X distances
    ("Y",): ["DYU", "DYT"],  # Y distances
    ("Z",): ["DZU", "DZT"],  # Z distances
    ("X", "Y"): ["UAREA", "TAREA"],
}

# here we get the xgcm compatible dataset
gridxgcm, dsxgcm = pop_tools.to_xgcm_grid_dataset(
    ds,
    periodic=False,
    metrics=metrics,
    boundary={"X": "extend", "Y": "extend", "Z": "extend"},
)

for coord in ["nlat", "nlon"]:
    if coord in dsxgcm.coords:
        dsxgcm = dsxgcm.drop_vars(coord)

Who Kim (May 19 2021 at 23:00):

Yassir Eddebbar (May 20 2021 at 15:49):

Following on @Anna-Lena Deppenmeier 's reassigning coordinates idea, does an xr.roll / shift method work (just for debugging purposes)? something like:

di=xr.Dataset()
di['UET'] = -((ds.UET*ds.VOL) - (ds.UET*ds.VOL).roll(nlon=1, roll_coords=True))/ds.VOL
di['VNT'] = -((ds.VNT*ds.VOL) - (ds.VNT*ds.VOL).roll(nlat=1, roll_coords=True))/ds.VOL
di['WTT'] = - ((ds.WTT*(ds.VOL.drop('z_t').rename({"z_t":"z_w_top"}).assign_coords(z_w_top=ds.z_w_top))
                     - (ds.WTT*(ds.VOL.drop('z_t').rename({"z_t":"z_w_top"}).assign_coords(z_w_top=ds.z_w_top))).shift(z_w_top=-1).fillna(0)
                    ).drop('z_w_top').rename({"z_w_top":"z_t"}).assign_coords(z_t=ds.z_t))/ds.VOL

Who Kim (May 20 2021 at 16:38):

It appears to be working (at least I didn't see any error). I have also generated a new VOL defined at z_w_top and added to dsxgcm, but I got the same error:

dsxgcm["VOL2"] = xr.DataArray(dsxgcm.VOL.values, dims=['z_w_top', 'nlat_t', 'nlon_t'],
                              coords={'z_w_top':dsxgcm.z_w_top, 'nlat_t':dsxgcm.nlat_t, 'nlon_t':dsxgcm.nlon_t})

budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * dsxgcm.VOL2.values, axis="Z")/dsxgcm.VOL)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-86-192e3d3043c5> in <module>
      2 vol_top = vol_top.rename({'z_t': 'z_w_top'})
      3
----> 4 budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * vol_top, axis="Z")/ dsxgcm.VOL)

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in diff(self, da, axis, **kwargs)
   1483         >>> grid.diff(da, ['X', 'Y'], fill_value={'X':0, 'Y':100})
   1484         """
-> 1485         return self._grid_func("diff", da, axis, **kwargs)
   1486
   1487     @docstrings.dedent

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in _grid_func(self, funcname, da, axis, **kwargs)
   1419                 out = out * metric
   1420
-> 1421             out = func(out, **kwargs)
   1422
   1423             if metric_weighted:

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in diff(self, da, to, boundary, fill_value, boundary_discontinuity, vector_partner, keep_coords)
    638         """
    639
--> 640         return self._neighbor_binary_func(
    641             da,
    642             raw_diff_function,

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in _neighbor_binary_func(self, da, f, to, boundary, fill_value, boundary_discontinuity, vector_partner, keep_coords)
    275             The differenced data
    276         """
--> 277         position_from, dim = self._get_axis_coord(da)
    278         if to is None:
    279             to = self._default_shifts[position_from]

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in _get_axis_coord(self, da)
   1013                 return position, coord_name
   1014
-> 1015         raise KeyError(
   1016             "None of the DataArray's dims %s were found in axis "
   1017             "coords." % repr(da.dims)

KeyError: "None of the DataArray's dims ('Y', 'M', 'L', 'z_w_top', 'nlat_t', 'nlon_t') were found in axis coords."

Anna-Lena Deppenmeier (May 20 2021 at 16:49):

Who, I don't know where the 'Y', 'M', 'L' dimenions are coming from. I recommend always checking your dimensions for the DataArrays, as they have to line up for xgcm to want to multiply.

Anna-Lena Deppenmeier (May 20 2021 at 16:50):

also, you negate that you added dimensions to VOL2 when you then use .values. it turns the xarray dataarray into a numpy.ndarray array.

Anna-Lena Deppenmeier (May 20 2021 at 16:56):

Deepak Cherian (May 20 2021 at 16:59):

(I know I've asked this 50x before but I keep forgetting!) Why are we multiplying by VOL.values. Is it because there's a z_w and z_w_top mismatch?

Yassir Eddebbar (May 20 2021 at 19:21):

I believe the mismatch was z_t and z_w_top since WTT is centered at the center of the upper face (3112) vs VOL is in the center of the cell (3111).

Yassir Eddebbar (May 20 2021 at 19:25):

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in diff(self, da, axis, **kwargs)
   1483         >>> grid.diff(da, ['X', 'Y'], fill_value={'X':0, 'Y':100})
   1484         """
-> 1485         return self._grid_func("diff", da, axis, **kwargs)
   1486
   1487     @docstrings.dedent

Who Kim (May 20 2021 at 20:02):

@Anna-Lena Deppenmeier As I mentioned in my first post, I am working on DPLE data with Y, M, L being the start year, ensemble memble, and lead time, respectively. I believe the reason why .values is used that UET and VNT are at (nlat_t, nlon_u) while VOL is at T-grid for both. If I don't use .values, I got: broadcasting cannot handle duplicate dimensions: ['time', 'z_t', 'nlat_t', 'nlon_u', 'nlon_u']. I though Anna used .values for a similar region because WTT is at z_w_top although it is at T-grid horizonally. If this logic is correct, I don't see any reason why this operation is not working for WTT. Indeed, it is working fine when my dataset has the conventional POP (t, z, y, x) dimensions. So, my speculation is it is related to the unconventional dataset dimensions, which xgcm cannot perhaps handle it?

Who Kim (May 20 2021 at 20:11):

I just want to add that the operation for UET and VNT is working with the unconventional data dimentions.

Anna-Lena Deppenmeier (May 20 2021 at 20:13):

would you mind copying the dimensions of your dataarrays here Who? I don't understand where the duplicate nlon_u is coming from.

Anna-Lena Deppenmeier (May 20 2021 at 20:14):

Who Kim (May 20 2021 at 20:18):

'VOL' (z_t: 60, nlat_t: 384, nlon_t: 320)
'WTT' (Y: 63, M: 39, L: 122, z_w_top: 60, nlat_t: 384, nlon_t: 320)
The error message above is what I got when I ran the operation for UET without .values, not WTT

Anna-Lena Deppenmeier (May 20 2021 at 20:19):

Who Kim (May 20 2021 at 20:20):

Anna-Lena Deppenmeier (May 20 2021 at 20:20):

so what are the dimensions of UET? (Y: 63, M: 39, L: 122, z_t: 60, nlat_t: 384, nlon_u: 320)?

Who Kim (May 20 2021 at 20:21):

Anna-Lena Deppenmeier (May 20 2021 at 20:26):

Julius Busecke (May 20 2021 at 22:56):

Hey everyone. I have to admit I am a bit overwhelmed right here (might also be the dask summit haha). What is the exact code that fails?

Who Kim (May 21 2021 at 16:51):

budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * dsxgcm.VOL2.values, axis="Z")/dsxgcm.VOL)

, while the the same operation in X and Y directions is working fine. I am working with data (CESM DPLE) with "unusual" dimensions, which include additional "lead time" and "member" dimesions (ie., (lead time, member, start year, z, y, x)). The same operation is working fine when the data dimensions are the usual POP dimenions (time, z, y, x).

Anderson Banihirwe (May 21 2021 at 18:06):

It might be useful to have a reproducible example notebook in order to diagnose this issue. @Who Kim, is your work in a notebook that is publicly accessible or somewhere on Glade?

Who Kim (May 21 2021 at 22:23):

I think I found what went wrong. When reading the data, I only stored the variables and coordinates that I though necessary, including z_w_top, but somehow to_xgcm_grid_dataset coudn't convert z_w_top to xgcm dataset as a coordinate, while it did for z_w_bot. When I imported everying, z_w_top appears as a coordinate, weird. Thank everyone responded here!

Deepak Cherian (May 21 2021 at 23:26):

hmm.. that error message is really misleading then. It should've told you that z_w_top was missing

Anna-Lena Deppenmeier (Jun 23 2021 at 17:22):

I actually just had the same problem, would be great if someone could look into why to_xgcm_grid_dataset does not bring z_w_top along!

Deepak Cherian (Jun 23 2021 at 17:24):

Anna-Lena Deppenmeier (Jun 23 2021 at 17:25):

budget["DIA_IMPVF_TEMP"][:, 0, :, :] = (
    SRF_TEMP_FLUX * dsxgcm.TAREA - dsxgcm.DIA_IMPVF_TEMP.isel(z_w_bot=0) * dsxgcm.TAREA
) / dsxgcm.VOL.values[0, :, :]

This works when I use the example script, which only has 1 timestep. In that case ds.DIA_IMPVF_TEMP is not a dask array. It does not work when I load multiple files, then ds.DIA_IMPVF_TEMP is a dask array, and I can't assign with budget["DIA_IMPVF_TEMP"][:, 0, :, :] =... . yesterday I tried to get around this by loading ds.DIA_IMPVF_TEMP, which resulted in a memory error. How can I assign something to a dask array without loading?

Stream: python-dev

Topic: pop-tools : tracer budgets

Stephen Yeager (Sep 03 2020 at 21:59):

Matt Long (Sep 04 2020 at 15:00):

Deepak Cherian (Sep 04 2020 at 15:13):

Anna-Lena Deppenmeier (Sep 04 2020 at 15:16):

Anna-Lena Deppenmeier (Sep 04 2020 at 15:17):

Anna-Lena Deppenmeier (Sep 04 2020 at 15:17):

Deepak Cherian (Sep 04 2020 at 15:18):

Anna-Lena Deppenmeier (Sep 04 2020 at 15:21):

Riley Brady (Sep 04 2020 at 16:20):

Riley Brady (Sep 04 2020 at 16:22):

Yassir Eddebbar (Sep 04 2020 at 17:02):

Riley Brady (Sep 04 2020 at 17:11):

Riley Brady (Sep 04 2020 at 17:12):

Yassir Eddebbar (Sep 04 2020 at 17:17):

Stephen Yeager (Mar 01 2021 at 20:11):

Anna-Lena Deppenmeier (Mar 01 2021 at 20:24):

Matt Long (Mar 01 2021 at 20:27):

Yassir Eddebbar (Mar 01 2021 at 22:45):

Anna-Lena Deppenmeier (Mar 02 2021 at 00:11):

Anna-Lena Deppenmeier (Mar 02 2021 at 23:33):

Stephen Yeager (Mar 03 2021 at 00:47):

Anna-Lena Deppenmeier (Mar 03 2021 at 16:01):

Deepak Cherian (Mar 03 2021 at 16:07):

Anna-Lena Deppenmeier (Mar 03 2021 at 16:09):

Deepak Cherian (Mar 03 2021 at 16:11):

Anna-Lena Deppenmeier (Mar 03 2021 at 19:37):

Deepak Cherian (Mar 03 2021 at 19:59):

Anna-Lena Deppenmeier (Mar 03 2021 at 20:05):

Anna-Lena Deppenmeier (Mar 03 2021 at 20:06):

Anderson Banihirwe (Mar 03 2021 at 21:00):

Anderson Banihirwe (Mar 03 2021 at 21:00):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:01):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:06):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:09):

Anderson Banihirwe (Mar 03 2021 at 21:42):

Anderson Banihirwe (Mar 03 2021 at 21:42):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:44):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:45):

Anderson Banihirwe (Mar 03 2021 at 21:45):

Anderson Banihirwe (Mar 03 2021 at 21:46):

Anderson Banihirwe (Mar 03 2021 at 21:46):

Anderson Banihirwe (Mar 03 2021 at 21:46):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:47):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:48):

Anderson Banihirwe (Mar 03 2021 at 21:52):

Anderson Banihirwe (Mar 03 2021 at 21:52):

Anderson Banihirwe (Mar 03 2021 at 21:52):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:53):

Anna-Lena Deppenmeier (Mar 03 2021 at 21:55):

Anderson Banihirwe (Mar 03 2021 at 22:00):

Anderson Banihirwe (Mar 03 2021 at 22:00):

Anderson Banihirwe (Mar 03 2021 at 22:01):

Anna-Lena Deppenmeier (Mar 03 2021 at 22:04):

Anderson Banihirwe (Mar 03 2021 at 22:06):

Anderson Banihirwe (Mar 03 2021 at 22:09):

Anderson Banihirwe (Mar 03 2021 at 22:10):

Anderson Banihirwe (Mar 03 2021 at 22:10):

Anderson Banihirwe (Mar 03 2021 at 22:10):

Anna-Lena Deppenmeier (Mar 03 2021 at 22:13):

Anna-Lena Deppenmeier (Mar 03 2021 at 22:23):

Anna-Lena Deppenmeier (Mar 04 2021 at 16:19):

Anderson Banihirwe (Mar 04 2021 at 16:29):

Anna-Lena Deppenmeier (Mar 04 2021 at 16:50):

Yassir Eddebbar (Mar 12 2021 at 17:46):

Deepak Cherian (Mar 12 2021 at 17:47):

Deepak Cherian (Mar 12 2021 at 17:49):

Max Grover (Mar 12 2021 at 17:49):

Anna-Lena Deppenmeier (Mar 12 2021 at 17:50):

Anna-Lena Deppenmeier (Mar 12 2021 at 17:50):

Yassir Eddebbar (Mar 12 2021 at 17:56):

Yassir Eddebbar (Mar 12 2021 at 18:01):

Yassir Eddebbar (Mar 12 2021 at 18:03):

Yassir Eddebbar (Mar 12 2021 at 20:02):

Deepak Cherian (Mar 17 2021 at 14:03):

Who Kim (May 18 2021 at 21:39):

Anna-Lena Deppenmeier (May 19 2021 at 15:41):

Anna-Lena Deppenmeier (May 19 2021 at 15:41):

Yassir Eddebbar (May 19 2021 at 17:35):