Stream: python-dev

Topic: pop-tools : tracer budgets


view this post on Zulip Stephen Yeager (Sep 03 2020 at 21:59):

Is there any status update on this development (issue #7)?

view this post on Zulip Matt Long (Sep 04 2020 at 15:00):

@Yassir Eddebbar, @Riley Brady, @Keith Lindsay and @Deepak Cherian have been following this issue. I think we've foundered a bit, but would be good to assess and revive.

view this post on Zulip Deepak Cherian (Sep 04 2020 at 15:13):

a bunch of the pieces are almost there :slight_smile:

  1. we know how to set up xgcm: https://pop-tools.readthedocs.io/en/latest/examples/pop_div_curl_xr_xgcm_metrics_compare.html
  2. we know how to specify metrics: https://github.com/NCAR/pop-tools/issues/40
  3. we have unfinished code to estimate metrics that havenot been saved to disk: https://github.com/NCAR/pop-tools/pull/44, https://github.com/NCAR/pop-tools/pull/54
  4. and there is unifinished code for div, grad, curl: https://github.com/xgcm/xgcm/issues/187

cc @Anna-Lena Deppenmeier

view this post on Zulip Anna-Lena Deppenmeier (Sep 04 2020 at 15:16):

Hi all, @Yassir Eddebbar and I have been working on a notebook that closes, for example, temperature budget with xarray only, with xgcm and with xgcm metrics. I would say what works now is closing the temperature budget with xgcm, it's not optimally using metrics yet and xarray only has some problems.

view this post on Zulip Anna-Lena Deppenmeier (Sep 04 2020 at 15:17):

I could turn the xgcm version into a function and move it to pop-tools.

view this post on Zulip Anna-Lena Deppenmeier (Sep 04 2020 at 15:17):

it does need DZU which as far as I understand cannot be calculated from the released version of pop-tools right now?!

view this post on Zulip Deepak Cherian (Sep 04 2020 at 15:18):

wow much further along than I thought ! nice!

view this post on Zulip Anna-Lena Deppenmeier (Sep 04 2020 at 15:21):

yeah it's POP specific, though. Which is fine for pop-tools, I guess

view this post on Zulip Riley Brady (Sep 04 2020 at 16:20):

Hi all; glad this is making some headway.

I was looking to flesh out some convenience variables from the get_grid() function so we could really easily wrangle this with xgcm (DZU, DZT, HU, HT, KMU). I could use some feedback on https://github.com/NCAR/pop-tools/pull/54 and we need to get https://github.com/NCAR/pop-tools/pull/44 merged. We might not need every one of those variables, but if I remember correctly DZU and DZT are needed.

Also some thoughts on https://github.com/NCAR/pop-tools/issues/56 would be useful -- getting DZBC could help us close budgets for runs with partial bottom cells. Although maybe that is a step we take in a future PR. We also haven't done anything to deal with the overflow parameterization that @Matt Long meticulously deals with in his NCL code, but again this should be dealt with after the first iteration of this code.

There's a lot of discussion on my original purely xarray implementation here that might help in making the updated version work well: https://github.com/NCAR/pop-tools/pull/12. I closed it in favor of giving code/feedback to @Yassir Eddebbar and @Anna-Lena Deppenmeier on doing a purely xgcm implementation.

view this post on Zulip Riley Brady (Sep 04 2020 at 16:22):

Here's a demo from @Yassir Eddebbar and @Anna-Lena Deppenmeier closing the O2 budget using purely xgcm: https://nbviewer.jupyter.org/gist/Eddebbar/cb30e0e4a3151b2900fb49648e1e50c8. Yassir and I had talked about just hosting these as examples on the pop-tools docs so folks could just follow along themselves. I still think it'd be valuable to have a function in the package. The best advice I can give from our discussion on my PR is to use YAML files to denote the variables specifically needed for each tracer. For instance, iron needs IRON_FLUX, while O2 needs STF_O2, while DIC needs FG_CO2. We can have an exhaustive list for the tracer variables on what is needed.

view this post on Zulip Yassir Eddebbar (Sep 04 2020 at 17:02):

Another minor issue is that budget terms are not all available as outputs in many runs, e.g. for BGC in the high res runs, which prevents demonstrating full budget closure for now. e.g. the gist posted above by @Riley Brady is missing the HDIF terms, which likely explains the lack of budget closure.

@Anna-Lena Deppenmeier , a pop_tools function would be awesome, happy to help with BGC implementation. A couple other differences for BGC vs heat budgets are lack of a QSW_3D term for BGC budgets, and handling different units for various BGC terms (with ```YAML files too?), both manageable I think.

view this post on Zulip Riley Brady (Sep 04 2020 at 17:11):

@Yassir Eddebbar, @Stephen Yeager's FOSI run under the CESM-DPLE directory has a lot of the terms. That's what I based my original closure demos and code on in the previous PR. See /glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast

view this post on Zulip Riley Brady (Sep 04 2020 at 17:12):

Here's a demo YAML file: https://github.com/NCAR/pop-tools/blob/master/pop_tools/pop_grid_definitions.yaml. I would envision the headers being the given variable name, then the sub-sections could be a list of required/expected budget variables (could make some optional based on run output) and other booleans as needed.

I still think if we can advance discussion on get_grid() returning some of the other grid diagnostics/metrics we should be able to implement this pretty smoothly.

view this post on Zulip Yassir Eddebbar (Sep 04 2020 at 17:17):

Yassir Eddebbar, Stephen Yeager's FOSI run under the CESM-DPLE directory has a lot of the terms. That's what I based my original closure demos and code on in the previous PR. See /glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast

Excellent. I'll test on those!

view this post on Zulip Stephen Yeager (Mar 01 2021 at 20:11):

@Anna-Lena Deppenmeier @Yassir Eddebbar Would you be willing to share your notebook that closes the POP temperature budget using xarray and xgcm?

view this post on Zulip Anna-Lena Deppenmeier (Mar 01 2021 at 20:24):

sure, I am just running it now to make sure it works properly and will share it then

view this post on Zulip Matt Long (Mar 01 2021 at 20:27):

cc @Max Grover

view this post on Zulip Yassir Eddebbar (Mar 01 2021 at 22:45):

Not sure if of interest, but I have an [O2] budget version with major input from @Anna-Lena Deppenmeier as well, need to rerun and clean too...

view this post on Zulip Anna-Lena Deppenmeier (Mar 02 2021 at 00:11):

Currently my budget is not closing. Not sure why, this is the same code that closed a while ago, before a bunch of updates. Am looking into it and hopefully can close it again soon.

view this post on Zulip Anna-Lena Deppenmeier (Mar 02 2021 at 23:33):

Hi @Stephen Yeager I have something that works, do you currently work on the CGD system or casper/cheyenne?

view this post on Zulip Stephen Yeager (Mar 03 2021 at 00:47):

Great! Prefer something that works on casper. Thank you very much for debugging and sharing!

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 16:01):

ok, I'm currently running it on cgd. I will put it on casper and make sure it works, then I'll let you know where to find it.

view this post on Zulip Deepak Cherian (Mar 03 2021 at 16:07):

Anna you should upload it to pop-tools!

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 16:09):

The notebook itself? Or change it into a function?

view this post on Zulip Deepak Cherian (Mar 03 2021 at 16:11):

The notebook. almost 0 effort :slight_smile:

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 19:37):

ok so here come the problems. so far it's not running on casper -- I assume something to do with xgcm. This is the error I get when I try to convert my dataset ds into a xgcm compatible dataset:

# here we get the xgcm compatible dataset
gridx, dsx = pop_tools.to_xgcm_grid_dataset(ds)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-c629003e2721> in <module>
      1 # here we get the xgcm compatible dataset
----> 2 gridx, dsx = pop_tools.to_xgcm_grid_dataset(ds)
      3
      4 # make sure we have the cell volumne for calculations
      5 dsx["cell_volume"] = dsx.DZT * dsx.DXT * dsx.DYT

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/pop_tools/xgcm_util.py in to_xgcm_grid_dataset(ds, **kwargs)
    202
    203     try:
--> 204         import xgcm
    205     except ImportError:
    206         raise ImportError(

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/xgcm/__init__.py in <module>
      4 del get_versions
      5
----> 6 from .grid import Grid, Axis
      7 from .autogenerate import generate_grid_ds

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/xgcm/grid.py in <module>
    757
    758
--> 759 class Grid:
    760     """
    761     An object with multiple :class:`xgcm.Axis` objects representing different

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/xgcm/grid.py in Grid()
   1065         return out
   1066
-> 1067     @docstrings.dedent
   1068     def _apply_vector_function(self, function, vector, **kwargs):
   1069         # the keys, should be axis names

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/docrep/decorators.py in update_docstring(self, *args, **kwargs)
     41             return func(self, *args, **kwargs)
     42         elif len(args) and callable(args[0]):
---> 43             doc = func(self, args[0].__doc__, *args[1:], **kwargs)
     44             _set_object_doc(args[0], doc, py2_class=self.python2_classes)
     45             return args[0]

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages/docrep/__init__.py in dedent(self, s, stacklevel)
    532             encountering an invalid key in the string
    533         """
--> 534         s = inspect.cleandoc(s)
    535         return safe_modulo(s, self.params, stacklevel=stacklevel)
    536

/glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/inspect.py in cleandoc(doc)
    617     onwards is removed."""
    618     try:
--> 619         lines = doc.expandtabs().split('\n')
    620     except UnicodeError:
    621         return None

AttributeError: 'NoneType' object has no attribute 'expandtabs'

I have tried updating my environment and making sure the relevant packages are the same version, but I fail wrt pop-tools and xgcm.

conda_analysis_casper.txt:pop-tools                 2020.2.17.post6          pypi_0    pypi
conda_dcpy_andre.txt:pop-tools                 2020.12.15         pyhd8ed1ab_0    conda-forge

When I run conda update -c conda-forge pop-tools it stays the same, when I try pip install git+https://github.com/NCAR/pop-tools.git it gives me this error:

(analysis) -bash-4.2$ pip install git+https://github.com/NCAR/pop-tools.git
Collecting git+https://github.com/NCAR/pop-tools.git
  Cloning https://github.com/NCAR/pop-tools.git to /glade/scratch/deppenme/pip-req-build-8etvcn_v
  Running command git clone -q https://github.com/NCAR/pop-tools.git /glade/scratch/deppenme/pip-req-build-8etvcn_v
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
    Preparing wheel metadata ... done
Requirement already satisfied: dask>=2.14 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (2021.2.0)
Requirement already satisfied: pyyaml>=5.3.1 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (5.4.1)
Collecting numba>=0.52
  Using cached numba-0.52.0-cp37-cp37m-manylinux2014_x86_64.whl (3.2 MB)
Requirement already satisfied: xarray>=0.16.1 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (0.17.1.dev3+g48378c4)
Requirement already satisfied: numpy>=1.17.0 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (1.19.5)
Requirement already satisfied: pooch>=1.3.0 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pop-tools==2020.12.15) (1.3.0)
Requirement already satisfied: setuptools in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from numba>=0.52->pop-tools==2020.12.15) (49.6.0.post20210108)
Collecting llvmlite<0.36,>=0.35.0
  Using cached llvmlite-0.35.0-cp37-cp37m-manylinux2010_x86_64.whl (25.3 MB)
Requirement already satisfied: requests in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pooch>=1.3.0->pop-tools==2020.12.15) (2.25.1)
Requirement already satisfied: packaging in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pooch>=1.3.0->pop-tools==2020.12.15) (20.9)
Requirement already satisfied: appdirs in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pooch>=1.3.0->pop-tools==2020.12.15) (1.4.4)
Requirement already satisfied: pandas>=0.25 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from xarray>=0.16.1->pop-tools==2020.12.15) (1.2.3)
Requirement already satisfied: python-dateutil>=2.7.3 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pandas>=0.25->xarray>=0.16.1->pop-tools==2020.12.15) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from pandas>=0.25->xarray>=0.16.1->pop-tools==2020.12.15) (2021.1)
Requirement already satisfied: six>=1.5 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas>=0.25->xarray>=0.16.1->pop-tools==2020.12.15) (1.15.0)
Requirement already satisfied: pyparsing>=2.0.2 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from packaging->pooch>=1.3.0->pop-tools==2020.12.15) (2.4.7)
Requirement already satisfied: chardet<5,>=3.0.2 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (4.0.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (1.26.3)
Requirement already satisfied: certifi>=2017.4.17 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (2020.12.5)
Requirement already satisfied: idna<3,>=2.5 in /glade/work/deppenme/miniconda3/envs/analysis/lib/python3.7/site-packages (from requests->pooch>=1.3.0->pop-tools==2020.12.15) (2.10)
Building wheels for collected packages: pop-tools
  Building wheel for pop-tools (PEP 517) ... done
  Created wheel for pop-tools: filename=pop_tools-2020.12.15-py3-none-any.whl size=30153 sha256=988c711795c5b46037a868957be10d82f56fab4eab935b862cbc8c054f5f7717
  Stored in directory: /glade/scratch/deppenme/pip-ephem-wheel-cache-bdpxkpi5/wheels/65/87/a3/7667dcc7225e5105e95e09186af11e0befb140c2779fc074cf
Successfully built pop-tools
Installing collected packages: llvmlite, numba, pop-tools
  Attempting uninstall: llvmlite
    Found existing installation: llvmlite 0.31.0
ERROR: Cannot uninstall 'llvmlite'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

Anyway, I believe the issue is in principle with xgcm (see the .grid problem), so I tried to at least get that to the same version on both systems. Both seem to be installed by conda-forge

conda_analysis_casper.txt:xgcm                      0.3.0                      py_0    conda-forge
conda_dcpy_andre.txt:xgcm                      0.5.1                      py_0    conda-forge

and when I try to update with conda-forge it says it's already installed and doesn't change the version.

Long story short: help please. @Deepak Cherian @Anderson Banihirwe any ideas?

view this post on Zulip Deepak Cherian (Mar 03 2021 at 19:59):

Yes this is an xgcm version problem. conda update should fix it, are you getting some other error in that case?

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 20:05):

I keep doing conda update and it doesn't change the version.

(analysis) -bash-4.2$ conda update xgcm
Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

(analysis) -bash-4.2$

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 20:06):

same with conda update -c conda-forge xgcm

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:00):

@Anna-Lena Deppenmeier, I think the issue is coming from an incompatible docrep version... Try pinning docrep to v0.2.7

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:00):

conda install -c conda-forge docrep==0.2.7

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 21:01):

yes, I have come across the docrep problem. it is currently at 0.2.7
docrep 0.2.7 pypi_0 pypi

am doing the conda install thing now and will then try a conda update xgcm again.

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 21:06):

it's thinking really hard and currently hangs at

(analysis) -bash-4.2$ conda install -c conda-forge docrep==0.2.7
Collecting package metadata (current_repodata.json): done
Solving environment: -
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - conda-forge/linux-64::psyplot==1.3.1=py37h89c1867_0
  - conda-forge/noarch::xrft==0.2.3=pyhd3deb0d_0
  - conda-forge/noarch::funcargparse==0.2.3=pyh9f0ad1d_0
failed with initial frozen solve. Retrying with flexible solve.
Solving environment: |

what can I do about the inconsistent environment?

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 21:09):

ok it took a while but it worked and I was able to update xgcm. Thanks @Anderson Banihirwe ! And while you're here (; I am initializing my cluster like this

# this is for when you do things on casper
import ncar_jobqueue
import dask
import distributed

dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'})
cluster = ncar_jobqueue.NCARCluster() # initializes cluster
client = distributed.Client(cluster)

cluster.adapt(minimum=6, maximum=32, wait_count=600) # gets you 6 workers minimum, max
                                                     # makes them wait 10 minutes before releasing
client

but I can't see my dashboard when I click on the link, what am I doing wrong?

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:42):

what version of ncar-jobqueue are you using?

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:42):

With the newest version of ncar-jobqueue, you don't need this

dask.config.set({'distributed.dashboard.link': '/proxy/{port}/status'})

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 21:44):

'2020.3.4'

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 21:45):

so when I remove the line and click on the new link. it wants to log into jupyterhub, which I am not using in the first place

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:45):

it's thinking really hard and currently hangs at

You may find mamba to be very useful in this kind of situations

conda install -c conda-forge mamba`#  in your base

and whenever you want to install packages via conda, replace conda with mamba. E.g.: mamba install -c conda-forge docrep==0.2.7

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:46):

'2020.3.4'

That's old

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:46):

Try the latest version 2021.2.10

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:46):

conda install -c conda-forge ncar-jobqueue==2021.2.10

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 21:47):

hm. I did a conda update all in the beginning of the day, because I haven't worked on casper in a while, I wonder why it didn't update it.

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 21:48):

Is there a running documentary somewhere where we keep the current versions? So for example I am mostly working on the cgd system these days, and whenever I switch back to casper I am having these issues and have to ask people (partly because conda update all does not seem to take care of these things?)
but anyhow I updated and I still get [pasted image] upon clicking on the link (user_uploads/2/43/qRHGVdyfK9fAaORJ5dlfqOn8/pasted_image.png)

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:52):

hm. I did a conda update all in the beginning of the day, because I haven't worked on casper in a while, I wonder why it didn't update it.

the conda solver is very flexible unless you pin down which versions you want.. It's hard to control this when you've installed packages via conda install ... command. one remedy is to curate your environment in a environment.yml and then run conda env update -f environment.yml command whenever you want to update... (With this approach, it's easy to control which versions get installed)

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:52):

Is there a running documentary somewhere where we keep the current versions? So for example I am mostly working on the cgd system these days, and whenever I switch back to casper I am having these issues and have to ask people (partly because conda update all does not seem to take care of these things?)
but anyhow I updated and I still get [pasted image] upon clicking on the link (user_uploads/2/43/qRHGVdyfK9fAaORJ5dlfqOn8/pasted_image.png)

This looks very familiar

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 21:52):

This is likely a version mismatch

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 21:53):

hm. I did a conda update all in the beginning of the day, because I haven't worked on casper in a while, I wonder why it didn't update it.

the conda solver is very flexible unless you pin down which versions you want.. It's hard to control this when you've installed packages via conda install ... command. one remedy is to curate your environment in a environment.yml and then run conda env update -f environment.yml command whenever you want to update... (With this approach, it's easy to control which versions get installed)

I don't understand what you mean with curate -- like I literally go in and find the version numbers I need?

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 21:55):

tbh that doesn't sound very easy to me at all. I thought it was the idea to be able to use conda installand then get the latest version if I don't specify which version I want with the ==?

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 22:00):

I don't understand what you mean with curate -- like I literally go in and find the version numbers I need?

For instance, if you were starting with a new environment, you would create a new environment.yml file. The original contents of this file may look like this:

name: my-new-env
channels:
   - conda-forge
dependencies:
  - python=3.8
  - xarray==0.16.2
  - dask==2.14

To create/update this environment, you would run conda env update -f environment.yml

If let's say a week later, you decide that you want to upgrade one or more dependencies in this environment, you need to go back to the environment.yml file and update it, e.g.

name: my-new-env
channels:
   - conda-forge
dependencies:
  - python=3.8
  - xarray==0.17 # Upgrade to latest version of xarray
  - dask==2.14

At this point, you would then run conda env update -f environment.yml

With this approach, you can have an idea of what versions of packages you care about are installed in your environment by looking at your environment.yml file and you can easily re-create this environment on the same machine or another machine with the same type of operating system (Linux, MacOS, etc...). This is what I mean by environment curation.

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 22:00):

I hope this clarifies my previous point

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 22:01):

Regarding the version mismatch issue, what is the output of

import dask, distributed, bokeh
print(dask.__version__, distributed.__version__, bokeh.__version__)

from your notebook??

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 22:04):

I hope this clarifies my previous point

It does, thanks. I am not sure whether I would be able to do that though. It seems like I would be on top of every package version I would want to use, which seems like a lot to research / keep in mind. Anyway.

The output is 2021.02.0 2021.02.0 2.3.0

I also just tried pushing my notebook to git and it says I am denied

Username for 'https://github.com': ALDepp
Password for 'https://ALDepp@github.com':
remote: Permission to NCAR/pop-tools.git denied to ALDepp.
fatal: unable to access 'https://github.com/NCAR/pop-tools.git/': The requested URL returned error: 403

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 22:06):

The output is 2021.02.0 2021.02.0 2.3.0

Try pinning your dask, distributed, bokeh versions to these versions:

dask==2.14
distributed==2.14
bokeh==1.4

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 22:09):

I also just tried pushing my notebook to git and it says I am denied

I think you want to push to your fork (https://github.com/ALDepp/pop-tools) repo and open a pull request to merge your changes into NCAR/pop-tools

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 22:10):

Can you confirm your git repo is pointed to the right remote endpoint?

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 22:10):

Try this

view this post on Zulip Anderson Banihirwe (Mar 03 2021 at 22:10):

git remote -v

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 22:13):

I also just tried pushing my notebook to git and it says I am denied

I think you want to push to your fork (https://github.com/ALDepp/pop-tools) repo and open a pull request to merge your changes into NCAR/pop-tools

yes, thank you (:

view this post on Zulip Anna-Lena Deppenmeier (Mar 03 2021 at 22:23):

git remote -v
origin  https://github.com/ALDepp/pop-tools.git (fetch)
origin  https://github.com/ALDepp/pop-tools.git (push)
upstream    https://github.com/NCAR/pop-tools.git (fetch)
upstream    https://github.com/NCAR/pop-tools.git (push)

view this post on Zulip Anna-Lena Deppenmeier (Mar 04 2021 at 16:19):

The output is 2021.02.0 2021.02.0 2.3.0

Try pinning your dask, distributed, bokeh versions to these versions:

dask==2.14
distributed==2.14
bokeh==1.4
I have not been able to pin bokeh to 1.4. It took a very long time and then gave me a very long list of packages that conflict, and in the end says

Your installed version is: 2.17

view this post on Zulip Anderson Banihirwe (Mar 04 2021 at 16:29):

Could you point me to the location of this conda environment?

view this post on Zulip Anna-Lena Deppenmeier (Mar 04 2021 at 16:50):

/glade/work/deppenme/miniconda3/envs/analysis

view this post on Zulip Yassir Eddebbar (Mar 12 2021 at 17:46):

I am wondering if it could be helpful to users not deeply familiar with the POP grid to add an illustration/visualization in @Anna-Lena Deppenmeier 's notebook on where the various terms sit in the grid cell, what the various terms mean. This what I found most challenging in creating the budget terms... I started on something a few months ago on illustrator that can be edited for this purpose (needs QA/QC review for accuracy):
POP_Grid.png

view this post on Zulip Deepak Cherian (Mar 12 2021 at 17:47):

yes please! very cool image

view this post on Zulip Deepak Cherian (Mar 12 2021 at 17:49):

if you have html/svg skills you could add this to xgcm: https://github.com/xgcm/xgcm/issues/276

view this post on Zulip Max Grover (Mar 12 2021 at 17:49):

Would be good to add to pop-tools as well?

view this post on Zulip Anna-Lena Deppenmeier (Mar 12 2021 at 17:50):

@Yassir Eddebbar I think that would be useful. Are v.t and u.t consistent in this figure? one seems to be on the corner and one on the face

view this post on Zulip Anna-Lena Deppenmeier (Mar 12 2021 at 17:50):

@Yassir Eddebbar is it okay if I add this to the example notebook (since casper is up again I aim to make changes and resubmit today)

view this post on Zulip Yassir Eddebbar (Mar 12 2021 at 17:56):

Yassir Eddebbar is it okay if I add this to the example notebook (since casper is up again I aim to make changes and resubmit today)

Yes, as long as it looks ok by you and others accuracy wise, go for it. Happy to edit some later

view this post on Zulip Yassir Eddebbar (Mar 12 2021 at 18:01):

Yassir Eddebbar I think that would be useful. Are v.t and u.t consistent in this figure? one seems to be on the corner and one on the face

yeah, I struggled with how to represent v.t , it's supposed to look like it's popping out the center of the northern cell face (location 3121) not corner, but now looking at it, it can be confused for the SE upper corner... I can work something better later?

view this post on Zulip Yassir Eddebbar (Mar 12 2021 at 18:03):

if you have html/svg skills you could add this to xgcm: https://github.com/xgcm/xgcm/issues/276

@Deepak Cherian I'll look into it, not familiar with svg yet, but sounds like it could be really useful for xgcm.

view this post on Zulip Yassir Eddebbar (Mar 12 2021 at 20:02):

@Anna-Lena Deppenmeier Here is an updated version with more POP-consistent naming and a legend: POP_Grid.png
@Max Grover Feel free to use on pop-tools doc or elsewhere, and let me know if you need any edits or additions for other uses, I can also send over the .ai file

view this post on Zulip Deepak Cherian (Mar 17 2021 at 14:03):

Just an update: This awesome notebooks is now live thanks to @Anna-Lena Deppenmeier @Yassir Eddebbar @Anderson Banihirwe .

view this post on Zulip Who Kim (May 18 2021 at 21:39):

I am trying to expanding this awesome notebook to work for prediction datasets (in particular DPLE). Thanks to Anderson, now to_xgcm_grid_dataset() can handle the DPEL dimensions (time, lead, ensemble, z, y, x). But, when I try to compute the buget term using in Z direction, ie.,

budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * dsxgcm.VOL.values, axis="Z") / dsxgcm.VOL)

, I got the following error, while working fine in X and Y directions: "None of the DataArray's dims ('Y', 'M', 'L', 'z_w_top', 'nlat_t', 'nlon_t') were found in axis coords." Can anone help me to figure out how to fix this problem? Thanks!

view this post on Zulip Anna-Lena Deppenmeier (May 19 2021 at 15:41):

Hi Who, on which coordinate is WTT? I'm assuming VOL is on z_t and WTT on z_w_bot or z_w_top

view this post on Zulip Anna-Lena Deppenmeier (May 19 2021 at 15:41):

If that is the case then you need to either reasign / rename the coordinate or interpolate first for xgcm to want to mulitply WTT with VOL.

view this post on Zulip Yassir Eddebbar (May 19 2021 at 17:35):

@Who Kim can you paste your code for how you set up your grid, vertical thickness, volume, etc?

view this post on Zulip Who Kim (May 19 2021 at 20:31):

@Anna-Lena Deppenmeier I have tried that, but didn't work (I forget what was the error messsage). In fact, this line is same as in your original script and only the difference is that WTT has now more coornidates. Doesn't dsxgcm.VOL.values makes it free from its assigned coordinates?

@Yassir Eddebbar Below is how I set up those, which is similar to the Anna's script with some modifications,

dple["DZT"] = xr.DataArray(dple.dz.values[:,None,None]*np.ones((len(dple.dz),len(dple.nlat),len(dple.nlon)))
                , dims=['z_t','nlat','nlon'], coords={'z_t':dple.z_t,'nlat':dple.nlat,'nlon':dple.nlon})
dple["DZU"] = xr.DataArray(dple.dz.values[:,None,None]*np.ones((len(dple.dz),len(dple.nlat),len(dple.nlon)))
                , dims=['z_t','nlat','nlon'], coords={'z_t':dple.z_t,'nlat':dple.nlat,'nlon':dple.nlon})

dple.DZT.attrs["long_name"] = "Thickness of T cells"
dple.DZT.attrs["units"] = "centimeter"
dple.DZT.attrs["grid_loc"] = "3111"
dple.DZU.attrs["long_name"] = "Thickness of U cells"
dple.DZU.attrs["units"] = "centimeter"
dple.DZU.attrs["grid_loc"] = "3221"

# make sure we have the cell volumne for calculations
VOL = (dple.DZT * dple.DXT * dple.DYT).compute()
KMT = dple.KMT.compute()

for j in tqdm(range(len(KMT.nlat))):
    for i in range(len(KMT.nlon)):
        k = KMT.values[j, i].astype(int)
        VOL.values[k:, j, i] = 0.0

dple["VOL"] = VOL

dple.VOL.attrs["long_name"] = "volume of T cells"
dple.VOL.attrs["units"] = "centimeter^3"

dple.VOL.attrs["grid_loc"] = "3111"

view this post on Zulip Anna-Lena Deppenmeier (May 19 2021 at 20:33):

I didn't catch the .values, thanks for pointing that out. I'm not sure xgcm would multiply with values though, given that it uses the coordinates to determine how to multiply. Usually it looks for the same coordinates to know how to multiply.

view this post on Zulip Who Kim (May 19 2021 at 20:38):

Below is the codes from your script. I can compute UET and VNT using the same codes without any problems. Is there a reason why you multify (dzdxtdyt).values for WTT insead of VOL.values? I think I tried that, but I can try again with your original formula.

budget["UET"] = -(gridxgcm.diff(dsxgcm.UET * dsxgcm.VOL.values, axis="X") / dsxgcm.VOL)
budget["VNT"] = -(gridxgcm.diff(dsxgcm.VNT * dsxgcm.VOL.values, axis="Y") / dsxgcm.VOL)
budget["WTT"] = (
    gridxgcm.diff(dsxgcm.WTT.fillna(0) * (dsxgcm.dz * dsxgcm.DXT * dsxgcm.DYT).values, axis="Z")
    / dsxgcm.VOL
)

view this post on Zulip Yassir Eddebbar (May 19 2021 at 21:05):

Hmmm, how about the xgcm grid set up, any modification there? i.e. this part:

metrics = {
    ("X",): ["DXU", "DXT"],  # X distances
    ("Y",): ["DYU", "DYT"],  # Y distances
    ("Z",): ["DZU", "DZT"],  # Z distances
    ("X", "Y"): ["UAREA", "TAREA"],
}

# here we get the xgcm compatible dataset
gridxgcm, dsxgcm = pop_tools.to_xgcm_grid_dataset(
    ds,
    periodic=False,
    metrics=metrics,
    boundary={"X": "extend", "Y": "extend", "Z": "extend"},
)

for coord in ["nlat", "nlon"]:
    if coord in dsxgcm.coords:
        dsxgcm = dsxgcm.drop_vars(coord)

view this post on Zulip Who Kim (May 19 2021 at 23:00):

It is exactly same.

view this post on Zulip Yassir Eddebbar (May 20 2021 at 15:49):

Following on @Anna-Lena Deppenmeier 's reassigning coordinates idea, does an xr.roll / shift method work (just for debugging purposes)? something like:

di=xr.Dataset()
di['UET'] = -((ds.UET*ds.VOL) - (ds.UET*ds.VOL).roll(nlon=1, roll_coords=True))/ds.VOL
di['VNT'] = -((ds.VNT*ds.VOL) - (ds.VNT*ds.VOL).roll(nlat=1, roll_coords=True))/ds.VOL
di['WTT'] = - ((ds.WTT*(ds.VOL.drop('z_t').rename({"z_t":"z_w_top"}).assign_coords(z_w_top=ds.z_w_top))
                     - (ds.WTT*(ds.VOL.drop('z_t').rename({"z_t":"z_w_top"}).assign_coords(z_w_top=ds.z_w_top))).shift(z_w_top=-1).fillna(0)
                    ).drop('z_w_top').rename({"z_w_top":"z_t"}).assign_coords(z_t=ds.z_t))/ds.VOL

view this post on Zulip Who Kim (May 20 2021 at 16:38):

It appears to be working (at least I didn't see any error). I have also generated a new VOL defined at z_w_top and added to dsxgcm, but I got the same error:

dsxgcm["VOL2"] = xr.DataArray(dsxgcm.VOL.values, dims=['z_w_top', 'nlat_t', 'nlon_t'],
                              coords={'z_w_top':dsxgcm.z_w_top, 'nlat_t':dsxgcm.nlat_t, 'nlon_t':dsxgcm.nlon_t})

budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * dsxgcm.VOL2.values, axis="Z")/dsxgcm.VOL)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-86-192e3d3043c5> in <module>
      2 vol_top = vol_top.rename({'z_t': 'z_w_top'})
      3
----> 4 budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * vol_top, axis="Z")/ dsxgcm.VOL)

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in diff(self, da, axis, **kwargs)
   1483         >>> grid.diff(da, ['X', 'Y'], fill_value={'X':0, 'Y':100})
   1484         """
-> 1485         return self._grid_func("diff", da, axis, **kwargs)
   1486
   1487     @docstrings.dedent

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in _grid_func(self, funcname, da, axis, **kwargs)
   1419                 out = out * metric
   1420
-> 1421             out = func(out, **kwargs)
   1422
   1423             if metric_weighted:

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in diff(self, da, to, boundary, fill_value, boundary_discontinuity, vector_partner, keep_coords)
    638         """
    639
--> 640         return self._neighbor_binary_func(
    641             da,
    642             raw_diff_function,

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in _neighbor_binary_func(self, da, f, to, boundary, fill_value, boundary_discontinuity, vector_partner, keep_coords)
    275             The differenced data
    276         """
--> 277         position_from, dim = self._get_axis_coord(da)
    278         if to is None:
    279             to = self._default_shifts[position_from]

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in _get_axis_coord(self, da)
   1013                 return position, coord_name
   1014
-> 1015         raise KeyError(
   1016             "None of the DataArray's dims %s were found in axis "
   1017             "coords." % repr(da.dims)

KeyError: "None of the DataArray's dims ('Y', 'M', 'L', 'z_w_top', 'nlat_t', 'nlon_t') were found in axis coords."

view this post on Zulip Anna-Lena Deppenmeier (May 20 2021 at 16:49):

Who, I don't know where the 'Y', 'M', 'L' dimenions are coming from. I recommend always checking your dimensions for the DataArrays, as they have to line up for xgcm to want to multiply.

view this post on Zulip Anna-Lena Deppenmeier (May 20 2021 at 16:50):

also, you negate that you added dimensions to VOL2 when you then use .values. it turns the xarray dataarray into a numpy.ndarray array.

view this post on Zulip Anna-Lena Deppenmeier (May 20 2021 at 16:56):

Can you try with with just multiplying VOL2 instead of VOL2.values?

view this post on Zulip Deepak Cherian (May 20 2021 at 16:59):

(I know I've asked this 50x before but I keep forgetting!) Why are we multiplying by VOL.values. Is it because there's a z_w and z_w_top mismatch?

view this post on Zulip Yassir Eddebbar (May 20 2021 at 19:21):

I believe the mismatch was z_t and z_w_top since WTT is centered at the center of the upper face (3112) vs VOL is in the center of the cell (3111).

view this post on Zulip Yassir Eddebbar (May 20 2021 at 19:25):

Who, I don't know where the 'Y', 'M', 'L' dimensions are coming from. I recommend always checking your dimensions for the DataArrays, as they have to line up for xgcm to want to multiply.

maybe xgcm is confused by the DPLE Y dimension? in:

~/miniconda3/envs/rapcdi-analysis/lib/python3.8/site-packages/xgcm/grid.py in diff(self, da, axis, **kwargs)
   1483         >>> grid.diff(da, ['X', 'Y'], fill_value={'X':0, 'Y':100})
   1484         """
-> 1485         return self._grid_func("diff", da, axis, **kwargs)
   1486
   1487     @docstrings.dedent

copying @Julius Busecke who helps develop xgcm

view this post on Zulip Who Kim (May 20 2021 at 20:02):

@Anna-Lena Deppenmeier As I mentioned in my first post, I am working on DPLE data with Y, M, L being the start year, ensemble memble, and lead time, respectively. I believe the reason why .values is used that UET and VNT are at (nlat_t, nlon_u) while VOL is at T-grid for both. If I don't use .values, I got: broadcasting cannot handle duplicate dimensions: ['time', 'z_t', 'nlat_t', 'nlon_u', 'nlon_u']. I though Anna used .values for a similar region because WTT is at z_w_top although it is at T-grid horizonally. If this logic is correct, I don't see any reason why this operation is not working for WTT. Indeed, it is working fine when my dataset has the conventional POP (t, z, y, x) dimensions. So, my speculation is it is related to the unconventional dataset dimensions, which xgcm cannot perhaps handle it?

view this post on Zulip Who Kim (May 20 2021 at 20:11):

I just want to add that the operation for UET and VNT is working with the unconventional data dimentions.

view this post on Zulip Anna-Lena Deppenmeier (May 20 2021 at 20:13):

would you mind copying the dimensions of your dataarrays here Who? I don't understand where the duplicate nlon_u is coming from.

view this post on Zulip Anna-Lena Deppenmeier (May 20 2021 at 20:14):

specifically VOL (without .values) and WTT

view this post on Zulip Who Kim (May 20 2021 at 20:18):

'VOL' (z_t: 60, nlat_t: 384, nlon_t: 320)
'WTT' (Y: 63, M: 39, L: 122, z_w_top: 60, nlat_t: 384, nlon_t: 320)
The error message above is what I got when I ran the operation for UET without .values, not WTT

view this post on Zulip Anna-Lena Deppenmeier (May 20 2021 at 20:19):

I just want to add that the operation for UET and VNT is working with the unconventional data dimentions.

ah so you meant when you run this with .values? I misunderstood

view this post on Zulip Who Kim (May 20 2021 at 20:20):

Correct

view this post on Zulip Anna-Lena Deppenmeier (May 20 2021 at 20:20):

so what are the dimensions of UET? (Y: 63, M: 39, L: 122, z_t: 60, nlat_t: 384, nlon_u: 320)?

view this post on Zulip Who Kim (May 20 2021 at 20:21):

Yes

view this post on Zulip Anna-Lena Deppenmeier (May 20 2021 at 20:26):

:thinking:

view this post on Zulip Julius Busecke (May 20 2021 at 22:56):

Hey everyone. I have to admit I am a bit overwhelmed right here (might also be the dask summit haha). What is the exact code that fails?

view this post on Zulip Who Kim (May 21 2021 at 16:51):

The code fails to compute,

budget["WTT"] = (gridxgcm.diff(dsxgcm.WTT.fillna(0) * dsxgcm.VOL2.values, axis="Z")/dsxgcm.VOL)

, while the the same operation in X and Y directions is working fine. I am working with data (CESM DPLE) with "unusual" dimensions, which include additional "lead time" and "member" dimesions (ie., (lead time, member, start year, z, y, x)). The same operation is working fine when the data dimensions are the usual POP dimenions (time, z, y, x).

view this post on Zulip Anderson Banihirwe (May 21 2021 at 18:06):

It might be useful to have a reproducible example notebook in order to diagnose this issue. @Who Kim, is your work in a notebook that is publicly accessible or somewhere on Glade?

view this post on Zulip Who Kim (May 21 2021 at 22:23):

I think I found what went wrong. When reading the data, I only stored the variables and coordinates that I though necessary, including z_w_top, but somehow to_xgcm_grid_dataset coudn't convert z_w_top to xgcm dataset as a coordinate, while it did for z_w_bot. When I imported everying, z_w_top appears as a coordinate, weird. Thank everyone responded here!

view this post on Zulip Deepak Cherian (May 21 2021 at 23:26):

hmm.. that error message is really misleading then. It should've told you that z_w_top was missing

view this post on Zulip Anna-Lena Deppenmeier (Jun 23 2021 at 17:22):

I think I found what went wrong. When reading the data, I only stored the variables and coordinates that I though necessary, including z_w_top, but somehow to_xgcm_grid_dataset coudn't convert z_w_top to xgcm dataset as a coordinate, while it did for z_w_bot. When I imported everying, z_w_top appears as a coordinate, weird.

I actually just had the same problem, would be great if someone could look into why to_xgcm_grid_dataset does not bring z_w_top along!

view this post on Zulip Deepak Cherian (Jun 23 2021 at 17:24):

open an issue!

view this post on Zulip Anna-Lena Deppenmeier (Jun 23 2021 at 17:25):

I am currently having problems with this line

budget["DIA_IMPVF_TEMP"][:, 0, :, :] = (
    SRF_TEMP_FLUX * dsxgcm.TAREA - dsxgcm.DIA_IMPVF_TEMP.isel(z_w_bot=0) * dsxgcm.TAREA
) / dsxgcm.VOL.values[0, :, :]

This works when I use the example script, which only has 1 timestep. In that case ds.DIA_IMPVF_TEMP is not a dask array. It does not work when I load multiple files, then ds.DIA_IMPVF_TEMP is a dask array, and I can't assign with budget["DIA_IMPVF_TEMP"][:, 0, :, :] =... . yesterday I tried to get around this by loading ds.DIA_IMPVF_TEMP, which resulted in a memory error. How can I assign something to a dask array without loading?


Last updated: May 16 2025 at 17:14 UTC