Stream: python-questions

Topic: Missing config file for pop-tools?


view this post on Zulip Elizabeth Maroon (May 22 2020 at 16:23):

Hi y'all, got a pop-tools problem on cheyenne (version 2020.4.30). When using get_grid, pooch is trying to write temporary files to places I don't permission for, rather than my scratch space or other local directory. Is there a config file or YAML that I'm missing that tells pooch where to write temporary stuff to?

import pop_tools
pop_tools.get_grid('POP_gx1v7')['REGION_MASK']

Traceback (most recent call last):
  File "/glade/u/home/emaroon/miniconda3/lib/python3.7/site-packages/pooch/utils.py", line 279, in make_local_storage
    with tempfile.NamedTemporaryFile(dir=path):
  File "/glade/u/home/emaroon/miniconda3/lib/python3.7/tempfile.py", line 547, in NamedTemporaryFile
    (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
  File "/glade/u/home/emaroon/miniconda3/lib/python3.7/tempfile.py", line 258, in _mkstemp_inner
    fd = _os.open(file, flags, 0o600)
PermissionError: [Errno 13] Permission denied: '/glade/p/cesmdata/cseg/tmpodosnjdc'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/glade/u/home/emaroon/miniconda3/lib/python3.7/site-packages/pop_tools/grid.py", line 76, in get_grid
    horiz_grid_fname = INPUTDATA.fetch(grid_attrs['horiz_grid_fname'], downloader=downloader)
  File "/glade/u/home/emaroon/miniconda3/lib/python3.7/site-packages/pooch/core.py", line 564, in fetch
    make_local_storage(str(self.abspath))
  File "/glade/u/home/emaroon/miniconda3/lib/python3.7/site-packages/pooch/utils.py", line 293, in make_local_storage
    raise PermissionError(" ".join(message)) from error
PermissionError: [Errno 13] Permission denied: '/glade/p/cesmdata/cseg/tmpodosnjdc' | Pooch could not write to data cache folder '/glade/p/cesmdata/cseg'. Will not be able to download data files.
  ```

view this post on Zulip Matt Long (May 22 2020 at 16:55):

I would not expect any download to be necessary. It should be trying to access these files

    horiz_grid_fname: '/glade/p/cesmdata/cseg/inputdata/ocn/pop/gx1v7/grid/horiz_grid_20010402.ieeer8'
    topography_fname: '/glade/p/cesmdata/cseg/inputdata/ocn/pop/gx1v7/grid/topography_20161215.ieeei4'
    region_mask_fname: '/glade/p/cesmdata/cseg/inputdata/ocn/pop/gx1v7/grid/region_mask_20151008.ieeei4'

I think this might be a bug.

I think you can set an environment variable CESMDATAROOT to a different path as a temporary solution.

For what it's worth, this

import pop_tools
pop_tools.get_grid('POP_gx1v7')['REGION_MASK']

works for me.

view this post on Zulip Elizabeth Maroon (May 22 2020 at 17:04):

Hi Matt, thanks for the quick work-around, setting CESMDATAROOT to something else for now solves the problem. If you think this really is a bug, is it worth posting to the pop-tools github since it might be a cheyenne specific thing?

view this post on Zulip Matt Long (May 22 2020 at 17:11):

Would be great if you can file an issue ticket here.

Yes, I think it's a bug. We should be using the files that are available locally on Cheyenne. Somehow your CESMDATAROOT got munged to /glade/p/cesmdata/cseg/tmpodosnjdc, which I don't understand. If you get a fresh terminal on Cheyenne, is CESMDATAROOT set for you?

@Keith Lindsay, does CESM use CESMDATAROOT? I have forgotten whether we invented that for pop_tools or if we expropriated it.

view this post on Zulip Elizabeth Maroon (May 22 2020 at 17:17):

Cool, will file an issue ticket in a few minutes.

Yes, CESMDATAROOT is set by default for me on Cheyenne (and the student I've been helping through this issue):

echo $CESMDATAROOT
/glade/p/cesmdata/cseg

view this post on Zulip Keith Lindsay (May 22 2020 at 18:41):

Yes, @Matt Long , CESM/CIME uses CESMDATAROOT, to point to inputdata among other things.

FYI, I've recently stubled upon git grep and have become rather fond of it.
For example, if I run git grep -l CESMDATAROOT from a cime root directory, I see where CESMDATAROOT is used:

config/cesm/machines/config_compilers.xml
config/cesm/machines/config_machines.xml
config/ufs/machines/config_compilers.xml
config/xml_schemas/config_machines_template.xml
doc/source/misc_tools/ect.rst
doc/source/users_guide/unit_testing.rst
tools/statistical_ensemble_test/README

view this post on Zulip Anderson Banihirwe (May 22 2020 at 23:48):

Yes, I think it's a bug. We should be using the files that are available locally on Cheyenne. Somehow your CESMDATAROOT got munged to /glade/p/cesmdata/cseg/tmpodosnjdc, which I don't understand.

The /glade/p/cesmdata/cseg/tmpodosnjdc directory was being created by pooch to assert that /glade/p/cesmdata/cseg is a writable directory. Earlier versions of pooch didn't mind if the local storage was pointing to a read-only location. Pooch would warn the user about this. However, the most recent version of pooch throws an error by default. I just addressed this issue in https://github.com/NCAR/pop-tools/pull/52

view this post on Zulip Matt Long (May 23 2020 at 13:59):

@Anderson Banihirwe, see my comment on you PR. I am wondering if we should overload fetch and remove pooch from the equation altogether if inputdata is unwritable. Alternatively, we could support multiple rank-ordered locations from which data can be read and, if the primary is unwritable, cached.

view this post on Zulip Matt Long (Dec 15 2020 at 19:53):

Fixed?
https://github.com/NCAR/pop-tools/releases/tag/v2020.12.15

view this post on Zulip Anderson Banihirwe (Dec 15 2020 at 22:14):

Fixed?
https://github.com/NCAR/pop-tools/releases/tag/v2020.12.15

Yes...

In [1]: import pop_tools

In [2]: mask = pop_tools.get_grid('POP_gx1v7')['REGION_MASK']
/glade/work/abanihi/softwares/miniconda3/envs/pop-tools-dev/lib/python3.8/site-packages/numba/np/ufunc/parallel.py:363: NumbaWarning: The TBB threading layer requires TBB version 2019.5 or later i.e., TBB_INTERFACE_VERSION >= 11005. Found TBB_INTERFACE_VERSION = 6103. The TBB threading layer is disabled.
  warnings.warn(problem)

In [3]: mask
Out[3]:
<xarray.DataArray 'REGION_MASK' (nlat: 384, nlon: 320)>
array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [1, 1, 1, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int32)
Dimensions without coordinates: nlat, nlon
Attributes:
    long_name:    basin index number (signed integers)
    coordinates:  TLONG TLAT

In [4]: from pop_tools.grid import INPUTDATA

In [5]: INPUTDATA.path
Out[5]: PosixPath('/glade/p/cesmdata/cseg')

In [6]: import os

In [7]: os.environ['CESMDATAROOT']
Out[7]: '/glade/p/cesmdata/cseg'

Last updated: Jan 30 2022 at 12:01 UTC