Prestage model data

This notebook downloads data from the NCAR DASH repository where the modeling data for this study has been archived [Long et al., 2021] and also ensures that a dataset curated via Intake is accessible—local caching of this dataset happen automatically behind the scenes.

First, we demonstrate the various local storage locations used to support the calculation.

%load_ext autoreload
%autoreload 2
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
import os
from subprocess import Popen, PIPE
import tarfile

import xarray as xr
xr.set_options(display_style='text')

import config

Get data from DASH repo

Use a DASH-NCAR provided wget script to download the modeling data in Long et al. [2021]. This won’t work on machines that do not support wget (i.e., MacOS).

if not os.path.isdir(config.get("model_data_dir")):
    # run wget to stage data
    # TODO: support curl too
    cwd = os.getcwd()
    script = f'{cwd}/wget-dash-archive.sh'

    os.chdir(config.get("model_data_dir_root"))

    p = Popen(['bash', script], stdout=PIPE, stderr=PIPE)
    stdout, stderr = p.communicate()
    if p.returncode:    
        print(stderr.decode('UTF-8'))
        print(stdout.decode('UTF-8'))
        raise OSError('data transfer failed')    

    # untar archive
    assert os.path.isfile(config.get("dash_asset_fname")), f'missing {config.get("dash_asset_fname")}'
    tar = tarfile.open(config.get("dash_asset_fname"), "r:gz")
    tar.extractall()
    tar.close()

    os.chdir(cwd)

os.listdir(config.get("model_data_dir"))
['TM5-Flux-mrf',
 'TM5-Flux-m0f',
 'CT2019B',
 's99oc_SOCCOM_v2020',
 's99oc_v2020',
 's99oc_ADJocI40S_v2020',
 'CAMSv20r1',
 'CT2017',
 'MIROC',
 'README.md',
 'CTE2018',
 'TM5-Flux-mmf',
 'TM5-Flux-mwf',
 'CTE2020']

Check on intake datasets

The models sub-package includes an Intake catalog file providing access to the CO2 air-sea flux product of Landschützer et al. [2016]. Here, we simply request that dataset; intake is configured to cache the dataset locally.

import models

ds = models.dataset_som_ffn.open_dataset()
ds
<xarray.Dataset>
Dimensions:             (time: 432, lat: 180, lon: 360, d2: 2, bnds: 2)
Coordinates:
  * time                (time) datetime64[ns] 1982-01-15T12:00:00 ... 2017-12...
  * lat                 (lat) float32 -89.5 -88.5 -87.5 -86.5 ... 87.5 88.5 89.5
  * lon                 (lon) float32 -179.5 -178.5 -177.5 ... 177.5 178.5 179.5
Dimensions without coordinates: d2, bnds
Data variables: (12/15)
    spco2_raw           (time, lat, lon) float32 nan nan nan nan ... nan nan nan
    SFCO2_OCN           (time, lat, lon) float32 nan nan nan nan ... nan nan nan
    spco2_smoothed      (time, lat, lon) float32 nan nan nan nan ... nan nan nan
    SFCO2_OCN_smoothed  (time, lat, lon) float32 nan nan nan nan ... nan nan nan
    sol                 (time, lat, lon) float32 nan nan nan nan ... nan nan nan
    kw                  (time, lat, lon) float32 nan nan nan nan ... nan nan nan
    ...                  ...
    dco2_smoothed       (time, lat, lon) float32 nan nan nan nan ... nan nan nan
    seamask             (lat, lon) int32 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0
    time_bnds           (time, d2) datetime64[ns] 1981-12-31 ... 2017-12-31
    lat_bnds            (lat, bnds) float32 -90.0 -89.0 -89.0 ... 89.0 89.0 90.0
    lon_bnds            (lon, bnds) float32 -180.0 -179.0 -179.0 ... 179.0 180.0
    area                (lat, lon) float64 1.079e+08 1.079e+08 ... 1.079e+08
Attributes:
    institution:    MPI-MET, Hamburg, Germany (former: ETH Zurich, Switzerland)
    institude_id:   MPI
    model_id:       SOM-FFN
    run_id:         v2018
    contact:        Peter Landschutzer (peter.landschuetzer@mpimet.mpg.de)
    creation_date:  2019-03-21