Prestage model data
Contents
Prestage model data¶
This notebook downloads data from the NCAR DASH repository where the modeling data for this study has been archived [Long et al., 2021] and also ensures that a dataset curated via Intake is accessible—local caching of this dataset happen automatically behind the scenes.
First, we demonstrate the various local storage locations used to support the calculation.
%load_ext autoreload
%autoreload 2
The autoreload extension is already loaded. To reload it, use:
%reload_ext autoreload
import os
from subprocess import Popen, PIPE
import tarfile
import xarray as xr
xr.set_options(display_style='text')
import config
Print storage locations¶
config.get("project_tmpdir")
'/glade/work/mclong/so-co2-airborne-obs'
config.get("project_tmpdir_obs")
'/glade/work/mclong/so-co2-airborne-obs/obs-data'
config.get("model_data_dir_root")
'/glade/work/mclong/so-co2-airborne-obs/model-data'
config.get("model_data_dir")
'/glade/work/mclong/so-co2-airborne-obs/model-data/Long-etal-2021-SO-CO2-Science'
config.get("dash_asset_fname")
'Long-etal-2021-SO-CO2-Science.tar.gz'
Get data from DASH repo¶
Use a DASH-NCAR provided wget
script to download the modeling data in Long et al. [2021]. This won’t work on machines that do not support wget
(i.e., MacOS).
if not os.path.isdir(config.get("model_data_dir")):
# run wget to stage data
# TODO: support curl too
cwd = os.getcwd()
script = f'{cwd}/wget-dash-archive.sh'
os.chdir(config.get("model_data_dir_root"))
p = Popen(['bash', script], stdout=PIPE, stderr=PIPE)
stdout, stderr = p.communicate()
if p.returncode:
print(stderr.decode('UTF-8'))
print(stdout.decode('UTF-8'))
raise OSError('data transfer failed')
# untar archive
assert os.path.isfile(config.get("dash_asset_fname")), f'missing {config.get("dash_asset_fname")}'
tar = tarfile.open(config.get("dash_asset_fname"), "r:gz")
tar.extractall()
tar.close()
os.chdir(cwd)
os.listdir(config.get("model_data_dir"))
['TM5-Flux-mrf',
'TM5-Flux-m0f',
'CT2019B',
's99oc_SOCCOM_v2020',
's99oc_v2020',
's99oc_ADJocI40S_v2020',
'CAMSv20r1',
'CT2017',
'MIROC',
'README.md',
'CTE2018',
'TM5-Flux-mmf',
'TM5-Flux-mwf',
'CTE2020']
Check on intake
datasets¶
The models
sub-package includes an Intake catalog file providing access to the CO2 air-sea flux product of Landschützer et al. [2016]. Here, we simply request that dataset; intake
is configured to cache the dataset locally.
import models
ds = models.dataset_som_ffn.open_dataset()
ds
<xarray.Dataset> Dimensions: (time: 432, lat: 180, lon: 360, d2: 2, bnds: 2) Coordinates: * time (time) datetime64[ns] 1982-01-15T12:00:00 ... 2017-12... * lat (lat) float32 -89.5 -88.5 -87.5 -86.5 ... 87.5 88.5 89.5 * lon (lon) float32 -179.5 -178.5 -177.5 ... 177.5 178.5 179.5 Dimensions without coordinates: d2, bnds Data variables: (12/15) spco2_raw (time, lat, lon) float32 nan nan nan nan ... nan nan nan SFCO2_OCN (time, lat, lon) float32 nan nan nan nan ... nan nan nan spco2_smoothed (time, lat, lon) float32 nan nan nan nan ... nan nan nan SFCO2_OCN_smoothed (time, lat, lon) float32 nan nan nan nan ... nan nan nan sol (time, lat, lon) float32 nan nan nan nan ... nan nan nan kw (time, lat, lon) float32 nan nan nan nan ... nan nan nan ... ... dco2_smoothed (time, lat, lon) float32 nan nan nan nan ... nan nan nan seamask (lat, lon) int32 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 time_bnds (time, d2) datetime64[ns] 1981-12-31 ... 2017-12-31 lat_bnds (lat, bnds) float32 -90.0 -89.0 -89.0 ... 89.0 89.0 90.0 lon_bnds (lon, bnds) float32 -180.0 -179.0 -179.0 ... 179.0 180.0 area (lat, lon) float64 1.079e+08 1.079e+08 ... 1.079e+08 Attributes: institution: MPI-MET, Hamburg, Germany (former: ETH Zurich, Switzerland) institude_id: MPI model_id: SOM-FFN run_id: v2018 contact: Peter Landschutzer (peter.landschuetzer@mpimet.mpg.de) creation_date: 2019-03-21