Running Standalone CUPiD (for existing CESM datasets)

Running Standalone CUPiD (for existing CESM datasets)#

CUPiD can be run either independently or via the CESM workflow. If you want to run CUPiD as part of a CESM case submission, we recommend looking at the Running CUPiD via CESM Workflow page. If you already have CESM data that you want to analyze, and don’t plan on running CESM to generate a case, you have come to the right place!

Setup#

Install CUPiD’s analysis and infrastructure environments per the usual setup instructions.

Activate the cupid-infrastructure environment:

conda activate cupid-infrastructure

Adjust CUPiD configuration#

Update the CUPiD configuration file with values relevant to your case.

cd examples/key_metrics

Request resources#

Request resources– eg, at NCAR, this may be useful:

qinteractive -l select=1:ncpus=12:mem=120GB

See more details for resource requests on tips and tricks for running at NCAR.

Postprocessing of Files#

Run Timeseries, if desired#

cupid-timeseries

Run Remapping, if desired#

Coming soon

cupid-remap

Run Diagnostics#

Running External Diagnostics#

One feature of CUPiD is that it is possible to run external diagnostic packages using CUPiD’s helper scripts that automatically generate necessary configuration files for a variety of external packages using the CUPiD configuration file. Additionally, CUPiD is set up to integrate output from these packages directly into one location where all CUPiD output is viewed. Packages that are integrated into CUPiD:

ADF
CVDP
LDF
ILAMB

Running CUPiD Diagnostics#

CUPiD’s main example for generating diagnostics is examples/key-metrics. Other examples with various sets of diagnostic notebooks from nblibrary are also available, or you can make your own example describing a different set of notebooks– or even add your own!

This will require multiple compute cores:

$ cupid-diagnostics
$ cupid-webpage  # Will build HTML from Jupyter Book

CUPiD Options#

Most of CUPiD’s configuration is done via the config.yml file, but there are a few command line options as well:

(cupid-infrastructure) $ cupid-diagnostics -h
Usage: cupid-diagnostics [OPTIONS] CONFIG_PATH

  Main engine to set up running all the notebooks.

Options:
  -s, --serial        Do not use LocalCluster objects
  -atm, --atmosphere  Run atmosphere component diagnostics
  -ocn, --ocean       Run ocean component diagnostics
  -lnd, --land        Run land component diagnostics
  -ice, --seaice      Run sea ice component diagnostics
  -glc, --landice     Run land ice component diagnostics
  -rof, --river-runoff Run river runoff component diagnostics
  --config_path       Path to the YAML configuration file containing specifications for notebooks (default config.yml)
  -h, --help          Show this message and exit.

Running in serial#

By default, several of the example notebooks provided use a dask LocalCluster object to run in parallel. However, the --serial option will pass a logical flag to each notebook that can be used to skip starting the cluster.

# Spin up cluster (if running in parallel)
client=None
if not serial:
  cluster = LocalCluster(**lc_kwargs)
  client = Client(cluster)

client

Specifying components#

If no component flags are provided, all component diagnostics listed in config.yml will be executed by default. Multiple flags can be used together to select a group of components, for example: cupid-diagnostics -ocn -ice.

View output#

After the last step is finished, you can use Jupyter to view generated notebooks in ${CUPID_ROOT}/examples/key-metrics/computed_notebooks or you can view ${CUPID_ROOT}/examples/key-metrics/computed_notebooks/_build/html/index.html in a web browser. If you’re at NCAR, you may also want to check out the FastX visualization tool

Clean computed notebook directory#

Furthermore, to clean the computed_notebooks folder which was generated by the cupid-diagnostics and cupid-webpage commands, you can run the following command:

$ cupid-clean

This will clean the computed_notebooks folder which is at the location pointed to by the run_dir variable in the config.yml file.