CUPiD#

The CESM Unified Postprocessing and Diagnostics (CUPiD) package is a new python-based system for running post-processing routines and diagnostics across all CESM components with a common user and developer interface. Official documentation for CUPiD can be found online here.

This notebook is a chance to try out CUPiD on a CESM3 simulation, which is the version of the model CUPiD is targeting. Please note for future reference that while we are using the Jupyterhub interface in this tutorial, CUPiD can be run in a terminal just like a standard python script, and can be included as an automatic post-processing step for CESM3 simulations.

BEFORE BEGINNING THIS EXERCISE - Check that your kernel (upper right corner, above) is Bash. This should be the default kernel, but if it is not, click on that button and select Bash.

CUPiD is currently a command line tool. This means that instead of running python code directly, this notebook will run unix commands CUPiD provides in order to generate the relevant diagnostics. To start, we need to clone CUPiD from Github:

# Delete old CUPiD directory if one exists:
if [ -d "CUPiD" ]; then
  rm -rf CUPiD
fi

#Clone CUPiD source code from Github repo:
git clone --recurse-submodules https://github.com/NCAR/CUPiD.git
cd CUPiD  # Need to enter CUPiD directory for remaining commands

This downloads the core CUPiD software, as well as two additional diagnostics packages that CUPiD will use. One is the AMWG Diagnostics Framework (ADF), which is a command-line tool that can be used to generate CAM (and soon CLM) diagnostics, and mom6-tools, which is a python package that can be used to analyze MOM6, which is the ocean model that will be used in CESM3 (but for this tutorial we’ll ignore).

Next we need to setup the proper python environment using conda/mamba, and activate the cupid-infrastructure environment: NOTE: You may see a red :1 appear in your notebook, but this can be safely ignored. Also note that if this is the first time you are running this cell, it could take a few minutes to install the conda environments.

#Load conda to your environment:
module load conda

#Install 'cupid-infrastructure' environment if it doesn't already exist:
if ! { conda env list | grep 'cupid-infrastructure'; } >/dev/null 2>&1; then
  mamba env create -f environments/cupid-infrastructure.yml
fi

#Install 'cupid-analysis' environment if it doesn't already exist:
if ! { conda env list | grep 'cupid-analysis'; } >/dev/null 2>&1; then
  mamba env create -f environments/cupid-analysis.yml
fi

#Activate CUPiD conda environemnt:
conda activate cupid-infrastructure
#NOTE: You may see a red ": 1" message below, but it can be ignored.

#Check that cupid-run can be accessed appropriately:
which cupid-diagnostics
if [ $? -ne 0 ]; then
  #If not then use pip to install:
  pip install -e .
fi

CUPiD is controlled via a config YAML file. Here we create a new directory and write the relevant config file for our tutorial simulation. Please note that if your tutorial simulations didn’t finish then you can use the provided simulations instead:

cd examples         #Go to the examples directory
if ! [ -d "cesm_tutorial" ]; then #Check if CESM tutorial directory already exists.
  mkdir cesm_tutorial #If not, then make a new directory to hold our config file
fi 
cd cesm_tutorial    #Go to newly made CESM tutorial example directory
cat << EOF > config.yml
################## SETUP ##################

################
# Data Sources #
################
data_sources:
    # run_dir is the path to the folder you want
    ### all the files associated with this configuration
    ### to be created in
    run_dir: .

    # nb_path_root is the path to the folder that cupid will
    ### look for your template notebooks in. It doesn't have to
    ### be inside run_dir, or be specific to this project, as
    ### long as the notebooks are there
    nb_path_root: ../../nblibrary

######################
# Computation Config #
######################

computation_config:

    # default_kernel_name is the name of the environment that
    ### the notebooks in this configuration will be run in by default.
    ### It must already be installed on your machine. You can also
    ### specify a different environment than the default for any
    ### notebook in NOTEBOOK CONFIG
    default_kernel_name: cupid-analysis

    # log level sets the level of how verbose logging will be.
    # options include: debug, info, warning, error
    log_level: 'info'

############# NOTEBOOK CONFIG #############

############################
# Notebooks and Parameters #
############################

# All parameters under global_params get passed to all the notebooks

global_params:
  case_name: 'b.e30_beta02.BLT1850.ne30_t232.104'
  base_case_name: 'b.e23_alpha17f.BLT1850.ne30_t232.092'
  CESM_output_dir: /glade/campaign/cesm/development/cross-wg/diagnostic_framework/CESM_output_for_testing
  start_date: '0001-01-01'
  end_date: '0101-01-01'
  base_start_date: '0001-01-01'
  base_end_date: '0101-01-01'
  obs_data_dir: '/glade/campaign/cesm/development/cross-wg/diagnostic_framework/CUPiD_obs_data'
  lc_kwargs:
    threads_per_worker: 1

timeseries:
  num_procs: 8
  ts_done: [False, False]
  overwrite_ts: [False, False]
  case_name: ['b.e30_beta02.BLT1850.ne30_t232.104', 'b.e23_alpha17f.BLT1850.ne30_t232.092']

  atm:
    vars: ['PSL']
    derive_vars: []
    hist_str: 'cam.h0a'
    start_years: [91,91]
    end_years: [100,100]
    level: 'lev'

  lnd:
    vars: ['SOILWATER_10CM','FSH_TO_COUPLER']
    derive_vars: []
    hist_str: 'h0'
    start_years: [91,91]
    end_years: [100,100]
    level: 'lev'

  ocn:
    vars: []
    derive_vars: []
    hist_str: 'h.z'
    start_years: [91,91]
    end_years: [100,100]
    level: 'lev'

  ice:
    vars: ['aice','hi','hs']
    derive_vars: []
    hist_str: 'cice.h'
    start_years: [91,91]
    end_years: [100,100]
    level: 'lev'

  glc:
    vars: []
    derive_vars: []
    hist_str: 'initial_hist'
    start_years: [91,91]
    end_years: [100,100]
    level: 'lev'

  rof:
    vars: []
    derive_vars: []
    hist_str: 'h0'
    start_years: [91,91]
    end_years: [100,100]
    level: 'lev'

compute_notebooks:

  # This is where all the notebooks you want run and their
  # parameters are specified. Several examples of different
  # types of notebooks are provided.

  # The first key (here infrastructure) is the name of the
  # notebook from nb_path_root, minus the .ipynb

    infrastructure:
      index:
        parameter_groups:
          none: {}

    atm:
      Global_PSL_NMSE_compare_obs_lens:
        parameter_groups:
          none:
            regridded_output: False # it looks like output is already on f09 grid, didn't need to regrid time-series file
            base_regridded_output: True
            validation_path: 'atm/analysis_datasets/fv0.9x1.25/seasonal_climatology/nmse_validation/PSL/'
      link_to_ADF:
        kernel_name: cupid-infrastructure
        parameter_groups:
          none:
            adf_root: ../../examples/key_metrics/ADF_output/
            key_plots: ["Surface_Wind_Stress_ANN_LatLon_Vector_Mean.png", "PRECT_ANN_LatLon_Mean.png", "PS_DJF_SHPolar_Mean.png", "TaylorDiag_ANN_Special_Mean.png"]
        external_tool:
          tool_name: 'ADF'
          vars: ['SST', 'TS', 'SWCF', 'LWCF', 'PRECT', 'TAUX', 'TAUY',  'TGCLDLWP']
          plotting_scripts: ["global_latlon_map", "global_latlon_vect_map"]
          analysis_scripts: ["amwg_table"]
          base_regridded_output: True

    glc:
      Greenland_SMB_visual_compare_obs:
        parameter_groups:
          none:
            obs_path: 'glc/analysis_datasets/multi_grid/annual_avg/SMB_data'
            obs_name: 'GrIS_MARv3.12_climo_1960_1999.nc'
            climo_nyears: 40

    rof:
      global_discharge_gauge_compare_obs:
        parameter_groups:
          none:
            analysis_name: ""
            grid_name: 'f09_f09_mosart' # ROF grid name
            climo_nyears: 10
            figureSave: False
      global_discharge_ocean_compare_obs:
        parameter_groups:
          none:
            analysis_name: ""

            grid_name: 'f09_f09_mosart' # ROF grid name
            climo_nyears: 10
            figureSave: False

    ice:
      Hemis_seaice_visual_compare_obs_lens:
        parameter_groups:
          none:
            climo_nyears: 35
            grid_file: '/glade/campaign/cesm/community/omwg/grids/tx2_3v2_grid.nc'
            path_model: '/glade/campaign/cesm/development/cross-wg/diagnostic_framework/CUPiD_model_data/ice/'


    lnd:
      Global_TerrestrialCouplingIndex_VisualCompareObs:
        parameter_groups:
          none:
            clmFile_h: '.h0.'
            fluxnet_comparison: True
            obsDir: 'lnd/analysis_datasets/ungridded/timeseries/FLUXNET2015/'

            ########### JUPYTER BOOK CONFIG ###########

##################################
# Jupyter Book Table of Contents #
##################################
book_toc:

  # See https://jupyterbook.org/en/stable/structure/configure.html for
  # complete documentation of Jupyter book construction options

  format: jb-book

  # All filenames are notebook filename without the .ipynb, similar to above

  root: infrastructure/index # root is the notebook that will be the homepage for the book
  parts:

    # Parts group notebooks into different sections in the Jupyter book
    # table of contents, so you can organize different parts of your project.
    # Each chapter is the name of one of the notebooks that you executed
    # in compute_notebooks above, also without .ipynb

    - caption: Atmosphere
      chapters:
        - file: atm/Global_PSL_NMSE_compare_obs_lens
        - file: atm/link_to_ADF

    # - caption: Ocean
    #   chapters:
    #       - file: ocn/ocean_surface

    - caption: Land
      chapters:
        - file: lnd/Global_TerrestrialCouplingIndex_VisualCompareObs

    - caption: Sea Ice
      chapters:
        - file: ice/Hemis_seaice_visual_compare_obs_lens

    - caption: Land Ice
      chapters:
        - file: glc/Greenland_SMB_visual_compare_obs

    - caption: River Runoff
      chapters:
        - file: rof/global_discharge_gauge_compare_obs
        - file: rof/global_discharge_ocean_compare_obs

#####################################
# Keys for Jupyter Book _config.yml #
#####################################
book_config_keys:

  title: CESM Key Metrics   # Title of your jupyter book

  # Other keys can be added here, see https://jupyterbook.org/en/stable/customize/config.html
  ### for many more options

EOF

Now we are ready to run CUPiD!

Running CUPiD diagnostics#

CUPiD is designed to run a collection of notebooks on a single set of model simulation data. The notebooks themselves can be found in CUPiD/nblibrary/<comp>, where <comp> is atm, lnd, ice, etc. You can run the notebooks with the following command: NOTE: This can take several minutes to run. Don’t worry if you see some parameter warnings under DAG render with warnings.

cupid-diagnostics
# sometimes users report that the conda environment was not found. IF this happens, run the following lines and then continue:
conda activate cupid-analysis
python -m ipykernel install --user --name=cupid-analysis
conda activate cupid-infrastructure

Looking at Diagnostics#

The now processed notebooks can be found in CUPiD/examples/cesm_tutorial/computed_notebooks/<comp>. Try opening them here in Jupyterhub and see if you can determine what diagnostics they are calculating!

Alternatively, CUPiD can also combine the notebooks into a single website, which can then be viewed on a browser. To generate the website you’ll need to run the following command:

cupid-webpage

When this command finishes running, you will have a new directory: examples/cesm_tutorial/computed_notebooks/_build/html/.

The best way to view these files is to download them to your own machine and then view them in your browser. To download the directory, open a local terminal and run the following command:

scp -r <username>@casper.hpc.ucar.edu:~/CESM-Tutorial/notebooks/diagnostics/CUPiD/examples/cesm_tutorial/computed_notebooks/_build/html cupid_website

Where <username> is your casper/derecho username. It should first ask for your password and a Duo push, and if successful will download the entire html directory with the name cupid_website. Then just open the cupid_website/index.html file with your browser to see the generated website!