Overview: x4c as a CESM Diagnostic System

Overview: x4c as a CESM Diagnostic System#

x4c provides a high-level diagnositc system for CESM output analysis and visualization. In this tutorial, we show some simple examples to give a quick glance of the major features, including:

  • The Timeseries case system

  • The spell magics

  • Adding new diagnostic variables

More detailed explainations of each feature can be found in later sections.

Example Datasets: Zhu, F. & Zhu, J. Long simulations of the Miocene Climatic Optimum, DOI: 10.5065/3QFN-GN70 (2025). https://rda.ucar.edu/datasets/d010026/.

[1]:
%load_ext autoreload
%autoreload 2

import os
os.chdir('/glade/u/home/fengzhu/Github/x4c/docsrc/notebooks')
import x4c
print(x4c.__version__)
2025.10.13
[7]:
dirpath = '/glade/campaign/cesm/development/cross-wg/diagnostic_framework/x4c/timeseries/b.e30_beta06.B1850C_LTso.ne30_t232_wgx3.192.wrkflw.1_32'
case = x4c.Timeseries(
    dirpath,
    casename='b.e30_beta06.B1850C_LTso.ne30_t232_wgx3.192.wrkflw.1',
    grid_dict={'atm': 'ne30gp3'},
)
>>> case.root_dir: /glade/campaign/cesm/development/cross-wg/diagnostic_framework/x4c/timeseries/b.e30_beta06.B1850C_LTso.ne30_t232_wgx3.192.wrkflw.1_32
>>> case.path_pattern: comp/proc/tseries/*/casename.hstr.vn.timespan.nc
>>> case.grid_dict: {'atm': 'ne30gp3', 'ocn': 'g16', 'lnd': 'ne30gp3', 'rof': 'ne30gp3', 'ice': 'g16'}
>>> case.casename: b.e30_beta06.B1850C_LTso.ne30_t232_wgx3.192.wrkflw.1
>>> case.paths["atm"]["cam.h0a"] created
>>> case.paths["atm"]["cam.h2a"] created
>>> case.paths["atm"]["cam.h1a"] created
>>> case.paths["ocn"]["mom6.h.sfc"] created
>>> case.paths["ocn"]["mom6.h.z"] created
>>> case.paths["ocn"]["mom6"] created
>>> case.paths["ocn"]["mom6.h.rho2"] created
>>> case.paths["ocn"]["mom6.h.native"] created
>>> case.paths["lnd"]["clm2.h0"] created
>>> case.paths["rof"]["mosart.h0"] created
>>> case.paths["ice"]["cice.h"] created
>>> case.paths["ice"]["cice.h1"] created
>>> case.vns["atm"]["cam.h0a"] created
>>> case.vns["atm"]["cam.h2a"] created
>>> case.vns["atm"]["cam.h1a"] created
>>> case.vns["ocn"]["mom6.h.sfc"] created
>>> case.vns["ocn"]["mom6.h.z"] created
>>> case.vns["ocn"]["mom6"] created
>>> case.vns["ocn"]["mom6.h.rho2"] created
>>> case.vns["ocn"]["mom6.h.native"] created
>>> case.vns["lnd"]["clm2.h0"] created
>>> case.vns["rof"]["mosart.h0"] created
>>> case.vns["ice"]["cice.h"] created
>>> case.vns["ice"]["cice.h1"] created

Notice that the same variable name could exist for multiple history strings (h-strings).

[8]:
case.load('TS')
case.ds['TS']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 1
----> 1 case.load('TS')
      2 case.ds['TS']

File ~/Github/x4c/x4c/case.py:742, in Timeseries.load(self, vn, vtype, comp, hstr, timespan, load_idx, verbose, reload, **kws)
    740 else:
    741     if comp is None or hstr is None:
--> 742         raise ValueError(f'The input variable name belongs to multiple (comp, hstr) pairs: {found_comp_hstr}. Please specify via the argument `comp` and `hstr`.')
    744 if timespan is not None and not isinstance(timespan[0], str) and not isinstance(timespan[-1], str):
    745     timespan = utils.timespan_int2str(timespan)

ValueError: The input variable name belongs to multiple (comp, hstr) pairs: [('atm', 'cam.h0a'), ('atm', 'cam.h1a')]. Please specify via the argument `comp` and `hstr`.

In this case, we should specify the component and h-string:

[9]:
case.load('TS', comp='atm', hstr='cam.h0a')
case.ds['TS']
>>> case.ds["TS"] created
[9]:
<xarray.Dataset> Size: 25MB
Dimensions:       (time: 120, ncol: 48600, ilev: 59, lev: 58, nbnd: 2,
                   trop_cld_lev: 58, trop_pref: 58, trop_prefi: 59)
Coordinates:
  * ilev          (ilev) float64 472B 2.055 3.98 6.909 ... 987.4 995.1 1e+03
  * lev           (lev) float64 464B 3.018 5.445 9.087 ... 983.2 991.2 997.5
  * time          (time) object 960B 0001-01-16 12:00:00 ... 0010-12-16 12:00:00
  * trop_cld_lev  (trop_cld_lev) float64 464B 3.018 5.445 9.087 ... 991.2 997.5
  * trop_pref     (trop_pref) float64 464B 3.018 5.445 9.087 ... 991.2 997.5
  * trop_prefi    (trop_prefi) float64 472B 2.055 3.98 6.909 ... 995.1 1e+03
Dimensions without coordinates: ncol, nbnd
Data variables: (12/19)
    TS            (time, ncol) float32 23MB ...
    area          (ncol) float64 389kB ...
    areawt        (ncol) float64 389kB ...
    date          (time) int32 480B ...
    date_written  (time) |S8 960B ...
    datesec       (time) int32 480B ...
    ...            ...
    nbdate        int32 4B ...
    nbsec         int32 4B ...
    ndbase        int32 4B ...
    nsbase        int32 4B ...
    time_bounds   (time, nbnd) object 2kB ...
    time_written  (time) |S8 960B ...
Attributes: (12/19)
    ne:                30
    fv_nphys:          3
    Conventions:       CF-1.0
    source:            CAM
    case:              b.e30_beta06.B1850C_LTso.ne30_t232_wgx3.192.wrkflw.1
    logname:           cmip7
    ...                ...
    comp:              atm
    hstr:              cam.h0a
    grid:              ne30gp3
    gw:                <xarray.DataArray 'area' (ncol: 48600)> Size: 389kB\n[...
    lat:               <xarray.DataArray 'lat' (ncol: 48600)> Size: 389kB\n[4...
    lon:               <xarray.DataArray 'lon' (ncol: 48600)> Size: 389kB\n[4...

All derived variables should follow the same idea of specifying component and h-strings:

[11]:
case.load('LST', comp='atm', hstr='cam.h0a')
case.ds['LST']
>>> LST is a supported derived variable.
>>> case.ds["TS"] already loaded; to reload, run case.load("TS", ..., reload=True).
>>> case.ds["LANDFRAC"] created
>>> case.ds["LST"] created
[11]:
<xarray.DataArray 'LST' (time: 120, ncol: 48600)> Size: 23MB
array([[      nan,       nan,       nan, ..., 281.56094,       nan,
              nan],
       [      nan,       nan,       nan, ..., 281.35333,       nan,
              nan],
       [      nan,       nan,       nan, ..., 282.2262 ,       nan,
              nan],
       ...,
       [      nan,       nan,       nan, ..., 295.93307,       nan,
              nan],
       [      nan,       nan,       nan, ..., 289.19916,       nan,
              nan],
       [      nan,       nan,       nan, ..., 283.95468,       nan,
              nan]], shape=(120, 48600), dtype=float32)
Coordinates:
  * time     (time) object 960B 0001-01-16 12:00:00 ... 0010-12-16 12:00:00
    lat      (ncol) float64 389kB -35.03 -35.48 -35.92 ... 36.66 36.2 35.74
    lon      (ncol) float64 389kB 315.5 316.5 317.5 315.5 ... 137.0 136.0 135.0
Dimensions without coordinates: ncol
Attributes:
    units:         K
    long_name:     Land Surface Temperature
    cell_methods:  time: mean
    path:          /glade/campaign/cesm/development/cross-wg/diagnostic_frame...
    gw:            <xarray.DataArray 'area' (ncol: 48600)> Size: 389kB\narray...
    lat:           <xarray.DataArray 'lat' (ncol: 48600)> Size: 389kB\n[48600...
    lon:           <xarray.DataArray 'lon' (ncol: 48600)> Size: 389kB\n[48600...
    comp:          atm
    grid:          ne30gp3
    vn:            LST
[ ]: