API Reference#

Core Features#

.. py:module:: x4c.core

.. py:function:: load_dataset(path, shift_time=False, comp=None, hstr=None, grid=None, vn=None, **kws) :module: x4c.core

Load a netCDF file and form a xarray.Dataset

:param path: path to the netCDF file :type path: str :param shift_time: shift the time of the xarray.Dataset (the CESM1 output has a time shift) :type shift_time: bool :param comp: the tag for CESM component, including “atm”, “ocn”, “lnd”, “ice”, and “rof” :type comp: str :param grid: the grid tag for the CESM output (e.g., ne16, g16) :type grid: str :param vn: variable name :type vn: str

.. py:function:: open_mfdataset(paths, shift_time=False, comp=None, hstr=None, grid=None, vn=None, **kws) :module: x4c.core

Open multiple netCDF files and form a xarray.Dataset in a lazy load mode

:param path: path to the netCDF file :type path: str :param shift_time: shift the time of the xarray.Dataset (the default CESM output has a time shift) :type shift_time: bool :param comp: the tag for general CESM components, including “atm”, “ocn”, “lnd”, “ice”, and “rof” :type comp: str :param grid: the grid tag for the CESM output (e.g., ne16, g16) :type grid: str :param vn: variable name :type vn: str

.. py:class:: XDataset(ds=None) :module: x4c.core

.. py:method:: XDataset.annualize(months=None, days_weighted=False, time2year=False) :module: x4c.core

  Annualize/seasonalize a `xarray.Dataset`

  :param months: a list of integers to represent month combinations,
                 e.g., `None` means calendar year annualization, [7,8,9] means JJA annualization, and [-12,1,2] means DJF annualization
  :type months: list of int

.. py:property:: XDataset.anom :module: x4c.core

  Compute monthly anomalies relative to the climatology.

  This property subtracts the monthly climatology (from
  `XDataset.climo`) from the dataset to produce anomalies for each
  time step. The climatology is aligned by month before subtraction so
  that, e.g., all Januaries are compared against the January climatology.

  :returns: dataset of anomalies with the same coordinates as
            the original dataset.
  :rtype: xarray.Dataset

.. py:property:: XDataset.climo :module: x4c.core

  Compute the climatology (monthly mean) of the dataset.

  This property groups the dataset by calendar month and computes the
  mean over the `time` dimension for each month. It also records the
  `climo_period` as a tuple (start_year, end_year) in the returned
  dataset's attributes and preserves `comp`/`grid` attributes when
  present. If the grouping result uses a `month` coordinate it is
  renamed to `time` to keep downstream interfaces consistent.

  :returns: monthly climatology where the `time` coordinate
            indexes months (1-12). `ds.attrs['climo_period']` documents the
            original temporal coverage used to compute the climatology.
  :rtype: xarray.Dataset

.. py:property:: XDataset.da :module: x4c.core

  get its `xarray.DataArray` version

.. py:method:: XDataset.get_plev(ps, vn=None, lev_mode=’hybrid’, **kws) :module: x4c.core

  Interpolate a hybrid-level field to pressure levels and return a Dataset.

  This method converts a 3D atmospheric variable that is on hybrid model
  levels (a/k/a k-levels) into pressure levels using the provided surface
  pressure `ps` (either an `xarray.DataArray` or an `xarray.Dataset` that
  contains a variable named "PS"). It wraps
  `geocat.comp.interpolation.interp_hybrid_to_pressure` and returns a
  copy of the original `Dataset` with the requested variable replaced by
  its pressure-level version.

  :param ps: surface pressure. If a
             `Dataset` is passed the method will look for the variable
             named "PS". Dimensions must align with the variable being
             interpolated.
  :type ps: xarray.DataArray or xarray.Dataset
  :param vn: variable name in `self.ds` to interpolate. If
             not provided the method will use the dataset attribute
             `ds.attrs['vn']` and `self.da`.
  :type vn: str, optional
  :param lev_mode: currently only supports "hybrid".
                   (Reserved for future expansion.)
  :type lev_mode: str, optional
  :param \*\*kws: additional keyword arguments forwarded to
                  `geocat.comp.interpolation.interp_hybrid_to_pressure`.
                  By default `lev_dim` is set to `'lev'`. If the dataset
                  contains `hyam`/`hybm` arrays they will be passed automatically.

  :returns: a copy of `self.ds` with `vn` replaced by the
            pressure-level `DataArray` produced by the interpolation.
  :rtype: xarray.Dataset

  .. rubric:: Notes

  - Requires `geocat.comp` to be available and the dataset to include
    the hybrid coefficients (`hyam`, `hybm`) when using hybrid
    vertical coordinates.
  - The returned dataset preserves the original dataset attributes
    and coordinate structure except that the specified variable is
    now on pressure levels.

.. py:method:: XDataset.regrid(dlon=1, dlat=1, weight_file=None, gs=’T’, method=’bilinear’, periodic=True) :module: x4c.core

  Regrid the CESM output to a normal lat/lon grid

  Supported atmosphere regridding: ne16np4, ne16pg3, ne30np4, ne30pg3, ne120np4, ne120pg4 TO 1x1d / 2x2d.
  Supported ocean regridding: any grid similar to g16 TO 1x1d / 2x2d.
  For any other regridding, `weight_file` must be provided by the user.

  For the atmosphere grid regridding, the default method is area-weighted;
  while for the ocean grid, the default is bilinear.

  :param dlon: longitude spacing
  :type dlon: float
  :param dlat: latitude spacing
  :type dlat: float
  :param weight_file: the path to an ESMF-generated weighting file for regridding
  :type weight_file: str
  :param gs: grid style in 'T' or 'U' for the ocean grid
  :type gs: str
  :param method: regridding method for the ocean grid
  :type method: str
  :param periodic: the assumption of the periodicity of the data when perform the regrid method
  :type periodic: bool

.. py:method:: XDataset.zavg(depth_top, depth_bot, vn=None) :module: x4c.core

  Vertically average an ocean/column field between two depths and return a Dataset.

  The method selects the vertical range along the `z_t` coordinate from
  `depth_top` to `depth_bot`, applies area/volume weights provided by the
  dataset variable `dz`, computes the weighted mean over the vertical
  dimension, and returns a copy of the original `Dataset` with the
  specified variable replaced by its vertically averaged version.

  :param depth_top: upper bound of the vertical slice (same units as `z_t`).
  :type depth_top: float
  :param depth_bot: lower bound of the vertical slice (same units as `z_t`).
  :type depth_bot: float
  :param vn: variable name in `self.ds` to average. If not
             provided the method will use the dataset attribute
             `ds.attrs['vn']` and `self.da`.
  :type vn: str, optional

  :returns: a copy of `self.ds` with `vn` replaced by the
            vertically averaged `DataArray`.
  :rtype: xarray.Dataset

  .. rubric:: Notes

  - This method expects a vertical coordinate named `z_t` and a
    thickness/weight variable named `dz` in the dataset. The
    weighting is `dz` (e.g., layer thickness) and the mean is taken
    over the `z_t` dimension.

.. py:class:: XDataArray(da=None) :module: x4c.core

.. py:method:: XDataArray.annualize(months=None, days_weighted=False) :module: x4c.core

  Annualize/seasonalize a `xarray.DataArray`

  :param months: a list of integers to represent month combinations,
                 e.g., [7,8,9] means JJA annualization, and [-12,1,2] means DJF annualization
  :type months: list of int

.. py:property:: XDataArray.ds :module: x4c.core

  get its `xarray.Dataset` version

.. py:method:: XDataArray.eof(n=4, weight=True) :module: x4c.core

  Perform EOF analysis

.. py:method:: XDataArray.geo_mean(ind=None, latlon_range=(-90, 90, 0, 360), **kws) :module: x4c.core

  Calculate the geospatial-weighted (latitude or area) mean over a specified region or climate index.
  :param ind: Climate index name. Supported indices include:
              - 'nino3.4': Niño 3.4 region
              - 'nino1+2': Niño 1+2 region
              - 'nino3': Niño 3 region
              - 'nino4': Niño 4 region
              - 'wpi': Western Pacific Index
              - 'tpi': Tri-Pole Index
              - 'dmi': Dipole Mode Index (Indian Ocean)
              - 'iobw': Indian Ocean Basin-Wide Index
              If None, uses latlon_range instead. Default is None.
  :type ind: str, optional
  :param latlon_range: Latitude and longitude range for computing the mean in the format
                       (lat_min, lat_max, lon_min, lon_max). Default is (-90, 90, 0, 360).
  :type latlon_range: tuple or list, optional
  :param \*\*kws: Additional keyword arguments passed to utils.geo_mean().
  :type \*\*kws: dict

  :returns: Latitude-weighted mean values over the specified region or index.
            Attributes from the original data are preserved. Time coordinate
            long_name is updated to 'Model Year' if applicable.
  :rtype: xarray.DataArray

  :raises ValueError: If ind is not one of the supported climate index names.

.. py:method:: XDataArray.get_plev(**kws) :module: x4c.core

  See: https://geocat-comp.readthedocs.io/en/v2024.04.0/user_api/generated/geocat.comp.interpolation.interp_hybrid_to_pressure.html

.. py:property:: XDataArray.gm :module: x4c.core

  the global area-weighted mean

.. py:property:: XDataArray.gs :module: x4c.core

  the global area-weighted sum

.. py:method:: XDataArray.nearest2d(lat=None, lon=None, lat_coord=’lat’, lon_coord=’lon’, lat_dim=’lat’, lon_dim=’lon’) :module: x4c.core

  Select the nearest non-NaN grid point(s) for the given lat/lon targets.

  Given one or more target `lat`/`lon` pairs, this method finds the
  nearest valid (non-NaN across non-spatial dims) grid cell in the
  DataArray and returns a concatenated `DataArray` with a new dimension
  `site` indexing the selected points.

  :param lat: target latitude(s).
  :type lat: float or array-like
  :param lon: target longitude(s).
  :type lon: float or array-like
  :param lat_coord: name of latitude coordinate in the DataArray.
  :type lat_coord: str
  :param lon_coord: name of longitude coordinate in the DataArray.
  :type lon_coord: str
  :param lat_dim: latitude dimension name.
  :type lat_dim: str
  :param lon_dim: longitude dimension name.
  :type lon_dim: str

  :returns: concatenated selections at nearest grid points
            with a new `site` coordinate.
  :rtype: xarray.DataArray

.. py:property:: XDataArray.nhm :module: x4c.core

  the NH area-weighted mean

.. py:property:: XDataArray.nhs :module: x4c.core

  the NH area-weighted sum

.. py:method:: XDataArray.plot(title=None, figsize=None, ax=None, latlon_range=None, add_clabels=False, clevels=None, clabel_kwargs=None, projection=’Robinson’, transform=’PlateCarree’, central_longitude=180, proj_args=None, bad_color=’dimgray’, add_gridlines=False, gridline_labels=True, gridline_style=’–’, ssv=None, log=False, vmin=None, vmax=None, coastline_zorder=99, coastline_width=1, site_markersizes=100, df_sites=None, colname_dict=None, gs=’T’, ux=False, site_marker_dict=None, site_color_dict=None, count_site_num=False, lgd_kws=None, legend=True, return_im=False, **kws) :module: x4c.core

  The plotting functionality

  :param title: figure title
  :type title: str
  :param figsize: figure size in format of (w, h)
  :type figsize: tuple or list
  :param ax: a `matplotlib.axes`
  :type ax: `matplotlib.axes`
  :param latlon_range: lat/lon range in format of (lat_min, lat_max, lon_min, lon_max)
  :type latlon_range: tuple or list
  :param projection: a projection name supported by `Cartopy`
  :type projection: str
  :param transform: a projection name supported by `Cartopy`
  :type transform: str
  :param central_longitude: the central longitude of the map to plot
  :type central_longitude: float
  :param proj_args: other keyword arguments for projection
  :type proj_args: dict
  :param add_gridlines: if True, the map will be added with gridlines
  :type add_gridlines: bool
  :param gridline_labels: if True, the lat/lon ticklabels will appear
  :type gridline_labels: bool
  :param gridline_style: the gridline style, e.g., '-', '--'
  :type gridline_style: str
  :param ssv: a sea surface variable used for plotting the coastlines
  :type ssv: `xarray.DataArray`
  :param gs: grid style in 'T' or 'U' for the ocean grid
  :type gs: str
  :param coastline_zorder: the layer order for the coastlines
  :type coastline_zorder: int
  :param coastline_width: the width of the coastlines
  :type coastline_width: float
  :param df_sites: a `pandas.DataFrame` that stores the information of a collection of sites
  :type df_sites: `pandas.DataFrame`
  :param colname_dict: a dictionary of column names for `df_sites` in the "key:value" format "assumed name:real name"
  :type colname_dict: dict

.. py:method:: XDataArray.regrid(**kws) :module: x4c.core

  Regrid this DataArray by delegating to the parent Dataset regrid.

  This wraps `XDataset.regrid` by converting the `DataArray` to a
  temporary `Dataset`, calling the dataset-level regrid helper, then
  extracting and returning the regridded `DataArray`. Any dataset-level
  `lat`/`lon` attributes added during the transformation are removed from
  the returned `DataArray` attributes for cleanliness.

  **Forwarded kwargs** are the same as `XDataset.regrid` (e.g., `dlon`,
  `dlat`, `weight_file`, `gs`, `method`, `periodic`).

.. py:property:: XDataArray.shm :module: x4c.core

  the SH area-weighted mean

.. py:property:: XDataArray.shs :module: x4c.core

  the SH area-weighted sum

.. py:property:: XDataArray.somin :module: x4c.core

  the Southern Ocean min

.. py:property:: XDataArray.zm :module: x4c.core

  the zonal mean

CESM Postprocessing#

.. py:class:: History(root_dir, comps=[‘atm’, ‘ocn’, ‘lnd’, ‘ice’, ‘rof’], comps_info=None, casename=None, path_pattern=’comp/hist/casename.hstr.date.nc’, avoid_list=None) :module: x4c.case

Handle CESM history files for a single case.

Provides utilities to discover history file paths, list time-series variables, split (isolate) variables into separate files, and re-merge them across time ranges. Designed to work with NCO tools and MPI for parallel operations.

.. py:method:: History.bigbang(comp, hstr, output_dirpath, timespan=None, overwrite=True, nproc=1, vns=None) :module: x4c.case

  Split history files into per-variable files in parallel using MPI.

  Each MPI rank handles a subset of (file,variable) tasks.

.. py:method:: History.bigcrunch(comp, hstr, input_dirpath, output_dirpath, timespan=None, overwrite=True, nproc=1, compression=1, vns=None) :module: x4c.case

  Merge per-variable files back into timeseries files in parallel.

  Coordinates work across MPI ranks similar to `bigbang`.

.. py:method:: History.gen_ts(output_dirpath, staging_dirpath=None, comps=[‘atm’, ‘ocn’, ‘lnd’, ‘ice’, ‘rof’], timespan=None, timestep=None, timestep_unit=’year’, dir_structure=’comp/proc/tseries/hstr’, overwrite=True, nproc=1, compression=1) :module: x4c.case

  Generate timeseries files for selected components and timespans.

  This orchestrates splitting (`bigbang`) and merging
  (`bigcrunch`) stages and moves results from staging to final
  output directories.

.. py:method:: History.get_hstr_based_on_vn(vn) :module: x4c.case

  Return the first hstr that contains variable `vn`.

  This searches across all components and hstrs and returns the
  matching hstr string or None if not found.

.. py:method:: History.get_paths(comp, hstr, timespan=None) :module: x4c.case

  Return history file paths for a component/hstr optionally
  filtered by a timespan.

  timespan may be provided in a variety of formats accepted by
  utils.parse_timespan.

.. py:method:: History.get_ts_vns(comp, hstr, exclude_vars=[‘time’, ‘time_bnds’, ‘time_bounds’, ‘time_bound’, ‘time_written’, ‘date’, ‘datesec’, ‘date_written’]) :module: x4c.case

  Return list of time-varying variable names for a given component
  and hstr by inspecting the first history file.

.. py:method:: History.isolate_vn(vn, comp, hstr, in_path, output_dirpath, overwrite=True) :module: x4c.case

  Create a new netCDF file containing only variable `vn` from
  the input history file `in_path`.

  Uses `ncks` to drop other variables and writes result to
  `output_dirpath` with a standardized filename.

.. py:method:: History.merge_vn(hstr, vn, input_dirpath, output_dirpath, timespan=None, overwrite=True, compression=1) :module: x4c.case

  Concatenate per-variable files across time into a single file.

  Uses `ncrcat` with optional compression level to produce an
  aggregated timeseries file for `vn` and `hstr`.

.. py:method:: History.rm_timespan(timespan, comps=[‘atm’, ‘ice’, ‘ocn’, ‘rof’, ‘lnd’], nworkers=None, rehearsal=True) :module: x4c.case

  Rename the archive files within a timespan

  :param timespan: [start_year, end_year] with elements being integers
  :type timespan: tuple or list

CESM Diagnostics#

.. py:class:: Timeseries(root_dir, grid_dict=None, casename=None, cesm_ver=3) :module: x4c.case

CESM Timeseries case helper.

Manages discovery and loading of preprocessed CESM timeseries files produced by CESM postprocessing. Provides convenience methods to locate paths, load raw or derived diagnostics, compute spells, and create plots and seasonal means.

.. py:method:: Timeseries.calc(spell: str, comp=None, timespan=None, load_idx=-1, recalculate=False, verbose=True, **kws) :module: x4c.case

  Compute a diagnostic spell and cache the result.

  The `spell` string controls regridding, slicing, spatial/vertical
  averaging and other modifiers parsed by `Spell`. The final
  xarray DataArray is stored in `self.diags[spell]`.

.. py:method:: Timeseries.clear_ds(vn=None) :module: x4c.case

  Clear the existing `.ds` property

.. py:method:: Timeseries.copy() :module: x4c.case

  Return a deep copy of this Timeseries instance.

.. py:method:: Timeseries.get_comp_hstr(vn) :module: x4c.case

  Find all (component, hstr) pairs where `vn` is present.

.. py:method:: Timeseries.get_paths(comp, hstr, vn, timespan=None) :module: x4c.case

  Return list of timeseries file paths for `vn` under `comp/hstr`.

  If `timespan` is provided it filters the returned paths to those
  fully covering the requested interval.

.. py:method:: Timeseries.get_ts(vn, comp, timespan=None, slicing=False, regrid=False, dlat=1, dlon=1) :module: x4c.case

  Open and return a Dataset for `vn` on `comp`.

  Applies optional slicing and regridding before returning the
  Dataset; does not cache the result.

.. py:method:: Timeseries.load(vn, vtype=None, comp=None, hstr=None, timespan=None, load_idx=-1, verbose=True, reload=False, **kws) :module: x4c.case

  Load a variable or derived diagnostic into `self.ds`.

  Automatically detects whether `vn` is a raw timeseries or a
  derived diagnostic and loads or computes it. Results are stored
  in `self.ds[vn]`.

.. py:method:: Timeseries.plot(spell, t_idx=None, regrid=False, gs=’T’, ssv=’SSH’, recalculate_ssv=False, timespan=None, **kws) :module: x4c.case

  Plot a computed diagnostic `spell`.

  Detects plot type (map, ts, zm, yz) from the DataArray and
  dispatches to the plotting helpers in `diags`/`visual`.

.. py:method:: Timeseries.quickview(timespan=None, nrow=None, ncol=None, wspace=0.3, hspace=0.5, ax_loc=None, figsize=None, stat_period=-50, roll_int=50, ylim_dict=None, spells=None, recalculate=False) :module: x4c.case

  Create a multi-panel overview figure for a selection of spells.

  Returns `(fig, ax)` where `ax` is a dict of axes keyed by spell
  keys.

.. py:method:: Timeseries.save_means(vn, comp, output_dirpath, timespan, slicing=False, regrid=False, dlat=1, dlon=1, overwrite=False) :module: x4c.case

  Save seasonal and annual mean files for `vn` into `output_dirpath`.

  Writes files for ANN, DJF, MAM, JJA and SON for the given
  `timespan` and optionally regrids results.