Posts by Deepak Cherian
Thinking through CESM data access
- 16 August 2023
We want to read a large number of netCDF files, combine them to form a single dataset, and then analyze that. How do we think about it?
In pseudocode we want
Analyzing and Visualizing CAM-SE Output in Python
- 15 August 2023
We demonstrate a variety of options for analyzing and visualizing output from the Community Atmosphere Model (CAM) with the spectral element (SE) grid in Python. This notebook was developed for the ESDS Collaborative Work Time on Unstructured Grids, which took place on April 17, 2023. A recap of the related CAM-SE discussion can be found here.
Regrid CAM-SE output using map file
Recap: Unstructured Grid Collaborative Work Time
- 05 May 2023
ESDS hosted our first Collaborative Work Time event on April 17, 2023. The topic of the session was “Working With Unstructured Grids”. Our goal is to encourage cross-lab collaboration and build lasting science-software partnerships.
The event was hybrid with in-person attendees in the Damon Room at the Mesa Lab. A lucky overlap with the Improving Scientific Software conference, meant that collaborators from the Department of Energy were also able to attend in-person.
Using Kerchunk with CESM Timeseries Data on the Cloud
- 15 March 2023
We benchmark reading a subset of the CESM2-Large Ensemble stored as a collection of netCDF files on the cloud (Amazon / AWS) from Casper. We use a single ensemble member historical experiment with daily data from 1850 to 2009, with a total dataset size of 600+ GB, from 13 netCDF4 files.
We read in two ways:
Virtual aggregate CESM MOM6 datasets with kerchunk
- 07 March 2023
This notebook is adapted from the work by Lucas Sterzinger (an NCAR SIParCS intern in 2021).
This notebook was updated to
Regridding using xESMF and an existing weights file
- 06 December 2022
A fairly common request is to use an existing ESMF weights file to regrid a Xarray Dataset (1, 2). Applying weights in general should be easy: read weights then apply them using dot
or tensordot
on the input dataset.
In the Xarray/Dask/Pangeo ecosystem, xESMF provides an interface to ESMF for convenient regridding, includiing parallelization with Dask. Here we demonstrate how to use an existing ESMF weights file with xESMF specifically for CAM-SE.
Debugging dask workflows: Detrending
- 31 March 2022
Detrending - subtracting a trend, commonly a linear fit, from the data - along the time dimension is a common workflow in the climate sciences.
Here’s an example
Sparse arrays and the CESM land model component
- 24 February 2022
An underappreciated feature of Xarray + Dask is the ability to plug in different array types. Usually we work with Xarray wrapping a Dask array which in turn uses NumPy arrays for each block; or just Xarray wrapping NumPy arrays directly. NumPy arrays are dense in-memory arrays. Other array types exist:
sparse for sparse arrays