Welcome to the GDEX Intake ESM Catalog documentation! This project provides tools and scripts for generating intake-ESM catalogs that enable unified access to diverse Earth science datasets within NCAR’s GDEX infrastructure.
What is GDEX Intake ESM?¶
While intake-ESM was originally designed for Earth System Model output, we extend its capabilities to support a broader range of Earth science data including:
Observations (satellite, in-situ measurements)
Reanalysis datasets (ERA5, JRA-3Q, etc.)
Model output (CESM, CMIP, etc.)
Other Earth science datasets
Key Features¶
🛠️ Custom Catalog Generation¶
Our primary tool generator/create_catalog.py creates intake-ESM catalogs for any dataset directory with flexible configuration options for different data formats and structures.
🌐 Multiple Access Methods¶
Generated catalogs support three access patterns :
POSIX: Direct filesystem access for NCAR HPC users
HTTPS: Web-based access for remote users
OSDF: Distributed access through Open Science Data Federation
📊 Broad Dataset Support¶
Compatible with diverse data formats including NetCDF, Zarr, and Kerchunk reference files, following vocabulary conventions used by major data providers (DKRZ, Copernicus, NASA, NOAA).
Quick Start¶
Generate a basic catalog:
python generator/create_catalog.py /path/to/data \
--out /output/directory \
--catalog_name my_catalog \
--description "My dataset catalog"For comprehensive usage examples:
NCAR HPC users: gdex-examples
OSDF users: osdf_examples
Repository Structure¶
generator/- Core catalog generation toolsnotebooks/- Example Jupyter notebooks demonstrating usageexamples/- Python script examples for generating dataset catalogtest/- Test scripts and validation tools
Content¶
This documentation provides:
Understanding the catalog generation process
Accessing generated catalogs through different methods