Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

OSDF-Examples

Jupyter Book DOI Python PelicanFS

A collection of Jupyter notebooks that stream Earth System Science data from Open Science Data Federation (OSDF) origins using PelicanFS, and run analysis on a variety of HPC and cloud platforms.

Browse the rendered book: https://ncar.github.io/osdf-examples/

New to OSDF or PelicanFS? Project Pythia’s OSDF Cookbook is the recommended introduction — its first chapters cover the OSDF concept and PelicanFS in depth. For background on how NCAR integrated OSDF with its data infrastructure, see Integration of OSDF with NCAR’s data infrastructure: Interim Project Report (Oct 2025).

Quick Start

git clone https://github.com/NCAR/osdf-examples.git
cd osdf-examples
python -m venv .venv && source .venv/bin/activate    # or use conda
pip install -r requirements.txt
jupyter lab

New here? Start with notebooks/simple_aws_example.ipynb (runs on a laptop, no credentials required).

What’s inside

The repository is organized by data origin — the OSDF origin a notebook streams data from. Each notebook also indicates the compute platform it was tested on. Browse the Notebook Gallery for the full, tagged list.

Data origins

Compute platforms covered

NCAR Casper · TACC Stampede3 · Indiana Jetstream2 · OSPool · laptop

Most notebooks are designed to run on a user’s own machine via a Dask LocalCluster. The compute-platform mentions and platform: tags indicate where each notebook was verified (e.g. via PBS on Casper), not the only place it can run — flip the cluster switch in the notebook to use a LocalCluster instead.

Workflow types

Bias correction · climatology · ML (logistic-regression Niño 3.4 prediction) · benchmarking · diagnostic visualization · equilibrium climate sensitivity.

Finding a notebook

Each notebook is tagged in its frontmatter with a faceted scheme so you can filter by axis instead of guessing keywords:

FacetExamples
origin:aws, ncar-posix, ncar-object-store
platform:casper, stampede3, jetstream2, ospool, laptop
dataset:cesm, cmip6, era5, conus404, na-cordex, hrrr, dart, jra3q, hadisst
task:bias-correction, climatology, ml, benchmark, visualization, ecs
level:beginner, intermediate, advanced

The rendered Jupyter Book exposes these tags as filters. See the Notebook Gallery for a tagged index, or Contributing to OSDF-Examples for the tag conventions when adding new notebooks.

Repository structure

docs/         Markdown overviews and the notebook gallery
notebooks/    All workflow notebooks (subfolders for ML and NDC workflows)
scripts/      Non-notebook code (e.g. OSPool batch examples)
myst.yml      Jupyter Book configuration / table of contents

How to contribute

Contributions are welcome from anyone — you do not need an NCAR HPC account. Notebooks that run on a laptop, on the cloud, or on any HPC system are all in scope, as long as they demonstrate accessing data via OSDF/PelicanFS.

  1. Fork the repository.

  2. Create a feature branch: git checkout -b example/my-amazing-example.

  3. Add your notebook with the standard frontmatter and tags (see Contributing to OSDF-Examples).

  4. Open a pull request describing the dataset, origin, and compute platform.

If you’re contributing a workflow that requires NCAR HPC access, please note that in the notebook so external readers know what to expect.

Citing

If you use any workflow in this repository, please cite via the DOI badge above.

Support

Bug reports and feature requests: please open a GitHub Issue.

References
  1. Harsha Hampapura, Riley Conroy, Emma Turetsky, & Joanmarie Del Vecchio. (2025). NCAR/osdf_examples: osdf-example-workflows-1.0.1. Zenodo. 10.5281/ZENODO.16863133