File I/O#

Overview#

This section covers file I/O functions from NCL:

Functions#

addfile#

NCL’s addfile opens a data file from supported file formats

Grab and Go#

  • netCDF, HDF5, and HDF-EOS5

import geocat.datafiles as gcd
import xarray as xr

data_filepath = gcd.get("netcdf_files/1994_256_FSD.nc")
netCDF_datafile = xr.open_dataset(data_filepath)
netCDF_datafile
Downloading file 'netcdf_files/1994_256_FSD.nc' from 'https://github.com/NCAR/geocat-datafiles/raw/main/netcdf_files/1994_256_FSD.nc' to '/home/runner/.cache/geocat'.
<xarray.Dataset> Size: 248kB
Dimensions:  (time: 1, lat: 310, lon: 198)
Coordinates:
  * time     (time) float64 8B 913.0
  * lat      (lat) float32 1kB 34.11 34.17 34.24 34.31 ... 51.87 51.92 51.97
  * lon      (lon) float32 792B 127.4 127.5 127.6 127.7 ... 143.0 143.1 143.2
Data variables:
    FSD      (time, lat, lon) float32 246kB ...
Attributes:
    nclprojection:  use native mercator grid plotting techniques
    mapflag:        mercator
    script:         converted to netCDF via archive2netCDF.ncl
    source:         archv.1994_256_00_fsd.A
    creation_date:  Mon Dec 17 14:49:06 CST 2001
    experiment:     01.2

Grab and Go#

  • HDF4 and HDF-EOS

xarray can support additional file formats with the addition of specified engines

import geocat.datafiles as gcd
import xarray as xr

data_filepath = gcd.get("hdf_files/avhrr.hdf")
hdf_datafile = xr.open_dataset(data_filepath, engine="netcdf4")
hdf_datafile
Downloading file 'hdf_files/avhrr.hdf' from 'https://github.com/NCAR/geocat-datafiles/raw/main/hdf_files/avhrr.hdf' to '/home/runner/.cache/geocat'.
<xarray.Dataset> Size: 519kB
Dimensions:     (fakeDim0: 180, fakeDim1: 360)
Coordinates:
  * fakeDim0    (fakeDim0) uint8 180B 129 129 129 129 129 ... 129 129 129 129
  * fakeDim1    (fakeDim1) uint8 360B 129 129 129 129 129 ... 129 129 129 129
Data variables:
    Data-Set-2  (fakeDim0, fakeDim1) float64 518kB ...

Additional file formats:#

GRIB#

Important Note

GRIB files need the additional cfgrib engine, which can be installed alongside Xarray:

conda install -c conda-forge cfgrib eccodes
import geocat.datafiles as gcd
import xarray as xr

grib_datafile = xr.open_dataset(
    gcd.get("grib_files/ST4.2002030112.06h.grb"), engine="cfgrib"
)
grib_datafile
Shapefiles#

Important Note

Shapefiles can be read via the geopandas read_file() function

conda install -c conda-forge geopandas

Shapefiles require both the .shp and index .shx file

import geocat.datafiles as gcd
import geopandas as gpd

shp_dbffile = gpd.read_file(gcd.get("shape_files/states.dbf"))
shp_prjfile = gpd.read_file(gcd.get("shape_files/states.prj"))
shp_idxfile = gpd.read_file(gcd.get("shape_files/states.shx"))
shp_datafile = gpd.read_file(gcd.get("shape_files/states.shp"))

Sometimes, the Shapefile’s associated files types are not available. In these instances, if the index file (.shx) is missing, it can be repaired with gdal by setting SHAPE_RESTORE_SHX to YES before reading the file

conda install -c conda-forge gdal

Where:

import geocat.datafiles as gcd
import geopandas as gpd

from osgeo import gdal

gdal.SetConfigOption('SHAPE_RESTORE_SHX', 'YES')

shp_datafile = gpd.read_file(gcd.get("shape_files/states.shp"))

Python Resources#