File I/O#
Overview#
This section covers file I/O functions from NCL:
Functions#
addfile#
NCL’s addfile opens a data file from supported file formats
Grab and Go#
netCDF, HDF5, and HDF-EOS5
import geocat.datafiles as gcd
import xarray as xr
data_filepath = gcd.get("netcdf_files/1994_256_FSD.nc")
netCDF_datafile = xr.open_dataset(data_filepath)
netCDF_datafile
Downloading file 'netcdf_files/1994_256_FSD.nc' from 'https://github.com/NCAR/geocat-datafiles/raw/main/netcdf_files/1994_256_FSD.nc' to '/home/runner/.cache/geocat'.
<xarray.Dataset> Size: 248kB
Dimensions: (time: 1, lat: 310, lon: 198)
Coordinates:
* time (time) float64 8B 913.0
* lat (lat) float32 1kB 34.11 34.17 34.24 34.31 ... 51.87 51.92 51.97
* lon (lon) float32 792B 127.4 127.5 127.6 127.7 ... 143.0 143.1 143.2
Data variables:
FSD (time, lat, lon) float32 246kB ...
Attributes:
nclprojection: use native mercator grid plotting techniques
mapflag: mercator
script: converted to netCDF via archive2netCDF.ncl
source: archv.1994_256_00_fsd.A
creation_date: Mon Dec 17 14:49:06 CST 2001
experiment: 01.2Grab and Go#
HDF4 and HDF-EOS
xarray can support additional file formats with the addition of specified engines
import geocat.datafiles as gcd
import xarray as xr
data_filepath = gcd.get("hdf_files/avhrr.hdf")
hdf_datafile = xr.open_dataset(data_filepath, engine="netcdf4")
hdf_datafile
Downloading file 'hdf_files/avhrr.hdf' from 'https://github.com/NCAR/geocat-datafiles/raw/main/hdf_files/avhrr.hdf' to '/home/runner/.cache/geocat'.
<xarray.Dataset> Size: 519kB
Dimensions: (fakeDim0: 180, fakeDim1: 360)
Coordinates:
* fakeDim0 (fakeDim0) uint8 180B 129 129 129 129 129 ... 129 129 129 129
* fakeDim1 (fakeDim1) uint8 360B 129 129 129 129 129 ... 129 129 129 129
Data variables:
Data-Set-2 (fakeDim0, fakeDim1) float64 518kB ...Additional file formats:#
GRIB#
Important Note
GRIB files need the additional cfgrib engine, which can be installed alongside Xarray:
conda install -c conda-forge cfgrib eccodes
import geocat.datafiles as gcd
import xarray as xr
grib_datafile = xr.open_dataset(
gcd.get("grib_files/ST4.2002030112.06h.grb"), engine="cfgrib"
)
grib_datafile
Shapefiles#
Important Note
Shapefiles can be read via the geopandas read_file() function
conda install -c conda-forge geopandas
Shapefiles require both the .shp and index .shx file
import geocat.datafiles as gcd
import geopandas as gpd
shp_dbffile = gpd.read_file(gcd.get("shape_files/states.dbf"))
shp_prjfile = gpd.read_file(gcd.get("shape_files/states.prj"))
shp_idxfile = gpd.read_file(gcd.get("shape_files/states.shx"))
shp_datafile = gpd.read_file(gcd.get("shape_files/states.shp"))
Sometimes, the Shapefile’s associated files types are not available. In these instances, if the index file (.shx) is missing, it can be repaired with gdal by setting SHAPE_RESTORE_SHX to YES before reading the file
conda install -c conda-forge gdal
Where:
import geocat.datafiles as gcd
import geopandas as gpd
from osgeo import gdal
gdal.SetConfigOption('SHAPE_RESTORE_SHX', 'YES')
shp_datafile = gpd.read_file(gcd.get("shape_files/states.shp"))