Spectral Analysis#

This example demonstrates how to perform spectral and cross-spectral analysis.

This document shows a concise workflow using the scipy.signal module. To learn more about the functions and keyword arguments available in this package, read SciPy’s signal processing documentation.

This notebook will cover:

  1. Periodograms

  2. Smoothing

  3. Cross Spectral Analysis

  4. Coherence

import xarray as xr
import geocat.datafiles as gcd
import pandas as pd
import scipy
import matplotlib.pyplot as plt
import numpy as np

Periodogram#

A periodogram estimate is the square of the coefficients from the Fourier transform of one dimension of spectral data.

Read in Data#

We’ll use the geocat-datafiles module to access the relevant datafile for this example: SOI_Darwin.nc

Then, we use xarray.open_dataset() function to read the data as an xarray dataset.

This data represents the Darwin Southern Oscillation Index between 1882 and 1998 using normalized sea level pressure.

soi_darwin = xr.open_dataset(gcd.get('netcdf_files/SOI_Darwin.nc'))

soi_darwin
Downloading file 'netcdf_files/SOI_Darwin.nc' from 'https://github.com/NCAR/GeoCAT-datafiles/raw/main/netcdf_files/SOI_Darwin.nc' to '/home/runner/.cache/geocat'.
<xarray.Dataset> Size: 22kB
Dimensions:  (time: 1404)
Coordinates:
  * time     (time) int32 6kB 0 1 2 3 4 5 6 ... 1398 1399 1400 1401 1402 1403
Data variables:
    date     (time) float64 11kB ...
    DSOI     (time) float32 6kB ...
Attributes:
    title:          Darwin Southern Oscillation Index
    source:         Climate Analysis Section, NCAR
    history:        \nDSOI = - Normalized Darwin\nNormalized sea level pressu...
    creation_date:  Tue Mar 30 09:29:20 MST 1999
    references:     Trenberth, Mon. Wea. Rev: 2/1984
    time_span:      1882 - 1998
    conventions:    None

Time Series Correction#

Currently, our time coordinate is in units of months since January 1882, we’ll use pandas to change this to the conventional DatetimeIndex.

start_date = pd.Timestamp('1882-01-01')
dates = [
    start_date + pd.DateOffset(months=int(month))
    for month in soi_darwin.indexes['time']
]
datetime_index = pd.DatetimeIndex(dates)

datetime_index
DatetimeIndex(['1882-01-01', '1882-02-01', '1882-03-01', '1882-04-01',
               '1882-05-01', '1882-06-01', '1882-07-01', '1882-08-01',
               '1882-09-01', '1882-10-01',
               ...
               '1998-03-01', '1998-04-01', '1998-05-01', '1998-06-01',
               '1998-07-01', '1998-08-01', '1998-09-01', '1998-10-01',
               '1998-11-01', '1998-12-01'],
              dtype='datetime64[ns]', length=1404, freq=None)
soi = soi_darwin.DSOI
soi['time'] = datetime_index

Periodogram#

Now we can call scipy.signal.periodogram().

You can specify a window argument for smoothing your spectrum before spectral analysis. Here is a list of supported Scipy windows. The default is boxcar which is a unit window equivalent to no tapering.

Set the detrending kwarg to constant to remove the mean from our data.

There is also a scaling keyword that defaults to density for computing the power spectral density with units of \(V^{2}/Hz\), with V being the units of the input array (here dimensionless), and Hz representing your frequency units (we’ll use per month instead of per second). spectral is also supported and produces the squared magnitude spectrum with units of \(V^{2}\).

freq_soi, psd_soi = scipy.signal.periodogram(
    soi,
    fs=1,  # sample monthly
    window='tukey',
    detrend='constant',
)

Smoothing#

To provide statistical reliability, the periodogram estimates must be smoothed.

You can use the same scipy.signal.windows as for the tapering. We’ll chose a standard Hann window.

k = 3  # smoothing constant

window = scipy.signal.windows.hann(2 * k + 1)  # Hann Window
window = window / window.sum()  # Normalize window

# Apply the convolution
smoothed_psd = scipy.signal.convolve(psd_soi, window, mode='same')

Plot#

plt.plot(freq_soi, smoothed_psd)
plt.xlabel('Frequency (Cycles / Month)')
plt.ylabel('Variance (Pressure$^2$ / Frequency)')
plt.title('Darwin Southern Oscillation Index (1882 - 1998) - Power Spectral Density')
plt.xlim([0, 0.5]);
../../_images/ba7d08515aabbe2e4504fd4982f3cc9ef833a323c94b8ef55b7dd5bb39ab6370.png

Cross-Spectral Analysis#

For cross-spectral analysis, we’ll create plots of the following quantities: cospectrum, quadtrature, phase, coherence squared, and the periodogram of each dimensional array.

Read in Data#

For our cross-spectral data we’ll look at SLP_DarwinTahitiAnom which contains Darwin and Tahiti sea level pressure anomalies between 1935-1998.

This data will need the same time series adjustment as in the periodogram example above.

slp_darwin = xr.open_dataset(gcd.get('netcdf_files/SLP_DarwinTahitiAnom.nc'))

# Time Series Correction
start_date = pd.Timestamp('1935-01-01')  # Months Since January 1935
dates = [
    start_date + pd.DateOffset(months=int(month))
    for month in slp_darwin.indexes['time']
]
datetime_index = pd.DatetimeIndex(dates)

slp_darwin['time'] = datetime_index
slp_darwin
Downloading file 'netcdf_files/SLP_DarwinTahitiAnom.nc' from 'https://github.com/NCAR/GeoCAT-datafiles/raw/main/netcdf_files/SLP_DarwinTahitiAnom.nc' to '/home/runner/.cache/geocat'.
<xarray.Dataset> Size: 18kB
Dimensions:  (time: 768)
Coordinates:
  * time     (time) datetime64[ns] 6kB 1935-01-01 1935-02-01 ... 1998-12-01
Data variables:
    date     (time) float64 6kB ...
    DSLP     (time) float32 3kB ...
    TSLP     (time) float32 3kB ...
Attributes:
    title:          Darwin and Tahiti Sea Level Pressure Anomalies: 1935-1998
    source:         Climate Analysis Section, NCAR
    history:        \nCreated to test specxy routines
    creation_date:  Tue Mar 30 12:34:39 MST 1999
    conventions:    None
dslp = slp_darwin.DSLP  # Darwin
dslp
<xarray.DataArray 'DSLP' (time: 768)> Size: 3kB
[768 values with dtype=float32]
Coordinates:
  * time     (time) datetime64[ns] 6kB 1935-01-01 1935-02-01 ... 1998-12-01
Attributes:
    short_name:  DSLP
    long_name:   Darwin Sea Level Pressure Anomalies
    units:       hPa
tslp = slp_darwin.TSLP  # Tahiti
tslp
<xarray.DataArray 'TSLP' (time: 768)> Size: 3kB
[768 values with dtype=float32]
Coordinates:
  * time     (time) datetime64[ns] 6kB 1935-01-01 1935-02-01 ... 1998-12-01
Attributes:
    short_name:  TSLP
    long_name:   Tahiti Sea Level Pressure Anomalies
    units:       hPa
dslp.plot(label='Darwin')
tslp.plot(label='Tahiti')
plt.legend()
plt.ylabel('Sea Level Pressure Anomalies [hPa]');
../../_images/a4c7f8fee2d393b07636da23313cc7c2fa19c0032a7dc8988afc8ba77eff6408.png

Cross-Power Spectral Density#

We will calculate cospectrum, quadtrature, and phase by leveraging scipy.signal.csd() and computing its real (cospectrum) and imaginary (quadtrature) components, and calculating the angle between them (phase).

f, pxy = scipy.signal.csd(
    dslp,
    tslp,
    fs=1,  # monthly
    detrend='constant',
)  # remove mean

cospectrum = np.real(pxy)
quadrature = np.imag(pxy)

phase = np.angle(pxy, deg=True)
fig, axs = plt.subplots(1, 3, figsize=(10, 5))

axs[0].plot(f, cospectrum)
axs[0].set_xlim([0, 0.5])
axs[0].set_xlabel('Frequency (Cycles / Month)')
axs[0].set_ylabel('Cospectrum')

axs[1].plot(f, quadrature)
axs[1].set_xlim([0, 0.5])
axs[1].set_ylim([-3, 3])
axs[1].set_xlabel('Frequency (Cycles / Month)')
axs[1].set_ylabel('Quadrature')

axs[2].plot(f, phase)
axs[2].set_xlim([0, 0.5])
axs[2].set_ylim([-200, 200])
axs[2].set_xlabel('Frequency (Cycles / Month)')
axs[2].set_ylabel('Phase')

fig.suptitle(
    'Darwin and Tahiti Sea Level Pressure Anomalies (1935-1998) - Cross-Spectral Analysis'
)
plt.tight_layout();
../../_images/b4042dc6bb144439989fb514611ae7c5cb4e95c8541e7684296754bba8a36714.png

Coherence Squared#

Coherence is a measure of how similar two specta are, with regards to their frequency and phase offset relative to one another.

For coherence squared, simply square the output from scipy.signal.coherence().

f, cxy = scipy.signal.coherence(
    dslp,
    tslp,
    fs=1,  # monthly
    detrend='constant',
)  # remove mean

co_sq = cxy**2
plt.plot(f, co_sq)
plt.xlim([0, 0.5])
plt.ylim([0, 1])
plt.xlabel('Frequency (Cycles / Month)')
plt.ylabel('Coherence SQ')

plt.title('Darwin and Tahiti Sea Level Pressure Anomalies (1935-1998)')
plt.tight_layout();
../../_images/ad0935c73023b882811b3aba2ee3727c80925bcae9cd1128ac4102dbec90e7ed.png

Peridogram and Smoothing#

# Calculate Periodogram
freq_dslp, dslp_psd = scipy.signal.periodogram(
    dslp,
    fs=1,  # monthly
    detrend='constant',
)  # remove mean
freq_tslp, tslp_psd = scipy.signal.periodogram(tslp, fs=1, detrend='constant')

# Perform modified Daniel Smoothing
dslp_smoothed = scipy.signal.convolve(dslp_psd, window, mode='same')
tslp_smoothed = scipy.signal.convolve(tslp_psd, window, mode='same')
fig, axs = plt.subplots(1, 2, figsize=(10, 5))

axs[0].plot(freq_dslp, dslp_smoothed)
axs[0].set_xlim([0, 0.5])
axs[0].set_ylim([0, 30])
axs[0].set_xlabel('Frequency (cycles/month)')
axs[0].set_ylabel('Variance of Darwin SLP')

axs[1].plot(freq_tslp, tslp_smoothed)
axs[1].set_xlim([0, 0.5])
axs[1].set_ylim([0, 30])
axs[1].set_xlabel('Frequency (cycles/month)')
axs[1].set_ylabel('Variance of Tahiti SLP')

fig.suptitle(
    'Darwin and Tahiti Sea Level Pressure (SLP) Anomalies (1935-1998) - Power Spectral Density'
)
plt.tight_layout();
../../_images/5858d335c8cf8768835738801a70ca57fa54d51faba4f6c45229f9c6471e29f8.png