Ensemble spread and ocean initialization · ESP-SMYLE

Stream: ESP-SMYLE

Topic: Ensemble spread and ocean initialization

Dan Amrhein (Oct 05 2020 at 16:10):

What are some options for ocean initialization? What are the uncertainties / PDFs that we would want to sample over to generate spread in ocean ICs? How important is consistency with JRA55 to avoid initial shocks?

Stephen Yeager (Oct 05 2020 at 16:49):

JRA55-do FOSI is the only viable option for ocean initialization for SMYLE. In the future, perhaps we will have a fully-coupled EnKF DA reconstruction (including ocean BGC) that will naturally have spread in each component. I don't know what the uncertainties should be for the ocean state, but up to now we've assumed it to be zero which is wrong. A lagged ensemble seems to me to be the easiest way to introduce some spread into ocean/ice/BGC, but open to other ideas.

Dan Amrhein (Oct 05 2020 at 18:39):

OK, thanks Steve. Re: lagged ensemble, I know you described this in the meeting, but could you clarify again? Do I have it right that the idea would be to take states from the FOSI that are offset in time from the initialization state? Would those states be paired with atmospheric states?

What is the origin of ocean uncertainties here?

Stephen Yeager (Oct 05 2020 at 20:30):

Yes, FOSI states from , say, the first 5 days of the month rather than just start of month. Those ocean/ice states would be paired with perturbed atm states from 0Z at the start of the month (implying some ocean/atm mismatch). Perhaps the ocean spread could be interpreted as representing uncertainty in the phase of ocean waves, or perhaps as a reflection of missing mesoscale noise, but mostly it's just an ad hoc way to introduce spread. We could alternatively think about a 'pertlim' method for ocean/ice?

Dan Amrhein (Oct 06 2020 at 17:53):

It's interesting to think about how much and where the ocean initialization will matter. I would guess that the relatively rapidly growing perturbation errors in the atmosphere would take over for generating spread soon in many parts of the ocean after initializing. (Maybe different ocean ICs would contribute to the growth of atmospheric spread.) So a goal might be to make sure that regions that may be more insulated from atmospheric influence, e.g. at greater ocean depths, also have a chance to develop spread. If so is it a question of what kinds of perturbations might project onto whatever growing modes might exist? Maybe that's an argument for the lagged ensemble, as those length scales would presumably be present, whereas high wavenumber, pertlim-like perturbations could just get damped out?

Stephen Yeager (Oct 07 2020 at 14:46):

I'd love some help scouring the literature for a good rationale for different techniques of ensemble generation. I had a chat with Magdalena Balmaseda at the S2D prediction meeting in Boulder. She suggested that our ENSO initialization shock in DPLE may be related to a lack of spread in ocean IC's. I haven't found any paper that clarifies that at all, but I'm not that familiar with ENSO prediction literature. Perhaps we should ask Joe. The ECMWF seasonal forecasting system uses an ocean analysis (OCEAN5) that has 5 members (https://doi.org/10.5194/gmd-12-1087-2019) that adds to the atmosphere IC spread.

Dan Amrhein (Oct 07 2020 at 15:34):

Good idea. Pinging @Joe Tribbia to get his thoughts. I'll take a look at the OCEAN5 paper.

Fred Castruccio (Oct 07 2020 at 19:15):

Without an actual ensemble to get the ocean IC spread, a cheap an easy way could be to use the first few EOFs to introduce noise along the dominant mode of variability of the system.

Dan Amrhein (Oct 07 2020 at 21:05):

Thanks Fred. How large would you think those perturbations should be?

Re: ensemble spread as we could get from an EnKF solution, that would have a flavor of "practical predictability," as spread would reflect in part how well-constrained the system was by observations. My impression is that SMYLE and similar experiments are designed more to target the intrinsic predictability of the system, independent of observations, is that right?

Fred Castruccio (Oct 07 2020 at 21:15):

Dan, that remain to be determined. I think the EOFs can be useful to point where to introduce spread, not sure how to get the magnitude.

Fred Castruccio (Oct 07 2020 at 21:26):

Dan, another option could be to rely on an ENOI approach and only run the analysis for the ICs time. In a post-processing fashion. This will actually be much better approach to obtain a meaningful ensemble of ocean ICs (using the posterior ensemble).

Dan Amrhein (Oct 09 2020 at 15:36):

Fred, agreed that's another option. For your EnOI perturbation generation, you drew states at different times from a control JRA-forced run. That seems similar to the lagged ensemble approach, with larger perturbations if they're drawn from a climatological background of variability, rather than a few states that are close in time.

I looked at a couple of related papers. It looks like the OCEAN5 reanalysis that Steve pointed to generates spread by perturbing the location of ocean observations in a DA procedure, which is designed to sample over and account for observational representativeness error (https://www.ecmwf.int/en/elibrary/17831-generic-ensemble-generation-scheme-data-assimilation-and-ocean-analysis) in addition to the uncertainty in the DA posterior. Lang et al. (2012) outline strategies used at ECMWF, including work on singular vectors by @Joe Tribbia and SKEB by @Judith Berner et al. Johnson and Wang (https://journals.ametsoc.org/mwr/article/144/7/2579/72331/A-Study-of-Multiscale-Initial-Condition) suggest that perturbations at varying length scales can improve ensemble performance because they better approximate the true uncertainty in initial conditions in an NWP context.

It seems to me that the distribution that we want to sample to make initial perturbations is the FOSI - TRUTH distribution. One way to do this that would try to keep the perturbations on the model attractor could be to compute Kalman increments based on misfits between FOSI and observations (or a data product, like Roemmich and Gilson). We'd have to have a time period to compute those statistics that didn't overlap with the prediction interval.

Dan Amrhein (Oct 09 2020 at 17:47):

The Kalman increments could be computed offline using a stationary covariance and differences between FOSI and obs over some time period, i.e. no need to run actual DA. We might want to think about what to do with the time-mean of the increments, which is an estimate of FOSI bias.

Stephen Yeager (Oct 09 2020 at 20:50):

Shall we schedule a zoom conversation next week to discuss?

Dan Amrhein (Oct 10 2020 at 02:27):

Sounds good.

Stephen Yeager (Oct 13 2020 at 21:10):

Interesting comment (and paper reference) on this topic from an email exchange with Yuko Okumura:

Regarding the methods to generate initial condition spread, it would be interesting and useful in terms of ENSO prediction to add noise dominant in the coupled system for each initialization month (for example, Pacific meridional mode in spring and Indian Ocean dipole in fall). The noise structure could be obtained from the ensemble spread of existing hindcasts (NMME and Xian’s CESM1 hindcasts). Kug et al. (2010,https://link.springer.com/article/10.1007/s00382-009-0664-y) seem to talk about a similar idea. I am not too familiar with the technical aspect of forecasts, but this seems like an interesting approach to account for the uncertainty of initial condition.

Stephen Yeager (Oct 13 2020 at 22:26):

Another interesting and relevant read: Palmer&Zanna 2013 (https://iopscience-iop-org.cuucar.idm.oclc.org/article/10.1088/1751-8113/46/25/254018)

Dan Amrhein (Oct 15 2020 at 20:49):

We had a good conversation yesterday and moving forward the goal is to design some tests and experiments to evaluate how large ocean perturbations should be, and how much they are likely to matter.

Dan Amrhein (Oct 15 2020 at 20:51):

One takeaway I had from Joe is that the growth of ensemble spread in the ocean is likely to be driven by rapidly growing spread in the atmosphere. @Fred Castruccio , I took a stab at writing up the procedure I had in mind for using offline EnOI-like DA tools for generating perturbations. Can you take a look and see if this is similar to what you had in mind? And you'll also have a much better notion of what is feasible! I described this to Jeff A today and he also seemed interested, and suggested we could chat early next week. One_approach_for_using_offline_DA_to_generate_ocean_IC_perturbations.pdf

Stephen Yeager (Nov 02 2020 at 22:51):

@Dan Amrhein Any updates on these discussions? Would you have an interest in presenting your thoughts at next Monday's meeting (Nov. 9)?

Dan Amrhein (Nov 03 2020 at 21:51):

Hey Steve: @Fred Castruccio and I chatted last week, and also looped in Jeff Anderson. We discussed a few different options, including the list from our meeting a couple of weeks ago. Our impression was that the lagged ensembles are probably the way to go now: while we can think of other approaches that could be better in some respects, we weren't sure a) how much of a difference they would make, and b) how feasible they would be given the time frame. I'd be happy to give a brief discussion at the meeting.

Last updated: May 16 2025 at 17:14 UTC