2. Operational control

This section describes various input options that control the operation control (rather than physical parameters) of the model, including processor configuration, time management and some input/output control.

2.1. Processor configuration

The first namelist read by POP model, domain_nml model determines how many processors it should use and how to distribute blocks of the domain across processors.

The number of processors used by the model is governed by the nprocs_clinic parameter. The parameter nprocs_tropic determines whether the barotropic solver is run using the same or fewer number of processors; specifying a number larger than the clinic value will result in an error.

The distribution of blocks across processors is determined by the parameters clinic_distribution_type and tropic_distribution_type. Typically, the 'balanced' choice is best for the baroclinic part of the code and 'cartesian' is best for the barotropic solver. The 'spacecurve' choice is sometimes the default for cases using the tripole grid at high resolution on high processor counts.

In order to update ‘’ghost cells’’ and implement proper boundary conditions, some boundary information for the global domain is required. The parameter ew_boundary_type determines the type of boundary for the logical east-west direction (i direction). Acceptable values are currently 'cyclic' and 'closed' for periodic and closed boundaries, respectively. The parameter for the logical north-south direction (j direction) is ns_boundary_type and accepts 'cyclic','closed' and 'tripole', where cyclic and closed have the same meaning as the east-west direction and tripole refers to use of a tripole grid.

Note

Occasionally, when running POP2 with OpenMP and using certain processor configurations, the POP_Distribution rake algorithm fails when clinic_distribution_type is set to balanced. Because of this problem, which is related to other limitations imposed by the POP2, the default setting for this option is clinic_distribution_type = cartesian.

2.2. Input/Output

The namelist io_nml determines the IO settings for POP.

POP supports both binary and netCDF file formats. The formats for each type of file (e.g. restart, history, movie) are set in individual namelists for those operations. For binary output format, POP can perform parallel input/output in order to speed up IO when writing large files. Because most files read or written by POP utilize direct-access IO with a horizontal slice written to each binary record, the parallel IO routines allow several processors to write individual records to the same file. The user can specify how many processors participate in the parallel IO with some restrictions. The number of processors obviously cannot exceed the total number of processors assigned to the job. In addition, it is not productive to assign more processors than the number of vertical levels as these processors will generally remain idle (or even perform unnecessary work). Lastly, there may be some restrictions based on the particular architecture. Some architectures have a limit on the number of effective IO units that can be open simultaneously. Some architectures (e.g. loose clusters of workstations) may not have a file system accessible to all of the participating processors, in which case the user must set the number of IO processors appropriately. Lastly, note that netCDF does not support parallel I/O, so any netCDF formatted files will be read/written from a single processor regardless of the num_iotasks setting.

The POP model writes a variety of information, including model configuration and many diagnostics, to standard output. A namelist flag lredirect_stdout can be turned on to redirect standard to a logfile. The output log file is always redirected to the log_filename file, where log_filename has been pre-defined by the CIME scripts.

During production runs, it is not convenient to have to change the pop_in file for every run. Typically, the only changes necessary are the names of any restart input files. To avoid having to change these filenames in the pop_in file for every run, an option luse_pointer_files exists. If this flag is .true., the names of restart output files are written to pointer files with the name pointer_filename.suffix, where suffix is currently either restart or tavg to handle restart files and tavg restart files. When a simulation is started from restart, it will read these pointer files to determine the location and name of the actual restart files.

As model resolution increases and/or more fields are written to netCDF output files, it becomes increasingly likely that the files will exceed the default netCDF file size of 2Gb. netCDF version 3.6 and higher now support files larger than 2Gb, which is activated by using the NF_64BIT_OFFSET flag when opening a new netCDF file. The io_nml variable luse_nf_64bit_offset allows a user to select large-file support for the netCDF output files. For further information, see the FAQ created by Unidata, the developer of netCDF.

2.3. Parallel I/O

The namelist group **pio_nml ** contains namelist variables that control the PIO settings for POP.

Todo

Add link to pio namelist group io_pio_nml

Parallel I/O is increasingly needed for two reasons: to significantly reduce the memory footprint required to perform I/O and to address performance limitations on high resolution, high processor count simulations. Serial I/O is normally implemented by gathering the data onto one task before writing it out. As a result, it is one of the largest sources of global memory and will always result in a memory bottleneck as the model resolution is increased. Consequently, the absence of parallel I/O in a model component will always give rise to a resolution “cut-off” on a given computational platform. In addition, serial I/O is also associated with serious performance penalties at higher processor counts.

To address these issues, a new parallel I/O library, PIO, has been developed as a collaboratory effort by NCAR/CISL, DOE/SciDAC and NCAR/CSEG. PIO was initially designed to allow 0.1-degree POP to execute and write history and restart files on Blue Gene/L in less than 256 MB per MPI task.

Since that initial prototype version, PIO has developed into a general purpose parallel I/O library that currently supports netCDF (serial), pnetcdf and MPI_IO and has been implemented throughout CESM system. PIO is a software interface layer designed to encapsulate the complexities of parallel IO and to make it easier to replace the lower level software backend. PIO calls are collective, an MPI communicator is set in a call to PIO_init and all tasks associated with that communicator must participate in all subsequent calls to PIO.

One of the key features of PIO is that it takes the model’s decomposition and redistributes it to an I/O “friendly” decomposition on the requested number of I/O tasks. In using the PIO library for netCDF or pnetcdf I/O, the user must specify the number of iotasks to be used, the stride or number of tasks between iotasks and whether the I/O will use the serial netCDF or pnetcdf library. This information is set in the “io_pio_nml” namelist. By varying the number of iotasks, the user can easily reduce the serial I/O memory bottleneck (by increasing the number of iotasks), even with the use of serial netCDF.

POP PIO has been been implemented all netCDF I/O. It is important to note that it is now the only mechanism for producing netCDF history files, as well as for reading and writing netCDF restart files.

2.4. Time management

The time_manager_nml namelist contains variables that control the timestep, the length of the current run, the method used to suppress the leapfrog computational mode, and the date on which this run-sequence began. A run-sequence consists of one or more job submissions, each of which produces a restart file that is used to begin the next job in the sequence. A run-sequence is identified by a runid that is declared in the first job of the sequence and held fixed throughout the sequence; runid is used in generating default names for the model’s output files. Similarly, the start date and time for the run sequence (iyear0...,isecond0), are set in the first job and held fixed throughout the sequence. An additional variable called date_separator can be used to govern the form of the date that is appended to various output files. The date_separator is a single character used to separate yyyy, mm, and dd in a date format. A blank character is the default and is translated to no separator (yyyymmdd); a value of ‘-‘ would result in the format yyyy-mm-dd.

The timestep is defined using a combination of dt_option and dt_count. If steps_per_(day,year) is chosen, the timestep is computed such that dt_count steps are taken each day or year. If hours or seconds is chosen, the timestep is dt_count in hours or seconds (note that dt_count is an integer). If auto_dt is chosen, the timestep is automatically computed based on the grid size. The time step may be adjusted from these values to accomodate averaging time steps.

In order to control a computational mode resulting from the use of a leapfrog time stepping scheme, either a time-averaging method (‘avg’,’avgbb’,’avgfit’) or a Matsuno (‘matsuno’) time step must be specified through the time_mix_opt parameter. The frequency (in time steps) for applying one of these methods is defined by the time_mix_freq parameter. If ‘avg’ is selected for time_mix_opt, the averaging results in only a half timestep being taken every time_mix_freq steps. This may result in a non-integral number of steps per day and will result in irregular day boundaries. If an integral number of steps per day is required, two alternative options are provided. Choosing ‘avgbb’ will enable always taking two half steps back-to-back, thus giving a full time step, but with increased diffusion. The ‘avgfit’ option will compute a number of full and half steps that will fit into a particular time interval. The time interval to be fit is governed by the ‘fit_freq’ parameter which sets the number of time intervals per day (1=once per day) into which the time steps must fit exactly. The Matsuno scheme does not use half steps, but Matsuno is generally more diffusive than time averaging and has been shown to be unstable in many situations.

The timestep above can be increased for tracers in the deep ocean. If such acceleration is requested (laccel = .true.), a profile of the acceleration with depth must be read from the file accel_file, an ascii file with a single column of numbers giving the acceleration factor for each vertical level. Another form of acceleration is to take a longer tracer timestep than momentum timestep. This can be specified by changing dtuxcel to a factor smaller than 1.0.