Operational control ==================== This section describes various input options that control the operation control (rather than physical parameters) of the model, including processor configuration, time management and some input/output control. Processor configuration ------------------------ The first namelist read by POP model, `domain_nml `_ model determines how many processors it should use and how to distribute blocks of the domain across processors. The number of processors used by the model is governed by the ``nprocs_clinic`` parameter. The parameter ``nprocs_tropic`` determines whether the barotropic solver is run using the same or fewer number of processors; specifying a number larger than the clinic value will result in an error. The distribution of blocks across processors is determined by the parameters ``clinic_distribution_type`` and ``tropic_distribution_type``. Typically, the ``'balanced'`` choice is best for the baroclinic part of the code and ``'cartesian'`` is best for the barotropic solver. The ``'spacecurve'`` choice is sometimes the default for cases using the tripole grid at high resolution on high processor counts. In order to update ''ghost cells'' and implement proper boundary conditions, some boundary information for the global domain is required. The parameter ``ew_boundary_type`` determines the type of boundary for the logical east-west direction (i direction). Acceptable values are currently ``'cyclic'`` and ``'closed'`` for periodic and closed boundaries, respectively. The parameter for the logical north-south direction (j direction) is ``ns_boundary_type`` and accepts ``'cyclic','closed'`` and ``'tripole'``, where cyclic and closed have the same meaning as the east-west direction and tripole refers to use of a tripole grid. .. note:: Occasionally, when running POP2 with OpenMP and using certain processor configurations, the POP\_Distribution rake algorithm fails when ``clinic_distribution_type`` is set to ``balanced``. Because of this problem, which is related to other limitations imposed by the POP2, the default setting for this option is ``clinic_distribution_type = cartesian``. .. _input-output: Input/Output ------------- The namelist `io_nml `_ determines the IO settings for POP. POP supports both binary and netCDF file formats. The formats for each type of file (e.g. restart, history, movie) are set in individual namelists for those operations. For binary output format, POP can perform parallel input/output in order to speed up IO when writing large files. Because most files read or written by POP utilize direct-access IO with a horizontal slice written to each binary record, the parallel IO routines allow several processors to write individual records to the same file. The user can specify how many processors participate in the parallel IO with some restrictions. The number of processors obviously cannot exceed the total number of processors assigned to the job. In addition, it is not productive to assign more processors than the number of vertical levels as these processors will generally remain idle (or even perform unnecessary work). Lastly, there may be some restrictions based on the particular architecture. Some architectures have a limit on the number of effective IO units that can be open simultaneously. Some architectures (e.g. loose clusters of workstations) may not have a file system accessible to all of the participating processors, in which case the user must set the number of IO processors appropriately. Lastly, note that netCDF does not support parallel I/O, so any netCDF formatted files will be read/written from a single processor regardless of the num\_iotasks setting. The POP model writes a variety of information, including model configuration and many diagnostics, to standard output. A namelist flag ``lredirect_stdout`` can be turned on to redirect standard to a logfile. The output log file is always redirected to the ``log_filename`` file, where ``log_filename`` has been pre-defined by the CIME scripts. During production runs, it is not convenient to have to change the ``pop_in`` file for every run. Typically, the only changes necessary are the names of any restart input files. To avoid having to change these filenames in the ``pop_in`` file for every run, an option ``luse_pointer_files`` exists. If this flag is .true., the names of restart output files are written to pointer files with the name ``pointer_filename``.\ *suffix*, where *suffix* is currently either ``restart`` or ``tavg`` to handle restart files and tavg restart files. When a simulation is started from restart, it will read these pointer files to determine the location and name of the actual restart files. As model resolution increases and/or more fields are written to netCDF output files, it becomes increasingly likely that the files will exceed the default netCDF file size of 2Gb. netCDF version 3.6 and higher now support files larger than 2Gb, which is activated by using the NF\_64BIT\_OFFSET flag when opening a new netCDF file. The io\_nml variable ``luse_nf_64bit_offset`` allows a user to select large-file support for the netCDF output files. For further information, see the `FAQ `__ created by Unidata, the developer of netCDF. Parallel I/O -------------------------- The namelist group **pio_nml ** contains namelist variables that control the PIO settings for POP. .. todo:: Add link to pio namelist group **io_pio_nml** Parallel I/O is increasingly needed for two reasons: to significantly reduce the memory footprint required to perform I/O and to address performance limitations on high resolution, high processor count simulations. Serial I/O is normally implemented by gathering the data onto one task before writing it out. As a result, it is one of the largest sources of global memory and will always result in a memory bottleneck as the model resolution is increased. Consequently, the absence of parallel I/O in a model component will always give rise to a resolution "cut-off" on a given computational platform. In addition, serial I/O is also associated with serious performance penalties at higher processor counts. To address these issues, a new parallel I/O library, PIO, has been developed as a collaboratory effort by NCAR/CISL, DOE/SciDAC and NCAR/CSEG. PIO was initially designed to allow 0.1-degree POP to execute and write history and restart files on Blue Gene/L in less than 256 MB per MPI task. Since that initial prototype version, PIO has developed into a general purpose parallel I/O library that currently supports netCDF (serial), pnetcdf and MPI\_IO and has been implemented throughout CESM system. PIO is a software interface layer designed to encapsulate the complexities of parallel IO and to make it easier to replace the lower level software backend. PIO calls are collective, an MPI communicator is set in a call to PIO\_init and all tasks associated with that communicator must participate in all subsequent calls to PIO. One of the key features of PIO is that it takes the model's decomposition and redistributes it to an I/O "friendly" decomposition on the requested number of I/O tasks. In using the PIO library for netCDF or pnetcdf I/O, the user must specify the number of iotasks to be used, the stride or number of tasks between iotasks and whether the I/O will use the serial netCDF or pnetcdf library. This information is set in the "io\_pio\_nml" namelist. By varying the number of iotasks, the user can easily reduce the serial I/O memory bottleneck (by increasing the number of iotasks), even with the use of serial netCDF. POP PIO has been been implemented all netCDF I/O. It is important to note that it is now the only mechanism for producing netCDF history files, as well as for reading and writing netCDF restart files. .. _time-management: Time management ---------------- The `time_manager_nml `_ namelist contains variables that control the timestep, the length of the current run, the method used to suppress the leapfrog computational mode, and the date on which this *run-sequence* began. A *run-sequence* consists of one or more job submissions, each of which produces a *restart file* that is used to begin the next job in the sequence. A run-sequence is identified by a ``runid`` that is declared in the first job of the sequence and held fixed throughout the sequence; ``runid`` is used in generating default names for the model's output files. Similarly, the start date and time for the run sequence (``iyear0...,isecond0``), are set in the first job and held fixed throughout the sequence. An additional variable called ``date_separator`` can be used to govern the form of the date that is appended to various output files. The ``date_separator`` is a single character used to separate yyyy, mm, and dd in a date format. A blank character is the default and is translated to no separator (yyyymmdd); a value of '-' would result in the format yyyy-mm-dd. The timestep is defined using a combination of ``dt_option`` and ``dt_count``. If ``steps_per_(day,year)`` is chosen, the timestep is computed such that ``dt_count`` steps are taken each day or year. If ``hours`` or ``seconds`` is chosen, the timestep is ``dt_count`` in hours or seconds (note that ``dt_count`` is an integer). If ``auto_dt`` is chosen, the timestep is automatically computed based on the grid size. The time step may be adjusted from these values to accomodate averaging time steps. In order to control a computational mode resulting from the use of a leapfrog time stepping scheme, either a time-averaging method ('avg','avgbb','avgfit') or a Matsuno ('matsuno') time step must be specified through the ``time_mix_opt`` parameter. The frequency (in time steps) for applying one of these methods is defined by the ``time_mix_freq`` parameter. If 'avg' is selected for ``time_mix_opt``, the averaging results in only a half timestep being taken every ``time_mix_freq`` steps. This may result in a non-integral number of steps per day and will result in irregular day boundaries. If an integral number of steps per day is required, two alternative options are provided. Choosing 'avgbb' will enable always taking two half steps back-to-back, thus giving a full time step, but with increased diffusion. The 'avgfit' option will compute a number of full and half steps that will fit into a particular time interval. The time interval to be fit is governed by the 'fit\_freq' parameter which sets the number of time intervals per day (1=once per day) into which the time steps must fit exactly. The Matsuno scheme does not use half steps, but Matsuno is generally more diffusive than time averaging and has been shown to be unstable in many situations. The timestep above can be increased for tracers in the deep ocean. If such acceleration is requested (``laccel`` = .true.), a profile of the acceleration with depth must be read from the file ``accel_file``, an ascii file with a single column of numbers giving the acceleration factor for each vertical level. Another form of acceleration is to take a longer tracer timestep than momentum timestep. This can be specified by changing ``dtuxcel`` to a factor smaller than 1.0.