The log files#

During a CESM run, each model component writes one or more log files. These files are named:

$model.log.*

where $model is the component name (for example, cesm, atm, lnd, ice, ocn, or rof).

Where are the log files?#

  • While the model is running, the log files are written to the run directory (RUNDIR).

  • If the run completes successfully, the log files are copied to the archive directory (DOUT_S_ROOT) during the short-term archiving step.

  • If the run fails, the log files remain in the run directory (RUNDIR), where they can be inspected to diagnose the problem.

CESM directories and namelists

Figure: Overview of the CESM directories and the log files.

What to do when a run fails?#

First, check the latest cesm.log.*, which will often tell you when the model failed. If a run completed successfully, the last several lines of the drv.log.* file will have a string like SUCCESSFUL TERMINATION OF CESM. Note that the successful logs files are compressed (extension gz) before going into the archive logs directory. If the log files are not compressed this is normally an indication that the run was not successful. To examine these one can do:

gunzip -c cesm.log.nnnnnnn.desched1.YYMMDD-pid.gz | more

or

more cesm.log.nnnnnnn.desched1.YYMMDD-pid

If you don’t see this message, it means the run has failed.

SUCCESSFUL TERMINATION OF CESM

in

drv.log.*

If this message is missing, the simulation did not complete successfully.

Before digging into the component logs, check for common system-related issues:

  • Did the model time out?

  • Was a disk quota limit hit?

  • Did a machine go down?

  • Did a file system become full? If any of those things happened, take appropriate corrective action and resubmit the job.

If none of these appear to be the cause, examine the component log files (*.log.*) for error messages. The first place to look is the most recent cesm log file:

cesm.log.*

The cesm log records the overall progress of the simulation and often indicates when and why the model stopped. The other log files can also contain useful information. The error is often reported close to the end of one of the logs, although the first visible error is not always the root cause. Learning to interpret CESM log files takes some practice.

In the exercises later in this chapter, we will work through an example of runtime failure and learn how to identify the underlying cause from the log files.