The log files#
The log files are files in the format $model.log.*
When the model is running, it produces the log files in the run directory:
RUNDIR
.When the run completes successfully, the model moves the log files into the archive directory:
DOUT_S_ROOT
When the model fails, the log files remains in the run directory
RUNDIR
Figure: Overview of the CESM directories and the log files.
What to do when a run fails?#
First, check the latest cpl.log.*
, which will often tell you when the model failed. If a run completed successfully, the last several lines of the cpl.log.*
file will have a string like SUCCESSFUL TERMINATION OF CESM
.
If you don’t see this message, it means the run has failed.
Check these things first when a job fails:
Did the model time out?
Was a disk quota limit hit?
Did a machine go down?
Did a file system become full? If any of those things happened, take appropriate corrective action and resubmit the job.
If it is not clear that any of the above caused a case to fail, check the rest of the component log files $model.log.*
for error messages. It takes a bit of practice to interpret message errors. We will look at an example in this chapter exercices.