Using Timing Files#
Model timing files contain a summary of various timing information for the run. It is helpful to check the timings after the run to verify that the model is running efficiently.
1. What are timing files and where are they located?#
A summary timing output file is produced after every CESM run. This file is placed in $CASEROOT/timing/ccsm_timing.$CASE.$date
, where $date is a datestamp set by CESM at runtime, and contains a summary of various timing information.
The first section in the timing output, “CCSM TIMING PROFILE”, summarizes general timing information for the run. The total run time and cost is given in several metrics including pe-hrs per simulated year (cost), simulated years per wall day (thoughput), seconds, and seconds per model day. This provides general summary information quickly in several units for analysis and comparison with other runs. The total run time for each component is also provided, as is the time for initialization of the model. These times are the aggregate over the total run and do not take into account any temporal or processor load imbalances.
2. Use timing files to determine runtime variables#
Here is the cost information given by timing file will give you the following information:
Overall Metrics:
Model Cost: 327.14 pe-hrs/simulated_year (scale= 0.50)
Model Throughput: 4.70 simulated_years/day
The model throughput is the estimated number of model years that you can run in a wallclock day. Based on this, you can maximize $CASE.run
queue limit and change STOP_OPTION
and STOP_N
in env_run.xml
.
For example, say a model’s throughput is 4.7 simulated_years/day. On Cheyenne, the maximum runtime limit is 12 hours. 4.7 model years/24 hours * 12 hours = 2.34 years. On the massively parallel computers, there is always some variability in how long it will take a job to run. On some machines, you may need to leave as much as 20% buffer time in your run to guarantee that jobs finish reliably before the time limit. For that reason we will set our model to run only 2 model year/job.
Continuing to assume that the run is on Cheyenne, we can set:
./xmlchange STOP_OPTION='nyears'
./xmlchange STOP_N=2