Traceability#
Configuration#
An Autosubmit experiment starts with its creation using a version
of Autosubmit to issue the command autosubmit expid. The generated
experiments contain minimal YAML configuration to bootstrap the
experiment.
For an Autosubmit experiment of type Git, the rest of the experiment
configuration is located at a location like <EXPID>/proj/git_project/ (the
proj part is constant, but the git_project is configurable) and imported
by Autosubmit. The <EXPID>/proj/git_project/ subdirectory contains a clone
of a Git repository (i.e. there is a proj/git_project/.git).
Note
Autosubmit combines multiple YAML files and generates a merged YAML
file at <EXPID>/conf/metadata/experiment_data.yml. This file can be used to analyse the final configuration used for the run and compare with the information from trace files.
The cloned repository may contain YAML configuration files in a location
such as <EXPID>/proj/git_project/conf, for example, with settings
for models and applications, Autosubmit jobs, as well as the template
scripts (e.g. under <EXPID>/proj/git_project/templates, or anywhere
the user may choose).
These configuration files, template scripts, and the Git information
from <EXPID>/proj/git_project/ (and any Git submodules), are one part of
the traces used for provenance and reproducibility of the Autosubmit
experiments. The rest of the traces and the data produced by running
the experiment workflow jobs are explained in the following sections.
Logs#
Most of the Autosubmit commands that take an expid argument (autosubmit create,
autosubmit setstatus, autosubmit run, etc.) write to log
files persisted in the computer where the command is issued, along with
the rest of the workflow configuration and other traces. The only exception
being autosubmit delete <EXPID>, which will write to the global log
path, as it deletes the experiment folder, along with its ASLOGS folder.
By default, these command logs are saved under <EXPID>/tmp/ASLOGS, contain in their names
the timestamp of the command, and always come in pairs of “.log” and
“_err.log” files (one for the command standard output, and one for
the error output).
If the user issuing the command is not the owner of the experiment, then
Autosubmit will try to write the log file in the ASLOGS folder first,
and should that fail, it will try to write to the tmp folder or to the
global log path, depending on the file system permissions for the user.
Note
Autosubmit keeps 10 logs of each command, i.e. up to 10 logs of
autosubmit create, 10 logs of autosubmit run, etc., and then
removes older log files when new ones are created.
For Autosubmit commands that do not contain an expid argument
(e.g. autosubmit expid, autosubmit testcase, autosubmit readme, etc.)
will write to the global log path, which can be configured in the .autosubmitrc
configuration file.
Note
There are commands that do not produce any log, e.g. autosubmit delete.
The logs of the workflow tasks are retrieved from remote platforms by Autosubmit
and written to <EXPID>/tmp/LOG_<EXPID>/. They contain the output and errors,
as well as the trace output of the template script after parameter expansion
(done via the set -x mode in Bash Shell).
The parent directory, <EXPID>/tmp, contains other trace files:
.cmdfiles that are the scripts created by Autosubmit from the templates and used to run each task (locally or to a remote platform with Slurm, for example);*_COMPLETEDfiles that confirm a task was marked as completed by the platform;*_STATfiles that contain the latest start and end date of the job; and*_TOTAL_STATSthat aggregates the information of all*_STATinfo for the current and previous jobs.
Data#
The Autosubmit experiment ID acts as an persistent identifier (PID), which can be used to link data produced, traces, and configuration.
For example, it is possible to use the experiment ID in directories or as metadata to data written to remote file systems and databases. This way, one can verify if the experiment produced the expected data, or what experiment produced certain data.
Users must decide on the policy to maintain experiments. Depending on the number of experiments (thousands, millions) and storage limitations (user quota) it may be necessary to remove experiments and any data in the experiment directory.
It is possible to archive Autosubmit experiments, or delete old experiments. Another possibility is to compress logs and traces generated by experiments, keeping the experiments in the Autosubmit experiments directory.
Version control#
If your Autosubmit project uses Git (i.e. you have PROJECT.PROJECT_TYPE=git),
then Autosubmit will check out your project code and keep track of the Git information
for traceability and provenance.
Your experiment configuration will contain the Git details like the URL
(GIT.PROJECT_ORIGIN), the branch (GIT.PROJECT_BRANCH), and the commit
used (GIT.PROJECT_COMMIT).
If you have access to the project destination folder (PROJECT.PROJECT_DESTINATION)
in the environment where you are running Autosubmit, then you can also obtain the
same information using the command-line utility git in that directory, or inspect
the contents of the .git sub-directory.
Autosubmit will inspect the Git repository of your project and extract the current commit.
That value will be then written to the configuration AUTOSUBMIT.WORKFLOW_COMMIT.
The table below contains the complete list of Git parameters and their description:
Parameters |
Description |
|---|---|
GIT.PROJECT_ORIGIN |
URL pointing towards the project |
GIT.PROJECT_BRANCH |
Branch that should be checkout out to, once repository is downloaded |
GIT.PROJECT_COMMIT |
Git commit message |
GIT.PROJECT_TAG |
Tag of the project |
GIT.PROJECT_SUBMODULES |
Creates a folder within the main project that allows for a |
GIT.PROJECT_SUBMODULES_DEPTH |
Allow for a shallow clone to be done within the submodule with the specified depth. This will allow for smaller local clones, but limits its commit reachability |
GIT.FETCH_SINGLE_BRANCH |
Limits the data of the submodule to a single branch. |
GIT.REMOTE_CLONE_ROOT |
Clones a git project on the main HPC platform |
Note
Depending on how you configure and run your experiment, the GIT.PROJECT_COMMIT
may differ from what was actually used by Autosubmit (AUTOSUBMIT.WORKFLOW_COMMIT).
For example, you may use a commit for GIT.PROJECT_COMMIT and run
autosubmit refresh <EXPID>. Later, you may add more commits to your local
Git working copy, or you may check out another branch. Doing any of that, the
next time autosubmit commands are used, they will use the latest version
of your local project folder, unless you run autosubmit refresh again.
For traceability and provenance, we recommend the use of AUTOSUBMIT.WORKFLOW_COMMIT.
A practical example#
Given an experiment ID, such as a001, the experiment directory in a machine
could be something similar to /$HOME/a001/ (configurable). For brevity, the
rest of this section will use relative directories like tmp/ instead of
/app/autosubmit/a001/tmp/.
The YAML configuration files of the experiments are stored in the conf/
subdirectory and may import other YAML files from proj/git_project/ (where
proj is a directory common to all Autosubmit experiments, but git_project
is configurable).
The complete YAML configuration used by Autosubmit, after all files have been
included by Autosubmit, is stored at conf/metadata/experiment_data.yml.
The autosubmit commands issued for the experiment a001 will have access
to this YAML configuration, and will be logged to files in the platforms configured
(local or remote). The log files are later retrieved by Autosubmit automatically,
and saved to the machine where the autosubmit command was issued at. The
command logs are stored in the directory tmp/ASLOGS.
Running autosubmit setstatus, for example, would produce files that could be
stored for example as tmp/ASLOGS/20240319_141712_setstatus.log and
tmp/ASLOGS/20240319_141712_setstatus_err.log.. These two files contain the
standard output and error output of the autosubmit setstatus command, issued on
2024-03-19 at 14:17:12 (computer time). The “.log” file contains the output
produced by Autosubmit, whereas the “_err.log file would contain the error or
be empty if no error occurred.
2024-03-19 14:17:17,772 Autosubmit is running with 4.1.0 2024-03-19 14:17:17,782 Preparing .lock file to avoid multiple instances with same expid. 2024-03-19 14:17:17,782 Exp ID: a001 2024-03-19 14:17:17,782 Save: False 2024-03-19 14:17:17,782 Final status: WAITING 2024-03-19 14:17:17,782 List of jobs to change: a001_20200101_fc0_285_SIM a001_20200101_fc0_284_SIM 2024-03-19 14:17:17,782 Chunks to change: None 2024-03-19 14:17:17,782 Status of jobs to change: None 2024-03-19 14:17:17,782 Sections to change: None ...
The workflow task logs are stored in the directory tmp/LOG_<EXPID>,
tmp/LOG_a001/ in this example. The task logs are written on the remote
platforms used in the experiment configuration (e.g. a cloud server, or HPC).
These files are copied automatically by Autosubmit to the computer where the
autosubmit command was issued at.
These log files, like the autosubmit commands logs described before, also
come in pairs “.out” and “.err”. However, in this case the “.err”
file contains the workflow task script source with the Bash Shell script
generated by Autosubmit and the expanded parameters (produced with the Bash
Shell attribute -x). The file name also contains a timestamp from when the
job was started.
[INFO] JOBID=**6709774**
job_name_ptrn='/scratch/a001/LOG_a001/a001_20200101_fc0_337_SIM'
+ job_name_ptrn=/scratch/a001/LOG_a001/a001_20200101_fc0_337_SIM
echo $(date +%s) > ${job_name_ptrn}_STAT
++ date +%s
+ echo 1711509353
...
The .err and .out files both contain the JOBID data, which for
remote platforms like HPC batch systems (e.g. Slurm) represent the Job ID.
As well as any other output from the workflow task.
Users can also access the jobs data stored by Autosubmit in
<AUTOSUBMIT>/metadata/data/job_data_a001.db, to query for information
from previous jobs:
$ sqlite3 ~/job_data_a001.db "select job_id from job_data where job_name = 'a001_20200101_fc0_337_SIM';"
6709774
$ # Use sacct, scontrol, etc. in the remote platform to query the Job information