Changelog#

This page shows the main changes from AS3 to AS4.

Mayor mentions:

  • Python version has changed to 3.7.3 instead of 2.7.

  • Configuration language has changed to YAML.

  • All parameters are now unified into a single dictionary.

  • All sections are now uppercase.

  • All parameters, except for job related ones, have now an hierarchy.

  • An special key, FOR:, has been added. This key allows to create multiple jobs with almost the same configuration.

  • The configuration of autosubmit is now more flexible.

  • New command added, upgrade. This command will update all the scripts and autosubmit configuration.

  • Wrapper definition has changed.

  • Tasks dependencies system has changed.

  • Added the parameter DELETE_WHEN_EDGELESS ( boolean ) to the section JOBS. This parameter allows to delete a job when it has no edges. ( default TRUE)

Warning

The configuration language has changed. Please, check the new configuration file format.

Warning

The wrapper definition has changed. Please, check the new wrapper definition.

Warning

The tasks dependencies system has changed. Please, check the new tasks dependencies system.

Warning

Edgeless jobs are now deleted by default. Please, check the new parameter DELETE_WHEN_EDGELESS.

Warning

upgrade may not translate all the scripts, we recommend to revise your scripts before run AS.

Configuration changes#

Now autosubmit is composed by two kind of YAML configurations, the default ones, which are the same as always, and the custom ones.

The custom ones, allows to define custom configurations that will override the default ones, in order to do this, you only have to put the key in the custom configuration file. These custom ones, can be anywhere and have any name, by default they’re inside <expid>/conf but you can change this path in the expdef.yml file. DEFAULT.CUSTOM_CONFIG

Additionally, you must be aware of the following changes:

  • All sections keys are normalized to UPPERCASE, while values remain as the user put. Beware of the scripts that relies on %CURRENT_HPCARCH% and variables that refer to a platform because they will be always in UPPERCASE. Normalize the script.

  • To define a job, you must put them under the key jobs in any custom configuration file.

  • To define a platform, you must put them under the key platforms in any custom configuration file.

  • To define a loop, you must put the key “FOR” as the first key of the section.

  • You can put any %placeholder% in the proj.yml and custom files, and also you can put %ROOTDIR% in the expdef.yml.

  • All configuration is now based in an hierarchical structure, so to export a var, you must use the following syntax: %KEY.SUBKEY.SUBSUBKEY%. The same goes for override them.

  • YAML has into account the type.

Examples#

List of example with the new configuration and the structure as follows

$/autosubmit/a00q/conf$ ls
autosubmit_a00q.yml  custom_conf  expdef_a00q.yml  jobs_a00q.yml  platforms_a00q.yml
$/autosubmit/a00q/conf/custom_conf ls
more_jobs.yml

Configuration#

autosubmit_expid.yml

config:
  AUTOSUBMIT_VERSION: 4.0.0b
  MAXWAITINGJOBS: '3000'
  TOTALJOBS: '3000'
  SAFETYSLEEPTIME: 0
  RETRIALS: '10'
mail:
  NOTIFICATIONS: 'False'
  TO: daniel.beltran@bsc.es

expdef_expid.yml

DEFAULT:
  EXPID: a02u
  HPCARCH: local
  CUSTOM_CONFIG: "%ROOTDIR%/conf/custom_conf"
experiment:
  DATELIST: '20210811'
  MEMBERS: CompilationEfficiency HardwareBenchmarks WeakScaling StrongScaling
  CHUNKSIZEUNIT: hour
  CHUNKSIZE: '6'
  NUMCHUNKS: '2'
  CALENDAR: standard
rerun:
  RERUN: 'FALSE'
  CHUNKLIST: ''
project:
  PROJECT_TYPE: local
  PROJECT_DESTINATION: r_test
git:
  PROJECT_ORIGIN: https://earth.bsc.es/gitlab/ces/automatic_performance_profiling.git
  PROJECT_BRANCH: autosubmit-makefile1
  PROJECT_COMMIT: ''
svn:
  PROJECT_URL: ''
  PROJECT_REVISION: ''
local:
  PROJECT_PATH: /home/dbeltran/r_test
project_files:
  FILE_PROJECT_CONF: ''
  FILE_JOBS_CONF: ''

jobs_expid.yml

JOBS:
  LOCAL_SETUP:
    FILE: LOCAL_SETUP.sh
    PLATFORM: LOCAL
    RUNNING: "once"
  REMOTE_SETUP:
    FILE: REMOTE_SETUP.sh
    DEPENDENCIES: LOCAL_SETUP
    WALLCLOCK: '00:05'
    RUNNING: once
    NOTIFY_ON: READY SUBMITTED QUEUING COMPLETED
  INI:
    FILE: INI.sh
    DEPENDENCIES: REMOTE_SETUP
    RUNNING: member
    WALLCLOCK: '00:05'
    NOTIFY_ON: READY SUBMITTED QUEUING COMPLETED

  SIM:
    FOR:
      NAME: [20,40,80]
      PROCESSORS: [2,4,8]
      THREADS: [1,1,1]
      DEPENDENCIES: [INI SIM_20-1 CLEAN-2, INI SIM_40-1 CLEAN-2, INI SIM_80-1 CLEAN-2]
      NOTIFY_ON: READY SUBMITTED QUEUING COMPLETED

    FILE: SIM.sh
    DEPENDENCIES: INI SIM_20-1 CLEAN-2
    RUNNING: chunk
    WALLCLOCK: '00:05'
    TASKS: '1'
    NOTIFY_ON: READY SUBMITTED QUEUING COMPLETED

  POST:
    FOR:
      NAME: [ 20,40,80 ]
      PROCESSORS: [ 20,40,80 ]
      THREADS: [ 1,1,1 ]
      DEPENDENCIES: [ SIM_20 POST_20-1,SIM_40 POST_40-1,SIM_80 POST_80-1 ]
    FILE: POST.sh
    RUNNING: chunk
    WALLCLOCK: '00:05'
  CLEAN:
    FILE: CLEAN.sh
    DEPENDENCIES: POST_20 POST_40 POST_80
    RUNNING: chunk
    WALLCLOCK: '00:05'
  TRANSFER:
    FILE: TRANSFER.sh
    PLATFORM: LOCAL
    DEPENDENCIES: CLEAN
    RUNNING: member

platforms_expid.yml

Platforms:
  MaReNoStRuM4:
    TYPE: slurm
    HOST: bsc
    PROJECT: bsc32
    USER: bsc32070
    QUEUE: debug
    SCRATCH_DIR: /gpfs/scratch
    ADD_PROJECT_TO_HOST: False
    MAX_WALLCLOCK: '48:00'
    USER_TO: pr1enx13
    TEMP_DIR: ''
    SAME_USER: False
    PROJECT_TO: pr1enx00
    HOST_TO: bscprace
  marenostrum_archive:
    TYPE: ps
    HOST: dt02.bsc.es
    PROJECT: bsc32
    USER: bsc32070
    SCRATCH_DIR: /gpfs/scratch
    ADD_PROJECT_TO_HOST: 'False'
    TEST_SUITE: 'False'
    USER_TO: pr1enx13
    TEMP_DIR: /gpfs/scratch/bsc32/bsc32070/test_migrate
    SAME_USER: false
    PROJECT_TO: pr1enx00
    HOST_TO: transferprace
  transfer_node:
    TYPE: ps
    HOST: dt01.bsc.es
    PROJECT: bsc32
    USER: bsc32070
    ADD_PROJECT_TO_HOST: false
    SCRATCH_DIR: /gpfs/scratch
    USER_TO: pr1enx13
    TEMP_DIR: /gpfs/scratch/bsc32/bsc32070/test_migrate
    SAME_USER: false
    PROJECT_TO: pr1enx00
    HOST_TO: transferprace
  transfer_node_bscearth000:
    TYPE: ps
    HOST: bscearth000
    USER: dbeltran
    PROJECT: Earth
    ADD_PROJECT_TO_HOST: false
    QUEUE: serial
    SCRATCH_DIR: /esarchive/scratch
    USER_TO: dbeltran
    TEMP_DIR: ''
    SAME_USER: true
    PROJECT_TO: Earth
    HOST_TO: bscpraceearth000
  bscearth000:
    TYPE: ps
    HOST: bscearth000
    USER: dbeltran
    PROJECT: Earth
    ADD_PROJECT_TO_HOST: false
    QUEUE: serial
    SCRATCH_DIR: /esarchive/scratch
  nord3:
    TYPE: SLURM
    HOST: nord1.bsc.es
    PROJECT: bsc32
    USER: bsc32070
    QUEUE: debug
    SCRATCH_DIR: /gpfs/scratch
    MAX_WALLCLOCK: '48:00'
    USER_TO: pr1enx13
    TEMP_DIR: ''
    SAME_USER: true
    PROJECT_TO: pr1enx00
  ecmwf-xc40:
    TYPE: ecaccess
    VERSION: pbs
    HOST: cca
    USER: c3d
    PROJECT: spesiccf
    ADD_PROJECT_TO_HOST: false
    SCRATCH_DIR: /scratch/ms
    QUEUE: np
    SERIAL_QUEUE: ns
    MAX_WALLCLOCK: '48:00'

custom_conf/more_jobs.yml

jobs:
  Additional_job_1:
    FILE: extrajob.sh
    DEPENDENCIES: POST_20
    RUNNING: once
  additional_job_2:
    FILE: extrajob.sh
    RUNNING: once

Wrappers definition#

To define a the wrappers:

wrappers:
  wrapper_sim20:
    TYPE: "vertical"
    JOBS_IN_WRAPPER: "SIM_20"
  wrapper_sim40:
    TYPE: "vertical"
    JOBS_IN_WRAPPER: "SIM_40"

Loops definition#

To define a loop, you need to use the FOR key and also the NAME key.

In order to generate the following jobs:

experiment:
  DATELIST: 19600101
  MEMBERS: "00"
  CHUNKSIZEUNIT: day
  CHUNKSIZE: '1'
  NUMCHUNKS: '2'
  CALENDAR: standard
JOBS:
  POST_20:

    DEPENDENCIES:
      POST_20:
      SIM_20:
    FILE: POST.sh
    PROCESSORS: '20'
    RUNNING: chunk
    THREADS: '1'
    WALLCLOCK: 00:05
  POST_40:

    DEPENDENCIES:
      POST_40:
      SIM_40:
    FILE: POST.sh
    PROCESSORS: '40'
    RUNNING: chunk
    THREADS: '1'
    WALLCLOCK: 00:05
  POST_80:

    DEPENDENCIES:
      POST_80:
      SIM_80:
    FILE: POST.sh
    PROCESSORS: '80'
    RUNNING: chunk
    THREADS: '1'
    WALLCLOCK: 00:05
  SIM_20:

    DEPENDENCIES:
      SIM_20-1:
    FILE: POST.sh
    PROCESSORS: '20'
    RUNNING: chunk
    THREADS: '1'
    WALLCLOCK: 00:05
  SIM_40:

    DEPENDENCIES:
      SIM_40-1:
    FILE: POST.sh
    PROCESSORS: '40'
    RUNNING: chunk
    THREADS: '1'
    WALLCLOCK: 00:05
  SIM_80:

    DEPENDENCIES:
      SIM_80-1:
    FILE: POST.sh
    PROCESSORS: '80'
    RUNNING: chunk
    THREADS: '1'
    WALLCLOCK: 00:05

One can use now the following configuration:

experiment:
  DATELIST: 19600101
  MEMBERS: "00"
  CHUNKSIZEUNIT: day
  CHUNKSIZE: '1'
  NUMCHUNKS: '2'
  CALENDAR: standard
JOBS:
  SIM:
    FOR:
      NAME: [ 20,40,80 ]
      PROCESSORS: [ 20,40,80 ]
      THREADS: [ 1,1,1 ]
      DEPENDENCIES: [ SIM_20-1,SIM_40-1,SIM_80-1 ]
    FILE: POST.sh
    RUNNING: chunk
    WALLCLOCK: '00:05'
  POST:
      FOR:
        NAME: [ 20,40,80 ]
        PROCESSORS: [ 20,40,80 ]
        THREADS: [ 1,1,1 ]
        DEPENDENCIES: [ SIM_20 POST_20,SIM_40 POST_40,SIM_80 POST_80 ]
      FILE: POST.sh
      RUNNING: chunk
      WALLCLOCK: '00:05'

Warning

Only the parameters that changes must be included inside the FOR key.

Dependencies rework#

The DEPENDENCIES key is used to define the dependencies of a job. It can be used in the following ways:

  • Basic: The dependencies are a list of jobs, separated by “ “, that runs before the current task is submitted.

  • New: The dependencies is a list of YAML sections, separated by “n”, that runs before the current job is submitted.

    • For each dependency section, you can designate the following keywords to control the current job-affected tasks:

      • DATES_FROM: Selects the job dates that you want to alter.

      • MEMBERS_FROM: Selects the job members that you want to alter.

      • CHUNKS_FROM: Selects the job chunks that you want to alter.

    • For each dependency section and *_FROM keyword, you can designate the following keywords to control the destination of the dependency:

      • DATES_TO: Links current selected tasks to the dependency tasks of the dates specified.

      • MEMBERS_TO: Links current selected tasks to the dependency tasks of the members specified.

      • CHUNKS_TO: Links current selected tasks to the dependency tasks of the chunks specified.

    • Important keywords for [DATES|MEMBERS|CHUNKS]_TO:

      • “natural”: Will keep the default linkage. Will link if it would be normally. Example, SIM_FC00_CHUNK_1 -> DA_FC00_CHUNK_1.

      • “all”: Will link all selected tasks of the dependency with current selected tasks. Example, SIM_FC00_CHUNK_1 -> DA_FC00_CHUNK_1, DA_FC00_CHUNK_2, DA_FC00_CHUNK_3…

      • “none”: Will unlink selected tasks of the dependency with current selected tasks.

For the new format, consider that the priority is hierarchy and goes like this DATES_FROM -(includes)-> MEMBERS_FROM -(includes)-> CHUNKS_FROM.

  • You can define a DATES_FROM inside the DEPENDENCY.

  • You can define a MEMBERS_FROM inside the DEPENDENCY and DEPENDENCY.DATES_FROM.

  • You can define a CHUNKS_FROM inside the DEPENDENCY, DEPENDENCY.DATES_FROM, DEPENDENCY.MEMBERS_FROM, DEPENDENCY.DATES_FROM.MEMBERS_FROM

For the examples, we will consider that our experiment has the following configuration:

EXPERIMENT:
    DATELIST: 20220101
    MEMBERS: FC1 FC2
    NUMCHUNKS: 4

Basic#

JOBS:
  JOB_1:
      FILE: job1.sh
      RUNNING: chunk
  JOB_2:
      FILE: job2.sh
      DEPENDENCIES: JOB_1
      RUNNING: chunk
  JOB_3:
      FILE: job3.sh
      DEPENDENCIES: JOB_2
      RUNNING: chunk
  SIM:
      FILE: sim.sh
      DEPENDENCIES: JOB_3 SIM-1
      RUNNING: chunk
  POST:
      FILE: post.sh
      DEPENDENCIES: SIM
      RUNNING: chunk
  TEST:
      FILE: test.sh
      DEPENDENCIES: POST
      RUNNING: chunk

New format#

JOBS:
  JOB_1:
      FILE: job1.sh
      RUNNING: chunk
  JOB_2:
      FILE: job2.sh
      DEPENDENCIES:
          JOB_1:
              dates_to: "natural"
              members_to: "natural"
              chunks_to: "natural"
      RUNNING: chunk
  JOB_3:
      FILE: job3.sh
      DEPENDENCIES:
          JOB_2:
              dates_to: "natural"
              members_to: "natural"
              chunks_to: "natural"
      RUNNING: chunk
  SIM:
      FILE: sim.sh
      DEPENDENCIES:
          JOB_3:
              dates_to: "natural"
              members_to: "natural"
              chunks_to: "natural"
          SIM-1:
              dates_to: "natural"
              members_to: "natural"
              chunks_to: "natural"
      RUNNING: chunk
  POST:
      FILE: post.sh
      DEPENDENCIES:
          SIM:
              dates_to: "natural"
              members_to: "natural"
              chunks_to: "natural"
      RUNNING: chunk
  TEST:
      FILE: test.sh
      DEPENDENCIES:
          POST:
              dates_to: "natural"
              members_to: "natural"
              chunks_to: "natural"
      RUNNING: chunk
new_dependencies

Example 1: New format with specific dependencies#

In the following example, we want to launch the next member SIM after the last SIM chunk of the previous member is finished.

JOBS:
    JOB_1:
        FILE: job1.sh
        RUNNING: chunk
    JOB_2:
        FILE: job2.sh
        DEPENDENCIES:
            JOB_1:
        RUNNING: chunk
    JOB_3:
        FILE: job3.sh
        DEPENDENCIES:
            JOB_2:
        RUNNING: chunk
    SIM:
        FILE: sim.sh
        DEPENDENCIES:
            JOB_3:
            SIM-1:
            SIM:
                MEMBERS_FROM:
                  FC2:
                    CHUNKS_FROM:
                     1:
                      dates_to: "all"
                      members_to: "FC1"
                      chunks_to: "4"
        RUNNING: chunk
    POST:
        FILE: post.sh
        DEPENDENCIES:
            SIM:
        RUNNING: chunk
    TEST:
        FILE: test.sh
        DEPENDENCIES:
            POST:
              members_to: "FC2"
              chunks_to: 4
        RUNNING: once
new_dependencies

Example 2: Crossdate wrappers using the the new dependencies#

experiment:
  DATELIST: 20120101 20120201
  MEMBERS: "000 001"
  CHUNKSIZEUNIT: day
  CHUNKSIZE: '1'
  NUMCHUNKS: '3'
wrappers:
    wrapper_simda:
        TYPE: "horizontal-vertical"
        JOBS_IN_WRAPPER: "SIM DA"

JOBS:
  LOCAL_SETUP:
    FILE: templates/local_setup.sh
    PLATFORM: marenostrum_archive
    RUNNING: once
    NOTIFY_ON: COMPLETED
  LOCAL_SEND_SOURCE:
    FILE: templates/01_local_send_source.sh
    PLATFORM: marenostrum_archive
    DEPENDENCIES: LOCAL_SETUP
    RUNNING: once
    NOTIFY_ON: FAILED
  LOCAL_SEND_STATIC:
    FILE: templates/01b_local_send_static.sh
    PLATFORM: marenostrum_archive
    DEPENDENCIES: LOCAL_SETUP
    RUNNING: once
    NOTIFY_ON: FAILED
  REMOTE_COMPILE:
    FILE: templates/02_compile.sh
    DEPENDENCIES: LOCAL_SEND_SOURCE
    RUNNING: once
    PROCESSORS: '4'
    WALLCLOCK: 00:50
    NOTIFY_ON: COMPLETED
  SIM:
    FILE: templates/05b_sim.sh
    DEPENDENCIES:
      LOCAL_SEND_STATIC:
      REMOTE_COMPILE:
      SIM-1:
      DA-1:
    RUNNING: chunk
    PROCESSORS: '68'
    WALLCLOCK: 00:12
    NOTIFY_ON: FAILED
  LOCAL_SEND_INITIAL_DA:
    FILE: templates/00b_local_send_initial_DA.sh
    PLATFORM: marenostrum_archive
    DEPENDENCIES: LOCAL_SETUP LOCAL_SEND_INITIAL_DA-1
    RUNNING: chunk
    SYNCHRONIZE: member
    DELAY: '0'
  COMPILE_DA:
    FILE: templates/02b_compile_da.sh
    DEPENDENCIES: LOCAL_SEND_SOURCE
    RUNNING: once
    WALLCLOCK: 00:20
    NOTIFY_ON: FAILED
  DA:
    FILE: templates/05c_da.sh
    DEPENDENCIES:
      SIM:
      LOCAL_SEND_INITIAL_DA:
        CHUNKS_TO: "all"
        DATES_TO: "all"
        MEMBERS_TO: "all"
      COMPILE_DA:
      DA:
        DATES_FROM:
          "20120201":
            CHUNKS_FROM:
              1:
                DATES_TO: "20120101"
                CHUNKS_TO: "1"
    RUNNING: chunk
    SYNCHRONIZE: member
    DELAY: '0'
    WALLCLOCK: 00:12
    PROCESSORS: '256'
    NOTIFY_ON: FAILED
crossdate-example