######### Changelog ######### This page shows the main changes from AS3 to AS4. Mayor mentions: - Python version has changed to 3.7.3 instead of 2.7. - Configuration language has changed to YAML. - All parameters are now unified into a single dictionary. - All sections are now uppercase. - All parameters, except for job related ones, have now an hierarchy. - An special key, FOR:, has been added. This key allows to create multiple jobs with almost the same configuration. - The configuration of autosubmit is now more flexible. - New command added, updateproj. This command will update all the scripts and autosubmit configuration. - Wrapper definition has changed. - Tasks dependencies system has changed. .. warning:: updateproj may not translate all the scripts, we recommend to revise your scripts before run AS. Configuration changes --------------------- Now autosubmit is composed by two kind of YAML configurations, the default ones, which are the same as always, and the custom ones. The custom ones, allows to define custom configurations that will override the default ones, in order to do this, you only have to put the key in the custom configuration file. These custom ones, can be anywhere and have any name, by default they're inside `/conf` but you can change this path in the expdef.yml file. `DEFAULT.CUSTOM_CONFIG_DIR` Additionally, you must be aware of the following changes: - All sections **keys** are normalized to **UPPERCASE**, while values remain as the user put. Beware of the scripts that relies on %CURRENT_HPCARCH% and variables that refer to a platform because they will be always in UPPERCASE. Normalize the script. - To define a job, you must put them under the key `jobs` in any custom configuration file. - To define a platform, you must put them under the key `platforms` in any custom configuration file. - To define a loop, you must put the key "FOR" as the first key of the section. - You can put any %placeholder% in the proj.conf and custom files, and also you can put %ROOTDIR% in the expdef.yml. - All configuration is now based in an hierarchical structure, so to export a var, you must use the following syntax: `%KEY.SUBKEY.SUBSUBKEY%`. The same goes for override them. - YAML has into account the type. Examples -------- List of example with the new configuration and the structure as follows .. code-block:: bash $/autosubmit/a00q/conf$ ls autosubmit_a00q.yml custom_conf expdef_a00q.yml jobs_a00q.yml platforms_a00q.yml $/autosubmit/a00q/conf/custom_conf ls more_jobs.yml Configuration ~~~~~~~~~~~~~ `autosubmit_expid.yml` .. code-block:: yaml config: AUTOSUBMIT_VERSION: 4.0.0b MAXWAITINGJOBS: '3000' TOTALJOBS: '3000' SAFETYSLEEPTIME: 0 RETRIALS: '10' mail: NOTIFICATIONS: 'False' TO: daniel.beltran@bsc.es `expdef_expid.yml` .. code-block:: yaml DEFAULT: EXPID: a02u HPCARCH: local CUSTOM_CONFIG_DIR: %ROOTDIR%/conf/custom_conf experiment: DATELIST: '20210811' MEMBERS: CompilationEfficiency HardwareBenchmarks WeakScaling StrongScaling CHUNKSIZEUNIT: hour CHUNKSIZE: '6' NUMCHUNKS: '2' CALENDAR: standard rerun: RERUN: 'FALSE' CHUNKLIST: '' project: PROJECT_TYPE: local PROJECT_DESTINATION: r_test git: PROJECT_ORIGIN: https://earth.bsc.es/gitlab/ces/automatic_performance_profiling.git PROJECT_BRANCH: autosubmit-makefile1 PROJECT_COMMIT: '' svn: PROJECT_URL: '' PROJECT_REVISION: '' local: PROJECT_PATH: /home/dbeltran/r_test project_files: FILE_PROJECT_CONF: '' FILE_JOBS_CONF: '' `jobs_expid.yml` .. code-block:: yaml JOBS: LOCAL_SETUP: FILE: LOCAL_SETUP.sh PLATFORM: LOCAL RUNNING: "once" REMOTE_SETUP: FILE: REMOTE_SETUP.sh DEPENDENCIES: LOCAL_SETUP WALLCLOCK: '00:05' RUNNING: once NOTIFY_ON: READY SUBMITTED QUEUING COMPLETED INI: FILE: INI.sh DEPENDENCIES: REMOTE_SETUP RUNNING: member WALLCLOCK: '00:05' NOTIFY_ON: READY SUBMITTED QUEUING COMPLETED SIM: FOR: NAME: [20,40,80] PROCESSORS: [2,4,8] THREADS: [1,1,1] DEPENDENCIES: [INI SIM_20-1 CLEAN-2, INI SIM_40-1 CLEAN-2, INI SIM_80-1 CLEAN-2] NOTIFY_ON: READY SUBMITTED QUEUING COMPLETED FILE: SIM.sh DEPENDENCIES: INI SIM_20-1 CLEAN-2 RUNNING: chunk WALLCLOCK: '00:05' TASKS: '1' NOTIFY_ON: READY SUBMITTED QUEUING COMPLETED POST: FOR: NAME: [ 20,40,80 ] PROCESSORS: [ 20,40,80 ] THREADS: [ 1,1,1 ] DEPENDENCIES: [ SIM_20 POST_20-1,SIM_40 POST_40-1,SIM_80 POST_80-1 ] FILE: POST.sh RUNNING: chunk WALLCLOCK: '00:05' CLEAN: FILE: CLEAN.sh DEPENDENCIES: POST_20 POST_40 POST_80 RUNNING: chunk WALLCLOCK: '00:05' TRANSFER: FILE: TRANSFER.sh PLATFORM: LOCAL DEPENDENCIES: CLEAN RUNNING: member `platforms_expid.yml` .. code-block:: yaml Platforms: MaReNoStRuM4: TYPE: slurm HOST: bsc PROJECT: bsc32 USER: bsc32070 QUEUE: debug SCRATCH_DIR: /gpfs/scratch ADD_PROJECT_TO_HOST: False MAX_WALLCLOCK: '48:00' USER_TO: pr1enx13 TEMP_DIR: '' SAME_USER: False PROJECT_TO: pr1enx00 HOST_TO: bscprace marenostrum_archive: TYPE: ps HOST: dt02.bsc.es PROJECT: bsc32 USER: bsc32070 SCRATCH_DIR: /gpfs/scratch ADD_PROJECT_TO_HOST: 'False' TEST_SUITE: 'False' USER_TO: pr1enx13 TEMP_DIR: /gpfs/scratch/bsc32/bsc32070/test_migrate SAME_USER: false PROJECT_TO: pr1enx00 HOST_TO: transferprace transfer_node: TYPE: ps HOST: dt01.bsc.es PROJECT: bsc32 USER: bsc32070 ADD_PROJECT_TO_HOST: false SCRATCH_DIR: /gpfs/scratch USER_TO: pr1enx13 TEMP_DIR: /gpfs/scratch/bsc32/bsc32070/test_migrate SAME_USER: false PROJECT_TO: pr1enx00 HOST_TO: transferprace transfer_node_bscearth000: TYPE: ps HOST: bscearth000 USER: dbeltran PROJECT: Earth ADD_PROJECT_TO_HOST: false QUEUE: serial SCRATCH_DIR: /esarchive/scratch USER_TO: dbeltran TEMP_DIR: '' SAME_USER: true PROJECT_TO: Earth HOST_TO: bscpraceearth000 bscearth000: TYPE: ps HOST: bscearth000 USER: dbeltran PROJECT: Earth ADD_PROJECT_TO_HOST: false QUEUE: serial SCRATCH_DIR: /esarchive/scratch nord3: TYPE: SLURM HOST: nord1.bsc.es PROJECT: bsc32 USER: bsc32070 QUEUE: debug SCRATCH_DIR: /gpfs/scratch MAX_WALLCLOCK: '48:00' USER_TO: pr1enx13 TEMP_DIR: '' SAME_USER: true PROJECT_TO: pr1enx00 ecmwf-xc40: TYPE: ecaccess VERSION: pbs HOST: cca USER: c3d PROJECT: spesiccf ADD_PROJECT_TO_HOST: false SCRATCH_DIR: /scratch/ms QUEUE: np SERIAL_QUEUE: ns MAX_WALLCLOCK: '48:00' `custom_conf/more_jobs.yml` .. code-block:: yaml jobs: Additional_job_1: FILE: extrajob.sh DEPENDENCIES: POST_20 RUNNING: once additional_job_2: FILE: extrajob.sh RUNNING : once Wrappers definition ~~~~~~~~~~~~~~~~~~~ To define a the wrappers: .. code-block:: yaml wrappers: wrapper_sim20: TYPE: "vertical" JOBS_IN_WRAPPER: "SIM_20" wrapper_sim40: TYPE: "vertical" JOBS_IN_WRAPPER: "SIM_40" Loops definition ~~~~~~~~~~~~~~~~~~~ To define a loop, you need to use the FOR key and also the NAME key. In order to generate the following jobs: .. code-block:: yaml POST_20: FILE: POST.sh RUNNING: chunk WALLCLOCK: '00:05' PROCESSORS: 20 THREADS: 1 DEPENDENCIES: SIM_20 POST_20-1 POST_40: FILE: POST.sh RUNNING: chunk WALLCLOCK: '00:05' PROCESSORS: 40 THREADS: 1 DEPENDENCIES: SIM_40 POST_40-1 POST_80: FILE: POST.sh RUNNING: chunk WALLCLOCK: '00:05' PROCESSORS: 80 THREADS: 1 DEPENDENCIES: SIM_80 POST_80-1 One can use now the following configuration: .. code-block:: yaml POST: FOR: NAME: [ 20,40,80 ] PROCESSORS: [ 20,40,80 ] THREADS: [ 1,1,1 ] DEPENDENCIES: [ SIM_20 POST_20-1,SIM_40 POST_40-1,SIM_80 POST_80-1 ] FILE: POST.sh RUNNING: chunk WALLCLOCK: '00:05' .. warning:: Only the parameters that changes must be included inside the `FOR` key. DEPENDENCIES ------------ The DEPENDENCIES key is used to define the dependencies of a job. It can be used in the following ways: - Basic: The dependencies are a list of jobs, separated by " ", that runs before the current task is submitted. - New: The dependencies is a list of YAML sections, separated by "\n", that runs before the current job is submitted. - For each dependency section, you can designate the following keywords to control the current job-affected tasks: - DATES_FROM: Selects the job dates that you want to alter. - MEMBERS_FROM: Selects the job members that you want to alter. - CHUNKS_FROM: Selects the job chunks that you want to alter. - For each dependency section and *_FROM keyword, you can designate the following keywords to control the destination of the dependency: - DATES_TO: Links current selected tasks to the dependency tasks of the dates specified. - MEMBERS_TO: Links current selected tasks to the dependency tasks of the members specified. - CHUNKS_TO: Links current selected tasks to the dependency tasks of the chunks specified. - Important keywords for [DATES|MEMBERS|CHUNKS]_TO: - "natural": Will keep the default linkage. - "all": Will link selected tasks of the dependency with current selected tasks. - "none": Will unlink selected tasks of the dependency with current selected tasks. For the new format, consider that the priority is hierarchy and goes like this DATES_FROM -(includes)-> MEMBERS_FROM -(includes)-> CHUNKS_FROM. - You can define a DATES_FROM inside the DEPENDENCY. - You can define a MEMBERS_FROM inside the DEPENDENCY and DEPENDENCY.DATES_FROM. - You can define a CHUNKS_FROM inside the DEPENDENCY, DEPENDENCY.DATES_FROM, DEPENDENCY.MEMBERS_FROM, DEPENDENCY.DATES_FROM.MEMBERS_FROM For the examples, we will consider that our experiment has the following configuration: .. code-block:: yaml EXPERIMENT: DATELIST: 202201[01-02] MEMBERS: FC1 FC2 NUMCHUNKS: 4 Basic ~~~~~ .. code-block:: yaml JOBS: JOB_1: FILE: job1.sh RUNNING: chunk JOB_2: FILE: job2.sh DEPENDENCIES: JOB_1 RUNNING: chunk JOB_3: FILE: job3.sh DEPENDENCIES: JOB_2 RUNNING: chunk SIM: FILE: sim.sh DEPENDENCIES: JOB_3 SIM-1 RUNNING: chunk POST: FILE: post.sh DEPENDENCIES: SIM RUNNING: chunk TEST: FILE: test.sh DEPENDENCIES: POST RUNNING: chunk New format ~~~~~~~~~~ .. code-block:: yaml JOBS: JOB_1: FILE: job1.sh RUNNING: chunk JOB_2: FILE: job2.sh DEPENDENCIES: JOB_1: dates_to: "natural" members_to: "natural" chunks_to: "natural" RUNNING: chunk JOB_3: FILE: job3.sh DEPENDENCIES: JOB_2: dates_to: "natural" members_to: "natural" chunks_to: "natural" RUNNING: chunk SIM: FILE: sim.sh DEPENDENCIES: JOB_3: dates_to: "natural" members_to: "natural" chunks_to: "natural" SIM-1: dates_to: "natural" members_to: "natural" chunks_to: "natural" RUNNING: chunk POST: FILE: post.sh DEPENDENCIES: SIM: dates_to: "natural" members_to: "natural" chunks_to: "natural" RUNNING: chunk TEST: FILE: test.sh DEPENDENCIES: POST: dates_to: "natural" members_to: "natural" chunks_to: "natural" RUNNING: chunk Example 1: New format with specific dependencies ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: yaml JOBS: JOB_1: FILE: job1.sh RUNNING: chunk JOB_2: FILE: job2.sh DEPENDENCIES: JOB_1: dates_to: "natural" members_to: "natural" chunks_to: "natural" RUNNING: chunk JOB_3: FILE: job3.sh DEPENDENCIES: JOB_2: dates_to: "natural" members_to: "natural" chunks_to: "natural" RUNNING: chunk SIM: FILE: sim.sh DEPENDENCIES: JOB_3: SIM-1: SIM: MEMBERS_FROM: FC2: CHUNKS_FROM: 1: dates_to: "all" members_to: "FC1" chunks_to: "4" RUNNING: chunk POST: FILE: post.sh DEPENDENCIES: SIM: RUNNING: chunk TEST: FILE: test.sh DEPENDENCIES: POST: members_to: "FC2" chunks_to: 4 RUNNING: once # too add img