.. _develproject: ==================== Developing a project ==================== This section contains some examples on how to develop a new project. All files, with the exception of user-defined scripts, are located in the ``/conf`` directory. Configuration files are written in ``ini`` format. In the other hand, the user-defined scripts are written in ``bash/python or R`` format. To configure the experiment, edit ``autosubmit_cxxx.conf``, ``expdef_cxxx.conf``, ``jobs_cxxx.conf`` , ``platforms_cxxx.conf`` and ``proj_cxxx.conf``` in the ``conf`` folder of the experiment. Expdef configuration ==================== vi /cxxx/conf/expdef_cxxx.conf .. code-block:: ini [DEFAULT] # Experiment identifier # No need to change EXPID = cxxx # HPC name. # No need to change HPCARCH = ithaca [experiment] # Supply the list of start dates. Available formats: YYYYMMDD YYYYMMDDhh YYYYMMDDhhmm # Also you can use an abbreviated syntax for multiple dates with common parts: # 200001[01 15] <=> 20000101 20000115 # DATELIST = 19600101 19650101 19700101 # DATELIST = 1960[0101 0201 0301] DATELIST = 19900101 # Supply the list of members. LIST = fc0 fc1 fc2 fc3 fc4 MEMBERS = fc0 # Chunk size unit. STRING = hour, day, month, year CHUNKSIZEUNIT = month # Chunk size. NUMERIC = 4, 6, 12 CHUNKSIZE = 1 # Total number of chunks in experiment. NUMERIC = 30, 15, 10 NUMCHUNKS = 2 # Calendar used. LIST: standard, noleap CALENDAR = standard # List of members that can be included in this run. Optional. # RUN_ONLY_MEMBERS = fc0 fc1 fc2 fc3 fc4 # RUN_ONLY_MEMBERS = fc[0-4] RUN_ONLY_MEMBERS = [rerun] # Is a rerun or not? [Default: Do set FALSE]. BOOLEAN = TRUE, FALSE RERUN = FALSE # If RERUN = TRUE then supply the list of jobs to rerun RERUN_JOBLIST = [project] # Select project type. STRING = git, svn, local, none # If PROJECT_TYPE is set to none, Autosubmit self-contained dummy templates will be used PROJECT_TYPE = git # Destination folder name for project. type = STRING, default = leave empty, PROJECT_DESTINATION = model # If PROJECT_TYPE is not git, no need to change [git] # Repository URL STRING = 'https://github.com/torvalds/linux.git' PROJECT_ORIGIN = https://gitlab.cfu.local/cfu/auto-ecearth3.git # Select branch or tag, STRING, default = 'master', # help = {'master' (default), 'develop', 'v3.1b', ...} PROJECT_BRANCH = develop # type = STRING, default = leave empty, help = if model branch is a TAG leave empty PROJECT_COMMIT = # If PROJECT_TYPE is not svn, no need to change [svn] # type = STRING, help = 'https://svn.ec-earth.org/ecearth3' PROJECT_URL = # Select revision number. NUMERIC = 1778 PROJECT_REVISION = # If PROJECT_TYPE is not local, no need to change [local] # type = STRING, help = /foo/bar/ecearth PROJECT_PATH = # If PROJECT_TYPE is none, no need to change [project_files] # Where is PROJECT CONFIGURATION file location relative to project root path FILE_PROJECT_CONF = templates/ecearth3/ecearth3.conf # Where is JOBS CONFIGURATION file location relative to project root path FILE_JOBS_CONF = templates/common/jobs.conf Autosubmit configuration ======================== vi /cxxx/conf/autosubmit_cxxx.conf .. code-block:: ini [config] # Experiment identifier # No need to change EXPID = # No need to change. # Autosubmit version identifier AUTOSUBMIT_VERSION = # Default maximum number of jobs to be waiting in any platform # Default = 3 MAXWAITINGJOBS = 3 # Default maximum number of jobs to be running at the same time at any platform # Can be set at platform level on the platform_cxxx.conf file # Default = 6 TOTALJOBS = 6 # Time (seconds) between connections to the HPC queue scheduler to poll already submitted jobs status # Default = 10 SAFETYSLEEPTIME = 10 # Number of retrials if a job fails. Can ve override at job level # Default = 0 RETRIALS = 0 ## Allows to put a delay between retries, of retrials if a job fails. If not specified, it will be static # DELAY_RETRY_TIME = 11 # DELAY_RETRY_TIME = +11 # will wait 11,22,33,44... # DELAY_RETRY_TIME = *11 # will wait 11,110,1110,11110... # Default output type for CREATE, MONITOR, SET STATUS, RECOVERY. Available options: pdf, svg, png, ps, txt # Default = pdf OUTPUT = pdf # [wrappers] Jobs configuration ================== vi /cxxx/conf/jobs_cxxx.conf .. code-block:: ini # Example job with all options specified ## Job name # [JOBNAME] ## Script to execute. If not specified, job will be omitted from workflow. ## Path relative to the project directory # FILE = ## Platform to execute the job. If not specified, defaults to HPCARCH in expedf file. ## LOCAL is always defined and refers to current machine # PLATFORM = ## Queue to add the job to. If not specified, uses PLATFORM default. # QUEUE = ## Defines dependencies from job as a list of parents jobs separated by spaces. ## Dependencies to jobs in previous chunk, member o startdate, use -(DISTANCE) # DEPENDENCIES = INI SIM-1 CLEAN-2 ## Define if jobs runs once, once per stardate, once per member or once per chunk. Options: once, date, member, chunk. ## If not specified, defaults to once # RUNNING = once ## Specifies that job has only to be run after X dates, members or chunk. A job will always be created for the last ## If not specified, defaults to 1 # FREQUENCY = 3 ## On a job with FREQUENCY > 1, if True, the dependencies are evaluated against all ## jobs in the frequency interval, otherwise only evaluate dependencies against current ## iteration. ## If not specified, defaults to True # WAIT = False ## Defines if job is only to be executed in reruns. If not specified, defaults to false. # RERUN_ONLY = False ## Wallclock to be submitted to the HPC queue in format HH:MM # WALLCLOCK = 00:05 ## Processors number to be submitted to the HPC. If not specified, defaults to 1. ## Wallclock chunk increase (WALLCLOCK will be increased according to the formula WALLCLOCK + WCHUNKINC * (chunk - 1)). ## Ideal for sequences of jobs that change their expected running time according to the current chunk. # WCHUNKINC = 00:01 # PROCESSORS = 1 ## Threads number to be submitted to the HPC. If not specified, defaults to 1. # THREADS = 1 ## Enables hyper-threading. If not specified, defaults to false. # HYPERTHREADING = false ## Tasks number to be submitted to the HPC. If not specified, defaults to 1. # Tasks = 1 ## Memory requirements for the job in MB # MEMORY = 4096 ## Number of retrials if a job fails. If not specified, defaults to the value given on experiment's autosubmit.conf # RETRIALS = 4 ## Allows to put a delay between retries, of retrials if a job fails. If not specified, it will be static # DELAY_RETRY_TIME = 11 # DELAY_RETRY_TIME = +11 # will wait 11,22,33,44... # DELAY_RETRY_TIME = *11 # will wait 11,110,1110,11110... ## Some jobs can not be checked before running previous jobs. Set this option to false if that is the case # CHECK = False ## Select the interpreter that will run the job. Options: bash, python, r Default: bash # TYPE = bash ## Specify the path to the interpreter. If empty, use system default based on job type . Default: empty # EXECUTABLE = /my_python_env/python3 [LOCAL_SETUP] FILE = LOCAL_SETUP.sh PLATFORM = LOCAL [REMOTE_SETUP] FILE = REMOTE_SETUP.sh DEPENDENCIES = LOCAL_SETUP WALLCLOCK = 00:05 [INI] FILE = INI.sh DEPENDENCIES = REMOTE_SETUP RUNNING = member WALLCLOCK = 00:05 [SIM] FILE = SIM.sh DEPENDENCIES = INI SIM-1 CLEAN-2 RUNNING = chunk WALLCLOCK = 00:05 PROCESSORS = 2 THREADS = 1 [POST] FILE = POST.sh DEPENDENCIES = SIM RUNNING = chunk WALLCLOCK = 00:05 [CLEAN] FILE = CLEAN.sh DEPENDENCIES = POST RUNNING = chunk WALLCLOCK = 00:05 [TRANSFER] FILE = TRANSFER.sh PLATFORM = LOCAL DEPENDENCIES = CLEAN RUNNING = member Platform configuration ====================== vi /cxxx/conf/platforms_cxxx.conf .. code-block:: ini # Example platform with all options specified ## Platform name # [PLATFORM] ## Queue type. Options: PBS, SGE, PS, LSF, ecaccess, SLURM # TYPE = ## Version of queue manager to use. Needed only in PBS (options: 10, 11, 12) and ecaccess (options: pbs, loadleveler) # VERSION = ## Hostname of the HPC # HOST = ## Project for the machine scheduler # PROJECT = ## Budget account for the machine scheduler. If omitted, takes the value defined in PROJECT # BUDGET = ## Option to add project name to host. This is required for some HPCs # ADD_PROJECT_TO_HOST = False ## User for the machine scheduler # USER = ## Path to the scratch directory for the machine # SCRATCH_DIR = /scratch ## If true, autosubmit test command can use this queue as a main queue. Defaults to false # TEST_SUITE = False ## If given, autosubmit will add jobs to the given queue # QUEUE = ## If specified, autosubmit will run jobs with only one processor in the specified platform. # SERIAL_PLATFORM = SERIAL_PLATFORM_NAME ## If specified, autosubmit will run jobs with only one processor in the specified queue. ## Autosubmit will ignore this configuration if SERIAL_PLATFORM is provided # SERIAL_QUEUE = SERIAL_QUEUE_NAME ## Default number of processors per node to be used in jobs # PROCESSORS_PER_NODE = ## Default Maximum number of jobs to be waiting in any platform queue ## Default = 3 # MAX_WAITING_JOBS = 3 ## Default maximum number of jobs to be running at the same time at the platform. ## Applies at platform level. Considers QUEUEING + RUNNING jobs. ## Ideal for configurations where some remote platform has a low upper limit of allowed jobs per user at the same time. ## Default = 6 # TOTAL_JOBS = 6 [ithaca] # Queue type. Options: ps, SGE, LSF, SLURM, PBS, eceaccess TYPE = SGE HOST = ithaca PROJECT = cfu ADD_PROJECT_TO_HOST = true USER = dbeltran SCRATCH_DIR = /scratch/cfu TEST_SUITE = True Proj configuration ================== After filling the experiment configuration and promt ``autosubmit create cxxx -np`` create, user can go into ``proj`` which has a copy of the model. The experiment project contains the scripts specified in ``jobs_cxxx.conf`` and a copy of model source code and data specified in ``expdef_xxxx.conf``. To configure experiment project parameters for the experiment, edit ``proj_cxxx.conf``. *proj_cxxx.conf* contains: - The project dependant experiment variables that Autosubmit will substitute in the scripts to be run. .. warning:: The ``proj_xxxx.conf`` has to be defined in INI style so it should has section headers. At least one. Example: :: vi /cxxx/conf/proj_cxxx.conf .. code-block:: ini [common] # No need to change. MODEL = ecearth # No need to change. VERSION = v3.1 # No need to change. TEMPLATE_NAME = ecearth3 # Select the model output control class. STRING = Option # listed under the section : https://earth.bsc.es/wiki/doku.php?id=overview_outclasses OUTCLASS = specs # After transferring output at /cfunas/exp remove a copy available at permanent storage of HPC # [Default: Do set "TRUE"]. BOOLEAN = TRUE, FALSE MODEL_output_remove = TRUE # Activate cmorization [Default: leave empty]. BOOLEAN = TRUE, FALSE CMORIZATION = TRUE # Essential if cmorization is activated. # STRING = (http://www.specs-fp7.eu/wiki/images/1/1c/SPECS_standard_output.pdf) CMORFAMILY = # Supply the name of the experiment associated (if there is any) otherwise leave it empty. # STRING (with space) = seasonal r1p1, seaiceinit r?p? ASSOCIATED_EXPERIMENT = # Essential if cmorization is activated (Forcing). STRING = Nat,Ant (Nat and Ant is a single option) FORCING = # Essential if cmorization is activated (Initialization description). STRING = N/A INIT_DESCR = # Essential if cmorization is activated (Physics description). STRING = N/A PHYS_DESCR = # Essential if cmorization is activated (Associated model). STRING = N/A ASSOC_MODEL = [grid] # AGCM grid resolution, horizontal (truncation T) and vertical (levels L). # STRING = T159L62, T255L62, T255L91, T511L91, T799L62 (IFS) IFS_resolution = T511L91 # OGCM grid resolution. STRING = ORCA1L46, ORCA1L75, ORCA025L46, ORCA025L75 (NEMO) NEMO_resolution = ORCA025L75 [oasis] # Coupler (OASIS) options. OASIS3 = yes # Number of pseudo-parallel cores for coupler [Default: Do set "7"]. NUMERIC = 1, 7, 10 OASIS_nproc = 7 # Handling the creation of coupling fields dynamically [Default: Do set "TRUE"]. # BOOLEAN = TRUE, FALSE OASIS_flds = TRUE [ifs] # Atmospheric initial conditions ready to be used. # STRING = ID found here : https://earth.bsc.es/wiki/doku.php?id=initial_conditions:atmospheric ATM_ini = # A different IC member per EXPID member ["PERT"] or which common IC member # for all EXPID members ["fc0" / "fc1"]. String = PERT/fc0/fc1... ATM_ini_member = # Set timestep (in sec) w.r.t resolution. # NUMERIC = 3600 (T159), 2700 (T255), 900 (T511), 720 (T799) IFS_timestep = 900 # Number of parallel cores for AGCM component. NUMERIC = 28, 100 IFS_nproc = 640 # Coupling frequency (in hours) [Default: Do set "3"]. NUMERIC = 3, 6 RUN_coupFreq = 3 # Post-processing frequency (in hours) [Default: Do set "6"]. NUMERIC = 3, 6 NFRP = 6 # [Default: Do set "TRUE"]. BOOLEAN = TRUE, FALSE LCMIP5 = TRUE # Choose RCP value [Default: Do set "2"]. NUMERIC = 0, 1=3-PD, 2=4.5, 3=6, 4=8.5 NRCP = 0 # [Default: Do set "TRUE"]. BOOLEAN = TRUE, FALSE LHVOLCA = TRUE # [Default: Do set "0"]. NUMERIC = 1850, 2005 NFIXYR = 0 # Save daily output or not [Default: Do set "FALSE"]. BOOLEAN = TRUE, FALSE SAVEDDA = FALSE # Save reduced daily output or not [Default: Do set "FALSE"]. BOOLEAN = TRUE, FALSE ATM_REDUCED_OUTPUT = FALSE # Store grib codes from SH files [User need to refer defined ppt* files for the experiment] ATM_SH_CODES = # Store levels against "ATM_SH_CODES" e.g: level1,level2,level3, ... ATM_SH_LEVELS = # Store grib codes from GG files [User need to refer defined ppt* files for the experiment] ATM_GG_CODES = # Store levels against "ATM_GG_CODES" (133.128, 246.128, 247.128, 248.128) # e.g: level1,level2,level3, ... ATM_GG_LEVELS = # SPPT stochastic physics active or not [Default: set "FALSE"]. BOOLEAN = TRUE, FALSE LSPPT = FALSE # Write the perturbation patterns for SPPT or not [Default: set "FALSE"]. # BOOLEAN = TRUE, FALSE LWRITE_ARP = # Number of scales for SPPT [Default: set 3]. NUMERIC = 1, 2, 3 NS_SPPT = # Standard deviations of each scale [Default: set 0.50,0.25,0.125] # NUMERIC values separated by , SDEV_SPPT = # Decorrelation times (in seconds) for each scale [Default: set 2.16E4,2.592E5,2.592E6] # NUMERIC values separated by , TAU_SPPT = # Decorrelation lengths (in meters) for each scale [Default: set 500.E3,1000.E3,2000.E3] # NUMERIC values separated by , XLCOR_SPPT = # Clipping ratio (number of standard deviations) for SPPT [Default: set 2] NUMERIC XCLIP_SPPT = # Stratospheric tapering in SPPT [Default: set "TRUE"]. BOOLEAN = TRUE, FALSE LTAPER_SPPT = # Top of stratospheric tapering layer in Pa [Default: set to 50.E2] NUMERIC PTAPER_TOP = # Bottom of stratospheric tapering layer in Pa [Default: set to 100.E2] NUMERIC PTAPER_BOT = ## ATMOSPHERIC NUDGING PARAMETERS ## # Atmospheric nudging towards re-interpolated ERA-Interim data. BOOLEAN = TRUE, FALSE ATM_NUDGING = FALSE # Atmospheric nudging reference data experiment name. [T255L91: b0ir] ATM_refnud = # Nudge vorticity. BOOLEAN = TRUE, FALSE NUD_VO = # Nudge divergence. BOOLEAN = TRUE, FALSE NUD_DI = # Nudge temperature. BOOLEAN = TRUE, FALSE NUD_TE = # Nudge specific humidity. BOOLEAN = TRUE, FALSE NUD_Q = # Nudge liquid water content. BOOLEAN = TRUE, FALSE NUD_QL = # Nudge ice water content. BOOLEAN = TRUE, FALSE NUD_QI = # Nudge cloud fraction. BOOLEAN = TRUE, FALSE NUD_QC = # Nudge log of surface pressure. BOOLEAN = TRUE, FALSE NUD_LP = # Relaxation coefficient for vorticity. NUMERIC in ]0,inf[; # 1 means half way between model value and ref value ALPH_VO = # Relaxation coefficient for divergence. NUMERIC in ]0,inf[; # 1 means half way between model value and ref value ALPH_DI = # Relaxation coefficient for temperature. NUMERIC in ]0,inf[; # 1 means half way between model value and ref value ALPH_TE = # Relaxation coefficient for specific humidity. NUMERIC in ]0,inf[; # 1 means half way between model value and ref value ALPH_Q = # Relaxation coefficient for log surface pressure. NUMERIC in ]0,inf[; # 1 means half way between model value and ref value ALPH_LP = # Nudging area Northern limit [Default: Do set "90"] NUD_NLAT = # Nudging area Southern limit [Default: Do set "-90"] NUD_SLAT = # Nudging area Western limit NUMERIC in [0,360] [Default: Do set "0"] NUD_WLON = # Nudging area Eastern limit NUMERIC in [0,360] [Default: Do set "360"; E/conf/proj_cxxx.conf .. code-block:: ini PROJECT_ROOT = /gpfs/scratch/bsc32/bsc32070/a000/automatic_perfomance_profile REFRESH_GIT_REPO = false Write your original script in the user project directory: vi /proj/template/autosubmit/remote_setup.sh .. code-block:: bash cd %CURRENT_ROOTDIR% # This comes from autosubmit. # Clone repository to the remote for needed files # if exist or force refresh is true if [ ! -d %PROJECT_ROOT% ] || [ %REFRESH_GIT_REPO% == true ]; then chmod +w -R %PROJECT_ROOT% || : rm -rf %PROJECT_ROOT% || : git clone (...) fi (...) Final script, which is generated by `autosubmit run` or ``autosubmit inspect`` cat /cxxx/tmp/remote_setup.cmd .. code-block:: bash cd /gpfs/scratch/bsc32/bsc32070/a000 # Clone repository to the remote for needed files # if exist or force refresh is true if [ ! -d /gpfs/scratch/bsc32/bsc32070/a000/automatic_performance_profile ] || [ false == true ]; then chmod +w -R /gpfs/scratch/bsc32/bsc32070/a000/automatic_performance_profile || : rm -rf /gpfs/scratch/bsc32/bsc32070/a000/automatic_performance_profile || : git clone (...) fi (...) Detailed platform configuration ------------------------------- In this section, we describe the platform configuration using `-QOS` and also `PARTITION` vi /conf/platform_cxxx.conf .. code-block:: ini [marenostrum0] TYPE = ps HOST = mn0.bsc.es PROJECT = bsc32 USER = bsc32070 ADD_PROJECT_TO_HOST = false SCRATCH_DIR = /gpfs/scratch [marenostrum4] # Queue type. Options: ps, SGE, LSF, SLURM, PBS, eceaccess TYPE = slurm HOST = mn1.bsc.es,mn2.bsc.es,mn3.bsc.es PROJECT = bsc32 USER = bsc32070 SCRATCH_DIR = /gpfs/scratch ADD_PROJECT_TO_HOST = False # use 72:00 if you are using a PRACE account, 48:00 for the bsc account MAX_WALLCLOCK = 02:00 # use 19200 if you are using a PRACE account, 2400 for the bsc account MAX_PROCESSORS = 2400 PROCESSORS_PER_NODE = 48 #SERIAL_QUEUE = debug #QUEUE = debug CUSTOM_DIRECTIVES = ["#SBATCH -p small", "#SBATCH --no-requeue", "#SBATCH --usage"] [marenostrum_archive] TYPE = ps HOST = dt02.bsc.es PROJECT = bsc32 USER = bsc32070 SCRATCH_DIR = /gpfs/scratch ADD_PROJECT_TO_HOST = False TEST_SUITE = False [power9] TYPE = slurm HOST = plogin1.bsc.es PROJECT = bsc32 USER = bsc32070 SCRATCH_DIR = /gpfs/scratch ADD_PROJECT_TO_HOST = False TEST_SUITE = False SERIAL_QUEUE = debug QUEUE = debug [nord3] TYPE = lsf HOST = nord1.bsc.es PROJECT = bsc32 USER = bsc32070 ADD_PROJECT_TO_HOST = False SCRATCH_DIR = /gpfs/scratch TEST_SUITE = False MAX_WALLCLOCK = 48:00 MAX_PROCESSORS = 1024 PROCESSORS_PER_NODE = 16 [transfer_node] TYPE = ps HOST = dt01.bsc.es PROJECT = bsc32 USER = bsc32070 ADD_PROJECT_TO_HOST = false SCRATCH_DIR = /gpfs/scratch [transfer_node_bscearth000] TYPE = ps HOST = bscearth000 USER = dbeltran PROJECT = Earth ADD_PROJECT_TO_HOST = false QUEUE = serial SCRATCH_DIR = /esarchive/scratch [bscearth000] TYPE = ps HOST = bscearth000 PROJECT = Earth USER = dbeltran SCRATCH_DIR = /esarchive/scratch .. warning:: The ``TYPE`` field is mandatory. The ``HOST`` field is mandatory. The ``PROJECT`` field is mandatory. The ``USER`` field is mandatory. The ``SCRATCH_DIR`` field is mandatory. The ``ADD_PROJECT_TO_HOST`` field is mandatory. .. warning:: The ``TEST_SUITE`` field is optional. The ``MAX_WALLCLOCK`` field is optional. The ``MAX_PROCESSORS`` field is optional. The ``PROCESSORS_PER_NODE`` field is optional. .. warning:: The ``SERIAL_QUEUE`` and ``QUEUE`` field are used for specify a -QOS. For specify a partition, you must use ``CUSTOM_DIRECTIVES``. For specify the memory usage you must use ``MEMORY`` but only in jobs.conf. The custom directives can be used for multiple parameters at the same time using the follow syntax. vi /conf/platform_cxxx.conf .. code-block:: ini [puhti] #Check your partition ( test/small/large]) CUSTOM_DIRECTIVES = ["#SBATCH -p test", "#SBATCH --no-requeue", "#SBATCH --usage"] ### Batch job system / queue at HPC TYPE = slurm ### Hostname of the HPC HOST = puhti ### Project name-ID at HPC (WEATHER) PROJECT = project_test ### User name at HPC USER = dbeltran ### Path to the scratch directory for the project at HPC SCRATCH_DIR = /scratch # Should've false already, just in case it is not ADD_PROJECT_TO_HOST = False #Check your partition ( test[00:15]/small[72:00]/large[72:00]) max_wallclock MAX_WALLCLOCK = 00:15 # [test [80] // small [40] // large [1040] MAX_PROCESSORS = 80 # test [40] / small [40] // large [40] PROCESSORS_PER_NODE = 40