Platforms ========= .. |br| raw:: html
.. note:: This documentation is based on the v4.1.13 branch, and can only guarantee reproducibility in this context Extending an Existing Platform ------------------------------ Platforms are defined under Python classes. The source files for such classes are stored inside ``autosubmit/platforms/`` directory. To extend an existing platform we will create a child class from an existing platform class, for which first we need to identify which existing platform is the most suitable for our project. .. note:: Currently the platforms available are: |br| :ref:`Local Platform ` :mod:`Platform ` |br| :mod:`EC Platform ` :mod:`PJM Platform ` |br| :mod:`Slurm Platform ` Composing the Extended Platform Class ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In this page we will be extending the SLURM platform - source file ``autosubmit/platforms/slurmplatform.py``, see in GitHub `slurmplatform.py `_ -, but any platform can be extended by following the same steps. The platform will be transcribing the files and configurations you set manually to allow operations, and connection to SLURM and its commands, preparing your experiments to be executed transforming configuration into executable functions. We will create a new file in ``/autosubmit/platforms/`` and we are going to call it ``slurm_example.py``. .. code-block:: python :linenos: from autosubmit.platforms.slurmplatform import SlurmPlatform class Slurm_ExamplePlatform(SlurmPlatform): """ Class to manage slurm jobs """ This will create a class in which the ``Slurm_ExamplePlatform`` will be associated as its parent class allowing ``Slurm_ExamplePlatform`` inherit all its characteristics. We create an initialization method with the required parameters. .. code-block:: python :linenos: def __init__(self, expid: str, name: str, config: dict): """ Initialization of the Class ExamplePlatform :param expid: Id of the experiment. :type expid: str :param name: Name of the platform. :type name: str :param config: A dictionary containing all the Experiment parameters. :type config: dict """ SlurmPlatform.__init__(self, expid, name, config, auth_password = auth_password) self.example_platform_parameter = ... # add any platform specific parameters As it can be seen, the parent class has an initialization method to invoke all the parent's methods and attributes into the child (``Slurm_ExamplePlatform``). In order to override methods from the parent class, we can simply redefine them as shown below, this way we can add new parameters and/or behaviours, making it possible to add flexibility and restructure a platform for the new needs. .. code-block:: python :linenos: def submit_job(self, job, script_name: str, hold: bool=False, export: str="none") -> Union[int, None]: """Submit a job from a given job object.""" Log.result(f"Job: {job.name}") return None The class ``submit_job`` is a existing class in ``SlurmPlatform`` that was overwritten to have a new behaviour. After all needed modifications and expansions, the ``Slurm_ExamplePlatform`` class could look similar to the following example code. .. code-block:: python :linenos: from typing import Union from autosubmit.platforms.slurmplatform import SlurmPlatform class Slurm_ExamplePlatform(SlurmPlatform): """Class to manage slurm jobs""" def __init__(self, expid: str, name: str, config: dict, auth_password: str=None): """ Initialization of the Class ExamplePlatform :param expid: Id of the experiment. :type expid: str :param name: Name of the platform. :type name: str :param config: A dictionary containing all the Experiment parameters. :type config: dict """ SlurmPlatform.__init__(self, expid, name, config, auth_password = auth_password) def submit_job(self, job, script_name: str, hold: bool=False, export: str="none") -> Union[int, None]: """Submit a job from a given job object.""" Log.result(f"Job: {job.name}") return None Integrating the Extended Platform into the Module ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To ensure that the platform will be created as expected, we need to make some changes in 3 different files |br| ``autosubmit/job/job.py`` - see in GitHub `job.py `_. |br| ``autosubmit/autosubmit.py`` - see in GitHub `autosubmit.py `_. |br| ``autosubmit/platforms/paramiko_submitter.py`` - see in GitHub `paramiko_submitter.py `_. |br| ``type`` from ``platform.type`` is defined in the YAML file that configures a platform as it's shown :ref:`here ` to determine the scheduler. .. warning:: The numbers noted down to each of the files could become obsolete locally as files get updated so they should be seen more as a reference ``autosubmit/autosubmit.py`` in `line 2538 `_ add a new ``string`` making sure the new platform type is considered the same as SLURM platform, as we expect a similar behaviour. .. code-block:: python :emphasize-lines: 1 if platform.type.lower() in [ "slurm" , "pjm", "example" ] and not inspect and not only_wrappers: # Process the script generated in submit_ready_jobs save_2, valid_packages_to_submit = platform.process_batch_ready_jobs(valid_packages_to_submit, failed_packages, error_message="", hold=hold) ``autosubmit/job/job.py`` in `line 2575 `_ ensure each job Job writes the timestamp to TOTAL_STATS file and jobs_data.db properly. .. code-block:: python :emphasize-lines: 1 if job_data_dc and type(self.platform) is not str and (self.platform.type in ["slurm", "example"]): thread_write_finish = Thread(target=ExperimentHistory(self.expid, jobdata_dir_path=BasicConfig.JOBDATA_DIR, historiclog_dir_path=BasicConfig.HISTORICAL_LOG_DIR).write_platform_data_after_finish, args=(job_data_dc, self.platform)) thread_write_finish.name = "JOB_data_{}".format(self.name) thread_write_finish.start() ``autosubmit/job/job.py`` in `line 2817 `_ add a new validation for the validation of the queue creation with the platform type .. code-block:: python :emphasize-lines: 1 if self._platform.type in ["slurm", "example"]: self._platform.send_command( self._platform.get_queue_status_cmd(self.id)) reason = self._platform.parse_queue_reason( self._platform._ssh_output, self.id) ``autosubmit/platforms/paramiko_submitter.py`` in `line 143 `_ add a new validation for the header command creation where the platform type .. code-block:: python :emphasize-lines: 1 elif platform_type in ["slurm", "example"]: remote_platform = SlurmPlatform( asconf.expid, section, exp_data, auth_password = auth_password) How to Configure a Platform ------------------------------------ To set up your platform, you first have to create a new experiment by running the following command: |br| *Change the platform from MARENOSTRUM5 to whichever you will use* .. parsed-literal:: autosubmit :ref:`expid ` -H MARENOSTRUM5 -d "platform test" --minimal This will generate a minimal version of an experiment. To change the configuration of your experiment to ensure it works properly, you can create a project and customize its parameters. The following instructions are designed to execute a small job through Autosubmit, explaining how to configure a new platform. Open the file ``~/autosubmit//config/minimal.yml`` and you'll find a file as shown below. .. code-block:: yaml CONFIG: AUTOSUBMIT_VERSION: "4.1.12" TOTALJOBS: 20 MAXWAITINGJOBS: 20 DEFAULT: EXPID: # ID of the experiment HPCARCH: "MARENOSTRUM5" # This will be the default platform if a job doesn't contain a defined platform #hint: use %PROJDIR% to point to the project folder (where the project is cloned) CUSTOM_CONFIG: "%PROJDIR%/" PROJECT: PROJECT_TYPE: local PROJECT_DESTINATION: local_project GIT: PROJECT_ORIGIN: "" PROJECT_BRANCH: "" PROJECT_COMMIT: '' PROJECT_SUBMODULES: '' FETCH_SINGLE_BRANCH: true Now we start configuring the experiment adding the additional ``PARAMETERS`` to create a simple executable experiment .. code-block:: yaml EXPERIMENT: DATELIST: 19900101 MEMBERS: fc0 CHUNKSIZEUNIT: month SPLITSIZEUNIT: day CHUNKSIZE: 1 NUMCHUNKS: 2 CALENDAR: standard Add the following PARAMETER which will point towards the folder containing all the scripts and instructions to be used to execute the experiment in the platform .. code-block:: yaml LOCAL: PROJECT_PATH: /home/user/experiment_example # path to your project sources Autosubmit will copy your sources to the ``$autosubmit_installation/$expid/proj/%PROJECT.PROJECT_DESTINATION%``. The following settings are used to create a connection with a platform to execute the jobs. You must input the information suitable for your project (e.g.: user, host, platform). .. _TargetPlatform: --------- .. code-block:: yaml PLATFORMS: MARENOSTRUM5: TYPE: [slurm, ps, example] HOST: PROJECT: USER: scratch_dir: QUEUE: gp_debug [dummy, gp_debug, nf, hpc] MAX_WALLCLOCK: MAX_PROCESSORS: # This is to enable horizontal_wrappers PROCESSORS_PER_NODE: 112 # Each HPC has their own number check the documentation of your platform .. warning:: If you cannot connect, it may be because your user doesn't have access to the host, or the PARAMETER SCRATCH_DIR might be pointing to a non-existing folder on the host. Make sure to create the folder with your USERNAME inside the proper path you pointed to (e.g.: //) How to generate a new experiment ------------------------------------ Now you can add jobs at the end of the file to see the execution Each job will point to one of the ``Bash`` files that will be created in the next step, meaning that Autosubmit will look for the instructions of the experiment in the ``~/autosubmit//proj/local_project/`` if none is found inside the folder Autosubmit will look at ``LOCAL.PROJECT_PATH`` set earlier in order to copy to the project folder if they exist. .. note:: The files can also be R, python2, python3. By default it is bash and can be changed by setting the file type. .. code-block:: yaml JOBS: LOCAL_SETUP: TYPE: Python # adding this .. code-block:: yaml JOBS: LOCAL_SETUP: FILE: LOCAL_SETUP.sh # ~/autosubmit//proj/local_project/LOCAL_SETUP.sh PLATFORM: Local RUNNING: once SYNCHRONIZE: FILE: SYNCHRONIZE.sh PLATFORM: MARENOSTRUM5 DEPENDENCIES: LOCAL_SETUP RUNNING: once WALLCLOCK: 00:05 REMOTE_SETUP: FILE: REMOTE_SETUP.sh PLATFORM: MARENOSTRUM5 DEPENDENCIES: SYNCHRONIZE WALLCLOCK: 00:05 RUNNING: once INI: FILE: INI.sh PLATFORM: MARENOSTRUM5 DEPENDENCIES: REMOTE_SETUP RUNNING: once WALLCLOCK: 00:05 DATA_NOTIFIER: FILE: DATA_NOTIFIER.sh PLATFORM: MARENOSTRUM5 DEPENDENCIES: INI RUNNING: chunk SIM: FILE: SIM.sh PLATFORM: MARENOSTRUM5 DEPENDENCIES: DATA_NOTIFIER RUNNING: chunk STATISTICS: FILE: STATISTICS.sh PLATFORM: MARENOSTRUM5 DEPENDENCIES: SIM RUNNING: chunk APP: FILE: APP.sh PLATFORM: MARENOSTRUM5 DEPENDENCIES: STATISTICS RUNNING: chunk CLEAN: FILE: CLEAN.sh # PLATFORM: MARENOSTRUM5 DEPENDENCIES: APP SIM STATISTICS RUNNING: once WALLCLOCK: 00:05 Once you finish setting up all the new configurations, you can run the following command to generate the experiment just created; we need to create a new folder to keep all the instructions for the experiment to be executed on the platform. ``mkdir -p /home/user/experiment_example`` .. hint:: The name of the folder can be anything as long as it matches the Local Parameter specified in the configuration file; the name change needs to take this into account For the execution of this test, a few files will need to be created within the new folder; these files will contain proj-associated code that will be executed on the job-specified platform. .. code-block:: yaml LOCAL_SETUP.sh SYNCHRONIZE.sh REMOTE_SETUP.sh INI.sh DATA_NOTIFIER.sh SIM.sh STATISTICS.sh APP.sh CLEAN.sh To keep a concise and clear example of how Autosubmit works, a simple instruction can be executed as a test. So add the following the instruction below to one or more ``Bash`` files created in the previous steps. .. code-block:: yaml sleep 5 How to run the experiment ------------------------------------ ``autosubmit create -f -v `` Once the experiment is generated, we can execute it and check the experiment by running the command below #. Submit the job to the specified platform #. monitor their status #. transfers logs to $expid/tmp/Log_$expid ``autosubmit run `` .. note:: For more examples on how to create and share configurations of experiments and platforms, you can visit the :ref:`page `.