autosubmit.platform#
- class autosubmit.platforms.platform.CopyQueue(maxsize: int = -1, block: bool = True, timeout: float | None = None, ctx: Any | None = None)#
Bases:
QueueA queue that copies the object gathered.
- put(job: Any, block: bool = True, timeout: float | None = None) None#
Puts a job into the queue if it is not a duplicate.
- Parameters:
job (Any) – The job to be added to the queue.
block (bool) – Whether to block when the queue is full. Defaults to True.
timeout (float) – Timeout for blocking operations. Defaults to None.
- class autosubmit.platforms.platform.Platform(expid: str, name: str, config: dict, auth_password: str | list[str] | None = None)#
Bases:
objectClass to manage the connections to the different platforms.
- add_parameters(as_conf: AutosubmitConfig)#
Add parameters for the current platform to the given parameters list
- Parameters:
as_conf (AutosubmitConfig object) – autosubmit config object
- property budget#
Platform budget.
- check_all_jobs(job_list: list[Job], as_conf: AutosubmitConfig, retries: int = 5)#
Checks jobs running status
- Parameters:
job_list (list) – list of jobs
as_conf (as_conf) – config
retries (int) – retries
- clean_log_recovery_process() None#
Cleans the log recovery process variables.
This method sets the cleanup event to signal the log recovery process to finish, waits for the process to join with a timeout, and then resets all related variables.
- compress_file(file_path: str) str | None#
Compress a file.
- Parameters:
file_path – file path
- Returns:
The path to the compressed file. None if compression failed.
- confirm_done_jobs_via_stat(job_list: list) dict[str, Status]#
Confirm that jobs marked as done are actually completed by checking their STAT files.
- Parameters:
job_list – List of jobs to confirm.
has_internal_retries – Indicates if the jobs have internal retries, which affects the STAT file naming convention.
- Returns:
List of jobs that are confirmed as completed.
- connect(as_conf: AutosubmitConfig, reconnect: bool = False, log_recovery_process: bool = False) None#
Establishes an SSH connection to the host.
- Parameters:
as_conf – The Autosubmit configuration object.
reconnect – Indicates whether to attempt reconnection if the initial connection fails.
log_recovery_process – Specifies if the call is made from the log retrieval process.
- Returns:
None
- delete_file(filename: str)#
Deletes a file from this platform.
- Parameters:
filename (str) – file name
- Returns:
True if successful or file does not exist
- Return type:
bool
- delete_previous_run_files_by_job_names(job_names: list[str]) None#
Delete the COMPLETED and FAILED files for the given job names from the remote log directory.
- Parameters:
job_names (list[str]) – List of job names whose COMPLETED and FAILED files should be deleted.
- delete_previous_stat_files_by_job_names(job_names: list[str]) None#
Delete all previous STAT files for the given job names from the remote log directory.
- Parameters:
job_names (list[str]) – List of job names whose STAT files should be deleted.
- property exclusivity#
True if you want to request exclusivity nodes.
- get_checkpoint_files(job)#
Get all the checkpoint files of a job.
- Parameters:
job (Job) – Get the checkpoint files
- get_completed_job_names(job_names: list[str] | None = None) list[str]#
Get the names of the completed jobs on this platform.
- Parameters:
job_names – List of job names to check. If None, all jobs will be checked.
- Returns:
List of completed job names.
- get_file(filename, must_exist=True, relative_path='', ignore_log=False, wrapper_failed=False)#
Copies a file from the current platform to experiment’s tmp folder
- Parameters:
wrapper_failed
ignore_log
filename (str) – file name
must_exist (bool) – If True, raises an exception if file can not be copied
relative_path (str) – relative path inside tmp folder
- Returns:
True if file is copied successfully, false otherwise
- Return type:
bool
- get_file_size(src: str) int | None#
Get file size in bytes.
- Parameters:
src – file path
- get_files(files, must_exist=True, relative_path='')#
Copies some files from the current platform to experiment’s tmp folder.
- Parameters:
files ([str]) – file names
must_exist (bool) – If True, raises an exception if file can not be copied
relative_path (str) – relative path inside tmp folder
- Returns:
True if file is copied successfully, false otherwise
- Return type:
bool
- get_files_path() str#
The platform’s LOG directory.
- Returns:
platform’s LOG directory
- Return type:
str
- get_logs_files(exp_id: str, remote_logs: tuple[str, str]) None#
Get the given LOGS files.
- Parameters:
exp_id (str) – experiment id
remote_logs ((str, str)) – names of the log files
- get_remote_log_dir() str#
Get the variable remote_log_dir that stores the directory of the experiment’s log.
- Returns:
The remote_log_dir variable.
- property host#
Platform url.
- property hyperthreading#
TODO
- move_file(src, dest)#
Moves a file on the platform.
- Parameters:
src (str) – source name
dest (str) – destination name
- property name#
Platform name.
- property partition#
Partition to use for jobs.
- Returns:
queue’s name
- Return type:
str
- static prepare_dry_run_if_applicable(package: JobPackageBase, only_wrappers: bool, inspect: bool) None#
Dry-run preparation of a package to emulate that the package was submitted, without following the normal submission flow.
- Parameters:
package (JobPackageBase) – Package being prepared for inspect or wrapper-only mode.
only_wrappers (bool) – If
True, prepare wrapper metadata without following the normal submission flow.inspect (bool) – If
True, prepare package metadata for inspect mode.
- Raises:
Exception – Propagate any exception raised while creating wrapper job metadata or saving the package.
- prepare_submission(as_conf: AutosubmitConfig, job_list: JobList, packages_to_submit: list[JobPackageBase], inspect=False, only_wrappers=False) tuple[dict[str, dict[str, JobPackageBase]], dict[str, dict[str, JobPackageBase]]]#
Prepare job packages for submission on the current platform.
Log the number of ready jobs, optionally initialize the platform submit script, and process each package selected for submission. Depending on the
inspectandonly_wrappersflags, this method updates wrapper metadata, generates job scripts, transfers files to the platform, and collects the jobs prepared for later submission handling.- Parameters:
as_conf (AutosubmitConfig) – Autosubmit configuration for the current experiment.
job_list (JobList) – Job container used to inspect ready jobs and register wrapper information.
packages_to_submit (list[JobPackageBase]) – Packages built for this platform and ready to be prepared.
inspect (bool) – If
True, prepare packages for inspect mode without generating the platform submit script or sending files.only_wrappers (bool) – If
True, prepare only wrapper-related metadata and skip the regular package submission flow.
- Raises:
Exception – Propagate any exception raised while preparing packages, generating scripts, or transferring files.
- Returns:
A list containing the jobs gathered while preparing the given packages for submission.
- Return type:
list
- property project#
Platform project.
- property project_dir#
Platform’s project folder path.
- property queue#
Queue to use for jobs.
- Returns:
queue’s name
- Return type:
str
- read_file(src: str, max_size: int | None = None) bytes | None#
Read file content as bytes. If max_size is set, only the first max_size bytes are read.
- Parameters:
src – file path
max_size – maximum size to read
- recover_job_log() set[Any]#
Recovers log files for jobs from the recovery queue and retries failed jobs.
- Returns:
Updated set of jobs pending to process.
- recover_platform_job_logs(as_conf: AutosubmitConfig) None#
Recovers the logs of the jobs that have been submitted. When this is executed as a process, the exit is controlled by the work_event and cleanup_events of the main process.
- remove_checkpoint_file(filename)#
Removes CHECKPOINT files from remote.
- Parameters:
filename – file name to delete.
- Returns:
True if successful, False otherwise
- remove_completed_file(job_name)#
Removes COMPLETED files from remote.
- Parameters:
job_name (str) – name of job to check
- Returns:
True if successful, False otherwise
- Return type:
bool
- remove_stat_file(job: Any) bool#
Removes STAT files from remote.
- Parameters:
job (Job) – Job to check.
- Returns:
True if the file was removed, False otherwise.
- Return type:
bool
- classmethod remove_workers(event_worker: Event) None#
Remove the given even worker from the list of workers in this class.
- property reservation#
You can configure your reservation id for the given platform.
- restore_connection(as_conf: AutosubmitConfig, log_recovery_process: bool = False) None#
Restores the SSH connection to the platform.
- Parameters:
as_conf (AutosubmitConfig) – The Autosubmit configuration object used to establish the connection.
log_recovery_process (bool) – Indicates that the call is made from the log retrieval process.
- property root_dir#
Platform’s experiment folder path.
- property scratch#
Platform’s scratch folder path.
- send_cleanup_signal() None#
Sends a cleanup signal to the log recovery process if it is alive. This function is executed by the atexit module
- send_file(filename: str, check=True) bool#
Sends a local file to the platform.
- Parameters:
filename – The name of the file to send.
check – Whether the platform must perform tests (e.g. for permission).
- property serial_partition#
Partition to use for serial jobs.
- Returns:
partition’s name
- Return type:
str
- property serial_platform#
Platform to use for serial jobs.
- Returns:
platform’s object
- Return type:
platform
- property serial_queue#
Queue to use for serial jobs.
- Returns:
queue’s name
- Return type:
str
- spawn_log_retrieval_process(as_conf: AutosubmitConfig | None) None#
Spawns a process to recover the logs of the jobs that have been completed on this platform.
- Parameters:
as_conf (AutosubmitConfig) – Configuration object for the platform.
- property type#
Platform scheduler type.
- property user#
Platform user.
- wait_for_work() bool#
Waits until there is work, or the keep alive timeout is reached.
- Returns:
True if there is work to process, False otherwise.
- write_jobid(jobid: str, complete_path: str) None#
Writes Job id in an out/err file.
- Parameters:
jobid (str) – job id
complete_path (str) – complete path to the file, includes filename
- autosubmit.platforms.platform.recover_platform_job_logs_wrapper(platform: Platform, recovery_queue: Queue, worker_event: Event, cleanup_event: Event, as_conf: AutosubmitConfig) None#
Wrapper function to recover platform job logs.
- Parameters:
platform – The platform object responsible for managing the connection and job recovery.
recovery_queue – A multiprocessing queue used to store jobs for recovery.
worker_event – An event to signal work availability.
cleanup_event – An event to signal cleanup operations.
as_conf (AutosubmitConfig) – The Autosubmit configuration object containing experiment data.
- Returns:
None
- Return type:
None
- class autosubmit.platforms.ecplatform.EcPlatform(expid, name, config, scheduler)#
Bases:
ParamikoPlatformClass to manage queues with ecaccess
- Parameters:
expid (str) – experiment’s identifier
scheduler (str (pbs, loadleveler)) – scheduler to use
- cancel_jobs(job_ids: list[str]) None#
Cancel ecaccess jobs by their IDs.
- Parameters:
job_ids (list[str]) – List of ecaccess job IDs to cancel.
- check_remote_log_dir() None#
Create the remote log directory and all required parent directories.
ecaccess-file-mkdirhas no-poption, so each intermediate path level must be created in sequence. Failures for levels that already exist are silenced; only a failure on the final LOG directory is reported.- Raises:
AutosubmitError – If the log directory cannot be created.
- check_remote_permissions() bool#
Checks if the necessary permissions are in place on the remote host. There is no mkdir -p equivalent in ecaccess-file-mkdir. So we need to check permissions for each level of the path separately.
- confirm_done_jobs_via_stat(job_list: list) dict[str, Status]#
Confirm job statuses via STAT files using ecaccess commands.
Overrides the base
awk-based implementation because EcPlatform runs commands locally via subprocess and cannot read remote files directly.- Parameters:
job_list – Jobs to confirm.
- Returns:
Mapping of job names to resolved statuses.
- connect(as_conf: AutosubmitConfig, reconnect: bool = False, log_recovery_process: bool = False) None#
Establishes an SSH connection to the host.
- Parameters:
as_conf – The Autosubmit configuration object.
reconnect – Indicates whether to attempt reconnection if the initial connection fails.
log_recovery_process – Specifies if the call is made from the log retrieval process.
- Returns:
None
- delete_file(filename: str) bool#
Deletes a file from this platform
- Parameters:
filename (str) – file name
- Returns:
True if successful or file does not exist
- Return type:
bool
- delete_previous_run_files_by_job_names(job_names: list[str]) None#
Delete COMPLETED and FAILED marker files for the given job names using ecaccess.
Overrides the base SSH
find-based implementation because EcPlatform runs commands locally via subprocess, not over SSH.- Parameters:
job_names – Job names whose COMPLETED and FAILED files should be deleted.
- delete_previous_stat_files_by_job_names(job_names: list[str]) None#
Delete all STAT marker files for the given job names using ecaccess.
Lists the remote log directory via
ecaccess-file-dirand removes any matching{name}_STAT_*files one at a time. Overrides the base SSHfind-based implementation because EcPlatform runs commands locally.- Parameters:
job_names – Job names whose STAT files should be deleted.
- get_check_all_jobs_cmd(jobs_id)#
Return the batch job-list command for ecaccess.
ecaccess-job-listwithout arguments lists all active jobs for the user, so a single call covers every job in jobs_id.- Parameters:
jobs_id – Comma-separated job IDs (ignored by ecaccess).
- Returns:
ecaccess-job-listcommand.
- get_check_job_cmd(job_id)#
Returns command to check job status on remote platforms.
- Parameters:
job_id – id of job to check
- Returns:
command to check job status
- get_completed_job_names(job_names: list[str] | None = None) list[str]#
Retrieve the names of all files ending with ‘_COMPLETED’ from the remote log directory using SSH.
Uses
ecaccess-file-dirto inspect the remote directory and filters results locally. Ifjob_namesis provided, only those names are checked.- Parameters:
job_names – Optional job names to restrict the lookup.
- Returns:
Job names whose
_COMPLETEDmarker exists remotely.
- get_file(filename, must_exist=True, relative_path='', ignore_log=False, wrapper_failed=False)#
Copies a file from the current platform to experiment’s tmp folder
- Parameters:
wrapper_failed
ignore_log
filename (str) – file name
must_exist (bool) – If True, raises an exception if file can not be copied
relative_path (str) – path inside the tmp folder
- Returns:
True if file is copied successfully, false otherwise
- Return type:
bool
- get_mkdir_cmd()#
Gets command to create directories on HPC
- Returns:
command to create directories on HPC
- Return type:
str
- get_remote_log_dir()#
Get the variable remote_log_dir that stores the directory of the experiment’s log.
- Returns:
The remote_log_dir variable.
- get_ssh_output()#
Gets output from last command executed.
- Returns:
output from last command
- Return type:
str
- get_submitted_job_id(output: str, x11: bool = False) list[str]#
Parses the output of the submit command to get the job ID.
- Parameters:
output – output of the submit command.
x11 – whether the job is an x11 job, which has a different output format.
- Returns:
job ID of the submitted job.
- get_submitted_jobs_by_name(script_names: list[str]) list[int]#
Return submitted ecaccess job IDs by script name.
This fallback is used when the batched submit command does not return one recoverable job identifier per submitted script.
- Parameters:
script_names (list[str]) – Submitted script filenames.
- Returns:
Matching ecaccess job IDs in submission order, one per script. Returns an empty list if any script name has no newly submitted job.
- Return type:
list[int]
- move_file(src, dest, must_exist=False)#
Moves a file on the platform (includes .err and .out).
- Parameters:
src (str) – source name
dest (str) – destination name
must_exist – ignore if file exist or not
- parse_all_jobs_output(output, job_id)#
Parse ecaccess-job-list tabular output for a single job ID.
- Parameters:
output – Raw output from
get_check_all_jobs_cmd().job_id – ecaccess job ID to look up.
- Returns:
Job status word (e.g.
EXEC,DONE,STOP) or an empty string if the job is not found.
- parse_job_output(output)#
Parses check job command output, so it can be interpreted by autosubmit
- Parameters:
output (str) – output to parse
- Returns:
job status
- Return type:
str
- restore_connection(as_conf: AutosubmitConfig, log_recovery_process: bool = False) None#
Restores the SSH connection to the platform.
- Parameters:
as_conf (AutosubmitConfig) – The Autosubmit configuration object used to establish the connection.
log_recovery_process (bool) – Indicates that the call is made from the log retrieval process.
- send_command(command, ignore_log=False, x11=False) bool#
Sends a given command to an HPC platform.
- Parameters:
command – The command to send to the HPC.
ignore_log – Whether logging is enabled or not for this function.
x11 – Whether X11 is enabled for the SSH session.
- Returns:
True if executed, False if failed
- send_file(filename, check=True) bool#
Sends a local file to the platform.
- Parameters:
filename – The name of the file to send.
check – Whether the platform must perform tests (e.g. for permission).
- set_start_time_from_remote_stat_file(job_list: list) None#
Set the start_time_timestamp for each job from the first line of its STAT file.
Overrides the base SSH
head-based implementation because EcPlatform runs commands locally via subprocess and cannot read remote files directly. The first line of each STAT file contains the job start time as a Unix epoch float.- Parameters:
job_list – Jobs whose start times should be filled from remote STAT files.
- test_connection(as_conf: AutosubmitConfig) None#
Tests the connection using the provided configuration.
- Parameters:
as_conf (AutosubmitConfig) – The configuration to use for testing the connection.
- update_cmds()#
Updates commands for platforms
- class autosubmit.platforms.pjmplatform.PJMPlatform(expid, name, config)#
Bases:
ParamikoPlatformClass to manage jobs to host using PJM scheduler
- Parameters:
expid (str) – experiment’s identifier
- cancel_jobs(job_ids: list[str]) None#
Cancel PJM jobs by their IDs.
- Parameters:
job_ids (list[str]) – List of PJM job IDs to cancel.
- Return type:
None
- check_remote_log_dir()#
Creates log dir on remote host
- get_check_all_jobs_cmd(jobs_id)#
Returns command to check jobs status on remote platforms.
- Parameters:
jobs_id – id of jobs to check
jobs_id – str
- Returns:
command to check job status
- Return type:
str
- get_check_job_cmd(job_id)#
Returns command to check job status on remote platforms.
- Parameters:
job_id – id of job to check
- Returns:
command to check job status
- get_job_id_by_job_name_cmd(job_name)#
Returns command to get job id by job name on remote platforms
- Parameters:
job_name
- Returns:
str
- get_mkdir_cmd()#
Gets command to create directories on HPC
- Returns:
command to create directories on HPC
- Return type:
str
- get_remote_log_dir()#
Get the variable remote_log_dir that stores the directory of the experiment’s log.
- Returns:
The remote_log_dir variable.
- get_submitted_job_id(output: str, x11: bool = False) list[int]#
Parse the output of the submit command and return PJM job IDs.
- Parameters:
output (str) – Output of the submit command.
x11 (bool) – Unused for PJM, kept for API compatibility.
- Returns:
Parsed PJM job IDs.
- Return type:
list[int]
- get_submitted_jobs_by_name(script_names: list[str]) list[int]#
Get submitted PJM job IDs by script name.
This is a fallback used when the submit command output does not contain enough information to recover all submitted job IDs directly.
- Parameters:
script_names (list[str]) – Submitted script filenames.
- Returns:
Matching PJM job IDs in the same order as
script_names.- Return type:
list[int]
- parse_all_jobs_output(output, job_id)#
Parses check jobs command output, so it can be interpreted by autosubmit
- Parameters:
output (str) – output to parse
job_id – select the job to parse
- Returns:
job status
- Return type:
str
- parse_job_list(job_list: list[list[Job]]) str#
Convert a list of job_list to job_list_cmd.
- Parameters:
job_list (list) – list of jobs
- Returns:
job status
- Return type:
str
- parse_job_output(output)#
Parses check job command output, so it can be interpreted by autosubmit
- Parameters:
output (str) – output to parse
- Returns:
job status
- Return type:
str
- submit_error(output)#
Check if the output of the submit command contains an error message.
- Parameters:
output – output of the submit cmd
- Returns:
boolean
- update_cmds()#
Update commands for platforms.
Slurm Platform.
This file contains code that interfaces between Autosubmit and a Slurm Platform.
- class autosubmit.platforms.slurmplatform.SlurmPlatform(expid: str, name: str, config: dict, auth_password: str | list[str] | None = None)#
Bases:
ParamikoPlatformClass to manage jobs to host using SLURM scheduler.
- static allocated_nodes() str#
It sets the allocated nodes of the wrapper
- Returns:
A command that changes the num of Node per job
- Return type:
str
- cancel_jobs(job_ids: list[str]) None#
Cancel jobs by their IDs.
- Parameters:
job_ids (list[str]) – List of job IDs to cancel.
- check_file_exists(src: str, wrapper_failed: bool = False, sleeptime: int = 5, max_retries: int = 3, show_logs: bool = True) bool#
Checks if a file exists on the FTP server.
- Parameters:
src (str) – The name of the file to check.
wrapper_failed (bool) – Whether the wrapper has failed. Defaults to False.
sleeptime (int) – Time to sleep between retries in seconds. Defaults to 5.
max_retries (int) – Maximum number of retries. Defaults to 3.
show_logs (bool) – Whether to show logs if the file does not exist. Defaults to True.
- Returns:
True if the file exists, False otherwise
- Return type:
bool
- check_remote_log_dir() None#
Creates log dir on remote host.
- create_a_new_copy()#
Return a copy of a SlurmPlatform object with the same expid, name and config as the original.
- Returns:
A new platform type slurm
- Return type:
- get_check_all_jobs_cmd(jobs_id: str)#
Generates sacct command to all the jobs passed down.
- Parameters:
jobs_id – ID of one or more jobs.
- Returns:
sacct command to all jobs.
- Return type:
str
- get_check_job_cmd(job_id: str) str#
Generates sacct command to the job selected.
- Parameters:
job_id – ID of a job.
- Returns:
Generates the sacct command to be executes.
- get_estimated_queue_time_cmd(job_id: str)#
Gets an estimated queue time to the job selected.
- Parameters:
job_id – ID of a job.
job_id – str
- Returns:
Gets estimated queue time.
- Return type:
str
- get_job_energy_cmd(job_id: str) str#
Generates a command to get data from a job JobId, State, NCPUS, NNodes, Submit, Start, End, ConsumedEnergy, MaxRSS, AveRSS%25.
- Parameters:
job_id – ID of a job.
job_id – str
- Returns:
Command to get job energy.
- Return type:
str
- get_job_id_by_job_name_cmd(job_name: str) str#
Looks for a job based on its name.
- Parameters:
job_name – Name given to a job
job_name – str
- Returns:
Command to look for a job in the queue.
- Return type:
str
- get_mkdir_cmd() str#
Get the variable mkdir_cmd that stores the mkdir command.
- Returns:
Mkdir command
- Return type:
str
- get_remote_log_dir() str#
Get the variable remote_log_dir that stores the directory of the Log of the experiment.
- Returns:
The remote_log_dir variable.
- Return type:
str
- get_submitted_job_id(output: str, x11: bool = False) list[str]#
Parses the output of the submit command to get the job ID.
- Parameters:
output – output of the submit command.
x11 – whether the job is an x11 job, which has a different output format.
- Returns:
job ID of the submitted job.
- get_submitted_jobs_by_name(script_names: list[str]) list[int]#
Return submitted Slurm job IDs by script name.
This fallback is used when the batched submit command does not return one recoverable job identifier per submitted script.
- Parameters:
script_names (list[str]) – Submitted script filenames.
- Returns:
Matching Slurm job IDs in submission order.
- Return type:
list[int]
- parse_all_jobs_output(output: str, job_id: int) list[str] | str#
Parses check jobs command output, so it can be interpreted by autosubmit
- Parameters:
output (str) – output to parse
job_id – select the job to parse
- Returns:
job status
- Return type:
str
- parse_job_output(output: str) str#
Parses check job command output, so it can be interpreted by autosubmit.
- Parameters:
output (str) – output to parse.
- Returns:
job status.
- Return type:
str
- parse_queue_reason(output: str, job_id: str) str#
Parses the queue reason from the output of the command.
- Parameters:
output – output of the command.
job_id – job id
- Returns:
queue reason.
- update_cmds() None#
Updates commands for platforms.
- wrapper_header(**kwargs: Any) str#
It generates the header of the wrapper configuring it to execute the Experiment.
- Parameters:
kwargs (Any) – Key arguments associated to the Job/Experiment to configure the wrapper.
- Returns:
a sequence of slurm commands.
- Return type:
str
- class autosubmit.platforms.locplatform.LocalPlatform(expid: str, name: str, config: dict, auth_password: str | list[str] | None = None)#
Bases:
ParamikoPlatformClass to manage jobs to localhost.
- cancel_jobs(job_ids: list[str]) None#
Cancel local processes by their PIDs.
- Parameters:
job_ids (list[str]) – List of local process IDs to cancel.
- check_completed_files(sections: str | None = None) str | None#
Checks for completed files in the remote log directory. This function is used to check inner_jobs of a wrapper.
- Parameters:
sections (str) – Space-separated string of sections to check for completed files. Defaults to None.
- Returns:
The output if the command is successful, None otherwise.
- Return type:
str
- check_file_exists(src: str, wrapper_failed: bool = False, sleeptime: int = 1, max_retries: int = 1) bool#
Checks if a file exists in the platform.
- Parameters:
src (str) – source name.
wrapper_failed (bool) – Checks inner jobs files. Defaults to False.
sleeptime (int) – Time to sleep between retries. Defaults to 1.
max_retries (int) – Maximum number of retries. Defaults to 1.
- Returns:
True if the file exists, False otherwise.
- Return type:
bool
- check_remote_permissions() bool#
Check remote permissions on a platform.
This is needed for Paramiko and PS and other platforms.
It uses the platform scratch project directory to create a subdirectory, and then removes it. It does it that way to verify that the user running Autosubmit has the minimum permissions required to run Autosubmit.
It does not check Slurm, queues, modules, software, etc., only the file system permissions required.
- Returns:
Trueon success,Falseotherwise.
- compress_file(file_path: str) None#
Compress a file.
- Parameters:
file_path – file path
- Returns:
The path to the compressed file. None if compression failed.
- connect(as_conf: AutosubmitConfig, reconnect: bool = False, log_recovery_process: bool = False) None#
Establishes an SSH connection to the host.
- Parameters:
as_conf – The Autosubmit configuration object.
reconnect – Indicates whether to attempt reconnection if the initial connection fails.
log_recovery_process – Specifies if the call is made from the log retrieval process.
- Returns:
None
- delete_file(filename, del_cmd=False)#
Deletes a file from this platform
- Parameters:
filename (str) – file name
- Returns:
True if successful or file does not exist
- Return type:
bool
- get_check_all_jobs_cmd(jobs_id: str) str#
Return a shell command that checks the status of all given process IDs.
For each PID, the command outputs a line
<pid> <status>where status is0if the process is running (non-zombie) and1otherwise.- Parameters:
jobs_id – Comma-separated list of process IDs to check.
- Returns:
Shell command outputting
<pid> <status>per line.
- get_check_job_cmd(job_id)#
Returns command to check job status on remote platforms.
- Parameters:
job_id – id of job to check
- Returns:
command to check job status
- get_file(filename, must_exist=True, relative_path='', ignore_log=False, wrapper_failed=False)#
Copies a file from the current platform to experiment’s tmp folder
- Parameters:
wrapper_failed
ignore_log
filename (str) – file name
must_exist (bool) – If True, raises an exception if file can not be copied
relative_path (str) – path inside the tmp folder
- Returns:
True if file is copied successfully, false otherwise
- Return type:
bool
- get_file_size(src: str | Path) int | None#
Get file size in bytes
- Parameters:
src – file path
- get_logs_files(exp_id: str, remote_logs: tuple[str, str]) None#
Do nothing because the log files are already in the local platform (redundancy).
- get_mkdir_cmd()#
Gets command to create directories on HPC
- Returns:
command to create directories on HPC
- Return type:
str
- get_remote_log_dir()#
Get the variable remote_log_dir that stores the directory of the experiment’s log.
- Returns:
The remote_log_dir variable.
- get_ssh_output()#
Gets output from last command executed.
- Returns:
output from last command
- Return type:
str
- get_submitted_job_id(raw_output: str, x11: bool = False) list[str]#
Parses the output of the submit command to get the job ID.
- Parameters:
raw_output – output of the submit command.
x11 – whether the job is an x11 job, which has a different output format.
- Returns:
job ID of the submitted job.
- move_file(src, dest, must_exist=False)#
Moves a file on the platform (includes .err and .out)
- Parameters:
src (str) – source name.
dest (str) – destination name.
must_exist (bool) – ignore if file exist or not.
- parse_all_jobs_output(output: str, job_id: int) str#
Parse process-status output for the given job ID.
- Parameters:
output – Output of
get_check_all_jobs_cmd().job_id – Process ID to look up.
- Returns:
'0'if running,'1'if not running or not found.
- parse_job_output(output)#
Parses check job command output, so it can be interpreted by autosubmit
- Parameters:
output (str) – output to parse
- Returns:
job status
- Return type:
str
- read_file(src: str | Path, max_size: int | None = None) bytes | None#
Read file content as bytes. If max_size is set, only the first max_size bytes are read.
- Parameters:
src – file path
max_size – maximum size to read
- remove_multiple_files(filenames: str) str#
Creates a shell script to remove multiple files in the remote and sets the appropriate permissions.
- Parameters:
filenames (str) – A string containing the filenames to be removed.
- Returns:
An empty string.
- Return type:
str
- restore_connection(as_conf: AutosubmitConfig, log_recovery_process: bool = False) None#
Restores the SSH connection to the platform.
- Parameters:
as_conf (AutosubmitConfig) – The Autosubmit configuration object used to establish the connection.
log_recovery_process (bool) – Indicates that the call is made from the log retrieval process.
- send_command(command: str, ignore_log=False, x11=False) bool#
Sends a given command to an HPC platform.
- Parameters:
command – The command to send to the HPC.
ignore_log – Whether logging is enabled or not for this function.
x11 – Whether X11 is enabled for the SSH session.
- Returns:
True if executed, False if failed
- send_file(filename: str, check: bool = True) bool#
Sends a file to a specified location using a command.
- Parameters:
filename (str) – The name of the file to send.
check (bool) – Unused in this platform.
- Returns:
True if the file was sent successfully.
- Return type:
bool
- test_connection(as_conf: AutosubmitConfig) None#
Test if the connection is still alive, reconnect if not.
- update_cmds()#
Updates commands for platforms.
- write_jobid(jobid: str, complete_path: str) None#
Writes Job id in an out/err file.
- Parameters:
jobid (str) – job id
complete_path (str) – complete path to the file, includes filename