autosubmit.job

Main module for Autosubmit. Only contains an interface class to all functionality implemented on Autosubmit

class autosubmit.job.job.Job(name, job_id, status, priority)

Class to handle all the tasks with Jobs at HPC. A job is created by default with a name, a jobid, a status and a type. It can have children and parents. The inheritance reflects the dependency between jobs. If Job2 must wait until Job1 is completed then Job2 is a child of Job1. Inversely Job1 is a parent of Job2

Parameters:
  • name (str) – job’s name
  • jobid (int) – job’s identifier
  • status (Status) – job initial status
  • priority (int) – job’s priority
add_parent(*parents)

Add parents for the job. It also adds current job as a child for all the new parents

Parameters:parents (*Job) – job’s parents to add
check_completion(default_status=-1)

Check the presence of COMPLETED file. Change status to COMPLETED if COMPLETED file exists and to FAILED otherwise. :param default_status: status to set if job is not completed. By default is FAILED :type default_status: Status

check_end_time()

Returns end time from stat file

Returns:date and time
Return type:str
check_retrials_end_time()

Returns list of end datetime for retrials from total stats file

Returns:date and time
Return type:list[int]
check_retrials_start_time()

Returns list of start datetime for retrials from total stats file

Returns:date and time
Return type:list[int]
check_retrials_submit_time()

Returns list of submit datetime for retrials from total stats file

Returns:date and time
Return type:list[int]
check_running_after(date_limit)

Checks if the job was running after the given date :param date_limit: reference date :type date_limit: datetime.datetime :return: True if job was running after the given date, false otherwise :rtype: bool

check_script(as_conf, parameters, show_logs=False)

Checks if script is well formed

Parameters:
  • parameters (dict) – script parameters
  • as_conf (AutosubmitConfig) – configuration file
  • show_logs (Bool) – Display output
Returns:

true if not problem has been detected, false otherwise

Return type:

bool

check_start_time()

Returns job’s start time

Returns:start time
Return type:str
check_started_after(date_limit)

Checks if the job started after the given date :param date_limit: reference date :type date_limit: datetime.datetime :return: True if job started after the given date, false otherwise :rtype: bool

children

Returns a list containing all children of the job

Returns:child jobs
Return type:set
compare_by_id(other)

Compare jobs by ID

Parameters:other (Job) – job to compare
Returns:comparison result
Return type:bool
compare_by_name(other)

Compare jobs by name

Parameters:other (Job) – job to compare
Returns:comparison result
Return type:bool
compare_by_status(other)

Compare jobs by status value

Parameters:other (Job) – job to compare
Returns:comparison result
Return type:bool
create_script(as_conf)

Creates script file to be run for the job

Parameters:as_conf (AutosubmitConfig) – configuration object
Returns:script’s filename
Return type:str
delete_child(child)

Removes a child from the job

Parameters:child (Job) – child to remove
delete_parent(parent)

Remove a parent from the job

Parameters:parent (Job) – parent to remove
get_last_retrials()

Returns the retrials of a job, including the last COMPLETED run. The selection stops, and does not include, when the previous COMPLETED job is located or the list of registers is exhausted.

Returns:list of list of dates of retrial [submit, start, finish] in datetime format
Return type:list of list
has_children()

Returns true if job has any children, else return false

Returns:true if job has any children, otherwise return false
Return type:bool
has_parents()

Returns true if job has any parents, else return false

Returns:true if job has any parent, otherwise return false
Return type:bool
inc_fail_count()

Increments fail count

static is_a_completed_retrial(fields)

Returns true only if there 4 fields: submit start finish status, and status equals COMPLETED.

is_ancestor(job)

Check if the given job is an ancestor :param job: job to be checked if is an ancestor :return: True if job is an ancestor, false otherwise :rtype bool

is_parent(job)

Check if the given job is a parent :param job: job to be checked if is a parent :return: True if job is a parent, false otherwise :rtype bool

log_job()

Prints job information in log

long_name

Job’s long name. If not setted, returns name

Returns:long name
Return type:str
parents

Returns parent jobs list

Returns:parent jobs
Return type:set
platform

Returns the platform to be used by the job. Chooses between serial and parallel platforms

:return HPCPlatform object for the job to use :rtype: HPCPlatform

print_job()

Prints debug information about the job

print_parameters()

Print sjob parameters in log

queue

Returns the queue to be used by the job. Chooses between serial and parallel platforms

:return HPCPlatform object for the job to use :rtype: HPCPlatform

remove_redundant_parents()

Checks if a parent is also an ancestor, if true, removes the link in both directions. Useful to remove redundant dependencies.

total_processors

Number of processors requested by job. Reduces ‘:’ separated format if necessary.

update_content(as_conf)

Create the script content to be run for the job

Parameters:as_conf (config) – config
Returns:script code
Return type:str
update_parameters(as_conf, parameters, default_parameters={'M': '%M%', 'M_': '%M_%', 'Y': '%Y%', 'Y_': '%Y_%', 'd': '%d%', 'd_': '%d_%', 'm': '%m%', 'm_': '%m_%'})

Refresh parameters value

Parameters:
  • default_parameters (dict) –
  • as_conf (AutosubmitConfig) –
  • parameters (dict) –
update_status(copy_remote_logs=False, failed_file=False)

Updates job status, checking COMPLETED file if needed

Parameters:
  • new_status – job status retrieved from the platform
  • copy_remote_logs – should copy remote logs when finished?
Type:

Status

write_end_time(completed)

Writes ends date and time to TOTAL_STATS file :param completed: True if job was completed successfully, False otherwise :type completed: bool

write_start_time()

Writes start date and time to TOTAL_STATS file :return: True if succesful, False otherwise :rtype: bool

write_submit_time()

Writes submit date and time to TOTAL_STATS file

class autosubmit.job.job.WrapperJob(name, job_id, status, priority, job_list, total_wallclock, num_processors, platform, as_config, hold)

Defines a wrapper from a package.

Calls Job constructor.

Parameters:
  • name (String) – Name of the Package
  • job_id (Integer) – Id of the first Job of the package
  • status (String) – ‘READY’ when coming from submit_ready_jobs()
  • priority (Integer) – 0 when coming from submit_ready_jobs()
  • job_list (List() of Job() objects) – List of jobs in the package
  • total_wallclock (String Formatted) – Wallclock of the package
  • num_processors (Integer) – Number of processors for the package
  • platform (Platform Object. e.g. EcPlatform()) – Platform object defined for the package
  • as_config (AutosubmitConfig object) – Autosubmit basic configuration object
class autosubmit.job.job_common.StatisticsSnippetBash

Class to handle the statistics snippet of a job. It contains header and tailer for local and remote jobs

class autosubmit.job.job_common.StatisticsSnippetEmpty

Class to handle the statistics snippet of a job. It contains header and footer for local and remote jobs

class autosubmit.job.job_common.StatisticsSnippetPython

Class to handle the statistics snippet of a job. It contains header and tailer for local and remote jobs

class autosubmit.job.job_common.StatisticsSnippetR

Class to handle the statistics snippet of a job. It contains header and tailer for local and remote jobs

class autosubmit.job.job_common.Status

Class to handle the status of a job

class autosubmit.job.job_common.Type

Class to handle the status of a job

autosubmit.job.job_common.increase_wallclock_by_chunk(current, increase, chunk)

Receives the wallclock times an increases it according to a quantity times the number of the current chunk. The result cannot be larger than 48:00. If Chunk = 0 then no increment.

Parameters:
  • current (str) – WALLCLOCK HH:MM
  • increase (str) – WCHUNKINC HH:MM
  • chunk (int) – chunk number
Returns:

HH:MM wallclock

Return type:

str

autosubmit.job.job_common.parse_output_number(string_number)

Parses number in format 1.0K 1.0M 1.0G

Parameters:string_number (str) – String representation of number
Returns:number in float format
Return type:float
class autosubmit.job.job_list.JobList(expid, config, parser_factory, job_list_persistence)

Class to manage the list of jobs to be run by autosubmit

add_logs(logs)

add logs to the current job_list

Parameters:platform (HPCPlatform) – job platform
Returns:logs
Return type:dict(tuple)
backup_load()

Recreates an stored job list from the persistence

Returns:loaded job list object
Return type:JobList
backup_save()

Persists the job list

check_scripts(as_conf)

When we have created the scripts, all parameters should have been substituted. %PARAMETER% handlers not allowed

Parameters:as_conf (AutosubmitConfig) – experiment configuration
expid

Returns the experiment identifier

Returns:experiment’s identifier
Return type:str
generate(date_list, member_list, num_chunks, chunk_ini, parameters, date_format, default_retrials, default_job_type, wrapper_type=None, wrapper_jobs=None, new=True, notransitive=False, update_structure=False, run_only_members=[])

Creates all jobs needed for the current workflow

Parameters:
  • default_job_type (str) – default type for jobs
  • date_list (list) – start dates
  • member_list (list) – members
  • num_chunks (int) – number of chunks to run
  • chunk_ini (int) – the experiment will start by the given chunk
  • parameters (dict) – parameters for the jobs
  • date_format (str) – option to format dates
  • default_retrials (int) – default retrials for ech job
  • new (bool) – is it a new generation?
  • wrapper_type – Type of wrapper defined by the user in autosubmit_.conf [wrapper] section.
  • wrapper_jobs (String) – Job types defined in autosubmit_.conf [wrapper sections] to be wrapped.
get_active(platform=None, wrapper=False)

Returns a list of active jobs (In platforms queue + Ready)

Parameters:platform (HPCPlatform) – job platform
Returns:active jobs
Return type:list
get_all(platform=None, wrapper=False)

Returns a list of all jobs

Parameters:platform (HPCPlatform) – job platform
Returns:all jobs
Return type:list
get_chunk_list()

Get inner chunk list

Returns:chunk list
Return type:list
get_completed(platform=None, wrapper=False)

Returns a list of completed jobs

Parameters:platform (HPCPlatform) – job platform
Returns:completed jobs
Return type:list
get_date_list()

Get inner date list

Returns:date list
Return type:list
get_failed(platform=None, wrapper=False)

Returns a list of failed jobs

Parameters:platform (HPCPlatform) – job platform
Returns:failed jobs
Return type:list
get_finished(platform=None, wrapper=False)

Returns a list of jobs finished (Completed, Failed)

Parameters:platform (HPCPlatform) – job platform
Returns:finished jobs
Return type:list
get_held_jobs(platform=None)

Returns a list of jobs in the platforms (Held)

Parameters:platform (HPCPlatform) – job platform
Returns:jobs in platforms
Return type:list
get_in_queue(platform=None, wrapper=False)

Returns a list of jobs in the platforms (Submitted, Running, Queuing, Unknown,Held)

Parameters:platform (HPCPlatform) – job platform
Returns:jobs in platforms
Return type:list
get_job_by_name(name)

Returns the job that its name matches parameter name

Parameters:name (str) – name to look for
Returns:found job
Return type:job
get_job_list()

Get inner job list

Returns:job list
Return type:list
get_job_names(lower_case=False)

Returns a list of all job names

Parameters:platform (HPCPlatform) – job platform
Returns:all jobs
Return type:list
Parameters:
  • select_jobs_by_name – job name
  • select_all_jobs_by_section – section name
  • filter_jobs_by_section – section, date , member? , chunk?
Returns:

jobs_list names

Return type:

list

get_logs()

Returns a dict of logs by jobs_name jobs

Parameters:platform (HPCPlatform) – job platform
Returns:logs
Return type:dict(tuple)
get_member_list()

Get inner member list

Returns:member list
Return type:list
get_not_in_queue(platform=None, wrapper=False)

Returns a list of jobs NOT in the platforms (Ready, Waiting)

Parameters:platform (HPCPlatform) – job platform
Returns:jobs not in platforms
Return type:list
get_ordered_jobs_by_date_member()

Get the dictionary of jobs ordered according to wrapper’s expression divided by date and member

Returns:jobs ordered divided by date and member
Return type:dict
get_prepared(platform=None)

Returns a list of prepared jobs

Parameters:platform (HPCPlatform) – job platform
Returns:prepared jobs
Return type:list
get_queuing(platform=None, wrapper=False)

Returns a list of jobs queuing

Parameters:platform (HPCPlatform) – job platform
Returns:queuedjobs
Return type:list
get_ready(platform=None, hold=False, wrapper=False)

Returns a list of ready jobs

Parameters:platform (HPCPlatform) – job platform
Returns:ready jobs
Return type:list
get_running(platform=None, wrapper=False)

Returns a list of jobs running

Parameters:platform (HPCPlatform) – job platform
Returns:running jobs
Return type:list
get_skipped(platform=None)

Returns a list of prepared jobs

Parameters:platform (HPCPlatform) – job platform
Returns:prepared jobs
Return type:list
get_submitted(platform=None, hold=False, wrapper=False)

Returns a list of submitted jobs

Parameters:platform (HPCPlatform) – job platform
Returns:submitted jobs
Return type:list
get_suspended(platform=None, wrapper=False)

Returns a list of jobs on unknown state

Parameters:platform (HPCPlatform) – job platform
Returns:unknown state jobs
Return type:list
get_uncompleted(platform=None, wrapper=False)

Returns a list of completed jobs

Parameters:platform (HPCPlatform) – job platform
Returns:completed jobs
Return type:list
get_unknown(platform=None, wrapper=False)

Returns a list of jobs on unknown state

Parameters:platform (HPCPlatform) – job platform
Returns:unknown state jobs
Return type:list
get_unsubmitted(platform=None, wrapper=False)

Returns a list of unsummited jobs

Parameters:platform (HPCPlatform) – job platform
Returns:all jobs
Return type:list
get_waiting(platform=None, wrapper=False)

Returns a list of jobs waiting

Parameters:platform (HPCPlatform) – job platform
Returns:waiting jobs
Return type:list
get_waiting_remote_dependencies(platform_type='slurm')

Returns a list of jobs waiting on slurm scheduler

Parameters:platform (HPCPlatform) – job platform
Returns:waiting jobs
Return type:list
graph

Returns the graph

Returns:graph
Return type:networkx graph
load()

Recreates an stored job list from the persistence

Returns:loaded job list object
Return type:JobList
static load_file(filename)

Recreates an stored joblist from the pickle file

Parameters:filename (str) – pickle file to load
Returns:loaded joblist object
Return type:JobList
parameters

List of parameters common to all jobs :return: parameters :rtype: dict

print_with_status(statusChange=None, nocolor=False, existingList=None)

Returns the string representation of the dependency tree of the Job List

Parameters:
  • statusChange (List of strings) – List of changes in the list, supplied in set status
  • nocolor (Boolean) – True if the result should not include color codes
  • existingList (List of Job Objects) – External List of Jobs that will be printed, this excludes the inner list of jobs.
Returns:

String representation

Return type:

String

remove_rerun_only_jobs(notransitive=False)

Removes all jobs to be run only in reruns

rerun(chunk_list, notransitive=False, monitor=False)

Updates job list to rerun the jobs specified by chunk_list

Parameters:chunk_list (str) – list of chunks to rerun
save()

Persists the job list

sort_by_id()

Returns a list of jobs sorted by id

Returns:jobs sorted by ID
Return type:list
sort_by_name()

Returns a list of jobs sorted by name

Returns:jobs sorted by name
Return type:list
sort_by_status()

Returns a list of jobs sorted by status

Returns:job sorted by status
Return type:list
sort_by_type()

Returns a list of jobs sorted by type

Returns:job sorted by type
Return type:list
update_from_file(store_change=True)

Updates jobs list on the fly from and update file :param store_change: if True, renames the update file to avoid reloading it at the next iteration

update_genealogy(new=True, notransitive=False, update_structure=False)

When we have created the job list, every type of job is created. Update genealogy remove jobs that have no templates :param new: if it is a new job list or not :type new: bool

update_list(as_conf, store_change=True, fromSetStatus=False, submitter=None, first_time=False)

Updates job list, resetting failed jobs and changing to READY all WAITING jobs with all parents COMPLETED

Parameters:as_conf (AutosubmitConfig) – autosubmit config object
Returns:True if job status were modified, False otherwise
Return type:bool