Other than the filters, another option for large workflows is to group jobs. This option is available with the
group_by keyword, which can receive the values
For the first 4 options, the grouping criteria is explicitly defined
In addition to that, it is possible to expand some dates/members/chunks that would be grouped either/both by status or/and by specifying the date/member/chunk not to group.
The syntax used in this option is almost the same as for the filters, in the format of
[ date1 [ member1 [ chunk1 chunk2 ] member2 [ chunk3 ... ] ... ] date2 [ member3 [ chunk1 ] ] ... ]
The grouping option is also in autosubmit monitor, create, setstatus and recovery
Consider the following workflow:
Group by date
-group_by=date -expand="[ 20000101 ]"
-group_by=date -expand_status="FAILED RUNNING"
-group_by=date -expand="[ 20000101 ]" -expand_status="FAILED RUNNING"
Group by member
-group_by=member -expand="[ 20000101 [ fc0 fc1 ] 20000202 [ fc0 ] ]"
-group_by=member -expand_status="FAILED QUEUING"
-group_by=member -expand="[ 20000101 [ fc0 fc1 ] 20000202 [ fc0 ] ]" -expand_status="FAILED QUEUING"
Group by chunk
If there are jobs synchronized between members or dates, then a connection between groups is shown:
-group_by=chunk -expand="[ 20000101 [ fc0 [1 2] ] 20000202 [ fc1  ] ]"
-group_by=chunk -expand_status="FAILED RUNNING"
-group_by=chunk -expand="[ 20000101 [ fc0  ] 20000202 [ fc1 [1 2] ] ]" -expand_status="FAILED RUNNING"
Group by split
If there are chunk jobs that are split, the splits can also be grouped.
Understading the group status
If there are jobs with different status grouped together, the status of the group is determined as follows: If there is at least one job that failed, the status of the group will be FAILED. If there are no failures, but if there is at least one job running, the status will be RUNNING. The same idea applies following the hierarchy: SUBMITTED, QUEUING, READY, WAITING, SUSPENDED, UNKNOWN. If the group status is COMPLETED, it means that all jobs in the group were completed.
For the automatic grouping, the groups are created by collapsing the split->chunk->member->date that share the same status (following this hierarchy). The following workflow automatic created the groups 20000101_fc0, since all the jobs for this date and member were completed, 20000101_fc1_3, 20000202_fc0_2, 20000202_fc0_3 and 20000202_fc1, as all the jobs up to the respective group granularities share the same - waiting - status.
Especially in the case of monitoring an experiment with a very large number of chunks, it might be useful to hide the groups created automatically. This allows to better visualize the chunks in which there are jobs with different status, which can be a good indication that there is something currently happening within such chunks (jobs ready, submitted, running, queueing or failed).