9. Tasks

9.1. Task dependencies

ArmoniK supports data dependencies between tasks, it means that a task will be executed only when all their input data are available. The input data can be created by other tasks.

A task cannot directly wait for another tasks since we want to make sure that a task that does nothing (just wait) will not occupy a pod for no reason. Similarly, a task cannot wait for the completion of its child tasks. Child tasks are only submitted when the parent task completes successfully in order to simplify the management of the children when there is an issue during the execution of the parent task.

9.2. Error management

        graph TD
  Task[Task Ends] --> Error{Return Output}
  Error -->|OK| Completed[Task Completed]
  Error -->|KO| Failed
  Error -->|Throw gRPC Error| grpcex{retry < MaxRetries}
  grpcex -->|Yes| SubmitDuplicate[Task Error + Submit a duplicated task]
  grpcex -->|No| Failed[Task Error + No retries]
    
  • Task Completed

    • Status Ok

    • Subtask creation OK : child tasks are created when received (with creating status i.e; created “on the fly”) and submitted at the end of the parent task

    • Outputs OK

  • Task Error

    • Cancellation of outputs and child tasks

    • Error not managed by application (raise a gRPC Error that will be transferred to the polling agent), annulate output from child tasks.

    • creation of Task Duplicate (with link for monitoring and number of retry) and transfer output creation responsibility

  • Task resubmission

    • copy of task metadata with new id

9.3. Status state diagram

        stateDiagram-v2
    [*] --> Creating: Task is created in the db
    Creating --> Pending: Dependencies are not available<br/>Creation rpc succeeds
    Pending --> Submitted: Dependencies are available<br/>Task is inserted into the queue<br/>Session is running
    Submitted --> Paused: Session is paused
    Paused --> Submitted: Session is resumed
    Pending --> Paused: Dependencies are available<br/>Session is paused
    Submitted --> Dispatched: Task is acquired
    Dispatched --> Processing: Task is sent to the worker
    Dispatched --> Submitted: Task acquisition timeout
    Processing --> Completed: Task succeeds
    Processing --> Error: Task fails, no retry
    Processing --> Retried: Task fails, recovery possible<br/>A new task is created
    Processing --> Timeout: Task exceeds MaxDuration
    Creating --> Cancelled: Task is marked as cancelled
    Submitted --> Cancelled: Task is marked as cancelled
    Dispatched --> Cancelled: Task is marked as cancelled
    Processing --> Cancelled: Task is hardly cancelled
    Paused --> Cancelled: Task is marked as cancelled
    Dispatched --> Paused: Session is paused
    Cancelled --> [*]
    Timeout --> [*]
    Completed --> [*]
    Retried --> [*]
    Error --> [*]