Job scheduling under the job forwarding model

With this model, scheduling of MultiCluster jobs is a process with two scheduling phases:

Phase I, local scheduling phase (all jobs)

  1. The send-jobs queue receives the job submission request from a user.
  2. The send-jobs queue parameters affect whether or not the job is accepted. For example, a job that requires 100 MB memory will be rejected if queue-level parameters specify a memory limit of only 50 MB.
  3. If the job is accepted, it becomes pending in the send-jobs queue with a job ID assigned by the submission cluster.
  4. During the next scheduling cycle, the send-jobs queue attempts to place the job on a host in the submission cluster. If a suitable host is found, the job is dispatched locally.
  5. If the job cannot be placed locally (local hosts may not satisfy its resource requirements, or all the local hosts could be busy), the send-jobs queue attempts to forward the job to another cluster.

Phase II, job forwarding phase (MultiCluster submission queues only)

  1. The send-jobs queue has a list of remote receive-jobs queues that it can forward jobs to. If a job cannot be placed locally, the send-jobs queue evaluates each receive-jobs queue. All queues that will accept more MultiCluster jobs are candidates. To find out how many additional MultiCluster jobs a queue can accept, subtract the number of MultiCluster jobs already pending in the queue from the queue’s pending MultiCluster job threshold (IMPT_JOBBKLG).
  2. If information available to the submission cluster indicates that the first queue is suitable, LSF forwards the job to that queue.
    1. By default, only queue capacity is considered and the first queue evaluated is the one that has room to accept the most new MultiCluster jobs.
    2. When MC_PLUGIN_REMOTE_RESOURCE=Y is set, boolean resource requirements and available remote resources are considered.
      Tip:

      When MC_PLUGIN_REMOTE_RESOURCE is defined, only the following resource requirements (boolean only) are supported: -R "type==type_name", -R "same[type]", and -R "defined(resource_name)"

    3. When MC_PLUGIN_SCHEDULE_ENHANCE is defined, remote resources are considered as for MC_PLUGIN_REMOTE_RESOURCE=Y, and the scheduler is enhanced to consider remote queue preemptable jobs, queue priority, and queue workload, based on the settings selected.
  3. If the first queue is not suitable, LSF considers the next queue.
  4. If LSF cannot forward the job to any of the receive-jobs queues, the job remains pending in the send-jobs cluster and is evaluated again during the next scheduling cycle.

Phase III, remote scheduling phase (MultiCluster jobs only)

  1. The receive-jobs queue receives the MultiCluster job submission.
  2. The receive-jobs queue parameters affect whether or not the job is accepted. For example, a job that requires 100 MB memory will be rejected if queue-level parameters specify a memory limit of only 50 MB.
  3. If the job is rejected, it returns to the submission cluster.
  4. If the job is accepted, it becomes pending in the receive-jobs queue with a new job ID assigned by the execution cluster.
  5. During the next scheduling cycle, the receive-jobs queue attempts to place the job on a host in the execution cluster. If a suitable host is found, the job is dispatched. If a suitable host is not found, the job remains pending in the receive-jobs cluster, and is evaluated again the next scheduling cycle.
  6. If the job is dispatched to the execution host but cannot start after the time interval MAX_RSHED_TIME (lsb.params), it returns to the submission cluster to be rescheduled. However, if the job repeatedly returns to the submission cluster because it could not be started in a remote cluster, after LSB_MC_INITFAIL_RETRY tries to start the job (lsf.conf), LSF suspends the job (PSUSP) in the submission cluster.