About Platform Session Scheduler

While traditional Platform LSF job submission, scheduling, and dispatch methods such as job arrays or job chunking are well suited to a mix of long and short running jobs, or jobs with dependencies on each other, Session Scheduler is ideal for large volumes of independent jobs with short run times.

As clusters grow and the volume of workload increases, the need to delegate scheduling decisions increases. Session Scheduler improves throughput and performance of the LSF scheduler by enabling multiple tasks to be submitted as a single LSF job.

Platform Session Scheduler implements a hierarchal, personal scheduling paradigm that provides very low-latency execution. With very low latency per job, Session Scheduler is ideal for executing very short jobs, whether they are a list of tasks, or job arrays with parametric execution.

The Session Scheduler provides users with the ability to run large collections of short duration tasks within the allocation of an LSF job using a job-level task scheduler that allocate resources for the job once, and reuses the allocated resources for each task.

Each Session Scheduler is dynamically scheduled in a similar manner to a parallel job. Each instance of the ssched command then manages its own workload within its assigned allocation. Work is submitted as a task array or a task definition file.

Session Scheduler satisfies the following goals for running a large volume of short jobs:
  • Minimize the latency when scheduling short jobs

  • Improve overall cluster utilization and system performance

  • Allocate resources according to LSF policies

  • Support existing LSF pre-execution, post-execution programs, job starters, resources limits, etc.

  • Handle thousands of users and more than 50000 short jobs per user

Session Scheduler system requirements

Supported operating systems
Session Scheduler is delivered in the following distributions:
  • lsf8.0_ssched_linux2.4-glibc2.3-x86.tar.Z

  • lsf8.0_ssched_linux2.6-glibc2.3-x86.tar.Z

  • lsf8.0_ssched_linux2.6-glibc2.3-x86_64.tar.Z

Required libraries

Note: These libraries may not be installed by default by all Linux distributions.

On Linux 2.4 (x86 and x86_64), the following external libraries are required:
  • libstdc++.so.5

  • libpthread-0.60.so or later

  • libgcc_s.so.1

On Linux 2.6 (x86), the following external libraries are required:
  • libstdc++.so.5

  • libpthread-2.3.4.so or later

On Linux 2.6 (x86_64), the following external libraries are required:
  • libstdc++.so.5

  • libpthread-2.3.4.so or later

  • libxml2.so.2

Compatible Linux distributions
Certified compatible distributions include:
  • Red Hat Enterprise Linux AS 3 or later

  • SUSE Linux Enterprise Server 10

Platform LSF

Session Scheduler is included in Platform LSF 7 Update 3 or later

Session Scheduler terminology

Job

A traditional LSF job that is individually scheduled and dispatched to sbatchd by mbatchd and mbschd

Task

Similar to a job, a unit of workload that describes an executable and its environment that runs on an execution node. Tasks are managed and dispatched by the Session Scheduler.

Job Session

An LSF job that is individually scheduled by mbatchd, but is not dispatched as an LSF job. Instead, a running Session Scheduler job session represents an allocation of nodes for running large collections of tasks

Scheduler

The component that accepts and dispatches tasks within the nodes allocated for a job session.

Session Scheduler architecture

Session Scheduler jobs are submitted, scheduled, and dispatched like normal LSF jobs.

When the Session Scheduler begins running, it starts a Session Scheduler execution agent on each host in its allocation.

The Session Scheduler then reads in the task definition file, which contains a list of tasks to run. Tasks are sent to an execution agent and run. When a task finishes, the next task in the list is dispatched to the available host. This continues until all tasks have been run.

Tasks submitted through Session Scheduler bypass the LSF mbatchd and mbschd. The LSF mbatchd is unaware of individual tasks.

Session Scheduler components

Session Scheduler comprises the following components.

Session Scheduler command (ssched)

The ssched command accepts and dispatches tasks within the nodes allocated for a job session. It reads the task definition file and sends tasks to the execution agents. ssched also logs errors, performs task accounting, and requeues tasks as necessary.

sservice and sschild

These components are the execution agents. They run on each remote host in the allocation. They set up the task execution environment, run the tasks, and enable task monitoring and resource usage collection.

Session Scheduler performance

Session Scheduler has been tested to support up to 50,000 tasks. Based on performance tests, the best maximum allocation size (specified by bsub -n) depends on the average runtime of the tasks. Here are some typical results:

Average Runtime (seconds)

Recommended maximum allocation size (slots)

0

12

5

64

15

256

30

512


Session Scheduler licensing

You must license each cluster or host with a Session Scheduler license on a per core basis. Full and partial licensing modes are supported.

Full mode

In this mode, licenses are required for each core in the whole cluster. If there are not enough available licenses for the whole cluster, only the hosts enabled with Session Scheduler can run Session Scheduler tasks..

Append keyword “LSF_Session_Scheduler” for PRODUCTS in the parameter section of the lsf.cluster file as follows:

Begin Parameters
PRODUCTS=LSF_Make LSF_Base LSF_Manager LSF_Session_Scheduler
# LSF_HOST_ADDR_RANGE=*.*.*.*
# FLOAT_CLIENTS_ADDR_RANGE=*.*.*.*
# FLOAT_CLIENTS=10
End Parameters

Partial mode

In this mode, LSF requires Session Scheduler licenses for each core of selected hosts. Only hosts that are configured with "LSF_Session_Scheduler" require a license. If there are not enough "lsf_session_scheduler" licenses for all the selected hosts, only the hosts enabled with Session Scheduler can run Session Scheduler tasks.

Define "LSF_Session_Scheduler" in the RESOURCE column of the host section in the lsf.cluster file as follows:

Begin   Host
HOSTNAME     model type server r1m mem swp RESOURCES #Keywords
w864         !     !    1      3.5 ()  ()  (mg)
iquadcore-02 !     !    1      3.5 ()  ()  (mg LSF_Session_
Scheduler)
End Host

Dispatching session jobs to licensed hosts

In order to tell the scheduler to dispatch session jobs to licensed hosts, you can specify a resource requirement string, either at the job level (-R) or in the application profile (RESREQ inside “ssched”). Note that the -R option is suitable for the partial licensing mode only.

bsub –app ssched –R “LSF_Session_Scheduler” ssched –tasks ./my.tasks

Administrators can also specify a dedicated hostgroup for job submission.