Knowledge Center         Contents    Previous  Next    Index  
Platform Computing Corp.

About Platform LSF

Contents

Learn about Platform LSF

Before using Platform LSF for the first time, you should download and read LSF Version 7 Release Notes for the latest information about what's new in the current release and other important information.

Cluster Concepts

Clusters, jobs, and queues

Cluster

A group of computers (hosts) running LSF that work together as a single unit, combining computing power and sharing workload and resources. A cluster provides a single-system image for disparate computing resources.

Hosts can be grouped into clusters in a number of ways. A cluster could contain:

Commands:

Configuration:

tip:  
The name of your cluster should be unique. It should not be the same as any host or queue.
Job

A unit of work run in the LSF system. A job is a command submitted to LSF for execution. LSF schedules, controls, and tracks the job according to configured policies.

Jobs can be complex problems, simulation scenarios, extensive calculations, anything that needs compute power.

Commands:

Job slot

A job slot is a bucket into which a single unit of work is assigned in the LSF system.

If hosts are configured with a number of job slots, you can dispatch jobs from queues until all the job slots are filled.

Commands:

Configuration:

Job states

LSF jobs have the following states:

Queue

A clusterwide container for jobs. All jobs wait in queues until they are scheduled and dispatched to hosts.

Queues do not correspond to individual hosts; each queue can use all server hosts in the cluster, or a configured subset of the server hosts.

When you submit a job to a queue, you do not need to specify an execution host. LSF dispatches the job to the best available execution host in the cluster to run that job.

Queues implement different job scheduling and control policies.

Commands:

Configuration:

tip:  
The names of your queues should be unique. They should not be the same as the cluster name or any host in the cluster.
First-come, first-served (FCFS) scheduling

The default type of scheduling in LSF. Jobs are considered for dispatch based on their order in the queue.

Hosts

Host

An individual computer in the cluster.

Each host may have more than 1 processor. Multiprocessor hosts are used to run parallel jobs. A multiprocessor host with a single process queue is considered a single machine, while a box full of processors that each have their own process queue is treated as a group of separate machines.

Commands:

tip:  
The names of your hosts should be unique. They should not be the same as the cluster name or any queue defined for the cluster.
Submission host

The host where jobs are submitted to the cluster.

Jobs are submitted using the bsub command or from an application that uses the LSF API.

Client hosts and server hosts can act as submission hosts.

Commands:

Execution host

The host where a job runs. Can be the same as the submission host. All execution hosts are server hosts.

Commands:

Server host

Hosts that are capable of submitting and executing jobs. A server host runs sbatchd to execute server requests and apply local policies.

An LSF cluster may consist of static and dynamic hosts. Dynamic host configuration allows you to add and remove hosts without manual reconfiguration.

By default, all configuration changes made to LSF are static. To add or remove hosts within the cluster, you must manually change the configuration and restart all master candidates.

Commands:

Configuration:

Client host

Hosts that are only capable of submitting jobs to the cluster. Client hosts run LSF commands and act only as submission hosts. Client hosts do not execute jobs or run LSF daemons.

Commands:

Configuration:

Floating client host

In LSF, you can have both client hosts and floating client hosts. The difference is in the type of license purchased.

If you purchased a regular (fixed) client license, LSF client hosts are static. The client hosts must be listed in lsf.cluster.cluster_name. The license is fixed to the hosts specified in lsf.cluster.cluster_name and whenever client hosts change, you must update it with the new host list.

If you purchased a floating client license, LSF floating client hosts are dynamic. They are not listed in lsf.cluster.cluster_name. Since LSF does not take into account the host name but the number of floating licenses, clients can change dynamically and licenses will be distributed to clients that request to use LSF.

When you submit a job from any unlicensed host, and if there are any floating licenses free, the host will check out a license and submit your job to LSF. However, once a host checks out a floating client license, it keeps that license for the rest of the day, until midnight. A host that becomes a floating client behaves like a fixed client all day, then at 12 midnight it releases the license. At that time, the host turns back into a normal, unlicensed host, and the floating client license becomes available to any other host that needs it.

Master host

Where the master LIM and mbatchd run. An LSF server host that acts as the overall coordinator for that cluster. Each cluster has one master host to do all job scheduling and dispatch. If the master host goes down, another master candidate LSF server in the cluster becomes the master host.

All LSF daemons run on the master host. The LIM on the master host is the master LIM.

Commands:

Configuration:

LSF daemons

LSF daemon
Role
mbatchd
Job requests and dispatch
mbschd
Job scheduling
sbatchd
res
Job execution
lim
Host information
pim
Job process information
elim
Collect and track custom dynamic load indices

mbatchd

Master Batch Daemon running on the master host. Started by sbatchd. Responsible for the overall state of jobs in the system.

Receives job submission, and information query requests. Manages jobs held in queues. Dispatches jobs to hosts as determined by mbschd.

Commands:

Configuration:

mbschd

Master Batch Scheduler Daemon running on the master host. Works with mbatchd. Started by mbatchd.

Makes scheduling decisions based on job requirements and policies.

sbatchd

Slave Batch Daemon running on each server host. Receives the request to run the job from mbatchd and manages local execution of the job. Responsible for enforcing local policies and maintaining the state of jobs on the host.

sbatchd forks a child sbatchd for every job. The child sbatchd runs an instance of res to create the execution environment in which the job runs. The child sbatchd exits when the job is complete.

Commands:

Configuration:

res

Remote Execution Server (RES) running on each server host. Accepts remote execution requests to provide transparent and secure remote execution of jobs and tasks.

Commands:

Configuration:

lim

Load Information Manager (LIM) running on each server host. Collects host load and configuration information and forwards it to the master LIM running on the master host. Reports the information displayed by lsload and lshosts.

Static indices are reported when the LIM starts up or when the number of CPUs (ncpus) change. Static indices are:

Dynamic indices for host load collected at regular intervals are:

Commands:

Configuration:

Master LIM

The LIM running on the master host. Receives load information from the LIMs running on hosts in the cluster.

Forwards load information to mbatchd, which forwards this information to mbschd to support scheduling decisions. If the master LIM becomes unavailable, a LIM on another host automatically takes over.

Commands:

Configuration:

ELIM

External LIM (ELIM) is a site-definable executable that collects and tracks custom dynamic load indices. An ELIM can be a shell script or a compiled binary program, which returns the values of the dynamic resources you define. The ELIM executable must be named elim.anything and located in LSF_SERVERDIR.

pim

Process Information Manager (PIM) running on each server host. Started by LIM, which periodically checks on PIM and restarts it if it dies.

Collects information about job processes running on the host such as CPU and memory used by the job, and reports the information to sbatchd.

Commands:

Batch jobs and tasks

You can either run jobs through the batch system where jobs are held in queues, or you can interactively run tasks without going through the batch system, such as tests for example.

Job

A unit of work run in the LSF system. A job is a command submitted to LSF for execution, using the bsub command. LSF schedules, controls, and tracks the job according to configured policies.

Jobs can be complex problems, simulation scenarios, extensive calculations, anything that needs compute power.

Commands:

Interactive batch job

A batch job that allows you to interact with the application and still take advantage of LSF scheduling policies and fault tolerance. All input and output are through the terminal that you used to type the job submission command.

When you submit an interactive job, a message is displayed while the job is awaiting scheduling. A new job cannot be submitted until the interactive job is completed or terminated.

The bsub command stops display of output from the shell until the job completes, and no mail is sent to you by default. Use Ctrl-C at any time to terminate the job.

Commands:

Interactive task

A command that is not submitted to a batch queue and scheduled by LSF, but is dispatched immediately. LSF locates the resources needed by the task and chooses the best host among the candidate hosts that has the required resources and is lightly loaded. Each command can be a single process, or it can be a group of cooperating processes.

Tasks are run without using the batch processing features of LSF but still with the advantage of resource requirements and selection of the best host to run the task based on load.

Commands:

Local task

An application or command that does not make sense to run remotely. For example, the ls command on UNIX.

Commands:

Configuration:

Remote task

An application or command that can be run on another machine in the cluster.

Commands:

Configuration:

Host types and host models

Hosts in LSF are characterized by host type and host model.

The following example host type X86_64, with host models Opteron240, Opteron840, Intel_EM64T, Intel_IA64, etc.

Host type

The combination of operating system and host CPU architecture.

All computers that run the same operating system on the same computer architecture are of the same type - in other words, binary-compatible with each other.

Each host type usually requires a different set of LSF binary files.

Commands:

Configuration:

Host model

The host type of the computer, which determines the CPU speed scaling factor applied in load and placement calculations.

The CPU factor is taken into consideration when jobs are being dispatched.

Commands:

Configuration:

Users and administrators

LSF user

A user account that has permission to submit jobs to the LSF cluster.

LSF administrator

In general, you must be an LSF administrator to perform operations that will affect other LSF users. Each cluster has one primary LSF administrator, specified during LSF installation. You can also configure additional administrators at the cluster level and at the queue level.

Primary LSF administrator

The first cluster administrator specified during installation and first administrator listed in lsf.cluster.cluster_name. The primary LSF administrator account owns the configuration and log files. The primary LSF administrator has permission to perform clusterwide operations, change configuration files, reconfigure the cluster, and control jobs submitted by all users.

Cluster administrator

May be specified during LSF installation or configured after installation. Cluster administrators can perform administrative operations on all jobs and queues in the cluster. Cluster administrators have the same cluster-wide operational privileges as the primary LSF administrator except that they do not necessarily have permission to change LSF configuration files.

For example, a cluster administrator can create an LSF host group, submit a job to any queue, or terminate another user's job.

Queue administrator

An LSF administrator user account that has administrative permissions limited to a specified queue. For example, an LSF queue administrator can perform administrative operations on the specified queue, or on jobs running in the specified queue, but cannot change LSF configuration or operate on LSF daemons.

Resources

Resource usage

The LSF system uses built-in and configured resources to track resource availability and usage. Jobs are scheduled according to the resources available on individual hosts.

Jobs submitted through the LSF system will have the resources they use monitored while they are running. This information is used to enforce resource limits and load thresholds as well as fairshare scheduling.

LSF collects information such as:

On UNIX, job-level resource usage is collected through PIM.

Commands:

Configuration:

Load indices

Load indices measure the availability of dynamic, non-shared resources on hosts in the cluster. Load indices built into the LIM are updated at fixed time intervals.

Commands:

External load indices

Defined and configured by the LSF administrator and collected by an External Load Information Manager (ELIM) program. The ELIM also updates LIM when new values are received.

Commands:

Static resources

Built-in resources that represent host information that does not change over time, such as the maximum RAM available to user processes or the number of processors in a machine. Most static resources are determined by the LIM at start-up time.

Static resources can be used to select appropriate hosts for particular jobs based on binary architecture, relative CPU speed, and system configuration.

Load thresholds

Two types of load thresholds can be configured by your LSF administrator to schedule jobs in queues. Each load threshold specifies a load index value:

To schedule a job on a host, the load levels on that host must satisfy both the thresholds configured for that host and the thresholds for the queue from which the job is being dispatched.

The value of a load index may either increase or decrease with load, depending on the meaning of the specific load index. Therefore, when comparing the host load conditions with the threshold values, you need to use either greater than (>) or less than (<), depending on the load index.

Commands:

Configuration:

Runtime resource usage limits

Limit the use of resources while a job is running. Jobs that consume more than the specified amount of a resource are signalled.

Configuration:

Hard and soft limits

Resource limits specified at the queue level are hard limits while those specified with job submission are soft limits. See setrlimit(2) man page for concepts of hard and soft limits.

Resource allocation limits

Restrict the amount of a given resource that must be available during job scheduling for different classes of jobs to start, and which resource consumers the limits apply to. If all of the resource has been consumed, no more jobs can be started until some of the resource is released.

Configuration:

Resource requirements (bsub -R)

Restrict which hosts the job can run on. Hosts that match the resource requirements are the candidate hosts. When LSF schedules a job, it collects the load index values of all the candidate hosts and compares them to the scheduling conditions. Jobs are only dispatched to a host if all load values are within the scheduling thresholds.

Commands:

Configuration:

Job Life Cycle

1 Submit a job

You submit a job from an LSF client or server with the bsub command.

If you do not specify a queue when submitting the job, the job is submitted to the default queue.

Jobs are held in a queue waiting to be scheduled and have the PEND state. The job is held in a job file in the LSF_SHAREDIR/cluster_name/logdir/info/ directory, or in one of its subdirectories if MAX_INFO_DIRS is defined in lsb.params.

Job ID

LSF assigns each job a unique job ID when you submit the job.

Job name

You can also assign a name to the job with the -J option of bsub. Unlike the job ID, the job name is not necessarily unique.

2 Schedule job

  1. mbatchd looks at jobs in the queue and sends the jobs for scheduling to mbschd at a preset time interval (defined by the parameter JOB_SCHEDULING_INTERVAL in lsb.params).
  2. mbschd evaluates jobs and makes scheduling decisions based on:
  3. mbschd selects the best hosts where the job can run and sends its decisions back to mbatchd.
  4. Resource information is collected at preset time intervals by the master LIM from LIMs on server hosts. The master LIM communicates this information to mbatchd, which in turn communicates it to mbschd to support scheduling decisions.

3 Dispatch job

As soon as mbatchd receives scheduling decisions, it immediately dispatches the jobs to hosts.

4 Run job

sbatchd handles job execution. It:

  1. Receives the request from mbatchd
  2. Creates a child sbatchd for the job
  3. Creates the execution environment
  4. Starts the job using res
  5. The execution environment is copied from the submission host to the execution host and includes the following:

5 Return output

When a job is completed, it is assigned the DONE status if the job was completed without any problems. The job is assigned the EXIT status if errors prevented the job from completing.

sbatchd communicates job information including errors and output to mbatchd.

6 Send email to client

mbatchd returns the job output, job error, and job information to the submission host through email. Use the -o and -e options of bsub to send job output and errors to a file.

Job report

A job report is sent by email to the LSF job owner and includes:


Platform Computing Inc.
www.platform.com
Knowledge Center         Contents    Previous  Next    Index