Knowledge Center         Contents    Previous  Next    Index  
Platform Computing Corp.

Resource Allocation Limits

Contents

About Resource Allocation Limits

Contents

What resource allocation limits do

By default, resource consumers like users, hosts, queues, or projects are not limited in the resources available to them for running jobs. Resource allocation limits configured in lsb.resources restrict:

If all of the resource has been consumed, no more jobs can be started until some of the resource is released.

For example, by limiting maximum amount of memory for each of your hosts, you can make sure that your system operates at optimal performance. By defining a memory limit for some users submitting jobs to a particular queue and a specified set of hosts, you can prevent these users from using up all the memory in the system at one time.

Jobs must specify resource requirements

For limits to apply, the job must specify resource requirements (bsub -R rusage string or RES_REQ in lsb.queues). For example, the a memory allocation limit of 4 MB is configured in lsb.resources:

Begin Limit 
NAME = mem_limit1 
MEM = 4 
End Limit 

A is job submitted with an rusage resource requirement that exceeds this limit:

bsub -R "rusage[mem=5]" uname 

and remains pending:

bjobs -p 600 
JOBID  USER   STAT  QUEUE   FROM_HOST  EXEC_HOST  JOB_NAME        SUBMIT_TIME 
 600    user1  PEND  normal   suplin02                uname       Aug 12 14:05 
Resource (mem) limit defined cluster-wide has been reached; 

A job is submitted with a resource requirement within the configured limit:

bsub -R"rusage[mem=3]" sleep 100 

is allowed to run:

bjobs 
JOBID   USER   STAT  QUEUE   FROM_HOST  EXEC_HOST  JOB_NAME      SUBMIT_TIME 
  600    user1   PEND  normal      hostA                uname      Aug 12 14:05 
  604    user1    RUN  normal      hostA            sleep 100      Aug 12 14:09 
Resource usage limits and resource allocation limits

Resource allocation limits are not the same as resource usage limits, which are enforced during job run time. For example, you set CPU limits, memory limits, and other limits that take effect after a job starts running. See Chapter 36, "Runtime Resource Usage Limits" for more information.

Resource reservation limits and resource allocation limits

Resource allocation limits are not the same as queue-based resource reservation limits, which are enforced during job submission. The parameter RESRSV_LIMIT (in lsb.queues) specifies allowed ranges of resource values, and jobs submitted with resource requests outside of this range are rejected. See Chapter 25, "Reserving Resources" for more information.

How LSF enforces limits

Resource allocation limits are enforced so that they apply to:

How LSF counts resources

Resources on a host are not available if they are taken by jobs that have been started, but have not yet finished. This means running and suspended jobs count against the limits for queues, users, hosts, projects, and processors that they are associated with.

Job slot limits

Job slot limits can correspond to the maximum number of jobs that can run at any point in time. For example, a queue cannot start jobs if it has no job slots available, and jobs cannot run on hosts that have no available job slots.

Limits such as such as QJOB_LIMIT (lsb.queues), HJOB_LIMIT (lsb.queues), UJOB_LIMIT (lsb.queues), MXJ (lsb.hosts), JL/U (lsb.hosts), MAX_JOBS (lsb.users), and MAX_PEND_JOBS (lsb.users) limit the number of job slots. When the workload is sequential, job slots are usually equivalent to jobs. For parallel or distributed applications, these are true job slot limits and not job limits.

Job limits

Job limits, specified by JOBS in a Limit section in lsb.resources, correspond to the maximum number of running and suspended jobs that can run at any point in time. If both job limits and job slot limits are configured, the most restrictive limit is applied.

Resource reservation and backfill

When processor or memory reservation occurs, the reserved resources count against the limits for users, queues, hosts, projects, and processors. When backfilling of parallel jobs occurs, the backfill jobs do not count against any limits.

MultiCluster

Limits apply only to the cluster where lsb.resources is configured. If the cluster leases hosts from another cluster, limits are enforced on those hosts as if they were local hosts.

Switched jobs can exceed resource allocation limits

If a switched job (bswitch) has not been dispatched, then the job behaves as if it were submitted to the new queue in the first place, and the JOBS limit is enforced in the target queue.

If a switched job has been dispatched, then resource allocation limits like SWP. TMP. and JOBS can be exceeded in the target queue. For example, given the following JOBS limit configuration:

Begin Limit 
USERS     QUEUES      SLOTS   TMP    JOBS  
-         normal        -      20     2 
-         short         -      20     2 
End Limit 

Submit 3 jobs to the normal queue, and 3 jobs to the short queue:

bsub -q normal -R"rusage[tmp=20]" sleep 1000 
bsub -q short -R"rusage[tmp=20]" sleep 1000 

bjobs shows 1 job in RUN state in each queue:

bjobs 
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME 
16      user1   RUN   normal     hosta           hosta sleep 1000  Aug 30 16:26 
17      user1   PEND  normal     hosta                 sleep 1000  Aug 30 16:26 
18      user1   PEND  normal     hosta                 sleep 1000  Aug 30 16:26 
19      user1   RUN   short      hosta           hosta sleep 1000  Aug 30 16:26 
20      user1   PEND  short      hosta                 sleep 1000  Aug 30 16:26 
21      user1   PEND  short      hosta                 sleep 1000  Aug 30 16:26 

blimits shows the TMP limit reached:

blimits 
INTERNAL RESOURCE LIMITS: 
NAME      USERS    QUEUES     SLOTS      TMP     JOBS 
NONAME000   -      normal       -      20/20      1/2 
NONAME001   -      short        -      20/20      1/2 

Switch the running job in the normal queue to the short queue:

bswitch short 16 

bjobs shows 1 job running in the short queue, and two jobs running in the normal queue:

bjobs 
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME 
17      user1   RUN   normal     hosta           hosta sleep 1000  Aug 30 16:26 
18      user1   PEND  normal     hosta                 sleep 1000  Aug 30 16:26 
19      user1   RUN   short      hosta           hosta sleep 1000  Aug 30 16:26 
16      user1   RUN   normal     hosta           hosta sleep 1000  Aug 30 16:26 
20      user1   PEND  short      hosta                 sleep 1000  Aug 30 16:26 
21      user1   PEND  short      hosta                 sleep 1000  Aug 30 16:26 

blimits now shows the TMP limit exceeded and the JOBS limit reached in the short queue:

blimits 
INTERNAL RESOURCE LIMITS: 
NAME    USERS    QUEUES     SLOTS      TMP     JOBS 
NONAME000   -    normal       -      20/20      1/2 
NONAME001   -    short        -      40/20      2/2 

Switch the running job in the normal queue to the short queue:

bswitch short 17 

bjobs now shows 3 jobs running in the short queue and the third job running in the normal queue:

bjobs 
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME 
18      user1   RUN   normal     hosta           hosta sleep 1000  Aug 30 16:26 
19      user1   RUN   short      hosta           hosta sleep 1000  Aug 30 16:26 
16      user1   RUN   short      hosta           hosta sleep 1000  Aug 30 16:26 
17      user1   RUN   short      hosta           hosta sleep 1000  Aug 30 16:26 
20      user1   PEND  short      hosta                 sleep 1000  Aug 30 16:26 
21      user1   PEND  short      hosta                 sleep 1000  Aug 30 16:26 

blimits shows both TMP and JOBS limits exceeded in the short queue:

blimits 
INTERNAL RESOURCE LIMITS: 
NAME    USERS    QUEUES     SLOTS      TMP     JOBS 
NONAME000   -    normal       -      20/20      1/2 
NONAME001   -    short        -      60/20      3/2 

Limits for resource consumers

Host groups and compute units

If a limit is specified for a host group or compute unit, the total amount of a resource used by all hosts in that group or unit is counted. If a host is a member of more than one group, each job running on that host is counted against the limit for all groups to which the host belongs.

Limits for users and user groups

Jobs are normally queued on a first-come, first-served (FCFS) basis. It is possible for some users to abuse the system by submitting a large number of jobs; jobs from other users must wait until these jobs complete. Limiting resources by user prevents users from monopolizing all the resources.

Users can submit an unlimited number of jobs, but if they have reached their limit for any resource, the rest of their jobs stay pending, until some of their running jobs finish or resources become available.

If a limit is specified for a user group, the total amount of a resource used by all users in that group is counted. If a user is a member of more than one group, each of that user's jobs is counted against the limit for all groups to which that user belongs.

Use the keyword all to configure limits that apply to each user or user group in a cluster. This is useful if you have a large cluster but only want to exclude a few users from the limit definition.

You can use ENFORCE_ONE_UG_LIMITS=Y combined with bsub -G to have better control over limits when user groups have overlapping members. When set to Y, only the specified user group's limits (or those of any parent user group) are enforced. If set to N, the most restrictive job limits of any overlapping user/user group are enforced.

Per-user limits on users and groups

Per-user limits are enforced on each user or individually to each user in the user group listed. If a user group contains a subgroup, the limit also applies to each member in the subgroup recursively.

Per-user limits that use the keywords all apply to each user in a cluster. If user groups are configured, the limit applies to each member of the user group, not the group as a whole.

Resizable jobs

When a resize allocation request is scheduled for a resizable job, all resource allocation limits (job and slot) are enforced. Once the new allocation is satisfied, it consumes limits such as SLOTS, MEM, SWAP and TMP for queues, users, projects, hosts or cluster-wide. However the new allocation will not consume job limits such as job group limits, job array limits, and non-host level JOBS limit.

Releasing part of an allocation from a resizable job frees general limits that belong to the allocation, but not the actual job limits.

Configuring Resource Allocation Limits

Contents

lsb.resources file

Configure all resource allocation limits in one or more Limit sections in the lsb.resources file. Limit sections set limits for how much of the specified resources must be available for different classes of jobs to start, and which resource consumers the limits apply to.

Resource parameters

To limit ...
Set in a Limit section of lsb.resources ...
Total number of running and suspended (RUN, SSUSP, USUSP) jobs
JOBS
Total number of job slots that can be used by specific jobs
SLOTS
Jobs slots based on the number of processors on each host affected by the limit
SLOTS_PER_PROCESSOR and PER_HOST
Memory - if PER_HOST is set for the limit, the amount can be a percentage of memory on each host in the limit
MEM (MB or percentage)
Swap space - if PER_HOST is set for the limit, the amount can be a percentage of swap space on each host in the limit
SWP (MB or percentage)
Tmp space - if PER_HOST is set for the limit, the amount can be a percentage of tmp space on each host in the limit
TMP (MB or percentage)
Software licenses
LICENSE or RESOURCE
Any shared resource
RESOURCE

Consumer parameters

For jobs submitted ...
Set in a Limit section of lsb.resources ...
By all specified users or user groups
USERS
To all specified queues
QUEUES
To all specified hosts, host groups, or compute units
HOSTS
For all specified projects
PROJECTS
By each specified user or each member of the specified user groups
PER_USER
To each specified queue
PER_QUEUE
To each specified host or each member of specified host groups or compute units
PER_HOST
For each specified project
PER_PROJECT

Enable resource allocation limits

  1. To enable resource allocation limits in your cluster, you configure the resource allocation limits scheduling plugin schmod_limit in lsb.modules:
  2. Begin PluginModule 
    SCH_PLUGIN               RB_PLUGIN                SCH_DISABLE_PHASES 
    schmod_default              ()                              () 
    schmod_limit                ()                              () 
    End PluginModule 
    

Configure cluster-wide limits

  1. To configure limits that take effect for your entire cluster, configure limits in lsb.resources, but do not specify any consumers.

Compatibility with pre-version 7 job slot limits

The Limit section of lsb.resources does not support the keywords or format used in lsb.users, lsb.hosts, and lsb.queues. However, any existing job slot limit configuration in these files will continue to apply.

How resource allocation limits map to pre-version 7 job slot limits

Job slot limits are the only type of limit you can configure in lsb.users, lsb.hosts, and lsb.queues. You cannot configure limits for user groups, host groups, and projects in lsb.users, lsb.hosts, and lsb.queues. You should not configure any new resource allocation limits in lsb.users, lsb.hosts, and lsb.queues. Use lsb.resources to configure all new resource allocation limits, including job slot limits.

Job slot resources
Resource consumers (lsb.resources)
Equivalent existing limit (file)
(lsb.resources)
USERS
PER_USER
QUEUES
HOSTS
PER_HOST
SLOTS
-
all
-
host_name
-
JL/U (lsb.hosts)
SLOTS_PER_PROCESSOR
user_name
-
-
-
all
JL/P (lsb.users)
SLOTS
-
all
queue_name
-
-
UJOB_LIMIT
(lsb.queues)
SLOTS
-
all
-
-
-
MAX_JOBS
(lsb.users)
SLOTS
-
-
queue_name
-
all
HJOB_LIMIT
(lsb.queues)
SLOTS
-
-
-
host_name
-
MXJ (lsb.hosts)
SLOTS_PER_PROCESSOR
-
-
queue_name
-
all
PJOB_LIMIT
(lsb.queues)
SLOTS
-
-
queue_name
-
-
QJOB_LIMIT
(lsb.queues)

Limits for the following resources have no corresponding limit in lsb.users, lsb.hosts, and lsb.queues:

How conflicting limits are resolved

LSF handles two kinds of limit conflicts:

Similar conflicting limits

For similar limits configured in lsb.resources, lsb.users, lsb.hosts, or lsb.queues, the most restrictive limit is used. For example, a slot limit of 3 for all users is configured in lsb.resources:

Begin Limit 
NAME  = user_limit1 
USERS = all 
SLOTS = 3 
End Limit 

This is similar, but not equivalent to an existing MAX_JOBS limit of 2 is configured in lsb.users.

busers 
USER/GROUP    JL/P    MAX  NJOBS   PEND    RUN  SSUSP  USUSP    RSV  
user1           -       2      4      2      2      0      0      0 

user1 submits 4 jobs:

bjobs 
JOBID   USER    STAT  QUEUE     FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME 
816     user1   RUN   normal    hostA       hostA       sleep 1000 Jan 22 16:34 
817     user1   RUN   normal    hostA       hostA       sleep 1000 Jan 22 16:34 
818     user1   PEND  normal    hostA                   sleep 1000 Jan 22 16:34 
819     user1   PEND  normal    hostA                   sleep 1000 Jan 22 16:34 

Two jobs (818 and 819) remain pending because the more restrictive limit of 2 from lsb.users is enforced:

bjobs -p 
JOBID   USER    STAT  QUEUE      FROM_HOST      JOB_NAME           SUBMIT_TIME 
818     user1   PEND  normal     hostA          sleep 1000         Jan 22 16:34 
The user has reached his/her job slot limit; 
819     user1   PEND  normal     hostA          sleep 1000         Jan 22 16:34 
The user has reached his/her job slot limit; 

If the MAX_JOBS limit in lsb.users is 4:

busers 
USER/GROUP  JL/P   MAX  NJOBS   PEND   RUN  SSUSP  USUSP  RSV 
user1         -      4      4      1     3      0      0    0 

and user1 submits 4 jobs:

bjobs 
JOBID  USER    STAT  QUEUE   FROM_HOST   EXEC_HOST    JOB_NAME     SUBMIT_TIME 
824    user1   RUN   normal  hostA       hostA        sleep 1000   Jan 22 16:38 
825    user1   RUN   normal  hostA       hostA        sleep 1000   Jan 22 16:38 
826    user1   RUN   normal  hostA       hostA        sleep 1000   Jan 22 16:38 
827    user1   PEND  normal  hostA                    sleep 1000   Jan 22 16:38 

Only one job (827) remains pending because the more restrictive limit of 3 in lsb.resources is enforced:

bjobs -p 
JOBID    USER    STAT  QUEUE   FROM_HOST       JOB_NAME           SUBMIT_TIME 
827     user1    PEND  normal      hostA     sleep 1000          Jan 22 16:38 
Resource (slot) limit defined cluster-wide has been reached; 
Equivalent conflicting limits

New limits in lsb.resources that are equivalent to existing limits in lsb.users, lsb.hosts, or lsb.queues, but with a different value override the existing limits. The equivalent limits in lsb.users, lsb.hosts, or lsb.queues are ignored, and the value of the new limit in lsb.resources is used.

For example, a per-user job slot limit in lsb.resources is equivalent to a MAX_JOBS limit in lsb.users, so only the lsb.resources limit is enforced, the limit in lsb.users is ignored:

Begin Limit 
NAME  = slot_limit 
PER_USER =all 
SLOTS = 3 
End Limit 

How job limits work

The JOBS parameter limits the maximum number of running or suspended jobs available to resource consumers. Limits are enforced depending on the number of jobs in RUN, SSUSP, and USUSP state.

Stopping and resuming jobs

Jobs stopped with bstop, go into USUSP status. LSF includes USUSP jobs in the count of running jobs, so the usage of JOBS limit will not change when you suspend a job.

Resuming a stopped job (bresume) changes job status to SSUSP. The job can enter RUN state, if the JOBS limit has not been exceeded. Lowering the JOBS limit before resuming the job can exceed the JOBS limit, and prevent SSUSP jobs from entering RUN state.

For example, JOBS=5, and 5 jobs are running in the cluster (JOBS has reached 5/5). Normally. the stopped job (in USUSP state) can later be resumed and begin running, returning to RUN state. If you reconfigre the JOBS limit to 4 before resuming the job, the JOBS usage becomes 5/4, and the job cannot run because the JOBS limit has been exceeded.

Preemption

The JOBS limit does not block preemption based on job slots. For example, if JOBS=2, and a host is already running 2 jobs in a preemptable queue, a new preemptive job can preempt a job on that host as long as the preemptive slots can be satisfied even though the JOBS limit has been reached.

Reservation and backfill

Reservation and backfill are still made at the job slot level, but despite a slot reservation being satisfied, the job may ultimately not run because the JOBS limit has been reached. This similar to a job not running because a license is not available.

Other jobs

Example limit configurations

Each set of limits is defined in a Limit section enclosed by Begin Limit and End Limit.

Example 1

user1 is limited to 2 job slots on hostA, and user2's jobs on queue normal are limited to 20 MB of memory:

Begin Limit 
NAME    HOSTS     SLOTS  MEM   SWP  TMP   USERS       QUEUES 
Limit1  hostA     2      -      -    -    user1       - 
-       -         -      20     -    -    user2       normal 
End Limit 
Example 2

Set a job slot limit of 2 for user user1 submitting jobs to queue normal on host hosta for all projects, but only one job slot for all queues and hosts for project test:

Begin Limit 
HOSTS  SLOTS  PROJECTS   USERS     QUEUES 
hosta  2         -       user1     normal 
  -    1      test       user1       -    
End Limit 
Example 3

Limit usage of hosts in license1 group:

Example 4

All users in user group ugroup1 except user1 using queue1 and queue2 and running jobs on hosts in host group hgroup1 are limited to 2 job slots per processor on each host:

Begin Limit 
NAME          = limit1 
# Resources: 
SLOTS_PER_PROCESSOR = 2 
#Consumers: 
QUEUES       = queue1 queue2 
USERS        = ugroup1 ~user1 
PER_HOST     = hgroup1 
End Limit 
Example 5

user1 and user2 can use all queues and all hosts in the cluster with a limit of 20 MB of available memory:

Begin Limit 
NAME  = 20_MB_mem  
# Resources: 
MEM   = 20 
# Consumers: 
USERS = user1 user2 
End Limit 
Example 6

All users in user group ugroup1 can use queue1 and queue2 and run jobs on any host in host group hgroup1 sharing 10 job slots:

Begin Limit 
NAME   = 10_slot  
# Resources: 
SLOTS  = 10 
#Consumers: 
QUEUES = queue1 queue2 
USERS  = ugroup1 
HOSTS  = hgroup1 
End Limit 
Example 7

All users in user group ugroup1 except user1 can use all queues but queue1 and run jobs with a limit of 10% of available memory on each host in host group hgroup1:

Begin Limit 
NAME     = 10_percent_mem 
# Resources: 
MEM      = 10% 
QUEUES   = all ~queue1 
USERS    = ugroup1 ~user1 
PER_HOST = hgroup1 
End Limit 
Example 8

Limit users in the develop group to 1 job on each host, and 50% of the memory on the host.

Begin Limit 
NAME = develop_group_limit 
# Resources: 
SLOTS = 1 
MEM = 50% 
#Consumers: 
USERS = develop 
PER_HOST = all 
End Limit 
Example 9

Limit software license lic1, with quantity 100, where user1 can use 90 licenses and all other users are restricted to 10.

Begin Limit 
USERS          LICENSE 
user1          ([lic1,90]) 
(all ~user1)   ([lic1,10]) 
End Limit 

lic1 is defined as a decreasing numeric shared resource in lsf.shared.

To submit a job to use one lic1 license, use the rusage string in the -R option of bsub specify the license:

bsub -R "rusage[lic1=1]" my-job 
Example 10

Jobs from crash project can use 10 lic1 licenses, while jobs from all other projects together can use 5.

Begin Limit 
LICENSE        PROJECTS 
([lic1,10])    crash 
([lic1,5])     (all ~crash) 
End Limit 

lic1 is defined as a decreasing numeric shared resource in lsf.shared.

Example 11

Limit all hosts to 1 job slot per processor:

Begin Limit 
NAME                = default_limit 
SLOTS_PER_PROCESSOR = 1 
PER_HOST            = all 
End Limit 
Example 12

The short queue can have at most 200 running and suspended jobs:

Begin Limit 
NAME     = shortq_limit 
QUEUES   = short 
JOBS     = 200 
End Limit 

Viewing Information about Resource Allocation Limits

Your job may be pending because some configured resource allocation limit has been reached. Use the blimits command to show the dynamic counters of resource allocation limits configured in Limit sections in lsb.resources. blimits displays the current resource usage to show what limits may be blocking your job.

blimits command

The blimits command displays:

Resources that have no configured limits or no limit usage are indicated by a dash (-). Limits are displayed in a USED/LIMIT format. For example, if a limit of 10 slots is configured and 3 slots are in use, then blimits displays the limit for SLOTS as 3/10.

If limits MEM, SWP, or TMP are configured as percentages, both the limit and the amount used are displayed in MB. For example, lshosts displays maxmem of 249 MB, and MEM is limited to 10% of available memory. If 10 MB out of 25 MB are used, blimits displays the limit for MEM as 10/25 (10 MB USED from a 25 MB LIMIT).

Configured limits and resource usage for built-in resources (slots, mem, tmp, and swp load indices, and number of running and suspended jobs) are displayed as INTERNAL RESOURCE LIMITS separately from custom external resources, which are shown as EXTERNAL RESOURCE LIMITS.

Limits are displayed for both the vertical tabular format and the horizontal format for Limit sections. If a vertical format Limit section has no name, blimits displays NONAMEnnn under the NAME column for these limits, where the unnamed limits are numbered in the order the vertical-format Limit sections appear in the lsb.resources file.

If a resource consumer is configured as all, the limit usage for that consumer is indicated by a dash (-).

PER_HOST slot limits are not displayed. The bhosts commands displays these as MXJ limits.

In MultiCluster, blimits returns the information about all limits in the local cluster.

Examples

For the following limit definitions:

Begin Limit 
NAME = limit1 
USERS = user1 
PER_QUEUE = all 
PER_HOST = hostA hostC 
TMP = 30% 
SWP = 50% 
MEM = 10% 
End Limit 
Begin Limit 
NAME = limit_ext1 
PER_HOST = all 
RESOURCE = ([user1_num,30] [hc_num,20]) 
End Limit 
Begin Limit 
NAME = limit2 
QUEUES = short 
JOBS = 200 
End Limit 

blimits displays the following:

blimits  
INTERNAL RESOURCE LIMITS: 
  
NAME    USERS     QUEUES    HOSTS             PROJECTS   SLOTS    MEM      TMP      SWP    JOBS 
limit1   user1         q2   hostA@cluster1         -       -   10/25        -   10/258       - 
limit1   user1         q3   hostA@cluster1         -       -       -   30/2953       -       - 
limit1   user1         q4     hostC         -       -       -    40/590       -       - 
limit2     -        short      -            -       -       -        -        -  50/200 
  
EXTERNAL RESOURCE LIMITS: 
  
NAME        USERS   QUEUES   HOSTS               PROJECTS    user1_num    hc_num 
limit_ext1      -        -   hostA@cluster1          -           -         1/20 
limit_ext1      -        -   hostC@cluster1          -         1/30        1/20 

Platform Computing Inc.
www.platform.com
Knowledge Center         Contents    Previous  Next    Index