The JOBS parameter limits the maximum number of running or suspended jobs available to resource consumers. Limits are enforced depending on the number of jobs in RUN, SSUSP, and USUSP state.
Jobs stopped with bstop, go into USUSP status. LSF includes USUSP jobs in the count of running jobs, so the usage of JOBS limit will not change when you suspend a job.
Resuming a stopped job (bresume) changes job status to SSUSP. The job can enter RUN state, if the JOBS limit has not been exceeded. Lowering the JOBS limit before resuming the job can exceed the JOBS limit, and prevent SSUSP jobs from entering RUN state.
For example, JOBS=5, and 5 jobs are running in the cluster (JOBS has reached 5/5). Normally. the stopped job (in USUSP state) can later be resumed and begin running, returning to RUN state. If you reconfigure the JOBS limit to 4 before resuming the job, the JOBS usage becomes 5/4, and the job cannot run because the JOBS limit has been exceeded.
The JOBS limit does not block preemption based on job slots. For example, if JOBS=2, and a host is already running 2 jobs in a preemptable queue, a new preemptive job can preempt a job on that host as long as the preemptive slots can be satisfied even though the JOBS limit has been reached.
Reservation and backfill are still made at the job slot level, but despite a slot reservation being satisfied, the job may ultimately not run because the JOBS limit has been reached. This similar to a job not running because a license is not available.
brun forces a pending job to run immediately on specified hosts. A job forced to run with brun is counted as a running job, which may violate JOBS limits. After the forced job starts, the JOBS limits may be exceeded.
Requeued jobs (brequeue) are assigned PEND status or PSUSP. Usage of JOBS limit is decreased by the number of requeued jobs.
Checkpointed jobs restarted with brestart start a new job based on the checkpoint of an existing job. Whether the new job can run depends on the limit policy (including the JOBS limit) that applies to the job. For example, if you checkpoint a job running on a host that has reached its JOBS limit, then restart it, the restarted job cannot run because the JOBS limit has been reached.
For job arrays, you can define a maximum number of jobs that can run in the array at any given time. The JOBS limit, like other resource allocation limits, works in combination with the array limits. For example, if JOBS=3 and the array limit is 4, at most 3 job elements can run in the array.
For chunk jobs, only the running job among the jobs that are dispatched together in a chunk is counted against the JOBS limit. Jobs in WAIT state do not affect the JOBS limit usage.
user1 is limited to 2 job slots on hostA, and user2’s jobs on queue normal are limited to 20 MB of memory:
Set a job slot limit of 2 for user user1 submitting jobs to queue normal on host hosta for all projects, but only one job slot for all queues and hosts for project test:
All users in user group ugroup1 except user1 using queue1 and queue2 and running jobs on hosts in host group hgroup1 are limited to 2 job slots per processor on each host:
user1 and user2 can use all queues and all hosts in the cluster with a limit of 20 MB of available memory:
All users in user group ugroup1 can use queue1 and queue2 and run jobs on any host in host group hgroup1 sharing 10 job slots:
All users in user group ugroup1 except user1 can use all queues but queue1 and run jobs with a limit of 10% of available memory on each host in host group hgroup1:
Limit software license lic1, with quantity 100, where user1 can use 90 licenses and all other users are restricted to 10.
lic1 is defined as a decreasing numeric shared resource in lsf.shared.
To submit a job to use one lic1 license, use the rusage string in the -R option of bsub specify the license:
Jobs from crash project can use 10 lic1 licenses, while jobs from all other projects together can use 5.
lic1 is defined as a decreasing numeric shared resource in lsf.shared.