The lsb.params file defines general parameters used by the LSF system. This file contains only one section, named Parameters. mbatchd uses lsb.params for initialization. The file is optional. If not present, the LSF-defined defaults are assumed.
Some of the parameters that can be defined in lsb.params control timing within the system. The default settings provide good throughput for long-running batch jobs while adding a minimum of processing overhead in the batch daemons.
This file is installed by default in LSB_CONFDIR/cluster_name/configdir.
The following parameter values are set at installation for the purpose of testing a new cluster:
Begin Parameters
DEFAULT_QUEUE = normal #default job queue name
MBD_SLEEP_TIME = 20 #Time used for calculating parameter values (60 secs is default)
SBD_SLEEP_TIME = 15 #sbatchd scheduling interval (30 secs is default)
JOB_ACCEPT_INTERVAL = 1 #interval for any host to accept a job
#(default is 1 (one-fold of MBD_SLEEP_TIME))
End Parameters
With this configuration, jobs submitted to the LSF system will be started on server hosts quickly. If this configuration is not suitable for your production use, you should either remove the parameters to take the default values, or adjust them as needed.
For example, to avoid having jobs start when host load is high, increase JOB_ACCEPT_INTERVAL so that the job scheduling interval is longer to give hosts more time to adjust load indices after accepting jobs.
In production use, you should define DEFAULT_QUEUE to the normal queue, MBD_SLEEP_TIME to 60 seconds (the default), and SBD_SLEEP_TIME to 30 seconds (the default).
Specifies a CPU limit, run limit, or estimated run time for jobs submitted to a chunk job queue to be chunked.
When CHUNK_JOB_DURATION is set, the CPU limit or run limit set at the queue level (CPULIMIT or RUNLMIT), application level (CPULIMIT or RUNLIMIT), or job level (-c or -W bsub options), or the run time estimate set at the application level (RUNTIME) must be less than or equal to CHUNK_JOB_DURATION for jobs to be chunked.
For non-repetitive jobs, the amount of time that job records for jobs that have finished or have been killed are kept in mbatchd core memory after they have finished.
Users can still see all jobs after they have finished using the bjobs command.
For jobs that finished more than CLEAN_PERIOD seconds ago, use the bhist command.
Used only with fairshare scheduling. Committed run time weighting factor.
In the calculation of a user’s dynamic priority, this factor determines the relative importance of the committed run time in the calculation. If the -W option of bsub is not specified at job submission and a RUNLIMIT has not been set for the queue, the committed run time is not considered.
This parameter can also be set for an individual queue in lsb.queues. If defined, the queue value takes precedence.
Used to define valid compute unit types for topological resource requirement allocation.
The order in which compute unit types appear specifies the containment relationship between types. Finer grained compute unit types appear first, followed by the coarser grained type that contains them, and so on.
At most one compute unit type in the list can be followed by an exclamation mark designating it as the default compute unit type. If no exclamation mark appears, the first compute unit type in the list is taken as the default type.
Used only with fairshare scheduling. CPU time weighting factor.
In the calculation of a user’s dynamic share priority, this factor determines the relative importance of the cumulative CPU time used by a user’s jobs.
This parameter can also be set for an individual queue in lsb.queues. If defined, the queue value takes precedence.
The name of the default job group.
When you submit a job to LSF without explicitly specifying a job group, LSF associates the job with the specified job group. The LSB_DEFAULT_JOBGROUP environment variable overrrides the setting of DEFAULT_JOBGROUP. The bsub -g job_group_name option overrides both LSB_DEFAULT_JOBGROUP and DEFAULT_JOBGROUP.
Default job group specification supports macro substitution for project name (%p) and user name (%u). When you specify bsub -P project_name, the value of %p is the specified project name. If you do not specify a project name at job submission, %p is the project name defined by setting the environment variable LSB_DEFAULTPROJECT, or the project name specified by DEFAULT_PROJECT in lsb.params. the default project name is default.
For example, a default job group name specified by DEFAULT_JOBGROUP=/canada/%p/%u is expanded to the value for the LSF project name and the user name of the job submission user (for example, /canada/projects/user1).
Job group names must start with a slash character (/). For example, DEFAULT_JOBGROUP=/A/B/C is correct, but DEFAULT_JOBGROUP=A/B/C is not correct.
Job group names cannot end with a slash character (/). For example, DEFAULT_JOBGROUP=/A/ is not correct.
Job group names cannot contain more than one slash character (/) in a row. For example, job group names like DEFAULT_JOBGROUP=/A//B or DEFAULT_JOBGROUP=A////B are not correct.
Job group names cannot contain spaces. For example, DEFAULT_JOBGROUP=/A/B C/D is not correct.
Project names and user names used for macro substitution with %p and %u cannot start or end with slash character (/).
Project names and user names used for macro substitution with %p and %u cannot contain spaces or more than one slash character (/) in a row.
Project names or user names containing slash character (/) will create separate job groups. For example, if the project name is canada/projects, DEFAULT_JOBGROUP=/%p results in a job group hierarchy /canada/projects.
Space-separated list of candidate default queues (candidates must already be defined in lsb.queues).
When you submit a job to LSF without explicitly specifying a queue, and the environment variable LSB_DEFAULTQUEUE is not set, LSF puts the job in the first queue in this list that satisfies the job’s specifications subject to other restrictions, such as requested hosts, queue status, etc.
This parameter is set at installation to DEFAULT_QUEUE=normal.
When a user submits a job to LSF without explicitly specifying a queue, and there are no candidate default queues defined (by this parameter or by the user’s environment variable LSB_DEFAULTQUEUE), LSF automatically creates a new queue named default, using the default configuration, and submits the job to that queue.
When DEFAULT_USER_GROUP is defined, all submitted jobs must be associated with a user group. Jobs without a user group specified will be associated with default_user_group, where default_user_group is a group configured in lsb.users and contains all as a direct member. DEFAULT_USER_GROUP can only contain one user group.
If the default user group does not have shares assigned in a fairshare queue, jobs can still run from the default user group and are charged to the highest priority account the user can access in the queue. A job submitted to a user group without shares in a specified fairshare queue is transferred to the default user group where the job can run. A job modified or moved using bmod or bswitch may similarly be transferred to the default user group.
Jobs linked to a user group, either through the default_user_group or a user group specified at submission using bsub -G, allow the user group administrator to issue job control operations. User group administrator rights are configured in the UserGroup section lsb.users, under GROUP_ADMIN.
When DEFAULT_USER_GROUP is not defined, jobs do not require a user group association.
After adding or changing DEFAULT_USER_GROUP in lsb.params, use badmin reconfig to reconfigure your cluster
For EGO-enabled SLA scheduling, the number of slots that the SLA should request for parallel jobs running in the SLA.
By default, an EGO-enabled SLA requests slots from EGO based on the number of jobs the SLA needs to run. If the jobs themselves require more than one slot, they will remain pending. To avoid this for parallel jobs, set DEFAULT_SLA_VELOCITY to the total number of slots that are expected to be used by parallel jobs.
The name of the default service class or EGO consumer name for EGO-enabled SLA scheduling. If the specified SLA does not exist in lsb.servieclasses, LSF creates one with the specified consumer name, velocity of 1, priority of 1, and a time window that is always open.
If the name of the default SLA is not configured in lsb.servicesclasses, it must be the name of a valid EGO consumer.
ENABLE_DEFAULT_EGO_SLA is required to turn on EGO-enabled SLA scheduling. All LSF resource management is delegated to Platform EGO, and all LSF hosts are under EGO control. When all jobs running in the default SLA finish, all allocated hosts are released to EGO after the default idle timeout of 120 seconds (configurable by MAX_HOST_IDLE_TIME in lsb.serviceclasses).
When you submit a job to LSF without explicitly using the -sla option to specify a service class name, LSF puts the job in the default service class specified by service_class_name.
When enabled, allows job submission to any host that belongs to the intersection created when considering the queue the job was submitted to, any advance reservation hosts, or any hosts specified by bsub -m at the time of submission.
When disabled job submission with hosts specified can be accepted only if specified hosts are a subset of hosts defined in the queue.
Defines job resume permissions.
When this parameter is defined:
If the value is Y, users can resume their own jobs that have been suspended by the administrator.
If the value is N, jobs that are suspended by the administrator can only be resumed by the administrator or root; users do not have permission to resume a job suspended by another user or the administrator. Administrators can resume jobs suspended by users or administrators.
Upon job submission with the -G option and when user groups have overlapping members, defines whether only the specified user group’s limits (or those of any parent group) are enforced or whether the most restrictive user group limits of any overlapping user/user group are enforced.
If the value is Y, only the limits defined for the user group that you specify with -G during job submission apply to the job, even if there are overlapping members of groups.
If you have nested user groups, the limits of a user's group parent also apply.
If the value is N and the user group has members that overlap with other user groups, the strictest possible limits (that you can view by running blimits) defined for any of the member user groups are enforced for the job.
If the user group specified at submission is no longer valid when the job runs and ENFORCE_ONE_UG_LIMIT=Y, only the user limit is applied to the job. This can occur if the user group is deleted or the user is removed from the user group.
When ENFORCE_UG_TREE=Y is defined, user groups must form a tree-like structure, with each user group having at most one parent. User group definitions in the UserGroup section of lsb.users will be checked in configuration order, and any user group appearing in GROUP_MEMBER more than once will be ignored after the first occurence.
After adding or changing ENFORCE_UG_TREE in lsb.params, use badmin reconfig to reconfigure your cluster
Used with duplicate logging of event and accounting log files. LSB_LOCALDIR in lsf.conf must also be specified. Specifies how often to back up the data and synchronize the directories (LSB_SHAREDIR and LSB_LOCALDIR).
If you do not define this parameter, the directories are synchronized when data is logged to the files, or when mbatchd is started on the first LSF master host. If you define this parameter, mbatchd synchronizes the directories only at the specified time intervals.
Use this parameter if NFS traffic is too high and you want to reduce network traffic.
Job exited with exit reasons related to LSF and not related to a host problem (for example, user action or LSF policy). These jobs are not counted in the exit rate calculation for the host.
Job exited during initialization because of an execution environment problem. The job did not actually start running.
HPC job exited during initialization because of an execution environment problem. The job did not actually start running.
Used only with fairshare scheduling. Fairshare adjustment plugin weighting factor.
In the calculation of a user’s dynamic share priority, this factor determines the relative importance of the user-defined adjustment made in the fairshare plugin (libfairshareadjust.*).
A positive float number both enables the fairshare plugin and acts as a weighting factor.
This parameter can also be set for an individual queue in lsb.queues. If defined, the queue value takes precedence.
Specifies a cluster-wide threshold for exited jobs. Specify a number of jobs. If EXIT_RATE is not specified for the host in lsb.hosts, GLOBAL_EXIT_RATE defines a default exit rate for all hosts in the cluster. Host-level EXIT_RATE overrides the GLOBAL_EXIT_RATE value.
If the number of jobs that exit over the period of time specified by JOB_EXIT_RATE_DURATION (5 minutes by default) exceeds the number of jobs that you specify as the threshold in this parameter, LSF invokes LSF_SERVERDIR/eadmin to trigger a host exception.
Used only with fairshare scheduling. Determines a rate of decay for cumulative CPU time, run time, and historical run time.
To calculate dynamic user priority, LSF scales the actual CPU time and the run time using a decay factor, so that 1 hour of recently-used time is equivalent to 0.1 hours after the specified number of hours has elapsed.
To calculate dynamic user priority with decayed run time and historical run time, LSF scales the accumulated run time of finished jobs and run time of running jobs using the same decay factor, so that 1 hour of recently-used time is equivalent to 0.1 hours after the specified number of hours has elapsed.
When HIST_HOURS=0, CPU time and run time accumulated by running jobs is not decayed.
This parameter can also be set for an individual queue in lsb.queues. If defined, the queue value takes precedence.
The number you specify is multiplied by the value of lsb.params MBD_SLEEP_TIME (60 seconds by default). The result of the calculation is the number of seconds to wait after dispatching a job to a host, before dispatching a second job to the same host.
If 0 (zero), a host may accept more than one job. By default, there is no limit to the total number of jobs that can run on a host, so if this parameter is set to 0, a very large number of jobs might be dispatched to a host all at once. This can overload your system to the point that it will be unable to create any more processes. It is not recommended to set this parameter to 0.
JOB_ACCEPT_INTERVAL set at the queue level (lsb.queues) overrides JOB_ACCEPT_INTERVAL set at the cluster level (lsb.params).
The shared directory in which mbatchd saves the attached data of messages posted with the bpost command.
Use JOB_ATTA_DIR if you use bpost and bread to transfer large data files between jobs and want to avoid using space in LSB_SHAREDDIR. By default, the bread command reads attachment data from the JOB_ATTA_DIR directory.
JOB_ATTA_DIR should be shared by all hosts in the cluster, so that any potential LSF master host can reach it. Like LSB_SHAREDIR, the directory should be owned and writable by the primary LSF administrator. The directory must have at least 1 MB of free space.
JOB_ATTA_DIR/timestamp.jobid.msgs/msg$msgindex
JOB_ATTA_DIR=\\HostA\temp\lsf_work
After adding JOB_ATTA_DIR to lsb.params, use badmin reconfig to reconfigure your cluster.
Defines how long LSF waits before checking the job exit rate for a host. Used in conjunction with EXIT_RATE in lsb.hosts for LSF host exception handling.
If the job exit rate is exceeded for the period specified by JOB_EXIT_RATE_DURATION, LSF invokes LSF_SERVERDIR/eadmin to trigger a host exception.
If JOB_GROUP_CLEAN = Y, implicitly created job groups that are empty and have no limits assigned to them are automatically deleted.
Job groups can only be deleted automatically if they have no limits specified (directly or in descendent job groups), have no explicitly created children job groups, and haven’t been attached to an SLA.
Prevents a new job from starting on a host until post-execution processing is finished on that host
Includes the CPU and run times of post-execution processing with the job CPU and run times
sbatchd sends both job finish status (DONE or EXIT) and post-execution processing status (POST_DONE or POST_ERR) to mbatchd at the same time
In MultiCluster job forwarding model, the JOB_INCLUDE_POSTPROC value in the receiving cluster applies to the job.
MultiCluster job lease model, the JOB_INCLUDE_POSTPROC value applies to jobs running on remote leased hosts as if they were running on local hosts.
The variable LSB_JOB_INCLUDE_POSTPROC in the user environment overrides the value of JOB_INCLUDE_POSTPROC in an application profile in lsb.applications. JOB_INCLUDE_POSTPROC in an application profile in lsb.applications overrides the value of JOB_INCLUDE_POSTPROC in lsb.params.
For SGI cpusets, if JOB_INCLUDE_POSTPROC=Y, LSF does not release the cpuset until post-execution processing has finished, even though post-execution processes are not attached to the cpuset.
Specifies a timeout in minutes for job post-execution processing. The specified timeout must be greater than zero.
If post-execution processing takes longer than the timeout, sbatchd reports that post-execution has failed (POST_ERR status), and kills the entire process group of the job’s post-execution processes on UNIX and Linux. On Windows, only the parent process of the post-execution command is killed when the timeout expires. The child processes of the post-execution command are not killed.
If JOB_INCLUDE_POSTPROC=Y, and sbatchd kills the post-execution processes because the timeout has been reached, the CPU time of the post-execution processing is set to 0, and the job’s CPU time does not include the CPU time of post-execution processing.
JOB_POSTPROC_TIMEOUT defined in an application profile in lsb.applications overrides the value in lsb.params. JOB_POSTPROC_TIMEOUT cannot be defined in user environment.
In MultiCluster job forwarding model, the JOB_POSTPROC_TIMEOUT value in the receiving cluster applies to the job.
MultiCluster job lease model, the JOB_POSTPROC_TIMEOUT value applies to jobs running on remote leased hosts as if they were running on local hosts.
JOB_PRIORITY_OVER_TIME enables automatic job priority escalation when MAX_USER_PRIORITY is also defined.
Specifies the value used to increase job priority every interval minutes. Valid values are positive integers.
Specifies the frequency, in minutes, to increment job priority. Valid values are positive integers.
Specifies a ratio between a job run limit and the runtime estimate specified by bsub -We or bmod -We, -We+, -Wep. The ratio does not apply to the RUNTIME parameter in lsb.applications.
This ratio can be set to 0 and no restrictions are applied to the runtime estimate.
JOB_RUNLIMIT_RATIO prevents abuse of the runtime estimate. The value of this parameter is the ratio of run limit divided by the runtime estimate.
By default, the ratio value is 0. Only administrators can set or change this ratio. If the ratio changes, it only applies to newly submitted jobs. The changed value does not retroactively reapply to already submitted jobs.
If the users specifiy a runtime estimate only (bsub -We), the job-level run limit will automatically be set to runtime_ratio * runtime_estimate. Jobs running longer than this run limit are killed by LSF. If the job-level run limit is greater than the hard run limit in the queue, the job is rejected.
If the users specify a runtime estimate (-We) and job run limit (-W) at job submission, and the run limit is greater than runtime_ratio * runtime_estimate, the job is rejected.
If the users modify the run limit to be greater than runtime_ratio, they must increase the runtime estimate first (bmod -We). Then they can increase the default run limit.
LSF remembers the run limit is set with bsub -W or convert from runtime_ratio * runtime_estimate. When users modify the run limit with bmod -Wn, the run limit is automatically be set to runtime_ratio * runtime_estimate If the run limit is set from runtime_ratio, LSF rejects the run limit modification.
If users modify the runtime estimate with bmod -We and the run limit is set by the user, the run limit is MIN(new_estimate * new_ratio, run_limit). If the run limit is set by runtime_ratio, the run limit is set to new_estimate * new_ratio.
If users modify the runtime estimate by using bmod -Wen and the run limit is set by the user, it is not changed. If the run limit is set by runtime_ratio, it is set to unlimited.
Run limit (for example with bsub -We) is 10, JOB_RUNLIMIT_RATIO=5 in the sending cluster, JOB_RUNLIMIT_RATIO=0 in the receiving cluster—run limit=50, and the job will run
Run limit (for example with bsub -We) is 10, JOB_RUNLIMIT_RATIO=5 in the sending cluster, JOB_RUNLIMIT_RATIO=3 in the receiving cluster—run limit=50, and the job will pend
Run limit (for example with bsub -We) is 10, JOB_RUNLIMIT_RATIO=5 in the sending cluster, JOB_RUNLIMIT_RATIO=6 in the receiving cluster—run limit=50, and the job will run
Run limit (for example with bsub -We) is 10, JOB_RUNLIMIT_RATIO=0 in the sending cluster, JOB_RUNLIMIT_RATIO=5 in the receiving cluster—run limit=50, and the job will run
MultiCluster job lease model, the JOB_RUNLIMIT_RATIO value applies to jobs running on remote leased hosts as if they were running on local hosts.
Time interval at which mbatchd sends jobs for scheduling to the scheduling daemon mbschd along with any collected load information. Specify in seconds, or include the keyword ms to specify in milliseconds.
If set to 0, there is no interval between job scheduling sessions.
The smaller the value of this parameter, the quicker jobs are scheduled. However, when the master batch daemon spends more time doing job scheduling, it has less time to respond to user commands. To have a balance between speed of job scheduling and response to the LSF commands, start with a setting of 0 or 1, and increase if users see the message “Batch system not responding...".
Specifies the directory for buffering batch standard output and standard error for a job.
When JOB_SPOOL_DIR is defined, the standard output and standard error for the job is buffered in the specified directory.
Files are copied from the submission host to a temporary file in the directory specified by the JOB_SPOOL_DIR on the execution host. LSF removes these files when the job completes.
If JOB_SPOOL_DIR is not accessible or does not exist, files are spooled to the default job output directory $HOME/.lsbatch.
For bsub -is and bsub -Zs, JOB_SPOOL_DIR must be readable and writable by the job submission user, and it must be shared by the master host and the submission host. If the specified directory is not accessible or does not exist, and JOB_SPOOL_DIR is specified, bsub -is cannot write to the default directory LSB_SHAREDIR/cluster_name/lsf_indir, and bsub -Zs cannot write to the default directory LSB_SHAREDIR/cluster_name/lsf_cmddir, and the job will fail.
As LSF runs jobs, it creates temporary directories and files under JOB_SPOOL_DIR. By default, LSF removes these directories and files after the job is finished. See bsub for information about job submission options that specify the disposition of these files.
JOB_SPOOL_DIR can be any valid path.
The path you specify for JOB_SPOOL_DIR should be as short as possible to avoid exceeding this limit.
Specifies the time interval in seconds between sending SIGINT, SIGTERM, and SIGKILL when terminating a job. When a job is terminated, the job is sent SIGINT, SIGTERM, and SIGKILL in sequence with a sleep time of JOB_TERMINATE_INTERVAL between sending the signals. This allows the job to clean up if necessary.
Improves the speed with which mbatchd obtains host status, and therefore the speed with which LSF reschedules rerunnable jobs: the sooner LSF knows that a host has become unavailable, the sooner LSF reschedules any rerunnable jobs executing on that host. Useful for a large cluster.
When you define this parameter, mbatchd periodically obtains the host status from the master LIM, and then verifies the status by polling each sbatchd at an interval defined by the parameters MBD_SLEEP_TIME and LSB_MAX_PROBE_SBD.
Defines how many concurrent job queries mbatchd can handle.
If a job information query is sent after the limit has been reached, an error message ("Batch system concurrent query limit exceeded") is displayed.
If mbatchd is not using multithreading, the value of MAX_CONCURRENT_JOB_QUERY is always the maximum number of job queries in the cluster.
If mbatchd is using multithreading (defined by the parameter LSB_QUERY_PORT in lsf.conf ), the number of job queries in the cluster can temporarily become higher than the number specified by MAX_CONCURRENT_JOB_QUERY.
This increase in the total number of job queries is possible because the value of MAX_CONCURRENT_JOB_QUERY actually sets the maximum number of queries that can be handled by each child mbatchd that is forked by mbatchd. When the new child mbatchd starts, it handles new queries, but the old child mbatchd continues to run until all the old queries are finished. It is possible that the total number of job queries can be as high as MAX_CONCURRENT_JOB_QUERY multiplied by the number of child daemons forked by mbatchd.
Determines the maximum size in MB of the lsb.stream file used by system performance analysis tools.
When the MAX_EVENT_STREAM_SIZE size is reached, LSF logs a special event EVENT_END_OF_STREAM, closes the stream and moves it to lsb.stream.0 and a new stream is opened.
All applications that read the file once the event EVENT_END_OF_STREAM is logged should close the file and reopen it.
The number of subdirectories under the LSB_SHAREDIR/cluster_name/logdir/info directory.
subdirectory = jobID % MAX_INFO_DIRS
The maximum number of finished jobs whose events are to be stored in the lsb.events log file.
Once the limit is reached, mbatchd starts a new event log file. The old event log file is saved as lsb.events.n, with subsequent sequence number suffixes incremented by 1 each time a new log file is started. Event logging continues in the new lsb.events file.
The job ID limit. The job ID limit is the highest job ID that LSF will ever assign, and also the maximum number of jobs in the system.
By default, LSF assigns job IDs up to 6 digits. This means that no more than 999999 jobs can be in the system at once.
Specify any integer from 999999 to 2147483646 (for practical purposes, you can use any 10-digit integer less than this value).
You cannot lower the job ID limit, but you can raise it to 10 digits. This allows longer term job accounting and analysis, and means you can have more jobs in the system, and the job ID numbers will roll over less often.
LSF assigns job IDs in sequence. When the job ID limit is reached, the count rolls over, so the next job submitted gets job ID "1". If the original job 1 remains in the system, LSF skips that number and assigns job ID "2", or the next available job ID. If you have so many jobs in the system that the low job IDs are still in use when the maximum job ID is assigned, jobs with sequential numbers could have totally different submission times.
The maximum number of pending jobs in the system.
This is the hard system-wide pending job threshold. No user or user group can exceed this limit unless the job is forwarded from a remote cluster.
If the user or user group submitting the job has reached the pending job threshold as specified by MAX_PEND_JOBS, LSF will reject any further job submission requests sent by that user or user group. The system will continue to send the job submission requests with the interval specified by SUB_TRY_INTERVAL in lsb.params until it has made a number of attempts equal to the LSB_NTRIES environment variable. If LSB_NTRIES is not defined and LSF rejects the job submission request, the system will continue to send the job submission requests indefinitely as the default behavior.
The maximum number of file descriptors mbatchd can have open and connected concurrently to sbatchd
Controls the maximum number of connections that LSF can maintain to sbatchds in the system.
Do not exceed the file descriptor limit of the root process (the usual limit is 1024). Setting it equal or larger than this limit can cause mbatchd to constantly die because mbatchd allocates all file descriptors to sbatchd connection. This could cause mbatchd to run out of descriptors, which results in an mbatchd fatal error, such as failure to open lsb.events.
Use together with LSB_MAX_JOB_DISPATCH_PER_SESSION in lsf.conf.
The maximum number of retries for reaching a non-responding slave batch daemon, sbatchd.
The interval between retries is defined by MBD_SLEEP_TIME. If mbatchd fails to reach a host and has retried MAX_SBD_FAIL times, the host is considered unreachable.
If you define LSB_SYNC_HOST_STAT_LIM=Y, mbatchd obtains the host status from the master LIM before it polls sbatchd. When the master LIM reports that a host is unavailable (LIM is down) or unreachable (sbatchd is down) MAX_SBD_FAIL number of times, mbatchd reports the host status as unavailable or unreachable.
When a host becomes unavailable, mbatchd assumes that all jobs running on that host have exited and that all rerunnable jobs (jobs submitted with the bsub -r option) are scheduled to be rerun on another host.
The accumulated preemption time in minutes after which a job cannot be preempted again, where minutes is wall-clock time, not normalized time.
The parameter of the same name in lsb.queues overrides this parameter. The parameter of the same name in lsb.applications overrides both this parameter and the parameter of the same name in lsb.queues.
Enables user-assigned job priority and specifies the maximum job priority a user can assign to a job.
LSF and queue administrators can assign a job priority higher than the specified value for jobs they own.
cpu_list defines the list of master host CPUS on which the mbatchd child query processes can run. Format the list as a white-space delimited list of CPU numbers.
the mbatchd child query processes will run only on CPU numbers 1, 2, and 3 on the master host.
This parameter allows you to specify the master host CPUs on which mbatchd child query processes can run (hard CPU affinity). This improves mbatchd scheduling and dispatch performance by binding query processes to specific CPUs so that higher priority mbatchd processes can run more efficiently.
When you define this parameter, LSF runs mbatchd child query processes only on the specified CPUs. The operating system can assign other processes to run on the same CPU; however, if utilization of the bound CPU is lower than utilization of the unbound CPUs.
EGO_DAEMONS_CPUS=0 LSF_DAEMONS_CPUS=1:2 MBD_QUERY_CPUS=3
MBD_REFRESH_TIME=seconds [min_refresh_time]
where min_refresh_time defines the minimum time (in seconds) that the child mbatchd will stay to handle queries.
Time interval, in seconds, when mbatchd will fork a new child mbatchd to service query requests to keep information sent back to clients updated. A child mbatchd processes query requests creating threads.
MBD_REFRESH_TIME applies only to UNIX platforms that support thread programming.
If MBD_REFRESH_TIME is < min_refresh_time, the child mbatchd exits at MBD_REFRESH_TIME even if the job changes status or a new job is submitted before MBD_REFRESH_TIME expires.
If MBD_REFRESH_TIME > min_refresh_time and no job changes status or a new job is submitted, the child mbatchd exits at MBD_REFRESH_TIME
The value of this parameter must be between 0 and 300. Any values specified out of this range are ignored, and the system default value is applied.
The bjobs command may not display up-to-date information if two consecutive query commands are issued before a child mbatchd expires because child mbatchd job information is not updated. If you use the bjobs command and do not get up-to-date information, you may need to decrease the value of this parameter. Note, however, that the lower the value of this parameter, the more you negatively affect performance.
By default, when EGO-enabled SLA scheduling is configured, EGO allocates an entire host to LSF, which uses its own MXJ definition to determine how many slots are available on the host. LSF gets its host allocation from EGO, and runs as many jobs as the LSF configured MXJ for that host dictates.
MBD_USE_EGO_MXJ forces LSF to use the job slot maximum configured in the EGO consumer. This allows partial sharing of hosts (for example, a large SMP computer) among different consumers or workload managers. When MBD_USE_EGO_MXJ is set, LSF schedules jobs based on the number of slots allocated from EGO. For example, if hostA has 4 processors, but EGO allocates 2 slots to an EGO-enabled SLA consumer. LSF can schedule a maximum of 2 jobs from that SLA on hostA.
MC_PLUGIN_SCHEDULE_ENHANCE= RESOURCE_ONLY
MC_PLUGIN_SCHEDULE_ENHANCE= COUNT_PREEMPTABLE [HIGH_QUEUE_PRIORITY] [PREEMPTABLE_QUEUE_PRIORITY] [PENDING_WHEN_NOSLOTS]
MultiCluster job forwarding model only. The parameter MC_PLUGIN_SCHEDULE_ENHANCE enhances the scheduler for the MultiCluster job forwarding model based on the settings selected. Use in conjunction with MC_PLUGIN_UPDATE_INTERVAL to set the data update interval between remote clusters. MC_PLUGIN_UPDATE_INTERVAL must be a non-zero value to enable the MultiCluster enhanced scheduler.
With the parameter MC_PLUGIN_SCHEDULE_ENHANCE set to a valid value, remote resources are considered as if MC_PLUGIN_REMOTE_RESOURCE=Y regardless of the actual setting. In addition the submission cluster scheduler considers specific execution queue resources when scheduling jobs. See Using Platform MultiCluster for details.
The parameter MC_PLUGIN_SCHEDULE_ENHANCE was introduced in LSF Version 7 Update 6. All clusters within a MultiCluster configuration must be running a version of LSF containing this parameter to enable the enhanced scheduler.
After a MultiCluster connection is established, counters take the time set in MC_PLUGIN_UPDATE_INTERVAL to update. Scheduling decisions made before this first interval has passed do not accurately account for remote queue workload.
MultiCluster job forwarding model only; set for the execution cluster. The number of seconds between data updates between clusters.
A non-zero value enables collection of remote cluster queue data for use by the submission cluster enhanced scheduler.
Suggested value when enabled is MBD_SLEEP_TIME (default is 20 seconds).
A value of 0 disables collection of remote cluster queue data.
The minimum period in seconds between event log switches.
Works together with MAX_JOB_NUM to control how frequently mbatchd switches the file. mbatchd checks if MAX_JOB_NUM has been reached every MIN_SWITCH_PERIOD seconds. If mbatchd finds that MAX_JOB_NUM has been reached, it switches the events file.
To significantly improve the performance of mbatchd for large clusters, set this parameter to a value equal to or greater than 600. This causes mbatchd to fork a child process that handles event switching, thereby reducing the load on mbatchd. mbatchd terminates the child process and appends delta events to new events after the MIN_SWITCH_PERIOD has elapsed.
Enables a child mbatchd to get up to date information about new jobs from the parent mbatchd. When set to Y, job queries with bjobs display new jobs submitted after the child mbatchd was created.
If you have enabled multithreaded mbatchd support, the bjobs command may not display up-to-date information if two consecutive query commands are issued before a child mbatchd expires because child mbatchd job information is not updated. Use NEWJOB_REFRESH=Y to enable the parent mbatchd to push new job information to a child mbatchd
When NEWJOB_REFRESH=Y, you should set MBD_REFRESH_TIME to a value greater than 10 seconds.
The parent mbatchd only pushes the new job event to a child mbatchd. The child mbatchd is not aware of status changes of existing jobs. The child mbatchd will not reflect the results of job control commands (bmod, bmig, bswitch, btop, bbot, brequeue, bstop, bresume, and so on) invoked after the child mbatchd is created.
Prevents preemption of jobs that will finish within the specified number of minutes or the specified percentage of the estimated run time or run limit.
Specifies that jobs due to finish within the specified number of minutes or percentage of job duration should not be preempted, where minutes is wall-clock time, not normalized time. Percentage must be greater than 0 or less than 100% (between 1% and 99%).
For example, if the job run limit is 60 minutes and NO_PREEMPT_FINISH_TIME=10%, the job cannot be preempted after it running 54 minutes or longer.
If you specify percentage for NO_PREEMPT_RUN_TIME, requires a run time (bsub -We or RUNTIME in lsb.applications), or run limit to be specified for the job (bsub -W, or RUNLIMIT in lsb.queues, or RUNLIMIT in lsb.applications)
Prevents preemption of jobs for the specified number of minutes of uninterrupted run time, where minutes is wall-clock time, not normalized time. NO_PREEMPT_INTERVAL=0 allows immediate preemption of jobs as soon as they start or resume running.
The parameter of the same name in lsb.queues overrides this parameter. The parameter of the same name in lsb.applications overrides both this parameter and the parameter of the same name in lsb.queues.
Prevents preemption of jobs that have been running for the specified number of minutes or the specified percentage of the estimated run time or run limit.
Specifies that jobs that have been running for the specified number of minutes or longer should not be preempted, where minutes is wall-clock time, not normalized time. Percentage must be greater than 0 or less than 100% (between 1% and 99%).
For example, if the job run limit is 60 minutes and NO_PREEMPT_RUN_TIME=50%, the job cannot be preempted after it running 30 minutes or longer.
If you specify percentage for NO_PREEMPT_RUN_TIME, requires a run time (bsub -We or RUNTIME in lsb.applications), or run limit to be specified for the job (bsub -W, or RUNLIMIT in lsb.queues, or RUNLIMIT in lsb.applications)
For Cray NQS compatibility only. Used by LSF to get the NQS queue information.
If the NQS version on a Cray is NQS 1.1, 80.42 or NQS 71.3, this parameter does not need to be defined.
For other versions of NQS on Cray, define both NQS_QUEUES_FLAGS and NQS_REQUESTS_FLAGS.
To determine the value of this parameter, run the NQS qstat command. The value of Npk_int[1] in the output is the value you need for this parameter. Refer to the NQS chapter in Administering Platform LSF for more details.
For Cray NQS compatibility only.
If the NQS version on a Cray is NQS 80.42 or NQS 71.3, this parameter does not need to be defined.
If the version is NQS 1.1 on a Cray, set this parameter to 251918848. This is the is the qstat flag that LSF uses to retrieve requests on Cray in long format.
For other versions of NQS on a Cray, run the NQS qstat command. The value of Npk_int[1] in the output is the value you need for this parameter. Refer to the NQS chapter in Administering Platform LSF for more details.
If defined, LSF schedules jobs based on the number of slots assigned to the hosts instead of the number of CPUs. These slots can be defined by host in lsb.hosts or by slot limit in lsb.resources.
All slot-related messages still show the word “processors”, but actually refer to “slots” instead. Similarly, all scheduling activities also use slots instead of processors.
The time interval that a host should be interactively idle (it > 0) before jobs suspended because of a threshold on the pg load index can be resumed.
This parameter is used to prevent the case in which a batch job is suspended and resumed too often as it raises the paging rate while running and lowers it while suspended. If you are not concerned with the interference with interactive jobs caused by paging, the value of this parameter may be set to 0.
Specify one or more values (between 1 and 255, but not 99) that corresponds to the exit code your pre-execution scripts exits with in the case of failure. LSF excludes any hosts that attempt to run the pre-exec script and exit with the value specified in PREEXEC_EXCLUDE_HOST_EXIT_VALUES.
The exclusion list exists for this job until the mbatchd restarts.
Specify more than one value by separating them with a space. 99 is a reserved value. For example, PREEXEC_EXCLUDE_HOST_EXIT_VALUES=1 14 19 20 21.
Exclude values using a "~": PREEXEC_EXCLUDE_HOST_EXIT_VALUES=all ~40
In the case of failures that could be avoided by retrying on the same host, add the retry process to the pre-exec script.
Use in combination with MAX_PREEXEC_RETRY in lsb.params to limit the total number of hosts that are tried. In a multicluster environment, use in combination with LOCAL_MAX_PREEXEC_RETRY and REMOTE_MAX_PREEXEC_RETRY.
If preemptive scheduling is enabled, this parameter is used to disregard suspended jobs when determining if a job slot limit is exceeded, to preempt jobs with the shortest running time, and to optimize preemption of parallel jobs.
Specify one or more of the following keywords. Use spaces to separate multiple keywords.
Counts only running jobs when evaluating if a user group is approaching its per-processor job slot limit (SLOTS_PER_PROCESSOR, USERS, and PER_HOST=all in the lsb.resources file). Suspended jobs are ignored when this keyword is used.
Counts only running jobs when evaluating if a user group is approaching its total job slot limit (SLOTS, PER_USER=all, and HOSTS in the lsb.resources file). Suspended jobs are ignored when this keyword is used. When preemptive scheduling is enabled, suspended jobs never count against the total job slot limit for individual users.
Counts only running jobs when evaluating if a user or user group is approaching its per-host job slot limit (SLOTS and USERS in the lsb.resources file). Suspended jobs are ignored when this keyword is used.
Preempts the job that has been running for the shortest time. Run time is wall-clock time, not normalized run time.
Optimizes the preemption of parallel jobs by preempting only enough parallel jobs to start the high-priority parallel job.
Optimizes preemption of parallel jobs by preempting only low-priority parallel jobs based on the least number of jobs that will be suspended to allow the high-priority parallel job to start.
User limits and user group limits can interfere with preemption optimization of OPTIMAL_MINI_JOB. You should not configure OPTIMAL_MINI_JOB if you have user or user group limits configured.
You should configure PARALLEL_SCHED_BY_SLOT=Y when using OPTIMAL_MINI_JOB.
Counts only running jobs when evaluating if a user is approaching their per-processor job slot limit (SLOTS_PER_PROCESSOR, USERS, and PER_HOST=all in the lsb.resources file). Suspended jobs are ignored when this keyword is used. Ignores suspended jobs when calculating the user-processor job slot limit for individual users. When preemptive scheduling is enabled, suspended jobs never count against the total job slot limit for individual users.
If preemptive scheduling is enabled, this parameter enables preemption of exclusive and backfill jobs.
Specify one or both of the following keywords. Separate keywords with a space.
Enables preemption of and preemption by exclusive jobs. LSB_DISABLE_LIMLOCK_EXCL=Y in lsf.conf must also be defined.
Enables preemption of backfill jobs. Jobs from higher priority queues can preempt jobs from backfill queues that are either backfilling reserved job slots or running as normal jobs.
Enables license preemption when preemptive scheduling is enabled (has no effect if PREEMPTIVE is not also specified) and specifies the licenses that will be preemption resources. Specify shared numeric resources, static or decreasing, that LSF is configured to release (RELEASE=Y in lsf.shared, which is the default).
You must also configure LSF preemption actions to make the preempted application releases its licenses. To kill preempted jobs instead of suspending them, set TERMINATE_WHEN=PREEMPT in lsb.queues, or set JOB_CONTROLS in lsb.queues and specify brequeue as the SUSPEND action.
If Y, mbatchd reserves resources based on job slots instead of per-host.
bsub -n 4 -R "rusage[mem=500]" -q reservation my_job
requires the job to reserve 500 MB on each host where the job runs.
Some parallel jobs need to reserve resources based on job slots, rather than by host. In this example, if per-slot reservation is enabled by RESOURCE_RESERVE_PER_SLOT, the job my_job must reserve 500 MB of memory for each job slot (4*500=2 GB) on the host in order to run.
Used only with fairshare scheduling. Job slots weighting factor.
In the calculation of a user’s dynamic share priority, this factor determines the relative importance of the number of job slots reserved and in use by a user.
This parameter can also be set for an individual queue in lsb.queues. If defined, the queue value takes precedence.
Used only with fairshare scheduling. Enables decay for run time at the same rate as the decay set by HIST_HOURS for cumulative CPU time and historical run time.
In the calculation of a user’s dynamic share priority, this factor determines whether run time is decayed.
This parameter can also be set for an individual queue in lsb.queues. If defined, the queue value takes precedence.
Used only with fairshare scheduling. Run time weighting factor.
In the calculation of a user’s dynamic share priority, this factor determines the relative importance of the total run time of a user’s running jobs.
This parameter can also be set for an individual queue in lsb.queues. If defined, the queue value takes precedence.
The interval at which LSF checks the load conditions of each host, to decide whether jobs on the host must be suspended or resumed.
The job-level resource usage information is updated at a maximum frequency of every SBD_SLEEP_TIME seconds.
The update is done only if the value for the CPU time, resident memory usage, or virtual memory usage has changed by more than 10 percent from the previous update or if a new process or process group has been created.
Enable scheduler performance metric collection.
Use badmin perfmon stop and badmin perfmon start to dynamically control performance metric collection.
The update is done only if the value for the CPU time, resident memory usage, or virtual memory usage has changed by more than 10 percent from the previous update or if a new process or process group has been created.
Used by Platform Session Scheduler (ssched).
A universally accessible and writable directory that will store Session Scheduler task accounting files. Each Session Scheduler session (each ssched instance) creates one accounting file. Each file contains one accounting entry for each task. The accounting file is named job_ID.ssched.acct. If no directory is specified, accounting records are not written.
Used by Platform Session Scheduler (ssched).
Maximum run time for a task. Users can override this value with a lower value. Specify a value greater than or equal to zero (0).
Used by Platform Session Scheduler (ssched).
Update the Session Scheduler task summary via bpost after the specified number of tasks finish. Specify a value greater than or equal to zero (0).
If both SSCHED_UPDATE_SUMMARY_INTERVAL and SSCHED_UPDATE_SUMMARY_BY_TASK are set to zero (0), bpost is not run.
Used by Platform Session Scheduler (ssched).
Update the Session Scheduler task summary via bpost after the specified number of seconds. Specify a value greater than or equal to zero (0).
If both SSCHED_UPDATE_SUMMARY_INTERVAL and SSCHED_UPDATE_SUMMARY_BY_TASK are set to zero (0), bpost is not run.
When STRICT_UG_CONTROL=Y is defined:
Jobs submitted with -G usergroup specified can only be controlled by the usergroup administrator of the specified user group.
user group administrators can be defined for user groups with all as a member
After adding or changing STRICT_UG_CONTROL in lsb.params, use badmin reconfig to reconfigure your cluster.
Enables Windows workgroup account mapping, which allows LSF administrators to map all Windows workgroup users to a single Windows system account, eliminating the need to create multiple users and passwords in LSF. Users can submit and run jobs using their local user names and passwords, and LSF runs the jobs using the mapped system account name and password. With Windows workgroup account mapping, all users have the same permissions because all users map to the same system account.
To specify the user account, include the domain name in uppercase letters (DOMAIN_NAME\user_name).
Define this parameter for LSF Windows Workgroup installations only.
If USE_SUSP_SLOTS=Y, allows jobs from a low priority queue to use slots held by suspended jobs in a high priority queue, which has a preemption relation with the low priority queue.
Set USE_SUSP_SLOTS=N to prevent low priority jobs from using slots held by suspended jobs in a high priority queue, which has a preemption relation with the low priority queue.
Variable configuration is used to automatically change LSF configuration based on time windows. You define automatic configuration changes in lsb.params by using if-else constructs and time expressions. After you change the files, reconfigure the cluster with the badmin reconfig command.
The expressions are evaluated by LSF every 10 minutes based on mbatchd start time. When an expression evaluates true, LSF dynamically changes the configuration based on the associated configuration statements. Reconfiguration is done in real time without restarting mbatchd, providing continuous system availability.