The job submission and execution controls feature enables you to use external, site-specific executables to validate, modify, and reject jobs, transfer data, and modify the job execution environment. By writing external submission (esub) and external execution (eexec) binaries or scripts, you can, for example, prevent the overuse of resources, specify execution hosts, or set required environment variables based on the job submission options.
The job submission and execution controls feature uses the executables esub and eexec to control job options and the job execution environment.
When a user submits a job using bsub or modifies a job using bmod, LSF runs the esub executable(s) on the submission host before accepting the job. If the user submitted the job with options such as -R to specify required resources or -q to specify a queue, an esub can change the values of those options to conform to resource usage policies at your site.
An esub can also change the user environment on the submission host prior to job submission so that when LSF copies the submission host environment to the execution host, the job runs on the execution host with the values specified by the esub. For example, an esub can add user environment variables to those already associated with the job.
An esub executable is typically used to enforce site-specific job submission policies and command-line syntax by validating or pre-parsing the command line. The file indicated by the environment variable LSB_SUB_PARM_FILE stores the values submitted by the user. An esub reads the LSB_SUB_PARM_FILE and then accepts or changes the option values or rejects the job. Because an esub runs before job submission, using an esub to reject incorrect job submissions improves overall system performance by reducing the load on the master batch daemon (mbatchd).
LSF provides a master external submission executable (LSF_SERVERDIR/mesub) that supports the use of application-specific esub executables. Users can specify one or more esub executables using the -a option of bsub or bmod. When a user submits or modifies a job or when a user restarts a job that was submitted or modified with the -a option included, mesub runs the specified esub executables.
An LSF administrator can specify one or more mandatory esub executables by defining the parameter LSB_ESUB_METHOD in lsf.conf. If a mandatory esub is defined, mesub runs the mandatory esub for all jobs submitted to LSF in addition to any esub executables specified with the -a option.
An eexec is an executable that you write to control the job environment on the execution host.
Run a shell script to create and populate environment variables needed by jobs
Monitor the number of tasks running on a host and raise a flag when this number exceeds a pre-determined limit
Pass DCE credentials and AFS tokens using a combination of esub and eexec executables; LSF functions as a pipe for passing data from the stdout of esub to the stdin of eexec
An eexec can change the user environment variable values transferred from the submission host so that the job runs on the execution host with a different environment.
For example, if you have a mixed UNIX and Windows cluster, the submission and execution hosts might use different operating systems. In this case, the submission host environment might not meet the job requirements when the job runs on the execution host. You can use an eexec to set the correct user environment between the two operating systems.
Typically, an eexec executable is a shell script that creates and populates the environment variables required by the job. An eexec can also monitor job execution and enforce site-specific resource usage policies.
Run a shell script to create and populate environment variables needed by jobs
Monitor the number of tasks running on a host and raise a flag when this number exceeds a pre-determined limit
Pass DCE credentials and AFS tokens using a combination of esub and eexec executables; LSF functions as a pipe for passing data from the stdout of esub to the stdin of eexec
If an eexec executable exists in the directory specified by LSF_SERVERDIR, LSF invokes that eexec for all jobs submitted to the cluster. By default, LSF runs eexec on the execution host before the job starts. The job process that invokes eexec waits for eexec to finish before continuing with job execution.
This feature is enabled by the presence of at least one esub or one eexec executable in the directory specified by the parameter LSF_SERVERDIR in lsf.conf. LSF does not include a default esub or eexec; you should write your own executables to meet the job requirements of your site.
The name of your esub should indicate the application with which it runs. For example: esub.fluent.
Once the LSF_SERVERDIR contains one or more esub executables, users can specify the esub executables associated with each job they submit. If an eexec exists in LSF_SERVERDIR, LSF invokes that eexec for all jobs submitted to the cluster.
When you write an esub, you can use the following environment variables provided by LSF for the esub execution environment:
Points to a temporary file that LSF uses to store the bsub options entered in the command line. An esub reads this file at job submission and either accepts the values, changes the values, or rejects the job. Job submission options are stored as name-value pairs on separate lines with the format option_name=value.
For example, if a user submits the following job,
bsub -q normal -x -P myproject -R "rlm rusage[mem=100]" -n 90 myjob
LSB_SUB_QUEUE="normal"LSB_SUB_EXLUSIVE=YLSB_SUB_RES_REQ="rlm usage[mem=100]"LSB_SUB_PROJECT_NAME="myproject"LSB_SUB_COMMAND_LINE="myjob"LSB_SUB_NUM_PROCESSORS=90LSB_SUB_MAX_NUM_PROCESSORS=90
An esub can change any or all of the job options by writing to the file specified by the environment variable LSB_SUB_MODIFY_FILE.
Points to the file that esub uses to modify the bsub job option values stored in the LSB_SUB_PARM_FILE. You can change the job options by having your esub write the new values to the LSB_SUB_MODIFY_FILE in any order, using the same format shown for the LSB_SUB_PARM_FILE. The value SUB_RESET, integers, and boolean values do not require quotes. String parameters must be entered with quotes around each string, or space-separated series of strings.
When your esub runs at job submission, LSF checks the LSB_SUB_MODIFY_FILE and applies changes so that the job runs with the revised option values.
Points to the file that esub uses to modify the user environment variables with which the job is submitted (not specified by bsub options). You can change these environment variables by having your esub write the values to the LSB_SUB_MODIFY_ENVFILE in any order, using the format variable_name=value, or variable_name="string".
LSF uses the LSB_SUB_MODIFY_ENVFILE to change the environment variables on the submission host. When your esub runs at job submission, LSF checks the LSB_SUB_MODIFY_ENVFILE and applies changes so that the job is submitted with the new environment variable values. LSF associates the new user environment with the job so that the job runs on the execution host with the new user environment.
exit $LSB_SUB_ABORT_VALUE
If multiple esubs are specified and one of the esubs exits with a value of LSB_SUB_ABORT_VALUE, LSF rejects the job without running the remaining esubs and returns a value of LSB_SUB_ABORT_VALUE.
Specifies the name of the LSF command that most recently invoked an external executable.
Stores the process ID of the LSF process that invoked eexec. If eexec is intended to monitor job execution, eexec must spawn a child and then have the parent eexec process exit. The eexec child should periodically test that the job process is still alive using the LS_JOBPID variable.
The following examples illustrate how customized esub and eexec executables can control job submission and execution.
#!/bin/sh. $LSB_SUB_PARM_FILE# Redirect stderr to stdout so echo can be used for error messages exec 1>&2# Check valid projectsif [ $LSB_SUB_PROJECT_NAME != "proj1" -o $LSB_SUB_PROJECT_NAME != "proj2" ]; thenecho "Incorrect project name specified"exit $LSB_SUB_ABORT_VALUEfiUSER=`whoami`if [ $LSB_SUB_PROJECT_NAME="proj1" ]; then# Only user1 and user2 can charge to proj1if [$USER != "user1" -a $USER != "user2" ]; thenecho "You are not allowed to charge to this project"exit $LSB_SUB_ABORT_VALUEfifi
#!/bin/sh. $LSB_SUB_PARM_FILE# Redirect stderr to stdout so echo can be used for error messages exec 1>&2USER=`whoami`# Make sure userA is using the right queue queueAif [ $USER="userA" -a $LSB_SUB_QUEUE != "queueA" ]; thenecho "userA has submitted a job to an incorrect queue"echo "...submitting to queueA"echo 'LSB_SUB_QUEUE="queueA"' > $LSB_SUB_MODIFY_FILEfi# Make sure userB is using the right shell (/bin/sh)if [ $USER="userB" -a $SHELL != "/bin/sh" ]; thenecho "userB has submitted a job using $SHELL"echo "...using /bin/sh instead"echo 'SHELL="/bin/sh"' > $LSB_SUB_MODIFY_ENVFILEfi# Deny userC the ability to submit a jobif [ $USER="userC" ]; thenecho "You are not permitted to submit a job."exit $LSB_SUB_ABORT_VALUEfi
#!/bin/sh# eexec# Example script to monitor the number of jobs executing through RES.# This script works in cooperation with an elim that counts the# number of files in the TASKDIR directory. Each RES process on a host# will have a file in the TASKDIR directory.# Don’t want to monitor lsbatch jobs.if [ "$LSB_JOBID" != "" ] ; thenexit 0fiTASKDIR="/tmp/RES_dir"# directory containing all the task files#for the host.# you can change this to whatever# directory you wish, just make sure anyone# has read/write permissions.# if TASKDIR does not exist create itif [ "test -d $TASKDIR" != "0" ] ; thenmkdir $TASKDIR > /dev/null 2>&1fi# Need to make sure LS_JOBPID, and USER are defined# exit normallyif [ "test -z $LS_JOBPID"="0" ] ; thenexit 0elif [ "test -z $USER" = "0" ] ; thenexit 0fitaskFile="$TASKDIR/$LS_JOBPID.$USER"# Fork grandchild to stay around for the duration of the tasktouch $taskFile >/dev/null 2>&1((while : ;dokill -0 $LS_JOBPID >/dev/null 2>&1if [ $? -eq 0 ] ; thensleep 10 # this is the poll interval# increase it if you want but# see the elim for its# corresponding update intervalelserm $taskFile >/dev/null 2>&1exit 0fidone)&)&wait
A combination of esub and eexec executables can be used to pass AFS/DCE tokens from the submission host to the execution host. LSF passes data from the standard output of esub to the standard input of eexec. A daemon wrapper script can be used to renew the tokens.
Use a text editor to view the lsf.sudoers configuration file.