LSBLIB provides lsb_submit() for job submission and lsb_modify() for job modification.
On success, these calls return the job ID. On failure, it returns -1, and lsberrno set to indicate the error. lsb_submit() is similar to lsb_modify(), except lsb_modify() modifies the parameters of an already submitted job.
The submit structure is defined in lsbatch.h as:
struct submit {int options; Indicates which optional fields are presentint options2; Indicates which additional fields are presentchar *jobName; Job name (optional)char *queue; Submit the job to this queue (optional)int numAskedHosts; Size of askedHosts (optional)char **askedHosts; Array of names of candidate hosts (optional)char *resReq; Resource requirements of the job (optional)int rlimits[LSF_RLIM_NLIMITS];Limits on system resource use by all of thejob’s processeschar *hostSpec; Host model used for scaling rlimits (optional)int numProcessors; Initial number of processors needed by the jobchar *dependCond; Job dependency condition (optional)char *timeEvent Time event string for scheduled repetitive jobs(optional)time_t beginTime; Dispatch the job on or after beginTimetime_t termTime; Job termination deadlineint sigValue; This variable is obsolete)char *inFile; Path name of the job’s standard input file(optional)char *outFile; Path name of the job’s standard output file(optional)char *errFile; Path name of the job’s standard error output file(optional)char *command; Command line of the jobchar *newCommand New command for bmod (optional)time_t chkpntPeriod; Job is checkpointable with this period (optional)char *chkpntDir; Directory for this job’s chk directory (optional)int nxf; Size of xf (optional)struct xFile *xf; Array of file transfer specifications (optional)char *preExecCmd; Job’s pre-execution command (optional)char *mailUser; User E-mail address to which the job’s outputare mailed (optional)int delOptions; Bits to be removed from options(lsb_modify() only)char *projectName; Name of the job’s project (optional)int maxNumProcessors; Requested maximum num of job slots for thejobchar *loginShell; Login shell to be used to re-initializeenvironmentchar *userGroup; User groupchar *exceptList; List of exception handlersint userPriority; User prioritychar *rsvId; Use hosts reserved in advancechar *jobGroup; Job group under which the job runschar *sla; SLA under which the job runschar *extsched; extsched optionsint warningTimePeriod; Warning time period (seconds), -1 if unspecifiedchar *warningAction; Warning action, SIGNAL | CHKPNT | command, NULL if unspecifiedchar *licenseProject; The license scheduler projectint options3; Extend options againint delOptions3; Delete options in options3 fieldchar *app; Application profileint jsdlFlag; -1 if no -jsdl, and -jsdl_strict options* 0 -jsdl_strict option* 1 -jsdl option*/char *jsdlDoc; jsdl filename*/void *correlator; ARM correlator */char *apsString; aps string set by admin to denote system value* or admin factor valuechar *postExecCmd; Post-execution commands specified by -Epchar *cwd; CWD specified by -cwdint runtimeEstimation; Runtime estimation specified by -Wechar *requeueEValues; /* -Q: Job level requeue exit values */int initChkpntPeriod; Initial checkpoint period */int migThreshold; Migration threshold */char *notifyCmd; Script or command invoked when resize request satisfied};
For a complete description of the fields in the submit structure, see the lsb_submit(3) man page.
The submitReply structure is defined in lsbatch.h as
struct submitReply {char *queue; Queue name the job was submitted toLS_LONG_INT badJobId; dependCond contains badJobId but there isno such jobchar *badJobName; dependCond contains badJobName butthere is no such jobint badReqIndx; Index of a host or resource limit that causedan error};
The last three variables in the structure submitReply are only used when the lsb_submit() or lsb_modify() fail.
For a complete description of the fields in the submitReply structure, see the lsb_submit(3) man page.
To submit a new job, fill out this data structure and then call lsb_submit(). The delOptions variable is ignored by LSF batch for lsb_submit().
The example job submission program below takes the job command line as an argument and submits the job to LSF batch. For simplicity, it is assumed that the job command does not have arguments.
/******************************************************* LSBLIB -- Examples** simple bsub* This program submits a batch job to LSF* It is the equivalent of using the "bsub" command without* any options.******************************************************/#include <stdio.h>#include <stdlib.h>#include <lsf/lsbatch.h>#include "combine_arg.h"/* To use the function "combine_arg" to combine arguments on the command line include its header file "combine_arg.h". */int main(int argc, char **argv){struct submit req; /* job specifications */memset(&req, 0, sizeof(req)); /* initializes req */struct submitReply reply; /* results of job submission */int jobId; /* job ID of submitted job */int i;/* initialize LSBLIB and get the configurationenvironment */if (lsb_init(argv[0]) < 0) {lsb_perror("simbsub: lsb_init() failed");exit(-1);}/* check if input is in the right format: "./simbsubCOMMAND ARGUMENTS" */if (argc < 2) {fprintf(stderr, "Usage: simbsub command\n");exit(-1);}/* options and options2 are bitwise inclusive OR of some ofthe SUB_* flags */req.options = 0;req.options2 = 0;for (i = 0; i < LSF_RLIM_NLIMITS; i++) /* resourcelimits areinitialized todefault */req.rLimits[i] = DEFAULT_RLIMIT;req.beginTime = 0;/* specific date and time to dispatch the job */req.termTime = 0;/* specifies job termination deadline */req.numProcessors = 1;/* initial number of processors needed by a (parallel) job */req.maxNumProcessors = 1;/* max num of processors required to run the (parallel) job */req.command = combine_arg(argc,argv);/* command line of job */printf("----------------------------------------------\n");jobId = lsb_submit(&req, &reply);/* submit the job with specifications */if (jobId < 0)/* if job submission fails, lsb_submit returns -1 */switch (lsberrno) {/* and sets lsberrno to indicate the error */case LSBE_QUEUE_USE:case LSBE_QUEUE_CLOSED:lsb_perror(reply.queue);exit(-1);default:lsb_perror(NULL);exit(-1);}exit(0);}/* main */
The above program will produce output similar to the following:
The options and options2 fields of the submit structure are the bitwise inclusive OR of some of the SUB_* flags defined in lsbatch.h. These flags serve two purposes.
Some flags indicate which of the optional fields of the submit structure are present. Those that are not present have default values.
Other flags indicate submission options. For a description of these flags, see lsb_submit(3).
Since options indicate which of the optional fields are meaningful, the programmer does not need to initialize the fields that are not chosen by options. All parameters that are not optional must be initialized properly.
/* initial number of processors needed by a (parallel) job */req.maxNumProcessors = 1;/* max number of processors required to run the (parallel) job */
numProcessors and maxNumProcessors are initialized to ensure only one processor is requested. They are defined in order to synchronize the job specification in lsb_submit() to the default used by bsub.
If the resReq field of the submit structure is NULL, then LSBLIB will try to obtain resource requirements for a command from the remote task list. If the task does not appear in the remote task list, then NULL is passed to LSF batch. mbatchd uses the default resource requirements with option DFT_FROMTYPE bit set when making a LSLIB call for host selection from LIM.
for (i = 0; i < LSF_RLIM_NLIMITS; i++)
The default resource limit (DEFAULT_RLIMIT) defined in lsf.h are for no resource limits.
|
The hostSpec field of the submit structure specifies the host model to use for scaling rlimits[LSF_RLIMIT_CPU] and rlimits[LSF_RLIMIT_RUN] (See lsb_queueinfo(3)). If hostSpec is NULL, the local host’s model is assumed.
req.beginTime = 0;/* specific date and time to dispatch the job */
If the beginTime field of the submit structure is 0, start the job as soon as possible.
A USR2 signal is sent if the job is running at termTime. If the job does not terminate within 10 minutes after being sent this signal, it is killed. If the termTime field of the submit structure is 0, the job is allowed to run until it reaches a resource limit.
The example below checks the value of lsberrno when lsb_submit() fails:
if (jobId < 0)/* if job submission fails, lsb_submit returns -1 */switch (lsberrno) {/* and sets lsberrno to indicate the error */case LSBE_QUEUE_USE:case LSBE_QUEUE_CLOSED:lsb_perror(reply.queue);exit(-1);default:lsb_perror(NULL);exit(-1);}
Different actions are taken depending on the type of the error. All possible error numbers are defined in lsbatch.h. For example, error number LSBE_QUEUE_USE indicates that the user is not authorized to use the queue. The error number LSBE_QUEUE_CLOSED indicates that the queue is closed.
Since a queue name was not specified for the job, the job is submitted to the default queue. The queue field of the submitReply structure contains the name of the queue to which the job was submitted.
The above program will produce output similar to the following:
The output from the job is mailed to the user because the program did not specify a file name for the outFile parameter in the submit structure.
The program assumes that uniform user names and user ID spaces exist among all the hosts in the cluster. That is, a job submitted by a given user will run under the same user's account on the execution host. For situations where non-uniform user names and user ID spaces exist, account mapping must be used to determine the account used to run a job.
|
* indicates a bitwise OR mask for options2.
** indicates -1 means undefined
Even if all the options are not used, all optional string fields must be initialized to the empty string. For a complete description of the fields in the submit structure, see the lsb_submit(3) man page.
To modify an already submitted job, fill out a new submit structure to override existing parameters, and use delOptions to remove option bits that were previously specified for the job. Modifying a submitted job is like re-submitting the job. Thus a similar program can be used to modify an existing job with minor changes. One additional parameter that must be specified for job modification is the job Id. The parameter delOptions can also be set if you want to clear some option bits that were previously set.
All applications that call lsb_submit() and lsb_modify() are subject to authentication constraints described in .