The LSF batch event log file lsb.events is used to display LSF batch event history and for mbatchd failure recovery.
Whenever a host, job, or queue changes status, a record is appended to the event log file. The file is located in LSB_SHAREDIR/cluster_name/logdir, where LSB_SHAREDIR must be defined in lsf.conf(5) and cluster_name is the name of the LSF cluster, as returned by lsid. See mbatchd(8) for the description of LSB_SHAREDIR.
The bhist command searches the most current lsb.events file for its output.
The event log file is an ASCII file with one record per line. For the lsb.events file, the first line has the format # history_seek_position>, which indicates the file position of the first history event after log switch. For the lsb.events.# file, the first line has the format # timestamp_most_recent_event, which gives the timestamp of the most recent event in the file.
Use MAX_JOB_NUM in lsb.params to set the maximum number of finished jobs whose events are to be stored in the lsb.events log file.
Once the limit is reached, mbatchd starts a new event log file. The old event log file is saved as lsb.events.n, with subsequent sequence number suffixes incremented by 1 each time a new log file is started. Event logging continues in the new lsb.events file.
Start time – the job should be started on or after this time
Termination deadline – the job should be terminated by this time (%d)
Current working directory (up to 4094 characters for UNIX or 255 characters for Windows)
Input file name (up to 4094 characters for UNIX or 255 characters for Windows)
Output file name (up to 4094 characters for UNIX or 255 characters for Windows)
Error output file name (up to 4094 characters for UNIX or 255 characters for Windows)
Job command (up to 4094 characters for UNIX or 255 characters for Windows)
Time Event, for job dependency condition; specifies when time event ended
Spool input file (up to 4094 characters for UNIX or 255 characters for Windows)
Spool command file (up to 4094 characters for UNIX or 255 characters for Windows)
Job spool directory (up to 4094 characters for UNIX or 255 characters for Windows)
Post-execution command to run on the execution host after the job finishes
Resize notification command to run on the first execution host to inform job of a resize event.
A job has been forwarded to a remote cluster (Platform MultiCluster only).
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, older daemons and commands (pre-LSF Version 6.0) cannot recognize the lsb.events file format.
Number of reserved hosts in the remote cluster
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is the number of .hosts listed in the reserHosts field.
List of names of the reserved hosts in the remote cluster
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is logged in a shortened format.
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, older daemons and commands (pre-LSF Version 6.0) cannot recognize the lsb.events file format.
Number of processors used for execution
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is the number of .hosts listed in the execHosts field.
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is logged in a shortened format.
Not switched: the mbatchd has switched the job to a new queue, but the sbatchd has not been informed of the switch
Checkpoint signal: the job has not been sent this signal to checkpoint itself
If set to true, then parameters for the job cannot be modified.
Number of processors requested for execution. The value 2147483646 means the number of processors is undefined.
Start time – the job should be started on or after this time
Termination deadline – the job should be terminated by this time
List of names of candidate hosts for job dispatching; blank if the last field value is 0. If there is more than one host name, then each additional host name will be returned in its own field
Time Event, for job dependency condition; specifies when time event ended
Input file name (up to 4094 characters for UNIX or 255 characters for Windows)
Output file name (up to 4094 characters for UNIX or 255 characters for Windows)
Error output file name (up to 4094 characters for UNIX or 255 characters for Windows)
Job command (up to 4094 characters for UNIX or 255 characters for Windows)
Current working directory (up to 4094 characters for UNIX or 255 characters for Windows)
Maximum number of processors. The value 2147483646 means the maximum number of processors is undefined.
Spool input file (up to 4094 characters for UNIX or 255 characters for Windows)
Spool command file (up to 4094 characters for UNIX or 255 characters for Windows)
Absolute priority scheduling (APS) value set by administrator
Post-execution command to run on the execution host after the job finishes
Resize notification command to run on the first execution host to inform job of a resize event.
Current working directory job used on execution host (up to 4094 characters for UNIX or 255 characters for Windows)
How long a backfilled job can run; used for preemption backfill jobs
Time Event, for missched exception specifies when time event ended.
Except Info, pending reason for missched or cantrun exception, the exit code of the job for the abend exception, otherwise 0.
This is created when a job is inserted into a chunk.
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, older daemons and commands (pre-LSF Version 6.0) cannot recognize the lsb.events file format.
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is the number of .hosts listed in the execHosts field.
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is logged in a shortened format.
Integral of the shared memory size over time (valid only on Ultrix)
Current working directory job used on execution host (up to 4094 characters for UNIX or 255 characters for Windows)
Total resident memory usage in KB of all currently running processes in a given process group
Totaly virtual memory usage in KB of all currently running processes in given process groups
Number of currently active process in given process groups. This entry has four sub-fields:
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is the number of .hosts listed in the execHosts field.
If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is logged in a shortened format.
Name of queue if a remote brun job ran; otherwise, this field is empty
Time Event, for job dependency condition; specifies when time event ended
SLA service class name that the job group is to be attached to
Time Event, for job dependency condition; specifies when time event ended
SLA service class name that the job group is to be attached to
Number of processors used for execution. If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is the number of hosts listed in short format.
List of execution host names. If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is logged in a shortened format.
Resize notification executable process ID. If no resize notification executable is defined, this field will be set to 0.
Resize notification executable process group ID. If no resize notification executable is defined, this field will be set to 0.
Status field used to indicate possible errors. 0 Success, 1 failure.
Resize notification command to run on the first execution host to inform job of a resize event.
Number of processors used for execution during resize. If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is the number of hosts listed in short format.
List of execution host names during resize. If LSF_HPC_EXTENSIONS="SHORT_EVENTFILE" is specified in lsf.conf, the value of this field is logged in a shortened format.