bhist

displays historical information about jobs

Synopsis

bhist [-a |-d | -e |-p | -r | -s] [-b | -w] [-l] [-C start_time,end_time] [-D start_time,end_time] [-f logfile_name | -n number_logfiles | -n 0] [-S start_time,end_time] [-J job_name] [-Jd "job_description"] [-Lp ls_project_name] [-m "host_name" ...] [-N host_name | -N host_model | -N CPU_factor] [-P project_name] [-q queue_name] [-u user_name | -u all | -G user_group]
bhist -t [-f logfile_name] [-T start_time,end_time]
bhist [-J job_name] [-Jd "job_description"] [-N host_name | -N host_model | -N cpu_factor] [job_ID ... | "job_ID[index]" ...]
bhist [-h | -V]

Description

By default:

  • Displays information about your own pending, running and suspended jobs. Groups information by job

  • CPU time is not normalized

  • Searches the event log file currently used by the LSF system: $LSB_SHAREDIR/cluster_name/logdir/lsb.events (see lsb.events(5))

  • Displays events occurring in the past week, but this can be changed by setting the environment variable LSB_BHIST_HOURS to an alternative number of hours

Options

-a

Displays information about both finished and unfinished jobs.

This option overrides -d, -p, -s, and -r.

-b

Brief format.

-d

Only displays information about finished jobs.

-e

Only displays information about exited jobs.

-l

Long format.

If the job was submitted bsub -K, the -l option displays Synchronous execution.

If you submitted a job using the OR (||) expression to specify alternative resources, this option displays the successful Execution rusage string with which the job ran.

If you submitted a job with multiple resource requirement strings using the bsub -R option for the order, same, rusage, and select sections, bhist -l displays a single, merged resource requirement string for those sections, as if they were submitted using a single -R.

Long format includes information about:
  • Job exit codes.

  • Exit reasons for terminated jobs

  • Job exceptions (for example, if a job's runtime exceeds the runtime estimate, a job exception of runtime_est_exceeded displays)

  • Resizable job information

  • SSH X11 forwarding information (-XF)

  • Changes to pending jobs as a result of the following bmod options:
    • Absolute priority scheduling (-aps | -apsn)

    • Runtime estimate (-We | -Wen)

    • Post-execution command (-Ep | -Epn)

    • User limits (-ul | -uln)

    • Current working directory (-cwd | -cwdn)

    • Checkpoint options (-k | -kn)

    • Migration threshold (-mig | -mign)

    • Autoreszizable job attribute (-ar | -arn)

    • Job resize notification command (-rnc | -rncn)

    • Job description (-Jd | -Jdn)

-p

Only displays information about pending jobs.

-r

Only displays information about running jobs.

-s

Only displays information about suspended jobs.

-t

Displays job events chronologically.

By default only displays records from the last week. For different time periods use -t with the -T option.

-w

Wide format. Displays the information in a wide format.

-C start_time,end_time

Only displays jobs that completed or exited during the specified time interval. Specify the times in the format yyyy/mm/dd/HH:MM. Do not specify spaces in the time interval string.

For more information about the syntax, see "Time interval format" at the end of this bhist command reference.

-D start_time,end_time

Only displays jobs dispatched during the specified time interval. Specify the times in the format yyyy/mm/dd/HH:MM. Do not specify spaces in the time interval string.

For more information about the syntax, see "Time interval format" at the end of this bhist command reference.

-G user_group

Only displays jobs associated with a user group submitted with bsub -G for the specified user group. The –G option does not display jobs from subgroups within the specified user group.

The -G option cannot be used together with the -u option. You can only specify a user group name. The keyword all is not supported for -G.

-S start_time,end_time

Only displays information about jobs submitted during the specified time interval. Specify the times in the format yyyy/mm/dd/HH:MM. Do not specify spaces in the time interval string.

For more information about the syntax, see "Time interval format" at the end of this bhist command reference.

-T start_time,end_time

Used together with -t.

Only displays information about job events within the specified time interval. Specify the times in the format yyyy/mm/dd/HH:MM. Do not specify spaces in the time interval string.

For more information about the syntax, see "Time interval format" at the end of this bhist command reference.

-f logfile_name

Searches the specified event log. Specify either an absolute or a relative path.

Useful for analysis directly on the file.

The specified file path can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.

-J job_name

Only displays the jobs that have the specified job name.

The job name can be up to 4094 characters long. Job names are not unique.

The wildcard character (*) can be used anywhere within a job name, but cannot appear within array indices. For example job* returns jobA and jobarray[1], *AAA*[1] returns the first element in all job arrays with names containing AAA, however job1[*] will not return anything since the wildcard is within the array index.

-Jd "job_description"

Only displays the jobs that have the specified job description.

The job description can be up to 4094 characters long. Job descriptions are not unique.

The wildcard character (*) can be used anywhere within a job description.

-Lp ls_project_name

Only displays information about jobs belonging to the specified License Scheduler project.

-m "host_name" ...

Only displays jobs dispatched to the specified host.

-n number_logfiles | -n 0

Searches the specified number of event logs, starting with the current event log and working through the most recent consecutively numbered logs. The maximum number of logs you can search is 100. Specify 0 to specify all the event log files in $(LSB_SHAREDIR)/cluster_name/logdir (up to a maximum of 100 files).

If you delete a file, you break the consecutive numbering, and older files are inaccessible to bhist.

For example, if you specify 3, LSF searches lsb.events, lsb.events.1, and lsb.events.2. If you specify 4, LSF searches lsb.events, lsb.events.1, lsb.events.2, and lsb.events.3. However, if lsb.events.2 is missing, both searches include only lsb.events and lsb.events.1.

-N host_name | -N host_model | -N cpu_factor

Normalizes CPU time by the specified CPU factor, or by the CPU factor of the specified host or host model.

If you use bhist directly on an event log, you must specify a CPU factor.

Use lsinfo to get host model and CPU factor information.

-P project_name

Only displays information about jobs belonging to the specified project.

-q queue_name

Only displays information about jobs submitted to the specified queue.

-u user_name | -u all

Displays information about jobs submitted by the specified user, or by all users if the keyword all is specified. To specify a Windows user account, include the domain name in uppercase letters and use a single back slash (DOMAIN_NAME\user_name) in a Windows command line or a double back slash (DOMAIN_NAME\\user_name) in a UNIX command line.

job_ID | "job_ID[index]"

Searches all event log files and only displays information about the specified jobs. If you specify a job array, displays all elements chronologically.

This option overrides all other options except -J, -Jd, -N, -h, and -V. When it is used with -J, only those jobs listed here that have the specified job name are displayed. When it is used with -Jd, only those jobs listed here that have the specified job description are displayed.

-h

Prints command usage to stderr and exits.

-V

Prints release version to stderr and exits.

Output: Default format

Statistics of the amount of time that a job has spent in various states:

PEND

The total waiting time excluding user suspended time before the job is dispatched.

PSUSP

The total user suspended time of a pending job.

RUN

The total run time of the job.

USUSP

The total user suspended time after the job is dispatched.

SSUSP

The total system suspended time after the job is dispatched.

UNKWN

The total unknown time of the job (job status becomes unknown if sbatchd on the execution host is temporarily unreachable).

TOTAL

The total time that the job has spent in all states; for a finished job, it is the turnaround time (that is, the time interval from job submission to job completion).

Output: Long format (-l)

The -l option displays a long format listing with the following additional fields:

Project

The project the job was submitted from.

Application Profile

The application profile the job was submitted to.

Command

The job command.

Detailed history includes job group modification, the date and time the job was forwarded and the name of the cluster to which the job was forwarded.

The displayed job command can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.

Initial checkpoint period

The initial checkpoint period specified at the job level, by bsub -k, or in an application profile with CHKPNT_INITPERIOD.

Checkpoint period

The checkpoint period specified at the job level, by bsub -k, in the queue with CHKPNT, or in an application profile with CHKPNT_PERIOD.

Checkpoint directory

The checkpoint directory specified at the job level, by bsub -k, in the queue with CHKPNT, or in an application profile with CHKPNT_DIR.

Migration threshold

The migration threshold specified at the job level, by bsub -mig.

Resizable job information
  • For JOB_NEW events, bhist displays the auto resizable attribute and resize notification command in the submission line.

  • For JOB_MODIFY2 events (bmod), bhist displays the auto resizable attribute and resize notification command in the submission line.
    • bmod -arn jobID:

      Parameters of Job are changed: Autoresizable attribute is removed;
    • bmod -ar jobID:

      Parameters of Job are changed: Job changes to autoresizable;
    • bmod -rnc resize_notification_cmd jobID:

      Parameters of Job are changed: Resize notification command changes to: <resize_notification_cmd>;
    • bmod -rncn jobID:

      Parameters of Job are changed: Resize notification command is removed;
  • For JOB_RESIZE_NOTIFY_START event, bhist displays:

    Additional allocation on <num_hosts> Hosts/Processors <host_list>
  • For JOB_RESIZE_NOTIFY_ACCEPT event, bhist displays the following:
    • If the notification command is configured and sbatchd successfully initializes notification command. bhist displays

      Resize notification accepted. Notification command initialized (Command PID: 123456)
    • If a notification command is not defined, bhist displays

      Resize notification accepted
    • If sbatchd reports failure for whatever reason, bhist displays

      Resize notification failed
  • For JOB_RESIZE_NOTIFY_DONE event, bhist displays the following:
    • Resize notification command completed if status is 0

    • Resize notification command failed if status is 1

  • For JOB_RESIZE_RELEASE event, bhist displays

    Release allocation on <num_hosts> Hosts/Processors <host_list> by user or administrator <user_name>, Resize notification command: <command_line>, Cancel pending allocation request;

    For bmod -rncn, bhist displays

    Resize notification command disabled 
  • For JOB_RESIZE_CANCEL event, bhist displays

    Cancel pending allocation request
Synchronous execution

Job was submitted with the -K option. LSF submits the job and waits for the job to complete.

Terminated jobs: exit reasons

For jobs that have terminated, displays exit reasons.

Interactive jobs

For interactive jobs, bhist -l does NOT display information about a job’s execution home, cwd, or running PID.

Files

Reads lsb.events

See also

lsb.events, bgadd, bgdel, bjgroup, bsub, bjobs, lsinfo

Time interval format

You use the time interval to define a start and end time for collecting the data to be retrieved and displayed. While you can specify both a start and an end time, you can also let one of the values default. You can specify either of the times as an absolute time, by specifying the date or time, or you can specify them relative to the current time.

Specify the time interval is follows:

start_time,end_time|start_time,|,end_time|start_time

Specify start_time or end_time in the following format:

[year/][month/][day][/hour:minute|/hour:]|.|.-relative_int

Where:

  • year is a four-digit number representing the calendar year.

  • month is a number from 1 to 12, where 1 is January and 12 is December.

  • day is a number from 1 to 31, representing the day of the month.

  • hour is an integer from 0 to 23, representing the hour of the day on a 24-hour clock.

  • minute is an integer from 0 to 59, representing the minute of the hour.

  • . (period) represents the current month/day/hour:minute.

  • .-relative_int is a number, from 1 to 31, specifying a relative start or end time prior to now.

    start_time,end_time

    Specifies both the start and end times of the interval.

    start_time,

    Specifies a start time, and lets the end time default to now.

    ,end_time

    Specifies to start with the first logged occurrence, and end at the time specified.

    start_time

    Starts at the beginning of the most specific time period specified, and ends at the maximum value of the time period specified. For example, 2/ specifies the month of February—start February 1 at 00:00 a.m. and end at the last possible minute in February: February 28th at midnight.

Absolute time examples

Assume the current time is May 9 17:06 2008:

1,8 = May 1 00:00 2008 to May 8 23:59 2008

,4 = the time of the first occurrence to May 4 23:59 2008

6 = May 6 00:00 2008 to May 6 23:59 2008

2/ = Feb 1 00:00 2008 to Feb 28 23:59 2008

/12: = May 9 12:00 2008 to May 9 12:59 2008

2/1 = Feb 1 00:00 2008 to Feb 1 23:59 2008

2/1, = Feb 1 00:00 to the current time

,. = the time of the first occurrence to the current time

,2/10: = the time of the first occurrence to May 2 10:59 2008

2001/12/31,2008/5/1 = from Dec 31, 2001 00:00:00 to May 1st 2008 23:59:59

Relative time examples

.-9, = April 30 17:06 2008 to the current time

,.-2/ = the time of the first occurrence to Mar 7 17:06 2008

.-9,.-2 = nine days ago to two days ago (April 30 17:06 2008 to May 7 17:06 2008)