Example output of bacct and bhist


Example termination cause

Termination reason in bacct –l

Example bhist output

bkill -s KILL

bkill job_ID

Completed <exit>; TERM_OWNER or TERM_ADMIN

Thu Mar 13 17:32:05: Signal <KILL> requested by user or administrator <user2>;

Thu Mar 13 17:32:06: Exited by signal 2. The CPU time used is 0.1 seconds;

bkill –r

Completed <exit>; TERM_FORCE_ADMIN or TERM_FORCE_OWNER when sbatchd is not reachable.

Otherwise, TERM_USER or

TERM_ADMIN

Thu Mar 13 17:32:05: Signal <KILL> requested by user or administrator <user2>;

Thu Mar 13 17:32:06: Exited by signal 2. The CPU time used is 0.1 seconds;

TERMINATE_WHEN

Completed <exit>; TERM_LOAD/

TERM_WINDOWS/

TERM_PREEMPT

Thu Mar 13 17:33:16: Signal <KILL> requested by user or administrator <user2>;

Thu Mar 13 17:33:18: Exited by signal 2. The CPU time used is 0.1 seconds;

Memory limit reached

Completed <exit>; TERM_MEMLIMIT

Thu Mar 13 19:31:13: Exited by signal 2. The CPU time used is 0.1 seconds;

Run limit reached

Completed <exit>; TERM_RUNLIMIT

Thu Mar 13 20:18:32: Exited by signal 2. The CPU time used is 0.1 seconds.

CPU limit

Completed <exit>; TERM_CPULIMIT

Thu Mar 13 18:47:13: Exited by signal 24. The CPU time used is 62.0 seconds;

Swap limit

Completed <exit>; TERM_SWAPLIMIT

Thu Mar 13 18:47:13: Exited by signal 24. The CPU time used is 62.0 seconds;

Regular job exits when host crashes

Rusage 0,

Completed <exit>;

TERM_ZOMBIE

Thu Jun 12 15:49:02: Unknown; unable to reach the execution host;

Thu Jun 12 16:10:32: Running;

Thu Jun 12 16:10:38: Exited with exit code 143. The CPU time used is 0.0 seconds;

brequeue –r

For each requeue,

Completed <exit>;

TERM_REQUEUE_ADMIN or TERM_REQUEUE_OWNER

Thu Mar 13 17:46:39: Signal <REQUEUE_PEND> requested by user or administrator <user2>;

Thu Mar 13 17:46:56: Exited by signal 2. The CPU time used is 0.1 seconds;

bchkpnt -k

On the first run:

Completed <exit>;

TERM_CHKPNT

Wed Apr 16 16:00:48: Checkpoint succeeded (actpid 931249);

Wed Apr 16 16:01:03: Exited with exit code 137. The CPU time used is 0.0 seconds;

Kill –9 <RES> and job

Completed <exit>; TERM_EXTERNAL_SIGNAL

Thu Mar 13 17:30:43: Exited by signal 15. The CPU time used is 0.1 seconds;

Others

Completed <exit>;

Thu Mar 13 17:30:43: Exited with 3; The CPU time used is 0.1 seconds;