Knowledge Center         Contents    Previous  Next    Index  
Platform Computing Corp.

Understanding Resources

Contents

About LSF Resources

The LSF system uses built-in and configured resources to track job resource requirements and schedule jobs according to the resources available on individual hosts.

View available resources

View cluster resources (lsinfo)
  1. Use lsinfo to list the resources available in your cluster.
  2. The lsinfo command lists all the resource names and their descriptions.

    lsinfo 
    RESOURCE_NAME   TYPE     ORDER  DESCRIPTION 
    r15s            Numeric  Inc    15-second CPU run queue length 
    r1m             Numeric  Inc    1-minute CPU run queue length (alias:cpu) 
    r15m            Numeric  Inc    15-minute CPU run queue length 
    ut              Numeric  Inc    1-minute CPU utilization (0.0 to 1.0) 
    pg              Numeric  Inc    Paging rate (pages/second) 
    io              Numeric  Inc    Disk IO rate (Kbytes/second) 
    ls              Numeric  Inc    Number of login sessions (alias: login) 
    it              Numeric  Dec    Idle time (minutes) (alias: idle) 
    tmp             Numeric  Dec    Disk space in /tmp (Mbytes) 
    swp             Numeric  Dec    Available swap space (Mbytes) (alias:swap) 
    mem             Numeric  Dec    Available memory (Mbytes) 
    ncpus           Numeric  Dec    Number of CPUs 
    nprocs          Numeric   Dec   Number of physical processors 
    ncores          Numeric   Dec   Number of cores per physical processor 
    nthreads        Numeric   Dec   Number of threads per processor 
    corendisks      Numeric  Dec    Number of local disks 
    maxmem          Numeric  Dec    Maximum memory (Mbytes) 
    maxswp          Numeric  Dec    Maximum swap space (Mbytes) 
    maxtmp          Numeric  Dec    Maximum /tmp space (Mbytes) 
    cpuf            Numeric  Dec    CPU factor 
    ... 
    
View host resources (lshosts)
  1. Run lshosts to get a list of the resources defined on a specific host:
  2. lshosts hostA
    HOST_NAME      type    model  cpuf ncpus maxmem maxswp server RESOURCES
    hostA        SOL732  Ultra2  20.2    2   256M   679M   Yes () 
    

View host load by resource

  1. Run lshosts -s to view host load by shared resource:
  2. lshosts -s
    RESOURCE     VALUE    LOCATION
    tot_lic          5    host1 host2
    tot_scratch    500    host1 host2 
     

    The above output indicates that 5 licenses are available, and that the shared scratch directory currently contains 500 MB of space.

    The VALUE field indicates the amount of that resource. The LOCATION column shows the hosts which share this resource. The lshosts -s command displays static shared resources. The lsload -s command displays dynamic shared resources.

How Resources are Classified

Resource categories

By values

Boolean resources
Resources that denote the availability of specific features
Numerical resources
Resources that take numerical values, such as all the load indices, number of processors on a host, or host CPU factor
String resources
Resources that take string values, such as host type, host model, host status

By the way values change

Dynamic Resources
Resources that change their values dynamically: host status and all the load indices.
Static Resources
Resources that do not change their values: all resources except for load indices or host status.

By definitions

External Resources
Custom resources defined by user sites: external load indices and resources defined in the lsf.shared file (shared resources).
Built-In Resources
Resources that are always defined in LSF, such as load indices, number of CPUs, or total swap space.

By scope

Host-Based Resources
Resources that are not shared among hosts, but are tied to individual hosts, such as swap space, CPU, or memory. An application must run on a particular host to access the resources. Using up memory on one host does not affect the available memory on another host.
Shared Resources
Resources that are not associated with individual hosts in the same way, but are owned by the entire cluster, or a subset of hosts within the cluster, such as floating licenses or shared file systems. An application can access such a resource from any host which is configured to share it, but doing so affects its value as seen by other hosts.

Boolean resources

Boolean resources (for example, server to denote LSF server hosts) have a value of one (1) if they are defined for a host, and zero (0) if they are not defined for the host. Use Boolean resources to configure host attributes to be used in selecting hosts to run jobs. For example:

Specify a Boolean resource in a resource requirement selection string of a job to select only hosts that can run the job.

Some examples of Boolean resources:

Resource Name
Describes
Meaning of Example Name
cs
Role in cluster
Compute server
fs
Role in cluster
File server
solaris
Operating system
Solaris operating system
frame
Available software
FrameMaker license

Shared resources

Shared resources are configured resources that are not tied to a specific host, but are associated with the entire cluster, or a specific subset of hosts within the cluster. For example:

LSF does not contain any built-in shared resources. All shared resources must be configured by the LSF administrator. A shared resource may be configured to be dynamic or static. In the above example, the total space on the shared disk may be static while the amount of space currently free is dynamic. A site may also configure the shared resource to report numeric, string or Boolean values.

An application may use a shared resource by running on any host from which that resource is accessible. For example, in a cluster in which each host has a local disk but can also access a disk on a file server, the disk on the file server is a shared resource, and the local disk is a host-based resource. In contrast to host-based resources such as memory or swap space, using a shared resource from one machine affects the availability of that resource as seen by other machines. There will be one value for the entire cluster which measures the utilization of the shared resource, but each host-based resource is measured separately.

The following restrictions apply to the use of shared resources in LSF products.

View shared resources for hosts

  1. Run bhosts -s to view shared resources for hosts. For example:
  2. bhosts -s
    RESOURCE       TOTAL    RESERVED     LOCATION
    tot_lic            5         0.0     hostA hostB
    tot_scratch       00         0.0     hostA hostB
    avail_lic          2         3.0     hostA hostB
    avail_scratch    100       400.0     hostA hostB 
     

    The TOTAL column displays the value of the resource. For dynamic resources, the RESERVED column displays the amount that has been reserved by running jobs.

How LSF Uses Resources

Jobs submitted through the LSF system will have the resources they use monitored while they are running. This information is used to enforce resource usage limits and load thresholds as well as for fairshare scheduling.

LSF collects information such as:

On UNIX, job-level resource usage is collected through a special process called PIM (Process Information Manager). PIM is managed internally by LSF.

Viewing job resource usage

The -l option of the bjobs command displays the current resource usage of the job. The usage information is sampled by PIM every 30 seconds and collected by sbatchd at a maximum frequency of every SBD_SLEEP_TIME (configured in the lsb.params file) and sent to mbatchd. The update is done only if the value for the CPU time, resident memory usage, or virtual memory usage has changed by more than 10 percent from the previous update, or if a new process or process group has been created.

tip:  
The parameter LSF_PIM_LINUX_ENHANCE in lsf.conf enables the enhanced PIM, returning the exact memory usage of processes using shared memory on Linux operating systems. By default memory shared between jobs may be counted more than once.

View load on a host

  1. Run bhosts -l to check the load levels on the host, and adjust the suspending conditions of the host or queue if necessary.
  2. The bhosts -l command gives the most recent load values used for the scheduling of jobs. A dash (-) in the output indicates that the particular threshold is not defined.

    bhosts -l hostB
    HOST:  hostB
    STATUS        CPUF   JL/U  MAX NJOBS RUN SSUSP USUSP RSV
    ok            20.00  2     2   0     0   0     0     0    
    
    CURRENT LOAD USED FOR SCHEDULING:
             r15s   r1m  r15m  ut    pg    io   ls    t   tmp   swp
       mem
    Total    0.3    0.8  0.9   61%   3.8   72   26    0   6M    253
    M  297M
    Reserved 0.0    0.0  0.0   0%    0.0   0    0     0   0M    0M 
       0M
    
    LOAD THRESHOLD USED FOR SCHEDULING:
               r15s   r1m  r15m  ut  pg  io  ls  it  tmp  swp  mem
    loadSched  -      -    -     -   -   -   -   -   -    -    -
    loadStop   -      -    -     -   -   -   -   -   -    -    -
    
    		             cpuspeed    bandwidth 
    loadSched          -            - 
    loadStop           -            -  
    

Load Indices

Load indices are built-in resources that measure the availability of static or dynamic, non-shared resources on hosts in the LSF cluster.

Load indices built into the LIM are updated at fixed time intervals.

External load indices are defined and configured by the LSF administrator, who writes an external load information manager (elim) executable. The elim collects the values of the external load indices and sends these values to the LIM.

Load indices collected by LIM

Index
Measures
Units
Direction
Averaged over
Update Interval
status
host status
string
15 seconds
r15s
run queue length
processes
increasing
15 seconds
15 seconds
r1m
run queue length
processes
increasing
1 minute
15 seconds
r15m
run queue length
processes
increasing
15 minutes
15 seconds
ut
CPU utilization
percent
increasing
1 minute
15 seconds
pg
paging activity
pages in + pages out per second
increasing
1 minute
15 seconds
ls
logins
users
increasing
N/A
30 seconds
it
idle time
minutes
decreasing
N/A
30 seconds
swp
available swap space
MB
decreasing
N/A
15 seconds
mem
available memory
MB
decreasing
N/A
15 seconds
tmp
available space in temporary file system
MB
decreasing
N/A
120 seconds
io
disk I/O (shown by lsload -l)
KB per second
increasing
1 minute
15 seconds
name
external load index configured by LSF administrator
site-defined

Status

The status index is a string indicating the current status of the host. This status applies to the LIM and RES.

The possible values for status are:

Status
Description
ok
The host is available to accept remote jobs. The LIM can select the host for remote execution.
-ok
When the status of a host is preceded by a dash (-), it means LIM is available but RES is not running on that host or is not responding.
busy
The host is overloaded (busy) because a load index exceeded a configured threshold. An asterisk (*) marks the offending index. LIM will not select the host for interactive jobs.
lockW
The host is locked by its run window. Use lshosts to display run windows.
lockU
The host is locked by an LSF administrator or root.
unavail
The host is down or the LIM on the host is not running or is not responding.
unlicensed
The host does not have a valid license.

note:  
The term available is frequently used in command output titles and headings. Available means a host is in any state except unavail. This means an available host could be unlicensed, locked, busy, or ok.

CPU run queue lengths (r15s, r1m, r15m)

The r15s, r1m and r15m load indices are the 15-second, 1-minute and 15-minute average CPU run queue lengths. This is the average number of processes ready to use the CPU during the given interval.

On UNIX, run queue length indices are not necessarily the same as the load averages printed by the uptime(1) command; uptime load averages on some platforms also include processes that are in short-term wait states (such as paging or disk I/O).

Effective run queue length

On multiprocessor systems, more than one process can execute at a time. LSF scales the run queue value on multiprocessor systems to make the CPU load of uniprocessors and multiprocessors comparable. The scaled value is called the effective run queue length.

Use lsload -E to view the effective run queue length.

Normalized run queue length

LSF also adjusts the CPU run queue based on the relative speeds of the processors (the CPU factor). The normalized run queue length is adjusted for both number of processors and CPU speed. The host with the lowest normalized run queue length will run a CPU-intensive job the fastest.

Use lsload -N to view the normalized CPU run queue lengths.

CPU utilization (ut)

The ut index measures CPU utilization, which is the percentage of time spent running system and user code. A host with no process running has a ut value of 0 percent; a host on which the CPU is completely loaded has a ut of 100 percent.

Paging rate (pg)

The pg index gives the virtual memory paging rate in pages per second. This index is closely tied to the amount of available RAM memory and the total size of the processes running on a host; if there is not enough RAM to satisfy all processes, the paging rate will be high. Paging rate is a good measure of how a machine will respond to interactive use; a machine that is paging heavily feels very slow.

Login sessions (ls)

The ls index gives the number of users logged in. Each user is counted once, no matter how many times they have logged into the host.

Interactive idle time (it)

On UNIX, the it index is the interactive idle time of the host, in minutes. Idle time is measured from the last input or output on a directly attached terminal or a network pseudo-terminal supporting a login session. This does not include activity directly through the X server such as CAD applications or emacs windows, except on Solaris and HP-UX systems.

On Windows, the it index is based on the time a screen saver has been active on a particular host.

Temporary directories (tmp)

The tmp index is the space available in MB on the file system that contains the temporary directory:

Swap space (swp)

The swp index gives the currently available virtual memory (swap space) in MB. This represents the largest process that can be started on the host.

Memory (mem)

The mem index is an estimate of the real memory currently available to user processes. This represents the approximate size of the largest process that could be started on a host without causing the host to start paging.

LIM reports the amount of free memory available. LSF calculates free memory as a sum of physical free memory, cached memory, buffered memory and an adjustment value. The command vmstat also reports free memory but displays these values separately. There may be a difference between the free memory reported by LIM and the free memory reported by vmstat because of virtual memory behavior variations among operating systems. You can write an ELIM that overrides the free memory values returned by LIM.

I/O rate (io)

The io index measures I/O throughput to disks attached directly to this host, in KB per second. It does not include I/O to disks that are mounted from other hosts.

Viewing information about load indices

lsinfo -l

The lsinfo -l command displays all information available about load indices in the system. You can also specify load indices on the command line to display information about selected indices:

lsinfo -l swp 
RESOURCE_NAME:  swp 
DESCRIPTION: Available swap space (Mbytes) (alias: swap) 
TYPE      ORDER   INTERVAL  BUILTIN  DYNAMIC  RELEASE 
Numeric     Dec         60      Yes      Yes       NO 
lsload -l

The lsload -l command displays the values of all load indices. External load indices are configured by your LSF administrator:

lsload 
HOST_NAME  status  r15s  r1m  r15m  ut   pg   ls  it  tmp  swp   mem 
hostN      ok      0.0   0.0  0.1   1%   0.0  1   224 43M  67M   3M 
hostK      -ok     0.0   0.0  0.0   3%   0.0  3   0   38M  40M   7M 
hostF      busy    0.1   0.1  0.3   7%   *17  6   0   9M   23M   28M 
hostG      busy    *6.2  6.9  9.5   85%  1.1  30  0   5M   400M  385M 
hostV      unavail 

Static Resources

Static resources are built-in resources that represent host information that does not change over time, such as the maximum RAM available to user processes or the number of processors in a machine. Most static resources are determined by the LIM at start-up time, or when LSF detects hardware configuration changes.

Static resources can be used to select appropriate hosts for particular jobs based on binary architecture, relative CPU speed, and system configuration.

The resources ncpus, nprocs, ncores, nthreads, maxmem, maxswp, and maxtmp are not static on UNIX hosts that support dynamic hardware reconfiguration.

Static resources reported by LIM

Index
Measures
Units
Determined by
type
host type
string
configuration
model
host model
string
configuration
hname
host name
string
configuration
cpuf
CPU factor
relative
configuration
server
host can run remote jobs
Boolean
configuration
rexpri
execution priority
nice(2) argument
configuration
ncpus
number of processors
processors
LIM
ndisks
number of local disks
disks
LIM
nprocs
number of physical processors
processors
LIM
ncores
number of cores per physical processor
cores
LIM
nthreads
number of threads per processor core
threads
LIM
maxmem
maximum RAM
MB
LIM
maxswp
maximum swap space
MB
LIM
maxtmp
maximum space in /tmp
MB
LIM

Host type (type)

Host type is a combination of operating system and CPU architecture. All computers that run the same operating system on the same computer architecture are of the same type. You can add custom host types in the HostType section of lsf.shared. This alphanumeric value can be up to 39 characters long.

An example of host type is LINUX86.

Host model (model)

Host model is the combination of host type and CPU speed (CPU factor) of your machine. All hosts of the same relative type and speed are assigned the same host model. You can add custom host models in the HostModel section of lsf.shared. This alphanumeric value can be up to 39 characters long.

An example of host model is Intel_IA64.

Host name (hname)

Host name specifies the name with which the host identifies itself.

CPU factor (cpuf)

The CPU factor (frequently shortened to cpuf) represents the speed of the host CPU relative to other hosts in the cluster. For example, if one processor is twice the speed of another, its CPU factor should be twice as large. For multiprocessor hosts, the CPU factor is the speed of a single processor; LSF automatically scales the host CPU load to account for additional processors. The CPU factors are detected automatically or defined by the administrator.

Server

The server static resource is Boolean. It has the following values:

Number of CPUs (ncpus)

By default, the number of CPUs represents the number of physical processors a machine has. As most CPUs consist of multiple cores, threads, and processors, ncpus can be defined by the cluster administrator (either globally or per-host) to consider one of the following:

Globally, this definition is controlled by the parameter EGO_DEFINE_NCPUS in lsf.conf or ego.conf. The default behavior for ncpus is to consider only the number of physical processors (EGO_DEFINE_NCPUS=procs).

note:  
  1. On a machine running AIX, ncpus detection is different. Under AIX, the number of detected physical processors is always 1, whereas the number of detected cores is the number of cores across all physical processors. Thread detection is the same as other operating systems (the number of threads per core).
  2. When PARALLEL_SCHED_BY_SLOT=Y in lsb.params, the resource requirement string keyword ncpus refers to the number of slots instead of the number of processors, however lshosts output will continue to show ncpus as defined by EGO_DEFINE_NCPUS in lsf.conf.

Number of disks (ndisks)

The number of disks specifies the number of local disks a machine has, determined by the LIM.

Maximum memory (maxmem)

Maximum memory is the total available memory of a machine, measured in megabytes (MB).

Maximum swap (maxswp)

Maximum swap is the total available swap space a machine has, measured in megabytes (MB).

Maximum temporary space (maxtmp)

Maximum temporary space is the total temporary space a machine has, measured in megabytes (MB).

How LIM detects cores, threads and processors

Traditionally, the value of ncpus has been equal to the number of physical CPUs. However, many CPUs consist of multiple cores and threads, so the traditional 1:1 mapping is no longer useful. A more useful approach is to set ncpus to equal one of the following:

A cluster administrator globally defines how ncpus is computed using the EGO_DEFINE_NCPUS parameter in lsf.conf or ego.conf (instead of LSF_ENABLE_DUALCORE in lsf.conf, or EGO_ENABLE_DUALCORE in ego.conf).

See Define ncpus-processors, cores, or threads for details.

LIM detects and stores the number of processors, cores, and threads for all supported architectures. The following diagram illustrates the flow of information between daemons, CPUs, and other components.

Although the ncpus computation is applied globally, it can be overridden on a per-host basis. See Override the global configuration of ncpus computation for details.

To correctly detect processors, cores, and threads, LIM assumes that all physical processors on a single machine are of the same type.

In cases where CPU architectures and operating system combinations may not support accurate processor, core, thread detection, LIM uses the defaults of 1 processor, 1 core per physical processor, and 1 thread per core. If LIM detects that it is running in a virtual environment (for example, VMwareŽ), each detected processor is similarly reported (as a single-core, single-threaded, physical processor).

LIM only detects hardware that is recognized by the operating system. LIM detection uses processor- or OS-specific techniques (for example, the Intel CPUID instruction, or Solaris kstat()/core_id). If the operating system does not recognize a CPU or core (for example, if an older OS does not recognize a quad-core processor and instead detects it as dual-core), then LIM will not recognize it either.

note:  
RQL normalization never considers threads. Consider a hyper-thread enabled Pentium: Threads are not full-fledged CPUs, so considering them as CPUs would artificially lower the system load.
ncpus detection on AIX

On a machine running AIX, detection of ncpus is different. Under AIX, the number of detected physical processors is always 1, whereas the number of detected cores is always the number of cores across all physical processors. Thread detection is the same as other operating systems (the number of threads per core).

Define ncpus-processors, cores, or threads

A cluster administrator must define how ncpus is computed. Usually, the number of available job slots is equal to the value of ncpus; however, slots can be redefined at the EGO resource group level. The ncpus definition is globally applied across the cluster.

  1. Open lsf.conf or ego.conf.
  2. Define the parameter EGO_DEFINE_NCPUS=[procs | cores | threads].
  3. Set it to one of the following:

  4. Save and close lsf.conf or ego.conf.
tip:  
As a best practice, set EGO_DEFINE_NCPUS instead of EGO_ENABLE_DUALCORE. The functionality of EGO_ENABLE_DUALCORE=y is preserved by setting EGO_DEFINE_NCPUS=cores.
Interaction with LSF_LOCAL_RESOURCES in lsf.conf

If EGO is enabled, and EGO_LOCAL_RESOURCES is set in ego.conf and LSF_LOCAL_RESOURCES is set in lsf.conf, EGO_LOCAL_RESOURCES takes precedence.

Platform LSF

Override the global configuration of ncpus computation

The cluster administrator globally defines how the ncpus resource is computed. The ncpus global definition can be overridden on specified dynamic and static hosts in the cluster.

Defining computation of ncpus on dynamic hosts
  1. Open lsf.conf or ego.conf.
  2. Define the parameter EGO_LOCAL_RESOURCES="[resource resource_name]".
  3. Set resource_name to one of the following:

  4. Save and close ego.conf.
note:  
In multi-cluster environments, if ncpus is defined on a per-host basis (thereby overriding the global setting) the definition is applied to all clusters that the host is a part of. In contrast, globally defined ncpus settings only take effect within the cluster for which EGO_DEFINE_NCPUS is defined.
Defining computation of ncpus on static hosts
  1. Open lsf.cluster.cluster_name.
  2. Find the host you for which you want to define ncpus computation. In the RESOURCES column, add one of the following definitions:
  3. Save and close lsf.cluster.cluster_name.
  4. Restart the master host.
note:  
In multi-cluster environments, if ncpus is defined on a per-host basis (thereby overriding the global setting) the definition is applied to all clusters that the host is a part of. In contrast, globally defined ncpus settings only take effect within the cluster for which EGO_DEFINE_NCPUS is defined.
Interaction with LSF_LOCAL_RESOURCES in lsf.conf

If EGO is enabled, and EGO_LOCAL_RESOURCES is set in ego.conf and LSF_LOCAL_RESOURCES is set in lsf.conf, EGO_LOCAL_RESOURCES takes precedence.

Platform LSF

Automatic Detection of Hardware Reconfiguration

Some UNIX operating systems support dynamic hardware reconfiguration-that is, the attaching or detaching of system boards in a live system without having to reboot the host.

Supported platforms

LSF is able to recognize changes in ncpus, maxmem, maxswp, maxtmp in the following platforms:

Dynamic changes in ncpus

LSF is able to automatically detect a change in the number of processors in systems that support dynamic hardware reconfiguration.

The local LIM checks if there is a change in the number of processors at an internal interval of 2 minutes. If it detects a change in the number of processors, the local LIM also checks maxmem, maxswp, maxtmp. The local LIM then sends this new information to the master LIM.

Dynamic changes in maxmem, maxswp, maxtmp

If you dynamically change maxmem, maxswp, or maxtmp without changing the number of processors, you need to restart the local LIM with the command lsadmin limrestart so that it can recognize the changes.

If you dynamically change the number of processors and any of maxmem, maxswp, or maxtmp, the change will be automatically recognized by LSF. When it detects a change in the number of processors, the local LIM also checks maxmem, maxswp, maxtmp.

Viewing dynamic hardware changes

lsxxx Commands

There may be a 2 minute delay before the changes are recognized by lsxxx commands (for example, before lshosts displays the changes).

bxxx Commands

There may be at most a 2 + 10 minute delay before the changes are recognized by bxxx commands (for example, before bhosts -l displays the changes).

This is because mbatchd contacts the master LIM at an internal interval of 10 minutes.

Platform MultiCluster

Configuration changes from a local cluster are communicated from the master LIM to the remote cluster at an interval of 2 * CACHE_INTERVAL. The parameter CACHE_INTERVAL is configured in lsf.cluster.cluster_name and is by default 60 seconds.

This means that for changes to be recognized in a remote cluster there is a maximum delay of 2 minutes + 2*CACHE_INTERVAL.

How dynamic hardware changes affect LSF

LSF uses ncpus, maxmem, maxswp, maxtmp to make scheduling and load decisions.

When processors are added or removed, LSF licensing is affected because LSF licenses are based on the number of processors.

If you put a processor offline:

If you put a new processor online:

Per-processor job slot limit (PJOB_LIMIT in lsb.queues) may be reached later.

Set the external static LIM

Use the external static LIM to automatically detect the operating system type and version of hosts.

  1. In lsf.shared, remove the comment from the indices you want detected.
  2. In $LSF_SERVERDIR, rename tmp.eslim.<extension> to eslim.extension.
  3. Set EGO_ESLIM_TIMEOUT in lsf.conf or ego.conf.
  4. Restart the lim on all hosts.

Platform Computing Inc.
www.platform.com
Knowledge Center         Contents    Previous  Next    Index