Dynamic host-based resource information

Dynamic host-based resources are frequently referred to as load indices, consisting of 12 built-in load indices and 256 external load indices which can be collected using an ELIM (see Administering Platform LSF for more information). The built-in load indices report load information about the CPU, memory, disk subsystem, interactive activities, etc. on each host. The external load indices are optionally defined by your LSF administrator to collect additional host-based dynamic load information for your site.

ls_load()

ls_load() reports information about load indices:

struct hostLoad *ls_load(resreq, numhosts, options, fromhost)

On success, ls_load() returns an array containing a hostLoad structure for each host. On failure, it returns NULL and sets lserrno to indicate the error.

ls_load() has the following parameters:

char *resreq;   Resource requirements that each host must satisfy 
int *numhosts;  Initially contains the number of hosts requested 
int options;    Option flags that affect the selection of hosts 
char *fromhost; Used in conjunction with the DFT_FROMTYPE option 
numhosts parameter

*numhosts determines how many hosts should be returned. If *numhosts is 0, information is requested on all hosts satisfying resreq. If numhosts is NULL, load information is requested on one host. If numhosts is not NULL, the number of hostLoad structures returned.

options parameter

The options parameter is constructed from the bitwise inclusive OR of zero or more of the option flags defined in <lsf/lsf.h>. The most commonly used flags are:

EXACT

Exactly *numhosts hosts are desired. If EXACT is set, either exactly *numhosts hosts are returned, or the call returns an error. If EXACT is not set, then up to *numhosts hosts are returned. If *numhosts is 0, then the EXACT flag is ignored and as many eligible hosts in the load sharing system (that is, those that satisfy the resource requirement) are returned.

OK_ONLY

Return only those hosts that are currently in the ok state. If OK_ONLY is set, hosts that are busy, locked, unlicensed, or unavail are not returned. If OK_ONLY is not set, then some or all of the hosts whose status are not ok may also be returned, depending on the value of *numhosts and whether the EXACT flag is set.

NORMALIZE

Normalize CPU load indices. If NORMALIZE is set, then the CPU run queue length load indices r15s, r1m, and r15m of each returned host are normalized. See Administering Platform LSF for different types of run queue lengths. The default is to return the raw run queue length.

EFFECTIVE

If EFFECTIVE is set, then the CPU run queue length load indices of each host returned are the effective load. The default is to return the raw run queue length. The options EFFECTIVE and NORMALIZE are mutually exclusive.

IGNORE_RES

Ignore the status of RES when determining the hosts that are considered to be “ok”. If IGNORE_RES is specified, then hosts with RES not running are also considered to be “ok” during host selection.

DFT_FROMTYPE

This flag determines the default resource requirements.

Returns hosts with the same type as the fromhost which satisfy the resource requirements.

fromhost parameter

The fromhost parameter is used when DFT_FROMTYPE is set in options. If fromhost is NULL, the local host is assumed. ls_load() returns an array of the following data structure as defined in <lsf/lsf.h>:

struct hostLoad { 
    char hostName[MAXHOSTNAMELEN];  Name of the host 
    int  status[2];            The operational and load status of the host 
    float *li;                 Values for all load indices of this host 
};

The returned hostLoad array is ordered according to the order requirement in the resource requirements. For details about the ordering of hosts, see Administering Platform LSF.

Example

The following example takes no options, and periodically displays the host name, host status, and 1-minute effective CPU run queue length for each Sun SPARC host in the LSF cluster.

/****************************************************** 
* LSLIB -- Examples 
* 
* simload 
* Displays load information about all Solaris hosts in * the cluster. 
******************************************************/
#include <stdio.h> 
#include <lsf/lsf.h> 
#include <string.h> 
#include <stdlib.h> 
 
int main() 
{ 
    int i;
    struct hostLoad *hosts; 
    char   *resreq="type==SUNSOL"; 
    int    numhosts = 0; 
    int    options = 0; 
    char   *fromhost = NULL; 
    char   field[20]="*";
/* get load information on specified hosts */
    hosts = ls_load(resreq, &numhosts, options, fromhost); 
    if (hosts == NULL) { 
        ls_perror("ls_load"); 
        exit(-1); 
    }
/* print out the host name, host status and the 1-minute CPU run queue length */
    printf("%-15.15s %6.6s%6.6s\n", "HOST_NAME", "status",           "r1m"); 
    for (i = 0; i < numhosts; i++) { 
        printf("%-15.15s ", hosts[i].hostName); 
        if (LS_ISUNAVAIL(hosts[i].status)) 
            printf("%6s", "unavail"); 
        else if (LS_ISBUSY(hosts[i].status)) 
            printf("%6.6s", "busy"); 
        else if (LS_ISLOCKED(hosts[i].status)) 
            printf("%6.6s", "locked"); 
        else 
            printf("%6.6s", "ok"); 
 
        if (hosts[i].li[R1M] >= INFINIT_LOAD) 
            printf("%6.6s\n", "-"); 
        else { 
            sprintf(field + 1, "%5.1f", hosts[i].li[R1M]); 
            if (LS_ISBUSYON(hosts[i].status, R1M)) 
        printf("%6.6s\n", field); 
        else 
        printf("%6.6s\n", field + 1); 
        }  
    } 
    exit(0); 
}

The output of the above program is similar to the following:

% a.out 
HOST_NAME       status    r1m 
hostB             ok      0.0 
hostC             ok      1.2 
hostA            busy     0.6 
hostD            busy     *4.3 
hostF            unavail

If the host status is busy because of r1m, then an asterisk (*) is printed in front of the value of the r1m load index.

In the above example, the returned data structure hostLoad never needs to be freed by the program even if ls_load() is called repeatedly.

Each element of the li array is a floating point number between 0.0 and INFINIT_LOAD (defined in lsf.h). The index value is set to INFINIT_LOAD by LSF to indicate an invalid or unknown value for an index.

The li array can be indexed using different ways. The constants defined in lsf.h (see the ls_load(3) man page) can be used to index any built-in load indices as shown in the above example. If external load indices are to be used, the order in which load indices are returned will be the same as that of the resources returned by ls_info(). The variables numUsrIndx and numIndx in structure lsInfo can be used to determine which resources are load indices.

Tip:

There are more flexible ways to map load index names to values.

LSF defines a set of macros in lsf.h to test the status field. The most commonly used macros include:


Macro Name

Macro Description

LS_ISUNAVAIL(status)

Returns 1 if the LIM on the host is unavailable.

LS_ISBUSYON(status, index)

Returns 1 if the host is busy on the given index.

LS_ISBUSY(status)

Returns 1 if the host is busy.

LS_ISLOCKEDU(status)

Returns 1 if the host is locked by user.

LS_ISLOCKEDW(status)

Returns 1 if the host is locked by a time window.

LS_ISLOCKED(status)

Returns 1 if the host is locked.

LS_ISRESDOWN(status)

Returns 1 if the RES is down.

LS_ISSBDDOWN(status)

Returns 1 if the SBATCH is down.

LS_ISUNLICENSED(status)

Returns 1 if the host has no software license.

LS_ISOK(status)

Returns 1 if none of the above is true.

LS_ISOKNRES(status)

Returns 1 if the host is ok except that no RES or SBATCHD is running.