Optional LSF HPC configuration

After installing LSF HPC, you can define the following in lsf.conf:

  • LSF_LOGDIR=directory

    In large clusters, you should set LSF_LOGDIR to a local file system (for example, /var/log/lsf).

  • LSB_RLA_WORKDIR=directory parameter, where directory is the location of the status files for RLA. Allows RLA to recover its original state when it restarts. When RLA first starts, it creates the directory defined by LSB_RLA_WORKDIR if it does not exist, then creates subdirectories for each host.

    You should avoid using /tmp or any other directory that is automatically cleaned up by the system. Unless your installation has restrictions on the LSB_SHAREDIR directory, you should use the default:

    LSB_SHAREDIR/cluster_name/rla_workdir

    On IRIX or TRIX, you should not use a CXFS file system for LSB_RLA_WORKDIR.

  • On Linux hosts running HP MPI, set the full path to the HP vendor MPI library libmpirm.so.

    For example, if HP MPI is installed in /opt/hpmpi:

    LSF_VPLUGIN="/opt/hpmpi/lib/linux_ia32/libmpirm.so"
  • LSB_RLA_UPDATE=time_seconds

    Specifies how often the LSF scheduler refreshes information from RLA.

    Default: 600 seconds

  • LSB_RLA_HOST_LIST="host_name ..."

    On Linux/QsNet hosts, the LSF scheduler can contact the RLA running on any host for RMS allocation requests. LSB_RLA_HOST_LIST defines a list of hosts to restrict which RLAs the LSF scheduler contacts.

    If LSB_RLA_HOST_LIST is configured, you must list at least one host per RMS partition for the RMS partition to be considered for job scheduling.

    Listed hosts must be defined in lsf.cluster.cluster_name.

  • LSB_RLA_UPDATE=seconds

    On Linux/QsNet hosts, specifies how often RLA should refresh its RMS information map.

    Default: 600 seconds

  • LSB_RMSACCT_DELAY=time_seconds

    If set on Linux/QsNet hosts, RES waits the specified number of seconds before exiting to allow LSF and RMS job statistics to synchronize.

    If LSB_RMSACCT_DELAY=0, RES waits forever until the database is up to date.

  • LSB_RMS_MAXNUMNODES=integer

    Maximum number of nodes in a Linux/QsNet system. Specifies a maximum value for the nodes argument to the topology scheduler options specified in:

    • -extsched option of bsub

    • DEFAULT_EXTSCHED and MANDATORY_EXTSCHED in lsb.queues

    Default: 1024

  • LSB_RMS_MAXNUMRAILS=integer

    Maximum number of rails in a Linux/QsNet system. Specifies a maximum value for the rails argument to the topology scheduler options specified in:

    • -extsched option of bsub

    • DEFAULT_EXTSCHED and MANDATORY_EXTSCHED in lsb.queues

    Default: 32

  • LSB_RMS_MAXPTILE=integer

    Maximum number of CPUs per node in a Linux/QsNet system. Specifies a maximum value for the RMS[ptile] argument to the topology scheduler options specified in:

    • -extsched option of bsub

    • DEFAULT_EXTSCHED and MANDATORY_EXTSCHED in lsb.queues

    Default: 32