Knowledge Center         Contents    Previous  Next    Index  
Platform Computing Corp.

Upgrading Platform LSF HPC

Contents

Upgrade Platform LSF HPC

You can install Platform LSF HPC on UNIX or Linux hosts. Refer to the installation guide for more information.

When you run lsfinstall for Platform LSF HPC, a number of changes are made for you automatically.

A number of shared resources are added to lsf.shared that are required by LSF HPC. When you upgrade Platform LSF HPC, you should add the appropriate resource names under the RESOURCES column of the Host section of lsf.cluster.cluster_name.

What lsfinstall does

lsb.hosts

For the default host, lsfinstall enables "!" in the MXJ column of the HOSTS section of lsb.hosts. For example:

Begin Host 
HOST_NAME MXJ   r1m     pg    ls    tmp  DISPATCH_WINDOW  # Keywords 
#hostA     () 3.5/4.5   15/   12/15  0      ()            # Example 
default    !    ()      ()    ()     ()     ()             
HPPA11     !    ()      ()    ()     ()     ()            #pset host 
End Host 

lsb.modules

Begin PluginModule 
SCH_PLUGIN           RB_PLUGIN       SCH_DISABLE_PHASES  
schmod_default           ()                 () 
schmod_fcfs              ()                 () 
schmod_fairshare         ()                 () 
schmod_limit             ()                 () 
schmod_reserve           ()                 () 
schmod_preemption        ()                 () 
schmod_advrsv            ()                 () 
... 
schmod_cpuset            ()                 () 
schmod_pset              ()                 () 
schmod_rms               ()                 () 
schmod_crayx1            ()                 () 
schmod_crayxt3           ()                 () 
End PluginModule 
note:  
The LSF HPC plugin names must be configured after the standard LSF plugin names in the PluginModule list.

lsb.resources

For IBM POE jobs, lsfinstall configures the ReservationUsage section in lsb.resources to reserve HPS resources on a per-slot basis.

Resource usage defined in the ReservationUsage section overrides the cluster-wide RESOURCE_RESERVE_PER_SLOT parameter defined in lsb.params if it also exists.

Begin ReservationUsag 
RESOURCE           METHOD 
adapter_windows    PER_SLOT 
ntbl_windows       PER_SLOT 
csss               PER_SLOT 
css0               PER_SLOT 
End ReservationUsage 

lsb.queues

Begin Queue 
QUEUE_NAME   = hpc_ibm 
PRIORITY     = 30 
NICE         = 20 
# ... 
RES_REQ = select[ poe > 0 ] 
EXCLUSIVE = Y 
REQUEUE_EXIT_VALUES = 133 134 135  
DESCRIPTION  = Platform HPC 7 for IBM. This queue is to run POE jobs 
ONLY. 
End Queue 

Begin Queue 
QUEUE_NAME   = hpc_ibm_tv 
PRIORITY     = 30 
NICE         = 20 
# ... 
RES_REQ = select[ poe > 0 ] 
REQUEUE_EXIT_VALUES = 133 134 135  
TERMINATE_WHEN = LOAD PREEMPT WINDOW 
RERUNNABLE = NO 
INTERACTIVE = NO 
DESCRIPTION  = Platform HPC 7 for IBM TotalView debug queue. This 
queue is to run POE jobs ONLY. 
End Queue 
Begin Queue 
QUEUE_NAME   = hpc_linux 
PRIORITY     = 30 
NICE         = 20 
# ... 
DESCRIPTION  = Platform HPC 7 for linux. 
End Queue 

Begin Queue 
QUEUE_NAME   = hpc_linux_tv 
PRIORITY     = 30 
NICE         = 20 
# ... 
TERMINATE_WHEN = LOAD PREEMPT WINDOW 
RERUNNABLE = NO 
INTERACTIVE = NO 
DESCRIPTION  = Platform HPC 7 for linux TotalView Debug queue. 
End Queue 

By default, LSF sends a SIGUSR2 signal to terminate a job that has reached its run limit or deadline. Since LAM/MPI does not respond to the SIGUSR2 signal, you should configure the hpc_linux queue with a custom job termination action specified by the JOB_CONTROLS parameter.

Begin Queue 
QUEUE_NAME   = rms 
PJOB_LIMIT   = 1 
PRIORITY     = 30 
NICE         = 20 
STACKLIMIT   = 5256 
DEFAULT_EXTSCHED = RMS[RMS_SNODE]  # LSF will using this scheduling policy if 
                                   # -extsched is not defined. 
# MANDATORY_EXTSCHED = RMS[RMS_SNODE] # LSF enforces this scheduling policy 
RES_REQ = select[rms==1] 
DESCRIPTION  = Run RMS jobs only on hosts that have resource 'rms' defined 
End Queue 
tip:  
To make the one of the LSF queues the default queue, set DEFAULT_QUEUE in lsb.params.

Use the bqueues -l command to view the queue configuration details. Before using LSF HPC, see the Platform LSF Configuration Reference to understand queue configuration parameters in lsb.queues.

lsf.cluster.cluster_name

Begin ResourceMap 
RESOURCENAME        LOCATION 
adapter_windows     [default] 
ntbl_windows        [default] 
poe                 [default] 
dedicated_tasks     (0@[default]) 
ip_tasks            (0@[default]) 
us_tasks            (0@[default]) 
End ResourceMap 

lsf.conf

lsf.shared

Defines the following shared resources required by LSF HPC in lsf.shared:

Begin Resource 
RESOURCENAME    TYPE    INTERVAL INCREASING  DESCRIPTION       # Keywords 
rms             Boolean    ()    ()          (RMS) 
pset            Boolean    ()    ()          (PSET) 
slurm           Boolean    ()    ()          (SLURM) 
cpuset          Boolean    ()    ()          (CPUSET) 
mpich_gm        Boolean    ()    ()          (MPICH GM MPI) 
lammpi          Boolean    ()    ()          (LAM MPI) 
mpichp4         Boolean    ()    ()          (MPICH P4 MPI) 
mvapich         Boolean    ()    ()          (Infiniband MPI) 
sca_mpimon      Boolean    ()    ()          (SCALI MPI) 
ibmmpi          Boolean    ()    ()          (IBM POE MPI) 
hpmpi           Boolean    ()    ()          (HP MPI) 
sgimpi          Boolean    ()    ()          (SGI MPI) 
intelmpi        Boolean    ()    ()          (Intel MPI) 
crayxt3         Boolean    ()    ()          (Cray XT3 MPI) 
crayx1          Boolean    ()    ()          (Cray X1 MPI) 
fluent          Boolean    ()    ()          (fluent availability) 
ls_dyna         Boolean    ()    ()          (ls_dyna availability) 
nastran         Boolean    ()    ()          (nastran availability) 
pvm             Boolean    ()    ()          (pvm availability) 
openmp          Boolean    ()    ()          (openmp availability) 
ansys           Boolean    ()    ()          (ansys availability) 
blast           Boolean    ()    ()          (blast availability) 
gaussian        Boolean    ()    ()          (gaussian availability) 
lion            Boolean    ()    ()          (lion availability) 
scitegic        Boolean    ()    ()          (scitegic availability) 
schroedinger    Boolean    ()    ()          (schroedinger availability) 
hmmer           Boolean    ()    ()          (hmmer availability) 
adapter_windows Numeric    30    N    (free adapter windows on css0 on IBM SP) 
ntbl_windows    Numeric    30    N    (free ntbl windows on IBM HPS) 
poe             Numeric    30    N    (poe availability) 
css0            Numeric    30    N    (free adapter windows on css0 on IBM SP) 
csss            Numeric    30    N    (free adapter windows on csss on IBM SP) 
dedicated_tasks Numeric    ()    Y    (running dedicated tasks) 
ip_tasks        Numeric    ()    Y    (running IP tasks) 
us_tasks        Numeric    ()    Y    (running US tasks) 
End Resource 
tip:  
You should add the appropriate resource names under the RESOURCES column of the Host section of lsf.cluster.cluster_name.

Platform Computing Inc.
www.platform.com
Knowledge Center         Contents    Previous  Next    Index