Release Notes for Platform™ LSF™ Version 7 Update 1

Release date: June 2007

Last modified: January 31, 2008

Comments to: doc@platform.com

Support: support@platform.com


Contents


What's New in Platform LSF Version 7 Update 1-June 2007

For more information

For detailed information about what's new in Platform LSF Version 7 Update 1, visit the Platform Computing Web site to see Features, Benefits & What's New.

Performance, scalability, reliability, usability enhancements

LSF on Platform EGO

Scheduling enhancements

LSF reports built on EGO

The LSF reporting feature adds the following new data loader plugins for LSF desktop support:

Desktop job

The desktop job data loader (desktopjobdataloader) is a polling loader that loads job completion logs from each desktop server and loads this data into the ACTIVE_DESKTOP_JOBDATA table. This data loader is only available on Linux hosts. By default, this data loader loads data every day.

Desktop client

The desktop client data loader (desktopclientdataloader) is a polling loader that samples client status data from the WSClientStatus file and loads this data into the ACTIVE_DESKTOP_SED_CLIENT table. This data loader is only available on Linux hosts. By default, this data loader samples data every ten minutes.

Desktop active event

The desktop active event data loader (desktopeventloader) is a polling loader that collects data on downloaded and reported jobs from the desktop event.log files. For each event of type 2 (REPORT_JOB) and type 4 (COMPLETE_JOB), desktopeventloader loads this data into the ACTIVE_DESKTOP_ACEVENT table. This data loader is only available on Linux hosts. This data loader collects data when an event is logged into the event.log files.

LSF License Scheduler

Platform LSF Desktop Support


Upgrade and Compatibility Notes

Server host compatibility Platform LSF

important:  
To use new features introduced in Platform LSF Version 7, you must upgrade all hosts in your cluster to 7.

LSF 6.x and 5.x servers are compatible with Version 7 master hosts. All LSF 6.x and 5.x features are supported by 7 master hosts.

Add new IBM AIX host types and models

Platform LSF Version 7 Update 1 supports a new host type and model for IBM AIX 5.3 POWER6 hosts. For LIM to correctly identify the new host type and model, you must manually add them to lsf.shared.

  1. Edit lsf.shared and add the new host type IBM9117 in the HostType section.
  2. Begin HostType 
    IBM9117 
    End HostType 
    
  3. Edit lsf.shared and add new host model PowerPC_Power6 in the HostModel section.
  4. Begin HostModel 
    PowerPC_Power6      14.0   (IBM9117) 
    End HostModel 
    
  5. Restart the master lim and slave lims running on all hosts to pick up the new host type and model.

Upgrade from an earlier version of LSF on UNIX and Linux

Run lsfinstall to upgrade to LSF Version 7 from an earlier version of LSF on UNIX and Linux. Follow the steps in Upgrading Platform LSF on UNIX and Linux.

important:  
Do not use the UNIX and Linux upgrade steps to update an existing LSF Version 7 cluster to LSF Version 7 Update 1. Follow the steps the "Cluster Version Management and Patching on UNIX and Linux" chapter in Administering Platform LSF to update your existing LSF Version 7 cluster to LSF Version 7 Update 1.

Update your existing LSF Version 7 cluster to Version 7 Update 1 on UNIX and Linux

You must use the latest lsfinstall program and the latest install.config template file to update your cluster. Follow the steps in the "Cluster Version Management and Patching on UNIX and Linux" chapter in Administering Platform LSF to update your existing LSF Version 7 cluster to LSF Version 7 Update 1.

important:  
Before running lsfinstall, you must download and extract the new installation distribution file for LSF Version 7 Update 1: lsf7.0.1_lsfinstall.tar.Z to use the latest version of lsfinstall. Prepare the install.config file using the new template and information from your original installation. The new template has additional parameters for the LSF Version 7 patch installation and managment facility.

Migrate LSF on Windows

To migrate an LSF on Windows to LSF Version 7 from an earlier version of LSF on Windows, follow the steps in "Migrate Your Windows Cluster to Platform LSF Version 7" (lsf_migrate_windows.pdf).

Maintenance pack and update availability

At release, Platform LSF Version 7 Update 1 includes all bug fixes and solutions up to and including bug fixes and solutions before June, 2007. Fixes after June 2007 will be included in the next LSF update.

Fixes in the November 2006 Maintenance Pack were included in the March 2007 update.

As of February 2007, monthly maintenance packs are no longer distributed for LSF Version 7.

System requirements

See the Platform Computing Web site for information about supported operating systems and system requirements for the Platform LSF family of products:

API compatibility

Full backward compatibility: your applications will run under LSF Version 7 without changing any code.

The Platform LSF Version 7 API is fully compatible with the LSF Version 6.x. and 5.x APIs. An application linked with the LSF Version 6.x or 5.x libraries will run under LSF Version 7 without relinking.

To take full advantage of new Platform LSF Version 7 features, including job submission using JSDL and IPv6 address formats, you should recompile your existing LSF applications with LSF Version 7.

New and changed LSF APIs

See the LSF API Reference for more information.

The following new APIs have been added for LSF Version 7 Update 1:

The following APIs have changed for LSF Version 7 Update 1:

Multiple cluster configuration

In Platform LSF Version 7, multiple independent clusters can no longer share the same configuration directory. You must install each LSF cluster in a unique location.

NCPUS detection on AIX

On a machine running AIX, ncpus detection is different from previous release. Under AIX, the number of detected physical processors is always 1, whereas the number of detected cores is the number of cores across all physical processors. Thread detection is the same as other operating systems (the number of threads per core).

Enable the full Platform Management Console

By default, only the LSF reporting feature is enabled in the Platform Management Console (PMC) after installation. Complete the following steps to enable the full PMC functionality.

  1. With an XML editor, open pmc_conf_ego.xml.
  2. In the configuration section, locate the parameter: <Name>OnlyShowReport</Name>.
  3. In the <Name>parameter, change <Value>true</Value> to <Value>false</Value>.
  4. Save and close pmc_conf_ego.xml.
  5. Restart the WEBGUI service.

What's Changed in Platform LSF Version 7 Update 1

Changed behavior

Banded licensing

The memory limit for S-Class licenses on X86/AMD64/EM64T processors has increased from 8 GB t o16 GB. The other classes of licenses have not changed.

You can use permanent licenses with restrictions in operating system and hardware configurations. These banded licenses have three classes, with the E-class licenses having no restrictions.

Banded licenses now support the following operating systems and hardware configurations:

License type
Supported operating systems
Processor
Physical memory
Physical processors/sockets
B-Class
Linux, Windows, MacOS
Intel X86/AMD64/EM64T
Up to and including 4 GB physical memory on a node
Up to and including 2 processors
S-Class
Linux, Windows, MacOS
Intel X86/AMD64/EM64T
Up to and including 16 GB physical memory on a node
Up to and including 4 processors
E-Class
Linux, Windows, MacOS
Intel X86/AMD64/EM64T
More than 16 GB physical memory on a node
More than 4 processors
All other LSF-supported operating systems
Intel X86/AMD64/EM64T
N/A
N/A
N/A
All other supported processors
N/A
N/A

In the LSF license file:

FEATURE lsf_manager lsf_ld 6.200 8-may-2008 2 ADE2C12C1A81E5E8F29C \        
VENDOR_STRING=Platform NOTICE=Class(S) 
FEATURE lsf_manager lsf_ld 6.200 8-may-2008 10 1DC2C1CCEF193E42B6DC \ 
VENDOR_STRING=Platform NOTICE=Class(E) 

Enforcement of dual-core processor licenses on Linux

Dual-core processor hosts running Linux must be licensed by the lsf_dualcore_x86 license feature.

Each dual core processor requires one standard LSF license and one lsf_dualcore_x86 license.

Use lshosts -l to see the number of dual-core licenses enabled and needed. For example:

lshosts -l hostB
HOST_NAME:  hostB
type     model  cpuf ncpus ndisks maxmem maxswp maxtmp rexpri server nprocs ncores nthreads
LINUX86 PC6000 116.1     2      1  2016M  1983M 72917M      0    Yes      1      1        2

...
LICENSES_ENABLED: (LSF_Base LSF_Manager LSF_MultiCluster LSF_Sched_Fairshare 
LSF_Sched_Resource_Reservation LSF_Sched_Preemption LSF_Sched_Parallel 
LSF_Sched_Advance_Reservation LSF_DualCore_x86LICENSE CLASS NEEDED: Class(B), Multi-cores
... 

Enforcement of multicore processor licenses on Linux and Windows

Multicore hosts running Linux or Windows must be licensed by the lsf_dualcore_x86 license feature. Each physical processor requires one standard LSF license and num_cores-1 lsf_dualcore_x86 licenses. For example, a processor with 4 cores requires 3 lsf_dualcore_x86 licenses.

Use lshosts -l to see the number of multicore licenses enabled and needed. For example:

lshosts -l hostB
HOST_NAME:  hostB
type     model  cpuf ncpus ndisks maxmem maxswp maxtmp rexpri server nprocs ncores nthreads
LINUX86 PC6000 116.1     2      1  2016M  1983M 72917M      0    Yes      1      1        2

LICENSES_ENABLED: (LSF_Base LSF_Manager LSF_MultiCluster LSF_Sched_Fairshare 
LSF_Sched_Resource_Reservation LSF_Sched_Preemption LSF_Sched_Parallel 
LSF_Sched_Advance_Reservation LSF_DualCore_x86LICENSE CLASS NEEDED: Class(B), Multi-cores
... 

Determining what licenses a host needs

Use lim -t to see the license requirements for a host. For example:

lim -t hostA
Host Type             : NTX64
Host Architecture     : EM64T_1596
Physical Processors   : 2
Cores per Processor   : 4
Threads per Core      : 2
License Needed        : Class(B), Multi-cores
Matched Type          : NTX64
Matched Architecture  : EM64T_3000
Matched Model         : Intel_EM64T
CPU Factor            : 60.0 

Resource requirements in application profiles

Job-level, application-level, and queue-level resource requirements are merged in the following manner:

For internal load indices and duration, jobs are rejected if they specify resource reservation requirements at the job level or application level that exceed the requirements specified in the queue.

If RES_REQ is defined at the queue level and there are no load thresholds defined, the pending reasons for each individual load index will not be displayed by bjobs.

LSF reporting data loader plug-ins

The LSF reporting feature adds the following new data loader plugins for LSF desktop support:

Desktop job

The desktop job data loader (desktopjobdataloader) is a polling loader that loads job completion logs from each desktop server and loads this data into the ACTIVE_DESKTOP_JOBDATA table. This data loader is only available on Linux hosts. By default, this data loader loads data every day.

Desktop client

The desktop client data loader (desktopclientdataloader) is a polling loader that samples client status data from the WSClientStatus file and loads this data into the ACTIVE_DESKTOP_SED_CLIENT table. This data loader is only available on Linux hosts. By default, this data loader samples data every ten minutes.

Desktop active event

The desktop active event data loader (desktopeventloader) is a polling loader that collects data on downloaded and reported jobs from the desktop event.log files. For each event of type 2 (REPORT_JOB) and type 4 (COMPLETE_JOB), desktopeventloader loads this data into the ACTIVE_DESKTOP_ACEVENT table. This data loader is only available on Linux hosts. This data loader collects data when an event is logged into the event.log files.

New and changed configuration parameters and environment variables

The following configuration parameters and environment variables are new or changed for LSF Version 7 Update 1:

ego.conf

install.config

lsb.applications

lsb.modules

lsb.params

lsb.queues

lsf.conf

lsf.licensescheduler

lsf.shared

Environment variables

The following environment variables are new in LSF Version 7 Update 1:

New and changed commands, options, and output

The following command options and output are new or changed for LSF Version 7 Update 1:

badmin

perfmon start [sample_period] | stop | view | setperiod sample_period

Dynamically enables and controls scheduler performance metric collection. Collecting and recording performance metric data may affect the performance of LSF. Smaller sampling periods will result in the lsb.streams file growing faster.

The following metrics are collected and recorded in each sample period:

bbot

You cannot run bbot on jobs pending in an absolute priority scheduling (APS) queue.

bhist

bhosts

When LOCAL_TO is configured for a license feature in lsf.licensescheduler, bhosts -s shows different resource information depending on the cluster locality of the features.

bjobs

blaunch (new)

Most MPI implementations and many distributed applications use rsh and ssh as their task launching mechanism. The blaunch command provides a drop-in replacement for rsh and ssh as a transparent method for launching parallel applications within LSF.

blaunch supports the following core command line options as rsh and ssh:

All other rsh and ssh options are silently ignored.

blaunch transparently connects directly to the RES/SBD on the remote host, and subsequently creates and tracks the remote tasks, and provides the connection back to LSF. There no need to insert pam, taskstarter or any other wrapper.

blaunch only works under LSF. It can only be used to launch tasks on remote hosts that are part of a job allocation. It cannot be used as a standalone command. blaunch is not supported on Windows.

blinfo

When LOCAL_TO is configured for a feature in lsf.licensescheduler, blinfo shows the cluster locality and license token allocation information for the license features.

blstat

When LOCAL_TO is configured for a feature in lsf.licensescheduler, blstat shows the cluster locality information for the license features. For example, with a group distribution configuration blstat shows the locality of a license feature configured for various sites.

blusers

When LOCAL_TO is configured for a feature in lsf.licensescheduler, blusers shows cluster locality information for the license features.

bmod

bqueues

bslots (new)

Displays slots availabimpactle for backfill jobs, and slots reserved for parallel jobs and advance reservations. The available slots are not currently used for running jobs and can be used for backfill jobs. The available slots displayed by bslots are only a snapshot of the slots currently not in use by parallel jobs or advance reservations. They are not guaranteed to be available at job submission.

By default, displays all available slots, and the available run time for those slots.

If the available backfill window has no run time limit, its length is displayed as UNLIMITED.

bsub

btop

You cannot run btop on jobs pending in an absolute priority scheduling (APS) queue.

lim

lim -t displays host information, such as host type, matched host type, host architecture, physical number of processors, number of cores per physical processor, number of threads per processor core, and license requirements.

note:  
When running Linux kernal version 2.4, you must run lim -t as root to ensure consistent output with other clustered application management commands (for example, output from running lshosts).

LIM reads the configuration file ego.conf to retrieve configuration information. ego.conf is a generic configuration file shared by all daemons and clients. It contains configuration information and other information that dictates the behavior of the software.. lim retrieves the following parameters from ego.conf:

lshosts

Host-based default output displays ncpus-The number of processors on the host. If LSF_ENABLE_DUALCORE=Y in lsf.conf for dual-core CPU hosts, displays the number of cores instead of physical CPUs. If EGO_DEFINE_NCPUS is specified in ego.conf, displays the appropriate value for ncpus, depending on the value of EGO_DEFINE_NCPUS:

EGO_DEFINE_NCPUS=cores is the same as setting LSF_ENABLE_DUALCORE=Y. LSF_ENABLE_DUALCORE and EGO_ENABLE_DUALCORE are obsolete. Use EGO_DEFINE_NCPUS for improved detection of processors, cores, and threads.

Host-based -l output displays:

patchinstall (UNIX-new)

Use to patchinstall install and manage patches on an existing licensed Platform cluster. The patch installer includes functionality to query a cluster, check contents of a package and compatibility with the cluster, and patch or roll back a cluster.

For clusters version 7 or earlier, you must obtain the patch installer separately from Platform, and run the patchinstall command from your download directory.

For clusters version 7 or later, the patch installer is available under install directory under the LSF installation directory. This location may not be in your path, so run the patchinstall command from this directory (LSF_TOP/7.0/install/patchinstall).

pversions (UNIX-new)

The version command pversions is provided to query patch history and deliver information about cluster and product version and patch levels. Use pversions to query a cluster or check contents of a package.

By default, pversions displays the version and patch level of Platform products. Optionally, the command can also be used to do the following:

For each binary type, displays basic version information (package build date, build number, package installed date) and lists patches installed (package type, build number, date installed, fixes).

The version command is not located with other LSF commands so it may not be in your path. The command location is LSF_TOP/7.0/install/pversions

The cluster location is normally determined by your environment setting, so ensure your environment is set before you run this command (for example, you sourced profile.platform or profile.lsf).

tspeek

tspeek is now supported on Linux hosts. In mixed cluster environments, you can use tspeek to monitor job output from a Linux host for a Windows Terminal Services job.

New and changed files

No files have been added or changed in Platform LSF Version 7 Update 1.

LSF

New and changed accounting and job event fields

lsb.acct

No fields are new or changed in the lsb.acct file records for Platform LSF Version 7 Update 1.

lsb.events

No fields are new or changed in the lsb.events file records for Platform LSF Version 7 Update 1.

LSF daemon management

Manage LSF daemons two ways:

important:  
LSF res and sbatchd do not restart automatically if you run lsadmin resshutdown and badmin hshutdown to manually shut them down. You must run lsadmin resstartup and badmin hstartup to restart the daemons after host shutdown.

All LSF commands and tools, including lsadmin and badmin are available under both management models.

Directory structure changes

The installation directory structure has changed for Platform LSF Version 7. See Installing Platform LSF on UNIX and Linux for the details of the new structure. Depending on which products you have installed and platforms you have selected, your directory structure may vary.

Bugs fixed since March 2007

The following bugs have been fixed in the June 2007 update since the March 2007 update:

87551
Date
2007-05-17
 
Description
License Scheduler will not count licenses used if the lmstat -a -c port@host output includes the word "licenses" in it. This can happen if the license server host has the word "licenses" in its domain name or the host checkout license with "licenses" in its name.
 
Component
blcollect
 
Platform
UNIX
 
Impact
Inaccurate license count in License Scheduler

87279
Date
2007-05-14
 
Description
Parameter LSF_LICENSE_FILE is not added to lsf.conf file after a new installation
 
Component
lsfinstall
 
Platform
UNIX
 
Impact
Cluster is unlicensed until the LSF_LICENSE_FILE parameter is manually added to the configuration

85528
Date
2007-05-09
 
Description
Newer Linux kernels are setting the parent and group of the init process to 1 instead of 0
 
Component
All
 
Platform
linux
 
Impact
This causes problems for pim on Linux since it skips all processes under group 1 including init. bjobs -l does not show correct resource usage. LSF cannot receive resource info from the /tmp/pim.* file since only one process remains.

87100
Date
2007-05-04
 
Description
Parallel job using exec rusage pends forever
 
Component
schmod_reserve.so schmod_default.so schmod_parallel.so
 
Platform
All
 
Impact
Job is never dispatched

86438
Date
2007-04-29
 
Description
Inconsistent fairshare behavior by restarting and reconfiguring mbatchd after user group is removed
 
Component
mbatchd
 
Platform
All
 
Impact
Fairshare does not work as expected after mbatchd restart

84852
Date
2007-04-29
 
Description
Job remains pending forever because of unsatisfied job dependency
 
Component
mbatchd
 
Platform
All
 
Impact
Cannot tell if the job has exited

86348
Date
2007-04-26
 
Description
Cannot pass -h and -V as arguments to an MPI program
 
Component
pam
 
Platform
All
 
Impact
Job fails if an MPI program and its options (including -h -V) are not enclosed with single quotes

84289
Date
2007-04-24
 
Description
When host exclusion for host partition is defined in queue level, LSF cannot exclude the host defined, and jobs could still be submitted to the excluded host
 
Component
mbatchd
 
Platform
All
 
Impact
In clusters with login hosts and server hosts jobs should not be submitted to run on login hosts. If hosts are licensed to run specialized software, only authorized users should to be able to use those hosts. Hosts cannot be excluded hosts from certain queues.

86069
Date
2007-04-23
 
Description
epoll_mod error: epoll_ctl() failed. No such file or directory.
 
Component
MultiCluster
 
Platform
Unix
 
Impact
MultiCluster does not work with epoll enabled

84445
Date
2007-04-20
 
Description
lsmake hangs or core dumps
 
Component
lsmakerm
 
Platform
All
 
Impact
lsmake fails

82558
Date
2007-04-20
 
Description
Slot reservation is not displayed in bjobs, but the job behaves like the reservation is happening
 
Component
schmod_cpuset.so
 
Platform
All
 
Impact
Cannot determine the job pending reason easily. It looks like jobs are blocked by certain other jobs. This makes job start time unpredictable.

85618
Date
2007-04-19
 
Description
Missing log message
 
Component
sbatchd
 
Platform
All
 
Impact
Hard to tell why job not suspended

70025
Date
2007-04-19
 
Description
lsfmon.exe is not in the install package
 
Component
Windows installer
 
Platform
Windows
 
Impact
lsfmon.exe is not in the install package

84904
Date
2007-04-17
 
Description
bpeek command fails because it cannot change to the user's home directory
 
Component
res
 
Platform
All
 
Impact
Cannot use bpeek to see the output from the job

84606
Date
2007-04-16
 
Description
Job cannot run after being changed with lsb_modify()
 
Component
lsb_modify() in libbat.a API
 
Platform
All
 
Impact
Job cannot run after being changed with lsb_modify()

84908
Date
2007-04-15
 
Description
LIM on Linux 2.6 reports wrong pg index
 
Component
lim
 
Platform
Linux 2.6
 
Impact
Wrong load index reported

84745
Date
2007-04-15
 
Description
For large remote execution tasks, nios/res gets timeout from time to time
 
Component
nios
 
Platform
All
 
Impact
Remote execution fails

85923
Date
2007-04-13
 
Description
Master LIM is very busy after upgrading
 
Component
lim
 
Platform
All
 
Impact
After master LIM is restarted, it takes very long time for all hosts to become ok

72938
Date
2007-04-13
 
Description
lsmake fails
 
Component
lsmakerm on lsmake
 
Platform
UNIX
 
Impact
lsmake fails

85689
Date
2007-04-10
 
Description
mbschd does not clearly mark the beginning and ending of a scheduling session
 
Component
mbschd
 
Platform
All
 
Impact
Hard to analyze mbschd log file

85323
Date
2007-04-10
 
Description
esub cannot change LSB_SUB2_USE_RSV parameter value
 
Component
bmod, bsub
 
Platform
All
 
Impact
Customized esub cannot change user-submitted advanced reservation value

82416
Date
2007-04-09
 
Description
bhosts with -l and -s options does not show appropriate column name
 
Component
bhosts
 
Platform
All
 
Impact
Information displayed by bhosts can be misinterpreted

84461
Date
2007-04-04
 
Description
mbatchd requests the wrong number of resources from bld when RESOURCE_RESERVE_PER_SLOT is set
 
Component
bld, mbatchd
 
Platform
All
 
Impact
Resources not allocated correctly

80169
Date
2007-03-27
 
Description
The start time recorded in lsb.acct for each job in a chunked set is not the starting time for the job, but rather the time when the chunk of jobs that job was part is sent to the execution host
 
Component
mbatchd
 
Platform
All
 
Impact
lsb.acct file cannot be used to determine the actual wall clock run time of a job

83334
Date
2007-03-23
 
Description
bpeek does not work on a Solaris 9 host
 
Component
bpeek
 
Platform
Solaris 7, 8, 9, 10
 
Impact
Cannot get the job output by bpeek on a Solaris 9 host

83520
Date
2007-03-22
 
Description
hostsetup does not set LSF startup script correctly
 
Component
hostsetup
 
Platform
Linux
 
Impact
LSF cannot start automatically when the host is rebooting

84207
Date
2007-03-21
 
Description
Cannot see the full host name for some hosts
 
Component
pam
 
Platform
All
 
Impact
pam output cannot distinguish between some hosts

83596
Date
2007-03-20
 
Description
Cannot remove temp accounting file under LSF_TMPDIR after a job is done
 
Component
sbatchd, res
 
Platform
All
 
Impact
Waste of disk space and file node resources

83986
Date
2007-03-19
 
Description
For some "ptile=!" job pending with "Not enough processors to meet the job's spanning requirement", mbdrestart will dispatch them anyway
 
Component
mbatchd
 
Platform
All
 
Impact
Parallel jobs get dispatched unexpectedly

84091
Date
2007-03-13
 
Description
The format of JOBID and FACTOR has been changed, causing a display issue in the job exception handling email
 
Component
none
 
Platform
UNIX/Linux
 
Impact
Low

82623
Date
2007-03-13
 
Description
mbatchd does not log accurate error message regarding communicating with bld
 
Component
bld, mbatchd
 
Platform
All
 
Impact
Difficult to diagnose the problem

83573
Date
2007-03-07
 
Description
When one of the tasks in a POE integrated parallel job exits, the TaskStarter for this task exits, but pam hangs. POE can handle some of the exit codes; e.g. 139 (Segfault), outside of LSF.
 
Component
taskstarter
 
Platform
All
 
Impact
Many applications often have segmentation faults for one or more tasks in a parallel job. LSF should able to handle this.

83361
Date
2007-03-07
 
Description
sbatchd directs load information request to master lim, causing master lim performance penalty
 
Component
sbatchd
 
Platform
All
 
Impact
Master lim becomes slow

83759
Date
2007-03-06
 
Description
The event file may be corrupted and job IDs are reused when two mbatchd are running
 
Component
mbatchd
 
Platform
All
 
Impact
More than one job could use the same job ID

83532
Date
2007-03-05
 
Description
When LSB_DEFAULTPROJECT environment variable is set, bmod does not work with running jobs
 
Component
mbatchd
 
Platform
All
 
Impact
bmod does not work with running jobs

83371
Date
2007-03-02
 
Description
lsmake will fail in the case of remake with non-zero make level
 
Component
lsmake
 
Platform
All
 
Impact
lsmake fails in the case of remake

83175
Date
2007-03-02
 
Description
bsub fails because of XDR error
 
Component
bsub
 
Platform
All
 
Impact
bsub fails

77119
Date
2007-02-27
 
Description
mbdrestart changes RUN_TIME in host partition fairshare
 
Component
All
 
Platform
All
 
Impact
User account information or user share priority will be wrong

82742
Date
2007-02-23
 
Description
Some CPUs on a host cannot be used after a job got post-done status update without done status update from sbatchd
 
Component
All
 
Platform
All
 
Impact
Some CPUs on a host cannot be used

83221
Date
2007-02-22
 
Description
mbatchd core dumps at event replay if one extra JOB_NEW inserted for a job
 
Component
All
 
Platform
All
 
Impact
The cluster is down

80814
Date
2007-02-14
 
Description
LIM fails to convert license from lsf_base to lsf_client
 
Component
All
 
Platform
All
 
Impact
Some LSF client hosts are unlicensed

82257
Date
2007-02-13
 
Description
lsmake fails with more than 10024 tasks
 
Component
lsmake
 
Platform
All
 
Impact
lsmake is unusable

82254
Date
2007-02-13
 
Description
lsmake server (lsmakerm) core dumps, which causes failure of lsmake
 
Component
lsmake
 
Platform
All
 
Impact
lsmake is unusable

80900
Date
2007-02-12
 
Description
pam hangs for 15 minutes during shutdown
 
Component
pam
 
Platform
All
 
Impact
pam does exit after the 15 minutes. Having a process hang for this long hurts performance.

80671
Date
2007-02-12
 
Description
LSF applies advance reservation defined with -t and current day's index to next week instead of applying it to current day. For example, assuming current time is 6:00 p.m. and today is Wednesday (day 3 in LSF advance reservation day index), the following reservation is defined:
brsvadd -m hostA -s -t "3:19:00-3:19:30" 
LSF will define this reservation to next Wednesday and not today from 7:00 to 7:30.
 
Component
All
 
Platform
All
 
Impact
Advance reservation is not created for the current day

80611
Date
2007-02-12
 
Description
After cluster restart, a job will be dispatched by trespassing on an advance reservation on a multi-CPU host
 
Component
All
 
Platform
All
 
Impact
Advance reservation is not honored

77785
Date
2007-02-12
 
Description
Events data can be lost if duplicate event logging is enabled together with failover and the primary master becomes unavailable for some time
 
Component
All
 
Platform
Linux
 
Impact
Events data is lost

82345
Date
2007-02-11
 
Description
In cross-queue fairshare, CPU time and run time decay too fast
 
Component
All
 
Platform
All
 
Impact
Fairshare is not accurate

82770
Date
2007-02-08
 
Description
mbschd crashes periodically
 
Component
mbschd
 
Platform
All
 
Impact
Cluster is not operating properly

82739
Date
2007-02-07
 
Description
If a large fairshare tree is configured and lots of finished jobs are in lsb.events, mbatchd may take long time to replay
 
Component
All
 
Platform
All
 
Impact
mbatchd does not respond for a long time during mbatchd restart

81748
Date
2007-02-07
 
Description
TotalView integration with LAMMPI does not work
 
Component
All
 
Platform
Linux2.6-glibc2.3-x86_64
 
Impact
TotalView integration with LAMMPI does not work


Known Issues

Platform LSF Version 7 Update 1

Warning message installing to ACL-enabled file systems

On RHEL5, lsfinstall gives a warning message during preinstallation checking if the installation file system has ACL enabled. You should avoid installing LSF on an ACL-enabled file system. To be fixed in a later LSF 7.0 update.

PARALLEL_SCHED_BY_SLOT limitations
Buffer space message on Windows XP

When maximum memory limit is set with the /3GB switch in boot.ini on Windows XP, some LSF operations (for example query commands like bqueues and bhosts) give a warning message like:

Failed in an LSF library call: Failed in sending/receiving a message: 
No buffer space available  

You should not set the /3GB switch in boot.ini on LSF master hosts.

Platform LSF on Windows Vista

Cannot delete uninstall directory

Windows shows "Access denied" when the local Windows administrator or the cluster administrator tries to delete the LSF uninstall directory. The LSF uninstall directory cannot be deleted because the C:\LSF_7.0\conf\passwd.lsfuser file is owned by "System". The passwd.lsfuser file must be owned by the cluster administrator.

Shared directory permissions

When users create a shared directory on Windows Vista, the default owner is the directory creator. For LSF to work properly, the shared directory for LSF must be configured so that cluster administrators have read/write permission and all LSF users must have at least read permission. The shared directory must have the following security settings:

cmd.exe permissions

For installations on an NTFS file system, users must have "Read" and "Execute" privileges for cmd.exe. The following files:

Require the following access permssions:

Platform EGO

Platform EGO version 1.2.2 increases the number of host types you can be manually define in EGO_CONFDIR/ego.shared from 128 to 1024. In a MultiCluster environment where one cluster contains a mix of EGO 1.2.2 hosts and pre-EGO 1.2.2 hosts, the maximum number of host types you can define in ego.shared remains 128.

Platform LSF Desktop Support

Platform EGO management of LSF desktop support services applies to the MED and to the Web servers (Tomcat and Apache). With EGO management of LSF desktop support services enabled, you should not use the command lsfac_daemons to start or stop Apache or Tomcat services because EGOSC will automatically restart them. Instead, you should use the egosh command to start and stop these services.

If EGO management of LSF desktop support services is enabled, you must use an EGO command to start and stop a managed service. From the command line, enter one of the following commands:

egosh service start LSFDesktopApache LSFDesktopTomcat 
egosh service stop LSFDesktopApache LSFDesktopTomcat 

Platform LSF Desktop reporting

In the Hourly Desktop Job Throughput report, if an SED host pulls a job from an MED host, but the job failed to run, while another SED host pulls the same job and it runs successfully, the job will be double-counted in both the number of downloaded jobs and the number of completed jobs for the MED host.


Download the Platform LSF Version 7 Distribution Packages

Download the LSF distribution packages two ways:

important:  
The latest Platform LSF Version 7 release is Update 2. Distribution packages are available only for Platform LSF Version 7 Update 2 and Platform LSF Version 7 Update 1.

Download LSF through FTP

Prerequisites: Access to the Platform FTP site is controlled by login name and password. If you cannot access the distribution files for download, send email to support@platform.com.

  1. Log on to the LSF file server.
  2. Change to the directory where you want to download the LSF distribution files. Make sure that you have write access to the directory. For example:
  3. # cd /usr/share/lsf/tarfiles 
    
  4. FTP to the Platform FTP site:
  5. # ftp ftp.platform.com 
    
  6. Provide the login user ID and password provided by Platform.
  7. Change to the directory for the LSF Version 7 release:
  8. ftp> cd /distrib/7.0 
    
  9. Set file transfer mode to binary:
  10. ftp> binary 
    
  11. For LSF on UNIX and Linux, get the installation distribution file.
  12. ftp> get archive/update1/platform_lsf/lsf7.0.1_lsfinstall.tar.Z 
     
    tip:  
    Before installing LSF on your UNIX and Linux hosts, you must uncompress and extract lsf7.0.1_lsfinstall.tar.Z to the same directory where you download the LSF product distribution tar files.
  13. Get the distribution packages for the products you want to install on the supported platforms you need. For example:
  14. Optional. Download the Platform LSF Version 7 Update 1 documentation.
  15. ftp> get archive/update1/docs/lsf7.0.1_documentation.zip 
    ftp> get archive/update1/docs/lsf7.0.1_documentation.tar.Z 
     
    note:  
    Get the latest Platform LSF Version 7 documentation from /distrib/7.0/docs/.
  16. Optional. Download the Platform EGO Version 1.2 documentation.
  17. ftp> get archive/update1/docs/ego1.2.2_documentation.zip 
    ftp> get archive/update1/docs/ego1.2.2_documentation.tar.Z 
     
    note:  
    Get the latest Platform EGO documentation from /distrib/7.0/docs/.
  18. Optional. Download the Platform Management Console (PMC) distribution package from /distrib/7.0/archive/update1/.
  19. ftp> get 
    archive/update1/platform_lsf/lsf7.0.1_pmc_linux-x86.tar.Z 
     

    OR

    ftp> get archive/update1/platform_lsf/lsf7.0.1_pmc_linux-x86_64.tar.Z
    note:  
    To take advantage of the Platform LSF reporting feature, you must download and install the Platform Management Console. The reporting feature is only supported on the same platforms as the Platform Management Console: 32-bit and 64-bit x86 Windows and Linux operating systems.
  20. Exit FTP.
  21. ftp> quit 
    

Download LSF from my.platform.com

Prerequisites: You must provide your Customer Support Number and register a user name and password on my.platform.com to download LSF.

If you have not registered at my.platform.com, click New User? and complete the registration form. If you do not know your Customer Support Number or cannot log in to my.platform.com, send email to support@platform.com.

  1. Navigate to http://my.platform.com/.
  2. Choose Products > Platform LSF Family > LSF 7.
  3. Under Download, choose Product Packages.
  4. Select the Updates, Packages, and Documentation you wish to download.
  5. Log out of my.platform.com.

Archive location of previous update releases

Directories containing release notes and distribution files for previous LSF Version 7 update releases are located on the Platform FTP site under /distrib/7.0/archive. Archive directories are named relative to the current update release:


Install Platform LSF Version 7

Installing Platform LSF involves the following steps:

  1. Get a DEMO license (license.dat fie).
  2. Run the installation programs.

Get a Platform LSF demo license

Before installing Platform LSF Version 7, you must get a demo license key.

Contact license@platform.com to get a demo license.

Put the demo license file license.dat in the same directory where you downloaded the Platform LSF product distribution tar files.

Run the UNIX and Linux installation

Use the lsfinstall installation program to install a new LSF Version 7 cluster, upgrade from and earlier LSF version, or to update your existing LSF Version 7 cluster to LSF Version 7 Update 1.

See Installing Platform LSF on UNIX and Linux for new cluster installation steps.

See the Platform LSF Command Reference for detailed information about lsfinstall and its options.

See the "Cluster Version Management and Patching on UNIX and Linux" chapter in Administering Platform LSF for detailed steps for updating your existing LSF Version 7 cluster to LSF Version 7 Update 1.

Run the Windows installation

Platform LSF on Windows 2000, Windows 2003, and Windows XP is distributed in the following packages:

See Installing Platform LSF on Windows for installation steps.

Install Platform LSF License Scheduler

See Using Platform LSF License Scheduler for installation and configuration steps.

Install Platform LSF HPC

Use lsfinstall to install a new Platform LSF HPC cluster or to upgrade LSF HPC from a previous release.

important:  
Make sure ENABLE_HPC_INST=Y is specified in install.config to enable Platform LSF HPC installation.

See Using Platform LSF HPC for installation and configuration steps.

Install Platform LSF Desktop Support

See the Platform LSF Desktop Support Administrator's Guide for installation and configuration steps.

Special installation steps for the Platform Management Console on Linux IA64

To install the Platform Management Console on Linux IA64 hosts, you must download and install the Linux IA64 version of BEA Jrockit 5.0 JRE.

  1. Download the Linux IA64 version of BEA Jrockit 5.0 JRE.
    1. Open the BEA download page.
    2. http://commerce.bea.com/products/weblogicjrockit/5.0/jr_50.jsp 
      
    3. Save the download file to your local disk.
    4. For JRockit 5.0 R27.1 JRE Linux (Intel Itanium - 64-bit), save the file named jrockit-R27.1.0-jre1.5.0_08-linux-ipf.bin.

    5. Make sure that the .bin file is executable.
    6. chmod +x jrockit-R27.1.0-jre1.5.0_08-linux-ipf.bin 
      
  2. Install the JRE on the Linux IA64 host.
    1. Change to a shared directory where you want to install BEA Jrockit.
    2. Run the installer in console mode.
    3. jrockit-R27.1.0-jre1.5.0_08-linux-ipf.bin -mode=console 
       

      The installation creates a new directory:

      jrockit-R27.1.0-jre1.5.0_08
  3. Follow the steps in Installing Platform LSF on UNIX and Linux to run lsfinstall to install Platform LSF and the Platform Management Console.
  4. Make a symbolic link to the JRE.
  5. For example, if you installed the JRE under /opt/jre:

    cd $EGO_TOP/jre 
    ln -s /opt/jre/jrockit-R27.1.0-jre1.5.0_08-linux-ipf linux-ia64 
    
  6. Check the symbolic link to the JRE.
  7. If the symbolic link is correct, you should see the contents of the linux-ia64 directory:

    cd $EGO_TOP/jre/linux-ia64 
    ls 
    bin/ lib/ LICENSE license.bea README.TXT 
    

Learn About Platform LSF Version 7

Information about Platform LSF is available from the following sources:

World Wide Web and FTP

Information about Platform LSF Version 7 is available in the LSF Version 7 area of the Platform FTP site (ftp.platform.com/).

The latest information about all supported releases of Platform LSF is available on the Platform Web site at www.platform.com.

If you have problems accessing the Platform web site or the Platform FTP site, send email to support@platform.com.

my.platform.com

my.platform.com-Your one-stop-shop for information, forums, e-support, documentation and release information. my.platform.com provides a single source of information and access to new products and releases from Platform Computing.

On the Platform LSF Family product page of my.platform.com, you can download software, patches, updates and documentation. See what's new in Platform LSF Version 7, check the system requirements for Platform LSF, and browse the latest documentation updates through the Platform LSF Knowledge Center.

Platform LSF documentation

The Platform LSF Knowledge Center is your entry point for all LSF documentation. After downloading and extracting the LSF documentation distribution file, browse the file docs/lsf/7.0/index.html to access the Platform LSF Knowledge Center.

If you have installed the Platform Management Console, access and search the Platform LSF documentation through the link to the Platform Knowledge Center.

Platform EGO documentation

The Platform EGO Knowledge Center is your entry point for Platform EGO documentation. It is installed when you install LSF. To access and search the EGO documentation, browse the file EGO_TOP/docs/ego/1.2.2/index.html.

If you have installed the Platform Management Console, access the Platform EGO documentation through the link to the Platform Knowledge Center.

Platform training

Platform's Professional Services training courses can help you gain the skills necessary to effectively install, configure and manage your Platform products. Courses are available for both new and experienced users and administrators at our corporate headquarters and Platform locations worldwide.

Customized on-site course delivery is also available.

Find out more about Platform Training at www.platform.com/Services/Training/, or contact Training@platform.com for details.


Get Technical Support

Contact Platform

Contact Platform Computing or your LSF vendor for technical support. Use one of the following to contact Platform technical support:

Email

support@platform.com

World Wide Web

www.platform.com

Mail

Platform Support
Platform Computing Inc.
3760 14th Avenue
Markham, Ontario
Canada L3R 3T7

When contacting Platform, please include the full name of your company.

See the Platform Web site at www.platform.com/Company/Contact.Us.htm for other contact information.

Get patch updates and other notifications

To get periodic patch update information, critical bug notification, and general support notification from Platform Support, contact supportnotice-request@platform.com with the subject line containing the word "subscribe".

To get security related issue notification from Platform Support, contact securenotice-request@platform.com with the subject line containing the word "subscribe".

We'd like to hear from you

If you find an error in any Platform documentation, or you have a suggestion for improving it, please let us know:

Email

doc@platform.com

Mail

Information Development
Platform Computing Inc.
3760 14th Avenue
Markham, Ontario
Canada L3R 3T7

Be sure to tell us:


Copyright

© 1994-2008, Platform Computing Inc.

Although the information in this document has been carefully reviewed, Platform Computing Inc. ("Platform") does not warrant it to be free of errors or omissions. Platform reserves the right to make corrections, updates, revisions or changes to the information in this document.

UNLESS OTHERWISE EXPRESSLY STATED BY PLATFORM, THE PROGRAM DESCRIBED IN THIS DOCUMENT IS PROVIDED "AS IS" AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL PLATFORM COMPUTING BE LIABLE TO ANYONE FOR SPECIAL, COLLATERAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING WITHOUT LIMITATION ANY LOST PROFITS, DATA, OR SAVINGS, ARISING OUT OF THE USE OF OR INABILITY TO USE THIS PROGRAM.

Document redistribution policy

This document is protected by copyright and you may not redistribute or translate it into another language, in part or in whole.

Internal redistribution

You may only redistribute this document internally within your organization (for example, on an intranet) provided that you continue to check the Platform Web site for updates and update your version of the documentation. You may not make it available to your organization over the Internet.

Trademarks

LSF is a registered trademark of Platform Computing Corporation in the United States and in other jurisdictions.

POWERING HIGH PERFORMANCE, PLATFORM COMPUTING, PLATFORM SYMPHONY, PLATFORM JOBSCHEDULER, and the PLATFORM and PLATFORM LSF logos are trademarks of Platform Computing Corporation in the United States and in other jurisdictions.

UNIX is a registered trademark of The Open Group in the United States and in other jurisdictions.

Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.

Microsoft is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries.

Windows is a registered trademark of Microsoft Corporation in the United States and other countries.

Macrovision, Globetrotter, and FLEXlm are registered trademarks or trademarks of Macrovision Corporation in the United States of America and/or other countries.

Oracle is a registered trademark of Oracle Corporation and/or its affiliates.

Intel, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

Other products or services mentioned in this document are identified by the trademarks or service marks of their respective owners.

Third Party License Agreements

http://www.platform.com/legal-notices/third-party-license-agreements


© 1994-2008, Platform Computing Inc.
www.platform.com