Release Date: March 2009
The following bugs have been fixed in the March 2009 update (LSF 7 Update 5) since the October 2008 update (LSF 7 Update 4):
122405 |
Date |
2009-03-20 |
|
Description |
Customer job remains in RUNNING state after execution host crashes and restarts. |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
Incorrect job status and host underutilization. |
117599 |
Date |
2009-03-20 |
|
Description |
Following a brequeue, the job fails due to missing job buffer files. |
|
Component |
sbatchd |
|
Platform |
Cray |
|
Impact |
brequeue command doesn't work. |
114627 |
Date |
2009-03-20 |
|
Description |
According to documentation 3 parameters are required to turn on dynamic adding of hosts:
In reality only the first 2 parameters are required to add hosts dynamically. |
|
Component |
lim |
|
Platform |
All |
|
Impact |
Customers are unintentionally adding hosts to the cluster dynamically. |
121696 |
Date |
2009-03-18 |
|
Description |
Customer job remains in the RUN state without running if the pre-exec fails once. |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
Job doesn't run and host resources are wasted. |
118519 |
Date |
2009-03-16 |
|
Description |
When jobs allocate memory and fork child processes, the memory usage showed by bjobs -l is not correct since the shared memory is calculated more than once. |
|
Component |
|
|
Platform |
All |
|
Impact |
The exact physical memory usage information is incorrect for jobs using shared memory between parent and child processes. |
120639 |
Date |
2009-03-13 |
|
Description |
After installing the latest IBM service pack for POE over InfiniBand, User Space POE jobs always fail in LSF due to negative node numbers. |
|
Component |
|
|
Platform |
AIX |
|
Impact |
Users cannot run POE jobs. |
122383 |
Date |
2009-03-09 |
|
Description |
lsload reports incorrect utilization values. |
|
Component |
lim |
|
Platform |
Solaris |
|
Impact |
CPU utilization cannot be reported correctly. |
122240 |
Date |
2009-03-09 |
|
Description |
Parameter MAX_JOB_PREEMPT does not control the number of times a License Scheduler job can be preempted. |
|
Component |
mbschd, License Scheduler package mbatchd |
|
Platform |
All |
|
Impact |
Jobs using limited licenses may be preempted many times. |
120018 |
Date |
2009-03-09 |
|
Description |
The bjobs command reports a job is in the RUNNING state although the job has already exited leaving the application res stuck on a win2003 server. |
|
Component |
res |
|
Platform |
Windows |
|
Impact |
Job does not finish. |
118801 |
Date |
2009-03-09 |
|
Description |
lspasswd times out on receiving a reply from the lim. |
|
Component |
lim |
|
Platform |
All |
|
Impact |
lspasswd fails and jobs aren't submitted. |
113707 |
Date |
2009-03-09 |
|
Description |
MPI jobs get different results when run both externally and within LSF. |
|
Component |
intelmpi_wrapper mpich2_wrapper |
|
Platform |
Linux |
|
Impact |
The MPI local option is set to global and program may run wrong arguments. |
111018 |
Date |
2009-03-09 |
|
Description |
The command bjobs in SAS LSF Version 7 Update 2 does not work well on Windows. |
|
Component |
bjobs |
|
Platform |
All |
|
Impact |
Cannot use bjobs. |
110814 |
Date |
2009-03-09 |
|
Description |
SGI-MPI mpirun options are not recognized by pam. |
|
Component |
pam |
|
Platform |
Linux IA64/x86-64 |
|
Impact |
Cannot use SGI-MPI mpirun options on pam command line when using pam SGI-MPI integration. |
115193 |
Date |
2009-03-08 |
|
Description |
Cannot transfer files locally on a Windows machine or to a shared directory. |
|
Component |
lsrcp.exe |
|
Platform |
Windows |
|
Impact |
File transfer fails. |
119899 |
Date |
2009-03-06 |
|
Description |
bmod -bn fails to remove the specified begin time on Windows-64 machines. |
|
Component |
mbatchd |
|
Platform |
All 64 bit architectures |
|
Impact |
Job continues to pend. |
121605 |
Date |
2009-03-05 |
|
Description |
LSF_TMPDIR defined in lsf.conf isn't effective on Windows systems. |
|
Component |
res |
|
Platform |
Windows |
|
Impact |
Temporary files are saved to the directory defined in registry instead of the directory set in LSF_TMPDIR. |
117812 |
Date |
2009-03-05 |
|
Description |
Perl and Python user programs fail when linked with LSF APIs; the C code calling the APIs is fine. |
|
Component |
liblsf.so, libbat.so |
|
Platform |
All |
|
Impact |
Customer programs cannot be used. |
117247 |
Date |
2009-03-05 |
|
Description |
When a License Scheduler project name is misspelled or has mismatched capitalization, the job runs under the default project or pends if the default project is not defined. |
|
Component |
|
|
Platform |
All |
|
Impact |
Under-utilization of License Scheduler tokens. Jobs running under the default project accidentally have the wrong user share allotment. |
99327 |
Date |
2009-03-04 |
|
Description |
Jobs pend indefinitely with reason “Unable to determine user account for execution” if the user account cannot be resolved. |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Jobs pend. |
118421 |
Date |
2009-03-04 |
|
Description |
Excessive lim debug messages are logged. |
|
Component |
lim, pim |
|
Platform |
All |
|
Impact |
When debug is on, the lim log is excessive. |
117368 |
Date |
2009-03-04 |
|
Description |
Jobs are not dispatched with the pending reason "System is unable to schedule the job". |
|
Component |
mbschd |
|
Platform |
All |
|
Impact |
Jobs pend. |
116145 |
Date |
2009-03-04 |
|
Description |
Jobs pend with reason "New job is waiting for scheduling" even when the dependent condition is satisfied. |
|
Component |
mbschd, mbatchd |
|
Platform |
All |
|
Impact |
Jobs pend. |
115276 |
Date |
2009-03-04 |
|
Description |
Sourcing profile.lsf returns "Cannot detect the binary type" from profile.perf. |
|
Component |
profile.perf |
|
Platform |
Linux |
|
Impact |
Binary type failed, and PMC does not run properly since some environment variables are not updated. |
120817 |
Date |
2009-03-03 |
|
Description |
Jobs submitted to LSF 7 Update 4 from older versions of an LSF client (either pre-7.0.4 or applications built with a pre-7.0.4 LSF library) can have the following problems:
|
|
Component |
libbatch.so, libbat.a, mbatchd, bmod, bsub |
|
Platform |
All |
|
Impact |
Jobs may fail and the chkpnt dir is wrong. |
120745 |
Date |
2009-03-03 |
|
Description |
The SUBMIT_TIME displayed by bjobs occupies 13 characters (although it only needs 12). The extra space results in one blank line following every record on terminals with a column width. |
|
Component |
bjobs |
|
Platform |
All |
|
Impact |
Extra blank lines in output. |
120163 |
Date |
2009-03-03 |
|
Description |
When XC_LIBLIC is configured and working with permanent licenses, the client host cannot get the license that converts from lsf_base. |
|
Component |
lim |
|
Platform |
All |
|
Impact |
Client host cannot obtain a license. |
119722 |
Date |
2009-03-03 |
|
Description |
Duplicated definition of license names in the lsf.licensescheduler and lsf.shared files cause jobs to pend. |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
License Scheduler jobs cannot run. |
119595 |
Date |
2009-03-03 |
|
Description |
Port number is missing in LSF daemon log files. |
|
Component |
liblsf.so nios libbat.a liblsf.a res lim libbat.so mbd sbd |
|
Platform |
All |
|
Impact |
The unknown communication port number makes debugging harder. |
119550 |
Date |
2009-03-03 |
|
Description |
Unable to determine user account for execution. |
|
Component |
|
|
Platform |
All |
|
Impact |
Customer need to kill the job and resubmit it for the job to run. |
121747 |
Date |
2009-03-02 |
|
Description |
The data purger will purge all data with TIME_STAMP_GMT of 10 bits if there are both 10 and 13 bit TIME_STAMP_GMT in some tables (CONSUMER_RESOURCELIST, CONSUMER_DEMAND, LSF_BHOSTS, RESOURCE_METRICS, HOST_GROUP, etc.). |
|
Component |
|
|
Platform |
All |
|
Impact |
Affects data when updating from LSF 7.0 to LSF 7 Update 4. |
121155 |
Date |
2009-03-02 |
|
Description |
Interactive jobs with bsub -Is fail with a "broken pipe" error in the LDAP environment. |
|
Component |
res |
|
Platform |
All |
|
Impact |
In the LDAP environment, interactive jobs are killed by the SIGPIPE signal. |
122151 |
Date |
2009-03-01 |
|
Description |
Usernames and passwords exposed in catalina.out when Webgui debug logging is turned on by editing logj.properties and a user tries to log on. |
|
Component |
|
|
Platform |
All |
|
Impact |
Security risk. |
96667 |
Date |
2009-02-27 |
|
Description |
Space bar doesn't work when using bpeek | more. |
|
Component |
bpeek.exe bpeek |
|
Platform |
Windows |
|
Impact |
bpeek usability is compromised. |
121865 |
Date |
2009-02-26 |
|
Description |
Some parameter values such as LSF_EGO_DAEMON_CONTROL are contained in quotes while others are not. |
|
Component |
|
|
Platform |
All |
|
Impact |
Inconsistent parameter definition formats. |
120092 |
Date |
2009-02-26 |
|
Description |
${GUI_TOP}/${GUI_VERSION}/tomcat/bin/profile.ocs does not exist, but it is used by pmc_daemons.sh at line 23, 28 and 33. |
|
Component |
|
|
Platform |
All |
|
Impact |
Cannot automatically start the PMC when the OS boots, and cannot use the service PMC command. |
119197 |
Date |
2009-02-18 |
|
Description |
LSF windows installer deletes license.dat during the upgrade process. |
|
Component |
|
|
Platform |
Windows |
|
Impact |
LSF license file is lost. |
117717 |
Date |
2009-02-18 |
|
Description |
LSF Version 7 Update 4 Windows quiet installation fails. |
|
Component |
|
|
Platform |
Windows |
|
Impact |
Unsuccessful installation. |
115968 |
Date |
2009-02-18 |
|
Description |
Jobs containing multiple launches of mpirun.lsf can fail because resources are not cleaned correctly in the previous launch. Job level post exec does not help. |
|
Component |
mpirun.lsf |
|
Platform |
All |
|
Impact |
Job fails due to clean up issue; job post-exec does not help. |
118924 |
Date |
2009-02-17 |
|
Description |
A customer using RPM's to install LSF in Linux machines has found many references to /bin/sh5 in linux which break the install (using 'rpm -Uvh --force). |
|
Component |
|
|
Platform |
All |
|
Impact |
Unsuccessful installation. |
117477 |
Date |
2009-02-17 |
|
Description |
lsf7Update3_win32.msi has errors in the install.bat script created when installing LSF 7.0.4 on Windows XP 32bits. |
|
Component |
lsf7Update3_win32.msi |
|
Platform |
Windows XP 32bits |
|
Impact |
install.bat cannot be used for batch installation. |
117473 |
Date |
2009-02-17 |
|
Description |
%CLUSTERID%is set in install.bat, but is not handed over by the LSF7.0.4 install package. |
|
Component |
install package |
|
Platform |
All |
|
Impact |
Parameter CLUSTERID is missed when set during the installation; |
114605 |
Date |
2009-02-13 |
|
Description |
Replacing ssh with blaunch, some mpich_mx job tasks processes disappeared after running bstop and then bresume. |
|
Component |
res |
|
Platform |
All |
|
Impact |
With some jobs tasks gone, job results are not correct. |
122152 |
Date |
2009-02-11 |
|
Description |
Jobs submitted through PMC result in output files with incorrect permission settings. |
|
Component |
PMC |
|
Platform |
All |
|
Impact |
Output files may have incorrect permission settings. |
119639 |
Date |
2009-02-11 |
|
Description |
lim exits without an error message when users detach the processor from the host while running on Solaris. |
|
Component |
lim |
|
Platform |
Solaris |
|
Impact |
Cannot execute jobs. |
118208 |
Date |
2009-02-11 |
|
Description |
blcollect fails to parse lmstat output. |
|
Component |
blcollect |
|
Platform |
UNIX |
|
Impact |
A license without a handle from the lmstat output is ignored by blcollect. |
117721 |
Date |
2009-02-11 |
|
Description |
The description of USE_SUSP_SLOTS is incorrect. |
|
Component |
bparams |
|
Platform |
All |
|
Impact |
Misleading USE_SUSP_SLOTS parameter. |
117290 |
Date |
2009-02-11 |
|
Description |
If the new job refresh feature is turned on, the jobs group information shown by bjobs is lost. |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
The new job refresh feature cannot be used with job groups. |
117215 |
Date |
2009-02-11 |
|
Description |
blcollect does not parse lmstat output correctly leading to an incorrect license count. |
|
Component |
blcollect |
|
Platform |
UNIX |
|
Impact |
blcollect reports incorrect token usage to bld. |
116604 |
Date |
2009-02-11 |
|
Description |
LSF 6.2 Windows installer fails on an Intel64 machine. |
|
Component |
lsf6.2_win.exe |
|
Platform |
Windows |
|
Impact |
LSF 6.2 Windows installer fails on an Intel64 machine. |
116600 |
Date |
2009-02-11 |
|
Description |
bread gives the wrong output when using bkill for Session Scheduler jobs. |
|
Component |
ssched_real |
|
Platform |
All |
|
Impact |
bread cannot get correct information about session scheduler jobs. |
115124 |
Date |
2009-02-11 |
|
Description |
sbatchd adds LSF_BINDIR into PATH even if it is inside PATH already. |
|
Component |
install |
|
Platform |
UNIX |
|
Impact |
PATH length increases unnecessarily. |
114379 |
Date |
2009-02-11 |
|
Description |
Users of remote desktops using Windows 2008 to run lsadmin reconfig see an error message. |
|
Component |
bstop, bkill, lsadmin, bsub |
|
Platform |
Windows |
|
Impact |
Cannot use remote desktop connections to run a program under Windows 2008 to control LSF. |
118014 |
Date |
2009-02-09 |
|
Description |
License Scheduler fails to install a second binary type when using standalone mode. |
|
Component |
setup |
|
Platform |
UNIX |
|
Impact |
Customer can manually create the necessary directories and re-run the setup script as a work around. |
118474 |
Date |
2009-02-04 |
|
Description |
Detailed reasons not in log file. |
|
Component |
res.exe |
|
Platform |
Windows |
|
Impact |
Difficult to troubleshoot cmd.exe permission issues. |
101754 |
Date |
2009-01-23 |
|
Description |
Fairshare queues will reject a job submission if the fairshare user group has all members defined in it and there is another user group defined with the submission user as a specific member. |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Some users can not submit jobs to a queue. |
116122 |
Date |
2009-01-22 |
|
Description |
sbatchd service shuts down when the job array index of a submitted job array is greater than 232830 |
|
Component |
sbatchd.exe |
|
Platform |
Windows |
|
Impact |
sbatchd shuts down. |
115297 |
Date |
2009-01-21 |
|
Description |
With MultiCluster enabled and a RemoteClusters section defined, "bhosts remote_cluster_name" can shut down the master lim. |
|
Component |
lim |
|
Platform |
All |
|
Impact |
Local master lim shuts down. |
113134 |
Date |
2009-01-21 |
|
Description |
With MultiCluster enabled, if a job is suspended and then resumed after it is forwarded to a remote cluster it is possible for the job to run twice, once on the submission cluster and once on the remote cluster. |
|
Component |
mbschd bhist bjobs mbatch |
|
Platform |
All |
|
Impact |
Job slots are misused. |
116305 |
Date |
2009-01-19 |
|
Description |
sbatchd sets a null entry "::" in LD_LIBRARY_PATH if EGO is disabled. |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
User applications (scripts) that use LD_LIBRARY_PATH may fail due to the null entry. |
115955 |
Date |
2009-01-19 |
|
Description |
Condensed pending reason behavior changed in LSF 7.0 (individual host based reasons are not used). |
|
Component |
schmod_default.so mbschd bparams mbatchd |
|
Platform |
All |
|
Impact |
Customer pending reason monitoring script broken on upgrade. |
113919 |
Date |
2009-01-19 |
|
Description |
Cannot get correct LSF binaries PATH using source on the environment file on RHEL4U7 x86,because /proc/xen exists. |
|
Component |
install |
|
Platform |
Linux |
|
Impact |
Cannot get correct LSF binaries PATH using source on the environment file. |
118814 |
Date |
2009-01-15 |
|
Description |
In 7.0.3 a backfill job CAN be preempted even when PREEMPT_JOBTYPE=BACKFILL is not set. |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Backward compatibility broken. |
118276 |
Date |
2009-01-14 |
|
Description |
LSF batch job ignores SIGHUP so SIGHUP cannot be delivered to the job. |
|
Component |
sbatchd res |
|
Platform |
UNIX |
|
Impact |
Customer applications cannot catch SIGHUP and make use of it. |
117769 |
Date |
2009-01-13 |
|
Description |
sbatchd tries to transfer the job input file even when the file is shared. |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
Unnecessary error messages in the job error file. |
117377 |
Date |
2009-01-09 |
|
Description |
Inconsistent allocation/deallocation for a job with a job group may cause potential problems. |
|
Component |
schmod_limit.so schmod_preemption.so |
|
Platform |
All |
|
Impact |
Error messages in mbschd log indicate the inconsistent allocation/deallocation may cause potential scheduling problems. |
115963 |
Date |
2009-01-09 |
|
Description |
mbatchd spends a lot of time checking for dependencies, affecting the mbatchd performance. |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Performance impact for mbatchd event replay. |
114194 |
Date |
2009-01-09 |
|
Description |
The cpu time displayed by bjobs is not right. |
|
Component |
pim |
|
Platform |
All |
|
Impact |
Dead PGIDs cpu time not accumulated for LSF jobs. Jobs may be incorrectly considered idle. |
115080 |
Date |
2009-01-08 |
|
Description |
Condensed host list not working with bjobs. |
|
Component |
bjobs |
|
Platform |
All |
|
Impact |
Condensed host group output from bjobs is not available. |
113880 |
Date |
2009-01-08 |
|
Description |
Rerunnable jobs cannot rerun after being dispatched to a host, if a network glitch such as a temporary loss of communication occurs. |
|
Component |
mbschd |
|
Platform |
All |
|
Impact |
Job pends. |
76955 |
Date |
2009-01-06 |
|
Description |
Dynamic slave lim calls the master lim 20 times if the slave lim host name is not resolvable, adding to the master lim load. |
|
Component |
lim |
|
Platform |
UNIX |
|
Impact |
Master lim performance is affected. |
117065 |
Date |
2009-01-04 |
|
Description |
rusage debugging is turned off automatically by sbatchd causing inconvenience for debugging. |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
Difficult to capture debug data. |
116800 |
Date |
2009-01-04 |
|
Description |
Duration string beyond that defined at queue level returns the message 'Bad resource requirement. Job not submitted'. |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Cannot submit jobs with desired resource duration. |
116333 |
Date |
2009-01-04 |
|
Description |
Job status from bjobs –Al is PEND when job is complete. |
|
Component |
bjobs |
|
Platform |
All |
|
Impact |
Confusing job status. |
115419 |
Date |
2008-12-31 |
|
Description |
When using Kerberos for authentication but not using the Platform Kerberos integration, sbatchd deletes the /tmp file mentioned in KRB5CCNAME env variable after batch job complete. |
|
Component |
sbatchd res |
|
Platform |
Linux |
|
Impact |
Customer manually has to unset the KRB5CCNAME parameter in a job starter to run jobs in LSF. |
114282 |
Date |
2008-12-30 |
|
Description |
The command bjobs -u displays all user's jobs in lsf7.0.2 and later versions although LSB_SECURE_JOBINFO_USERS=Y is set in lsf.conf. |
|
Component |
bjobs |
|
Platform |
All |
|
Impact |
LSB_SECURE_JOBINFO_USERS does not work as expected in lsf7.0.2 and later versions. |
116466 |
Date |
2008-10-29 |
|
Description |
Customer gets a java exception after logging into the PMC console. |
|
Component |
install |
|
Platform |
All |
|
Impact |
Customer is unable to use PMC. |
115542 |
Date |
2008-09-27 |
|
Description |
bparams returns an error code. |
|
Component |
bparams mbatchd |
|
Platform |
All |
|
Impact |
Cannot use bparams. |
109128 |
Date |
2008-09-16 |
|
Description |
No email is sent for job idle exceptions. |
|
Component |
mbatchd |
|
Platform |
Windows |
|
Impact |
LSF admin is not notified for idle job exceptions. |
support@platform.com
www.platform.com
North America: +1 905 948 4297
Europe: +44 1256 370 530
Asia: +86 10 6238 1125
Toll-free: 1-877-444-4573
Platform Support
Platform Computing Corporation
3760 14th Avenue
Markham, Ontario
Canada L3R 3T7
© 1994 - 2009 Platform Computing Corporation
All Rights Reserved.
Although the information in this document has been carefully reviewed, Platform Computing Corporation (“Platform”) does not warrant it to be free of errors or omissions. Platform reserves the right to make corrections, updates, revisions or changes to the information in this document.
UNLESS OTHERWISE EXPRESSLY STATED BY PLATFORM, THE PROGRAM DESCRIBED IN THIS DOCUMENT IS PROVIDED “AS IS” AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL PLATFORM COMPUTING BE LIABLE TO ANYONE FOR SPECIAL, COLLATERAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING WITHOUT LIMITATION ANY LOST PROFITS, DATA, OR SAVINGS, ARISING OUT OF THE USE OF OR INABILITY TO USE THIS PROGRAM.
Document redistribution policy : This document is protected by copyright and you may not redistribute or translate it into another language, in part or in whole. You may only redistribute this document internally within your organization (for example, on an intranet).
LSF is a registered trademark of Platform Computing Corporation in the United States and in other jurisdictions.
ACCELERATING INTELLIGENCE, THE BOTTOM LINE IN DISTRIBUTED COMPUTING, PLATFORM COMPUTING, CLUSTERWARE, PLATFORM ACTIVECLUSTER, IT INTELLIGENCE, SITEASSURE, PLATFORM SYMPHONY, PLATFORM JOBSCHEDULER, PLATFORM INTELLIGENCE, PLATFORM INFRASTRUCTURE INSIGHT, PLATFORM WORKLOAD INSIGHT, and the PLATFORM and LSF logos are trademarks of Platform Computing Corporation in the United States and in other jurisdictions.
UNIX is a registered trademark of The Open Group in the United States and in other jurisdictions.
Microsoft is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries.
Windows is a registered trademark of Microsoft Corporation in the United States and other countries.
Other products or services mentioned in this document are identified by the trademarks or service marks of their respective owners.