Power7 System Firmware Fix History - Release level AM74x, AM77x

Firmware Description and History

AM770
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
AM770_126_126 / FW770.94

05/29/18
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Security         Severity:  SPE

Response for Recent Security Vulnerabilities

  • DISRUPTIVE:  In response to recently reported security vulnerabilities, this firmware update is being released to address Common Vulnerabilities and Exposures issue number CVE-2018-3639.  In addition, Operating System updates are required in conjunction with this FW level for CVE-2018-3639.
System firmware changes that affect certain systems
  • On a system with an AIX partition,  a problem was fixed for a partition time jump that could occur after doing an AIX Live Update.  This problem could occur if the AIX Live Update happens after a Live Partition Mobility (LPM) migration to the partition.  AIX applications using the timebase facility could observe a large jump forwards or backwards in the time reported by the timebase facility.   A circumvention to this problem is to reboot the partition after the LPM operation prior to doing the AIX Live Update.  An AIX fix is also required to resolve this problem.  The issue will no longer occur when this firmware update is applied on the system that is the target of the LPM operation and the AIX partition performing the AIX Live Update has the appropriate AIX updates installed prior to doing the AIX Live Update.  This fix only pertains to the following models that are able to run AIX partitions:
    1) IBM Power 750 Express (8408-E8D)
    2) IBM Power 760 (9109-RMD)
AM770_123_032 / FW770.93

03/02/18
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY

Impact: Availability         Severity:  SPE

System firmware changes that affect certain systems

  • On systems running IBM i partitions at IBM i V6R1 or V7R1 at less than TR5, a problem was fixed for IBM i partitions failing to boot with SRC B600690B.  If the IBMi partition is running, a DLPAR add of I/O may fail.  This problem was introduced with FW770.90 and is present in FW770.91 and FW770.92 and always happens at these levels.  The problem can be resolved by moving up to OS IBM i 7.1 TR5 or later level, if the update to the fixed firmware level is not wanted.  This problem only pertains to the following models that are able to run IBM i partitions:
    1) IBM Power 750 Express (8408-E8D)
    2) IBM Power 760 (9109-RMD)
    3)  IBM Power 770 (9117-MMC)
    4)  IBM Power 780 (9179-MHC)
    For more information, see the following IBM Tech Note:  https://www.ibm.com/support/docview.wss?uid=nas8N1022482
AM770_122_032 / FW770.92

01/31/18
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Security         Severity:  SPE

Response for Recent Security Vulnerabilities

  • In response to recently reported security vulnerabilities, this firmware update is being released to address Common Vulnerabilities and Exposures issue number CVE-2017-5715. In addition, Operating System updates are available to mitigate the CVE-2017-5753 and CVE-2017-5754 security issues. This pertains to the following models:
    1)  IBM Power 770 (9117-MMC)
    2)  IBM Power 780 (9179-MHC)
    This firmware update also addresses CVE-2017-5715 for IBM i, along with updates for AIX and Linux, for the following models:
    1) IBM Power 750 Express (8408-E8D)
    2) IBM Power 760 (9109-RMD)
    3) IBM PowerLinux 7R4 (8248-L4T)
AM770_120_032 / FW770.91

01/09/18
Systems 8408-E8D; 8248-L4T; and 9109-RMD ONLY
Impact: Security         Severity:  SPE

New features and functions

  • In response to recently reported security vulnerabilities, this firmware update is being released to address Common Vulnerabilities and Exposures issue numbers CVE-2017-5715,  CVE-2017-5753 and CVE-2017-5754.  Note that a subsequent FW release is required and will replace this FW update for CVE-2017-5715 for IBMi when available.  In addition, Operating System updates are required in conjunction with this FW level for CVE-2017-5753 and CVE-2017-5754.
    The models addressed by this service pack update have the P7+ processor: 
    1) IBM Power 750 Express (8408-E8D)
    2) IBM Power 760 (9109-RMD)
    3) IBM PowerLinux 7R4 (8248-L4T)
AM770_119_032 / FW770.90

12/13/17
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Availability         Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed for an invalid date from the service processor causing the customer date and time to go to the Epoch value (01/01/1970) without a warning or chance for a correction.  With the fix,  the first IPL attempted on an invalid date will be rejected with a message alerting the user to set the time correctly in the service processor.  If the warning is ignored and the date/time is not corrected, the next IPL attempt will complete to the OS with the time reverted to the Epoch time and date.  This problem is very rare but it has been known to occur on service processor replacements when the repair step to set the date and time on the new service processor was inadvertently skipped by the service representative.
  • A problem was fixed for an SRC BA090006 serviceable event log occurring whenever an attempt was made to boot from an ALUA  (Asymmetric Logical Unit Access) drive.  These drives are always busy by design and cannot be used for a partition boot, but no service action is required if a user inadvertently tries to do that.  Therefore, the SRC was changed to be an informational log.
  • A problem was fixed for the incorrect reporting of the Universally Unique Identifier (UUID) to the OS, which prevented the tracking of a partition as it moved within a data center.  The UUID value as seen on the HMC did not match the value as displayed in the OS.
  • A  problem was fixed for a partition boot fail or hang from a Fibre Channel device having fabric faults.  Some of the fabric errors returned by the VIOS are not interpreted correctly by the Open Firmware VFC drive, causing the hang instead of generating helpful error logs.
  • A problem was fixed for spurious loggings of SRCs A7004715 and A7001730 for system VPD errors that did not reflect actual problems in the system Vital Product Data (VPD) card.  With the fix,  the VPD card SRCs are now reported only after a certain error threshold is achieved to ensure that replacement of the VPD card will help resolve the VPD problems.
System firmware changes that affect certain systems
  • On systems with mirrored memory running IBM i partitions, a problem was fixed for memory fails in the partition that also caused the system to crash.  The system failure will occur any time that IBM i partition memory towards the beginning of the partition's assigned memory fails.  With the fix, the memory failure is isolated to the impacted partition, leaving the rest of the system unaffected.
AM770_116_032 / FW770.80

05/23/17
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Availability         Severity:  SPE

New features and functions

  • Support for the Advanced System Management Interface (ASMI) was changed to allow the special characters of "I", "O", and "Q" to be entered for the serial number of the I/O Enclosure under the Configure I/O Enclosure option.  These characters have only been found in an IBM serial number rarely, so typing in these characters will normally be an incorrect action.  However, the special character entry is not blocked by ASMI any more so it is able to support the exception case.  Without the enhancement, the typing of one of the special characters causes message "Invalid serial number" to be displayed.
  • Support was added  for the Universally Unique IDentifier (UUID) property for each partition.  The UUID provides each partition with an identifier that is persisted by the platform across partition reboots, reconfigurations, OS reinstalls, partition migration,  and hibernation.

System firmware changes that affect all systems

  • A problem was fixed for incorrect error messages from the Advanced System Management Interface (ASMI) functions when the system is powered on but in the  "Incomplete State".  For this condition, ASMI was assuming the system was powered off because it could not communicate to the PowerVM hypervisor.  With the fix, the ASMI error messages will indicate that ASMI functions have failed because of the bad hypervisor connection instead of falsely stating that the system is powered off.
  • A problem was fixed for a Live Partition Mobility migration that resulted in the source-managed system going to the Hardware Management Console (HMC) Incomplete state after the migration to the target system was completed.  This problem is very rare and has only been detected once.. The problem trigger is that the source partition does not halt execution after the migration to the target system.   The HMC went to the Incomplete state for the source-managed system when it failed to delete the source partition because the partition would not stop running.  When this problem occurred, the customer network was running very slowly and this may have contributed to the failure.  The recovery action is to re-IPL the source system but that will need to be done without the assistance of the HMC.  For each partition that has a OS running on the source system, shut down each partition from the OS.  Then from the Advanced System Management Interface (ASMI),  power off the managed system.  Alternatively, the system power button may also be used to do the power off.  If the HMC Incomplete state persists after the power off, the managed system should be rebuilt from the HMC.  For more information on HMC recovery steps, refer to this IBM Knowledge Center link: https://www.ibm.com/support/knowledgecenter/en/POWER7/p7eav/aremanagedsystemstate_incomplete.htm
  • A problem was fixed for a latency time of about 2 seconds being added to a target Live Partition Mobility (LPM) migration system when there is a latency time check failure.  With the fix, in the case of a latency time check failure, a much smaller default latency is used instead of two seconds.  This error would not be noticed if the customer system is using a NTP time server to maintain the time.
  • A problem was fixed for a shared processor pool partition showing an incorrect zero "Available Pool Processor" (APP) value after a concurrent firmware update.  The zero APP value means that no idle cycles are present in the shared processor pool but in this case it stays zero even when idle cycles are available.  This value can be displayed using the AIX "lparstat" command.  If this problem is encountered, the partitions in the affected shared processor pool can be dynamically moved to a different shared processor pool.  Before the dynamic move, the  "uncapped" partitions should be changed to "capped" to avoid a system hang. The old affected pool would continue to have the APP error until the system is re-IPLed. 
    This fix pertains only to IBM Power 770 (9117-MMC) and IBM Power 780 (9179-MHC) systems.
  • A rare problem was fixed for a system hang that can occur when dynamically moving "uncapped" partitions to a different shared processor pool.  To prevent a system hang, the "uncapped" partitions should be changed to "capped" before doing the move.  This fix pertains only to IBM Power 770 (9117-MMC) and IBM Power 780 (9179-MHC) systems.
  • A problem was fixed for a Network boot/install failure using bootp in a network with switches using the Spanning Tree Protocol (STP).  A Network boot/install using lpar_netboot on the management console was enhanced to allow the number of retries to be increased.  If the user is not using lpar_netboot, the number of bootp retries can be increased using the SMS menus.  If the SMS menus are not an option, the STP in the switch can be set up to allow packets to pass through while the switch is learning the network configuration.
  • A problem was fixed for Live Partition Mobility (LPM) migrations from FW860.10 or FW860.11 to older levels of firmware.  Subsequent  DLPAR of Virtual Adapters will fail with HMC error message HSCL294C, which contains text similar to the following:  "0931-007 You have specified an invalid drc_name." This issue affects partitions installed with AIX 7.2 TL 1 and later. Not affected by this issue are partitions installed with VIOS, IBM i, or earlier levels of AIX.
  • A problem was fixed for an intermittent IPL failure with SRC B181E6C7 for a deadlock condition when testing the clocks during the IPL.  The problem state can be recovered by doing another IPL.  The problem is triggered by an error in the IPL clock test causing a interrupt handler to switch to the redundant clock and deadlock.  With the fix, the clock fault is handled and the bad clock is guarded, with the IPL completing on the redundant clock.  This fix pertains only to IBM Power 770 (9117-MMC) and IBM Power 780 (9179-MHC) systems.
System firmware changes that affect certain systems
  • On systems with IBM i partitions, a problem was fixed for frequent logging of informational B7005120 errors due to communications path closed conditions during messaging from HMCs to IBM i partitions.  In the majority of cases these errors are due to normal operating conditions and not due to errors that require service or attention.  The logging of informational errors due to this specific communications path closed condition that are the result of normal operating conditions has been removed.
AM770_112_032 / FW770.70

07/27/16
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Performance       Severity:  SPE

New features and functions

  • Support was added for the Stevens6+ option of the internal tray loading DVD-ROM drive with F/C #EU13.  This is an 8X/24X(max) Slimline SATA DVD-ROM Drive.  The Stevens6+ option is a FRU hardware replacement for the Stevens3+.  MTM 7226-1U3 (Oliver)  FC 5757/5762/5763 attaches to IBM Power Systems and lists Stevens6+ as optional for Stevens3+.  If the Stevens6+  DVD drive is installed on the system without the required firmware support, the boot of an AIX partition will fail when the DVD is used as the load source.  Also, an IBM i partition cannot consistently boot from the DVD drive using D-mode IPL.  A SRC C2004130 may be logged for the load source not found error.

System firmware changes that affect all systems

  • A problem was fixed for PCI adapters locking up when powered on.  The problem is rare but frequency varies with the specific adapter models.  A system power down and power up is required to get the adapter out of the locked state.
  • A problem was fixed for hypervisor task failures in adjunct partitions with a SRC B7000602 reported in the error log.  These failures occur during adjunct partition reboots for concurrent firmware updates but are extremely rare and require a re-IPL of the system to recover from the task failure.  The adjunct partitions may be associated with the VIOS or I/O virtualization for the physical adapters such as done for SR-IOV.
  • A security problem was fixed in OpenSSL for a possible service processor reset on a null pointer de-reference during RSA PPS signature verification. The Common Vulnerabilities and Exposures issue number is CVE-2015-3194.
  • A problem was fixed for the Advanced System Management Interface "Network Services/Network Configuration" "Reset Network Configuration" button that was not resetting the static routes to the default factory setting.  The manufacturing default is to have no static routes defined so the fix clears any static routes that had been added.  A circumvention to the problem is to use the ASMI "Network Services/Network Configuration/Static Route Configuration" "Delete" button before resetting the network configuration.
  • A problem was fixed for partial loss of Entitlement for On/Off Memory Capacity on Demand (also called Elastic CoD).  Users with large amounts of Entitlement on the system of greater than "65535 GB * Days" could have had a truncation of the Entitlement value on a re-IPL of the system.  To recover lost Entitlement, the customer can request another On/Off Enablement Code from IBM support to "re-fill" their entitlement.
  • A problem was fixed for a sequence of two or more Live Partition Mobility migrations that caused a partition to crash with a SRC BA330000 logged (Memory allocation error in partition firmware).  The sequence of LPM migrations that can trigger the partition crash are as follows:
    The original source partition level can be any FW760.xx, FW763.xx, FW770.xx, FW773.xx, FW780.xx, or FW783.xx P7 level or any FW810.xx, FW820.xx, FW830.xx, or FW840.xx P8 level.  It is migrated first to a system running one of the following levels:
    1) FW730.70 or later 730 firmware or
    2) FW740.60 or later 740 firmware
    And then a second migration is needed to a system running one of the following levels:
    1) FW760.00 - FW760.20 or
    2) FW770.00 - FW770.10
    The twice-migrated system partition is now susceptible to the BA330000 partition crash during normal operations until the partition is rebooted.  If an additional LPM migration is done to any firmware level, the thrice-migrated partition is also susceptible to the partition crash until it is rebooted.
    With the fix applied, the susceptible partitions may still log multiple BA330000 errors but there will be no partition crash.  A reboot of the partition will stop the logging of the BA330000 SRC.
System firmware changes that affect certain systems
  • On a system with a IBM i partition running 7.2 or later with 4K sector disks,  a problem was fixed for a  machine check incorrectly issued.
  • For Integrated Virtualization Manager (IVM) managed systems with more than 64 active partitions, a problem was fixed for recovery from Live Partition Mobility (LPM) errors.  Without the fix, the IVM managed system partition can appear to still be running LPM after LPM has aborted, preventing retries of the LPM operation.  In this case, the partition must be stopped and restarted to clear the LPM error state.  The problem is not frequent because it requires a failed LPM on a partition with a partition ID that is greater than 64.  This problem does not pertain to the IBM Power 770 (9117-MMC) nor the IBM Power 780 (9179-MHC).
  • On systems with a PowerVM Active Memory Sharing (AMS) partition with AIX  Level 7.2.0.0 or later with Firmware Assisted Dump enabled, a problem was fixed for a Restart Dump operation failing into KDB mode.  If "q" is entered to exit from KDB mode, the partition fails to start.  The AIX partition must be powered off and back on to recover.  The problem can be circumvented by disabling Firmware Assisted Dump (default is enabled in AIX 7.2).
  • On systems with dedicated processor partitions,  a problem was fixed for the dedicated processor partition becoming intermittently unresponsive. The problem can be circumvented by changing the partition to use shared processors.
  • For a system partition with more than 64 cores, a problem was fixed for Live Partition Mobility (LPM)  migration operations failing with HSCL365C.  The partition migration is stopped because the platform detects a firmware error anytime the partition has more than 64 cores.  This problem only pertains to the Power 780 (9179-MHC).
  • For systems with an invalid P-side or T-side in the firmware, a problem was fixed in the partition firmware Real-Time Abstraction System (RTAS) so that system Vital Product Data (VPD) is returned at least from the valid side instead of returning no VPD data.   This allows AIX host commands such as lsmcode, lsvpd, and lsattr that rely on the VPD data to work to some extent even if there is one bad code side.  Without the fix,  all the VPD data is blocked from the OS until the invalid code side is recovered by either rejecting the firmware update or attempting to update the system firmware again.
  • For non-HMC managed systems in Manufacturing Default Configuration (MDC) mode with a single host partition, a problem was fixed for missing dumps of type SYSDUMP. FSPDUMP. LOGDUMP, and RSCDUMP that were not off-loaded to the host OS.  This is an infrequent error caused by a timing error that causes the dump notification signal to the host OS to be lost.  The missing/pending dumps can be retrieved by rebooting the host OS partition.  The rebooted host OS will receive new notifications of the dumps that have to be off-loaded.  This problem does not pertain to the IBM Power 770 (9117-MMC) nor the IBM Power 780 (9179-MHC).
  • On systems where memory relocation (as done by using Live Partition Mobility (LPM) ) and a partition reboot are occurring simultaneously, a problem for a system termination was fixed.  The potential for the problem existed between the active migration and the partition reboot.
  • On a system with a AIX partition and a Linux partition, a problem was fixed for dynamically moving an adapter that uses DMA from the Linux partition to the AIX partition that caused the AIX to fail by going into KDB mode (0c20 crash).  The management console showed the following message for the partition operation:  "Dynamic move of I/O resources failed.  The I/O slot dynamic partitioning operation failed.".  The error was caused by Linux using 64K mappings for the DMA window and AIX using 4K mappings for the DMA window, causing incorrect calculations on the AIX when it received the adapter.  Until the fix is applied, the adapters that use DMA should only be moved from Linux to AIX when the partitions are powered off.  This problem does not pertain to the IBM PowerLinux 8246 and 8248 models as these are Linux-only partition systems.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • DEFERRED:  A problem was fixed for a I/O performance slow-down that can occur after a concurrent repair of a GX bus I/O adapter with a Feature Code of #1808 or #1914.  A re-IPL of the system after the concurrent repair operation corrects the I/O performance issue.  This fix requires an IPL of the system to take effect. This problem only pertains to the IBM Power 770 (9117-MMC) and the IBM Power 780 (9179-MHC).
AM770_110_032 / FW770.61

12/16/15
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Availability         Severity:  ATT

System firmware changes that affect all systems
  • A problem was fixed for a hypervisor deadlock that results in the system being in a "Incomplete state" as seen on the management console.  This deadlock is the result of two hypervisor tasks using the same locking mechanism for handling requests between the partitions and the management console.  Except for the loss of the management console control of the system, the system is operating normally when the "Incomplete state" occurs.
  • A security problem was fixed in the lighttpd server on the service processor OpenSSL where a remote attacker, while attempting authentication, could insert strings into the lighttpd server log file.  Under normal operations on the service processor, this does not impact anything because the log is disabled by default.  The Common Vulnerabilities and Exposures issue number is CVE-2015-3200.
System firmware changes that affect certain systems
  • On systems using PowerVM with shared processor partitions that are configured as capped or in a shared processor pool, there was a problem found that delayed the dispatching of the virtual processors which caused performance to be degraded in some situations.  Partitions with dedicated processors are not affected.   The problem is rare and can be mitigated, until the service pack is applied, by creating a new shared processor AIX or Linux partition and booting it to the SMS prompt; there is no need to install an operating system on this partition.  Refer to help document http://www.ibm.com/support/docview.wss?uid=nas8N1020863 for additional details.
AM770_109_032 / FW770.60

08/05/15
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Availability         Severity:  SPE

New features and functions

  • Support was added to the Advanced System Management Interface (ASMI) to be able to add a IPv4 static route definition for each ethernet interface on the service processor.  Using a static route definition,  a Hardware Management Console (HMC) configured on a private subnet that is different from the service processor subnet is now able to connect to the service processor and manage the CEC.  A static route persists until it is deleted or until the service processor settings are restored to manufacturing defaults.  The static route is managed with the ASMI panel "Network Services/Network Configuration/Static Route Configuration" IPv4 radio button.  The "Add" button is used to add a static route (only one is allowed for each ethernet interface) and the "Delete" button is used to delete the static route.

System firmware changes that affect all systems

  • A problem was fixed for the iptables process consuming all available memory, causing an out of memory dump and reset/reload of the service processor.
  • A problem was fixed in the Advanced System Management Interface (ASMI) to reword a confusing message for systems with no deconfigured resources.  The "System Service Aids/Deconfiguration Records" message text for this situation was changed from "Deconfiguration data is currently not available." to "No deconfigured resources found in the system.
  • A problem was fixed for performance dumps to speed its processing so it is able to handle partitions with a large number of processors configured.  Previously, for large systems, the performance dump took too long in collecting performance data to be useful in the debugging of some performance problems.
  • A problem was fixed for a faulty ambient temperature sensor that triggered emergency power offs with SRC 11007203 or 11007203 even though the temperature was not over the limit.  If the ambient temperatures are high now, the errors will be logged for call home service but they will not trigger an emergency power off.
  • A problem was fixed to prevent a hypervisor task failure with a B7000602 SRC logged, if multiple resource dumps running concurrently run out of dump buffer space. The failed hypervisor task could prevent basic logical partition operations from working, potentially leading to an Incomplete state on the Management Console.
  • For Power7+ systems with shared processor partitions, a problem was fixed that could result in latency or timeout issues with I/O devices.  For the Power7 systems (IBM Power 770 (9117-MMD), IBM Power 780 (9179-MHC)), this issue impacts all partitions, regardless if the processors are shared or not.
  • A problem was fixed for a partition deletion error on the management console with error code 0x4000E002 and message "...insufficient memory for PHYP".  The partition delete operation has been adjusted to accommodate the temporary increase in memory usage caused by memory fragmentation, allowing the delete operation to be successful.
  • A security problem was fixed in OpenSSL where the service processor would, under certain conditions, accept Diffie-Hellman client certificates without the use of a private key, allowing a user to falsely authenticate .  The Common Vulnerabilities and Exposures issue number is CVE-2015-0205.
  • A security problem was fixed in OpenSSL for it's BigNumber Squaring implementation to prevent a failure of cryptographic protection mechanisms.  The Common Vulnerabilities and Exposures issue number is CVE-2014-3570.
  • A security problem was fixed in OpenSSL to fix multiple flaws in the parsing of X.509 certificates.  These flaws could be used to modify an X.509 certificate to produce a certificate with a different fingerprint without invalidating its signature, and possibly bypass fingerprint-based blacklisting.  The Common Vulnerabilities and Exposures issue number is CVE-2014-8275.
  • A security vulnerability, commonly referred to as GHOST, was fixed in the service processor glibc functions getbyhostname() and getbyhostname2() that allowed remote users of the functions to cause a buffer overflow and execute arbitrary code with the permissions of the server application.  There is no way to exploit this vulnerability on the service processor but it has been fixed to remove the vulnerability from the firmware.  The Common Vulnerabilities and Exposures issue number is CVE-2015-0235.
  • A security problem was fixed in OpenSSL where a remote attacker could crash the service processor with a specially crafted X.509 certificate that causes an invalid pointer or an out-of-bounds write.  The Common Vulnerabilities and Exposures issue numbers are CVE-2015-0286 and CVE-2015-0287.
  • A problem was fixed that prevented a second management console from being added to the CEC.  In some cases, network outages caused defunct management console connection entries to remain in the service processor connection table, making connection slots unavailable for new management consoles  A reset of the service processor could be used to remove the defunct entries and allow the second management console to connect.
  • A problem was fixed for an IPL termination with a B150B10C SRC and B121C770 error logs.  This problem only occurred on a multiple node IBM Power 770 (9117-MMC) or  IBM Power 780 (9179-MHC).  The problem was intermittent so a re-ipl of the CEC normally resolved the problem.
  • A problem was fixed for some service processor error logs not getting reported to the OS partitions as needed.  The service processor was not checking for a successful completion code on the error log message send, so it was not doing retries of the send to the OS when that was needed to ensure that the OS received the message.
  • A problem was fixed for an incorrect call home for SRC B1818A0F.  There was no problem to be resolved so this call home should have been ignored.
  • A security problem was fixed for an OpenSSL specially crafted X.509 certificate that could cause the service processor to reset in a denial-of-service (DOS) attack.  The Common Vulnerabilities and Exposures issue number is CVE-2015-1789.

System firmware changes that affect certain systems
  • On a system with redundant service processors, a problem was fixed for bad pointer reference in the mailbox function during data synchronization between the two service processors.  The de-reference of the bad pointer caused a core dump, reset/reload, and fail-over to the backup service processor.
  • On systems using PowerVM and Virtual Trusted Platform Module (VTPM) partitions,  a problem was fixed for a Management Console error that occurred while restoring a backup profile that caused the system to go to the Management Console "Incomplete state".  The failed system had a suspended VTPM partition and a B7000602 SRC logged.
  • For a partition that has been migrated with Live Partition Mobility (LPM) from FW730 to FW740 or later, a problem was fixed for a Main Storage Dump (MSD) IPL failing with SRC B2006008.  The MSD IPL can happen after a system failure and is used to collect failure data.  If the partition is rebooted anytime after the migration, the problem cannot happen.  The potential for the problem existed between the active migration and a partition reboot.
  • On systems with a PCIe 3D graphics adapter (F/C #EC41 or #EC42) in a partition, a problem was fixed for a partition hang or BA21xxxx error conditions during partition initialization.
AM770_101_032 / FW770.51

04/21/15
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems

  • On systems using Virtual Shared Processor Pools (VSPP), a problem was fixed for an inaccurate pool idle count over a small sampling period.

    A problem was corrected for a defect in an earlier service pack (AM770_098) that potentially caused an undetected corruption of firmware when the fix was concurrently activated. If the earlier service pack(AM770_098) was concurrently installed, a platform IPL will mitigate potential future exposure to the problem.
System firmware changes that affect certain systems
  • On systems with redundant service processors and unlicensed cores, a problem was fixed with firmware update to prevent SRC B170B838 errors on unlicensed cores after an administrative failover (AFO) to the backup service processor.
AM770_098_032 / FW770.50

01/12/15

Impact: Security         Severity:  HIPER

New features and functions

  • Support was added for using the Mellanox ConnectX-3 Pro 10/40/56 GbE (Gigabit Ethernet) adapter as a network install device.

System firmware changes that affect all systems

  • A problem was fixed for the Advanced System Manager Interface (ASMI) to change the Dynamic Platform Optimizer (DPO) VET capability setting from "False" to "True".  DPO is available on all systems to use without a license required.  Even though the VET for DPO was set to "False", it did not interfere with the running of DPO.
  • A problem was fixed for memory relocation failing during a partition reboot with SRC B700F103 logged.  The memory relocation could be part of the processing for the Dynamic Platform Optimizer (DPO), Active Memory Sharing (AMS) between partitions, mirrored memory defragmentation, or a concurrent FRU repair.
  • A security problem was fixed for the Network Time Protocol (NTP) client that allowed remote attackers to execute arbitrary code via a crafted packet containing an extension field.  The Common Vulnerabilities and Exposures issue number is CVE-2009-1252.
  • A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed a man-in -the middle attacker, via a specially crafted fragmented handshake packet, to force a TLS/SSL server to use TLS 1.0, even if both the client and server supported newer protocol versions. The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3511.
  • A security problem was fixed in OpenSSL for formatting fields of security certificates without null-terminating the output strings.  This could be used to disclose portions of the program memory on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3508.
  • Multiple security problems were fixed in the way that OpenSSL handled Datagram Transport Layer Security (DLTS) packets.  A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2014-3505, CVE-2014-3506 and CVE-2014-3507.
  • A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests.  A specially crafted DTLS handshake packet with an included Supported EC Point Format extension could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3509.
  • A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Diffie Hellman (DH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3510.
  • A security problem in GNU Bash was fixed to prevent arbitrary commands hidden in environment variables from being run during the start of a Bash shell.  Although GNU Bash is not actively used on the service processor, it does exist in a library so it has been fixed.  This is IBM Product Security Incident Response Team (PSIRT) issue #2211.  The Common Vulnerabilities and Exposures issue numbers for this problem are CVE-2014-6271, CVE-2014-7169, CVE-2014-7186, and CVE-2014-7187.
  • A problem was fixed for the Advanced System Manager Interface (ASMI) that allowed possible cross-site request forgery (CSRF) exploitation of the ASMI user session to do unwanted tasks on the service processor.
  • A problem was fixed for I/O adapters so that BA400002 errors were changed to informational for memory boundary adjustments made to the size of DMA map-in requests.  These DMA size adjustments were marked as UE previously for a condition that is normal.
  • A security problem was fixed in OpenSSL for memory leaks that allowed remote attackers to cause a denial of service (out of memory on the service processor). The Common Vulnerabilities and Exposures issue numbers are CVE-2014-3513 and CVE-2014-3567.
  • A security problem was fixed in OpenSSL for padding-oracle attacks known as Padding Oracle On Downgraded Legacy Encryption (POODLE).  This attack allows a man-in-the-middle attacker to obtain a plain text version of the encrypted session data. The Common Vulnerabilities and Exposures issue number is CVE-2014-3566.  The service processor POODLE fix is based on a selective disablement of SSLv3 using the Advanced System Management Interface (ASMI) "System Configuration/Security Configuration" menu options.  The Security Configuration options of "nist_sp800_131a", "nist_compat", and "legacy" for affects the disablement SSLv3 and determines the level of protection from POODLE.  The management console also requires a POODLE fix for APAR MB03867(FIX FOR CVE-2014-3566 FOR HMC V7 R7.7.0 SP4 with PTF MH01482) to eliminate all vulnerability to POODLE and allow use of option 1 "nist_sp800_131a" as shown below:
    -1) nist_sp800_131a (SSlv3 disabled):  This highest level of security protection does not allow service processor clients to connect using SSLv3, thereby eliminating any possibility of a POODLE attack.  All clients must be capable of using TLS v1.2 to make the secured connections to the service processor to use this option.  This requires the management console be at a minimum level of HMC V7 R7.7.0 SP4 with POODLE PTF MH01482.
    -2) nist_compat (default mode - SSLv3 enabled for HMC):  This medium level of security protection disables SSLv3 (TLS v1.2 must be used instead) for the web browser sessions to ASMI and for the CIM clients and assures them of POODLE-free connections.  But the older management consoles are allowed to use SSLv3 to connect to the service processor.  This is intended to allow non-POODLE compliant HMC levels to be able to connect to the CEC servers until they can be planned and upgraded to the POODLE compliant HMC levels.  Running a non-POODLE compliant HMC to a service processor in this default mode will prevent the ASMI-proxy sessions from the HMC from connecting as these proxy sessions require SSLv3 support in ASMI.
    -3) legacy (SSLv3 enabled):  This basic level of security protection enables SSLv3 for all service processor client connection.  It relies on all clients being at POODLE fix compliant levels to provide full POODLE protection using the TLS Fallback Signaling Cipher Suite Value (TLS_FALLBACK_SCSV) to prevent fallback to vulnerable SSLv3 connections.  This legacy option is intended for customer sites on protected internal networks that have a large investment in older hardware that need SSLv3 to make browser and HMC connections to the service processor.  The level of POODLE protection actually achieved in legacy mode is determined by the percentage of clients that are at the POODLE fix compliant levels.

    Note:  If it is needed to downlevel the system and remove this fix, before doing the firmware update the ASMI Security Configuration should be set to either "nist_compat" or "legacy" to assure that all service processor clients are enforced to the same security level.  An ASMI Security Configuration of "nist_sp800_131a" will  be undefined at the earlier firmware level, causing a mixed-mode of security levels for the clients.
System firmware changes that affect certain systems
  • HIPER/Pervasive:  On systems using PowerVM firmware, a performance problem was fixed that may affect shared processor partitions where there is a mixture of dedicated and shared processor partitions with virtual IO connections, such as virtual ethernet or Virtual IO Server (VIOS) hosting, between them.  In high availability cluster environments this problem may result in a split brain scenario.
  • DEFERRED:  A performance problem was fixed for PCIe slot C4 which was missing a dedicated internal data buffer, making it a bottleneck when using certain high-performance IO adapters.  The PCIe slot C4 is now assigned a data capability of 16 GB.  This fix pertains only to the IBM Power 750 Express (8408-E8D), IBM Power 760 (9109-RMD), and IBM PowerLinux 7R4 (8248-L4T) systems.  This deferred fix addresses a potential performance problem but not an error condition.  As such,  customers may wait for the next planned service window to activate the deferred fix via a system reboot.
  • A problem was fixed to prevent unnecessary EPOW (Emergency and POwer Warning) class 3 event warnings in the OS for ambient temperature approaching specification limit.  This fix pertains only to the IBM Power 750 Express (8408-E8D), IBM Power 760 (9109-RMD), and IBM PowerLinux 7R4 (8248-L4T) systems.
  • On systems that have Active Memory Sharing (AMS) partitions and deduplication enabled, a problem was fixed for not being able to resume a hibernated AMS partition.  Previously,  resuming a hibernated AMS partition could give checksum errors with SRC B7000202 logged and the partition would remain in the hibernated state.
  • On systems that have Active Memory Sharing (AMS) partitions, a problem was fixed for Dynamic Logical Partitioning (DLPAR) for a memory remove that leaves a logical memory block (LMB) in an unusable state until partition reboot.
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed for a hypervisor hang at progress code C7004091 during the IPL or hangs during serviceability tasks to the I/O drawer.
  • A problem was fixed that could result in unpredictable behavior if a memory UE is encountered while relocating the contents of a logical memory block during one of these operations:
    - Using concurrent maintenance to perform a hot repair of a node.
    - Reducing the size of an Active Memory Sharing (AMS) pool.
    - On systems using mirrored memory, using the memory mirroring optimization tool.
    - Performing a Dynamic Platform Optimizer (DPO) operation.
  • On systems using Virtual Shared Processor Pools (VSPP), a problem was fixed for an inaccurate pool idle count over a small sampling period.
  • A problem was fixed for systems in networks using the Juniper 1GBe and 10GBe switches (F/Cs #1108, #1145, and #1151) to prevent network ping errors and boot from network (bootp) failures.  The Address Resolution Protocol (ARP) table information on the Juniper aggregated switches is not being shared between the switches and that causes problems for address resolution in certain network configurations.  Therefore, the CEC network stack code has been enhanced to add three gratuitous ARPs (ARP replies sent without a request received) before each ping and bootp request to ensure that all the network switches have the latest network information for the system.
  • On systems using the Virtual I/O Server (VIOS) to share physical I/O resources among client logical partitions, a problem was fixed for memory relocation errors during page migrations for the virtual control blocks.  These errors caused a CEC termination with SRC B700F103.  The memory relocation could be part of the processing for the Dynamic Platform Optimizer (DPO), Active Memory Sharing (AMS) between partitions, mirrored memory defragmentation, or a concurrent FRU repair.
  • On systems with redundant service processors,  a problem was fixed so that a backup memory clock failure with SRC B120CC62 is handled without terminating the system running on the primary memory clock.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed for a power off failure of an expansion drawer (F/C 5802 or F/C 5877) during a concurrent repair.  The power off commands to the drawer are now tried again using the System Power Control Network (SPCN) serial connection to the drawer to allow the repair to continue.
    This fix pertains only to IBM Power 770 (9117-MMC) and IBM Power 780 (9179-MHC) systems.
  • A problem was fixed for concurrent maintenance to prevent a hardware unavailable failure when doing consecutive concurrent remove and add operations to an I/O Hub adapter for a drawer.
    This fix pertains only to IBM Power 770 (9117-MMC) and IBM Power 780 (9179-MHC) systems.
AM770_092_032 / FW770.41

09/26/14
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Availability         Severity:  SPE

System firmware changes that affect all systems
  • A problem was fixed that was introduced by AM770_090 / FW770.40, which caused a service processor reset/reload and a SRC B1818601 error log during an IPL when adjusting the speeds of the system fans.  This problem would normally have a successful recovery with a good IPL of the system unless two other reset/reloads of the service processor had occurred within the last 15 minutes.
  • This Service Pack (AM770_092 / FW770.41), contains all of the Hiper fixes as provided by the previous Service Packs.
AM770_090_032 / FW770.40

06/26/14
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Security         Severity:  HIPER

New features and functions

  • System recovery for interrupted AC power and Voltage Regulator Module (VRM) failures has been enhanced for systems with multiple CEC enclosures such that a power AC or VRM fault on one CEC drawer will no longer block the other CEC drawers from powering on.  Previously, all CEC enclosures in a system needed valid AC power before the power on of the system could proceed.
    This system recovery feature does not pertain to the IBM Power 750 Express (8408-E8D) , IBM Power 760 (9109-RMD), and IBM PowerLinux 7R4 (8284-L4T) systems because they are single CEC enclosure systems.

System firmware changes that affect all systems

  • HIPER/Pervasive:  A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed clients and servers, via a specially crafted handshake packet, to use weak keying material for communication.  A man-in-the-middle attacker could use this flaw to decrypt and modify traffic between the management console and the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0224.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL for a buffer overflow in the Datagram Transport Layer Security (DTLS) when handling invalid DTLS packet fragments.  This could be used to execute arbitrary code on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0195.
  • HIPER/Pervasive:  Multiple security problems were fixed in the way that OpenSSL handled read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was enabled to prevent denial of service.  These could cause the service processor to reset or unexpectedly drop connections to the management console when processing certain SSL commands.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2010-5298 and CVE-2014-0198.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests. A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0221.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Elliptic Curve Diffie Hellman (ECDH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3470.
  • A problem was fixed that caused frequent SRC B1A38B24 error logs with a call home every 15 seconds when service processor network interfaces were incorrectly configured on the same subnet.  The frequency of the notification of the network subnet error has been reduced to once every 24 hours.
  • Help text for the Advanced System Management Interface (ASMI) "System Configuration/Hardware Deconfiguration/Clear All Deconfiguration Errors" menu option was enhanced to clarify that when selecting "Hardware Resources" value of "All hardware resources", the service processor deconfiguration data is not cleared.   The "Service processor" must be explicitly selected for that to be cleared.
  • Help text for the Advanced System Management Interface (ASMI) "System Configuration/Power Management/Power Supply Idle Control" menu option was enhanced to clarify that an idle power supply is in a low power state and not powered off.  The new help text states  "Power supply idle mode helps to reduce overall power usage when the system load is very light by having one power supply deliver all the power while the second is in a low power state".
  • A problem was fixed that caused a memory clock failure to be called out as failure in the processor clock FRU. This memory clock fix pertains only to the IBM Power 770 (9117-MMC) system.
  • A problem was fixed where a 12V DC power-good (pGood) input fault was reported as a SRC 11002620 with the wrong FRU callout of Un-P1 for system backplane.  The FRU callout for SRC 11002620 has been corrected to Un-P2 for I/O card.
  • A problem was fixed that caused the slot index to be missing for virtual slot number 0 for the dynamic reconfiguration connector (DRC) name for virtual devices.  This error was visible from the management console when using commands such as "lshwres -r virtualio --rsubtype slot -m machine" to show the hardware resources for virtual devices.
  • A problem was fixed that prevented a HMC-managed system from being converted to manufacturing default configuration (MDC) mode when the management console command "lpcfgop -m <server> -o clear" failed to create the default partition.  The management console went to the incomplete state for this error.
  • A problem was fixed that caused the Utility COD display of historical usage data to be truncated on the management console.
  • A problem was fixed that caused the Advanced System Management Interface (ASMI) menu for Memory Low Power State to be displayed on the IBM Power 770 (9117-MMC) even though it is not applicable to that system.  The 9117-MMC does not the DIMM type required for memory low power state.
  • A problem was fixed that caused a "code accept" during a concurrent firmware installation from the HMC to fail with SRC E302F85C.
  • A power supply fan speed problem was fixed that slowed the power supply fans down to a very low level for a minute about once every hour, with possible thermal shutdown of the power supply.  The affected systems are the Power 750 Express (8408-E8D), Power 760 (9109-RMD), and PowerLinux 7R4 (8248-L4T).
  • A power supply problem was fixed to prevent the system from consuming more power than the power supply can support and causing a possible power supply shutdown.  The power supply maximum wattage limit was lowered so that the case of a single power supply unit active for the system (system has redundant supplies) is handled correctly.  The affected systems are the Power 750 Express (8408-E8D), Power 760 (9109-RMD), and PowerLinux 7R4 (8248-L4T).
  • A problem was fixed for a Live Partition Mobility (LPM) suspend and transfer of a partition that caused the time of day to skip ahead to an incorrect value on the target system.  The problem only occurred when a suspended partition was migrated to a target CEC that had a hypervisor time that was later than the source CEC.
  • A problem was corrected that resulted in B7005300 error logs.
  • A  security problem was fixed in the service processor TCP/IP stack to discard illegal TCP/IP packets that have the SYN and FIN flags set at the same time.  An explicit packet discard was needed to prevent further processing of the packet that could result in an bypass of the iptables firewall rules.
  • A problem was fixed that prevented guard error logs from being reported for FRUs that were guarded during the system power on.  This could happen if the same FRU had been previously reported as guarded on a different power on of the system.  The requirement is now met that guarded FRUs are logged on every power on of the system.
System firmware changes that affect certain systems
  • On systems with a redundant service processor, a problem was fixed where the service processor allowed a clock failover to occur without a SRC B158CC62 error log and without a hardware deconfiguration record for the failed clock source.  This resulted in the system running with only one clock source and without any alerts to warn that clock redundancy had been lost.
    This fix pertains only to the IBM Power 770 (9117-MMC) system.
  • On systems with a redundant service processor and one memory clock deconfigured, a problem was fixed where the system failed to IPL using the second memory clock with SRCs B158CC62 and B181C041 logged.
    This fix pertains only to the IBM Power 770 (9117-MMC) system.
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that occurred during Offline Converter Assembly (OCA) replacement operations. The fix prevents a false Voltage Regulator Module (VRM) fault and the logging of SRCs 10001511 or 10001521 from occurring.    This resulted in the OCA LED getting stuck in an on or "fault" state and the OCA not powering on.
  • On systems with a redundant service processor with AC power missing to the node containing the anchor card, a problem was fixed that caused an IPL failure with SRC B181C062 when the anchor card could not be found in the vital product data (VPD) for the system.  With the fix, the system is able to find the anchor card and IPL since the anchor card gets its power from the service processor cable, not from the node where it resides.
  • On systems running Dynamic Platform Optimizer (DPO) with one or more unlicensed processors, a problem was fixed where the system performance was significantly degraded during the DPO operation.  The amount of performance degradation was more for systems with larger numbers of unlicensed processors.
  • On a system with partitions with redundant Virtual Asynchronous Services Interface (VASI) streams,  a problem was fixed that caused the system to terminate with SRC B170E540.  The affected partitions include Active Memory Sharing (AMS), encapsulated state partitions, and hibernation-capable partitions.  The problem is triggered when the management console attempts to change the active VASI stream in a redundant configuration.  This may occur due to a stream reconfiguration caused by Live Partition Mobility (LPM); reconfiguring from a redundant Paging Service Partition (PSP) to a single-PSP configuration; or conversion of a partition from AMS to dedicated memory.
  • For a partition with a 256MB Real Memory Offset (RMO) region size that has been migrated from a Power8 system to  Power7 or Power6 using Live Partition Mobility, a problem was fixed that caused a failure on the next boot of the partition with a BA210000 log with a CA000091 checkpoint just prior to the BA210000.  The fix dynamically adjusts the memory footprint of the partition to fit on the earlier Power systems.
  • On a system with a disk device with multiple boot partitions, a problem was fixed that caused System Management Services (SMS) to list only one boot partition.  Even though only one boot partition was listed in SMS, the AIX bootlist command could still be used to boot from any boot partition.
  • On systems that require in-band flash to update system firmware, a problem was fixed so in-band update would not fail if the Permanent (P) or the Temporary (T) side of the service processor was marked invalid.   Attempting to in-band flash from the AIX or Linux command line failed with a BA280000 log reported.  Attempting to in-band flash from the AIX diagnostics menus also failed because the flash menu options did not appear in this case.
  • On systems that have a boot disk located on a SAN,  a problem was fixed  where the SAN  boot disk would not be found on the default boot list  and then the boot disk would have to be selected from SMS menus.  This problem would normally  be seen for new partitions that had tape drives configured before the SAN boot disk.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused Capacity on Demand (COD) "Out of Compliance" messages during concurrent maintenance operations when the system was actually in compliance for the licensed amount of resources in use. This fix pertains only to the IBM Power 770 (9117-MMC) system.
  • A problem was fixed for concurrent maintenance operations to limit hardware retries on failed hardware so that it can be concurrently repaired.   This fix pertains only to the IBM Power 770 (9117-MMC) system.
AM770_076_032 / FW770.32

04/18/14
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY

Impact: Security         Severity:  HIPER

System firmware changes that affect all systems
  • HIPER/Pervasive:  A  security problem was fixed in the OpenSSL Montgomery ladder implementation for the ECDSA (Elliptic Curve Digital Signature Algorithm) to protect sensitive information from being obtained with a flush and reload cache side-channel attack to recover ECDSA nonces from the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-0076.  The stolen ECDSA nonces could be used to decrypt the SSL sessions and compromise the Hardware Management Console (HMC) access password to the service processor.  Therefore, the HMC access password for the managed system should be changed after applying this fix.
  • HIPER/Pervasive:  A  security problem was fixed in the OpenSSL Transport Layer Security (TLS) and Datagram Transport Layer Security (DTLS) to not allow Heartbeat Extension packets to trigger a buffer over-read to steal private keys for the encrypted sessions on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-0160 and it is also known as the heartbleed vulnerability.  The stolen private keys could be used to decrypt the SSL sessions and and compromise the Hardware Management Console (HMC) access password to the service processor.  Therefore, the HMC access password for the managed system should be changed after applying this fix.
  • A  security problem was fixed for the Lighttpd web server that allowed arbitrary SQL commands to be run on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2323.
  • A security problem was fixed for the Lighttpd web server where improperly-structured URLs could be used to view arbitrary files on the service processor.  The Common Vulnerabilities and Exposures issue number is CVE-2014-2324.
AM770_063_032 / FW770.31

01/14/14
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Serviceability         Severity:  SPE

System firmware changes that affect all systems
  • A firmware code update problem was fixed that caused a system failure with SRC B7000103 when the allowed resource usage was exceeded for the partition universal unique identifier (UUID) processing during a code update.
  • A firmware code update problem was fixed that caused the Hardware Management Console (HMC) to go to "Incomplete State" for the system with SRC E302F880 when assignment of a partition universal unique identifier (UUID) failed for a partition that was already running.  This problem happens for disruptive code updates from pre-770 levels to 770 or later levels.
AM770_062_032 / FW770.30

12/10/13
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY

Impact: Availability         Severity:  SPE

New features and functions

  • Support was added to upgrade the service processor to openssl version 1.0.1 and for compliance to National Institute of Standards and Technologies (NIST) Special Publications 800-131a.  SP800-131a compliance required the use of stronger cryptographic keys and more robust cryptographic algorithms
  • Support was added in Advanced System Management Interface (ASMI) to facilitate capture and reporting of debug data for system performance problems.  The  "System Service Aids/Performance Dump" menu was added to ASMI to perform this function.
  • Support was added to the Advanced System Management Interface (ASMI) to provide a menu for "Power Supply Idle Mode".  Using the "Power Supply Idle Mode"  menu, the power supplies can be either set enabled to save power by idling power supplies when possible or set disabled to keep all power supplies fully on and allow a balanced load to be maintained on the power distribution units (PDUs) of the system.  Power supply idle mode enabled helps to reduce overall power usage when the system load is very light by having one power supply deliver all the power while the second power supply is maintained in a low power state.
  • Support was added to the Advanced System Management Interface (ASMI) to provide a menu for "Memory Low Power State Control" to enable or disable the custom memory buffer low power mode.  If set to disabled, it disables low power mode (a power-saving feature) to speed memory and improve performance for some workloads.  The "Memory Low Power State Control" menu is not available on the MTM 9117-MMC system because its memory does not have a low power state option.
  • Support was added for the IBM Flash 90 (#ES09) PCIe 2.0 x8 eMLC adapter with 900GB storage and 350,00 IOPS read performance.  The system recognizes the PCI device as one needing additional cooling and increases the fan speeds accordingly.
  • Support was added in Advanced System Management Interface (ASMI) for saving and restoring network settings using a USB flash drive.
  • Support was dropped for Secured Socket Layer (SSL) protocol version 2 and SSL weak and medium cipher suites in the service processor web server (Lighttpd) .  Unsupported web browser connections to the Advanced System Management Interface (ASMI) secured port 443 (using https://) will now be rejected if those browsers do not support SSL version 3.  Supported web browsers for Power7 ASMI are Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0), Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24).
  • Support was added in Advanced System Management Interface (ASMI) "System Configuration/Firmware Update Policy" menu to detect and display the appropriate Firmware Update Policy (depending on whether system is HMC managed) instead of requiring the user to select the Firmware Update Policy.  The menu also displays the "Minimum Code Level Supported" value.

System firmware changes that affect all systems

  • The firmware was enhanced to display on the management console the correct number of concurrent Live Partition Mobility (LPM) operations that is supported.
  • A problem was fixed that caused a 1000911E platform event log (PEL) to be marked as not call home.  The PEL is now a call home to allow for correction.  This PEL is logged when the hypervisor has changed the Machine Type Model Serial Number (MTMS) of an external enclosure to UTMP.xxx.xxxx because it cannot read the vital product data (VPD), or the VPD has invalid characters, or if the MTMS is a duplicate to another enclosure.
  • When powering on a system partition, a problem was fixed that caused the partition universal unique identifier (UUID) to not get assigned, causing a B2006010 SRC in the error log.
  • For the sequence of a reboot of a system partition followed immediately by a power off of the partition, a problem was fixed where the hypervisor virtual service processor (VSP) incorrectly retained locks for the powered off partition, causing the CEC to go into recovery state during the next power on attempt.
  • A problem was fixed that caused the system attention LED to be lit without a corresponding SRC and error log for the event.  This problem typically occurs when an operating system on a partition terminates abnormally.
  • A problem was fixed that caused a memory leak of 50 bytes of service processor memory for every call home operation.  This could potentially cause an out of memory condition for the service processor when running over an extended period of time without a reset.
  • A problem was fixed that caused a L2 cache error to not guard out the faulty processor, allowing the system to checkstop again on an error to the same faulty processor.
  • A problem was fixed that caused a HMC code update failure for the FSP on the accept operation with SRC B1811402 or FSP is unable to boot on the updated side.
  • A problem was fixed that caused a SRC B181B2C0 and incorrect hardware callout for a GX bus failure on a wire test.  The SRC B114C80C with GX location codes are now provided to facilitate the repairs for the wire test errors.
  • A problem was fixed that caused a built-in self test (BIST) for GX slots to create corrupt error log values that core dumped the service processor with a B18187DA.  The corruption was caused by a failure to initialize the BIST array to 0 before starting the tests.
  • A problem was fixed that caused a SRC B7006A72 calling out the adapter and the I/O Planar.
  • A problem was fixed during resource dump processing that caused a read of an invalid system memory address and a SRC B181C141.  The invalid memory reference resulted from the service processor incorrectly referencing memory that had been relocated by the hypervisor.
System firmware changes that affect certain systems
  • On systems with a redundant service processor, a problem was fixed that caused fans to run at a high-speed after a failover to the sibling service processor.
  • On systems running AIX or Linux, a problem was fixed that caused the operating system to halt when an InfiniBand Host Channel Adapter (HCA) adapter fails or malfunctions.
  • On systems running AIX or linux, a hang in a Live Partition Mobility (LPM) migration for remote restart-capable partitions was fixed by adding a time-out for the required paging space to become available.  If after five minutes the required paging space is not available, the start migration command returns a error code of 0x40000042 (PagingSpaceNotReady) to the management console.
  • On systems running Dynamic Platform Optimizer (DPO) with no free memory,  a problem was fixed that caused the Hardware Management System (HMC) lsmemopt command to report the wrong status of completed with no partitions affected.  It should have indicated that DPO failed due to insufficient free memory.  DPO can only run when there is free memory in the system.
  • On systems with partitions using physical shared processor pools, a problem was fix that caused partition hangs if the shared processor pool was reduced to a single processor.
  • On systems with turbo-core enabled that are a target of Live Partition Mobility (LPM),  a problem was fixed where cache properties were not recognized and SRCs BA280000 and BA250010 reported.
  • On 8408-E8D, 9109-RMD, and 8248-L4T systems, the guidance provided by the Advanced System Manager Interface (ASMI) "System Configuration/Hardware Management Console" menu was changed to fix the problem of the serial port not being enabled when converting from a HMC-managed to a non-HMC-managed system.  The enhanced guidance adds a step to reset the service processor when doing the conversion.
  • On systems with a redundant service processor, a problem was fixed that caused a guarded sibling service processor deconfiguration details to not be able to be shown in the Advanced System Management Interface (ASMI).
  • On systems with a F/C 5802 or 5877 I/O drawer installed, the firmware was enhanced to guarantee that an SRC will be generated when there is a power supply voltage fault.  If no SRC is generated, a loss of power redundancy may not be detected, which can lead to a drawer crash if the other power supply goes down.  This also fixes a problem that causes an 8 GB Fiber channel adapter in the drawer to fail if the 12V level fails in one Offline Converter Assembly (OCA).
  • On systems managed by an HMC with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that caused the hardware topology on the management console for the managed system to show "null" instead of "operational" for the affected I/O drawers.
  • On systems with a redundant service processor, a problem was fixed that caused a SRC B150D15E to be erroneously logged after a failover to the sibling service processor.
  • On Power7+ systems,  a problem was fixed that caused a system checkstop during hypervisor time keeping services.
  • DEFERRED:  On Power7 systems, a problem was fixed that caused a system checkstop during hypervisor time keeping services. This deferred fix addresses a problem that has a very low probability of occurrence.  As such customers may wait for the next planned service window to activate the deferred fix via a system reboot.
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that where a Offline Converter Assembly (OCA) fault would appear to persist after a OCA micro-reset or OCA replacement.  The fault bit reported to the OS may not be cleared, indicating a fault still exists in the I/O drawer after it has been repaired.
  • DEFERRED: On Power7 systems, a problem was fixed that caused a system checkstop with SRC B113E504 for a recoverable hardware fault.  This deferred fix addresses a problem that has a very low probability of occurrence.  As such customers may wait for the next planned service window to activate the deferred fix via a system reboot.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail on an erroneously logged error for the service processor battery with  SRCs B15A3303, B15A3305, and  B181EA35 reported.
  • A problem was fixed that caused a concurrent processor exchange to terminate during node deactivation with SRC B1814616.
  • A problem was fixed that caused SRC B15A3303  to be erroneously logged as a predictive error on the service processor sibling after a successful concurrent repair maintenance operation for the real-time clock (RTC) battery.
AM770_052_032 / FW770.21

08/07/13
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Availability         Severity:  SPE

System firmware changes that affect all systems

  • A problem was fixed that caused a migrated partition to reboot during transfer to a VIOS 2.2.2.0, and later, target system. A manual reboot would be required if transferred to a target system running an earlier VIOS release. Migration recovery may also be necessary.
  • A problem was fixed that can cause  Anchor (VPD) card corruption and  A70047xx SRCs to be logged.  Note: If a serviceable event  with SRC A7004715 is present or was logged previously, damage to the VPD card may have occurred. After the fix is applied, replacement of the Anchor VPD  card is recommended in order to restored full redundancy. 
System firmware changes that affect certain systems
On systems running Dynamic Platform Optimizer (DPO) ,  a problem was fixed that caused an incorrect placement of dedicated processors for partitions larger than a single chip.  When this occurs, the performance is impacted over what would have been gained with proper placement.
AM770_048_032 / FW770.20

05/17/13
Systems 8408-E8D; 8248-L4T; 9109-RMD; 9117-MMC and 9179-MHC ONLY
Impact: Availability         Severity:  SPE

New Features and Functions

  • Support for the 8248-L4T.
  • Support for 9117-MMC and 9179-MHC with Dynamic Platform Optimization (DPO).

System firmware changes that affect all systems

  • A problem was fixed that caused a service processor reset/reload with SRC B181720D due to a memory leak.
  • The Hypervisor was enhanced to allow the system to continue to boot using the redundant data chip on the anchor (VPD) card, instead of stopping the Hypervisor boot and logging SRC B7004715,  when the primary data chip on the anchor card has been corrupted.
  • The firmware was enhanced to support up to 4200 virtual adapters.
  • A problem was fixed that caused a service processor dump to be generated with SRC B18187DA "NETC_RECV_ER" logged.
  • The firmware was enhanced to make the Capacity on Demand (CoD) menu option available on the Advanced System Management Interface (ASMI) menus when logged in as admin or celogin.
  • The firmware was enhanced to make SRC B15738F8 ("FRUM_ERC_UNEXPECTED_HOTPLUG_ADD") informational instead of predictive.
  • A problem was fixed that caused a platform dump generation to fail after a system checkstop with SRCs B181B8A2 and B114E504 ("Processor cleanup failure").
  • A problem was fixed that caused the date and time to be incorrect in AIX if a partition is remotely restarted on a different system from the one on which it was hibernated.
  • A problem was fixed that caused a performance loss after a configuration change, such as un-licensing a processor, because the Hypervisor is unable to dispatch a partition to a shared processor.
  • A problem was fixed that may cause inaccurate processor utilization reporting.
System firmware changes that affect certain systems
  • On systems running Active Memory Sharing (AMS) partitions, a problem was fixed that caused the system to hang after an AMS partition was deleted or mobilized, combined with either an AMS pool resize or relocation of AMS pool memory.
  • On systems with I/O towers attached, a problem was fixed that caused multiple service processor reset/reloads if the tower was continuously sending invalid System Power Control Network (SPCN) status data. 
  • A problem was fixed that was caused by an attempt to modify a virtual adapter from the management console command line when the command specifies it is an Ethernet adapter, but the virtual ID specified is for an adapter type other than Ethernet.  The managed system has to be rebooted to restore communications with the management console when this problem occurs; SRC B7000602 is also logged.
  • On systems running Dynamic Platform Optimization (DPO), a problem was fixed that caused the current DPO score for a partition to be incorrect.  When this occurs, it looks like DPO would not improve performance when in fact it would improve the performance.  Also, on systems running Dynamic Platform Optimization (DPO), in which there are no processors in the shared processor pool, a problem was fixed that caused the Hypervisor to become unresponsive (the service processor starts logging time-out errors against the Hypervisor, and the HMC can no longer talk to the Hypervisor) during a DPO operation.
AM770_038_032 / FW 770.10

03/21/13
Systems 8408-E8D and 9109-RMD ONLY
Impact:  New      Severity:  New

New Features and Functions

  • Support for the 8408-E8D and 9109-RMD systems.



AM740
Systems 9117-MMC and 9179-MHC ONLY
For Impact, Severity and other Firmware definitions, Please refer to the below 'Glossary of firmware terms' url:
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html#termdefs
AM740_152_042 / FW740.81

06/24/14
Impact: Security         Severity:  HIPER

System firmware changes that affect all systems

  • HIPER/Pervasive:  A security problem was fixed in the OpenSSL (Secure Socket Layer) protocol that allowed clients and servers, via a specially crafted handshake packet, to use weak keying material for communication.  A man-in-the-middle attacker could use this flaw to decrypt and modify traffic between the management console and the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0224.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL for a buffer overflow in the Datagram Transport Layer Security (DTLS) when handling invalid DTLS packet fragments.  This could be used to execute arbitrary code on the service processor.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0195.
  • HIPER/Pervasive:  Multiple security problems were fixed in the way that OpenSSL handled read and write buffers when the SSL_MODE_RELEASE_BUFFERS mode was enabled to prevent denial of service.  These could cause the service processor to reset or unexpectedly drop connections to the management console when processing certain SSL commands.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2010-5298 and CVE-2014-0198.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service when handling certain Datagram Transport Layer Security (DTLS) ServerHello requests. A specially crafted DTLS handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-0221.
  • HIPER/Pervasive:  A security problem was fixed in OpenSSL to prevent a denial of service by using an exploit of a null pointer de-reference during anonymous Elliptic Curve Diffie Hellman (ECDH) key exchange.  A specially crafted handshake packet could cause the service processor to reset.  The Common Vulnerabilities and Exposures issue number for this problem is CVE-2014-3470.
  • Multiple security problems were fixed in  OpenSSL to improve signature verification,  ensure private key protection, and to block plain-text recovery.  The Common Vulnerabilities and Exposures issue numbers for these problems are CVE-2013-0169, CVE-2013-0166 and CVE-2011-4354.
AM740_126_042 / FW740.80

04/03/14
Impact: Availability    Severity: SPE

New features and functions

  • Support was added in Advanced System Management Interface (ASMI) to facilitate capture and reporting of debug data for system performance problems.  The  "System Service Aids/Performance Dump" menu was added to ASMI to perform this function.
System firmware changes that affect all systems
  • Help text for the Advanced System Management Interface (ASMI) "System Configuration/Hardware Deconfiguration/Clear All Deconfiguration Errors" menu option was enhanced to clarify that when selecting "Hardware Resources" value of "All hardware resources", the service processor deconfiguration data is not cleared.   The "Service processor" must be explicitly selected for that to be cleared.
  • A problem was fixed that prevented guard error logs from being reported for FRUs that were guarded during the system power on.  This could happen if the same FRU had been previously reported as guarded on a different power on of the system.  The requirement is now met that guarded FRUs are logged on every power on of the system.
  • Help text for the Advanced System Management Interface (ASMI) "System Configuration/Power Management/Power Supply Idle Control" menu option was enhanced to clarify that an idle power supply is in a low power state and not powered off.  The new help text states  "Power supply idle mode helps to reduce overall power usage when the system load is very light by having one power supply deliver all the power while the second is in a low power state".
  • A problem was fixed that caused the slot index to be missing for virtual slot number 0 for the dynamic reconfiguration connector (DRC) name for virtual devices.  This error was visible from the management console when using commands such as "lshwres -r virtualio --rsubtype slot -m machine" to show the hardware resources for virtual devices.
  • A problem was fixed where a 12V DC power-good (pGood) input fault was reported as a SRC 11002620 with the wrong FRU callout of Un-P1 for system backplane.  The FRU callout for SRC 11002620 has been corrected to Un-P2 for I/O card.
  • A problem was fixed that caused a memory clock failure to be called out as failure in the processor clock FRU.
  • A problem was fixed that caused unneeded resets of ethernet adapters during logical partition (LPAR) power off or reboots.  The extra resets of the ethernet adapters could cause the network switch to disable the ethernet links if the threshold for maximum number of ethernet adapter resets per minute is exceeded.
System firmware changes that affect certain systems
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that occurred during Offline Converter Assembly (OCA) replacement operations. The fix prevents a false  Voltage Regulator Module (VRM) fault and the logging of SRCs 10001511 or 10001521 from occurring.    This resulted in the OCA LED getting stuck in an on or "fault" state and the OCA not powering on.
  • On a system with partitions with redundant Virtual Asynchronous Services Interface (VASI) streams,  a problem was fixed that caused the system to terminate with SRC B170E540.  The affected partitions include Active Memory Sharing (AMS), encapsulated state partitions, and hibernation-capable partitions.  The problem is triggered when the management console attempts to change the active VASI stream in a redundant configuration.  This may occur due to a stream reconfiguration caused by Live Partition Mobility (LPM); reconfiguring from a redundant Paging Service Partition (PSP) to a single-PSP configuration; or conversion of a partition from AMS to dedicated memory.
  • On systems with a redundant service processor, a problem was fixed where the service processor allowed a clock failover to occur without a SRC B158CC62 error log and without a hardware deconfiguration record for the failed clock source.  This resulted in the system running with only one clock source and without any alerts to warn that clock redundancy had been lost.
  • On systems with one memory clock deconfigured, a problem was fixed where the system failed to IPL using the second memory clock with SRCs B158CC62 and B181C041 logged.
  • On a system with a disk device with multiple boot partitions, a problem was fixed that caused System Management Services (SMS) to list only one boot partition.  Even though only one boot partition was listed in SMS, the AIX bootlist command could still be used to boot from any boot partition.
AM740_121_042 / FW740.70

11/14/13
Impact: Availability    Severity: SPE

New features and functions

  • Support was dropped for Secured Socket Layer (SSL) Version 2 and SSL weak and medium cipher suites in the service processor web server (Ligthttpd).  Unsupported web browser connections to the Advanced System Management Interface (ASMI) secured port 443 (using https://) will now be rejected if those browsers do not support SSL version 3.  Supported web browsers for Power7 ASMI are Netscape (version 9.0.0.4), Microsoft Internet Explorer (version 7.0), Mozilla Firefox (version 2.0.0.11), and Opera (version 9.24).
  • Support was added to the Advanced System Management Interface (ASMI) to provide a menu for "Power Supply Idle Mode".  Using the "Power Supply Idle Mode"  menu, the power supplies can be either set enabled to save power by idling power supplies when possible or set disabled to keep all power supplies fully on and allow a balanced load to be maintained on the power distribution units (PDUs) of the system.  Power supply idle mode enabled helps to reduce overall power usage when the system load is very light by having one power supply deliver all the power while the second power supply is maintained in a low power state.

System firmware changes that affect all systems

  • A problem was fixed that caused a service processor dump to be generated with SRC B18187DA "NETC_RECV_ER" logged.
  • A problem was fixed that caused a L2 cache error to not guard out the faulty processor, allowing the system to checkstop again on an error to the same faulty processor.
  • A problem was fixed that caused a HMC code update failure for the FSP on the accept operation with SRC B1811402 or FSP is unable to boot on the updated side.
  • A problem was fixed that caused a SRC B181B2C0 and incorrect hardware callout for a GX bus failure on a wire test.  The SRC B114C80C with GX location codes are now provided to facilitate the repairs for the wire test errors.
  • A problem was fixed that caused a 1000911E platform event log (PEL) to be marked as not call home.  The PEL is now a call home to allow for correction.  This PEL is logged when the hypervisor has changed the Machine Type Model Serial Number (MTMS) of an external enclosure to UTMP.xxx.xxxx because it cannot read the vital product data (VPD), or the VPD has invalid characters, or if the MTMS is a duplicate to another enclosure.
  • A problem was fixed that caused a built-in self test (BIST) for GX slots to create corrupt error log values that core dumped the service processor with a B18187DA.  The corruption was caused by a failure to initialize the BIST array to 0 before starting the tests.
  • A problem was fixed that caused the system attention LED to be lit without a corresponding SRC and error log for the event.  This problem typically occurs when an operating system on a partition terminates abnormally.
  • DEFERRED: A problem was fixed that caused a system checkstop during hypervisor time keeping services.  This deferred fix addresses a problem that has a very low probability of occurrence.  As such customers may wait for the next planned service window to activate the deferred fix via a system reboot.
  • DEFERRED: A problem was fixed that caused a system checkstop with SRC B113E504 for a recoverable hardware fault.  This deferred fix addresses a problem that has a very low probability of occurrence.  As such customers may wait for the next planned service window to activate the deferred fix via a system reboot.
System firmware changes that affect certain systems
  • On systems with a redundant service processor, a problem was fixed that caused fans to run at a high-speed after a failover to the sibling service processor.
  • On systems in manufacturing default configuration (MDC), a problem was fixed that caused the system to change from MDC to Hardware Management Console (HMC)-managed mode even though the HMC was unable to authenticate to the service processor.   A system must be successfully discovered by a HMC as a prerequisite to becoming HMC-managed. 
  • On systems with a redundant service processor, a problem was fixed that caused a guarded sibling service processor deconfiguration details to not be able to be shown in the Advanced System Management Interface (ASMI).
  • On systems with a F/C 5802 or 5877 I/O drawer installed, the firmware was enhanced to guarantee that an SRC will be generated when there is a power supply voltage fault.  If no SRC is generated, a loss of power redundancy may not be detected, which can lead to a drawer crash if the other power supply goes down.  This also fixes a problem that causes an 8 GB Fiber channel adapter in the drawer to fail if the 12V level fails in one Offline Converter Assembly (OCA).
  • On systems managed by an HMC with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that caused the hardware topology on the management console for the managed system to show "null" instead of "operational" for the affected I/O drawers.
  • On systems with a redundant service processor, a problem was fixed that caused a SRC B150D15E to be erroneously logged after a failover to the sibling service processor.
  • On systems with turbo-core enabled that are a target of a Live Partition Mobility (LPM) operation, a problem was fixed where cache properties were not recognized and SRCs BA280000 and BA250010 reported.
  • A problem was fixed in the run-time abstraction services (RTAS) extended error handling (EEH) for fundamental reset that caused partitions to crash during adapter updates.  The fundamental reset of adapters now returns a valid return code.  The adapter drivers using fundamental reset affected by this fix are the following:
    o QLogic PCIe Fibre Channel adapters (combo card)
    o IBM PCIe Obsidian
    o Emulex BE3-based ethernet adapters
    o Broadcom-based PCIe2 4-port 1Gb ethernet
    o Broadcom-based FlexSystem EN2024 4-port 1Gb ethernet for compute nodes
  • On systems with a F/C 5802 or 5877 I/O drawer installed, a problem was fixed that where a Offline Converter Assembly (OCA) fault would appear to persist after a OCA micro-reset or OCA replacement.  The fault bit reported to the OS may not be cleared, indicating a fault still exists in the I/O drawer after it has been repaired.
  • On systems involved in a series of consecutive Live Partition Mobility (LPM) operations, a memory leak problem was fixed in the run time abstraction service (RTAS) that caused a partition run time AIX crash with SRC 0c20.  Other possible symptoms include error logs with SRC BA330002 (RTAS memory allocation failure).
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused a concurrent hot add/repair maintenance operation to fail on an erroneously logged error for the service processor battery with  SRCs B15A3303, B15A3305, and  B181EA35 reported.
  • A problem was fixed that caused SRC  B15A3303  to be erroneously logged as a predictive error on the service processor sibling after a successful concurrent repair maintenance operation for the real-time clock (RTC) battery.
AM740_112_042 / FW740.61

07/25/13
Impact: Availability    Severity: SPE

System firmware changes that affect all systems

  • A problem was fixed that caused a migrated partition to reboot during transfer to a VIOS 2.2.2.0, and later, target system. A manual reboot would be required if transferred to a target system running an earlier VIOS release. Migration recovery may also be necessary.
  • A problem was fixed that can cause  Anchor (VPD) card corruption and  A70047xx SRCs to be logged.  Note: If a serviceable event  with SRC A7004715 is present or was logged previously, damage to the VPD card may have occurred. After the fix is applied, replacement of the Anchor VPD  card is recommended in order to restored full redundancy.
AM740_110_042 / FW740.60

04/30/13
Impact: Serviceability    Severity: ATT

New features and functions

  • Support for booting an IBM i partition from a USB flash drive.

System firmware changes that affect all systems

  • A problem was fixed that prevented predictive guard errors from being deleted on the secondary service processor.  This caused hardware to be erroneously guarded out if a service processor failover occurred, then the system was rebooted.
  • A problem was fixed that prevented the system attention indicator from being turned off when a service processor reset occurred.
  • A problem was fixed that caused SRC B1813221, which indicates a failure of the battery on the service processor, to be erroneously logged after a service processor reset or power cycle.
  • A problem was fixed that caused various parts to be erroneously guarded out in some cases, and the clock card being called out as defective in other cases, when both ac cords providing power to a drawer  were unplugged when the system was powered on.
  • A problem was fixed that caused the Advanced System Management Interface (ASMI) to produce a service processor dump when changing the admin user password.
  • A problem was fixed that caused various SRCs to be erroneously logged at boot time including B181E6C7 and B1818A14.
  • A problem was fixed that caused a card (and its children) that was removed after the system was booted to continue to be listed in the guard menus in the Advanced System Management Interface (ASMI).
  • A problem was fixed that caused system fans to be erroneously called out as failing with one or more of the following SRCs: 11007610,11007620,11007630,11007640, or 11007650.
  • A problem was fixed that caused the service processor to crash when it boots from the new level during a concurrent firmware installation.
  • A problem was fixed that caused various parts to be erroneously guarded out in some cases, and the clock card being called out as defective in other cases, when both ac cords providing power to a drawer  were unplugged when the system was powered on.
  • A problem was fixed that caused the management console to display incorrect data for a virtual Ethernet adapter's transactions statistics.
  • A problem was fixed that caused a hibernation resume operation to hang if the connection to the paging space is lost near the end of the resume processing.  This is more likely on a partition that supports remote restart.
  • A problem was fixed that caused the system power to be throttled, resulting in decreased performance.  This problem typically occurs after a PCI adapter is plugged into a node (CEC drawer), and can also happen when a dedicated I/O partition is powered on or off.
  • A problem was fixed that caused the system to terminate with a bad address checkstop during mirroring defragmentation.
  • A problem was fixed that caused the hibernation validation of a remote restart partition operation to fail with an "NvRam size error".  This also affects the capability to migrate the partition.
  • The Power Hypervisor was enhanced to insure better synchronization of vSCSI and NPIV I/O interrupts to partitions.
  • A problem was fixed that caused SRCs B70069F4 and B130E504 to be erroneously logged when a system was powered down.  This also results in I/O hardware being guarded out, and the hypervisor is not able to "unguard" the I/O hardware at runtime.
  • A problem was fixed that was caused by an attempt to modify a virtual adapter from the management console command line when the command specifies it is an Ethernet adapter, but the virtual ID specified is for an adapter type other than Ethernet.  The managed system has to be rebooted to restore communications with the management console when this problem occurs; SRC B7000602 is also logged.
  • The Hypervisor was enhanced to allow the system to continue to boot using the redundant Anchor (VPD) card, instead of stopping the Hypervisor boot and logging SRC B7004715,  when the primary Anchor card has been corrupted.
  • A problem was fixed that caused an error log generated by the partition firmware to show conflicting firmware levels.  This problem occurs after a firmware update or a Live Partition Mobility (LPM) operation on the system.
System firmware changes that affect certain systems
  • On systems with I/O towers attached, a problem was fixed that caused SRC 10009135, followed by SRC 10009139, to be logged, indicating that SPCN loop mode was being broken, then reestablished.
  • On systems with I/O towers attached, a problem was fixed that caused multiple service processor reset/reloads if the tower was continuously sending invalid System Power Control Network (SPCN) status data.
  • On partitions with the virtual Trusted Platform Module (vTPM) enabled, a problem was fixed that caused a memory leak, and failure, when vTPM was disabled, a vTPM-enabled partition was migrated, or a vTPM-enabled partition was deleted.
  • On systems running multiple IBM i partitions that are configured to communicate with each other via virtual Opticonnect, and Active Memory Sharing (AMS), AMS operations may time-out.  When this problem occurs, a platform reboot may be required to recover.
  • On a partition with the virtual Trusted Platform Module (vTPM) enabled, a problem was fixed that caused the partition to stop functioning after certain operations.  When this problem occurs, the client partition may not power off.
  • When switching between turbocore and maxcore mode, a problem was fixed that caused the number of supported partitions to be reduced by 50%.
  • On systems running Active Memory Sharing (AMS) partitions, a problem was fixed that may arise due to the incorrect handling of a return code in an error path during the Live Partition Mobility (LPM) of an AMS partition.
  • On systems using IPv6 addresses, the firmware was enhanced to reduce the time it take to install an operating system using the Network Installation Manager (NIM).
  • On systems with F/C EU07, the RDX SATA internal docking station for removable disk cartridge, a problem was fixed that caused SRCs BA210000 and BA210003 to be logged, and the System Management Services (SMS) menu firmware to drop into the ok> prompt, when the default boot list was built.
  • A problem was fixed that caused SRC BA330000 to be logged after the successful migration of a partition running Ax740_xxx firmware to a system running Ax760, or a later release, or firmware.  This problem can also cause SRCs BA330002, BA330003, and BA330004 to be erroneously logged over time when a partition is migrated from a system running Ax760, or a later release, to a system running Ax740_xxx firmware.
  • On system running an IBM i partition, the partition boot may succeed after a long delay, or may fail, if a mode D boot attempt is made, there is more than one USB device attached, and the IBM i operating system (OS) image is on the second USB device.
  • On system running an IBM i partition, a problem was fixed that caused a number of informational SRC BA09000F to be logged when a mode D partition boot is done.  This SRC is logged if a device that supports removable media is installed and the media is not present.
  • On systems with redundant service processors, a problem was fixed that caused the sibling service processor state to show up as "unknown" in the service processor error log if a code synchronization problem was detected after a service processor was replaced.
  • On systems running Active Memory Sharing (AMS) partitions, a timing problem was fixed that may occur if the system is undergoing AMS pool size changes.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  • A problem was fixed that caused SRC 11001512 or 11001522 to be erroneously logged against a node that was added or removed during a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused SRC B15738B0 to be erroneously logged against the target node during the node-level deactivation phase of a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused CHARM operations to fail when a memory channel failure is followed by a service processor reset/reload (which is caused by a firmware installation, for example).
  • On systems with a GX++ 2-port PCIe2 x8 adapter, feature code (F/C) 1914, a problem was fixed that caused the location code of the adapter to be incorrect in the operating system after a hot repair of the adapter.
  • On systems in which there are no processors in the shared processor pool, a problem was fixed that caused the Hypervisor to become unresponsive (the service processor starts logging time-out errors against the Hypervisor, and the HMC can no longer talk to the Hypervisor) during a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused a hypervisor memory leak during a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused a concurrent node repair or upgrade to fail during the system deactivation step with a Hypervisor error code of 0x300.
  • A problem was fixed that caused the system to hang if memory relocation is performed during a concurrent hot add/repair maintenance operation.
  • A problem was fixed that caused partition activations to fail during or after a node repair operation.
  • A problem was fixed that caused synchronization problems in an application using the Barrier Synchronization Register (BSR) facility during the memory relocation that occurs in a concurrent hot add/repair maintenance operation.
  • On systems with an IBM i partition and performance data collection enabled, a problem was fixed that caused SRC B170E540 to be logged during a hot repair of a GX++ 2-port PCIe2 x8 adapter, feature code (F/C) 1914.
  • A problem was fixed that prevented the I/O slot information from being presented on the management console after a concurrent node repair.
  • On systems running multiple IBM i partitions that are configured to communicate with each other via virtual Opticonnect, concurrent hot add/repair maintenance operations may time-out.  When this problem occurs, a platform reboot may be required to recover.
AM740_100_042

12/05/12
Impact: Serviceability    Severity: ATT

System firmware changes that affect all systems

  • A problem was fixed that can cause fans in the server to run at maximum speed and generate a serviceable event during system boot (B130B8AF, a predictive error with hardware callout) as a result of an incorrect calibration of a particular thermal sensor.
AM740_098_042

11/28/12
Impact: Availability    Severity: SPE

System firmware changes that affect all systems  

  • HIPER/Non-Pervasive: DEFERRED:  A problem was fixed that caused a system crash with SRC B170E540.
  • HIPER/Non-Pervasive:  A related problem was also fixed that could cause a live lock on the power bus resulting in a system crash.
  • DEFERRED:  A problem was fixed that caused an uncorrectable error (SRC B123E504) to be erroneously logged when 64GB DIMMs were installed in a system that already had 16GB or 32GB DIMMs.
  • To address poor placement of partitions following a reboot of a server with unlicensed cores, the firmware was enhanced to run the affinity manager when the initialize configuration operation is done from the HMC.  A problem was also fixed that caused the hypervisor to be left in an inconsistent state after a partition create operation failed.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  •   A problem was fixed that caused a CHARM operation to fail after this sequence of events:

           1.  User-initiated platform system dump is requested (from ASMI or HMC).
           2.  Service processor reset/reload takes place while dump collection is in progress.
           3.  User attempts a CHARM operation.
AM740_095_042

09/19/12
Impact: Availability    Severity: SPE

New features and functions

  • Support for booting the IBM i operating system from a USB tape drive.

System firmware changes that affect all systems

  • The firmware was enhanced to correctly diagnose the failing FRU when SRC B1xxE504 with error signature "MCFIR[14] - Hang timer detector" was logged.
  • A problem was fixed that caused the system to crash after a recoverable error was logged on an I/O hub.
  • A problem was fixed that caused a "code accept" during a concurrent firmware installation from the HMC to fail with SRC E302F85C.
  • The firmware was enhanced to continue booting when SRC B181C803 with description "WIRE_PROC_CST_HW_FAIL" is logged during boot.
  • A problem was fixed that caused the suspension of a partition to fail if a large amount of data has to be stored to resume the partition.
  • A problem was fixed that caused a system crash with unrecoverable SRC B7000103 and "ErFlightRecorder" in the failing stack.
  • A problem was fixed that caused an external interrupt to get stuck for some period of time before being presented to the operating system in certain scenarios in which there is a high rate of interrupts.
System firmware changes that affect certain systems
  • On systems on which Internet Explorer (IE) is used to access the Advanced System Management Interface (ASMI) on the Hardware Management Console (HMC), a problem was fixed that caused IE to hang for about 10 minutes after saving changes to network parameters on the ASMI.
  • A problem was fixed that caused a network installation of IBM i to fail when the client was on the same subnet as the server.
  • On systems with a 5796 or 5797 I/O drawer attached, a problem was fixed that could cause a system hang.
  • On systems with I/O drawers feature code (F/C) 5802 or 5877 attached, and running the Active Energy Manager, a problem was fixed that caused SRC B7000602 to be erroneously logged.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  •  A problem was fixed that caused a concurrent hot repair operation to fail with the message:  "Failed to deactivate system resources for FRU at Uxxxx.yyy.zzzzzzz. The hypervisor reported the following error: The request failed with PhypRc=807."
AM740_088_042

05/25/12
Impact: Availability    Severity: SPE  

New features and functions

  • Support for IBM i Live Partition Mobility (LPM)
  • Support for the EXP30 Ultra SSD I/O Drawer, feature code (F/C) 5888.

System firmware changes that affect all systems

  • A problem was fixed that prevented the user from changing the boot mode or keylock setting after a remote restart-capable partition is created, even after the partition's paging device is on-line.
  • A problem was fixed that caused a partition with dedicated processors to hang with SRC BA33xxxx when rebooted, after it was migrated using a Live Partition Mobility (LPM) operation from a system running Ax730 to a system running Ax740, or vice-versa.
  • A problem was fixed that caused the service processor's eth0 or eth1 IP addresses to change to "IPv6 NA"  when viewed on the control (operator) panel when a laptop was connected to the service processor.
  • A problem was fixed that caused a system to crash when the system was in low power (or safe) mode, and the system attempted to switch over to nominal mode.
  • A problem was fixed that caused booting from a virtual fibre channel tape device to fail with SRC B2008105.
  • The firmware was enhanced to increase the threshold of soft NVRAM errors on the service processor to 32 before SRC B15xF109 is logged.  (Replacement of the service processor is recommended if more than one B15xF109 is logged per week.)
  • A problem was fixed that caused informational SRC A70047FF, which may indicate that the Anchor (VPD) card should be replaced, to be erroneously logged again after the Anchor card was replaced.
  • A problem was fixed that caused the lsstat command on the HMC to display an erroneously high number of packets transmitted and received on a vlan interface.
System firmware changes that affect certain systems
  • The firmware resolves undetected N-mode stability problems and improves error reporting on the feature code (F/C) 5802 and 5877 I/O drawer power subsystem.
  • On systems on which the service processor is using IPv6 Ethernet addresses, a problem was fixed that caused a service processor dump to be taken with SRC B181EF88.
  • On systems running the virtual Trusted Platform Module (vTPM), a problem was fixed that caused the system to crash when the vTPM adjunct was reset.
  • On 8231-E1C, 8231-E2C, 8202-E4C and 8205-E6C systems running IBM i partitions, a problem was fixed that prevented slots on the same PCI bus from being assigned to different partitions.  This can result in SRC B600690B being logged when a partition is booted.
  • A problem was fixed that caused various operations to hang, such as concurrent hot add/repair maintenance (CHARM) operations, running lsvpd from a partition, or a concurrent firmware installation.
Concurrent hot add/repair maintenance (CHARM) firmware fixes
  •  A problem was fixed that caused a hot node repair operation to fail with PhypRc=0x0300, indicating the deactivate system resource operation failed.
  • On systems running the virtual Trusted Platform Module (vTPM), a problem was fixed that caused SRC B400F104, and possibly SRC BA54504D, to be erroneously generated during a node repair operation.
AM740_077_042

03/06/12
Impact: Availability    Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect all systems

  • The firmware was enhanced to log SRC B7006A72 as informational instead of predictive.  This will prevent unnecessary service actions on PCIe adapters and the associated I/O planars.  This problem was also causing unnecessary service actions on systems with the Integrated Multifunction Cards:
    - F/C 1768, the integrated dual 10 Gb copper + dual 1 Gb Ethernet, and
    - F/C 1769, the integrated dual 10 Gb optical + dual 1 Gb Ethernet
    Dual 10 Gb Optical + Dual 1 Gb Ethernet (#1769) (Sales Manual description)
    Dual 10 Gb Copper + Dual 1 Gb Ethernet (#1768) (Sales Manual description)
  • On systems running system firmware level AM740_075, a problem was fixed that prevented Hardware Management Console (HMC) authentication to a managed system in  the "Pending Authentication" state, and prevented the Advanced System Management Interface (ASMI) admin user's password from being changed.
AM740_075_042

02/20/12
Impact: Availability    Severity: HIPER - High Impact/PERvasive, Should be installed as soon as possible.

New features and functions

  • Support for concurrent hot add/repair maintenance (CHARM) operations on models MHC and MMC.

System firmware changes that affect all systems

  • A problem was fixed that caused multiple service processor dumps to be unnecessarily taken during a concurrent firmware update.  SRC B181EF9A, which indicates that the dump space on the service processor is full, was logged as a result.
  • A problem was fixed that caused SRCs B181843C and B181EF88 to be logged erroneously, and a service processor dump to be generated unnecessarily.
  • The firmware was enhanced to increase the threshold for recoverable SRC B113E504 so that the processor core reporting the SRC is not guarded out.  This prevents unnecessary performance loss and the unnecessary replacement of processor modules.
  • A problem was fixed that caused SRCs B7006790 and B7006A21 to be erroneously logged.
  • A problem was fixed that caused SRCs 11001512 and 11001522 to be erroneously logged during a field replaceable unit (FRU) replacement or service processor reset.
  • A problem was fixed that caused SRC B18138B4 to be erroneously logged when the system is rebooted.
  • The firmware was enhanced to provide a more complete list of field replaceable units (FRUs) for SRCs B1xxC803, B1xxC804, and B1xxC829.
  • A problem was fixed that prevented a node from being deconfigured manually using the  Advanced System Management Interface (ASMI).
  • A problem was fixed that caused the system to fail to boot with SRC B1xxB507.
  • A problem was fixed the caused system fans to be erroneously called out as failing.
  • A problem was fixed that caused SRC B7000602 to be erroneously logged at power on.

System firmware changes that affect certain systems

  • HIPER/Non pervasive:  On systems with PCI adapters in a feature code (F/C) 5802 or 5877 I/O drawer assigned to a Virtual I/O Server (VIOS), and on systems with the I/O adapters in a CEC drawer assigned to a VIOS, a problem was fixed that caused the system to crash with SRC B700F103.
  • HIPER/Non pervasive:  On systems running the Trusted Boot feature of the PowerSC Standard Edition, a problem was fixed that caused the system to hang.
  • A problem was fixed that impacted performance if profiling was enabled in one or more partitions.  Performance profiling is enabled:
    - In an AIX or VIOS partition using the tprof (-a, -b, -B, -E option) command or pmctl (-a, -E option) command.
    - In an IBM i partition when the PEX *TRACE profile (TPROF) collections or PEX *PROFILE collections are active.
    - In a Linux partition using the perf command, which is available in RHEL6 and SLES11; profiling with oprofile does not cause the problem.
  • On systems running the Trusted Boot feature of the PowerSC Standard Edition, a problem was fixed that prevented an inactive partition from being migrated when the partition did not have enough memory to boot.  The migration of an inactive partition should be allowed in this case.
  • On systems running the Virtual I/O Server (VIOS), a problem was fixed that caused SRCs HSCL294C and HSCLB308 to be logged on the management console, and the operation to fail, if an attempt was made to add Virtual Station Interface (VSI) configuration information to a virtual Ethernet adapter that was already running.
  • A problem was fixed that prevented the operating system from being notified that a F/C 5802 or 5877 I/O drawer had recovered from an input power fault (SRC 10001512 or 10001522).
  • On systems running Active Memory Sharing (AMS), a problem was fixed the caused an "Error in Pager device driver" message to be erroneously logged during a successful partition migration.
  • The firmware was enhanced to reduce the time between the completion of partition migration and the target system's reporting that the migration is complete.
  • On systems on which the Active Energy Manager (AEM) is running, a problem was fixed that caused the AEM to report the standby power usage of a system was 0 watts.
  • On systems with more than one drawer and service processor failover enabled, a problem was fixed that caused SRCs  B121C770 and B150B10C to be erroneously logged.
  • On systems running Active Memory Sharing (AMS), the allocation of the memory was enhanced to improve performance.
  • On systems with Active Memory Mirroring (AMM) configured, a problem was fixed that caused a Live Partition Mobility (LPM) to fail.
  • On systems using affinity groups, a problem was fixed that prevented one of the partitions from being placed correctly.
  • On systems or logical partitions with a large number of virtual processors, a performance problem was fixed that prevented the utilization of the entitled capacity of partitions. 
  • On 9117-MMC and 9179-MHC systems without an optional GX adapter, a problem was fixed that caused the system fans to ramp up to their maximum speed.
  • A problem was fixed that caused the hypervisor to hang during a concurrent operation on a F/C 5802, 5803, 5873 or 5877 I/O drawer.  Recovering from the hypervisor hang required a platform reboot.
AM740_045_042

12/06/11
Impact: Availability           Severity:  HIPER - High Impact/PERvasive, Should be installed as soon as possible.

System firmware changes that affect certain systems

  • HIPER/Pervasive for systems with a F/C 5283 or F/C 5285 InfiniBand adapter:  On systems with a F/C 5283 or F/C 5285 PCIe2 2-port 4X InfiniBand QDR adapter (40Gb), a  problem was fixed that caused the system to crash with SRC B170E540.
  • HIPER/Pervasive for systems running Red Hat 6.1 Linux with 4GB of memory installed, and for systems with a Red Hat 6.1 Linux partition with a max memory size less than 8GB:  On systems running Red Hat 6.1 Linux that are configured with the minimum memory of 4 GB, or that have a Red Hat 6.1 Linux partition with a max memory size attribute of less than 8 GB in its partition profile, a problem was fixed that prevented the I/O adapters from being configured.  This resulted in the adapters being unusable, or it prevented Linux from booting.
AM740_042_042

10/21/11
Impact:  New      Severity:  New

New Features and Functions:

  • Support for the 9117-MMC and 9117-MHC systems.
  • Support for the PCIe2 2-port 4X IB quad-data-rate (QDR) adapter (40GB), F/C 5285.
  • Support for the PCIe2 8Gb 4-port Fibre Channel adapter, F/C 5279.
  • Support for the connection of an uninterruptible power supply (UPS) to the service processor over Ethernet.
  • Support for Active Memory De-duplication.
  • Support for the Trusted Boot feature of PowerSC Standard Edition.
  • Support for Active Memory Mirroring.
  • Support for VIOS Shared Storage Pools Version 2.0.
  • Support for Linux dedicated mode (including NIC, RDMA, and iSCSI) on these 4-port Ethernet adapters:
  •   - PCIe2 low profile 4-port 10Gb and 1Gb SFP+copper and RJ45 adapter (F/C 5279)
      - PCIe2 low profile 4-port 10Gb and 1Gb SR and RJ45 adapter (F/C 5280)
      - PCIe2 4-port 10Gb and 1Gb SR&RJ45 adapter (F/C 5744)
      - PCIe2 4-Port 10Gb and 1GbE SFP+copper and RJ45 adapter (F/C 5745)