Knowledge Center         Contents    Previous  Next    Index  
Platform Computing Corp.

The Platform Management Console

Contents

Log on to the Platform Management Console

You can use the pmcadmin command to administer the Platform Management Console. For more information, see the Platform LSF Command Reference.

The Platform Management Console (PMC) allows you to monitor, administer, and configure your cluster. To log on, give the name and password of the LSF administrator.

  1. Browse to the web server URL and log in to the PMC using the LSF administrator name and password.
  2. The web server URL is:

    http://host_name:8080/platform 
     

    The host name is the PMC host.

    If PMC is controlled by EGO and you have not specified a dedicated PMC host, PMC could start on any management host. To find out, log on to a command console as LSF administrator and run:

    egosh client view GUIURL_1

    The description part of the command output shows the full URL including the host name and port.

    If PMC is not controlled by EGO, and you do not know the port, log on to the PMC host as LSF administrator and run:

    pmcadmin list

    The command output shows the port you need to use in the URL.

Set the command-line environment

On Linux hosts, set the environment before you run any LSF or EGO commands. You need to do this once for each session you open. root, lsfadmin, and egoadmin accounts use LSF and EGO commands to configure and start the cluster.

You need to reset the environment if the environment changes during your session, for example, if you run egoconfig mghost, which changes the location of some configuration files.

If Platform EGO is enabled in the LSF cluster (LSF_ENABLE_EGO=Y and LSF_EGO_ENVDIR are defined in lsf.conf), cshrc.lsf and profile.lsf set the following environment variables.

See the Platform EGO Reference for more information about these variables.

See the Platform LSF Configuration Reference for more information about cshrc.lsf and profile.lsf.

Manage services

Determine the host address where the service is running

Prerequisites: You need to know the address of where the service director (DNS server) is running (contact IT for assistance, if required).

EGO services may not all run on the same management host. You can use nslookup to find the address of the host where a specific service is running.

  1. Using the CLI, type nslookup.
  2. Type server and then enter the IP address of the service director (DNS server).
  3. The Default Server and Address will return. For example,

    > server 172.25.237.37

    Default server: 172.25.237.37

    Address: 172.25.237.37#53

  4. Enter the name of the service for which you want to find the host address. For example,
  5. > WEBGUI.ego

    Server: 172.25.237.37

    Address: 172.25.237.37#53

    Name: WEBGUI.ego

    Address: 172.25.237.37

Troubleshoot service error states

If you receive a service error message, or a message indicating that a service will not transition out of the Allocating state, there are steps you can perform to troubleshoot the issue.

Responding to service message Error

Normally, Platform EGO attempts to start a service multiple times, up to the maximum threshold set in the service profile XML file (containing the service definition). If the service cannot start, you will receive a service error message.

  1. Try stopping and then restarting the service.
  2. Review the appropriate service instance log file to discover the cause of the error.
  3. Platform EGO service log files include those for the service director (ServiceDirector), web service gateway (WebServiceGateway), and the Platform Management Console (WEBGUI). If you have defined your own non-EGO services, you may have other log files you will need to review, depending on the service which is triggering the error.

Responding to service message Allocating

Allocating is a transitional service state before the service starts running. If your service remains in this state for some time without transitioning to Started, or cycles between Defining and Allocating, you will want to discover the cause of the delay.

  1. If you are the cluster administrator, review the allocation policy.
    1. Open the service profile XML file (containing the service definition).
    2. Find the consumer for which the service is expected to run.
    3. Ensure that a proper resource plan is set for that consumer.
    4. During a service's "allocation" period, Platform EGO attempts to find an appropriate resource on which to run the service. If it cannot find the required resource, the service will not start.

Manage hosts

Important host roles

Hosts in the cluster may be described as the master host, master candidates, management hosts, compute hosts, or the web server host.

Master host

A cluster requires a master host. This is the first host installed. The master host controls the rest of the hosts in the grid.

Master candidates

There is only one master host at a time. However, if the master host ever fails, another host automatically takes over the master host role, allowing work to continue. This process is called failover. When the master host recovers, the role switches back again.

Hosts that can act as the master are called master candidates. This includes the original master host and all hosts that can take over the role in a failover scenario. All master candidates must be management hosts.

Master host failover

During master host failover, the system is unavailable for a few minutes while hosts are waiting to be contacted by the new master.

The master candidate list defines which hosts are master candidates. By default, the list includes just one host, the master host, and there is no failover. If you configure additional candidates to enable failover, the master host is first in the list. If the master host becomes unavailable, the next host becomes the master. If that host is also unavailable, the next host is considered to become the master, and so on down the list. A short list with two or three hosts is sufficient for practical purposes.

For failover to work properly, the master candidates must share a file system and the shared directory must always be available.

important:  
The shared directory should not reside on a master host or any of the master candidates. If the shared directory resides on the master host and the master host fails, the next candidate cannot access the necessary files.
Management host

Management hosts belong to the ManagementHosts resource group. These hosts are not expected to execute workload units for users. Management hosts are expected to run services such as the web server and web services gateway. The master host and all master candidates must be management hosts.

A slot is the basic unit of resource allocation, analogous to a "virtual CPU".

Management hosts share configuration files, so a shared file system is needed among all management hosts.

A management host is configured when you run egoconfig mghost on the host. The tag mg is assigned to the management host, in order to differentiate it from a compute host.

Compute host

Compute hosts are distributed to cluster consumers to execute workload units. By default, compute hosts belong to the ComputeHosts resource group.

The ComputeHosts group excludes hosts with the mg tag, which is assigned to management hosts when you run egoconfig mghost. If you create your own resource groups to replace ComputeHosts, make sure they also exclude hosts with the mg tag.

By default, the number of slots on a compute host is equal to the number of CPUs.

Web server host or PMC host

The web server is the host that runs the Platform Management Console, when you configure this you may call it the PMC host. There is only one host at a time acting as the web server host. If EGO controls the PMC, it does not need to be a dedicated host; by default, any management host in the cluster can be the web server (decided when the cluster starts up, failing over if the original host fails). However, if EGO does not control PMC, you must configure the PMC host manually. If you specify the PMC host, there can be no failover of PMC.

Customize job submission interfaces

From the Platform Console you can customize LSF application submission functionality, modify provided default applications (such as Fluent and NASTRAN), and add or remove application submission pages from the Platform Console.

Customizing, adding, and removing job submission interfaces requires you to manually change various XML configuration files and scripts.

  1. Modify the submission script to point the field APPLICATION_CMD to an application. For example:
  2. LSF_TOP/gui/lsf/7.0/batchgui/plugin/lsf/exec/fluent.cmd 
    #FLUENT_CMD="sleep 10;/bin/echo" 
    FLUENT_CMD="/pcc/app/fluent/Fluent.Inc/bin/fluent " 
    

Remove a job submission interface

The Platform Console provides some default job submission interface templates. You can remove any of these, or other, interfaces from the organizational tree.

  1. Using an XML editor, open this file: LSF_ENVDIR/gui/cluster_name/conf/navigation/
    pmc_navigation_jobsubmission.xml
  2. Remove the XML element for the application submission node definition.
    1. Search for <Name>Applications</Name>.
    2. Under this section, delete or add comment markings around the Category section that contains the application you wish to remove.
    3. For example:

      <Name>Applications</Name> 
      <Display>Submission</Display> 
      <Description>Submission</Description> 
      <IconPath></IconPath> 
      <Role>1,3</Role> 
      <Categories> 
      <!--Category> 
      <Name>Fluent</Name> 
      <Display>Fluent</Display> 
      <Description></Description> 
      <IconPath>/batchgui/images/icon_treeNode_jobSubmission.gif</IconPath> 
      <Role>1,3</Role> 
      <Viewport-Ref>Fluent</Viewport-Ref> 
      </Category--> 
      <Category>...</Categories> 
      
  3. Click Submit Job.

Platform Computing Inc.
www.platform.com
Knowledge Center         Contents    Previous  Next    Index