Performance tuning for large clusters

There are a number of system and Symphony parameters available that can be changed from their default settings to provide better performance for large, high-utilization clusters. These parameters are defined in configuration files and the Windows registry. As to what constitutes a large cluster, this is somewhat arbitrary. For the purposes of this discussion, we will consider a large cluster to have over 20,000 cores and over 100 applications. Since qualification of cluster size is not an exact science, relatively smaller clusters may also benefit to some degree from adjusting the parameters described here.

System configuration


Parameter

Registry Path/Configuration File

Description

EnableDynamicBacklog

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AFD\Parameters\EnableDynamicBacklog

Windows 2003 only. Enables or disables dynamic backlog. The default is 0 (disabled).

Example:

EnableDynamicBacklog=1 (REG_DWORD)

MinimumDynamicBacklog

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AFD\Parameters\MinimumDynamicBacklog

Windows 2003 only. Controls the minimum number of free connections allowed on a listening endpoint. If the number of free connections drops below this value, additional free connections are created. The default is 0.

Example:

MinimumDynamicBacklog=20 (REG_DWORD decimal)

MaximumDynamicBacklog

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AFD\Parameters\MaximumDynamicBacklog

Windows 2003 only. Controls the maximum number of free connections and connections in a half-connected (SYN_RECEIVED) state allowed on a listening endpoint. The default is 0

Example:

MaximumDynamicBacklog=6144 (REG_DWORD decimal)

DynamicBacklogGrowthDelta

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AFD\Parameters\DynamicBacklogGrowthDelta

Windows 2003 only. Controls the number of free connections that are created when additional connections are necessary. The default is 0.

Example:

DynamicBacklogGrowthDelta=10 (REG_DWORD decimal)

MaxUserPort

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\MaxUserPort

Limits the number of dynamic ports available to TCP/IP applications. The default is 5000, which limits VEMKD support to 16K cores. Configuring MaxUserPort to 65534 enables VEMKD to support 20K cores.

Example:

MaxUserPort=65534 (REG_DWORD decimal)

TcpTimedWaitDelay

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\TcpTimedWaitDelay

Determines the time that must elapse before TCP can release a closed connection and reuse its resources.

Example:

TcpTimedWaitDelay=60 (REG_DWORD decimal)

tcp_window_scaling

/proc/sys/net/ipv4/tcp_window_scaling

For RHEL5.3 x86_64 OS only. Set the value to 0.The default value is 1. With the default configuration, a client that is sending a large number of tasks (e.g. several hundred thousand) using message aggregation may experience a temporary "hanging" problem. When this problem happens, no tasks are submitted to Symphony from the client for about 15 minutes. After 15 minutes, the client API senses a broken connection with Symphony and automatically reconnects to Symphony and continues to send tasks.

No user interaction is required. The problem will be automatically resolved as described above.

This issue has only been noticed on RHEL5.3 x86_64 OS.


Symphony configuration


Parameter

Configuration File

Description

EGO_DISTRIBUTION_INTERVAL

ego.conf

Improves container startup speed, client response time, and overall cluster performance. The default is 5.

Example:

EGO_DISTRIBUTION_INTERVAL=1

EGO_ENABLE_COMPRESS_STATUS_FILE

ego.conf

Improves VEMKD file operation performance, thereby improving client response time and overall cluster performance. The default is "N".

Example:

EGO_ENABLE_COMPRESS_STATUS_FILE=Y

EGO_DYNAMIC_HOST_WAIT_TIME

ego.conf

Provides compute hosts with a greater chance to be recognized by the master host. The default is 60.

Example:

EGO_DYNAMIC_HOST_WAIT_TIME=60,120

EGO_DATA_MAXSIZE

ego.conf

Specified in Mbytes. Reduces the frequency for VEMKD to swtitch the stream file, to give the PERF loader sufficient time to finish loading data from the file to the database. Note: Adjust this parameter in conjunction with PERF egoeventsloader interval. The default is 10.

Example:

EGO_DATA_MAXSIZE=100

egoeventsloader->Interval

plc_ego.xml

Specified in seconds. Sets the interval shorter than the VEMKD stream file switch time so that the loader is able to load all data in the stream file before VEMKD switches it. Note: Adjust this parameter in conjunction with EGO_DATA_MAXSIZE in ego.conf.

Example:

<DataLoader Name="egoeventsloader" Interval="300" Enable="true" LoadXML="dataloader/egoevents.xml" />

EXINTERVAL

ego.cluster.<clustername>

Provides VEMKD with a greater chance to obtain load information from the master lim.

Example:

EXINTERVAL=150

MEM_HIGH_MARK and JAVA_OPT

wsm.conf

Reduces the chance of WEBGUI service restarts in a large cluster, and also helps with generating PERF reports.

Example:

MEM_HIGH_MARK=2048

JAVA_OPTS="-Xms512m -Xmx2048m"

Java maximum heap size

EGO_TOP\perf\1.2.5\etc\plc.bat (Windows)

EGO_TOP/perf/1.2.5/etc/plc.sh (Linux)

Gives more memory to the PLC service for collectiing data in a large cluster.

Example: (Windows)

"%JAVA_HOME%\bin\java.exe" -Xms64m -Xmx2048m ...

<ReclamationTimeout>

ConsumerTrees.xml

Reduces the chance of forcible reclaim happening.

Example:

<ReclamationTimeout>300</ReclamationTimeout>

EGO_MAX_CONN

ego.conf

Consider this only when there are more than 5000 physical hosts in a cluster. This configuration allows VEMKD to maintain more connections.

Example:

EGO_MAX_CONN=20000

SSM > startUpTimeout

Application profile

When there are many applications (e.g=. 300) in a cluster, SD may not be able to register all applications when many applications start SSM at the same time within the default SSM startUpTimeout. Consequently, some SSM processes may exit due to startup timeout. The default is 60 seconds. Increase this setting in a cluster with many applications. 300 seconds is suggested for clusters with 300 applications.

Example:

<SSM resReq="" workDir="${EGO_SHARED_TOP}/soam/work" startUpTimeout="300" shutDownTimeout="300">