The host scavenging feature adds hosts to the cluster that only run work when they are idle. Local users on the hosts are not interrupted, but once they are no longer using the host, the host is used in the cluster. When a user starts using the host again, the host is closed to the cluster and work runs on other hosts.
A scavenging agent, elim.sa, is included in the EGO package and deployed to hosts during installation. However this scavenging agent is disabled by default.
An administrator enables the host scavenging agent on selected hosts.
An administrator creates a special scavenging resource group and adds the scavenge-ready hosts. This separates opportunistic (scavenge-ready) hosts from dedicated hosts used deterministically by the cluster.
Once the scavenging agent is enabled, it monitors the local load information and dynamically opens the local host for scavenging or closes it and reclaims the work.
When the scavenging agent closes the host and reclaims the work, the host no longer qualifies for allocation to any consumer until it is opened again. This happens automatically once the host is not busy (determined by configurable threshold values) if the scavenging agent closed the host.
The host scavenging feature requires a resource group of scavenge-ready hosts . You set this resource group to exclude management hosts and include hosts with the static resource tag "scvg". Once set up, any new host added to the cluster with the resource tag "scvg" automatically joins this resource group.
The scavenge resource group must be the last resource group created.
By default, when the scavenging agent opens the local host, it also sets the OS process priority of any future grid workload to lowest priority. This can be modified but only with help from Platform Computing. We do not suggest changing it.
Normal process priority: When set to normal, EGO allocates resources to run workload at normal process priority as controlled by the OS.
Lowest process priority: When set to this priority level, EGO allocates resources to run workload at the lowest process priority as controlled by the OS (on Windows, it is the setting for IDLE_PRIORITY_CLASS).
This feature is enabled by running the following commands.
Scavenge-ready hosts need both the scavenge resource tag and the agent control flag set.
Scavenge resource tag (scvg): Marks a host as scavenge-ready and allows it to be identified with a scavenge resource group.
Agent control (agent_control): Enables or disables the local scavenging agent. The value can be on, fastrelease, or off. Enabling the scavenging agent lets it monitor whether the host is busy or idle.
|
Follow the steps in the topic Enable host scavenging for all steps required when setting up this feature.
When the scavenging agent detects that the host is busy, it closes the host. The running workload is terminated after a grace period and the host is prevented from further allocation.
The host status changes to closed and the reason indicates that the scavenging agent closed the host.
Note that the reclaim grace period set for a consumer does not apply when a scavenge-ready host is configured using the fastrelease command option.
The scavenging agent opens a host when all three of the following configurable thresholds indicate that a host is not busy.
The combination of these three thresholds being reached triggers a host to be opened and ready for opportunistic workload.
When the host starts being used locally, the threshold values are no longer met and the scavenging agent closes the host and reclaims the workload.
Once the thresholds are reached again (indicating that the host is not busy once more), the host is automatically opened again.
You can modify the default threshold values that determine when the scavenging agent opens and closes the scavenged host.
Setting the cluster to reclaim before borrowing makes sure that scavenged hosts are borrowed by other consumers only after all their own resources are reclaimed and used up.
It is a best practice to configure the cluster in this way when using the host scavenging feature.
|
Not applicable. There are no submission commands that affect host scavenging.
|
Hosts need both the agent control set to on and the scavenge resource tag (scvg) applied for host scavenging to function properly. If a host is missing one of the two, the feature does not work properly.
|