Owning and borrowing resources

Ownership

When consumers “own” resources, they are guaranteed a minimum allocation of resources, regardless of competition from other consumers. Ownership is expressed as a numeric quantity.

Ownership is optional. A consumer may not own any resources yet still use cluster resources allocated to them through borrowing. Consumers can choose to lend idle resources.

Overallocating at the cluster level

It is possible to allocate to the cluster (at the top level only in your resource plan) more slots than are currently listed as available. Only do so if resources are, for some reason, not available when you modify your resource plan, but would usually be part of the cluster and will be available when the time interval you are overallocating for occurs.
Attention:

Only overallocate ownership at the cluster level if you know that resources will be available during the time interval you are setting it for.

Lending resources

Lending is optional. You can enable lending only for leaf consumers who own resources (there are no lend settings available for non-leaf consumers in the resource plan). During periods of low demand, a consumer's resources can be lent to other consumers who have an unsatisfied demand. This kind of resource lending/borrowing relationship between consumers improves the efficiency of the cluster. Without lending, owned resources cannot be shared with other consumers and idle resources are wasted.

Owned resources that are not being used and that have lending enabled get allocated to consumers who have an unsatisfied demand. Qualifying resources are lent in the order of configured consumer rank. For example, in the case where a consumer has resources available to lend, and there are competing consumers with unsatisfied demand, this is what would happen:

  • First, the borrowing consumer with the highest assigned consumer rank is allocated as many resources as are available until its demand is satisfied or until its configured borrowing limit is reached.

  • Then, any surplus resources are assigned to the competing consumer with the next highest consumer rank.

  • The allocation continues down the line of consumer rank until all qualifying resources are allocated or all consumer demands are satisfied.

Lending can occur between consumer branches in the consumer tree, and is not restricted to leaf consumers from the same consumer branch. However, through advanced refinement of the resource plan, leaf consumers can be configured to only lend to and borrow from their siblings.

EGO reclaims resources from a borrowing consumer and returns them to the lending consumer as soon as the lending consumer has an unsatisfied demand. Although ownership of resources guarantees access to them at any time, preconfigured reclaim grace periods may delay the recovery of lent resources. When a cluster or consumer administrator sets the reclaim grace period for a consumer, they should consider the length of a typical workload unit potentially run by a borrowing consumer, along with the urgency of workload units that need to be done by a lending consumer that must reclaim its resources.

Reserving resources by setting a lending limit

A consumer has the flexibility to enable lending on all of its owned resources, or on only a few; those resources without lending enabled are reserved solely for use by the leaf consumer that owns them. The reserved resources do not qualify for lending and are never lent out, even if unused. The lending limit is expressed as a numeric quantity.

Borrowing and sharing

Borrowing refers to the temporary allocation of owned resources from a lending consumer or the share pool to a consumer with an unsatisfied demand.

Sharing refers to the temporary allocation of unowned resources from a “share pool” to a consumer with an unsatisfied demand.

Any client can make use of unused, owned resources that are enabled for lending. The only unused resources that cannot be borrowed are those that are reserved for use solely by a resource owner (that is, resources belonging to a consumer who has not enabled lending).

Borrowing is optional. If borrowing is disabled, the allocation to a leaf consumer never exceeds the configured ownership. Therefore, if borrowing is disabled for all consumers, any unused resources (owned by other consumers) are wasted.

Borrowing resources is on a first-come first-served basis. For example, one leaf consumer can borrow all the available resources in the cluster by being the first to request them. Once all available resources are allocated, other leaf consumers that want to borrow must then wait for a resource to be released.

Borrowing can be enabled for leaf consumers only.

Sharing resources between leaf consumers from the same consumer branch

In cases where leaf consumers from the same consumer branch are competing to borrow resources from the share pool, the share ratio determines the minimum number of resources to allocate to each of them.

The share ratio is configurable. A valid entry for a share ratio is a positive, whole number. Share ratios work in this way:

  • By default, all consumers have a share ratio of 1, meaning they share equally.

  • A share ratio of 0 (zero) means that a consumer cannot borrow at all from the share pool.

  • A leaf consumer with a ratio of 2 can borrow twice as many resources as a competing sibling with a ratio of 1, and half as much as a competing sibling with a ratio of 4.

Other examples of share ratios between competing leaf consumers (siblings):

  • Scenario: Two competing leaf consumers (siblings) with equal allocations

A ratio of 1:1 means that both siblings receive 1/2 of the available resources from the parent.

  • Scenario: Two competing leaf consumers (siblings) with unequal allocations

A ratio of 1:2 means that one sibling receives 1/3 of the available resources from the parent while the other sibling receives 2/3 of it.

  • Scenario: Ten competing leaf consumers (siblings) with equal allocations

A ratio of 1 each means each sibling receives equal resources (1/10th of the parent’s available resources).

Note:

Resource allocation to competing leaf consumers depends in turn on branch ownership and share ratios. Therefore, for consumers on different branches of the consumer tree, an identical share ratio does not imply an identical allocation of resources.

In addition to setting share ratios, the cluster administrator may set maximum shares for each consumer. A maximum share value is specified as an absolute numerical count of resources.

Share ratio enforcement throughout the consumer tree

By default, planned share ratios are enforced at the leaf level. This means that share policies guarantee that each application (registered at a leaf level) receives its planned or “deserved” number of resources when demand is demonstrated. If an application does not have sufficient demand to warrant receiving all its deserved resources, the unused resources are distributed to all consumer branches and filtered down to leaf consumers as per their relative share ratios.

You can change the default behavior to enforce share ratios at the parent level. Doing this forces EGO to distribute unused resources to sibling leaf consumers (within a single line of business) that exhibit demand first before it distributes them throughout the rest of the consumer tree (to other lines of business). This allows a line of business to share resources between its own registered applications before sharing with other lines of business.

Resource allocation according to consumer rank and borrowing preference

Once a consumer branch’s share pool of resources is exhausted, then EGO allocates resources from other branches in the consumer tree, eventually moving up the tree to allocate any unowned resources from the cluster level.

Leaf consumers borrow resources from other consumer branches according to the following policies:

  1. Consumer ranking: Leaf consumers from the same consumer branch with the highest priority setting have the first opportunity to borrow.

    Note:

    The cluster administrator can set a maximum number of resources that can be borrowed by each consumer.

  2. Borrowing preference order: In cases where resources may be borrowed from multiple sources, lenders are ordered by “borrowing preference”. A borrower’s demands are first satisfied by borrowing from the lender for which he has the highest borrowing preference.

Limited borrowing

By default, a consumer with unsatisfied demand can potentially borrow all qualifying resources. However, you can choose to limit the number of borrowed resources allocated to a specific consumer. The borrowing limit is expressed as a numeric quantity.

In cases where a consumer owns resources and also borrows additional resources, the specified maximum allocation includes both the borrowed and owned resources.

Resource reclaim

A consumer does not retain guaranteed use of borrowed resources. Borrowed resources get returned to their owners in two situations:

  • When the borrowing consumer and its client releases them

  • When owners reclaim their resources to meet their own unsatisfied demand

Resource reclaim is influenced by the grace period set by cluster or consumer administrators and the configured consumers rank.

Note:

EGO may not always return the exact resource that was originally lent. In cases where a high priority workload unit may be running on a lent resource, an analogous resource may be returned instead to the original lending consumer. This behavior is dependent upon the application manager or consumer (for example, Platform Symphony or an LSF cluster) that may be installed on EGO.

Grace period

Lent resources can be reclaimed by owners experiencing unsatisfied demand even if the client is using them. When a resource is reclaimed, any client workload units running on the resource are interrupted. You can set a grace period, however, to impose a delay before a borrowed resource is returned to its owner. For example:

  • If you set the grace period to 10 seconds, any client workload units continue to run on the borrowed resource for 10 seconds after EGO initiates the resource reclaim.

  • If you set the grace period to 1, any running client workload units are almost immediately interrupted.

Before setting a grace period, consider the length of a typical workload unit that is run by a borrowing consumer and its clients, and the urgency in which a lending consumer might require its demands be satisfied.

Note:

Leaving the grace period unconfigured or blank uses the default grace period of 0 seconds.

Reclaim according to consumer rank

Resources are reclaimed according to their configured consumer rank.

  • Example 1: If a lending consumer has unsatisfied demand and requires that its lent resources be reclaimed, EGO looks to reclaim resources starting with leaf consumers with the lowest consumer rank.

  • Example 2: If a lending consumer has a specific resource requirement (for example, the lending consumer needs a Windows slot with a certain amount of available memory), EGO reclaims the first lent resource it finds that matches this requirement. Borrowing leaf consumers with the lowest consumer rank are considered first, followed by leaf consumers with a higher consumer rank.

Change reclaim behavior for owned resources

By default, owned resources are only reclaimed after the lending consumer has attempted to satisfy its unmet demand through all other available means, including by borrowing resources from other lending consumers. You can, however, change this behavior so that owned resources get reclaimed before a consumer attempts to borrow resources from other lending consumers.

Changing the reclaim behavior is useful in cases where a consumer’s owned resources are specially selected to run certain workload units, or in charge-back settings where borrowing from outside a resource group might be more costly.

Change share pool reclaim behavior

By default, share pool resources can be reclaimed. This allows the share pool to reclaim resources from an over-allocated consumer to meet the demands of a competing consumer with a higher share ratio. You can change this behavior so that share pool resources are not reclaimed. Instead, resources get returned to the share pool for further allocation once the borrowing consumer and its client releases them.

Troubleshooting unexpected resource allocation issues

If you find that leaf consumers are not getting enough resources, or that client workload units are not running as expected, check the following:

  • Ensure that the entire consumer branch owns adequate resources (that parents own enough resources to meet the demands of their children).

  • Check that the priority levels are set appropriately (that they are not all set to “low” or all set to “high”).

  • Confirm that the share ratio is appropriate between sibling leaf consumers (that more important leaf consumers are given a higher share ratio than competing siblings).

  • Make sure that you enable borrowing and lending.