Architecture

Platform License Scheduler manages license tokens instead of controlling the licenses directly. Using Platform License Scheduler, jobs receive a license token before starting the application. The number of tokens available from LSF corresponds to the number of licenses available from FlexNet, so if a token is not available, the job does not start. In this way, the number of licenses requested by running jobs does not exceed the number of available licenses.

When a job starts, the application is not aware of LSF License Scheduler. The application checks out licenses from FlexNet in the usual manner.

Figure 1. Daemon interaction

How scheduling policies work

With Platform License Scheduler, LSF gathers information about the licensing requirements of pending jobs to efficiently distribute available licenses. Other LSF scheduling policies are independent from Platform License Scheduler policies.

When starting a job, the basic LSF scheduling comes first. Platform License Scheduler has no influence on job scheduling priority. Jobs are considered for dispatch according to the prioritization policies configured in each cluster.

For example, a job must have a candidate LSF host on which to start before the License Scheduler fairshare policy (for the license project this job belongs to) will apply.

Other LSF fairshare policies are based on CPU time, run time, and usage. If LSF fairshare scheduling is configured, LSF determines which user or queue has the highest priority, then considers other resources. In this way, the other LSF fairshare policies have priority over License Scheduler.

When the mbatchd is offline

When a cluster is running, the mbatchd maintains a TCP connection to bld. When the cluster is disconnected (such as when the cluster goes down or is restarted) the bld removes all information about jobs in the cluster. License Scheduler considers licenses checked out by jobs in a disconnected cluster to be non-LSF use of licenses.

When mbatchd comes back online, the bld immediately receives updated information about the number of tokens currently distributed to the cluster.

When the bld is offline

If the mbatchd loses the connection with the bld, the mbatchd cannot get bld’s token distribution decisions to update its own.

However, because the mbatchd logs token status every minute in $LSF_TOP/work/data/featureName.ServiceDomainName.dat file, if the connection is lost, the mbatchd uses the last logged information to schedule jobs.

f3.LanServer1.dat 
# f3 LanServer1 3 2
# p1 50 p2 50  
  12/3         14:20:38        2    0    2    0         1    0    1    0    
  12/3         14:21:39        2    0    2    0         1    0    1    0    
  12/3         14:22:40        3    3    0    0         0    0    0    0    
  12/3         14:23:41        3    3    0    0         0    0    0    0    
  12/3         14:24:42        1    0    1    0         2    0    2    0    
  12/3         14:25:43        1    0    1    0         2    0    2    0    
  12/3         14:26:44        1    0    1    0         2    0    2    0    
  12/3         14:27:55        1    0    1    0         2    0    2    0    

f3 on LanServer1 has 3 tokens and 2 projects. Projects p1 and p2 share licenses 50:50.

At 14:27:55, the bld dispatched 1 token to p1, which has 0 in use, 1 free, 0 reserve. At the same time, the bld dispatched 2 tokens to p2, which has 0 in use, 2 free, and 0 reserve.

The mbatchd continues to schedule jobs based on the token distribution logged at 14:27:55 until the connection with the bld is re-established.