Tutorial 3: Request Host Allocation in a Cluster with Asynchronous Callback Notifications
This tutorial describes how to create a registered EGO client that requests host allocation in a cluster and starts a container on the host. The sample uses callbacks for notifications from the cluster about resource change and container/host state change.
Using this tutorial, you will ...
- Open a connection to Platform EGO
- Print out cluster information
- Check if there are any registered clients connected to Platform EGO
- Log on to Platform EGO
- Register the client with Platform EGO
- Print out allocation and container reply info from a previous connection
- Print out host group information
- Request resource allocation from Platform EGO and print the allocation ID
- Start a container on Platform EGO and print container ID
- Check for registered clients connected to Platform EGO and print out information
- Implement client callback methods.
Step 1: Preprocessor directives and method declarations
The first step is to include a reference to the system and API header files. The samples.h header file contains the method declarations that are common to all of the samples. In addition, we declare the methods that are specific to this sample.
#include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <string.h> #include "vem.api.h" #include "samples.h" static int addResourceCB(vem_allocreply_t *areply); static int reclaimForceCB(vem_allocreclaim_t *areclaim); static int containerStateChgCB(vem_containerstatechg_t *cschange); static int hostStateChangeCB(vem_hoststatechange_t *hschange); // holds allocation information static vem_allocreply_t *allocReply = NULL; static char *allocated_host_name = NULL; static int barrier = 0; static vem_container_id_t jobContainerId = NULL; static int jobFinished = 0;Step 2: Implement the principal method
Lines 4-7: define and initialize a data structure that is used to request a connection with the EGO host cluster. The data structure contains a reference to a configuration file where the master host name and port numbers are stored.
Line 8: pass the data structure as an argument to the vem_open () method, which opens a connection to the master host. If the connection attempt is successful, a handle is returned; otherwise the method returns NULL. The handle acts as a communication channel to the master host and all subsequent communication occurs through this handle.
Lines 14-15: the vem_name_t structure (defined as clusterName) is initialized with NULL. This structure holds the cluster name, system name, and version. The vem_uname () method is passed the communication handle and, if successful, returns a valid vem_name_t structure ; otherwise the method returns NULL
Line 21: the cluster info is printed out to the screen.
Lines 22-39: define the client info structure. Use vem_locate() to get all registered clients. Since NULL is provided as the client name, all registered clients will be located and the method returns the number of registered clients. Note that Platform EGO is equipped with a number of default clients (services) such as the Service Controller, so as a minimum, the info relevant to these clients is printed out and the associated memory is released.
1 int 2 sample3() 3 { 4 vem_openreq_t orequest; 5 vem_handle_t *vhandle = NULL; 6 orequest.file = "ego.conf"; // default libvem.conf 7 orequest.flags=0; 8 vhandle = vem_open(&orequest); 9 if (vhandle == NULL) { 10 // error opening 11 fprintf(stderr, "Error opening cluster: %s\n", vem_strerror(vemerrno)); 12 return -1; 13 } 14 vem_name_t *clusterName = NULL; 15 clusterName = vem_uname(vhandle); 16 if (clusterName == NULL) { 17 // error connecting 18 fprintf(stderr, "Error connecting to cluster: %s\n", vem_strerror(vemerrno)); 19 return -2; 20 } 21 fprintf(stdout, " Connected... %s %s %4.2f\n", clusterName->clustername, clusterName->sysname, clusterName->version); 22 vem_clientinfo_t *clients; 23 int rc = vem_locate(vhandle, NULL, &clients); 24 if (rc >=0) { 25 if (rc == 0) { 26 printf("No registered clients exist\n"); 27 } else { 28 int i=0; 29 for (i=0; i<rc; i++) { 30 printf("%s %s %s\n", clients[i].name, clients[i].description, 31 clients[i].location); 32 } 33 // free 34 vem_clear_clientinfo(clients); 35 } 36 } else { 37 // error connecting 38 fprintf(stderr, "Error geting clients: %s\n", vem_strerror(vemerrno)); 39 }Lines 40-42: authenticate the user to Platform EGO.
Lines 43-47: define and initialize a structure for callback methods. These callback methods are invoked by Platform EGO when resources are added or reclaimed, or when a change occurs to host status or a container. When Platform EGO wants to communicate about these events, it invokes these methods thereby calling back to the client.
Lines 48-59: Define the vem_allocation_info_reply_t and vem_container_info_reply_t structures. If a client gets disconnected and then re-registers, its existing allocations and containers are returned to these structures. If the client had never registered before, the structures would be empty. Define and initialize a structure (rreq) that holds client info for registration purposes. (This includes assigning the client callback structure (cbf) to the callback member of the rreq structure; see Step 3: Client callback methods.) Register with Platform EGO via the open connection using vem_register().
40 if (login(vhandle, username, password)<0) { 41 fprintf(stderr, "Error logon: %s\n", vem_strerror(vemerrno)); 42 } 43 vem_clientcallback_t cbf; 44 cbf.addResource = addResourceCB; 45 cbf.reclaimForce = reclaimForceCB; 46 cbf.containerStateChg = containerStateChgCB; 47 cbf.hostStateChange = hostStateChangeCB; 48 vem_allocation_info_reply_t aireply; 49 vem_container_info_reply_t cireply; 50 vem_registerreq_t rreq; 51 rreq.name = "sample3_client"; 52 rreq.description = "Sample3"; 53 rreq.flags = VEM_REGISTER_TTL; 54 rreq.ttl = 3; 55 rreq.cb = &cbf; // NULL, would need to read messages explicitly; 56 rc = vem_register(vhandle, &rreq, &aireply, &cireply); 57 if (rc < 0) { 58 fprintf(stderr, "Error registering: %s\n", vem_strerror(vemerrno)); 59 }Lines 60-63: print out information related to the allocation requests and containers. Once the info is printed out, the memory for the allocations is freed.
Lines 65-75: the vem_gethostgroupinfo() method collects the information for the requested hostgroup. In this case, the requested hostgroup in the input argument is set to NULL, which means that information about all hostgroups is requested. If the method call is successful, hostgroup information is printed out to the screen.
Lines 76-96: initialize the data structure (vem_allocreq_t) that specifies the allocation request. vem_alloc() requests resource allocation using the allocation request info (vem_allocreq structure) as one of the input arguments. If the request is successful, the allocation ID is printed out to the screen.
60 print_vem_allocation_info_reply(&aireply); 61 print_vem_container_info_reply(&cireply); 62 // freeup any previous allocations 63 release_vem_allocation(vhandle, &aireply); 64 65 vem_hostgroupreq_t hgroupreq; 66 hgroupreq.grouplist = NULL; 67 vem_hostgroup_t *hgroup; 68 rc = vem_gethostgroupinfo(vhandle, &hgroupreq, &hgroup); 69 if (rc < 0) { 70 fprintf(stderr, "Error getting hostgroup: %s\n", 71 vem_strerror(vemerrno)); 72 } else { 73 printf("%s %s %d %d\n", hgroup->groupName, hgroup->members, hgroup->free, 74 hgroup->allocated); 75 } 76 vem_allocreq_t areq; 77 areq.name = "Sample2Alloc"; 78 areq.consumer = "/SampleApplications/EclipseSamples"; 79 areq.hgroup = "ComputeHosts"; 80 #ifndef WIN32_RESOURCE 81 areq.resreq = "LINUX86"; 82 #else 83 areq.resreq = "NTX86"; 84 #endif 85 areq.minslots = 1; 86 areq.maxslots = 1; 87 areq.flags = VEM_ALLOC_EXCLUSIVE; 88 vem_allocation_id_t alocid; 89 vem_allocfreereq_t afree; 90 rc = vem_alloc(vhandle, &areq, &alocid); 91 if (rc < 0) { 92 fprintf(stderr, "Error allocating: %s\n", vem_strerror(vemerrno)); 93 goto bailout; 94 } else { 95 printf("allocated: %s\n", alocid); 96 }Lines 97-121: define and initialize a container specification including the setting of its resource limits to default values. The container specification essentially defines a job that the user wants to be executed. The conspec.command method specifies the actual binary that should be executed. In the sample, we want the program "sleep" to be executed. The UNIX sleep command takes the number of seconds to sleep as an input argument.
Lines 122-124: define and initialize various structures and assign container and allocation IDs.
Lines 126-135: a while loop suspends program execution until a hostname for the allocation is found. The barrier variable is set when the notification from Platform EGO arrives after which it can proceed to run a container on the allocated resource.. The hostname is printed out.
Lines 136-138: initialize the workload container request structure (conreq) with the hostname, container name, and the container specification (conspec).
97 vem_container_spec_t conspec; 98 memset(&conspec, 0, sizeof(vem_container_spec_t)); 99 100 #ifndef WIN32_RESOURCE 101 conspec.command = "sleep 120"; 102 conspec.execUser = "lsfadmin"; // "egoadmin"; 103 conspec.umask = 0777; 104 conspec.execCwd = "/tmp"; 105 conspec.envC = 0; 106 #else 107 // sleep needs to be installed on the cluster NT hosts 108 // or if ping is available, use something like ping -n xxx 127.0.0.1 > nul 109 conspec.command = "sleep 120"; 110 conspec.execUser = "lsf\\lsfadmin"; //"egouser"; // "lsfadmin"; // 111 "egoadmin"; 112 conspec.umask = 0777; 113 conspec.execCwd = "c:\\"; 114 conspec.envC = 0; 115 #endif 116 117 int i; 118 for (i=0; i<VEM_RLIM_NLIMITS; i++) { 119 conspec.rlimits[i].rlim_cur = VEM_RLIM_DEFAULT; 120 conspec.rlimits[i].rlim_max = VEM_RLIM_DEFAULT; 121 } 122 vem_startcontainerreq_t conreq; 123 vem_container_id_t conid = NULL; 124 conreq.allocId = alocid; 125 // find the hostname for allocation from the CB fn 126 while (barrier == 0) { 127 // wait until we have a host allocated 128 sleep(1); 129 } 130 if (allocReply == NULL || allocReply->nhost ==0) { 131 fprintf(stderr, "Error allocating host: %s\n", vem_strerror(vemerrno)); 132 goto cleanup; 133 } 134 char *host = allocated_host_name; 135 printf("Allocated host: %s\n", host); 136 conreq.hostname = host; // allocReply->host[0].name; 137 conreq.name = "Sample2Container"; 138 conreq.spec = &conspec;Lines 139-146: start the workload container on the specified host and, if successful, print out the container ID.
Lines 147-168: use vem_locate() to get all registered clients. Since NULL is provided as the client name, all registered clients will be located and the method returns the number of registered clients. If successful, print out the client info and free the associated memory.
139 rc = vem_startcontainer(vhandle, &conreq, &conid); 140 if (rc < 0) { fprintf(stderr, "Error starting container: %s\n", 141 vem_strerror(vemerrno)); 142 jobContainerId = "INVALID"; 143 goto cleanup; 144 } 145 jobContainerId = conid; 146 printf("Started container %s\n", conid); 147 rc = vem_locate(vhandle, NULL, &clients); 148 if (rc >=0) { 149 if (rc == 0) { 150 printf("No registered clients exist\n"); 151 } else { 152 int i=0; 153 for (i=0; i<rc; i++) { 154 printf("%s %s %s\n", clients[i].name, clients[i].description, 155 clients[i].location); 156 } 157 vem_clear_clientinfo(clients); 158 } 159 } else { 160 // error connecting 161 fprintf(stderr, "Error geting clients: %s\n", vem_strerror(vemerrno)); 162 } 163 // wait for job to be finished 164 while (!jobFinished) { 165 //wait 166 sleep(10); 167 } 168 vem_free_containerId(conid);Step 3: Client callback methods
These callback methods are invoked by Platform EGO when resources are added or reclaimed, or when a change occurs to host status or a container. When Platform EGO wants to communicate about these events, it invokes these methods thereby calling back to the client.
Lines 169-179: this method is called by Platform EGO when resources have been added to an allocation in order to tell the client which resources have been provided for its use. This method prints out the allocation and consumer IDs, the number of hosts allocated, host names and number of slots, and host attributes.
Lines 180-186: this method is called by Platform EGO when resources need to be reclaimed. Resources may be reclaimed either for policy reasons, or because a resource has been found to be down or unavailable. The method prints out the host info including host name and slots for each host being reclaimed.
Lines 187-200: this method is called by Platform EGO in order to communicate status changes in containers to the clients that started them. The method prints out the container ID and its associated state; the container state is enumerated in the vem.common.h file.
Lines 201-207: this method is called by Platform EGO when a host changes state. The method prints out the host name and its new host state.
169 int 170 addResourceCB(vem_allocreply_t *areply) 171 { 172 printf("addResource Call Back\n"); 173 allocReply = areply; 174 allocated_host_name = malloc(strlen(allocReply->host[0].name)); 175 strcpy(allocated_host_name, allocReply->host[0].name); 176 barrier = 1; 177 print_vem_allocreply(areply); 178 return 0; 179 } 180 int 181 reclaimForceCB(vem_allocreclaim_t *areclaim) 182 { 183 printf("reclaimForce Call Back\n"); 184 print_vem_allocreclaim(areclaim); 185 return 0; 186 } 187 int 188 containerStateChgCB(vem_containerstatechg_t *cschange) 189 { 190 printf("containerStateChg Call Back\n"); 191 printf("%s %d\n", cschange->containerId, cschange->newState); 192 while(jobContainerId == NULL) {sleep(1);} // wait until container has been 193 created 194 if(jobContainerId && !strcmp(cschange->containerId, jobContainerId)) { 195 if(cschange->newState == CONTAINER_FINISH) { 196 jobFinished = 1; 197 } 198 } 199 return 0; 200 } 201 int 202 hostStateChangeCB(vem_hoststatechange_t *hschange) 203 { 204 printf("hostStateChange Call Back\n"); 205 printf("%s %d\n", hschange->name, hschange->newState); 206 return 0; 207 }Run the client application
- Select Run > Run.
The Run dialog appears.
- In the Configurations list, either select an EGO C Client Application or click New for a new configuration.
For a new configuration, enter the configuration name.
- Enter the project name and C/C++ Application name.
- Click Apply and then Run.
Sample Output
![]()
[ Top ]
[ Platform Documentation ]
Date Modified: July 12, 2006
Platform Computing: www.platform.com
Platform Support: support@platform.com
Platform Information Development: doc@platform.com
Copyright © 1994-2006 Platform Computing Corporation. All rights reserved.