Appendix: Data-aware Scheduling Plug-in Development

Protocol for interactions between the SSM and the plug-in process

This section describes the protocol that must be implemented by the plug-in code to interact with the SSM.

  1. The protocol is ASCII and line-based with the <CR (\n)> character used as the line separator for portability between platforms and programming languages. Note that on Windows, CR='\r''\n' and on Unix (including Linux), CR='\n'.

  2. The protocol implements synchronous communication with the SSM where the SSM initiates the request and the plug-in replies with a response. As a result, the response from any plug-in operation can be success or failure. To distinguish between success or failure, the very first number in any response line is either an error code (non-zero value) or 0 (successful response). An error code is followed by the accompanying error message, such as:

    "1 ERROR: bad initialization parameters.<CR>"

  3. During the initial sequence, the SSM spawns the plug-in process and waits for the first initialization response. The plug-in process, starts up and after finishing its initialization (parameter for the initialization can be passed as command line arguments), replies to standard output on success:

    "Status version <error>" , status = 0 for success or non-zero for failure.

    "0 1.0.0.0"

    where the first number is success code 0 followed by the current protocol version supported by this plug-in (current version 1.0.0.0)

    -Have timeout to detect hang, default=30 seconds.

  4. The SSM specifies request commands identified by one character (so protocol can be potentially extended up to 256 commands) followed by command arguments separated by a space character (' '). Currently the SSM provides only one command specified by character 'L' ("Location") followed by a space-separated list of attribute names.

    Example:

    "Command attr1 …attrN CR"

    where Command=L(return locations and Cost)

    "L attr1 attr2 attr3 attr4<CR>"

  5. The plug-in process responds to the "Location" command with multiple response lines, one per attribute and in the same attributes order. Each successful response line starts with the success code 0 followed by (separated by space) the cost value and then one or more locations until the <CR>-character.

    Example:

    "Status Cost1 Location1 Cost 2 Location2…CostN LocationN CR"

    The following example assumes the values for four attributes are available:

    "0 1.0 host1" for attr1

    "0 2.0 host2 0.0 host3" for attr2

    "0 3.123 host4 7 host5 10000 host6 0 host7" for attr3

    "0 1.0 *" for attr4

    The following example assumes the value for attr3 is not available to the plugin:

    "0 1.0 host1" for attr1

    "0 2.0 host2 0.0 host3" for attr2

    "1 Do not have the value" for attr3; this value will be cached and returned in subsequent calls.

    Note:

    : The plug-in has to return number of response lines equal to number of attributes in the previous "Location" command and in the same order. The response can be a mix of successes and failures.

    The asterisk (*) character for location means "any available host".

  6. The plug-in should be ready at any time to get EOF character or error from standard input (e,g, "broken pipe"), which means a shutdown request from the SSM. In this case, the plug-in process must do appropriate cleanup and exit by itself, as the SSM will never kill the plug-in process.

Code sample

This section provides sample code for the data-aware scheduling plug-in process. Use the sample code as a template and add the necessary logic to retrieve the host location and transfer cost data from the metadata repositories.

#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#ifndef WIN32
#include <unistd.h>
#endif
#include <errno.h>
static const char DELIMITER = ' ';
#ifndef WIN32
#define STDIN 0
#endif
int main(int argc, char* argv[])
{
    // reply initialization OK and plugin's protocol version number (1.0.0.0)
    fprintf(stdout, "0 1.0.0.0\n");
    fflush(stdout);
    // loop forever
#ifndef WIN32
    while(true)
    {    
        fd_set fds;
        struct timeval tv;
        FD_ZERO(&fds);
        FD_SET(STDIN, &fds);
        tv.tv_sec = 0;
        tv.tv_usec = 100;
        int retVal = select (STDIN + 1, &fds, NULL, NULL, &tv);
        //select time out
        if (retVal == 0)
        {
            if (getppid() != 1)
            {
                continue;
            }
            //parent process is dead.
            else
            {
                break;
            }
        }
        //select error
        else if (retVal == -1)
        {
            break;    
        }
         
        char line[1024]="";
        if (fgets(line, sizeof(line), stdin) == 0)
        {
            break;
        }
#else    
    for(char line[1024]="";fgets(line, sizeof(line), stdin) != 0;line[0]=0 )
    {
#endif
 // trim <CR> in the end
         char* ptr = line+strlen(line)-1;
         if(*ptr == '\n')
             *ptr = '\0';
         // tokenize
         char* token = strtok( line, &DELIMITER);
         if( token == 0 )
         {
             fprintf(stdout, 
                 "3 ERROR: unknown command received by the plug-in process '%s'\n",
                 argv[0]);
             fflush(stdout);
             continue;
         }
         switch( token[0] )
         {
         case 'L': // GET_LOCATION
             { // block
     
             int cnt=0;
             while( (token = strtok( 0, &DELIMITER)) != 0 )
             {
                 ++cnt;
                  // return status=0, cost/location 
                 fprintf(stdout, "0 0 hosta 0 oparmar.noam.corp.platform.com\n");
                 fflush(stdout);
                 
             }
             if(!cnt)
             {
                 // print error
                 fprintf(stdout, 
                 "6 ERROR: cannot get any parameters for the command '%s'\n",
                 line);
                 fflush(stdout);
             }
             } // end of block
             break;
         case 'T': // GET_TOPOLOGY
            {
            }
             break;
         case 'Q': // QUIT
    
             return(0); // break pipe with parent
             break;
     
         default:  // unknown command!
             fprintf(stdout, "3 ERROR: unknown command received by the plug-in process
             '%s'\n", argv[0]);
                 
             fflush(stdout);
         }
    }
    
    return(0); // break pipe with parent
}