The XML PARSE statement is the COBOL language interface to the high-speed XML parser that is part of the COBOL run time.
The XML PARSE statement parses an XML document into its individual pieces and passes each piece, one at a time, to a user-written processing procedure.
XML PARSE statements must not be specified in declarative procedures.
Format >>-XML PARSE--identifier-1--------------------------------------> >--PROCESSING PROCEDURE--+----+--procedure-name-1--+-------------------------------+--> '-IS-' '-+-THROUGH-+--procedure-name-2-' '-THRU----' >--+-------------------------------------------+----------------> '-+----+--EXCEPTION--imperative-statement-1-' '-ON-' >--+------------------------------------------------+-----------> '-NOT--+----+--EXCEPTION--imperative-statement-2-' '-ON-' >--+---------+------------------------------------------------->< '-END-XML-'
If identifier-1 is a national group item, identifier-1 is processed as an elementary data item of category national.
CHAR(NATIVE) and alphanumeric identifier-1
If identifier-1 is alphanumeric and the CHAR(EBCDIC) compiler option is not in effect, the content of identifier-1 must be encoded using UTF-8 Unicode or a single-byte ASCII code page that is supported by ICU conversion libraries (see International Components for Unicode: Converter Explorer).
UTF-8 documents must not contain any characters with a Unicode scalar value greater than x'FFFF'. Use a character reference for such characters.
If the XML document in such a data item does not specify an encoding declaration and does not start with a UTF-8 byte order mark, it is parsed with the code page indicated by the current runtime locale.
CHAR(EBCDIC) and alphanumeric identifier-1
If identifier-1 is alphanumeric and the CHAR(EBCDIC) compiler option is in effect, the content of identifier-1 must be encoded using a single-byte EBCDIC code page that is supported by ICU conversion libraries (see International Components for Unicode: Converter Explorer). If identifier-1 is an elementary item, the NATIVE keyword must not be specified in its data description entry.
If the XML document in such a data item does not specify an encoding declaration, the XML document is parsed with the code page specified by the EBCDIC_CODEPAGE environment variable, or if the EBCDIC_CODEPAGE environment variable is not set, the default EBCDIC code page selected for the current runtime locale, as described in Locales and code pages that are supported in the COBOL for AIX Programming Guide.
Setting and using runtime locales and code pages
For more information about setting and using runtime locales and code pages, see Locales and code pages that are supported in the COBOL for AIX Programming Guide. The single-byte ASCII and EBCDIC code pages are those for which the column labeled Language group (the rightmost column) of the table Locales and code pages supported does not specify "Ideographic languages."
The only necessary relationship between procedure-name-1 and procedure-name-2 is that they define a consecutive sequence of operations to execute, beginning at the procedure named by procedure-name-1 and ending with the execution of the procedure named by procedure-name-2.
If there are two or more logical paths to the return point, then procedure-name-2 can name a paragraph that consists of only an EXIT statement; all the paths to the return point must then lead to this paragraph.
The processing procedure consists of all the statements at which XML events are handled. The range of the processing procedure includes all statements executed by CALL, EXIT, GO TO, GOBACK, INVOKE, MERGE, PERFORM, and SORT statements that are in the range of the processing procedure, as well as all statements in declarative procedures that are executed as a result of the execution of statements in the range of the processing procedure.
The range of the processing procedure must not cause the execution of any GOBACK or EXIT PROGRAM statement, except to return control from a method or program to which control was passed by an INVOKE or CALL statement, respectively, that is executed in the range of the processing procedure.
The range of the processing procedure must not cause the execution of an XML PARSE statement, unless the XML PARSE statement is executed in a method or outermost program to which control was passed by an INVOKE or CALL statement that is executed in the range of the processing procedure.
A program executing on multiple threads can execute the same XML statement or different XML statements simultaneously.
The processing procedure can terminate the run unit with a STOP RUN statement.
For more details about the processing procedure, see Control flow.
An exception condition exists when the XML parser detects an error in processing the XML document. The parser first signals an XML exception by passing control to the processing procedure with special register XML-EVENT containing 'EXCEPTION'. The parser also provides a numeric error code in special register XML-CODE, as detailed in Handling XML PARSE exceptions in the COBOL for AIX Programming Guide.
An exception condition also exists if the processing procedure sets XML-CODE to -1 before returning to the parser for any normal XML event. In this case, the parser does not signal an EXCEPTION XML event and parsing is terminated.
If the ON EXCEPTION phrase is specified, the parser transfers control to imperative-statement-1. If the ON EXCEPTION phrase is not specified, the NOT ON EXCEPTION phrase, if any, is ignored and control is transferred to the end of the XML PARSE statement.
Special register XML-CODE contains the numeric error code for the XML exception or -1 after execution of the XML PARSE statement.
If the processing procedure handles the XML exception event and sets XML-CODE to zero before returning control to the parser, the exception condition no longer exists. If no other unhandled exceptions occur before termination of the parser, control is transferred to imperative-statement-2 of the NOT ON EXCEPTION phrase, if specified.
If an exception condition does not exist at termination of XML PARSE processing, control is transferred to imperative-statement-2 of the NOT ON EXCEPTION phrase, if specified. If the NOT ON EXCEPTION phrase is not specified, control is transferred to the end of the XML PARSE statement. The ON EXCEPTION phrase, if specified, is ignored.
Special register XML-CODE contains zero after execution of the XML PARSE statement.
The scope of a conditional XML GENERATE or XML PARSE statement can be terminated by:
END-XML can also be used with an XML GENERATE or XML PARSE statement that does not specify either the ON EXCEPTION or NOT ON EXCEPTION phrase.
For more information about explicit scope terminators, see Delimited scope statements.
When a given XML GENERATE or XML PARSE statement appears as imperative-statement-1 or imperative-statement-2, or as part of imperative-statement-1 or imperative-statement-2 of another XML GENERATE or XML PARSE statement, that given XML GENERATE or XML PARSE statement is a nested XML GENERATE or XML PARSE statement.
Nested XML GENERATE or XML PARSE statements are considered to be matched XML GENERATE and END-XML, or XML PARSE and END-XML combinations proceeding from left to right. Thus, any END-XML phrase that is encountered is matched with the nearest preceding XML GENERATE or XML PARSE statement that has not been implicitly or explicitly terminated.
When the XML parser receives control from an XML PARSE statement, the parser analyzes the XML document and transfers control at the following points in the process:
Control returns to the XML parser when the end of the processing procedure is reached.
The exchange of control between the parser and the processing procedure continues until either:
In each case, the processing procedure returns control to the parser. Then, the parser terminates and returns control to the XML PARSE statement with the XML-CODE special register containing the most recent value set by the parser or -1 (which might have been set by the parser or by the processing procedure).
For each XML event passed to the processing procedure, the XML-CODE and XML-EVENT special registers contain information about the particular event. Special register XML-EVENT is set to the event name, such as 'START-OF-DOCUMENT'. For most events, the XML-TEXT or XML-NTEXT special register contains document text. See XML-EVENT for details.
The content of the XML-CODE special register is defined during and after execution of an XML PARSE statement. The contents of all other XML special registers are undefined outside the range of the processing procedure.
For normal XML events, special register XML-CODE contains zero when the processing procedure receives control. For XML exception events, XML-CODE contains an XML exception code as described in XML exceptions that allow continuation and XML exceptions that do not allow continuation in the COBOL for AIX Programming Guide.
For more information about the XML special registers, see:
For an introduction to special registers, see Special registers
For more information about the EXCEPTION event and exception processing, see Handling XML PARSE exceptions in the COBOL for AIX Programming Guide.