A regular
expression is a sequence of characters that can match one or
more target sequences of characters, according to a regular expression
grammar. This implementation supports the following regular expression
grammars:
BRE —
Basic Regular Expressions, defined by the POSIX Standard, Part 1 (ISO/IEC
9945-1:2003)
ERE —
Extended Regular Expressions, also defined by the POSIX Standard,
Part 1
ECMAScript —
ECMAScript regular expressions, as defined by the ECMAScript
Language Specification (Ecma-262)
awk —
regular expressions as used in the awk utility,
defined by the POSIX
Standard, Part 3 (ISO/IEC 9945-3:2003)
grep —
regular expressions as used in the grep utility,
also defined by the POSIX Standard, Part 3
egrep —
regular expressions as used in the grep utility
with the -E option, also defined by the POSIX
Standard, Part 3
This document describes each of these grammars as provided in this
implementation. Most of the differences between the grammars are in
the regular expression features that are supported. When features
are not supported by all of the grammars the text describing those
features lists the grammars that support them. In some cases the differences
between the grammars are in the syntax used to describe a feature
(for example, BRE and grep require
a backslash in front of a left parenthesis that marks the beginning
of a group and the others do not). In these cases the differences
are described as part of the description of the feature.
Regular Expression Grammar
Element
An element can
be any of the following:
An ordinary character,
which matches the same character in the target sequence
A wildcard character,
'.', which matches any character in the target
sequence except a newline
A bracket expression,
of the form “[expr]”, which matches a character or a collation element in the
target sequence that is also in the set defined by the expression expr, or of the form “[^expr]”, which matches
a character or a collation element in the target sequence that is
not in the set defined by the expression expr.
The expression expr can consist of any combination
of any number of each of the following.
An individual
character, which adds that character to the set defined by expr.
A character range, of
the form “ch1-ch2”, which adds all of the characters represented
by values in the closed range [ch1, ch2] to the set defined by expr.
A character class, of
the form “[:name:]”, which adds all of the characters in the
named class to the set defined by expr.
An equivalence class,
of the form “[=elt=]”, which adds the collating elements that
are equivalent to elt to the set defined by expr.
A collating symbol,
of the form “[.elt.]”, which adds the collation element elt to the set defined by expr.
An anchor, either '^'
or '$', which matches the beginning or the end
of the target sequence, respectively
A capture group, of the
form “( Subexpression )”,
or “\( Subexpression \)”
in BRE and grep, which
matches the sequence of characters in the target sequence that is
matched by the pattern between the delimiters
An identity escape,
of the form “\k”,
which matches the character k in the target
sequence
Examples:
“a” matches the target sequence “a”
but none of the target sequences “B”, “b”, or “c”.
“.” matches all of the target sequences “a”, “B”, “b”,
and “c”.
“[b-z]” matches the target sequences “b”
and “c” but does not match the target sequence “a”
or the target sequence “B”.
“[:lower:]” matches the target sequences “a”, “b”,
and “c” but does not match the target sequence “B”.
“(a)” matches the target sequence “a”
and associates capture group 1 with the subsequence “a”,but
does not match any of the target sequences “B”, “b”,
or “c”.
In ECMAScript, BRE,
and grep an element can also be:
a back reference, of
the form “\dd”
where dd represents a decimal value N, which
matches a sequence of characters in the target sequence that is the
same as the sequence of characters matched by the Nth capture group.
For example:
“(a)\1” matches the target sequence
"aa" because the first (and only) capture group matches the initial
sequence “a” and the \1 then matches the final sequence “a”.
In ECMAScript, an element can also be any
of the following:
A non-capture group,
of the form “(?: Subexpression )”, which matches the sequence
of characters in the target sequence that is matched by the pattern
between the delimiters
a limited file format
escape, of the form “\f”, “\n”, “\r”, “\t”, or “\v”;
these match a form feed, newline, carriage return, horizontal tab,
and vertical tab, respectively, in the target sequence.
A positive assert, of
the form “(?= Subexpression )”,
which matches the sequence of characters in the target sequence that
is matched by the pattern between the delimiters, but does not change
the match position in the target sequence.
A negative assert, of
the form “(?! Subexpression )”,
which matches any sequence of characters in the target sequence that
does not match the pattern between the delimiters, and does not change
the match position in the target sequence.
A hexadecimal
escape sequence, of the form “\xhh”, which matches a character in
the target sequence whose representation is the value represented
by the two hexadecimal digits hh.
A unicode escape
sequence, of the form “\uhhhh”,
which matches a character in the target sequence whose representation
is the value represented by the four hexadecimal digits hhhh.
A control escape
sequence, of the form “\ck”,
which matches the control character named by the character k.
A word boundary assert,
of the form “\b”, which matches if the
current position in the target sequence is immediately after a word boundary.
A dsw character escape,
of the form “\d”, “\D”, “\s”, “\S”, “\w”, “\W”, which
provides a short name for a character class.
For example:
“(?:a)” matches the target sequence “a”,
but “(?:a)\1” is invalid, because there is no capture group
1.
“(?=a)a” matches the target sequence “a”.
The positive assert matches the initial sequence “a” in the
target sequence and the final “a” in the regular expression
matches the initial sequence “a” in the target sequence.
“(?!a)a” does not match the target
sequence “a”.
“a\b.” matches the target sequence “a~”
but does not match the target sequence “ab”.
“a\B.” matches the target sequence “ab”
but does not match the target sequence “a~”.
In awk, an element can also be one of the
following:
A file format escape,
of the form “\\”, “\a”, “\b”, “\f”, “\n”, “\r”, “\t”, or “\v”;
these match a backslash, alert, backspace, form feed, newline, carriage
return, horizontal tab, and vertical tab, respectively, in the target
sequence.
An octal escape sequence,
of the form “\ooo”,
which matches a character in the target sequence whose representation
is the value represented by the one, two, or three octal digits ooo.
Repetition
Any element other than a positive
assert, a negative assert,
or an anchor can be followed by
a repetition
count. The most general form of repetition count takes the form "{min,max}",
or "\{min,max\}" in BRE and grep. An element followed by this form of repetition
count matches at least min and no more than max successive occurrences of a sequence that
matches the element.
For example:
“a{2,3}” matches the target sequence “aa”
and the target sequence “aaa”, but not the target sequence “a”
or the target sequence “aaaa”.
A repetition count can also take one of the following forms:
“{min}”,
or “\{min\}”
in BRE and grep, which
is equivalent to “{min,min}”.
“{min,}”,
or “\{min,\}”
in BRE and grep, which
is equivalent to “{min,unbounded}”.
“*”, which is equivalent to “{0,unbounded}”.
Examples:
“a{2}” matches the target sequence “aa”
but not the target sequence “a” or the target sequence “aaa”.
“a{2,}” matches the target sequence “aa”,
the target sequence “aaa”, and so on, but does not match the
target sequence “a”.
“a*” matches the target sequence “”,
the target sequence “a”, the target sequence “aa”,
and so on.
For all grammars except BRE and grep, a repetition count can also take one of
the following forms:
“?”, which is equivalent to “{0,1}”.
“+”, which is equivalent to “{1,unbounded}”.
Examples:
“a?” matches the target sequence “”
and the target sequence “a”, but not the target sequence “aa”.
“a+” matches the target sequence “a”,
the target sequence “aa”, and so on, but not the target sequence “”.
Finally, in ECMAScript, all of the preceding
forms of repetition count can be followed by the character '?', which designates a non-greedy repetition.
Concatenation
Regular expression elements, with our without repetition counts, can be
concatenated to form longer regular expressions. Such an expression
matches a target sequence that is a concatenation of sequences matched
by the individual elements.
For example:
“a{2,3}b” matches the target sequence “aab"
and the target sequence ”aaab“, but does not match the target
sequence ”ab" or the target sequence “aaaab”.
Alternation
For all regular expression grammars except BRE and grep, a concatenated regular expression can be
followed by the character '|' and another concatenated
regular expression, which can be followed by another '|' and another
concatenated regular expression, and so on. Such an expression matches
any target sequence that matches one or more of the concatenated regular
expressions. When more than one of the concatenated regular expressions
matches the target sequence, ECMAScript chooses
the first of the concatenated regular expressions that matches the
sequence as the match (first match);
the other regular expression grammars choose the one that results
in the longest match.
For example:
“ab|cd” matches the target sequence “ab”
and the target sequence “cd”, but does not match the target
sequence “abd” or the target sequence “acd”.
In grep and egrep,
a newline character ('\n') can be used to separate
alternations.
Subexpression
A subexpression is
a concatenation in BRE and grep, or an alternation in the
other regular expression grammars.
Grammar Summary
Element
BRE
ERE
ECMA
grep
egrep
awk
alternation using '|'
+
+
+
+
alternation using '\n'
+
+
anchor
+
+
+
+
+
+
back reference
+
+
+
bracket expression
+
+
+
+
+
+
capture group using “()”
+
+
+
+
capture group using “\(\)”
+
+
control escape sequence
+
dsw character escape
+
file format escape
+
+
hexadecimal escape sequence
+
identity escape
+
+
+
+
+
+
negative assert
+
negative word boundary assert
+
non-capture group
+
non-greedy repetition
+
octal escape sequence
+
ordinary character
+
+
+
+
+
+
positive assert
+
repetition using "{}"
+
+
+
+
repetition using "\{\}"
+
+
repetition using '*'
+
+
+
+
+
+
repetition using '?' and '+'
+
+
+
+
unicode escape sequence
+
wildcard character
+
+
+
+
+
+
word boundary assert
+
Semantic Details
Anchor
An anchor matches
a position in the target string and not a character. A '^' matches
the beginning of the target string, and a '$' matches
the end of the target string.
Back Reference
A back reference is
a backslash followed by a decimal value N. It matches the contents
of the Nth capture group.
The value of N must not be greater than
the number of capture groups that precede the back reference. In BRE amd grep the value
of N is determined by the decimal digit that follows the backslash.
In ECMAScript the value of N is determined
by all of the decimal digits that immediately follow the backslash.
Thus, in BRE and grep the
value of N is never greater than 9, even if the regular expression
has more than nine capture groups. In ECMAScript the
value of N is unbounded.
Examples:
“((a+)(b+))(c+)\3” matches the target
sequence “aabbbcbbb”. The back reference “\3” matches
the text in the third capture group, that is, the “(b+)”.
It does not match the target sequence “aabbbcbb”.
“(a)\2” is not valid.
“(b(((((((((a))))))))))\10” has
a different meaning in BRE and in ECMAScript. In BRE the
back reference is “\1”. It matches the contents of the first
capture group (i.e. the one beginning with “(b” and ending
with the final “)” preceding the back reference), and the
final '0' matches the ordinary character '0'. In ECMAScript the
back reference is “\10”. It matches the tenth capture group
(i.e. the innermost one).
Bracket Expression
A bracket
expression defines a set of characters and collating elements. If the
bracket expression begins with the character '^' the
match succeeds if none of the elements in the set matches the current
character in the target sequence. Otherwise, the match succeeds if
any of the elements in the set matches the current character in the
target sequence.
A capture
group marks its contents as a single unit in the regular expression
grammar and labels the target text that matches its contents. The
label associated with each capture group is a number, determined by
counting the left parentheses marking capture groups up to and including
the left parenthesis marking the current capture group. In this implementation,
the maximum number of capture groups is 31.
Examples:
“ab+” matches the target sequence “abb”
but not the target sequence “abab”.
“(ab)+” does not match the target
sequence “abb” but matches the target sequence “abab”.
“((a+)(b+))(c+)” matches the target
sequence “aabbbc” and associates capture group 1 with the
subsequence “aabbb”, capture group 2 with the subsequence “aa”,
capture group 3 with “bbb”, and capture group 4 with the subsequence “c”.
Character Class
A character
class in a bracket expression adds all the characters in the
named class to the character set defined by the bracket expression.
To create a character class, use “[:” followed by the name
of the class followed by “:]”. Internally, names of character
classes are recognized by calling id = traits.lookup_classname.
A character ch belongs to such a class
if traits.isctype(ch, id) returns true.
The default regex_traits template supports
the following class names:
“alnum” — lowercase letters,
uppercase letters, and digits;
“alpha” — lowercase letters
and uppercase letters;
A character
range in a bracket expression adds all the characters in the
range to the character set defined by the bracket expression. To create
a character range put the character '-' between the first and last
characters in the range. This puts all the characters whose numeric
value is greater than or equal to the numeric value of the first character
and less than or equal to the numeric value of the last character
into the set. Note that this set of added characters depends on the
platform-specific representation of characters. If the character '-'
occurs at the beginning or end of a bracket expression or as the first
or last character of a character range it represents itself.
Examples:
“[0-7]” represents the set of characters
{ '0', '1', '2', '3', '4', '5', '6', '7' }. It matches the target
sequences “0”, “1”, etc., but not “a”.
“[h-k]” represents the set of characters
{ 'h', 'i', 'j', 'k' } on systems that use the ASCII character encoding;
it matches the target sequences “h", ”i“, etc., but
not ”\x8A“ or ”0“.
”[h-k]“ represents the set of characters
{ 'h', 'i', '\x8A', '\x8B', '\x8C', '\x8D', '\x8E', '\x8F', '\x90',
'j', 'k' } on systems that use the EBCDIC character encoding ('h'
is encoded as 0x88 and 'k' is encoded as 0x92). It matches the target
sequences ”h", “i”, “\x8A”, etc., but not “0”.
“[-0-24]” represents the set of
characters { '-', '0', '1', '2', '4' }.
“[0-2-]” represents the set of characters
{ '0', '1', '2', '-' }.
“[+—]” on systems that use
ASCII represents the set of characters { '+', ',', '-' }.
When using locale-sensitive
ranges, however, the characters in a range are determined by
the collation rules for the locale. Characters that collate after
the first character in the definition of the range and before the
last character in the definition of the range are in the set, as are
the two end characters.
Collating Element
A collating
element is a multi-character sequence that is treated as a single
character.
Collating Symbol
A collating
symbol in a bracket expression adds a collating element to the
set defined by the bracket expression. To create a collating symbol,
use “[.” followed by the collating element followed by “.]”.
Control Escape Sequence
A control
escape sequence is a backslash followed by the letter 'c' followed
by one of the letters 'a' through 'z' or 'A' through 'Z'. It matches
the ASCII control character named by that letter.
For example,
“\ci” matches the target sequence “\x09”,
because <ctrl-i> has the value 0x09.
DSW Character Escape
A dsw
character escape is a short name for a character class.
Escape Sequence
Equivalent Named Class
Default Named Class
“\d”
“[[:d:]]”
“[[:digit:]]”
“\D”
“[^[:d:]]”
“[^[:digit:]]”
“\s”
“[[:s:]]”
“[[:space:]]”
“\S”
“[^[:s:]]”
“[^[:space:]]”
“\w”
“[[:w:]]”
“[a-zA-Z0-9_]”*
“\W”
“[^[:w:]]”
“[^a-zA-Z0-9_]”*
*ASCII character
set
Equivalence Class
An equivalence
class in a bracket expression adds all the characters and collating elements that are
equivalent to the collating element in the equivalence class definition
to the set defined by the bracket expression. To create an equivalence
class, use “[=” followed by a collating element followed by “=]”.
Internally, two collating elements elt1 and elt2 are equivalent if traits.transform_primary(elt1.begin(),
elt1.end()) == traits.transform_primary(elt2.begin(), elt2.end()).
File Format Escape
A file
format escape consists of the usual C language character escape
sequences, “\\”, “\a”, “\b”, “\f”, “\n”, “\r”, “\t”, “\v”,
with their usual meanings, namely, backslash, alert, backspace, form
feed, newline, carriage return, horizontal tab, and vertical tab,
respectively. In ECMAScript “\a” and “\b”
are not allowed. (“\\” is allowed, but technically it's an
identity escape, not a file format escape).
Hexadecimal Escape Sequence
A hexadecimal
escape sequence is a backslash followed by the letter 'x' followed
by two hexadecimal digits (0-9a-fA-F). It matches a character in the
target sequence with the value specified by the two digits.
For example,
“\x41” matches the target sequence “A”
when the ASCII character encoding is used.
Identity Escape
An identity
escape is a backslash followed by a single character. It matches
that character. It is needed when the character has a special meaning;
using the identity escape removes the special meaning.
For example,
“a*” matches the target sequence “aaa”
but does not match the target sequence “a*”
“a\*" does not match the target
sequence ”aaa“ but does match the target sequence ”a*“
The set of characters allowed in an identity escape depends on
the regular expression grammar.
ECMAScript — all characters except
those that can be part of an identifier. Roughly speaking, this is
letters, digits, '$', '_', and unicode escape sequences. For full
details see the ECMAScript Language
Specification.
Individual Character
An individual
character in a bracket expression adds that character to the
character set defined by the bracket expression. A '^' anywhere other
than at the beginning of a bracket expression represents itself.
Examples:
”[abc]“ matches the target sequences ”a“, ”b“,
and ”c“ but not the sequence ”d“.
”[^abc]“ matches the target sequence ”d“,
but not ”a“, ”b“, or ”c“.
”[a^bc]“ matches the target sequences ”a“, ”b“, ”c“,
and ”^“ but not the sequence ”d“.
In all the regular expression grammars except ECMAScript if
a ']' is the first character following the opening '[' or the first
character following an initial '^' it represents itself.
Examples:
”[]a“ is invalid, because there
is no ']' to end the bracket expression.
”[]abc]“ matches the target sequences ”a“, ”b“, ”c“,
and ”]“ but not the sequence ”d“.
”[^]abc]“ matches the target sequence ”d“,
but not ”a“, ”b“, ”c“, or ”]“.
In ECMAScript use '\]' to represent the
character ']' in a bracket expression.
Examples:
”[]a“ matches the target sequence ”a“
because the bracket expression is empty.
”[\]abc]“ matches the target sequences ”a“, ”b“, ”c“,
and ”]“ but not the sequence ”d“.
Negative Assert
A negative
assert matches anything but its contents; it does not consume
any characters in the target sequence.
For example,
”(?!aa)(a*)“ matches the target
sequence ”a“ and associates capture group 1 with the subsequence ”a“.
It does not match the target sequence ”aa“ or the target sequence ”aaa“.
Negative Word Boundary Assert
A negative
word boundary assert matches if the current position in the target
string is not immediately after a word
boundary.
Non-capture Group
A non-capture
group marks its contents as a single unit in the regular expression
grammar, but does not label the target text.
For example,
”(a)(?:b)*(c) matches the target text “abbc”
and associates capture group 1 with the subsequence “a” and
capture group 2 with the subsequence “c”.
Non-greedy Repetition
A non-greedy repetition
consumes the shortest subsequence of the target sequence that matches
the pattern. A greedy repetition
consumes the longest.
For example,
“(a+)(a*b)” matches the target sequence “aaab”.
When using a non-greedy repetition it associates capture group 1 with
the subsequence “a” at the beginning of the target sequence
and capture group 2 with the subsequence “aab” at the end
of the target sequence. When using a greedy match it associates capture
group 1 with the subsequence “aaa” and capture group 2 with
the subsequence “b”.
Octal Escape Sequence
An octal
escape sequence is a backslash followed by one, two, or three
octal digits (0-7). It matches a character in the target sequence
with the value specified by those digits. If all the digits are '0'
the sequence is invalid.
For example,
“\101” matches the target sequence “A”
when the ASCII character encoding is used.
Ordinary Character
An ordinary
character is any valid character that doesn't have a special
meaning in the current grammar.
In ECMAScript the characters that have
special meanings are:
^ $ \ . * + ? ( ) [ ] { } |
In BRE and grep the
characters that have special meanings are:
. [ \
In addition, the following characters have special meanings when
used in a particular context:
'*' has a special meaning in all cases except
when it is the first character in a regular expression or the first
character following an initial '^' in a regular expression and when
it is the first character of a capture group or the first character
following an initial '^' in a capture group.
'^' has a special meaning when it is the
first character of a regular expression.
'$' has a special meaning when it is the
last character of a regular expression.
In ERE, egrep,
and awk the following characters have special
meanings:
. [ \ ( * + ? { |
In addition, the following characters have special meanings when
used in a particular context.
')' has a special meaning when it matches
a preceding '('.
'^' has a special meaning when it is the
first character of a regular expression.
'$' has a special meaning when it is the
last character of a regular expression.
An ordinary character matches the same character in the target
sequence. By default this means that the match succeeds if the two
characters are represented by the same value. In a case-insensitive match two
characters ch0 and ch1 match
if traits.translate_nocase(ch0) == traits.translate_nocase(ch1).
In a locale-sensitive match
two characters ch0 and ch1 match
if traits.translate(ch0) == traits.translate(ch1).
Positive Assert
A positive
assert matches its contents, but does not consume any characters
in the target sequence.
Examples:
“(?=aa)(a*)” matches the target
sequence “aaaa” and associates capture group 1 with the subsequence “aaaa”.
In contrast, “(aa)(a*)” matches
the target sequence “aaaa” and associates capture group 1
with the subsequence “aa” at the beginning of the target sequence
and capture group 2 with the subsequence “aa” at the end of
the target sequence.
“(?=aa)(a)|(a)” matches the target
sequence “a” and associates capture group 1 with an empty
sequence (because the positive assert failed) and capture group 2
with the subsequence “a”. It also matches the target sequence “aa"
and associates capture group 1 with the subsequence ”aa“ and
capture group 2 with an empty sequence.
Unicode Escape Sequence
A unicode
escape sequence is a backslash followed by the letter 'u' followed
by four hexadecimal digits (0-9a-fA-F). It matches a character in
the target sequence with the value specified by the four digits.
For example,
”\u0041“ matches the target sequence ”A“
when the ASCII character encoding is used.
Wildcard Character
A wildcard
character matches any character in the target expression except
a newline.
Word Boundary
A word boundary occurs
in the following situations:
the current character is at the beginning of the target sequence
and the current character is one of the word characters A-Za-z0-9_
the current character position is past the end of the target sequence
and the last character in the target sequence is one of the word characters
the current character is one of the word characters and
the preceding character is not
the current character is not one of the word
characters and the preceding character is.
Word Boundary Assert
A word
boundary assert matches if the current position in the target
string is immediately after a word
boundary.
Matching and Searching
For a regular expresion to match a
target sequence, the entire regular expression must match the entire
target sequence.
For example:
the regular expression ”bcd“ matches
the target sequence ”bcd“ but does not match the target sequence ”abcd“
nor the target sequence ”bcde“.
For a regular expression search to
succeed there must be a subsequence somewhere in the target sequence
that matches the regular expression. The search ordinarily finds the
leftmost matching subsequence.
Examples:
A search for the regular expression ”bcd“ in
the target sequence ”bcd“ succeeds and matches the entire
sequence; the same search in the target sequence ”abcd“ also
succeeds and matches the last three characters; the same search in
the target sequence ”bcde“ also succeeds, and matches the
first three characters.
A search for the regular expression ”bcd“ in
the target sequence ”bcdbcd“ succeeds and matches the first
three characters.
If there is more than one subsequence that matches at some position
in the target sequence there are two ways to choose the matching pattern.
First match chooses
the subsequence that was found first when matching the regular expression. Longest match chooses
the longest subsequence from the ones that match at that point. If
there is more than one subsequence with the maximal length, longest
match chooses the subsequence that was found first.
For example:
a search for the regular expression ”b|bc“ in
the target sequence ”abcd“ matches the subsequence ”b“
with first match, because the left-hand term of the alternation matched
that subsequence and there was no need to try the right-hand term
of the alternation; the same search matches ”bc“ with longest
match, because ”bc“ is longer than ”b“.
A partial
match succeeds if the match reaches the end of the target sequence
without failing, even if it has not reached the end of the regular
expression. Thus, after a partial match succeeds, appending characters
to the target sequence could cause a subsequent partial match to fail.
After a partial match fails, appending characters to the target sequence
cannot cause a subsequent partial match to succeed.
For example, with a partial match:
”ab“ matches the target sequence ”a“
but not ”ac“.
Format Flags
ECMAScript format
rules
sed format rules
Replacement text
”$&“
”&“
The character sequence that matched the entire regular expression
([match[0].first, match[0].second))
”$$“
”$“
”\&“
”&“
”$`“ (dollar sign followed by
back quote)
The character sequence that precedes the subsequence that matched
the regular expression ([match.prefix().first,
match.prefix().second))
”$'“ (dollar sign followed by
forward quote)
The character sequence that follows the subsequence that matched
the regular expression ([match.suffix().first,
match.suffix().second))
”$n“
”\n“
The character sequence that matched the nth (0 <= n <=
9) capture group ([match[n].first, match[n].second)
”\\n“
”\n“
”$nn“
The character sequence that matched the nnth (10 <=
nn <= 99) capture group ([match[nn].first,
match[nn].second)
Copyright note
Certain materials included or referred to in this document are copyright
P.J. Plauger and/or Dinkumware, Ltd. or are based on materials that are copyright
P.J. Plauger and/or Dinkumware, Ltd.
Notwithstanding the meta-data for this document, copyright information
for this document is as follows: