You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
578 lines
20 KiB
578 lines
20 KiB
Definitions:
|
|
|
|
statement delimiter:
|
|
a new statement will always follow a linebreak
|
|
character:
|
|
a) unless the following character is the continuation character
|
|
b) if the following character is another linebreak character,
|
|
it is a null statement.
|
|
c) statements consisting entirely of whitespace characters,
|
|
comments or combinations are considered null statements.
|
|
d) the parser ignores null statements.
|
|
e) parser eats everything up to linebreak character and each
|
|
subsequent line that begins with the continuation char
|
|
in the event of a parsing error.
|
|
The Construct delimiters are a statement in and of themselves
|
|
and they also serve to terminate the previous statement
|
|
and delimit the start of the following statement.
|
|
you may think of a <construct_delimiter> as equivalent to
|
|
<linebreak><construct_delimiter><linebreak>
|
|
|
|
|
|
Whitespace characters: space, TAB
|
|
Linebreak characters: <CR>, <LF> if both occur consecutively
|
|
the pair is treated as one linebreak.
|
|
White characters: either Whitespace or Linebreak characters.
|
|
continuation character: +
|
|
NEW DEF: when used as a continuation char
|
|
the '+' must appear as the first char in a line.
|
|
The parser will interpret the preceeding linebreak
|
|
as whitespace.
|
|
Logical linebreak: a linebreak character that is NOT followed by a
|
|
continuation character.
|
|
Continuation linebreak: a linebreak character followed by a
|
|
continuation character.
|
|
Arbitrary whitespace: Whitespace chars or comments or
|
|
Continuation Linebreak
|
|
Construct delimiters: curly braces { }
|
|
A linebreak character is not required to preceed or proceed
|
|
a construct delimiter.
|
|
You may think of a <construct_delimiter> as logically equivalent
|
|
to <linebreak><construct_delimiter><linebreak>
|
|
Construct delimiters also delimit the scope of macro
|
|
definitions. If a macro was defined within a nesting level
|
|
created by a pair of construct delimiters, it remains defined
|
|
only within that nesting level.
|
|
Nesting level: the logical set of statements enclosed by
|
|
a matching pair of construct delimiters.
|
|
Statement delimiter: Logical linebreak or Construct delimiter.
|
|
Keyword - value separator: colon :
|
|
string delimiter: double quotes ""
|
|
macro invocation: equals =
|
|
comment indicator: *%
|
|
Comments may be inserted immediately preceeding any logical
|
|
or continuation linebreak. They may contain
|
|
any characters except linebreak characters.
|
|
The linebreak character terminates the comment.
|
|
The comment indicator must be preceeded by a White Character.
|
|
(unless the comment indicator is the first byte in the source file.)
|
|
Grouping operator: () used in certain value constructs.
|
|
Hex substring delimiter: <>
|
|
Escape character : % used in string and parameter constructs.
|
|
|
|
|
|
|
|
The above characters have reserved meanings
|
|
and may not be used in any keyword, symbol name
|
|
or any user defined name.
|
|
|
|
Name spaces:
|
|
|
|
If attributes keyword are used within other constructs
|
|
say feature or global keyword inside an option,
|
|
that keyword must be declared using the EXTERN_FEATURE:
|
|
or EXTERN_GLOBAL: modifier. Otherwise the attribute type
|
|
expected (dictionary used) will be defined by the state.
|
|
|
|
the namespaces of the attribute keywords for various constructs
|
|
may overlap each other, since we rely on the above rules to
|
|
assign the namespace, but they cannot overlap non-attribute
|
|
keywords.
|
|
|
|
|
|
Keywords: predefined keywords must begin with '*'
|
|
the remainder of the keyword may be comprised of
|
|
'A' to 'Z', 'a' to 'z', '0' - '9', '_' and may
|
|
be terminated by an optional '?'.
|
|
|
|
Symbol Keywords do not begin with '*' and may be any
|
|
name defined by the user, they must be comprised of the same
|
|
characters as normal keywords. Symbol Keywords are used
|
|
as the name of Value macros and Font names in certain constructs.
|
|
|
|
|
|
Parsing rules:
|
|
|
|
Values: on some Keywords the Value is
|
|
ignored by the parser. In these cases
|
|
the Value (and the : delimiter) may be omitted
|
|
for example:
|
|
*Macros
|
|
*Macros: PaperNames
|
|
are both valid.
|
|
|
|
|
|
Block Macro definitions:
|
|
if the definition (body of a BlockMacro)
|
|
contains braces, the braces must appear in pairs
|
|
and the correct order. ie { must appear before }.
|
|
braces may be nested within the body.
|
|
|
|
|
|
Parsing Level 0: this is the outermost level of parsing.
|
|
At this level the characters {, } are interpreted
|
|
as construct delimiters, *% begins a comment etc.
|
|
This is contrasted to the parsing rules applied to
|
|
higher level objects like strings where these charcters
|
|
have no special meaning.
|
|
|
|
Linebreak characters may only appear in parsing level 0.
|
|
Their appearance at any time terminates parsing of the current statement.
|
|
|
|
parser eats everything up to linebreak character
|
|
in the event of a parsing error.
|
|
|
|
Statements begin with either a *<keyword> or a <symbol keyword>
|
|
where the keyword is a parser recognized keyword token and
|
|
where <symbol keyword> is a parser unrecognized token
|
|
which may represent a ValueMacroName or a TTFontName.
|
|
such tokens must not begin with '*'. and will be marked
|
|
as SYMBOL in the TokenMap.
|
|
|
|
In general arbitary Whitespaces may appear between
|
|
any entities recognized at Parsing level 0.
|
|
If Whitespaces are permitted within such entities
|
|
it will noted.
|
|
|
|
For added robustness, macro strings will not be
|
|
expanded to their binary equivalents. This prevents the
|
|
insertion of random Linebreak characters in the stream.
|
|
|
|
Rule for GPD authors: for maximum robustness and error
|
|
recovery, place level 0 braces (Construct delimiters) on a
|
|
separate line. Do not place Construct delimiters on the same
|
|
line where questionable keywords and constructs are used.
|
|
|
|
|
|
Heap useage:
|
|
|
|
the heap will be divided into several sections, each large enough
|
|
to hold whatever may come. Growth of the heap sections
|
|
is not allowed.
|
|
|
|
Strings, composite objects like RECTS: holds all strings, offsets referenced
|
|
from beginning of string section.
|
|
|
|
Arrays of various types: each type of array is assigned its own dedicated
|
|
memory buffer. A master table contains pointers to each array,
|
|
its size and current entry. Once data is entered into the array,
|
|
the keeper of the data need only remember the index of the array
|
|
the data was written into.
|
|
|
|
After all parsing operations are complete, we will consolidate the
|
|
arrays into one memory space and update the master table accordingly.
|
|
|
|
|
|
Some composite values require an indefinite amount of storage
|
|
or reside in dedicated structures.
|
|
(for example strings, lists and UIconstraints).
|
|
Such values are stored in 2 parts, a fixed sized link, and
|
|
and the part the link refers to. This part may be variable
|
|
sized and may occupy heap space or one or more dedicated
|
|
structures or some combination thereof.
|
|
Since the link is always of a known size, it may be stored
|
|
in a field in a structure etc.
|
|
|
|
The following table lists the values supported by the parser
|
|
and how they are structured:
|
|
|
|
value type: Strings
|
|
link: ARRAYREF
|
|
dwOffset field specifies heap offset of start of string
|
|
dwCount field specifies string length excluding
|
|
terminating NULL.
|
|
body: Null terminated array of bytes stored in heap.
|
|
|
|
value type: LIST
|
|
value type: QualifiedName
|
|
|
|
|
|
|
|
|
|
Shortcuts that cause headaches:
|
|
|
|
*Command: 2 forms exist.
|
|
|
|
|
|
Macros:
|
|
|
|
a macroDefinition cannot be self-referencing
|
|
macroDefintions cannot be forward referenced.
|
|
ie only a previously defined and fully resolved macro can be
|
|
referenced.
|
|
|
|
scope: an Macro is defined (referenceable) only after parsing the
|
|
closing brace of its definition and until encoutering a closing
|
|
brace that signifies the termination of the level the macro
|
|
was defined in.
|
|
|
|
namespaces: since macro definitions are stored in a stack,
|
|
defining a second macro with the same name does not necessarily
|
|
destroy the first definition. If The first macro was defined
|
|
outside of the scope of the 2nd, it will be visible once
|
|
the parser leaves the scope of the 2nd Macro.
|
|
|
|
ValueMacros:
|
|
Only string ValueMacros may be nested.
|
|
That if any valueMacro definition references
|
|
another valueMacro, the parser will assume
|
|
the definition is a stringMacro, and the
|
|
macro being referenced is also a stringvalue.
|
|
|
|
BlockMacros:
|
|
a blockMacro may contain other Macrodefinitions
|
|
but those definitions can only be referenced inside
|
|
the block macro. They will not appear
|
|
when the blockmacro is actually referenced.
|
|
A BlockMacroName may NOT be substituted
|
|
by a ValueMacro either in a *BlockMacro or
|
|
*InsertBlock statement.
|
|
|
|
|
|
|
|
----- more parsing rules ------
|
|
|
|
the first non-null line of the root GPD sourcefile
|
|
must be:
|
|
*GPDSpecVersion:
|
|
|
|
|
|
arbitary whitespace is allowed between tokens
|
|
comprising a command parameter.
|
|
arbitary whitespace is allowed anywhere within
|
|
a hex substring.
|
|
--------------------------------------------------------------
|
|
|
|
|
|
Currently, these are the known types of keywords:
|
|
|
|
CONSTRUCTS: introduces a construct (causes a parser context change)
|
|
usually followed by open brace in next statement.
|
|
construct is terminated by matching close brace.
|
|
|
|
*UIGroup, *Switch, *Case, *Default, *Command
|
|
*FontCartridge, *TTFontSubs, *Feature, *Option
|
|
*OEM, *BlockMacro, *Macros
|
|
a construct may be thought of as a type of structure
|
|
initialization. ONly certain keywords are may be used
|
|
inside of a construct. Some of these keywords may only be
|
|
using within their associated construct and no where else.
|
|
|
|
LOCAL ATTRIBUTES: initializes a value in a construct.
|
|
GLOBAL ATTRIBUTES: initializes a value in the global structure.
|
|
|
|
local and global attributes may be subdivided into
|
|
freefloating and fixed. A fixed attribute must be used
|
|
in the same nesting level as the construct it is associated
|
|
with. A freefloating attribute may be used within another
|
|
construct as long as that construct is contained within the
|
|
construct associated attribute.
|
|
|
|
|
|
SPECIAL ATTRIBUTE: initializes and adds another item to a dedicated
|
|
or global list or a list in construct. or has side effects
|
|
requiring special processing.
|
|
examples:
|
|
|
|
*Installable?, - Causes an installable feature
|
|
to be synthesized. but parser may deal with this after all
|
|
Feature/Options have been parsed. So not really.
|
|
|
|
Adds link to special tree structure:
|
|
*Constraints, *InvalidCombination, *InvalidInstallableCombination,
|
|
*InstalledConstraints, *NotInstalledConstraints
|
|
|
|
The values introduced by these keywords are additive
|
|
(like using a LIST):
|
|
*Font
|
|
|
|
|
|
*Command:<commandName>:<invocation> a shorthand
|
|
|
|
*MemConfigKB a shorthand way of creating an entire
|
|
memory option.
|
|
|
|
LIST(<QualifiedName>,<QualifiedName>,<QualifiedName>)
|
|
may be written as:
|
|
<FeatureName>.LIST(<OptionName>,<OptionName>,<OptionName>)
|
|
|
|
|
|
if there are other types of keywords let me know.
|
|
|
|
Special Parsing contexts:
|
|
in which User defines new keywords simply by
|
|
referencing them.
|
|
|
|
*TTFontSubs:
|
|
{
|
|
<TTFontFaceName>: <DeviceFontID>
|
|
.... not actually a symbol, but adds a string, number pair
|
|
to a list. May be implemented during construction as a symbol.
|
|
}
|
|
*Macros:
|
|
{
|
|
<ValueMacroName>:<macrovalue>
|
|
}
|
|
|
|
*FontCart: note the FontCart construct is ROOT_ONLY and
|
|
is not multivalued. Each construct with a unique SymbolName
|
|
corresponds to a dedicated FONTCART structure.
|
|
|
|
If we want to make FontCarts multivalued,
|
|
we introduce a new keyword *AvailFontCarts: LIST(symbol1, symbol2, symbol3)
|
|
which is a FreeFloating Global.
|
|
|
|
|
|
MacroProcessing:
|
|
=<ValueMacroName> where a value or component of a string is expected
|
|
=<BlockMacroName> following a symbolname following a construct Keyword.
|
|
*InsertBlock: <BlockMacroName>
|
|
|
|
recognized value types:
|
|
ORDER :== <section>.<number>
|
|
|
|
|
|
SYMBOLS: Any user defined (not recognized by the parser) token used
|
|
to identify a statement or construct or value.
|
|
<CommandNames> are not symbols because the parser has a list
|
|
of recognized Valid Unidrv commands. Non Macro Symbols may be
|
|
forward referenced: ie *DefaultOption or *Constraints may reference
|
|
a symbol that is defined later.
|
|
|
|
where defined:
|
|
|
|
Associated Keyword: <symbol type>
|
|
*Macros: <ValueMacroNames> not the Group Name!
|
|
*BlockMacro: <BlockMacroNames>
|
|
*Feature: <featureSymbol>
|
|
*Option: <optionSymbol>
|
|
*OEM: <OEM group name> saved in symbol tree for possible future use.
|
|
*TTFontSubs: TTFontnames may be stored as symbols, but
|
|
are not symbols in the strictest sense.
|
|
|
|
constructs not using symbols:
|
|
*TTFontSubs: <ON | OFF> predefined.
|
|
*UIGroup: <Group name> optional - not used by parser.
|
|
*Default: <optional tag> optional - not used by parser.
|
|
*Command: <Unidrv Command Name> predefined. CmdSelect
|
|
is a special name which triggers special processing.
|
|
*FontCartridge: <optional tag> optional - not used by parser.
|
|
Implementation hint: use macros to keep all definitions in one place.
|
|
or introduce *AvailFontCart: LIST(<FontCartSymbol>, <FontCartSymbol>)
|
|
inside constructs.
|
|
|
|
where referenced:
|
|
*InsertMacro: <BlockMacroNames>
|
|
*<ConstructKeyword>: <symboldef> =<BlockMacroName>
|
|
|
|
*<anykeyword>: =<ValueMacroName>
|
|
except *BlockMacro, *InsertMacro, *Include
|
|
*Switch: <FeatureName>
|
|
*Case: <OptionName>
|
|
|
|
|
|
Currently the parser saves symbols defined in *Feature and *Option
|
|
keywords and remembers symbol references made in *Switch and *Case
|
|
keywords.
|
|
|
|
The include keyword:
|
|
|
|
must not appear within a macrodefinition
|
|
must not reference a macrovalue
|
|
must be terminated by a linebreak not { or } construct.
|
|
|
|
--- state machine ----
|
|
|
|
The parser treates construct keywords as operators
|
|
which change the state of the parser. (create state
|
|
transitions.)
|
|
|
|
the set of allowed transitions is
|
|
defined in the table AllowedTransitions
|
|
this table enforces several rules:
|
|
|
|
the construct _TTFONTSUBS can only
|
|
appear at the root level.
|
|
|
|
no constructs may appear within
|
|
OEM, FONTCART, TTFONTSUBS, COMMAND constructs.
|
|
|
|
|
|
The following code fragment is a comprehensive list
|
|
of the allowed state transitions:
|
|
|
|
|
|
pst = astAllowedTransitions[STATE_ROOT] ;
|
|
|
|
pst[CONSTRUCT_UIGROUP] = STATE_UIGROUP;
|
|
pst[CONSTRUCT_FEATURE] = STATE_FEATURE;
|
|
pst[CONSTRUCT_SWITCH] = STATE_SWITCH_ROOT;
|
|
pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
|
|
pst[CONSTRUCT_FONTCART] = STATE_FONTCART;
|
|
pst[CONSTRUCT_TTFONTSUBS] = STATE_TTFONTSUBS;
|
|
pst[CONSTRUCT_OEM] = STATE_OEM;
|
|
|
|
pst = astAllowedTransitions[STATE_UIGROUP] ;
|
|
|
|
pst[CONSTRUCT_UIGROUP] = STATE_UIGROUP;
|
|
pst[CONSTRUCT_FEATURE] = STATE_FEATURE;
|
|
|
|
pst = astAllowedTransitions[STATE_FEATURE] ;
|
|
|
|
pst[CONSTRUCT_OPTION] = STATE_OPTIONS;
|
|
pst[CONSTRUCT_SWITCH] = STATE_SWITCH_FEATURE;
|
|
|
|
pst = astAllowedTransitions[STATE_OPTIONS] ;
|
|
|
|
pst[CONSTRUCT_SWITCH] = STATE_SWITCH_OPTION;
|
|
pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
|
|
pst[CONSTRUCT_OEM] = STATE_OEM;
|
|
|
|
pst = astAllowedTransitions[STATE_SWITCH_ROOT] ;
|
|
|
|
pst[CONSTRUCT_CASE] = STATE_CASE_ROOT;
|
|
pst[CONSTRUCT_DEFAULT] = STATE_DEFAULT_ROOT;
|
|
|
|
pst = astAllowedTransitions[STATE_SWITCH_FEATURE] ;
|
|
|
|
pst[CONSTRUCT_CASE] = STATE_CASE_FEATURE;
|
|
pst[CONSTRUCT_DEFAULT] = STATE_DEFAULT_FEATURE;
|
|
|
|
pst = astAllowedTransitions[STATE_SWITCH_OPTION] ;
|
|
|
|
pst[CONSTRUCT_CASE] = STATE_CASE_OPTION;
|
|
pst[CONSTRUCT_DEFAULT] = STATE_DEFAULT_OPTION;
|
|
|
|
pst = astAllowedTransitions[STATE_CASE_ROOT] ;
|
|
|
|
pst[CONSTRUCT_SWITCH] = STATE_SWITCH_ROOT;
|
|
pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
|
|
pst[CONSTRUCT_OEM] = STATE_OEM;
|
|
|
|
pst = astAllowedTransitions[STATE_DEFAULT_ROOT] ;
|
|
|
|
pst[CONSTRUCT_SWITCH] = STATE_SWITCH_ROOT;
|
|
pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
|
|
pst[CONSTRUCT_OEM] = STATE_OEM;
|
|
|
|
pst = astAllowedTransitions[STATE_CASE_FEATURE] ;
|
|
|
|
pst[CONSTRUCT_SWITCH] = STATE_SWITCH_FEATURE;
|
|
pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
|
|
pst[CONSTRUCT_OEM] = STATE_OEM;
|
|
|
|
pst = astAllowedTransitions[STATE_DEFAULT_FEATURE] ;
|
|
|
|
pst[CONSTRUCT_SWITCH] = STATE_SWITCH_FEATURE;
|
|
pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
|
|
pst[CONSTRUCT_OEM] = STATE_OEM;
|
|
|
|
pst = astAllowedTransitions[STATE_CASE_OPTION] ;
|
|
|
|
pst[CONSTRUCT_SWITCH] = STATE_SWITCH_OPTION;
|
|
pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
|
|
pst[CONSTRUCT_OEM] = STATE_OEM;
|
|
|
|
pst = astAllowedTransitions[STATE_DEFAULT_OPTION] ;
|
|
|
|
pst[CONSTRUCT_SWITCH] = STATE_SWITCH_OPTION;
|
|
pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
|
|
pst[CONSTRUCT_OEM] = STATE_OEM;
|
|
|
|
|
|
|
|
|
|
--- multiple statements and redefinitions: ------
|
|
|
|
for standard attributes, if two statements containing
|
|
that attribute with different values appears in the
|
|
same construct, the attribute takes the latter occuring value.
|
|
|
|
If the attribute is defined to be FreeFloating, it may appear
|
|
multiple times in different *Option or *Case constructs.
|
|
In this case if the effect of the multiple occurances is to
|
|
add new branches which are compatible with the existing tree,
|
|
or to reinitialize the value of a node in the existing tree
|
|
that is an accepted use of multiple occuring attributes.
|
|
However if the effect is to define a new branch which is
|
|
incompatible with the existing tree, that is an error, and
|
|
the latter initialization of the attribute is ignored.
|
|
|
|
There is one exception to the rule of adding conflicting branches
|
|
to the attribute tree. That exception allows default initializers
|
|
to be created. If an attribute is assigned a value which is
|
|
subsequently made multivalued, the initial value becomes the
|
|
default initializer unless the GPD author explicitly specified
|
|
a 'default' case when making the attribute multivalued.
|
|
|
|
Note the order cannot be reversed.
|
|
An attribute which is already defined to be multivalued
|
|
cannot subsequently be defined to be fewer valued.
|
|
|
|
--- state machine ----
|
|
|
|
the set of allowed transitions is
|
|
defined in the table AllowedTransitions
|
|
this table enforces several rules:
|
|
|
|
the construct _TTFONTSUBS can only
|
|
appear at the root level.
|
|
|
|
no constructs may appear within
|
|
OEM, FONTCART, TTFONTSUBS, COMMAND constructs.
|
|
|
|
|
|
---- use of switch/case constructs -----
|
|
|
|
The same feature must not be referenced in nested constructs.
|
|
This will produce an attribute tree that contains the same
|
|
feature at two different levels. similarly...
|
|
an attribute tree should not be constructed
|
|
piecemeal. It is an error if the tree is subsequently
|
|
redefined/elaborated using a different feature nesting
|
|
order.
|
|
-----
|
|
Severity of errors:
|
|
|
|
!!!!!: parser is non-compilable/non-functional unless
|
|
this is resolved.
|
|
!!!!: unfinished functionality. Some legal GPD files
|
|
will cause corruption.
|
|
!!!: integrity check omitted - a corrupt file may be inadvertantly
|
|
generated if resource limitations are encountered.
|
|
!!: syntax error in GPD may cause widespread corruption
|
|
!: emit useful message for user.
|
|
BUG_BUG: wish item - user friendlier error message etc.
|
|
parser self-consistency check, self diagnostics.
|
|
more general, elegant, faster, more complex code etc.
|
|
|
|
|
|
Note: PARANOID BUG_BUGs indicate error conditions that are
|
|
the result of coding errors (mistaken assumptions, incomplete
|
|
code paths etc) and are not the result of improper GPD syntax,
|
|
or resource constraints (overflow of fixed length buffers etc).
|
|
|
|
All originating error messages should report the name of
|
|
the function, name of variable or system call that is
|
|
out of range or invalid.
|
|
|
|
Later, if a caller function sees a failure return value,
|
|
it may want to tack on an extra message say
|
|
keyword or line number where error occured.
|
|
|
|
A if a function returns with a failure condition, the caller
|
|
may at its discretion increase the severity of the error.
|
|
For example if the caller passed a string to be parsed
|
|
and it failed, the string parsing function may raise a tiny
|
|
error condition. But if the caller was going to use the
|
|
string to open a GPD or resource file, then this suddenly
|
|
becomes a major problem.
|
|
|
|
A function may
|
|
never reduce the severity of an error unless code was just
|
|
executed which will migitate the source of the problem.
|
|
Don't select ERRSEV_RESTART unless there is a handler
|
|
on the next go round to solve the initial problem.
|
|
An endless loop may result otherwise.
|
|
|
|
|