Leaked source code of windows server 2003
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

578 lines
20 KiB

  1. Definitions:
  2. statement delimiter:
  3. a new statement will always follow a linebreak
  4. character:
  5. a) unless the following character is the continuation character
  6. b) if the following character is another linebreak character,
  7. it is a null statement.
  8. c) statements consisting entirely of whitespace characters,
  9. comments or combinations are considered null statements.
  10. d) the parser ignores null statements.
  11. e) parser eats everything up to linebreak character and each
  12. subsequent line that begins with the continuation char
  13. in the event of a parsing error.
  14. The Construct delimiters are a statement in and of themselves
  15. and they also serve to terminate the previous statement
  16. and delimit the start of the following statement.
  17. you may think of a <construct_delimiter> as equivalent to
  18. <linebreak><construct_delimiter><linebreak>
  19. Whitespace characters: space, TAB
  20. Linebreak characters: <CR>, <LF> if both occur consecutively
  21. the pair is treated as one linebreak.
  22. White characters: either Whitespace or Linebreak characters.
  23. continuation character: +
  24. NEW DEF: when used as a continuation char
  25. the '+' must appear as the first char in a line.
  26. The parser will interpret the preceeding linebreak
  27. as whitespace.
  28. Logical linebreak: a linebreak character that is NOT followed by a
  29. continuation character.
  30. Continuation linebreak: a linebreak character followed by a
  31. continuation character.
  32. Arbitrary whitespace: Whitespace chars or comments or
  33. Continuation Linebreak
  34. Construct delimiters: curly braces { }
  35. A linebreak character is not required to preceed or proceed
  36. a construct delimiter.
  37. You may think of a <construct_delimiter> as logically equivalent
  38. to <linebreak><construct_delimiter><linebreak>
  39. Construct delimiters also delimit the scope of macro
  40. definitions. If a macro was defined within a nesting level
  41. created by a pair of construct delimiters, it remains defined
  42. only within that nesting level.
  43. Nesting level: the logical set of statements enclosed by
  44. a matching pair of construct delimiters.
  45. Statement delimiter: Logical linebreak or Construct delimiter.
  46. Keyword - value separator: colon :
  47. string delimiter: double quotes ""
  48. macro invocation: equals =
  49. comment indicator: *%
  50. Comments may be inserted immediately preceeding any logical
  51. or continuation linebreak. They may contain
  52. any characters except linebreak characters.
  53. The linebreak character terminates the comment.
  54. The comment indicator must be preceeded by a White Character.
  55. (unless the comment indicator is the first byte in the source file.)
  56. Grouping operator: () used in certain value constructs.
  57. Hex substring delimiter: <>
  58. Escape character : % used in string and parameter constructs.
  59. The above characters have reserved meanings
  60. and may not be used in any keyword, symbol name
  61. or any user defined name.
  62. Name spaces:
  63. If attributes keyword are used within other constructs
  64. say feature or global keyword inside an option,
  65. that keyword must be declared using the EXTERN_FEATURE:
  66. or EXTERN_GLOBAL: modifier. Otherwise the attribute type
  67. expected (dictionary used) will be defined by the state.
  68. the namespaces of the attribute keywords for various constructs
  69. may overlap each other, since we rely on the above rules to
  70. assign the namespace, but they cannot overlap non-attribute
  71. keywords.
  72. Keywords: predefined keywords must begin with '*'
  73. the remainder of the keyword may be comprised of
  74. 'A' to 'Z', 'a' to 'z', '0' - '9', '_' and may
  75. be terminated by an optional '?'.
  76. Symbol Keywords do not begin with '*' and may be any
  77. name defined by the user, they must be comprised of the same
  78. characters as normal keywords. Symbol Keywords are used
  79. as the name of Value macros and Font names in certain constructs.
  80. Parsing rules:
  81. Values: on some Keywords the Value is
  82. ignored by the parser. In these cases
  83. the Value (and the : delimiter) may be omitted
  84. for example:
  85. *Macros
  86. *Macros: PaperNames
  87. are both valid.
  88. Block Macro definitions:
  89. if the definition (body of a BlockMacro)
  90. contains braces, the braces must appear in pairs
  91. and the correct order. ie { must appear before }.
  92. braces may be nested within the body.
  93. Parsing Level 0: this is the outermost level of parsing.
  94. At this level the characters {, } are interpreted
  95. as construct delimiters, *% begins a comment etc.
  96. This is contrasted to the parsing rules applied to
  97. higher level objects like strings where these charcters
  98. have no special meaning.
  99. Linebreak characters may only appear in parsing level 0.
  100. Their appearance at any time terminates parsing of the current statement.
  101. parser eats everything up to linebreak character
  102. in the event of a parsing error.
  103. Statements begin with either a *<keyword> or a <symbol keyword>
  104. where the keyword is a parser recognized keyword token and
  105. where <symbol keyword> is a parser unrecognized token
  106. which may represent a ValueMacroName or a TTFontName.
  107. such tokens must not begin with '*'. and will be marked
  108. as SYMBOL in the TokenMap.
  109. In general arbitary Whitespaces may appear between
  110. any entities recognized at Parsing level 0.
  111. If Whitespaces are permitted within such entities
  112. it will noted.
  113. For added robustness, macro strings will not be
  114. expanded to their binary equivalents. This prevents the
  115. insertion of random Linebreak characters in the stream.
  116. Rule for GPD authors: for maximum robustness and error
  117. recovery, place level 0 braces (Construct delimiters) on a
  118. separate line. Do not place Construct delimiters on the same
  119. line where questionable keywords and constructs are used.
  120. Heap useage:
  121. the heap will be divided into several sections, each large enough
  122. to hold whatever may come. Growth of the heap sections
  123. is not allowed.
  124. Strings, composite objects like RECTS: holds all strings, offsets referenced
  125. from beginning of string section.
  126. Arrays of various types: each type of array is assigned its own dedicated
  127. memory buffer. A master table contains pointers to each array,
  128. its size and current entry. Once data is entered into the array,
  129. the keeper of the data need only remember the index of the array
  130. the data was written into.
  131. After all parsing operations are complete, we will consolidate the
  132. arrays into one memory space and update the master table accordingly.
  133. Some composite values require an indefinite amount of storage
  134. or reside in dedicated structures.
  135. (for example strings, lists and UIconstraints).
  136. Such values are stored in 2 parts, a fixed sized link, and
  137. and the part the link refers to. This part may be variable
  138. sized and may occupy heap space or one or more dedicated
  139. structures or some combination thereof.
  140. Since the link is always of a known size, it may be stored
  141. in a field in a structure etc.
  142. The following table lists the values supported by the parser
  143. and how they are structured:
  144. value type: Strings
  145. link: ARRAYREF
  146. dwOffset field specifies heap offset of start of string
  147. dwCount field specifies string length excluding
  148. terminating NULL.
  149. body: Null terminated array of bytes stored in heap.
  150. value type: LIST
  151. value type: QualifiedName
  152. Shortcuts that cause headaches:
  153. *Command: 2 forms exist.
  154. Macros:
  155. a macroDefinition cannot be self-referencing
  156. macroDefintions cannot be forward referenced.
  157. ie only a previously defined and fully resolved macro can be
  158. referenced.
  159. scope: an Macro is defined (referenceable) only after parsing the
  160. closing brace of its definition and until encoutering a closing
  161. brace that signifies the termination of the level the macro
  162. was defined in.
  163. namespaces: since macro definitions are stored in a stack,
  164. defining a second macro with the same name does not necessarily
  165. destroy the first definition. If The first macro was defined
  166. outside of the scope of the 2nd, it will be visible once
  167. the parser leaves the scope of the 2nd Macro.
  168. ValueMacros:
  169. Only string ValueMacros may be nested.
  170. That if any valueMacro definition references
  171. another valueMacro, the parser will assume
  172. the definition is a stringMacro, and the
  173. macro being referenced is also a stringvalue.
  174. BlockMacros:
  175. a blockMacro may contain other Macrodefinitions
  176. but those definitions can only be referenced inside
  177. the block macro. They will not appear
  178. when the blockmacro is actually referenced.
  179. A BlockMacroName may NOT be substituted
  180. by a ValueMacro either in a *BlockMacro or
  181. *InsertBlock statement.
  182. ----- more parsing rules ------
  183. the first non-null line of the root GPD sourcefile
  184. must be:
  185. *GPDSpecVersion:
  186. arbitary whitespace is allowed between tokens
  187. comprising a command parameter.
  188. arbitary whitespace is allowed anywhere within
  189. a hex substring.
  190. --------------------------------------------------------------
  191. Currently, these are the known types of keywords:
  192. CONSTRUCTS: introduces a construct (causes a parser context change)
  193. usually followed by open brace in next statement.
  194. construct is terminated by matching close brace.
  195. *UIGroup, *Switch, *Case, *Default, *Command
  196. *FontCartridge, *TTFontSubs, *Feature, *Option
  197. *OEM, *BlockMacro, *Macros
  198. a construct may be thought of as a type of structure
  199. initialization. ONly certain keywords are may be used
  200. inside of a construct. Some of these keywords may only be
  201. using within their associated construct and no where else.
  202. LOCAL ATTRIBUTES: initializes a value in a construct.
  203. GLOBAL ATTRIBUTES: initializes a value in the global structure.
  204. local and global attributes may be subdivided into
  205. freefloating and fixed. A fixed attribute must be used
  206. in the same nesting level as the construct it is associated
  207. with. A freefloating attribute may be used within another
  208. construct as long as that construct is contained within the
  209. construct associated attribute.
  210. SPECIAL ATTRIBUTE: initializes and adds another item to a dedicated
  211. or global list or a list in construct. or has side effects
  212. requiring special processing.
  213. examples:
  214. *Installable?, - Causes an installable feature
  215. to be synthesized. but parser may deal with this after all
  216. Feature/Options have been parsed. So not really.
  217. Adds link to special tree structure:
  218. *Constraints, *InvalidCombination, *InvalidInstallableCombination,
  219. *InstalledConstraints, *NotInstalledConstraints
  220. The values introduced by these keywords are additive
  221. (like using a LIST):
  222. *Font
  223. *Command:<commandName>:<invocation> a shorthand
  224. *MemConfigKB a shorthand way of creating an entire
  225. memory option.
  226. LIST(<QualifiedName>,<QualifiedName>,<QualifiedName>)
  227. may be written as:
  228. <FeatureName>.LIST(<OptionName>,<OptionName>,<OptionName>)
  229. if there are other types of keywords let me know.
  230. Special Parsing contexts:
  231. in which User defines new keywords simply by
  232. referencing them.
  233. *TTFontSubs:
  234. {
  235. <TTFontFaceName>: <DeviceFontID>
  236. .... not actually a symbol, but adds a string, number pair
  237. to a list. May be implemented during construction as a symbol.
  238. }
  239. *Macros:
  240. {
  241. <ValueMacroName>:<macrovalue>
  242. }
  243. *FontCart: note the FontCart construct is ROOT_ONLY and
  244. is not multivalued. Each construct with a unique SymbolName
  245. corresponds to a dedicated FONTCART structure.
  246. If we want to make FontCarts multivalued,
  247. we introduce a new keyword *AvailFontCarts: LIST(symbol1, symbol2, symbol3)
  248. which is a FreeFloating Global.
  249. MacroProcessing:
  250. =<ValueMacroName> where a value or component of a string is expected
  251. =<BlockMacroName> following a symbolname following a construct Keyword.
  252. *InsertBlock: <BlockMacroName>
  253. recognized value types:
  254. ORDER :== <section>.<number>
  255. SYMBOLS: Any user defined (not recognized by the parser) token used
  256. to identify a statement or construct or value.
  257. <CommandNames> are not symbols because the parser has a list
  258. of recognized Valid Unidrv commands. Non Macro Symbols may be
  259. forward referenced: ie *DefaultOption or *Constraints may reference
  260. a symbol that is defined later.
  261. where defined:
  262. Associated Keyword: <symbol type>
  263. *Macros: <ValueMacroNames> not the Group Name!
  264. *BlockMacro: <BlockMacroNames>
  265. *Feature: <featureSymbol>
  266. *Option: <optionSymbol>
  267. *OEM: <OEM group name> saved in symbol tree for possible future use.
  268. *TTFontSubs: TTFontnames may be stored as symbols, but
  269. are not symbols in the strictest sense.
  270. constructs not using symbols:
  271. *TTFontSubs: <ON | OFF> predefined.
  272. *UIGroup: <Group name> optional - not used by parser.
  273. *Default: <optional tag> optional - not used by parser.
  274. *Command: <Unidrv Command Name> predefined. CmdSelect
  275. is a special name which triggers special processing.
  276. *FontCartridge: <optional tag> optional - not used by parser.
  277. Implementation hint: use macros to keep all definitions in one place.
  278. or introduce *AvailFontCart: LIST(<FontCartSymbol>, <FontCartSymbol>)
  279. inside constructs.
  280. where referenced:
  281. *InsertMacro: <BlockMacroNames>
  282. *<ConstructKeyword>: <symboldef> =<BlockMacroName>
  283. *<anykeyword>: =<ValueMacroName>
  284. except *BlockMacro, *InsertMacro, *Include
  285. *Switch: <FeatureName>
  286. *Case: <OptionName>
  287. Currently the parser saves symbols defined in *Feature and *Option
  288. keywords and remembers symbol references made in *Switch and *Case
  289. keywords.
  290. The include keyword:
  291. must not appear within a macrodefinition
  292. must not reference a macrovalue
  293. must be terminated by a linebreak not { or } construct.
  294. --- state machine ----
  295. The parser treates construct keywords as operators
  296. which change the state of the parser. (create state
  297. transitions.)
  298. the set of allowed transitions is
  299. defined in the table AllowedTransitions
  300. this table enforces several rules:
  301. the construct _TTFONTSUBS can only
  302. appear at the root level.
  303. no constructs may appear within
  304. OEM, FONTCART, TTFONTSUBS, COMMAND constructs.
  305. The following code fragment is a comprehensive list
  306. of the allowed state transitions:
  307. pst = astAllowedTransitions[STATE_ROOT] ;
  308. pst[CONSTRUCT_UIGROUP] = STATE_UIGROUP;
  309. pst[CONSTRUCT_FEATURE] = STATE_FEATURE;
  310. pst[CONSTRUCT_SWITCH] = STATE_SWITCH_ROOT;
  311. pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
  312. pst[CONSTRUCT_FONTCART] = STATE_FONTCART;
  313. pst[CONSTRUCT_TTFONTSUBS] = STATE_TTFONTSUBS;
  314. pst[CONSTRUCT_OEM] = STATE_OEM;
  315. pst = astAllowedTransitions[STATE_UIGROUP] ;
  316. pst[CONSTRUCT_UIGROUP] = STATE_UIGROUP;
  317. pst[CONSTRUCT_FEATURE] = STATE_FEATURE;
  318. pst = astAllowedTransitions[STATE_FEATURE] ;
  319. pst[CONSTRUCT_OPTION] = STATE_OPTIONS;
  320. pst[CONSTRUCT_SWITCH] = STATE_SWITCH_FEATURE;
  321. pst = astAllowedTransitions[STATE_OPTIONS] ;
  322. pst[CONSTRUCT_SWITCH] = STATE_SWITCH_OPTION;
  323. pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
  324. pst[CONSTRUCT_OEM] = STATE_OEM;
  325. pst = astAllowedTransitions[STATE_SWITCH_ROOT] ;
  326. pst[CONSTRUCT_CASE] = STATE_CASE_ROOT;
  327. pst[CONSTRUCT_DEFAULT] = STATE_DEFAULT_ROOT;
  328. pst = astAllowedTransitions[STATE_SWITCH_FEATURE] ;
  329. pst[CONSTRUCT_CASE] = STATE_CASE_FEATURE;
  330. pst[CONSTRUCT_DEFAULT] = STATE_DEFAULT_FEATURE;
  331. pst = astAllowedTransitions[STATE_SWITCH_OPTION] ;
  332. pst[CONSTRUCT_CASE] = STATE_CASE_OPTION;
  333. pst[CONSTRUCT_DEFAULT] = STATE_DEFAULT_OPTION;
  334. pst = astAllowedTransitions[STATE_CASE_ROOT] ;
  335. pst[CONSTRUCT_SWITCH] = STATE_SWITCH_ROOT;
  336. pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
  337. pst[CONSTRUCT_OEM] = STATE_OEM;
  338. pst = astAllowedTransitions[STATE_DEFAULT_ROOT] ;
  339. pst[CONSTRUCT_SWITCH] = STATE_SWITCH_ROOT;
  340. pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
  341. pst[CONSTRUCT_OEM] = STATE_OEM;
  342. pst = astAllowedTransitions[STATE_CASE_FEATURE] ;
  343. pst[CONSTRUCT_SWITCH] = STATE_SWITCH_FEATURE;
  344. pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
  345. pst[CONSTRUCT_OEM] = STATE_OEM;
  346. pst = astAllowedTransitions[STATE_DEFAULT_FEATURE] ;
  347. pst[CONSTRUCT_SWITCH] = STATE_SWITCH_FEATURE;
  348. pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
  349. pst[CONSTRUCT_OEM] = STATE_OEM;
  350. pst = astAllowedTransitions[STATE_CASE_OPTION] ;
  351. pst[CONSTRUCT_SWITCH] = STATE_SWITCH_OPTION;
  352. pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
  353. pst[CONSTRUCT_OEM] = STATE_OEM;
  354. pst = astAllowedTransitions[STATE_DEFAULT_OPTION] ;
  355. pst[CONSTRUCT_SWITCH] = STATE_SWITCH_OPTION;
  356. pst[CONSTRUCT_COMMAND] = STATE_COMMAND;
  357. pst[CONSTRUCT_OEM] = STATE_OEM;
  358. --- multiple statements and redefinitions: ------
  359. for standard attributes, if two statements containing
  360. that attribute with different values appears in the
  361. same construct, the attribute takes the latter occuring value.
  362. If the attribute is defined to be FreeFloating, it may appear
  363. multiple times in different *Option or *Case constructs.
  364. In this case if the effect of the multiple occurances is to
  365. add new branches which are compatible with the existing tree,
  366. or to reinitialize the value of a node in the existing tree
  367. that is an accepted use of multiple occuring attributes.
  368. However if the effect is to define a new branch which is
  369. incompatible with the existing tree, that is an error, and
  370. the latter initialization of the attribute is ignored.
  371. There is one exception to the rule of adding conflicting branches
  372. to the attribute tree. That exception allows default initializers
  373. to be created. If an attribute is assigned a value which is
  374. subsequently made multivalued, the initial value becomes the
  375. default initializer unless the GPD author explicitly specified
  376. a 'default' case when making the attribute multivalued.
  377. Note the order cannot be reversed.
  378. An attribute which is already defined to be multivalued
  379. cannot subsequently be defined to be fewer valued.
  380. --- state machine ----
  381. the set of allowed transitions is
  382. defined in the table AllowedTransitions
  383. this table enforces several rules:
  384. the construct _TTFONTSUBS can only
  385. appear at the root level.
  386. no constructs may appear within
  387. OEM, FONTCART, TTFONTSUBS, COMMAND constructs.
  388. ---- use of switch/case constructs -----
  389. The same feature must not be referenced in nested constructs.
  390. This will produce an attribute tree that contains the same
  391. feature at two different levels. similarly...
  392. an attribute tree should not be constructed
  393. piecemeal. It is an error if the tree is subsequently
  394. redefined/elaborated using a different feature nesting
  395. order.
  396. -----
  397. Severity of errors:
  398. !!!!!: parser is non-compilable/non-functional unless
  399. this is resolved.
  400. !!!!: unfinished functionality. Some legal GPD files
  401. will cause corruption.
  402. !!!: integrity check omitted - a corrupt file may be inadvertantly
  403. generated if resource limitations are encountered.
  404. !!: syntax error in GPD may cause widespread corruption
  405. !: emit useful message for user.
  406. BUG_BUG: wish item - user friendlier error message etc.
  407. parser self-consistency check, self diagnostics.
  408. more general, elegant, faster, more complex code etc.
  409. Note: PARANOID BUG_BUGs indicate error conditions that are
  410. the result of coding errors (mistaken assumptions, incomplete
  411. code paths etc) and are not the result of improper GPD syntax,
  412. or resource constraints (overflow of fixed length buffers etc).
  413. All originating error messages should report the name of
  414. the function, name of variable or system call that is
  415. out of range or invalid.
  416. Later, if a caller function sees a failure return value,
  417. it may want to tack on an extra message say
  418. keyword or line number where error occured.
  419. A if a function returns with a failure condition, the caller
  420. may at its discretion increase the severity of the error.
  421. For example if the caller passed a string to be parsed
  422. and it failed, the string parsing function may raise a tiny
  423. error condition. But if the caller was going to use the
  424. string to open a GPD or resource file, then this suddenly
  425. becomes a major problem.
  426. A function may
  427. never reduce the severity of an error unless code was just
  428. executed which will migitate the source of the problem.
  429. Don't select ERRSEV_RESTART unless there is a handler
  430. on the next go round to solve the initial problem.
  431. An endless loop may result otherwise.