Source code of Windows XP (NT5)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

619 lines
22 KiB

  1. This directory and its subdirectory contain source and
  2. binary files for the statitics support packages that can
  3. be run across multiple platforms.
  4. The directory is organized as follows:
  5. 1) .\ ->
  6. a) stat.c (single c source)
  7. b) makefile.rst (common for windows, OS2 16, and OS2386)
  8. c) makefile (for Windows NT)
  9. d) sources (sources file for Windows NT)
  10. e) The header file, teststat.h, required for building
  11. the dlls is under ..\inc\.
  12. 2) .\win (FOR WINDOWS)
  13. a) .\src (contains the remaining .asm file, and the
  14. module def file)
  15. b) .\bin (the binary statwin.dll file)
  16. 3) .\WIN32 (FOR WIN32 APPS)
  17. a) .\src (contains the .def file and an i386 sub-dir.
  18. and a mips subdir, each containing an asm file)
  19. b) .\bin (the binary file)
  20. 4) .\os2286 (FOR 16 bit OS/2 Cruiser and Sloop apps)
  21. a) .\src (the module def file)
  22. b) .\bin (the binary stat286.dll file)
  23. 5) .\os2386 (FOR 32 bit OS/2 Cruiser apps)
  24. a) .\src (the module def file)
  25. b) .\bin (the binary stat386.dll file)
  26. **********************************************************************
  27. To build an application that uses a statistics DLL:
  28. --------------------------------------------------
  29. To use one of the above binaries, please read the USAGE NOTES at the
  30. end of this document. Please copy the teststat.h file from this
  31. directory to the directory where you are building your application.
  32. Copy the relevant .dll to your libpath.
  33. It is essential that you define the type of system you are building
  34. your application for, since the header file uses some special types
  35. that are dependent on the system. While compiling your application,
  36. add the following flag: -DXXX where XXX stands for one of:
  37. WIN - for Windows applications
  38. OS2286 - for 16 bit OS/2 applications
  39. OS2386 - for 32 bit OS/2 applications
  40. WIN32 - for Win32 applications.
  41. **********************************************************************
  42. To build one of the dlls:
  43. ------------------------
  44. If building a Windows, OS2 16 or OS/2 32 bit dll:
  45. --------------------------------------------------------
  46. a) Copy the stat.c file found under this directory
  47. and teststat.h from ..\inc\. to a local directory.
  48. Copy the .asm file from win\src if building for WIN.
  49. b) Also copy the "makefile.rst" from here to the same directory.
  50. c) From ???\src copy the remaining files to the local directory,
  51. where ??? represents win, os2286 or os2386.
  52. d) Edit "makefile.rst" to define the system that you are making the
  53. dll for. Eg. if you are making the dll for windows, remove
  54. the comment sign (#) from the line "WIN=TRUE" in the makefile
  55. and ensure that the other system defines (OS2286 and OS2386)
  56. are commented out.
  57. f) Type "nmake -f makefile.rst" and the dll will be created for
  58. you. (Ensure that your development environment is set up for
  59. the right system).
  60. If building the Win32 dll:
  61. -------------------------
  62. a) Copy stat.c, makefile and sources files found under this
  63. directory and teststat.h from ..\inc\. to a local directory.
  64. b) tc the win32\src directory to your local directory. This
  65. will create an i386 (or mips) sub-directory containing an asm
  66. file on your local machine.
  67. c) From the directory where you have your sources file, type
  68. "build -xxx statw32" from the command line, where xxx represents
  69. your target system. It is 386 by default.
  70. d) A binary file "statw32.dll" will be created along with the
  71. .obj file under .\xxx\obj where xxx is your target system.
  72. It is i386 by default.
  73. In case you have any questions, or if you run into any problems,
  74. contact vaidy (936-7812).
  75. *****************************************************************
  76. USAGE NOTES
  77. -----------
  78. This is the user's guide to using TestStat.dll, the
  79. statistical package. In case of questions contact vaidy (936-
  80. 7812).
  81. This document describes the use of each of the functions
  82. available through this module and then demonstrates the use
  83. of these routines through an example.
  84. This module provides basic statistical routines which can be
  85. used to compute average, min, max, standard deviation, and
  86. statistical convergence. Statistical convergence of the
  87. average is determined by the number of test iterations required
  88. for the average to converge to a "stable" value. The number of
  89. iterations required is computed on the fly as the data is
  90. collected, so that the caller is informed when enough data is
  91. collected (the average is stable). Stable averages obtained in
  92. this way can be compared to other stable averages obtained under
  93. different experimental conditions with known confidence levels.
  94. Notes in this document describe the meanings of stability and
  95. confidence more formally.
  96. In addition to the functionlity described above, this module
  97. also provides routines that generate normally distributed
  98. random numbers. Three routines are provided that return
  99. random numbers within a specified boundary, a set of uniformly
  100. distributed random numbers within the range 0 to 1 and a normally
  101. distributed set of numbers around a mean, which satisfy a given
  102. mean and standard deviation.
  103. 1) TestStatOpen:
  104. -------------
  105. Description: Allocates an instance data array for the data set
  106. and other global data structures required by the high level
  107. functions.
  108. USHORT FAR PASCAL
  109. TestStatOpen (
  110. USHORT usMinIterations,
  111. USHORT usMaxIterations
  112. );
  113. usMinIterations - The minimum number of iterations
  114. that the calling application has to run before
  115. the convergence algorithm may be used.
  116. usMaxIterations - The maximum number of iterations
  117. that the test program may run. The maximum
  118. acceptable value is 64K. An internal data array
  119. of usMaxIterations of ULONGs is allocated. The
  120. caller should bear this in mind when setting this
  121. parameter.
  122. Remarks: This routine should be called before the first
  123. call to TestStatInit. If usMinIterations is zero, an error
  124. code is returned. If usMinIterations is greater than
  125. usMaxIterations an error code is returned. If usMinIterations
  126. is equal to usMaxIterations, TestStatConverge will return TRUE
  127. after that many iterations. This function frees the caller
  128. from the responsibility of allocating any data storage or
  129. book-keeping.
  130. Return Value: 0 if the call succeeded.
  131. An error code indicating a failure. The
  132. error code may be one of:
  133. STAT_ERROR_ILLEGAL_MIN_ITER
  134. STAT_ERROR_ILLEGAL_MAX_ITER
  135. STAT_ERROR_ALLOC_FAILED
  136. See also: TestStatInit, TestStatConverge, TestStatValues,
  137. TestStatClose.
  138. 2) TestStatInit:
  139. -------------
  140. Description: Initializes variables required by the convergence
  141. and statistics routines.
  142. VOID FAR PASCAL
  143. TestStatInit (
  144. VOID
  145. );
  146. Remarks: This routine should be called before the first
  147. call to TestStatConverge and after each call to
  148. TestStatValues, if you want to converge on a new set of
  149. data.
  150. Return Value: None.
  151. See also: TestStatOpen, TestStatClose, TestStatConverge,
  152. TestStatValues.
  153. 3) TestStatConverge:
  154. -----------------
  155. Description: Automatically computes number of iterations
  156. required for 95% confidence in data obtained.
  157. BOOL FAR PASCAL
  158. TestStatConverge (
  159. ULONG ulNewDataPoint,
  160. );
  161. ulNewDataPoint - The data point obtained for the
  162. current iteration.
  163. Remarks: This routine should be called for each
  164. iteration of the test. The first call to this routine
  165. should be preceded by a call to TestStatInit. The test
  166. program should check for the return value and should stop
  167. the test as soon as a TRUE is returned.
  168. In making tests of significance, sometimes errors will be
  169. encountered in the results concerning an hypothesis tested.
  170. The hypothesis is that the difference between the actual
  171. mean in one experiment and the actual mean in a second
  172. experiment is less than a specified value. This
  173. difference is expressed as a percentage of the first
  174. experiment's mean. We call this difference, the "precision"
  175. of the comparison.
  176. If the assumption is true and the results of the tests leads one
  177. to believe that it is false, the condition is described as a
  178. TYPE I error. If the assumption is false and the test results
  179. show that the two means are within the prescribed difference, the
  180. condition is described as a TYPE II error.
  181. The probability of TYPE I error is set by the significance level
  182. of the test. Choosing a small probability of one type of error,
  183. increases the probability of the other type.
  184. The routines in this module operate on the following set of
  185. assumed parameters:
  186. 95% confidence that if the means differ by less than 5% they
  187. are really the same, and 85% confidence that if the means
  188. differ by more than 5% that they are really different.
  189. The algorithm in this module uses these assumptions to determine
  190. the number of iterations needed to achieve these levels of
  191. confidence.
  192. The reason for emphasizing TYPE II error is that a TYPE I error
  193. indicates that the means differ, when in fact, they are the same.
  194. If they differ, we will usually explore why, and in doing so, will
  195. discover that they are not really different after all. If, on the
  196. other hand, we get a TYPE II error, then this means that the
  197. results show no difference, whereas the means really are
  198. different. This is to be avoided since if means don't differ
  199. from one run to the next, we are unlikely to look further into
  200. the problem.
  201. When additional iterations are forced by a high usMinIterations,
  202. then the resulting precision will usually be less than 5%.
  203. Conversely, when usMaxIterations are reached without converging,
  204. then the precision will be greater than 5%. The precision returned
  205. by TestStatValues will indicate how meaningful the comparisons of
  206. two means will be.
  207. Return Value: FALSE if further iterations are required for
  208. the test to converge or usMinIterations has
  209. not been reached.
  210. TRUE if already converged or maximum limit
  211. on iterations has been reached.
  212. See also: TestStatOpen, TestStatInit, TestStatValues,
  213. TestStatClose.
  214. 4) TestStatValues:
  215. ----------------
  216. Description: Automatically computes a number of useful
  217. statistical values for a given set of data.
  218. VOID FAR PASCAL
  219. TestStatValues (
  220. PSZ pszOutputString,
  221. USHORT usOutlierFactor,
  222. PULONG * pulDataArray,
  223. PUSHORT pcusElementsInArray,
  224. PUSHORT pcusDiscardedValues,
  225. );
  226. pszOutputString - A pointer to a string buffer to which
  227. output data may be returned. The minimum size of
  228. the buffer should be 81 bytes. The string will
  229. be a NULL terminated ascii string.
  230. usOutlierFactor - Factor that defines the range of
  231. acceptable data values. A value of zero will ignore
  232. this factor and all data will be considered
  233. valid.
  234. pulDataArray - A pointer to the data array. If the outlier
  235. factor has been chosen, this array has as many elements
  236. as there were good data points in the data set. Else,
  237. all the data points are contained in the data array.
  238. pcusElementsInArray - The number of elements in the array
  239. pointed to by puDataArray.
  240. pcusDiscardedValues - pointer to the number of data
  241. points discarded based upon the outlier factor.
  242. Remarks: This routine should be called only once for
  243. each test, normally after TestStatConverge has returned
  244. TRUE. Any call to this should be followed by a call to
  245. TestStatInit before the next call to TestStatConverge,
  246. if you want to converge on a new set of data. The Outlier
  247. factor decides the range of acceptable values in the data set.
  248. The format of the returned string will be (as in C):
  249. "%4u %10lu %10lu %10lu %6u %5u %10lu %4u %2u ".
  250. These represent the mode number, mean, minimum, maximum,
  251. the number of iterations completed, the precision, the standard
  252. deviation, number of points discarded, and, the outlier factor
  253. from the data set. The mode number will always be zero.
  254. The precision will be 5% in case test results converged before
  255. the limit on the maximum iterations is reached. Otherwise,
  256. it returns the precision of the results gathered.
  257. The precision value in this case assumes that the Type I error
  258. and Type II error probabilities are 85% and 95% respectively.
  259. The outlier factor determines along with the standard deviation
  260. any abnormal data points. Any data point that does not satisfy:
  261. [Mean - (SDev * OF)] < Data Point < [Mean + (SDev * OF)],
  262. where SDev is the standard deviation computed with good data
  263. points and OF is the outlier factor, is left out in the
  264. statistics computation. The standard deviation is
  265. recomputed and this process is repeated until there are no
  266. abnormal data entries in the data set. The number of
  267. outliers that were discarded is also returned to the calling
  268. program. To ignore the outlier factor and this process of
  269. elimination, the outlier factor may be set to zero.
  270. Otherwise, the outlier factor should be at least 2 in order
  271. for the results to be meaningful.
  272. Return Value: None
  273. See also: TestStatOpen, TestStatClose, TestStatInit,
  274. TestStatConverge
  275. 5) TestStatClose:
  276. -------------
  277. Description: Deallocates instance data structures and
  278. all memory allocated by TestStatOpen and TestStatInit.
  279. VOID FAR PASCAL
  280. TestStatClose (
  281. VOID
  282. );
  283. Remarks: This routine should be called after the last
  284. call to TestStatValues. A call to this must be followed by a
  285. call to TestStatOpen and TestStatInit, in that order, before
  286. the application calls TestStatConverge and TestStatValues.
  287. Return Value: None
  288. See also: TestStatOpen, TestStatInit, TestStatConverge,
  289. TestStatValues.
  290. ------------------------------------------------------------
  291. Usage of Statistical routines for convergence and values: TestApp
  292. ------------------------------------------------------------------
  293. #define MIN_ITERATION 3
  294. #define MAX_ITERATION 200
  295. #define OUTLIER_FACTOR 4
  296. Body of test application
  297. {
  298. USHORT usMinIteration = MIN_ITERATION;
  299. USHORT usMaxIteration = MAX_ITERATION;
  300. ULONG ulDataForCurrentIter;
  301. ULONG far *pulDataArray; // make sure you have the "far" for 16 bit.
  302. char chOutputBufferForString [81];
  303. USHORT usOutlierFactor = OUTLIER_FACTOR;
  304. USHORT cusDiscardedValues;
  305. USHORT cusElementsInArray;
  306. :
  307. :
  308. if (!TestStatOpen (usMinIteration,
  309. usMaxIteration)) {
  310. // Data Array could not be allocated.
  311. // Cannot do convergence/statistics routines;
  312. // Check parameters to call;
  313. }
  314. do { // for each test or if need to run convergence again
  315. TestStatInit ()
  316. // Initialize test variables;
  317. :
  318. do { // convergence loop; do until a
  319. // TRUE is returned
  320. // Start the timer;
  321. // Test operation;
  322. // Stop the timer;
  323. ulDataForCurrentIter = // get the elapsed time for
  324. // operation;
  325. } while (!TestStatConverge (ulDataForCurrentIter));
  326. // the data set has converged. Call the Statistics
  327. // routine for the values and output data
  328. TestStatValues (OutputBufferForString,
  329. usOutlierFactor,
  330. &pulDataArray,
  331. &cusDiscardedValues,
  332. &cusElementsInArray,
  333. );
  334. // the OutputBufferForString array has all the data.
  335. // iDiscardedValues has the number of discarded values
  336. //
  337. } while (//more tests or need to converge on new data set )
  338. :
  339. :
  340. TestStatClose();
  341. :
  342. }
  343. -------------------------------------------------------------------
  344. Random Number Generation Routines:
  345. 6) TestStatUniRand:
  346. ---------------
  347. Description: Returns a number within the range of 0 to 1 based on
  348. a starting seed.
  349. double FAR PASCAL
  350. TestStatUniRand (
  351. VOID
  352. );
  353. Remarks: This routine returns a set of uniformly distributed numbers
  354. between 0 and 1, on being, called repeatedly. TestStatUniRand makes
  355. use of the multiplicative congruential algorithm discussed in Knuth,
  356. Vol. II, Chapter 3. A starting seed is chosen along with a
  357. multiplier and a modulus values. The seed for the next iteration is
  358. computed from these values as follows:
  359. Temp Value = X * A, where,
  360. X is the current seed value and A is the multiplier. The remainder
  361. of the division of this value by the modulus identifier is
  362. determined. This will be the seed for the next iteration. This
  363. value is divided by the modulus value to obtain a normalized value
  364. (that lies between 0 and 1). This normalized value is returned to
  365. the caller.
  366. Through experiments, Sullivans, W. L has determined that a good set of
  367. values is returned by selecting one of the 9 following values as
  368. starting seeds:
  369. 32347753, 52142147, 52142123, 53214215, 23521425, 42321479,
  370. 20302541, 32524125, 42152159.
  371. TestStatUniRand uses 32347753 as the starting seed. A good set of
  372. values, mentioned above, implies that for the given seed, it takes
  373. a very large number of iterations, before the set of returned values
  374. is repeated. The following values have been chosen for the
  375. multiplier and the modulus by M.C. Pike and I.D. Hill (reference):
  376. Multiplier - 3125
  377. Modulus id - 67108864
  378. Return Value: A double float between 0 and 1.
  379. See also: TestStatShortRand, TestStatRand, TestStatNormDist.
  380. 7) TestStatShortRand:
  381. -----------------
  382. Description: Returns a number within the range of 0 to 65535 based on
  383. a starting seed.
  384. USHORT FAR PASCAL
  385. TestStatShortRand (
  386. VOID
  387. );
  388. Remarks: This routine returns a set of uniformly distributed numbers
  389. between 0 and 65535, on being, called repeatedly. TestStatShortRand makes
  390. use of the multiplicative congruential algorithm discussed in Knuth,
  391. Vol. II, Chapter 3. A starting seed is chosen along with a
  392. multiplier and a modulus values. The seed for the next iteration is
  393. computed from these values as follows:
  394. Temp Value = X * A, where,
  395. X is the current seed value and A is the multiplier. The remainder
  396. of the division of this value by the modulus identifier is
  397. determined. This will be the seed for the next iteration. This
  398. value is multiplied by 65535 and divided by the modulus value
  399. to obtain a value between 0 and 65535. This value is returned to
  400. the caller.
  401. Through experiments, Sullivans, W. L has determined that a good set of
  402. values is returned by selecting one of the 9 following values as
  403. starting seeds:
  404. 32347753, 52142147, 52142123, 53214215, 23521425, 42321479,
  405. 20302541, 32524125, 42152159.
  406. TestStatShortRand uses 32347753 as the starting seed. A good set of
  407. values, mentioned above, implies that for the given seed, it takes
  408. a very large number of iterations, before the set of returned values
  409. is repeated. The following values have been chosen for the
  410. multiplier and the modulus by M.C. Pike and I.D. Hill (reference):
  411. Multiplier - 3125
  412. Modulus id - 67108864
  413. Return Value: A USHORT between 0 and 65535.
  414. See also: TestStatUniRand, TestStatRand, TestStatNormDist.
  415. 8) TestStatRand:
  416. ------------
  417. Description: Returns a uniformly distributed random number within
  418. a specified range.
  419. ULONG FAR PASCAL
  420. TestStatRand (
  421. ULONG ulLower,
  422. ULONG ulUpper
  423. );
  424. ulLower - Specifies the lower boundary of the desired random
  425. number. Should be atleast 1 in value.
  426. ulUpper - Specifies the upper boundary of the desired random
  427. number. May not exceed 67108863.
  428. Remarks: TestStatRand calls TestStatNorm for obtaining a normalized
  429. random number. The value obtained from TestStatNorm is then
  430. multiplied by the range (i.e. the difference between ulUpper and
  431. ulLower). The computed value is then added to the lower limit
  432. and the resulting number is returned. It should be noted that both
  433. ulLower and ulUpper are included in the range of returned random
  434. numbers.
  435. Return Value: A random number within the specified range.
  436. See Also: TestStatShortRand, TestStatUniRand, TestStatNormDist.
  437. 9) TestStatNormDist:
  438. ----------------
  439. Description: With every call, returns a number that forms a set
  440. of points whose mean is approximately the input mean and whose
  441. standard deviation is nearly equal to the input standard deviation.
  442. A normally distributed set of points is generated.
  443. LONG FAR PASCAL
  444. TestStatNormDist (
  445. ULONG ulMean,
  446. USHORT usSDev
  447. );
  448. Remarks: This routine uses a formula discussed in 'Random Number
  449. Generation and Testing', IBM Data Processing Techniques, C20-8011
  450. and 'Tuning an Operating System for General Purpose Use', Russell
  451. P. Blake, Online Conferences (info. to be filled in).
  452. TestStatNormDist makes use of TestStatShortRand to get a set of
  453. uniformly distributed numbers. It generates a point around the
  454. input mean using the following formula:
  455. 14
  456. _
  457. lRetVal <- Mean + ( -7 + >_ TestStatShortRand ()) * Std. Dev
  458. i=1
  459. The set of points generated with several calls to this routine
  460. will be uniformly distributed with a mean of about the input mean
  461. and a standard deviation of approximately the input standard
  462. deviation. The returned value may be negative, too, depending
  463. upon the values returned by TestStatShortRand and the input standard
  464. deviation!
  465. Return Value: A long integer.
  466. See also: TestStatShortRand, TestStatUniRand, TestStatRand.