Source code of Windows XP (NT5)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

989 lines
38 KiB

  1. /*
  2. * Stat.c - Source file for a statistical
  3. * dll package that exports eleven
  4. * entry points:
  5. * a) TestStatOpen
  6. * b) TestStatInit
  7. * c) TestStatConverge
  8. * d) TestStatValues
  9. * e) TestStatClose
  10. * f) TestStatRand
  11. * g) TestStatUniRand
  12. * h) TestStatNormDist
  13. * i) TestStatShortRand
  14. * j) TestStatFindFirstMode
  15. * k) TestStatFindNextMode
  16. *
  17. * Entry point a) is an allocating routine
  18. * that is called by an application program
  19. * that desires to automatically compute
  20. * convergence.
  21. *
  22. * Entry point b) initializes all variables that
  23. * are used by entry points c) and d) in computing
  24. * convergence and statistical information.
  25. *
  26. * Entry point c) automatically computes the
  27. * the number of passes that the application has to
  28. * go through for a 95% confidence data.
  29. * This routine has to be called by the application
  30. * after each pass.
  31. *
  32. * Entry point d) automatically computes the
  33. * various statistical values eg. mean, SD etc.
  34. * This function has to be called only after the
  35. * application has called c) several times and has
  36. * either converged or reached the iteration limit.
  37. *
  38. * Entry point e) deallocates all instance data
  39. * data structures that were allocated by entry
  40. * point a).
  41. *
  42. * Entry point f) returns a Random Number in a
  43. * given range.
  44. *
  45. * Entry point g) returns a uniformly distributed
  46. * number in the range 0 - 1.
  47. *
  48. * Entry point h) returns a normally distributed
  49. * set of numbers, with repeated calls, whose
  50. * mean and standard deviation are approximately
  51. * equal to those that are passed in.
  52. *
  53. * Entry point i) is the same as g) except that
  54. * the range is 0 - 65535.
  55. *
  56. * The following should be the rules of calling
  57. * the entry points:
  58. *
  59. * Entry a) should be called before any of the others.
  60. * Entry c) should be preceded by at least one call
  61. * to entry b) for meaningful results. Entry d)
  62. * should be preceded by several calls to entry c).
  63. * A call to b) and c) after a call to e) should
  64. * preceded by a call to a) again.
  65. *
  66. * Created - Paramesh Vaidyanathan (vaidy)
  67. * Initial Version - October 29, '90
  68. */
  69. /*********************************************************************
  70. *
  71. * Formula Used in Computing 95 % confidence level is derived here:
  72. *
  73. *
  74. * Any reference to (A) would imply "Experimental Design
  75. * in Psychological Research", by Allan Edwards.
  76. *
  77. * Any reference to (B) would imply "Statistical Methods"
  78. * by Allan Edwards.
  79. *
  80. * Assumptions - TYPE I Error - 5% (B)
  81. * TYPE II Error - 16% -do-
  82. *
  83. * Area under the curve for Type I - 1.96
  84. * Area under the curve for Type II - 1.00
  85. *
  86. * For a 5% deviation, number of runs,
  87. *
  88. * 2 2
  89. * n = 2 (c) (1.96 + 1.00)
  90. * ------ .....Eqn (1)
  91. * 2
  92. * (d)
  93. *
  94. * where c is the Std. Dev. and d is the absolute
  95. * difference bet. means [(B) Page 91].
  96. *
  97. * d = 5% X' .....Eqn (2)
  98. *
  99. * where X' is the mean of samples
  100. * _
  101. * and = >_ X
  102. * ----- .....Eqn (3)
  103. * n
  104. * 0
  105. *
  106. * When the number of iterations -> infinity,
  107. *
  108. * 2 2
  109. * S -> c .....Eqn (4)
  110. *
  111. *
  112. * 2
  113. * where S is the estimate of the common population
  114. * variance (Eqn. 4 is a big assumption)
  115. *
  116. * From (B) page 59, we have,
  117. *
  118. * 2 _ 2 _ 2
  119. * S = >_ X - ( >_ X)
  120. * -----
  121. * n
  122. * 0
  123. * ----------------- .....Eqn (5)
  124. * n - 1
  125. * 0
  126. *
  127. * Substituting Eqn (2), (3), (4) and (5) in (1), we get:
  128. * _ _
  129. * 2 | _ 2 _ 2 |
  130. * n = 7008 (n ) |( >_ X ) - ( >_ X) |
  131. * 0 | -------- |
  132. * | n |
  133. * |_ 0 _|
  134. * ---------------------------------------
  135. * _ 2
  136. * (n - 1) ( >_ X )
  137. * 0
  138. *
  139. * It should be mentioned that n is the iteration pass number.
  140. * 0
  141. *********************************************************************/
  142. #include <nt.h>
  143. #include <ntrtl.h>
  144. #include <nturtl.h>
  145. #include <windows.h>
  146. #include <stdio.h>
  147. #include <math.h>
  148. #include "teststat.h"
  149. #define SQR(A) ( (A) * (A) ) /* macro for squaring */
  150. #define SUCCESS_OK 0 /* weird, but OK */
  151. #define MIN_ITER 3 /* MIN. ITERATIONS */
  152. #define MAX_ITER 65535 /* max. iterations */
  153. #define REPEATS 14 /* repeat count for Norm. Dist. Fn. */
  154. /**********************************************************************/
  155. USHORT usMinIter; /* global min iter */
  156. USHORT usMaxIter; /* global max iter */
  157. ULONG *pulDataArray; /* a pointer to the data array for this
  158. package. Will be as large as the
  159. maximum iterations */
  160. double dSumOfData; /* sum of data during each pass */
  161. double dSumOfDataSqr; /* sum of sqr. of each data point */
  162. ULONG ulTotalIterCount; /* No. of iters returned by the interna;
  163. routine */
  164. USHORT cusCurrentPass; /* count of the current iteration pass */
  165. BOOL bDataConverged = FALSE; /* TRUE will return a precision of 5% */
  166. BOOL bMemoryAllocated=FALSE; /* TRUE will allow alloced mem to free */
  167. BOOL bPowerComputed = FALSE; /* compute 10 exp. 9 for random no. gen */
  168. BOOL *pbIndexOfOutlier; /* to keep track of values in
  169. pulDataArray, that were thrown out */
  170. HANDLE hMemHandle = NULL; /* handle to mem. allocated */
  171. HANDLE hMemOutlierFlag; /* handle to outlier flag memory */
  172. /**********************************************************************/
  173. ULONG TestStatRepeatIterations (double, double);
  174. VOID TestStatStatistics (PSZ, PULONG far *, USHORT,
  175. PUSHORT, PUSHORT);
  176. void DbgDummy (double, double);
  177. ULONG ulDataArrayAddress; /* call to mem alloc routine returns
  178. base address of alloced. mem. */
  179. BOOL bOutlierDataIndex; /* for allocating memory for outliers'
  180. index in data set */
  181. /*********************************************************************/
  182. /*
  183. * Function - TestStatOpen (EXPORTED)
  184. *
  185. * Arguments -
  186. * a) USHORT - usMinIterations
  187. * b) USHORT - usMaxIterations
  188. *
  189. * Returns -
  190. * 0 if the call was successful
  191. *
  192. * An error code if the call failed. The error code
  193. * may be one of:
  194. *
  195. * STAT_ERROR_ILLEGAL_MIN_ITER
  196. * STAT_ERROR_ILLEGAL_MAX_ITER
  197. * STAT_ERROR_ALLOC_FAILED
  198. *
  199. *
  200. * Instance data is allocated for the statistical package. This
  201. * call should precede any other calls in this dll. This function
  202. * should also be called after a call to TestStatClose, if convergence
  203. * is required on a new set of data. An error code is returned if
  204. * argument a) is zero or a) is greater than b). An error code is
  205. * also returned of one of the allocations failed.
  206. *
  207. */
  208. USHORT
  209. TestStatOpen (
  210. USHORT usMinIterations,
  211. USHORT usMaxIterations
  212. )
  213. {
  214. /* check for invalid args to this function */
  215. if (!usMinIterations)
  216. return (STAT_ERROR_ILLEGAL_MIN_ITER);
  217. if ((usMinIterations > usMaxIterations) || (usMaxIterations > MAX_ITER))
  218. return (STAT_ERROR_ILLEGAL_MAX_ITER);
  219. /* any other parameter is allowed */
  220. usMinIter = usMinIterations; /* set global vars */
  221. usMaxIter = usMaxIterations; /* -do - */
  222. // change made based on request from JeffSt/Somase/JonLe
  223. if (hMemHandle != NULL)
  224. return (STAT_ERROR_ALLOC_FAILED);
  225. hMemHandle = GlobalAlloc (GMEM_MOVEABLE | GMEM_ZEROINIT, usMaxIter *
  226. sizeof(ULONG));
  227. if (hMemHandle == NULL)
  228. return (STAT_ERROR_ALLOC_FAILED);
  229. pulDataArray = (ULONG *) GlobalLock (hMemHandle);
  230. if (pulDataArray == NULL)
  231. return (STAT_ERROR_ALLOC_FAILED);
  232. bMemoryAllocated = TRUE; /* A call to TestStatClose will
  233. now free the mem */
  234. return (SUCCESS_OK);
  235. }
  236. /*
  237. * Function - TestStatClose (EXPORTED)
  238. *
  239. * Arguments - None
  240. *
  241. * Returns - Nothing
  242. *
  243. * Instance data allocated for the statistical package by TestStatOpen
  244. * is freed. Any call to entry points b) and c) following a call to
  245. * this function, should be preceded by a call to a).
  246. *
  247. */
  248. VOID
  249. TestStatClose (VOID)
  250. {
  251. if (bMemoryAllocated) { /* free only if memory allocated */
  252. GlobalUnlock (hMemHandle);
  253. GlobalFree (hMemHandle);
  254. hMemHandle = NULL; /* Indicate released (t-WayneR/JohnOw) */
  255. } /* end of if (bMemoryAllocated) */
  256. bMemoryAllocated = FALSE; /* further calls to TestStatClose should be
  257. preceded by a memory allocation */
  258. return;
  259. }
  260. /*
  261. * Function - TestStatInit (EXPORTED)
  262. *
  263. * Arguments - None
  264. *
  265. * Returns - Nothing
  266. *
  267. * Initializes all the data arrays/variables for use by the convergence
  268. * and statistics routines. This call should precede the first call
  269. * to TestStatConverge for each set of data.
  270. *
  271. */
  272. VOID
  273. TestStatInit (VOID)
  274. {
  275. USHORT usTempCtr;
  276. /* initialize all counters, variables and the data array itself */
  277. for (usTempCtr = 0; usTempCtr < usMaxIter; usTempCtr++) {
  278. pulDataArray [usTempCtr] = 0L;
  279. }
  280. dSumOfData = 0.0;
  281. dSumOfDataSqr = 0.0;
  282. ulTotalIterCount = 0L;
  283. cusCurrentPass = 0;
  284. bDataConverged = FALSE;
  285. return;
  286. }
  287. /*
  288. * Function - TestStatConverge (EXPORTED)
  289. *
  290. * Arguments -
  291. * a) ULONG - ulNewData
  292. * Returns -
  293. * TRUE if data set converged or limit on max. iters reached
  294. *
  295. * FALSE if more iterations required for converged.
  296. *
  297. * Computes the number of iterations required for a 95% confidence
  298. * in the data received (please see teststat.txt under \ntdocs on
  299. * \\jupiter\perftool for an explanation of the confidence.
  300. * If the current iteration count is larger than the maximum specified
  301. * with the call to TestStatOpen, or if the data set has converged
  302. * this function returns a TRUE. The calling application should test
  303. * for the return value.
  304. */
  305. BOOL
  306. TestStatConverge (
  307. ULONG ulNewData
  308. )
  309. {
  310. dSumOfData += (double)ulNewData; /* sum of all data points in the set */
  311. dSumOfDataSqr += SQR ((double) ulNewData);
  312. /* sqr of data needed for the computation */
  313. if (cusCurrentPass < (USHORT) (usMinIter-(USHORT)1)) { /* do nothing if current iter
  314. < min specified value */
  315. ulTotalIterCount = (ULONG)usMaxIter + 1; /* bogus value */
  316. pulDataArray [cusCurrentPass++] = ulNewData;
  317. /* register this data into the array and return FALSE */
  318. return (FALSE);
  319. }
  320. if ((cusCurrentPass == usMaxIter) ||
  321. (cusCurrentPass >= (USHORT) ulTotalIterCount)) {
  322. /* either the limit on the max. iters. specified has been reached
  323. or, the data has converged during the last iter; return TRUE */
  324. if (cusCurrentPass >= (USHORT) ulTotalIterCount)
  325. bDataConverged = TRUE; /* set to determine if precision
  326. should be computed */
  327. return (TRUE);
  328. }
  329. if ((usMinIter < MIN_ITER) &&
  330. (usMinIter == usMaxIter) && ((USHORT)(cusCurrentPass+(USHORT)1) >= usMaxIter))
  331. /* don't call convergence algorithm, just return a TRUE */
  332. /* It does not make any sense in calling the convergence
  333. algorithm if less than 3 iterations are specifed for the
  334. minimum */
  335. return (TRUE);
  336. pulDataArray [cusCurrentPass++] = ulNewData; /* register this data into
  337. the array */
  338. if (dSumOfData == 0.0) { /* possible if data points are all zeros */
  339. bDataConverged = TRUE;
  340. return (TRUE);
  341. }
  342. ulTotalIterCount = TestStatRepeatIterations (dSumOfData,
  343. dSumOfDataSqr);
  344. if (ulTotalIterCount <= cusCurrentPass)
  345. return (TRUE);
  346. return (FALSE);
  347. }
  348. /*
  349. * Function - TestStatValues (EXPORTED)
  350. *
  351. * Arguments -
  352. * a) PSZ - pszOutputString
  353. * b) USHORT - usOutlierFactor
  354. * c) PULONG - *pulData
  355. * d) PUSHORT - pcusElementsInArray
  356. * e) PUSHORT - pcusDiscardedElements
  357. *
  358. * Returns -
  359. * Nothing
  360. *
  361. * Computes useful statistical values and returns them in the string
  362. * whose address is passed to this function. The returned string
  363. * has the following format :
  364. * ("%4u %10lu %10lu %10lu %6u %5u %10lu %4u %2u")
  365. * and the arg. list will be in the order: mode number, mean,
  366. * minimum, maximum, number of iterations, precision,
  367. * standard deviation, number of outliers in the data set and the
  368. * outlier count. (Please refer to \ntdocs\teststat.txt for
  369. * a description of precision. This is on \\jupiter\perftool.
  370. *
  371. */
  372. VOID
  373. TestStatValues(
  374. PSZ pszOutputString,
  375. USHORT usOutlierFactor,
  376. PULONG *pulFinalData,
  377. PUSHORT pcusElementsInArray,
  378. PUSHORT pcusDiscardedElements
  379. )
  380. {
  381. ULONG far * pulArray = NULL;
  382. USHORT Count =0;
  383. /* Call the low-level routine to do the statistics computation */
  384. /* doing this ,'cos, there is a possibility that the low-level
  385. routine may be used for some apps, within the perf. group. This
  386. may not be fair, but that is the way life is */
  387. TestStatStatistics (pszOutputString, &pulArray,
  388. usOutlierFactor, pcusElementsInArray,
  389. pcusDiscardedElements);
  390. *pulFinalData = pulArray;
  391. return;
  392. }
  393. /***********************************************************************
  394. ROUTINES NOT EXPORTED, BEGIN
  395. ***********************************************************************/
  396. /*
  397. * Function - TestStatRepeatIterations (NOT EXPORTED)
  398. * Arguments -
  399. * (a) double - Sum of Individual Data Points thus far
  400. * (b) double - Sum of Squares of Indiv. data points
  401. *
  402. * Returns - ULONG - value of no. of iterations required for 95%
  403. * confidence,
  404. *
  405. * Computes the number of iterations required of the calling program
  406. * before a 95% confidence level can be reached. This will return
  407. * a zero if the application calls this routine before 3 passes
  408. * are complete. The function normally returns the total number of
  409. * iterations that the application has to pass through before
  410. * offering a 95% confidence on the data.
  411. */
  412. ULONG
  413. TestStatRepeatIterations(
  414. double dSumOfIndiv,
  415. double dSumOfSqrIndiv
  416. )
  417. {
  418. double dSqrSumOfIndiv = 0;
  419. ULONG ulRepeatsNeeded = 0L;
  420. /* dSqrSumOfIndiv. stands for the square of the Sum of Indiv. data
  421. points,
  422. dSumOfSqrIndiv stands for the sum of the square of each entry point,
  423. dSumOfIndiv. stands for the sum of each data point in the set, and
  424. uIter is the iteration pass count
  425. */
  426. if (cusCurrentPass < MIN_ITER)
  427. /* not enough passes to compute convergence count */
  428. return (MAX_ITER);
  429. dSqrSumOfIndiv = SQR (dSumOfIndiv);
  430. /* use the formula derived at the beginning of this file to
  431. compute the no. of iterations required */
  432. ulRepeatsNeeded = (ULONG) (7008 *
  433. (dSumOfSqrIndiv - dSqrSumOfIndiv/cusCurrentPass)
  434. * SQR (cusCurrentPass) /
  435. ((cusCurrentPass - 1) * dSqrSumOfIndiv));
  436. return (ulRepeatsNeeded);
  437. }
  438. /***************************************************************************/
  439. /*
  440. * Function - TestStatStatistics
  441. * Arguments -
  442. * a) PSZ - pszOutputString
  443. * b) PULONG far * - pulFinalData
  444. * c) USHORT - usOutlierFactor
  445. * d) PUSHORT - pcusElementsInArray
  446. * e) PUSHORT - pcusDiscardedValues
  447. *
  448. * Returns - Nothing
  449. *
  450. * Computes the max, min, mean, and std. dev. of a given
  451. * data set. The calling program should convert the values obtained
  452. * from this routine from a "ULONG" to the desired data type. The
  453. * outlier factor decides how many data points of the data set are
  454. * within acceptable limits. Data is returned to the buffer whose
  455. * address is the first argument to this call.
  456. *
  457. */
  458. VOID
  459. TestStatStatistics (
  460. PSZ pszOutputString,
  461. PULONG *pulFinalData,
  462. USHORT usOutlierFactor,
  463. PUSHORT pcusElementsInArray,
  464. PUSHORT pcusDiscardedValues
  465. )
  466. {
  467. static USHORT uArrayCount = 0; /* local variable that may be reused */
  468. USHORT uTempCt = 0; /* local variable that may be reused */
  469. double dSqrOfSDev = 0; /* sqr of the std. deviation */
  470. double dSumOfSamples = 0; /* sum of all data points */
  471. double dSumOfSquares = 0; /* sum of squares of data points */
  472. ULONG ulMean = 0L;
  473. ULONG ulStdDev = 0L;
  474. ULONG ulDiffMean = 0L; /* to store the diff. of mean and SD,
  475. outlier factor */
  476. BOOL bAcceptableSDev = TRUE ; /* flag to determine if SDev. is
  477. acceptable */
  478. ULONG ulMax = 0L; /* pilot value */
  479. ULONG ulMin = 0xffffffff; /* largest possible ULONG */
  480. USHORT usPrecision = 0; /* to obtain precision */
  481. USHORT uModeNumber = 0; /* DUMMY VALUE until this is
  482. supported */
  483. /* compute mean by adding up all values and dividing by the no.
  484. of elements in data set - might need to recompute the
  485. mean if outlier factor is selected. However, the min. and max. will
  486. be selected from the entire set */
  487. USHORT Count = 0;
  488. *pcusDiscardedValues = 0; /* init. this variable */
  489. if (cusCurrentPass == 0)
  490. return; /* get out without doing anything - this is a weird
  491. case when the user calls this routine without
  492. calling a converge routine */
  493. *pcusElementsInArray = cusCurrentPass;
  494. /* every iteration produces one data point */
  495. uArrayCount = 0;
  496. while (uArrayCount < *pcusElementsInArray) {
  497. if (pulDataArray[uArrayCount] > ulMax)
  498. ulMax = pulDataArray[uArrayCount]; /* new Max. value */
  499. if (pulDataArray[uArrayCount] < ulMin)
  500. ulMin = pulDataArray[uArrayCount]; /* new min. value */
  501. ulMean += pulDataArray [uArrayCount++];
  502. }
  503. if (*pcusElementsInArray)
  504. ulMean /= *pcusElementsInArray; /* this is the mean */
  505. else
  506. ulMean = 0;
  507. /* the standard deviation needs to be computed */
  508. for (uArrayCount = 0; uArrayCount < *pcusElementsInArray; uArrayCount++) {
  509. dSumOfSamples += (double) pulDataArray [uArrayCount];
  510. dSumOfSquares += SQR ((double) pulDataArray [uArrayCount]);
  511. }
  512. if (*pcusElementsInArray) {
  513. dSqrOfSDev = ((*pcusElementsInArray * dSumOfSquares) -
  514. SQR (dSumOfSamples)) /
  515. (*pcusElementsInArray * (*pcusElementsInArray - 1));
  516. }
  517. ulStdDev = (ULONG) sqrt (dSqrOfSDev);
  518. /* the standard deviation has been computed for the first pass */
  519. /* Use the outlier factor and the S.D to find out if any of
  520. individual data points are abnormal. If so, throw them out and
  521. increment the discard value counter */
  522. if (usOutlierFactor) { /* if outlier factor is zero, do not go
  523. through with the following */
  524. /*** here is what we do....
  525. allocate space for an array of BOOLs. Each of these is a flag
  526. corresponding to a data point. Initially, these flags will be
  527. all set to FALSE. We then go thru each data point. If a data
  528. point does not satisfy the condition for throwing out outliers,
  529. we set the flag corresponding to that data point to TRUE. That
  530. point is not used to recompute the mean and SDev. We recompute
  531. the mean and SDev after each round of outlier elimination. When
  532. we reach a stage where no points were discarded during a round,
  533. we get out of the while loop and compute the statistics for the
  534. new data set ****/
  535. hMemOutlierFlag = GlobalAlloc (GMEM_MOVEABLE | GMEM_ZEROINIT,
  536. *pcusElementsInArray * sizeof(BOOL));
  537. pbIndexOfOutlier = (BOOL FAR *) GlobalLock (hMemOutlierFlag);
  538. if (!pbIndexOfOutlier) {
  539. return;
  540. }
  541. for (uArrayCount = 0; uArrayCount < *pcusElementsInArray;
  542. uArrayCount ++)
  543. pbIndexOfOutlier [uArrayCount] = FALSE;
  544. while (1) { /* begin the data inspection round */
  545. bAcceptableSDev = TRUE; /* set this flag to TRUE. If we
  546. hit an outlier, this flag will
  547. be reset */
  548. for (uArrayCount = 0; uArrayCount < cusCurrentPass;
  549. uArrayCount++) {
  550. /*** check the individual data points ***/
  551. if (ulMean < (ulStdDev * usOutlierFactor))
  552. /* just make sure that we are not comparing with a
  553. negative number */
  554. ulDiffMean = 0L;
  555. else
  556. ulDiffMean = (ulMean - (ulStdDev * usOutlierFactor));
  557. if (!pbIndexOfOutlier [uArrayCount]) {
  558. if ((pulDataArray [uArrayCount] < ulDiffMean)
  559. || (pulDataArray [uArrayCount] >
  560. (ulMean + (ulStdDev * usOutlierFactor)))) {
  561. /* set the flag of this data point to TRUE to
  562. indicate that this data point should not be
  563. considered in the mean and SDev computation */
  564. pbIndexOfOutlier [uArrayCount] = TRUE;
  565. /*** increment the discarded qty ***/
  566. (*pcusDiscardedValues)++;
  567. /*** decrement the count of good data points ***/
  568. // uncomment next line if outliers should be part of mean - vaidy
  569. // (*pcusElementsInArray)--;
  570. bAcceptableSDev = FALSE;
  571. } /*** end of if statement ***/
  572. } /*** end of if !pbIndexOfOutlier ***/
  573. } /*** end of for loop ***/
  574. if (!bAcceptableSDev) { /*** there were some bad data points ;
  575. recompute S.Dev ***/
  576. // Starting at next statement, uncomment all lines until you see
  577. // "STOP UNCOMMENT FOR OUTLIERS IN MEAN", if you want outliers to be
  578. // part of mean. vaidy Aug. 1991.
  579. // dSumOfSamples = 0.0; /* init these two guys */
  580. // dSumOfSquares = 0.0;
  581. // for (uArrayCount = 0;
  582. // uArrayCount < cusCurrentPass;
  583. // /* check all elements in the data array */
  584. // uArrayCount++) {
  585. // /* consider only those data points that do not have the
  586. // pbIndexOfOutlier flag set */
  587. // if (!pbIndexOfOutlier [uArrayCount]) {
  588. // dSumOfSamples += (double) pulDataArray [uArrayCount];
  589. // dSumOfSquares += SQR ((double)pulDataArray
  590. // [uArrayCount]);
  591. // }
  592. // }
  593. // if (*pcusElementsInArray > 1)
  594. // /* compute StdDev. only if there are atleast 2 elements */
  595. // dSqrOfSDev = ((*pcusElementsInArray * dSumOfSquares) -
  596. // SQR (dSumOfSamples)) /
  597. // (*pcusElementsInArray *
  598. // (*pcusElementsInArray - 1));
  599. // ulStdDev = (ULONG) sqrt (dSqrOfSDev);
  600. // /* since some data points were discarded, the mean has to be
  601. // recomputed */
  602. // uArrayCount = 0;
  603. // ulMean = 0;
  604. // while (uArrayCount < cusCurrentPass) {
  605. // /* consider only those data points that do not have the
  606. // bIndexOfOutlier flag set */
  607. // if (!pbIndexOfOutlier [uArrayCount++])
  608. // ulMean += pulDataArray [uArrayCount - 1];
  609. // }
  610. // if (*pcusElementsInArray > 0) /* only then compute mean */
  611. // ulMean /= *pcusElementsInArray; /* this is the new mean */
  612. // else
  613. // ulMean = 0L;
  614. // "STOP UNCOMMENT FOR OUTLIERS IN MEAN"
  615. } /*** end of if (!bAcceptableSDev) ***/
  616. else /*** if the for loop completed without
  617. a single bad data point ***/
  618. break;
  619. } /* end of while */
  620. /**** free the memory for the bIndexOfOutiler flag */
  621. GlobalUnlock (hMemOutlierFlag);
  622. GlobalFree (hMemOutlierFlag);
  623. } /* end of if (iOutlierFactor) */
  624. /* so, now an acceptable Standard deviation and mean have been obtained */
  625. if ((!bDataConverged) &&
  626. (usMaxIter < MIN_ITER)) {
  627. /* set precision to 0% if max iters chosen is less than 3 */
  628. usPrecision = 0;
  629. } else { /* need to compute precision */
  630. /* using eqn. 1. above, it can be shown that the precision, p,
  631. can be written as:
  632. 1
  633. _ _ /
  634. | 2 2 | 2
  635. | 2 * SD * 2.96 |
  636. p = | ----------------- |
  637. | 2 |
  638. | n * Mean |
  639. |_ _|
  640. *************************************************************/
  641. if (ulMean > 0 && *pcusElementsInArray) {
  642. usPrecision = (USHORT) (sqrt((double) ((2 *
  643. SQR ((double)ulStdDev) *
  644. SQR (2.96) /(*pcusElementsInArray *
  645. SQR ((double) ulMean))))) * 100.0
  646. + 0.5);
  647. } else
  648. usPrecision = (USHORT)~0;
  649. } /* end of else need to compute precision */
  650. sprintf (pszOutputString,
  651. "%4u %10lu %10lu %10lu %6u %5u %10lu %4u %2u ",
  652. uModeNumber, ulMean, ulMin, ulMax, cusCurrentPass,
  653. usPrecision, ulStdDev, *pcusDiscardedValues,
  654. usOutlierFactor);
  655. *pcusElementsInArray = cusCurrentPass;
  656. *pulFinalData = pulDataArray;
  657. return;
  658. }
  659. /*
  660. * The following is the source for generating random numbers.
  661. * Two procs are provided: TestStatRand and TestStatUniRand.
  662. *
  663. * a) TestStatRand is called as follows: TestStatRand (Low, High)
  664. * The result is a number returned in the range Low - High (both
  665. * inclusive.
  666. *
  667. * A given intial value of Seed will yield a set of repeatable
  668. * results. The first call to TestStatRand should be with an odd seed
  669. * in the range of 1 - 67108863, both inclusive. The following
  670. * 9 seeds have been tested with good results:
  671. *
  672. * 32347753, 52142147, 52142123, 53214215, 23521425, 42321479,
  673. * 20302541, 32524125, 42152159.
  674. *
  675. * The result should never be equal to the seed since this would
  676. * eliminate the theoretical basis for the claim for uniform
  677. * randomeness.
  678. *
  679. * b) TestStatUniRand is called as follows:
  680. * NormFrac = TestStatUniRand ();
  681. * NormFrac is uniformaly distributed between 0 and 1 with
  682. * a scale of 9 (values range bet. 0 and 0.999999999).
  683. *
  684. * The basis for this algorithm is the multiplicative congruential
  685. * method found in Knuth (Vol.2 , Chap.3). Constants were selected
  686. * by Pike, M.C and Hill, I.D; Sullivans, W.L. provides the
  687. * the list of tested seeds.
  688. *
  689. * The code here has been adapted from Russ Blake's work.
  690. *
  691. * Created : vaidy - Nov. 29, 90
  692. */
  693. #define MODULUS 67108864 /* modulus for computing random no */
  694. #define SQRTMODULUS 8192 /* sqrt of MODULUS */
  695. #define MULTIPLIER 3125
  696. #define MAX_UPPER 67108863
  697. #define MAX_SEEDS 8 /* 8 good starting seeds */
  698. #define SCALE 65535
  699. ULONG aulSeedTable [] = { /* lookup table for good seeds */
  700. 32347753, 52142147, 52142123, 53214215, 23521425, 42321479,
  701. 20302541, 32524125, 42152159};
  702. USHORT uSeedIndex; /* index to lookup table */
  703. ULONG ulSeed = 32347753; /* the seed chosen from table (hardcoded here)
  704. and recomputed */
  705. /*********************************************************************/
  706. /*
  707. * Function - TestStatRand (EXPORTED)
  708. *
  709. * Arguments -
  710. * a) ULONG - ulLower
  711. * b) ULONG - ulUpper
  712. *
  713. * Returns -
  714. * a random number in the range ulLower to ulUpper
  715. *
  716. * An error code if the call failed. The error code
  717. * will be:
  718. *
  719. * STAT_ERROR_ILLEGAL_BOUNDS
  720. *
  721. *
  722. * Calls TestStatUniRand and returns a random number in the range passed
  723. * in (both inclusive). The limits for the lower and upper bounds
  724. * are 1 and 67108863. The start seed index looks up into the array
  725. * of seeds to select a good, tested starting seed value. The returned
  726. * values will be uniformaly distributed within the boundary. A start
  727. * seed has been hardcoded into this dll.
  728. *
  729. */
  730. ULONG
  731. TestStatRand (
  732. ULONG ulLower,
  733. ULONG ulUpper
  734. )
  735. {
  736. double dTemp;
  737. double dNormRand;
  738. LONG lTestForLowBounds = (LONG) ulLower;
  739. /* check args */
  740. if ((lTestForLowBounds < 1L) ||
  741. (ulUpper > MAX_UPPER) || (ulUpper < ulLower))
  742. return (STAT_ERROR_ILLEGAL_BOUNDS);
  743. dNormRand = TestStatUniRand (); /* call TestStatUniRand */
  744. dTemp = (double) ((ulUpper - ulLower) * dNormRand); /* scale value */
  745. return (ulLower + (ULONG) dTemp);
  746. }
  747. /*
  748. *
  749. * Function - TestStatUniRand () EXPORTED
  750. *
  751. * Accepts - nothing
  752. *
  753. * Returns a uniformaly distrib. normalized number in the range 0 - 0.9999999
  754. * (both inclusive). Modifies the seed to the next value.
  755. *
  756. */
  757. double
  758. TestStatUniRand (VOID)
  759. {
  760. ULONG ulModul = MODULUS; /* use the modulus for getting remainder
  761. and dividing the current value */
  762. double dMult = MULTIPLIER;
  763. double dTemp = 0.0; /* a temp variable */
  764. double dTemp2 = 0.0; /* a temp variable */
  765. ULONG ulDivForMod; /* used for obtaining the remainder of
  766. the present seed / MODULUS */
  767. /* the following long-winded approach has to be adopted to
  768. obtain the remainder. % operator does not work on floats */
  769. /* use a temp variable. Makes the code easier to follow */
  770. dTemp = dMult * (double) ulSeed; /* store product in temp var. */
  771. DbgDummy (dTemp, dMult); // NT screws up bigtime for no reason
  772. // if this is not used - possible compiler
  773. // bug
  774. dTemp2 = (double) ulModul; // more compiler problems reported
  775. // on Build 259 by JosephH.
  776. // April 13, 1992.
  777. ulDivForMod = (ULONG) (dTemp / dTemp2);
  778. // ulDivForMod = (ULONG) (dTemp / ulModul); /* store quotient of present
  779. // seed divided by MODULUS */
  780. dTemp -= ((double)ulDivForMod * (double)ulModul);
  781. /* dTemp will contain the remainder of present seed / MODULUS */
  782. ulSeed = (ULONG) dTemp; /* seed for next iteration obtained */
  783. /* return value */
  784. return ((dTemp)/(double)ulModul);
  785. }
  786. /*
  787. *
  788. * Function - TestStatNormDist () EXPORTED
  789. *
  790. * Accepts -
  791. * a) ULONG - ulMean
  792. * b) USHORT - usStdDev
  793. *
  794. * Returns - LONG - A LONG that allows the mean of the generated
  795. * points to be approximately ulMean and the SD of the
  796. * set to be ulStdDev.
  797. *
  798. * Formula used here is: REPEATS
  799. * _
  800. * Return Value = ulMean + (-7 + [ >_ TestStatUniRandRand ()] * ulStdDev
  801. * i = i
  802. *
  803. * This formula is based on 'Random Number Generation and Testing',
  804. * IBM Data Processing Techniques, C20-8011.
  805. */
  806. LONG
  807. TestStatNormDist (
  808. ULONG ulMean,
  809. USHORT usSDev
  810. )
  811. {
  812. LONG lSumOfRands = 0L; /* store the sum of the REPEATS calls here */
  813. USHORT cuNorm; /* a counter */
  814. LONG lMidSum = 0L;
  815. LONG lRemainder = 0L;
  816. for (cuNorm = 0; cuNorm < REPEATS; cuNorm++)
  817. lSumOfRands += (LONG) TestStatShortRand ();
  818. /* we now do a lot of simple but ugly mathematics to obtain the
  819. correct result. What we do is as follows:
  820. Divide the lSumOfRands by the scale factor.
  821. Since we are dealing with short and long integers, we are
  822. likely to lose precision. So, we get the remainder of this
  823. division and multiply each of the values by the standard division.
  824. Eg. if lSumOfRands = 65534 and std.dev is 10,
  825. lQuotient = 0, lRemainder = 65534.
  826. lMidSum = (-7 * 10) + (0 * 10) + (65534 * 10/65535) = -61,
  827. which is pretty accurate. We then add the mean and return.
  828. Actually, we do not return right away. To be more precise,
  829. we need to find out if the third element in the above term
  830. yields a remainder of < 0.5. If so, we do not do anything.
  831. Else, we add 1 to the result to round off and then return.
  832. In the above example, the remainder = 0.99. So we add 1 to
  833. -61. The result is -60 and this is accurate. */
  834. lRemainder = (lSumOfRands * usSDev) % SCALE;
  835. /* the above remainder is the one to determine the rounding off */
  836. lMidSum = ((-7 + (lSumOfRands / SCALE)) * usSDev) +
  837. ((lSumOfRands % SCALE) * usSDev / SCALE);
  838. if (lRemainder >= (SCALE / 2L)) /* need to roundup ? */
  839. lMidSum += 1L;
  840. return (lMidSum + ulMean);
  841. }
  842. /*
  843. *
  844. * Function - TestStatShortRand () EXPORTED
  845. *
  846. * Accepts - nothing
  847. *
  848. * Returns a normalized number in the range 0 - 65535
  849. * (both inclusive). Modifies the seed to the next value.
  850. *
  851. */
  852. USHORT
  853. TestStatShortRand (VOID)
  854. {
  855. ULONG ulTemp = SCALE / SQRTMODULUS;
  856. ulSeed = (MULTIPLIER * ulSeed) % MODULUS;
  857. /* seed for next iteration obtained */
  858. /* note: the return value should be (ulSeed * SCALE / MODULUS).
  859. However, the product of the elements in the numerator, far exceeds
  860. 4 Billion. So, the math is done in two stages. The value of
  861. MODULUS is a perfect square (of 8192). So, the SCALE is first
  862. divided by the SQRT of the MODULUS, the product of ulSeed and the
  863. result of the division is divided by the SQRT of the MODULUS again */
  864. /* return scale value - add one to ulTemp for correction */
  865. return ((USHORT) ((ulSeed * (ulTemp + 1)) / SQRTMODULUS));
  866. }
  867. /*
  868. *
  869. * Function - TestStatFindFirstMode () EXPORTED
  870. *
  871. * Accepts - a) PSZ - pszOutputString
  872. * b) USHORT - usOutlierFactor
  873. * c) PULONG - *pulData
  874. * d) PUSHORT - pcusElementsInArray
  875. * e) PUSHORT - pcusDiscardedElements
  876. *
  877. * Returns -
  878. * Nothing
  879. *
  880. * Computes useful statistical values and returns them in the string
  881. * whose address is passed to this function. The returned string
  882. * has the following format :
  883. * ("%10lu %10lu %10lu %10lu %5u %10lu %4u %2u")
  884. * and the arg. list will be in the order: mean,
  885. * minimum, maximum, number of iterations, precision,
  886. * standard deviation, number of outliers in the data set and the
  887. * outlier count. (Please refer to \ntdocs\teststat.txt for
  888. * a description of precision. This is on \\jupiter\perftool.
  889. *
  890. * Returns
  891. * TO BE COMPLETED.....
  892. *
  893. */
  894. /*++
  895. Had to call this routine in TestStatUniRand - compiler screws up
  896. --*/
  897. void
  898. DbgDummy (
  899. double dTemp,
  900. double dLocal
  901. )
  902. {
  903. dTemp = 0.0;
  904. dLocal = 0.0;
  905. }