Source code of Windows XP (NT5)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

225 lines
9.9 KiB

  1. README.txt
  2. Author: Murali R. Krishnan (MuraliK)
  3. Created: Jan 6, 1997
  4. Revisions:
  5. Date By Comments
  6. ----------------- -------- -------------------------------------------
  7. Summary :
  8. This file describes the files in the directory svcs\infocomm\atq
  9. and details related to ISATQ - Internet Services Async Thread Queue module
  10. File Owner Description
  11. README.txt MuraliK This file.
  12. abw.hxx MuraliK Bandwidth throttler declarations
  13. abw.cxx MuraliK Bandwidth throttler for ATQ
  14. acache.cxx MuraliK Alloc Cache module
  15. atqbmon.cxx MCourage Listen backlog monitor
  16. atqbmon.hxx MCourage Listen backlog monitor header
  17. atqcport.cxx JohnsonA Fake Completion port for Win95
  18. atqcport.hxx JohnsonA Fake Completion port for Win95 header
  19. atqendp.cxx MuraliK Atq Endpoint manager
  20. atqmain.cxx MuraliK Exposed ATQ entrypoints
  21. atqprocs.hxx MuraliK Internal Function Prototypes
  22. atqsupp.cxx MuraliK Atq Support Functions - timeout, thread pool, etc.
  23. atqtypes.hxx MuraliK Atq Internal Types
  24. atqxmit.cxx JohnsonA Internal routines for TransmitFile()
  25. auxctrs.hxx MuraliK Auxiliar counters - for internal analysis
  26. dbgutil.h MuraliK Debug support definitions
  27. dllmain.cxx MuraliK Dll Entry points
  28. isatq.def MuraliK .def file
  29. isatq.hxx MuraliK pre-compiled header file
  30. isatq.rc MuraliK Resource file
  31. sched.cxx MuraliK IIS Scheduler - internal thread pool for scheduling
  32. sched.hxx MuraliK Scheduler data structures
  33. spud.cxx JBallard SPUD.sys user-mode support functions
  34. timeout.cxx MuraliK ATQ Contexts Timeout Logic
  35. timer.cxx MuraliK Time measurement support code
  36. xmitnt.cxx JohnsonA obsolete file - replaced by atqxmit.cxx
  37. ----------------------------------------------------------------------
  38. Implementation Details
  39. Contents:
  40. ATQ based Bandwidth Throttle
  41. Author: MuraliK
  42. Date: 25-May-1995
  43. Goal:
  44. Given a specified bandwidth which should be used as threshold,
  45. the ATQ module shall throttle traffic, gracefully. Minimum CPU impact
  46. should be seen; Minor variations above specified threshold is
  47. allowed. Performance in the fast cause (no throttle) should be high
  48. and involve less stuff in the critical path.
  49. Given:
  50. M -- an administrator specified bandwidth which should not be
  51. exceeded in most cases. (assume to be specified through a special API
  52. interface added to ATQ module)
  53. Solution:
  54. Various solutions are possible based on measurements and metrics
  55. chosen. Whenever two possible solutions are possible, we pick the
  56. simplest one to avoid complexity and performance impact. (Remember to
  57. K.I.S.S.)
  58. Sub Problems:
  59. 1) Determination of Exisiting Usage:
  60. At real time determining existing usage exactly is computationally
  61. intensive. We resort to approximate measures whenever possible.
  62. Idea is: Estimated Bandwidth = (TotalBytesSent / PeriodOfObservation).
  63. solution a)
  64. Use a separate thread for sampling and calculating the
  65. bandwidth. Whenever an IO operation completes (we return from
  66. GetQueuedCompletionStatus()), increment the TotalBytesSent for the
  67. period under consideration. The sampling thread wakes up at regular
  68. intervals and caclulates the bandwidth effective at that time. The solution
  69. also uses histogramming to smooth out sudden variations in the bandwidth.
  70. This solution is:
  71. + good, since it limits complexity in calculating bandwidth
  72. - ignores completion of IO simultaneously => sudden spikes are possible.
  73. - ignores the duration took for actual IO to complete (results could be
  74. misleading)
  75. - requires separate sampling thread for bandwidth calculation.
  76. solution b)
  77. This solution uses a running approximation of time taken for
  78. completing an i/o of standard size viz., 1 KB transfer. Initially we start
  79. with an approximation of 0 Bytes sent/second (reasonable, since we just
  80. started). When an IO completes, the time taken for transfer then is
  81. calculated from the count of bytes sent and time required from inception to
  82. end of IO. Now we do a simple average of existing approximation and the
  83. newly caculated time. This gives the next approximation for bandwidth/time
  84. taken. Successively the calculations refine the effective usage measurement
  85. made. (However, we must note, by so simplifying, we offset ourselves from
  86. worrying about the concurrency in IO processing.) In case of concurrent
  87. transfers time taken for data transfer is larger than the actual time only
  88. for the particular transfer. Hence, the solution makes conservative
  89. estimates based on this measured value.
  90. + no separate thread for sampling
  91. + simple interface & function to calculate bandwidth.
  92. - avoids unusaual spikes seen in above solution.
  93. 2) Determination of Action to be performed:
  94. The allowed operations in ATQ module include Read, Write and
  95. TransmitFile. When a new operation is submitted, we need to evaluate if it
  96. is safe(allow), marginally safe(block) or unsafe(reject) to perform the
  97. operation. Evaluation of "safety"ness is tricky and involves knowledge
  98. about the operations, buffers used, CPU overhead for the operation setup,
  99. and estimated and specified bandwidths.
  100. Assume M and B as specified and estimated bandwidths respectively. Let
  101. R,W, and T stand for the operations Read, Write and TransmitFile. In
  102. addition assume that s and b are used as suffixes for small and big
  103. transfers. Definition of small and big are arbitrary and should be fixed
  104. empirically. Please refer the following table for actions to be performed.
  105. Action Table:
  106. ------------------------------------------------------------------------------
  107. \ Action |
  108. Bandwidth\ to be | Allow Block Reject
  109. comparison\ Done |
  110. ------------------------------------------------------------------------------
  111. M > B R,W,T - -
  112. M ~= B W, T R -
  113. (approx. equal) (reduces future traffic)
  114. M < B Ws, Ts Wb, Tb R
  115. (reject on LongQueue)
  116. ------------------------------------------------------------------------------
  117. Rationale:
  118. case M > B: In this case, the services are not yet hitting the limits
  119. specified, so it is acceptable to allow all the operations to occur without
  120. any blockage.
  121. case M ~= B: (i.e. -delta <= |(M - B)| <= +delta
  122. [Note: We use approximation, since exact equal is costly to calculate.]
  123. At this juncture, the N/w usage is at the brink of specified bandwidth. It
  124. is good to take some steps to reduce future traffic. Servers operate on
  125. serve-a-request basis -- they receive requests from clients and act upon
  126. them. It is hence worthwhile to limit the number of requests getting
  127. submitted to the active queue banging on the network. By delaying the Read,
  128. processing of requests are delayed artificially, leading to delayed load on
  129. the network. By the time delayed reads proceed, hopefully the network is
  130. eased up and hence server will stabilise. As far as write and transmit
  131. goes, certain amount of CPU processing is done and it is worthwhile to
  132. perform them, rather than delaying and queueing, wasting CPU usage.
  133. Another possibility is: Do Nothing. In most cases, the load may be coming
  134. down, in which case the bandwidth utilized will naturally get low. To the
  135. contrary allowing reads to proceed may result in resulting Write and
  136. Transmit loads. Due to this potential danger, we dont adopt this solution.
  137. case M < B:
  138. The bandwidth utilization has exceeded the specified limit. This is an
  139. important case that deserves regulation. Heavy gains are achieved by
  140. adopting reduced reads and delaying Wb and Tb. Better yet, reads can be
  141. rejected indicating that the server is busy or network is busy. In most
  142. cases when the server goes for a read operation, it is at the starting
  143. point of processing any future request from client (exception is: FTP
  144. server doing control reads, regularly.) Hence, it is no harm rejecting the
  145. read request entirely. In addition, blocking Wb and Tb delays their impact
  146. on the bandwidth, and brings down the bandwidth utilization faster than
  147. possible only by rejecting Reads. We dont want to reject Wb or Tb, simply
  148. because the amount of CPU work done for the same may be too high. By
  149. blocking them, most of the CPU work does not go waste.
  150. Implementation:
  151. To be continued later.
  152. The action table is simplified as shown below to keep the implementation
  153. simpler.
  154. Action Table:
  155. ------------------------------------------------------------------------------
  156. \ Action |
  157. Bandwidth\ to be | Allow Block Reject
  158. comparison\ Done |
  159. ------------------------------------------------------------------------------
  160. M > B R,W,T - -
  161. M ~= B W, T R -
  162. (approx. equal) (reduces future traffic)
  163. M < B W, T R
  164. ------------------------------------------------------------------------------
  165. Status and Entry point Modifications:
  166. We keep track of three global variables, one each for each of the
  167. operations: Read, Write and XmitFile. The values of these variables
  168. indicate if the operation is allowed, blocked or rejected. The entry points
  169. AtqReadFile(), AtqWriteFile() and AtqXmitFile() are modified to check the
  170. status and do appropriate action. If the operation is allowed, then
  171. operation proceeds normally. If the operation is blocked, then we store
  172. the context in a blocked list. The parameters of the entry points, which
  173. are required for restarting the operation are also stored along with
  174. context. The operation is rejected, if the status indicates rejection. All
  175. these three global variables are read, without any synchronization
  176. primitives around them. This will potentially lead to minor
  177. inconsistencies, which is acceptable. However, performance is improved
  178. since there is no syncronization primitive that needs to be accessed.( This
  179. assertion however is dependent upon SMP implementations and needs to be
  180. verified. It is deferred for current implementation.)