Team Fortress 2 Source Code as on 22/4/2020
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

328 lines
10 KiB

  1. LZMA compression
  2. ----------------
  3. Version: 9.35
  4. This file describes LZMA encoding and decoding functions written in C language.
  5. LZMA is an improved version of famous LZ77 compression algorithm.
  6. It was improved in way of maximum increasing of compression ratio,
  7. keeping high decompression speed and low memory requirements for
  8. decompressing.
  9. Note: you can read also LZMA Specification (lzma-specification.txt from LZMA SDK)
  10. Also you can look source code for LZMA encoding and decoding:
  11. C/Util/Lzma/LzmaUtil.c
  12. LZMA compressed file format
  13. ---------------------------
  14. Offset Size Description
  15. 0 1 Special LZMA properties (lc,lp, pb in encoded form)
  16. 1 4 Dictionary size (little endian)
  17. 5 8 Uncompressed size (little endian). -1 means unknown size
  18. 13 Compressed data
  19. ANSI-C LZMA Decoder
  20. ~~~~~~~~~~~~~~~~~~~
  21. Please note that interfaces for ANSI-C code were changed in LZMA SDK 4.58.
  22. If you want to use old interfaces you can download previous version of LZMA SDK
  23. from sourceforge.net site.
  24. To use ANSI-C LZMA Decoder you need the following files:
  25. 1) LzmaDec.h + LzmaDec.c + Types.h
  26. Look example code:
  27. C/Util/Lzma/LzmaUtil.c
  28. Memory requirements for LZMA decoding
  29. -------------------------------------
  30. Stack usage of LZMA decoding function for local variables is not
  31. larger than 200-400 bytes.
  32. LZMA Decoder uses dictionary buffer and internal state structure.
  33. Internal state structure consumes
  34. state_size = (4 + (1.5 << (lc + lp))) KB
  35. by default (lc=3, lp=0), state_size = 16 KB.
  36. How To decompress data
  37. ----------------------
  38. LZMA Decoder (ANSI-C version) now supports 2 interfaces:
  39. 1) Single-call Decompressing
  40. 2) Multi-call State Decompressing (zlib-like interface)
  41. You must use external allocator:
  42. Example:
  43. void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); }
  44. void SzFree(void *p, void *address) { p = p; free(address); }
  45. ISzAlloc alloc = { SzAlloc, SzFree };
  46. You can use p = p; operator to disable compiler warnings.
  47. Single-call Decompressing
  48. -------------------------
  49. When to use: RAM->RAM decompressing
  50. Compile files: LzmaDec.h + LzmaDec.c + Types.h
  51. Compile defines: no defines
  52. Memory Requirements:
  53. - Input buffer: compressed size
  54. - Output buffer: uncompressed size
  55. - LZMA Internal Structures: state_size (16 KB for default settings)
  56. Interface:
  57. int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
  58. const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
  59. ELzmaStatus *status, ISzAlloc *alloc);
  60. In:
  61. dest - output data
  62. destLen - output data size
  63. src - input data
  64. srcLen - input data size
  65. propData - LZMA properties (5 bytes)
  66. propSize - size of propData buffer (5 bytes)
  67. finishMode - It has meaning only if the decoding reaches output limit (*destLen).
  68. LZMA_FINISH_ANY - Decode just destLen bytes.
  69. LZMA_FINISH_END - Stream must be finished after (*destLen).
  70. You can use LZMA_FINISH_END, when you know that
  71. current output buffer covers last bytes of stream.
  72. alloc - Memory allocator.
  73. Out:
  74. destLen - processed output size
  75. srcLen - processed input size
  76. Output:
  77. SZ_OK
  78. status:
  79. LZMA_STATUS_FINISHED_WITH_MARK
  80. LZMA_STATUS_NOT_FINISHED
  81. LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
  82. SZ_ERROR_DATA - Data error
  83. SZ_ERROR_MEM - Memory allocation error
  84. SZ_ERROR_UNSUPPORTED - Unsupported properties
  85. SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
  86. If LZMA decoder sees end_marker before reaching output limit, it returns OK result,
  87. and output value of destLen will be less than output buffer size limit.
  88. You can use multiple checks to test data integrity after full decompression:
  89. 1) Check Result and "status" variable.
  90. 2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize.
  91. 3) Check that output(srcLen) = compressedSize, if you know real compressedSize.
  92. You must use correct finish mode in that case. */
  93. Multi-call State Decompressing (zlib-like interface)
  94. ----------------------------------------------------
  95. When to use: file->file decompressing
  96. Compile files: LzmaDec.h + LzmaDec.c + Types.h
  97. Memory Requirements:
  98. - Buffer for input stream: any size (for example, 16 KB)
  99. - Buffer for output stream: any size (for example, 16 KB)
  100. - LZMA Internal Structures: state_size (16 KB for default settings)
  101. - LZMA dictionary (dictionary size is encoded in LZMA properties header)
  102. 1) read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header:
  103. unsigned char header[LZMA_PROPS_SIZE + 8];
  104. ReadFile(inFile, header, sizeof(header)
  105. 2) Allocate CLzmaDec structures (state + dictionary) using LZMA properties
  106. CLzmaDec state;
  107. LzmaDec_Constr(&state);
  108. res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc);
  109. if (res != SZ_OK)
  110. return res;
  111. 3) Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop
  112. LzmaDec_Init(&state);
  113. for (;;)
  114. {
  115. ...
  116. int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
  117. const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
  118. ...
  119. }
  120. 4) Free all allocated structures
  121. LzmaDec_Free(&state, &g_Alloc);
  122. Look example code:
  123. C/Util/Lzma/LzmaUtil.c
  124. How To compress data
  125. --------------------
  126. Compile files:
  127. Types.h
  128. Threads.h
  129. LzmaEnc.h
  130. LzmaEnc.c
  131. LzFind.h
  132. LzFind.c
  133. LzFindMt.h
  134. LzFindMt.c
  135. LzHash.h
  136. Memory Requirements:
  137. - (dictSize * 11.5 + 6 MB) + state_size
  138. Lzma Encoder can use two memory allocators:
  139. 1) alloc - for small arrays.
  140. 2) allocBig - for big arrays.
  141. For example, you can use Large RAM Pages (2 MB) in allocBig allocator for
  142. better compression speed. Note that Windows has bad implementation for
  143. Large RAM Pages.
  144. It's OK to use same allocator for alloc and allocBig.
  145. Single-call Compression with callbacks
  146. --------------------------------------
  147. Look example code:
  148. C/Util/Lzma/LzmaUtil.c
  149. When to use: file->file compressing
  150. 1) you must implement callback structures for interfaces:
  151. ISeqInStream
  152. ISeqOutStream
  153. ICompressProgress
  154. ISzAlloc
  155. static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
  156. static void SzFree(void *p, void *address) { p = p; MyFree(address); }
  157. static ISzAlloc g_Alloc = { SzAlloc, SzFree };
  158. CFileSeqInStream inStream;
  159. CFileSeqOutStream outStream;
  160. inStream.funcTable.Read = MyRead;
  161. inStream.file = inFile;
  162. outStream.funcTable.Write = MyWrite;
  163. outStream.file = outFile;
  164. 2) Create CLzmaEncHandle object;
  165. CLzmaEncHandle enc;
  166. enc = LzmaEnc_Create(&g_Alloc);
  167. if (enc == 0)
  168. return SZ_ERROR_MEM;
  169. 3) initialize CLzmaEncProps properties;
  170. LzmaEncProps_Init(&props);
  171. Then you can change some properties in that structure.
  172. 4) Send LZMA properties to LZMA Encoder
  173. res = LzmaEnc_SetProps(enc, &props);
  174. 5) Write encoded properties to header
  175. Byte header[LZMA_PROPS_SIZE + 8];
  176. size_t headerSize = LZMA_PROPS_SIZE;
  177. UInt64 fileSize;
  178. int i;
  179. res = LzmaEnc_WriteProperties(enc, header, &headerSize);
  180. fileSize = MyGetFileLength(inFile);
  181. for (i = 0; i < 8; i++)
  182. header[headerSize++] = (Byte)(fileSize >> (8 * i));
  183. MyWriteFileAndCheck(outFile, header, headerSize)
  184. 6) Call encoding function:
  185. res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,
  186. NULL, &g_Alloc, &g_Alloc);
  187. 7) Destroy LZMA Encoder Object
  188. LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
  189. If callback function return some error code, LzmaEnc_Encode also returns that code
  190. or it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS.
  191. Single-call RAM->RAM Compression
  192. --------------------------------
  193. Single-call RAM->RAM Compression is similar to Compression with callbacks,
  194. but you provide pointers to buffers instead of pointers to stream callbacks:
  195. SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
  196. const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
  197. ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
  198. Return code:
  199. SZ_OK - OK
  200. SZ_ERROR_MEM - Memory allocation error
  201. SZ_ERROR_PARAM - Incorrect paramater
  202. SZ_ERROR_OUTPUT_EOF - output buffer overflow
  203. SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
  204. Defines
  205. -------
  206. _LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code.
  207. _LZMA_PROB32 - It can increase the speed on some 32-bit CPUs, but memory usage for
  208. some structures will be doubled in that case.
  209. _LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler and long is 32-bit.
  210. _LZMA_NO_SYSTEM_SIZE_T - Define it if you don't want to use size_t type.
  211. _7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder.
  212. C++ LZMA Encoder/Decoder
  213. ~~~~~~~~~~~~~~~~~~~~~~~~
  214. C++ LZMA code use COM-like interfaces. So if you want to use it,
  215. you can study basics of COM/OLE.
  216. C++ LZMA code is just wrapper over ANSI-C code.
  217. C++ Notes
  218. ~~~~~~~~~~~~~~~~~~~~~~~~
  219. If you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling),
  220. you must check that you correctly work with "new" operator.
  221. 7-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator.
  222. So 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator:
  223. operator new(size_t size)
  224. {
  225. void *p = ::malloc(size);
  226. if (p == 0)
  227. throw CNewException();
  228. return p;
  229. }
  230. If you use MSCV that throws exception for "new" operator, you can compile without
  231. "NewHandler.cpp". So standard exception will be used. Actually some code of
  232. 7-Zip catches any exception in internal code and converts it to HRESULT code.
  233. So you don't need to catch CNewException, if you call COM interfaces of 7-Zip.
  234. ---
  235. http://www.7-zip.org
  236. http://www.7-zip.org/sdk.html
  237. http://www.7-zip.org/support.html