windows-xp/Source/XPSP1/NT/base/mvdm/thunk/flatdocs.txt



                  How to write flat 32->16 thunks

1. INTRO
--------------------------------------------------------------------
   

   - What are flat 32->16 thunks?

     THUNK.EXE is really three compilers that share a common parser.
     This file describes the "flat thunk" mode, which generates 32->16
     thunks that are mostly 32-bit (the other two modes are 16->32 thunks
     and 32->16 thunks that are mostly 16-bit).  


   - Why should I use flat thunks?

     Flat thunks do most of their work in 32-bit mode. This means better
     performance since they use a minimum number of selector loads
     and replace far16 calls with near32 calls. Also, the
     flat compiler is an aggressive optimizer: early results show
     a 30-40% code size reduction over the 16-bit code generator.


   - Then why use 16-bit 32->16 thunks at all?

     Approximately 200 thunks in Chicago have hand-coded portions which
     need to be ported by hand to 32-bit mode. For compatibility,
     16-bit thunks will be with us for some time to come.


   - Can I have both 16-bit and 32-bit thunks in my component?

     Yes, but you will need two thunk scripts: one for the 16-bit thunks
     and one for the 32-bit thunks. There is no way to mix types in a single
     script.


   - How does the thunk compiler work?

     The thunk compiler's input is a "thunk script", which is a list
     of C-style function prototypes and typedefs. It outputs a .asm file which
     is really two .asm files in one. Assemble it with the "-DIS_16"
     flag and you get a 16-bit .obj which you link into your 16-bit
     component. Assemble it with the "-DIS_32" flag and you get a 32-bit
     .obj which you link into your 32-bit component.

     The 16-bit component contains a jump table containing the 16:16
     address of each function named in your thunk scripts (these functions
     must exist elsewhere as PASCAL functions in your 16-bit component).
     The 32-bit half contains a STDCALL function for each thunk which
     converts its parameters to 16-bit and then calls (through some
     kernel32 magic) the 16-bit target named in the jump table. When
     a 32-bit app invokes a thunked api, it calles these compiler-generated
     STDCALL functions directly.

     For example, the thunk declaration for the LineTo api looks like this:

        typedef          int INT;
        typedef unsigned int UINT;
        typedef UINT         HANDLE;
        typedef HANDLE       HDC;
        
        BOOL LineTo(HDC, INT, INT) =
        BOOL LineTo(HDC, INT, INT) 
        {
        }

     The first function "prototype" declares the form of the 16-bit target
     and second declares the form the of the 32-bit target. As in most
     thunks, the two prototypes are identical. Like the C compiler, the
     thunk compiler interprets "int" as "short" in the 16-bit prototype
     and "long" in the 32-bit prototype.

     When this is fed through the thunk compiler, this is what pops out.
     On the 16-bit half, there is a jump table:

         externDef LineTo:far16

         FT_GdiFThkTargetTable:
                ...
                dw      offset LineTo 
                dw      seg    LineTo

     and LineTo is (say) entry #79. The 32-bit half contains the code:


        ; LineTo(16) = LineTo(32) {}
        ;
        ; dword ptr [ebp+8]:  param1
        ; dword ptr [ebp+12]:  param2
        ; dword ptr [ebp+16]:  param3
        ;
        public LineTo@12
        LineTo@12:
                FAPILOG16       1377            ;DEBUG only -- log api call
                push    ebp
                mov     ebp,esp
                sub     esp,40                  ;Work-space for kernel32
                push    word ptr [ebp+8]        ;param1: dword->word
                push    word ptr [ebp+12]       ;param2: dword->word
                push    word ptr [ebp+16]       ;param3: dword->word
                mov     cl,79                   ;Thunk index
                call    QT_Call16_ShortToLong
                leave
                ret     12

     When a Win32 app calls "LineTo", it transfers directly to this
     routine, which builds a 16-bit call frame and calls a local routine 
     asking it to please invoke api #79 in the 16-bit jump table and 
     sign-extend the return value (each component gets its own set
     of QT_ routines which knows what jump table to use.)


2. PROCEDURE FOR ADDING FLAT THUNKS 
------------------------------------------------------------------------

     1. Write a thunk script containing thunk declarations and typedef's
        as above. core\thunk\gdifthk.thk and core\thunk\usrfthk.thk are
        good examples to start from. Put the lines:

                enablemapdirect3216 = true;
                flatthunks = true;

        at the start of your script. This tells the compiler you intend
        to write 32->16 thunks and to generate 32-bit code.

        The naming convention for flat thunk scripts is FooFThk.thk where
        "Foo" identifies your component. If you keep your script in
        the core\thunk directory, please follow this convention.


     2. Compile your thunk script:

            $(TNT) $(THUNK) -ynTb -t FooFThk FooFThk.thk FooFThk.asm

            $(TNT) = dev\tools\c\bin\tnt.exe
            $(THUNKCOM) = dev\tools\binr\thunk.exe

        The "-t FooFThk" provides a string which the thunk compiler uses
        to individualize identifier names. By convention, use the
        stem of your thunk script filename.

        For debug builds, eliminate the "-b" flag.

        The makefile in core\thunk has all this set up so it's easiest
        to check your thunk script there.


     3. Create a empty header file "FooFThk.inc". The 32-bit half of the *.asm
        file includes this header. This is where you put special-case 
        code for your thunks.


     4. Link the 16-bit half of FooFThk.asm into your 16-bit component. 
        Pass these flags to the assembler:
        
           -DIS_16

        Add the export:

            FT_FOOFTHKTHKCONNECTIONDATA
 
        to your *.def file and mark it internal. The 32-bit half 
        dynalinks to this symbol to access the 16-bit jump table.


     5. Link the 32-bit half of FooFThk.asm into your 32-bit component. 
        Pass these flags to the assembler:
        
           -DIS_32

        Core\thunk\fltthk.inc and FooFThk.inc must be in the
        include path.

        Do not pass the -DFT_DEFINEFTCOMMONROUTINES flag to activate
        the "ifdef"'d part of the .asm file. The "ifdef'd" part
        contains common support code that's to be linked into kernel32
        only. Including it in another module wastes code.


     6. In your DLL initialization procedure, execute the following
        for each PROCESS_ATTACH call:

                FT_FooFThkConnectToFlatThkPeer  PROTO near 
                                                pszDll16:dword, 
                                                pszDll32:dword

                pszDll16        db  'foo16.dll',0  ;name of your 16-bit dll
                pszDll32        db  'foo32.dll',0  ;name of your 32-bit dll

                ...


                invoke  FT_FooFThkConnectToFlatThkPeer, 
                        offset pszDll16, 
                        offset pszDll32
                or      eax,eax
                jz      failed
                ; success

        This initializes the flat thunks. The call executes a loadlibrary
        and getprocess address on the 16-bit module. The init routine
        itself is generated by the thunk compiler in the 32-bit half
        of the .asm file.


     7. Link your 32-bit module with dev\lib\kernel32.lib if you're
        not doing so already. The thunk code needs the import records
        for the support routines in kernel32.


     8. Build the components and test. Under debug, you
        can get a debug-port message for each flat thunk by
        setting the "fapilog16" variable in win32c.dll to 1.
        The "[F]" before the api name tells you that it's a flat thunk.


3. WHAT'S IN AND OUT FOR FLAT THUNKS
-------------------------------------------------------------------------

     The flat code generator supports:

        - Structures passed by value or reference.
        - Structures within structures.
        - Pointers within structures, provided that the object
          pointed to doesn't require repacking. The object can be
          another structure.
        - Arrays of scalars embedded in structures.
        - The "input", "output" and "inout" qualifiers for pointer 
          arguments. Default is "input".
        - "passifhinull" for pointer arguments.
        - The "hinstance" primitive type (for mapping instance handles)
        - "passifnull" for hinstances
        - "structsize" for integer structure fields

        - Returning pointers provided that the object pointed to requires
          no repacking. The object can be a structure.
        - The "voidtotrue" and "voidtofalse" qualifiers.


     Not supported:
        - Arrays of pointers or arrays of structures.
        - The "deleted" qualifier.
        - The "byname" qualifier
        - The "maptoretval" semantic.
        - The "sizeof" and "countof" semantics.
        - The "localheap" semantic.
        - The "reverserc" semantic.
        - The "callback" semantic.
        - "body = special", "raw pack/unpack", "push", "special".
          No hand-coding for flat thunks is allowed. Use wrappers
          instead to thunk complex routines.


4. SPECIAL-CASING A THUNK BODY
---------------------------------------------------------------------------
**** HAND-CODING FOR THUNK BODIES IS BEING PHASED OUT. USE
     WRAPPERS TO THUNK COMPLEX API. SEE ATSUSHIK IF YOU NEED HELP
     ON THIS.
****


13. APPENDIX I: 
------------------------------------------------------------------------------
    NEW for the flat code generator:
    Revision 1: STRUCTSIZE and HINSTANCES


  Structure size fields:
     You can now thunk those fields that contain the size of its 
     containing structure. Just put the "structsize" keyword after
     the field name, like this:

            typedef struct tagFOO {
                DWORD cbSize   structsize;
                LPSTR this;
                LPSTR that;
            } FOO;

     The compiler will insert the 16-bit structure size when packing in,
     and the 32-bit structure size when packing out. You can mark
     any integral type field as "structsize", including UINT.


  HInstance:
     There's a new primitive data type "hinstance" (lowercase). I'll
     add a typedef for HINSTANCE (uppercase) as soon as I've cleaned out
     the old usage of HINSTANCE.

     "hinstance" maps to a 32-bit value for the 32-bit side and a 16-bit
     value for the 16-bit side. Hinstances can appear anywhere an integer
     can (except as the return value type). 

     NULL gets mapped to the current hinstance by default. You can make
     NULL map to NULL instead by adding the "passifnull" qualifier.
     If the hinstance is a structure field, add the "passifnull" qualifier
     as you would "structsize". If it's a parameter, put

     paramname = passifnull;

     inside the curly braces.

     Everyone is reminded that the 16-bit "hinstance" for a 32-bit app is
     really the hmodule. This works because most api that ask for hinstances
     really want hmodules.