Jawin Userguide - Calling a DLL Entry Point

Introduction

This part of the userguide, covers how to use Jawin when calling a DLL entry point. The information herein covers the "low-level" code, close to Jawin. This is the form of code the Jawin Type Browser generates - this generated code should be used instead of manually writing this "low-level" code. If the Jawin Type Browser is not able to generate stub-code for a specific DLL, the guidelines in this document should be followed.

Before spending time on wrapping common DLL entry points (eg. functions in the Win32 API) one should also check whether the donated code in the org.jawin.donated.* subpackages already contains a wrapping for the needed DLL functions. We would also be grateful if you have wrapped some common used DLL functions, and chose to donate them to the Jawin Project - please send a mail to the Jawin mailing list, and we will help you getting the source into the donated package.

Content

Quick 'n Dirty Overview of DLL terms
How to call a DLL Entry Point with Jawin
Sample Code
1. Using a "shortcut" invoke
2. Using the generic invoke
Error Handling
Threading Issues
Additional Resources

1. Quick 'n Dirty Overview of DLL terms

The traditional way to expose a software module to other programs in the Win32 world is through exposion of functions in a DLL, which is the way the standard Win32 API is exposed to other programs (eg. some of the DLL's containing Win32 functions are: kernel32.dll, user32.dll and advapi32.dll).

If curious about exactly what functions a DLL exposes, a tool like Dependency Walker can be used (this tool is included in Visual Studio, where it is named Depends). Unfortunately DLL entry points do not contain meta data about which parameters a function requires, so to successfully use functions from a DLL, documentation for it is required. This can either take the form of header files for the DLL or more formal documentation, like eg. the Win32 API documentation contained in MSDN.

The Win32 API documentation specifies for each function the library that contains the particular function. Eg. the documentation for the Win32 MessageBox function contains the following requirements section:

Windows NT/2000: Requires Windows NT 3.1 or later.
Windows 95/98: Requires Windows 95 or later.
Header: Declared in Winuser.h; include Windows.h.
Library: Use User32.lib.
Unicode: Implemented as Unicode and ANSI versions on all platforms.

The line to note here, is the library line. This specifies in which import library the MessageBox function is defined. The name of the DLL file in which the MessageBox function is exposed is then a .dll-file with the same name as the .lib-file. So the MessageBox function is in the DLL user32.dll.

Besides this it is necessary to note the unicode line. A comment such as this usually means that there are really no MessageBox function in user32.dll, but instead two versions of it (this is actually the case for most Win32 API functions accepting strings as input):

MessageBoxW (accepting Unicode UCS-2 strings - two byte per character)
MessageBoxA (accepting ANSI strings - one byte per character)

This should be verified with the Dependency Walker. Pt. the Jawin marshalling code is only able to marshal to two byte Unicode strings (or BSTR), so the MessageBoxW should be choosen.

If you only have a vague feeling about what this Unicode stuff is about, please use the 15 minutes required to read the article about character encodings, refered to in the additional resources.

2. How to call a DLL Entry Point with Jawin

To call an exposed DLL function, the class org.jawin.FuncPtr should be used. Each instance of this class represents a single DLL entry point, and you create a FuncPtr by specifying a library name and entry point name.

After creating a FuncPtr-object, the DLL function can be invoked by calling one of the invoke()-methods. The FuncPtr-class contains both a generic invoke()-method that covers all function signatures, as well as several "shortcut"-methods for standard function signatures. Please see the Javadoc for the FuncPtr-class for the details about exactly what "shortcut"-methods exists.

Note that the invoke() methods can be invoked several times on a FuncPtr-object and from different threads, so if the same native function is used more than once, the FuncPtr-object can with advantage be cached.

When finished with a FuncPtr-object the close() method should be called to let Windows decrease the reference count for the DLL, and ultimately unload the DLL. The FuncPtr contains a finalize-method that tries to release the resources if close() was not explicitely called, but as usual in Java there are no guarentees with respect to finalize() so it is highly recomended to call close() explicit.

3. Sample Code

As mentioned in the previous section the FuncPtr-class supports both a generic invoke() and "shortcut" methods for some standard function signatures. The two following sections shows how to use both types. Please notice that the generic invoke() is a factor 10-50 times slower than using the "shortcut" methods, so the "shortcut" methods should be prefered if one exists.

3.1. Using a "shortcut" invoke

Code for using Jawin for calling the above mentioned MessageBoxW-function could look like the following, that uses a "shortcut" method (the code is from the HelloDll demo)

import org.jawin.COMException; import org.jawin.FuncPtr; import org.jawin.ReturnFlags; .. .. FuncPtr msgBox = null; try { msgBox = new FuncPtr("USER32.DLL", "MessageBoxW"); msgBox.invoke_I(0, "Hello From a DLL", "From Jawin", 0, ReturnFlags.CHECK_FALSE); } catch (COMException e) { // handle exception } finally { if (msgBox != null) { try { msgBox.close(); } catch (COMException e) { // handle fatal exception } } }

The called "shortcut" method

invoke_I(int, String, String, int, ReturnFlags);

can be used, since it matches the signature for MessageBoxW (taken from the MSDN documentation)

int MessageBox( HWND hWnd, // handle to owner window LPCTSTR lpText, // text in message box LPCTSTR lpCaption, // message box title UINT uType // message box style );

This is not immediately obvious, but the selected "shortcut" method is found following the rules:

first the invoke_I family of "shortcut" methods is selected because the MessageBoxW method returns an integer
secondly the (int, String, String, int) method is selected because the base types of the native method can be represented as this (eg. the HWND is TYPEDEF'ed as a void-pointer, and since any pointer on a 32-bit platform is a 4-byte integer, an int can be used for this value).

So generally it should be noted, that because of the TYPEDEF mechanism in C/C++ many method signatures are covered by the relative few invoke_*-methods. But if a matching "shortcut" method does really not exist, the generic invoke()-method can be used, which is described in the next section.

The final parameter, the ReturnFlags tells Jawin how to process the function return value. The different options for this values is described in the section about error handling.

3.2. Using the generic invoke

If no "shortcut" method exists for a specific function signature, the generic invoke() must be used. This is a somewhat more tedious task seen from the programmers view, but it is also very flexible. Below is a code snippet that invokes MessageBoxW using the generic invoke(). Don't worry if it looks confusing at first sight, it is explained below the code, what each step does (the code is from the HelloDllGeneric demo)

import org.jawin.COMException; import org.jawin.FuncPtr; import org.jawin.ReturnFlags; import org.jawin.io.LittleEndianOutputStream; import org.jawin.io.NakedByteStream; .. .. FuncPtr msgBox = null; try { msgBox = new FuncPtr("USER32.DLL", "MessageBoxW"); // create a NakedByteStream for the serialization of Java variables NakedByteStream nbs = new NakedByteStream(); // wrap it in a LittleEndianOutputStream LittleEndianOutputStream leos = new LittleEndianOutputStream(nbs); // and then write the Java arguments leos.writeInt(0); leos.writeStringUnicode("Generic Hello From a DLL"); leos.writeStringUnicode("From Jawin"); leos.writeInt(0); // call the generic invoke, with the NakedByteStream // and parameters describing how to deserialize the // NakedByteStream byte-array on the native side msgBox.invoke("IGGI:I:", 16, nbs, null, ReturnFlags.CHECK_FALSE); } catch (COMException e) { // handle exception } finally { if (msgBox != null) { try { msgBox.close(); } catch (COMException e) { // handle fatal exception } } }

As seen from the code example, all the arguments that should be passed to the native function are serialized into a NakedByteStream (which mostly wraps a byte-array as explained below).

This NakedByteStream is then passed to the generic invoke-method together with information about how to deserialize the content of the byte-array on the native side. This information is in the instructions-string and the stackSize, which are also both further explained below.

3.2.1. The NakedByteStream - serialize from Java-variables before calling `invoke`

Since Windows uses little-endian byte order internal, the serialization of Java variables onto the byte-array that should be passed to the native code, must use this byte order to. To assist in this, the caller should use the org.jawin.io.LittleEndianOutputStream- class. The recommended way to do this is like:

Create a org.jawin.io.NakedByteStream-object, which is just a simple subclass to java.io.ByteArrayOutputStream, that allows access without copying to the wrapped byte array. This should be used for better performance. Note: be aware of the limitations of working with the internal byte-array, that is NEVER use the .length value of the internal array, instead use the value of the size() method on the NakedByteStream.
Create a org.jawin.io.LittleEndianOutputStream object, passing the NakedByteStream-object to the constructor.
Serialize all the [in]-variables to the byte array, by calling the appropriate methods on the LittleEndianOutputStream for each variable.

After this, the NakedByteStream contains the serialized variables in little-endian byte order, and can then be passed to the invoke() method.

3.2.2. The instruction string for controlling native deserializing and serializing

Many C-functions receive pointers as parameters and other C-specific data types not present in Java; therefore, in order to invoke the functions in a DLL, and to pass the parameters correctly to the C function, Jawin has to have a way of converting the parameters passed to it, to the data types required by the DLL functions. In order to accomplish that, Jawin uses an "instruction string" (details of the meaning of each instruction character in the instruction string are specified in The Instruction String Reference. At this point the Instruction string reference is not complete, so additional information can be gleaned from the source code - more specifically cpp/jawin/instructions.h (where the letters in the instruction string are explained pretty well, but not complete), and in cpp/jawin/Transform.cpp (where the actual instructions are implemented).

The instructions-string specifies in detail the format of any [in], [out] and [retval] parameters. The format of it is on the form XXX:Y:ZZZ, where XXX is directions for [in], Y is [retval] and ZZZ is [out]-directions. If no directions is present for a specific section, the section can be empty, but the colons can not be left out, eg. a instruction string for a method with no [out] parameters should be on the form XXX:Y: (note the trailing ':').

Overall, the instruction string processing happens at the following stages (if reading cpp/jawin/instructions.h, it often refers to "src" and "dest" and the references refer to different things depending on which part of the instruction string is being processed).

Native deserializing: Converting the data passed in the byte-array (wrapped in the NakedByteStream) as specified by the first section of the instruction string to the data types required by the DLL function. At this stage, one has to pass instruction string that will convert the data from the input byte array from java, to the appropriate data types for the C call, and then pass the converted parameters to the C function. At this stage, in instructions.h, "src" refers to the byte array (or the content of the NakedByteStream) passed from Java into the FuncPtr.invoke() method, and "dest" refers to what is being passed to the DLL function (which is also a byte array with the size of StackSize).
In many cases, the instructions passed here, create the appropriate pointers and place them as parameters to the DLL function. The instructions here have to follow the same sequence in which the actual data is written to the byte array sent to the FuncPtr.invoke method.
The native invoke: After the deserialization of the byte-array has happened, the native method is invoked with the relevant native arguments.
Serializing the return value from the function: The result of the native invocation, is serialized onto a return byte-array. Note, the FuncPtr.invoke() function returns a byte[] in the general case - that is because the de-serialization of the return values to the appropriate data types cannot be determined by FuncPtr itself. For example, if the DLL function has a return value of int, and the instruction string specified that the return value was an I, the first 4 bytes of the return byte array will contain the return value and can be retrieved by calling the readInt() method of the LittleEndianInputStream class. In another example, if the function returns a pointer to a piece of data that we want to use in Java, the instructions specify that the content of what the pointer points to, should be read and written to the return byte array. At this stage, "src" refers to the value returned from the function, and "dest" refers to the byte array returned from the FuncPtr.invoke method.
Serializing any [out] values: Finally any [out] parameters that were passed to the DLL function (as specified by the instruction string) are written to the returned byte-array. Once again, in order for the return data to be useful for the Java application, it has to be deserialized on the Java side into the appropriate Java data types. At this stage, "src" refers to the byte-array that was passed to the native function (the array with the native arguments which step 1 referred to as "dest"; however, this time with values populated ), and "dest" refers to the same byte array as the return has just been serialized onto.
In many cases, the instructions passed here, either skip data passed as [in] parameters only, or retrieve the data that was placed in the [out] parameters of the function call. If there are no output parameters to be read, this section would be empty. In most cases, the content of this section will closely resemble the 1st stage of the input string, with instructions to skip the "in"-only parameters, and to read the "out" parameters in their sequence. However, that is not necessarily true in all cases, since the output instruction string can be totally independent of the input section and can arrange the content of the byte-array passed to Java in any way it wants.

3.2.3. The stacksize argument

The stack size is important in the context of invoking the native method (I find it convenient to think about it in terms of the size of the native function parameters between the function param parenthesis), because that is the size of the byte array that will be allocated to hold all parameters passed to a function. So, in essence, the stack size is the sum of the sizes of all the arguments passed to the native function.

An important characteristic is that the stack size is a multiple of 4 bytes (the size of an int): e.g. although it is not impossible to specify that the stack size is not a multiple of 4, there is a good chance that something will go wrong. This is due to the fact that the calling conventions for all standard call types (__cdecl, __stdcall and __fastcall) dictates that all smaller than 4 bytes values (32 bit), are widened to 4 bytes. Therefore the size of short, byte, char and boolean should be specified as 4 bytes, opposed to what one would belive.

To calculate the stack size for the method call, add up the sizes of the arguments of the native method signature. While adding the argument sizes, if any of the arguments is shorter than sizeof(int), add sizeof(int) to the stack size instead of the actual argument size. For example, if the method accepts 2 integers and a pointer to a struct, then the stack size will be 2*4 + 4 (the pointer size is sizeof(int)). If the method accepts a byte array with a length of 100, a struct (and not a pointer to a struct) of size 12, and a boolean, then the stack becomes 100+12+4 (the size of the boolean is less than 4; however, 4 is the minimum increment allocated on the stack).

Important sizes in calculating stack size are:

int: 4
double: 8
any kind of pointer, handle, etc: 4
string (BSTR) : 4 (again, it is a pointer to str)
struct: the sum of the sizes of all elements of the struct
byte buffer: 4 (since this is essentially a pointer to the buffer is on the stack, not the actual buffer)

3.2.4. Deserialize into Java-variables after calling `invoke`

As indicated in the above sections, the invoke-method returns any [retval] and [out] values in a new byte array. This should then be deserialized into Java types (just the opposite of the serializing in section 3.2.1) by using a org.jawin.io.LittleEndianInputStream like in this fictitious sample (with one integer [return] val, and one [out]-integer)

.. byte[] result = funcPtr.invoke(..); // wrap result in a LittleEndianInputStream LittleEndianInputStream leis = new LittleEndianInputStream(new ByteArrayInputStream(result)); // any [retval] values are placed first int retVal = leis.readInt(); // and then follows any [out] values int outVal = leis.readInt(); ..

4. Error Handling

DLL's uses return codes, instead of throwing exceptions. Jawin maps these error codes to Java exceptions (instances of org.jawin.COMException to be precise), to free the programmer of the burden of having to work with two different types of error checking. Unfortunately there are no general rule for exactly what return values signals an error, so the caller of a function have to tell Jawin how the return value for a function should be interpreted. The class org.jawin.ReturnFlags defines 4 constants which covers the standard types of return codes:

ReturnFlags.CHECK_NONE should be used when the return code should not be interpreted.
ReturnFlags.CHECK_FALSE should be used for functions returning 0 on error. So an exception will be thrown if (!ret), so this should be used if the return value on error is 0, false, NULL, etc. (remember C/C++ has less strict rules than Java for when something can be evaluated as a boolean expression). The error message in the thrown exception will be based on GetLastError.
ReturnFlags.CHECK_W32 should be used for functions returning 0 on success. This is usually specified in the documentation for a function, as returns ERROR_SUCCESS on success. The error message in the thrown exception will be based on the return value.
ReturnFlags.CHECK_HRESULT should be used for functions returning a HRESULT, as in the COM-world. An exception will be thrown if FAILED(hr), and the error message will be based on the returned HRESULT.

The documentation for the working example, MessageBox contains the following information about the return value:

If the function fails, the return value is zero. To get extended error information, call GetLastError.

So in this case the ReturnFlags.CHECK_FALSE should be used as the ReturnFlags value.

5. Threading Issues

DLL's do not have any threading issues. So the main rule (if not specified otherwise in the documentation for a specific DLL) is that a FuncPtr object can freely be created, used and closed on different threads.

6. Additional Resources

Additional resources when working with DLL's from Jawin

For Win32 basic concepts see [Ric99].
Character encodings: Joel on Software has an article explaining The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). This is a nice introduction (and/or refresher) to character encodings.