Aymeric on Software

Because we needed another of these blogs...

Don’t Use Standard library/CRT Functions in Static initializers/DllMain!

Today, my colleague and I, we found a bug in OpenSceneGraph. Everybody knows that singletons are evil and may cause cancer, but that does not prevent OpenSceneGraph from using them all over the place. One of them in particular is initialized in the static context, and happens to call the standard library function getenv(). This was the beginning of all our troubles: from time to time, this seemingly benign initialization would cause a deadlock in our software.

Static initialization occurs for instance when you construct an object outside of any function body by writing something like static ObjectBlah myGlobalVar = new ObjectBlah(). The constructors are called before the entry point of your program (i.e. the main() function). In the case of a shared library, the static initialization generally occurs when the shared library is loaded. On Windows, the DllMain callback function is used for that purpose.

The rest of this post will detail the specific behavior on Windows, and explain why it is prone to deadlocks. I think it is generally a bad practice to call standard library functions on other platforms too, although I do not have a detailed proof to back my words (yet!).

If you read the MSDN documentation of DllMain, you will realize that this function is quite dangerous to use. I extracted and highlighted some parts below:

Warning There are serious limits on what you can do in a DLL entry point. To provide more complex initialization, create an initialization routine for the DLL. You can require applications to call the initialization routine before calling any other routines in the DLL.

Because Kernel32.dll is guaranteed to be loaded in the process address space when the entry-point function is called, calling functions in Kernel32.dll does not result in the DLL being used before its initialization code has been executed. Therefore, the entry-point function can call functions in Kernel32.dll that do not load other DLLs. For example, DllMain can create synchronization objects such as critical sections and mutexes, and use TLS. Unfortunately, there is not a comprehensive list of safe functions in Kernel32.dll.

Windows 2000: Do not create a named synchronization object in DllMain because the system will then load an additional DLL.

One of the reasons why it is so critical to not use LoadLibrary() or any kernel32.dll functions susceptible to call LoadLibrary() (like user32.dll) is because the process private critical section is locked while DllMain is running. This “loader lock” is taken any time a library is loaded but also when functions like GetModuleHandle() or GetModuleFileName() are used.

At this point you might think it is safe to call a standard library function as long as it does not appear to require anything more than kernel32.dll functions and is not using GetModuleHandle() or GetModuleFileName(). This relies on implementation details of the standard library, so it’s a bit border line, but still might work, right? No.. wrong! Now all you need to cause a deadlock is another lock. Guess what? There are plenty of locks taken in the standard library functions…

Principle: do not call any standard library functions that acquire locks in DllMain().

It’s quite another thing to know whether a function uses advanced kernel functions (unlikely to change) than to know whether it does acquire internal locks (subject to change every time Microsoft releases a new CRT). Therefore the following corollary can be deduced:

Corollary: do not use any standard library functions at all in DllMain() (because you really have no idea whether it will change and acquire a lock in the future).

If you use CRT functions in your DllMain() the following events might occur in the wrong preemption order and cause a deadlock. Imagine that you have one thread that calls LoadLibrary() on your DLL. It acquires the loader lock, then executes DllMain() which finally executes your static initializer. At this stage if you use a CRT function, you will try to acquire an internal CRT lock… Meanwhile in the rest of you program you might want to access another CRT function that happens to do the following: attempt to acquire the same CRT lock, then attempt to call LoadLibrary (because it uses an advanced kernel function that requires a new DLL to be loaded). LoadLibrary will attempt to acquire the loader lock too. Boom!

In practice, the bug we discovered involved the following race condition:

  • Thread A
    1. LoadLibrary() called. Acquire loader lock.
    2. DllMain() executes getenv(). Acquire _ENV_LOCK.
  • Thread B
    1. stat() is called. Acquire _ENV_LOCK
    2. stat() calls GetTimeZoneInformation(), which requires ntdll.dll to be loaded.
    3. Then calls LoadLibrary() to get ntdll. Acquire loader lock.

If your threads are preempted such as 1, 3, 4, 2, 5 are executed in sequence, then you have a deadlock!

I discovered a posteriori that Richard Chen and Larry Osterman documented the problem. Although in their case they do not explicitly mention the C runtime, it is obvious that their remarks are relevant once you know that the CRT uses internal locks.

References:

Correction:

I previously wrote “the rest of this post will detail the behavior on Windows, but the same general principles are true for other platforms as well.”. This was a gross exaggeration. I’m still convinced that using libc in static initializers is prone to troubles on other platforms as well. Having it all work correctly depends on so many implementation details: kernel (how are shared objects loaded, how system calls works), libc (how does it interact with the kernel), and the linker (does it load stuff in the right order w.r.t. static initializers).