Don’t Use Standard library/CRT Functions in Static initializers/DllMain!
The Problem
Today, my colleague and I, have found a bug in OpenSceneGraph. Everybody knows that singletons are evil and may cause cancer, but that does not prevent OpenSceneGraph from using them all over the place. One of them, in particular, is initialized in the static context, and happens to call the standard library function getenv()
. This was the beginning of all our troubles: from time to time, this seemingly benign initialization would cause a deadlock in our software.
Static initialization occurs, for instance, when you construct an object outside of any function body by writing something like static ObjectBlah myGlobalVar = new ObjectBlah()
. The constructors are called before the entry point of your program (i.e. the main()
function). In the case of a shared library, the static initialization generally occurs when the shared library is loaded. On Windows, the DllMain
callback function is used for that purpose.
The rest of this post will detail the specific behavior on Windows, and explain why it is prone to deadlocks. I think it is generally a bad practice to call standard library functions on other platforms too, although I do not have a detailed proof to back my words (yet!).
The Dangers of DllMain
If you read the MSDN documentation of DllMain, you should realize that this function is quite dangerous to use. I extracted and highlighted some parts below:
Warning There are serious limits on what you can do in a DLL entry point. To provide more complex initialization, create an initialization routine for the DLL. You can require applications to call the initialization routine before calling any other routines in the DLL.
Because Kernel32.dll is guaranteed to be loaded in the process address space when the entry-point function is called, calling functions in Kernel32.dll does not result in the DLL being used before its initialization code has been executed. Therefore, the entry-point function can call functions in Kernel32.dll that do not load other DLLs. For example,
DllMain
can create synchronization objects such as critical sections and mutexes, and use TLS. Unfortunately, there is not a comprehensive list of safe functions in Kernel32.dll.
Windows 2000: Do not create a named synchronization object in
DllMain
because the system will then load an additional DLL.
One of the reasons why it is so critical to not use LoadLibrary()
or any kernel32.dll functions susceptible to call LoadLibrary()
(like user32.dll) is because the process private critical section is locked while DllMain
is running. This “loader lock” is taken any time a library is loaded but also when functions like GetModuleHandle()
or GetModuleFileName()
are used.
At this point you might think it is safe to call a standard library function, as long as it does not appear to require anything more than kernel32.dll functions, and is not using GetModuleHandle()
nor GetModuleFileName()
. This relies on implementation details of the standard library, so it’s a bit border line, but still might work, right? No… wrong! Now all you need to cause a deadlock is another lock. Guess what? There are plenty of locks taken in the standard library functions…
Principle: do not call any standard library functions that acquire locks in
DllMain()
.
It’s quite another thing to know whether a function uses advanced kernel functions (unlikely to change) than to know whether it does acquire internal locks (subject to change every time Microsoft releases a new CRT). Therefore the following corollary can be deduced:
Corollary: do not use any standard library functions at all in
DllMain()
(because you really have no idea whether it will change and acquire a lock in the future).
The Deadlock
If you use CRT functions in your DllMain()
, the following events might occur in the wrong preemption order and, consequently, cause a deadlock.
Imagine that you have one thread that calls LoadLibrary()
on your DLL. It acquires the loader lock, then executes DllMain()
, which finally executes your static initializer. At this stage if you use a CRT function, you will try to acquire an internal CRT lock…
Meanwhile, in the rest of you program, you might want to access another CRT function that happens to do the following: attempt to acquire the same CRT lock, then attempt to call LoadLibrary
(because it uses an advanced kernel function that requires a new DLL to be loaded). LoadLibrary
will attempt to acquire the loader lock as well.
Boom! You now have two thread that are trying to acquire two access locks in opposite orders.
In practice, the bug we discovered involved the following race condition:
- Thread A
LoadLibrary()
called. Acquire loader lock.DllMain()
executesgetenv()
. Acquire_ENV_LOCK
.
- Thread B
stat()
is called. Acquire_ENV_LOCK
stat()
callsGetTimeZoneInformation()
, which requires ntdll.dll to be loaded.- Then calls
LoadLibrary()
to get ntdll. Acquire loader lock.
If your threads are preempted such as 1, 3, 4, 2, 5 are executed in sequence, then you have a deadlock!
I discovered a posteriori that Richard Chen and Larry Osterman documented the problem. Although in their case they do not explicitly mention the C runtime, it is obvious that their remarks are relevant once you know that the CRT uses internal locks.
References
- Another reason not to do anything scary in your DllMain: Inadvertent deadlock, Raymond Chen, The Old New thing blog
- Best practices for DllMain, Larry Osterman’s blog
Correction
I previously wrote “the rest of this post will detail the behavior on Windows, but the same general principles are true for other platforms as well.”. This was a gross exaggeration. I’m still convinced that using libc in static initializers is prone to troubles on other platforms as well. Having it all work correctly depends on so many implementation details: kernel (how are shared objects loaded, how system calls works), libc (how does it interact with the kernel), and the linker (does it load stuff in the right order w.r.t. static initializers).