Aymeric on Software

Because we needed another of these blogs...

The $2,000 One Line Bug Fix

Today is the last of 4 days I spent (mostly) on a bug. The actual bug was fixed by removing one line of code, which should have never been there in the first place!

What was happening? Sometimes and randomly, our application was crashing on exit. The crash was always some dodgy memory access violation. This happened after calling a SendMessage() to HWND_BROADCAST… I did not quite understand what was wrong with it, but calling that SendMessage was stupid enough in the first place. I removed it, replaced it by something much better, and claimed to have fixed the bug. Not surprisingly, the next day the QA had another similar crash in DestroyWindow(), albeit much harder to reproduce.

This time was spent doing some proper investigation. A few days later, after unrolling tons and tons of code, I found miles away from the code that was crashing this little jewel:

  m_hWnd = 0;
  if (IsWindow())
  // etc...

Suddenly the whole picture gets clearer. The first line resets the window handle. The next one checks if the handle represents a window. Of course not! So no need to destroy the window object associated to the handle, right?! Then, by the time you get out of the destructor, your message map/callback functions have been destroyed and point to garbage memory… All it takes to crash it now, is one message routed to this window. And thanks to Murphy’s law, it did happen!

What SendMessage() to HWND_BROADCAST and DestroyWindow() have in common, is that they actually did send a message to this phantom window. In case you did not know, DestroyWindow() does purge the message loop…

Why the handle was set to null in the first place is beyond me, but it sure was a bitch to find!