The $2,000 one line bug fix

Aug 4, 2009 Windows, Programming

Today, after 4 days of work, I have finally fixed a bug. The fix? Removing a single line of code which should have been there in the first place!

The Bug

Sometimes, and at random, our application was crashing on exit. The crash was always some dodgy memory access violation. This happened after calling a SendMessage() to HWND_BROADCAST… I did not quite understand what was wrong with it, but calling SendMessage was stupid enough in the first place. So, I removed the SendMessage, replaced it by something much better, and promptly claimed to have fixed the bug. Voilà! Not surprisingly, on the very next day, QA found a similar crash in DestroyWindow(), albeit much harder to reproduce.

Time for a proper proper investigation!

The Cause

A few days later, after unrolling tons and tons of code, I found miles away from the code that was crashing this little jewel:

FrameWidget::~FrameWidget()
{
  m_hWnd = 0;
  if (IsWindow())
    DestroyWindow();
  // etc...
}

Suddenly, the whole picture got clearer.

The first line resets the window handle. The next one checks if the handle represents a window. Of course not! So no need to destroy the window object associated to the handle, right?! Then, by the time you get out of the destructor, your message map/callback functions have been destroyed and point to garbage memory… So, all it takes to crash it now, is one message routed to this window. And thanks to Murphy’s law, it indeed does happen, sometimes!

What SendMessage() to HWND_BROADCAST and DestroyWindow() have in common, is that they both actually send a message to this phantom window. In case you did not know, DestroyWindow() does purge the message loop before destroying the window…

Why was the handle set to null in the first place is beyond me, but it sure was a bitch to find!

Reference

DestroyWindow (MSDN)