Kernel mode constructs

I’ve been reading Jeffrey Richter’s book, CLR via C#, lately, and near the end of the book, I found this interest bit on threading.

He starts by telling us about two types of threading constructs: User-mode constructs that use special CPU instructions, and Kernel-mode constructs offered by Windows.

User-mode constructs handle all the thread management in the hardware, which means that it’s fast. When it blocks, Windows can’t tell. This is good because the OS won’t create a new thread (using resources). It’s bad, however, if the thread spins for a long time in the CPU, as it means that a CPU core is idle and can’t be used until the thread can be unblocked.

Kernel-mode constructs can be managed by the OS so that blocked threads don’t waste the CPU’s efforts, but they are much less efficient than User-mode constructs.

It was really interesting to see Richter’s comparison of just the overhead of the two constructs.

He shows the time to cycle through 10,000,000 iterations for each test, starting with two very trivial examples to help with context:

1. for (Int32 i=0; i< iterations; i++) {  x++;   }

This took 8ms.

2. Then we look into the overhead of a method that does nothing. As our thread constructs call an enter and exit method twice, we’ll call our empty method twice as well.

Imagine it looks a bit like: void M() {} and the test is:

for (Int32 i = 0; i < iterations; i++)
{   M(); x++;  M();   }

This took 50ms.

3. Now we use SimpleSpinLock, a user-mode construct that he implements earlier in the chapter for you. It’s test looks like:

SimpleSpinLock ssl = new SimpleSpinLock();
for( Int32 i = 0; i < iterations; i++)
{  ssl.Enter;  x++;  ssl.Leave();  }

This took 219ms, which isn’t that bad, but still four times as expensive as calling empty methods (note that object creation wasn’t counted in the test, just the looping).

4. Finally, he constructs a SimpleWaitLock (a simple kernel-mode implementation based on the WaitOne kernel-mode object) and tests it with:

using( SimpleWaitLock sql = new SimpleWaitLock() ) {
for(Int32 i = 0; i < iterations; i++)
{ swl.Enter();  x++;  swl.Leave();   }
}

The kernel-mode threading construct took 17,615ms or about 80 times longer!

The moral of the story? Don’t use kernel-mode constructs (semaphores, mutexes, etc.) above user mode constructs without a really good reason. Thankfully, in the next chapter, he presents hybrid constructs that get the advantages of both user- and kernel-mode threading constructs.

If you’re interested in going a bit deeper into C#, I’d give it a try.