The Science of Threading
Part 2: Keeping out of trouble

In Part 1 of this series (.NETDJ, Vol. 1, issue 12) we explored some basic background in regard to .NET threading. We explored concepts related to threads in general, their appropriate usage, and how to work with the thread pool in .NET as an alternative to managing our own threads. This month we will build on what we know and dive deeper into .NET threading. We have a lot to cover, so let's get started.

Threads: Rolling Your Own.
The easiest way for you to learn about threads is to just jump in and get your hands dirty. Before we go any further, let's look at some code. I've created a small program that puts a little spin on the traditional "Hello World" program by using a thread and updating the output to be a little more hip. Take a moment to review the code in Listing 1.

The code in the "Chello World" program is rather straightforward, yet it should provide you with enough basics in rolling your own threads to make you - if you read no further in this article - a rather dangerous .NET programmer. So let's take some time to dissect the program, get a little more knowledge under your belt, and cover a couple more topics before you run off and thread yourself silly.

The first thing to note is that, just like any other object in .NET, a thread must be instantiated. The code for this is rather simple:

Thread myThread = new Thread(new ThreadStart(ThreadDelegate));

You create an instance of a thread and pass the thread start member a delegate. We briefly covered delegates in the first part of this article so I won't cover them again here, but I promise to do more with delegates in future articles. The key to remember here is that although you have created an instance of a thread, you have not done anything except allocate some memory and do some housekeeping under the covers to allocate an environment in which the thread can run. But the thread is not running and will not run until you start it. From a design perspective, you could allocate a bunch of threads during your application's startup and keep them around without them getting in the way until you decide you need them. Hmmm, sounds a lot like the .NET thread pool we discussed last month - except with a lot less functionality. Sorry, it's just that I'm kind of down on rolling your own threads, so I'll keep pushing you to use the thread pool. But let's say you're just not into the thread pool; what next? Start the thread? No! Do not start the thread!

Why shouldn't you start the thread? Because I'll bet that at some point you or someone else will need to debug your mighty threading application. Therefore you should get in the habit of making that forthcoming experience as simple as possible by giving your thread a name. It is true that you aren't required to give your thread a name, but if you take out the line in the "Chello World" program that refers to the thread name and run the program, you will get the following message when your code exits:

The thread '<No Name>' (0x878) has exited with code 0 (0x0).

Now I don't know about you, but if I have to debug a multithreaded program, one of the last things I want to deal with is a bunch of "<No Name>" threads and having to refer to the hex identifier of each thread to track which thread is doing who knows what. It is much easier to know which thread is doing what by writing the thread's name to a file, event log, or the screen. To make your life easier, you should get in the habit of providing the thread with a name.

We have created an instance of a thread, provided it with a name and, just as a matter of good practice, explicitly stated a priority for the thread to run at when first started. The thread priority defaults to normal if you don't provide a priority. The reason I say this is a good practice is that it helps make your code more readable by explicitly stating the priority of the thread. It is also important to note that once you have started a thread, its properties cannot be changed.

Threads: Under the Covers
Before we go further you should take a look at the IL that is produced by a basic threading program (see Listing 1). In case you don't want to go to the trouble of running the IL disassembler (ILDASM.exe), I have provided you with the IL code in Listing 2. One thing to notice is that the IL is rather similar to the actual program. This is a good thing, as it shows that the code is efficient and that nothing wacky is happening under the covers. That said, the one thing to note is that by using threads you are incurring some overhead. For instance, there are two instantiations produced under the covers by the line in our program that creates an instance of a thread. So this C# code snippet:

Thread myThread = new Thread(new
ThreadStart(ThreadDelegate));

translates to:

IL_0007: newobj instance void
[mscorlib]System.Threading.ThreadStart::.ctor(object,native int)
IL_000c: newobj instance void
[mscorlib]System.Threading.Thread::.ctor(class [mscorlib]
System.Threading.ThreadStart)

Doesn't seem like a lot of overhead does it? No, but keep in mind that nothing is free - and that spinning up threads has its costs.

Threads: Managing What You've Got
Now that we have our thread running, the next thing to understand is what we can do with it. We want to learn how to manage our threads and to understand the ramifications of that management. Aside from starting a thread, there are three other things you can do with a thread, all of which require the thread to first be started in order to have any effect. Let's take a look at each of the management options. To start we can "suspend" a running thread by calling myThread. Suspend(). Suspending the thread causes it to request that the thread be put into a "wait" state, which stops execution. But there are some quirky things that can occur when you call "suspend" on a thread. First, you have to allow for the fact that the CLR can only manage "managed" code and that unmanaged code may continue to execute once you suspend the thread. If you are interacting with unmanaged resources under your thread's control it is really important that you design test cases that verify the behavior of your thread when suspended.

Once a thread is suspended, you can resume the thread with myThread.Resume(). The choice of the word "resume" by the .NET development team was an important one. They could have used "restart", but it would have been a misnomer. When you call "resume" you are doing just that, picking up pretty much where you left off. Now consider what that means: you've suspended the thread and all resources owned by that thread are still being held. What does that do to the GC (Garbage Collector), what does that do to system responsiveness, what types of blocking could be experienced? The point is that all of these are issues you need to consider when working with multithreaded programs. Even taking these situations into consideration, it is not a good practice to call "suspend" if you don't intend to very quickly call "resume".

Two other options you have in managing your thread include "interrupting" and "aborting" the thread. To interrupt the thread, you simply call the myThread. Interrupt() method. Calling this method causes the thread to throw a ThreadInterruptedException, which you can catch using the Try..Catch statement and then take action. If you call myThread.Abort(), it causes some interesting behavior. To keep this simple, the thing to know about "abort" is that it will usually cause the thread to terminate by throwing a ThreadAbortException. When the exception is thrown it is processed by the "catch" block and then thrown again at the end of the "catch" block, if you don't call "ResetAbort", which allows you to ignore the abort. One thing to be careful of is calling "abort" on a thread that has been suspended, since that is a sure way to create deadlock. One way to determine the state of your thread is by using the myThread.State() method, which can be helpful in writing well-behaved threading programs, as well as in debugging those well-written programs.

Threading: Safe Collections
If you decide to enter the brave world of threading, you should realize that some of the constructs you have used in the past are not thread-safe. For instance, things like collection classes and some variable operators will not work as expected if you use the generic versions. Let's take a look at what I mean so you don't get yourself into trouble.

Suppose you wanted to use some of the services provided by the System.Collections library. The important thing to keep in mind is that some of the classes in the library are not thread-safe, as they don't allow for thread synchronization. But one of the things that makes .NET a great environment for developers is that the Microsoft .NET team took the time to think through how to keep us out of trouble. If you were to use the following code in a multithreaded environment you would eventually get burned as multiple threads try to access the queue:

Queue myQueue = Queue(...);

But if you were to simply call the synchronized version of the class you would be able to party safely. The synchronized version looks like this:

Queue myQueue = Queue.Synchronized(...);

So when you are using any of the services of the System.Collections library, the trick is to use the synchronized version of the class. Another neat trick - because it can save you time by avoiding errors - is to check if the collection is synchronized. This is accomplished by testing the "IsSynchronized" property of the collection. Consider a multithreaded program in which you are passed a collection of something by another program. The first thing you should do is check to see if you have been passed a thread-safe collection by checking to see if it is synchronized. If it is, then continue processing, but if it isn't, raise an exception to let the consumer of your application know they have passed you a nonthread-safe collection. Aside from being a good offensive programming practice, it is just simply too cool to raise an exception to the effect of, "I don't play with nonthread-safe collections."

Threading: Atomic Variables.
If you're familiar with database programming, you're familiar with the concept known as passing the ACID test. A more "geeky" way to say it is that you want to perform an atomic operation whereby either all the steps of the operation are completed or known to be completed without fear of corruption. What does that all mean? It means that if you are going to use threads, you need to have a way to assure that when you interact with variables the value doesn't change before a thread is done acting upon the variable.

In working with integers you can use the Interlocked class to safely increment, decrement, compare, and exchange values. The Interlocked class provides a set of methods that assure you that you can interact with variables in a thread-safe manner. It is important that you become familiar with these methods and use them as part of your multithreaded application.

Threading: Housekeeping
I am often called in to help development teams recover runaway projects or optimize systems that are performing badly. It isn't bad work if you can get it, but when it comes to threading it can be rather frustrating. So I have provided some tips in Table 1 that I hope you will find useful. As simple as they may seem, you would be surprised at how often I come across applications in which the developers ignored these simple housekeeping steps and ended up with a huge pile of multithreaded spaghetti.

 

Conclusion
Our exploration of threading is by no means complete. If anything, the combination of this and my preceding article on threading has been a simple and high-level introduction. My goal was not to turn you into a threading guru, but rather to give you enough information to help you understand that threading requires strong attention to detail, a well thought-out plan, and a strong grasp of the basics. If you decide that you want to continue exploring threading, I suggest you learn as much as possible about thread synchronization and thread design patterns. I would also implore you to consider your error-handling strategy, as it will be important to your ability to discover bugs as they arise - and, my friend, they will arise.

About John Gomez
John Gomez, open source editor for .NET Developer's Journal, has over 25 years of software development and architectural experience, and is considered a leader in the design of highly distributed transaction systems. His interests include chaos- and fuzzy-based systems, self-healing and self-reliant systems, and offensive security technologies, as well as artificial intelligence. John started developing software at age 9 and is currently the CTO of Eclipsys Corporation, a worldwide leader in hospital and physician information systems.