Sunday, 5 May 2013

.NET Thread Pool

Contents

1. What’s a thread pool and why do we need it?
2. Is it per AppDomain or process?
3. What are the characteristics of a thread in the thread pool?
4. How is thread pool used by the .NET framework?
5. How can we use the thread pool?
6. QueueUserWorkItem
7. Threads are background or foreground?
8. Task Parallel Library
9. When to avoid using thread pool?
10. How many threads are available in the pool?
11. Exceptions
12. Resources


What’s a thread pool and why do we need it?
A thread pool is a store of threads. This store is created at application startup. The thread pool allows developers to focus on tasks or work items and not bother with managing threads actively. It relieves them from instantiating a new thread, starting it over and managing it. Thread pool is also used a lot internally by the framework. For e.g. the asynchronous method calls (BeginInvoke) use a thread from pool, without we knowing or worrying about it at all.
Also creating new threads is expensive, so why not reuse them? When a thread completes a work item taken from the queue, it is not immediately destroyed. It goes back to the pool and waits to pick up the next task - that thread is reused. It simplifies a lot of things.

Thread pool also attempts to create an abstraction layer in programming. Many things such as timer, web services etc. wouldn’t work without threading. For instance, someone who wants to write a basic program which uses a timer to do something every few seconds needn’t worry about managing a thread and in fact knowing the ins-out of threading to use something simple as a timer would make things difficult for the developer. And why an “attempt” to create an abstraction - because a heavy usage of things which use thread pool behind the scenes causes problems, which forces a developer to learn the intricacies of thread pool. And that’s why this post.

Is it per AppDomain or process?
It is per process. Every process created in .NET has a thread pool. And since a process can have more than one AppDomain a thread can traverse several AppDomains. A thread working in the main AppDomain could be working in a different AppDomain the next millisecond.

What are the characteristics of a thread in the thread pool?
- Background threads: I explain this in detail, few sections below.

- Priority: They start with “normal” priority.

- Naming: We cannot name these threads. Naming is useful when debugging, however we can name the thread when it starts running.

- Allocation: They could be created “on-demand” or with an interval, this depends on the allocation algorithm and it has changed over time. Until a threshold (defined by SetMinThreads) is reached, new threads are created on-demand, however after that 2 threads are created every second. This 2 threads-per-seconds was before .NET 4.0, from .NET 4.0 threads are created based on factors such as throughput.

- De-allocation: Yes threads in pool can be destroyed. I was under impression that once a thread is created in pool, it is never destroyed. After all, reusability is one of the idea behind having a pool. But it makes sense to destroy threads if they are not needed. Say if there was a burst of activity and 100 threads are created – then what should we do with them after the work is done? Every thread has 1 MB of stack, so 100 MB will go wasted.

Let’s check it in code:
ThreadPool.QueueUserWorkItem(delegate
    {
        Console.WriteLine ("ManagedThreadId = " + Thread.CurrentThread.ManagedThreadId);
        Console.WriteLine ("IsThreadPoolThread = " + Thread.CurrentThread.IsThreadPoolThread);
        Console.WriteLine ("IsBackground = " + Thread.CurrentThread.IsBackground);
        Console.WriteLine ("Priority = " + Thread.CurrentThread.Priority);
        Console.WriteLine ("Name = " + Thread.CurrentThread.Name);
    });

ManagedThreadId = 4973
IsThreadPoolThread = True
IsBackground = True
Priority = Normal
Name = {null} 

Although we cannot set these properties before the thread starts, we could however set them when the thread starts running. For e.g.:

ThreadPool.QueueUserWorkItem(delegate
    {
        Thread.CurrentThread.IsBackground = false;
        Thread.CurrentThread.Priority = ThreadPriority.Highest;
        Thread.CurrentThread.Name = "iamapooledthread";
        //do some work here
    });

How is thread pool used by the .NET framework?
Threads from the pool are used a lot by the framework. Here are few examples:
- when we make an asynchronous call to a method using a delegate’s BeginInvoke()
- when we use Timer in our application, a thread pool is used to keep calling the callback method with an interval
- when we use Task Parallel Library (System.Threading.Tasks)
- when we use Remoting or ASP.NET or WCF: a thread from pool calls the WCF service method, it also runs the ASP.NET page

How can we use the thread pool?
To use thread from the pool we have a method QueueUserWorkItem in the ThreadPool class. I am using LINQPad, that’s why we have Dump() method which is kind of Console.WriteLine(). This method accepts a delegate called WaitCallback. This delegate can point to a method which returns void and has an argument of type object.


void Main()
{
 "In Main: Start".Dump();
 ThreadPool.QueueUserWorkItem(new WaitCallback(Work));
 "In Main: Done".Dump();
}

void Work(object input)
{
 "In Work: Start".Dump();
 Thread.Sleep(1000);
 "In Work: Done".Dump(); 
}

Output:


In Main: Start
In Main: Done
In Work: Start
In Work: Done


QueueUserWorkItem
A few details about the method.

- What does QueueUserWorkItem return? - A boolean value indicating whether the work item was successfully queued or not.

- How can I pass data to the thread method? - There is an overload of QueueUserWorkItem which accepts a parameter of type ‘object’. This is the reason our ”Work” method above has a parameter “input”. In our case, since we do not use this overload, the input value is null. Overloaded example:

void Main()
{
 "In Main: Start".Dump();
 ThreadPool.QueueUserWorkItem(new WaitCallback(Work), "FromPool");
 "In Main: Done".Dump();
}

void Work(object input)
{
 "In Run: Start".Dump();
 Thread.Sleep(500);
 input.Dump();
 "In Run: Done".Dump(); 
}

Output:


In Main: Start
In Main: Done
In Run: Start
FromPool
In Run: Done

Threads are background or foreground?

In .NET, threads can be foreground or background. An essential difference is that an active foreground thread will keep the application running until it returns. And if an application is shutting down and there are only background threads running, the application will still shut and the threads taken down.

Threads created manually in code - for e.g. Thread wk = new Thread(Worker); – are foreground threads. We could change it however:

void Main()
{
 "In Main: Start".Dump();
 Thread wk = new Thread(Work);
 string s = "Thread is background?: " + wk.IsBackground;
 s.Dump();
 wk.IsBackground = true;
 wk.Start();
 "In Main: Done".Dump();

}

void Work(object input)
{
 "In Run: Start".Dump();
 Thread.Sleep(500);
 "In Run: Done".Dump(); 
}

Output:


In Main: Start
Thread is background?: False
In Main: Done
In Run: Start
In Run: Done

In the code above, after changing the thread to a background one, the Main method does not wait for the thread.

Threads in the thread pool are background threads and we cannot change them to foreground in the method which creates them . For this reason, the threads from pool are also taken down when the process terminates. We need to use some other mechanism to wait for the thread to finish.

A simple way could be to use a WaitHandle.


AutoResetEvent wh = new AutoResetEvent(false);

void Main()
{
 "In Main: Start".Dump();
 ThreadPool.QueueUserWorkItem(new WaitCallback(Work), "FromPool"); 
 wh.WaitOne();
 "In Main: Done".Dump();
}

void Work(object input)
{
 "In Run: Start".Dump();
 Thread.Sleep(500);
 "In Run: Done".Dump();
 wh.Set();
}

Output:


In Main: Start
In Run: Start
In Run: Done
In Main: Done

Now, we cannot change them to foreground in the method where QueueUserWorkItem is called, but they could be changed in the method the thread calls. For e.g. in the code given below the application would exit without waiting for the pooled thread to complete because it is background.

        static void Main(string[] args)
        {
            ThreadPool.QueueUserWorkItem(delegate
            {
                Console.WriteLine("In Pooled thread: Start");
                for (int i = 0; i < 1000; i++)
                {
                    Console.Write("A");
                }
                Console.WriteLine("\nIn Pooled thread: Done");
            });

            Thread.Sleep(5);
        }

Output:

image

Let’s add a new line to change it to foreground.

     static void Main(string[] args)
        {
            ThreadPool.QueueUserWorkItem(delegate
            {
                Thread.CurrentThread.IsBackground = false;// NEW LINE
                Console.WriteLine("In Pooled thread: Start");
                for (int i = 0; i < 1000; i++)
                {
                    Console.Write("A");
                }
                Console.WriteLine("\nIn Pooled thread: Done");
            });

            Thread.Sleep(5);
        }

Output:

image

NOTE: Please do note that we have to use the line “Thread.Sleep(5)” in the main method, otherwise the thread may never get a chance to run and before it sets itself to foreground, the application may have exited.

Task Parallel Library

.NET 4.0 came up with Task Parallel Library. Using tasks is better way of queuing items to the thread pool than using QueueUserWorkItem.

As we can see from the samples above we had to do something extra to wait for the thread to complete its task. Also there is nothing we could do to cancel the thread. Task library helps us with these requirements. It has features like – waiting and cancellation. And the code is also easier to write.

void Main()
{
    Console.WriteLine ("In Main: Starting");
    Task task = new Task(delegate 
    { 
        Console.WriteLine ("Created using task library.");
        Thread.Sleep(1000);
    });
    task.Start();
    task.Wait();
    Console.WriteLine ("In Main: Exiting");
}


When to avoid using thread pool?
- Longer duration of work: Thread pool is to be used only when the work is of short duration. The faster the thread completes its work and goes back to the pool, the better performance. If threads are assigned to work for something very long, after some time starvation could occur. New threads are not created in the pool after a limit is reached, and if all threads are busy – the work may not be performed at all.

- Priority, Naming, IsBackground: Values such as priority and background cannot be set in the QueueUserWorkItem call.

- Synchronization: If we rely on synchronization to start, wait and stop the work done by the thread, then it is very likely that the task duration will increase and could lead to starvation very soon. If we have to use WaitHandles, locks or other synchronization techniques - then creating our own threads would be a better option.

- Active thread management: Threads created manually by writing – Thread worker = new Thread(…) – have lot of things to offer. We can suspend, resume, interrupt, abort, wait on them, pass them around, add them to a list. And certain scenarios like producer-consumer queues work best when creating our own.

We do need to remember that the .NET framework is also a customer of the thread pool. The best scenario for using thread pool is fire and forget.

How many threads are available in the pool?

This is little bit tricky. There are 3 sets of numbers that we get from the thread pool.
-> Minimum, Maximum and Available.

- Minimum doesn’t mean that, that many threads will be created when the application starts. It means that threads will be created on demand until this limit is reached. And after that, new threads will be created based on allocation algorithm. In .NET 2.0 after the minimum limit is reached, a new thread is added after an interval of 500 ms.
- Maximum stays true to its word. It is the maximum no. of threads that can be created in the pool. If there are more requests, they will be kept on hold until a thread finishes its work and comes back to the pool.
- Available = Maximum – Active no. of threads

The numbers for minimum and maximum depends on external factors. Till .NET 3.5 this was on the no. of CPU core in the machine, from .NET 4.0 it is based on the virtual address space. Here are the nos. on my machine.

.NET 4.0:

image

.NET 3.5:

image

Exceptions

Any exception that occurs in a pooled thread needs to be handled, otherwise it will bring down the application. Also the exception handling code needs to be in the worker method, not where the work is queued to be taken up by the thread.
        static void Main(string[] args)
        {            
            ThreadPool.QueueUserWorkItem(Work);
            Console.ReadKey();
        }

        static void Work(object input)
        {
            try
            {
                throw new Exception("Error!");
            }
            catch (Exception ex)
            {
                //if not handled here, then it will take down the entire process
                //log message - then continue or rethrow or graceful shutdown
                Console.WriteLine(ex);
            }
        }

Resources
- Erika Parsons and Eric Eilebrecht : CLR 4 - Inside the Thread Pool
- Thread Pool developer’s blog: http://blogs.msdn.com/b/ericeil/

No comments:

Post a Comment

Note: only a member of this blog may post a comment.