Parallel Processsing with Threads Introduction
C# supports parallel execution of code through multithreading. A thread is an independent execution path, able to run simultaneously with other threads.
A C# client program (Console, WPF, or Windows Forms) starts in a single thread created automatically by the CLR and operating system (the “main” thread), and is made multithreaded by creating additional threads. Here’s a simple example and its output:
All examples assume the following namespaces are imported:
using System; using System.Threading;
class ThreadTest { static void Main() { Thread t = new Thread (WriteY); // Kick off a new thread t.Start(); // running WriteY() // Simultaneously, do something on the main thread. for (int i = 0; i < 1000; i++) Console.Write ("x"); } static void WriteY() { for (int i = 0; i < 1000; i++) Console.Write ("y"); } }
Output:
xxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxyyyyyyyyyyyyy yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy yyyyyyyyyyyyyxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx ...
The main thread creates a new thread t on which it runs a method that repeatedly prints the character “y”. Simultaneously, the main thread repeatedly prints the character “x”:
Once started, a thread’s IsAlive property returns true, until the point where the thread ends. A thread ends when the delegate passed to the Thread’s constructor finishes executing. Once ended, a thread cannot restart.
The CLR assigns each thread its own memory stack so that local variables are kept separate. In the next example, we define a method with a local variable, then call the method simultaneously on the main thread and a newly created thread:
static void Main() { new Thread (Go).Start(); // Call Go() on a new thread Go(); // Call Go() on the main thread } static void Go() { // Declare and use a local variable - 'cycles' for (int cycles = 0; cycles < 5; cycles++) Console.Write ('?'); }
Output:
??????????
A separate copy of the cycles variable is created on each thread's memory stack, and so the output is, predictably, ten question marks.
Threads share data if they have a common reference to the same object instance. For example:
class ThreadTest { bool done; static void Main() { ThreadTest tt = new ThreadTest(); // Create a common instance new Thread (tt.Go).Start(); tt.Go(); } // Note that Go is now an instance method void Go() { if (!done) { done = true; Console.WriteLine ("Done"); } } }
Because both threads call Go() on the same ThreadTest instance, they share the done field. This results in "Done" being printed once instead of twice:
Output:
Done
Static fields offer another way to share data between threads. Here’s the same example with done as a static field:
Threading Mechanism working process
Multithreading is managed internally by a thread scheduler, a function the CLR typically delegates to the operating system. A thread scheduler ensures all active threads are allocated appropriate execution time, and that threads that are waiting or blocked (for instance, on an exclusive lock or on user input) do not consume CPU time.
On a single-processor computer, a thread scheduler performs time-slicing — rapidly switching execution between each of the active threads. Under Windows, a time-slice is typically in the tens-of-milliseconds region — much larger than the CPU overhead in actually switching context between one thread and another (which is typically in the few-microseconds region).
On a multi-processor computer, multithreading is implemented with a mixture of time-slicing and genuine concurrency, where different threads run code simultaneously on different CPUs. It’s almost certain there will still be some time-slicing, because of the operating system’s need to service its own threads — as well as those of other applications.
A thread is said to be preempted when its execution is interrupted due to an external factor such as time-slicing. In most situations, a thread has no control over when and where it’s preempted.
A thread is analogous to the operating system process in which your application runs. Just as processes run in parallel on a computer, threads run in parallel within a single process. Processes are fully isolated from each other; threads have just a limited degree of isolation. In particular, threads share (heap) memory with other threads running in the same application. This, in part, is why threading is useful: one thread can fetch data in the background, for instance, while another thread can display the data as it arrives.
Threading Uses and Misuses
Multithreading has many uses; here are the most common:
Maintaining a responsive user interfaceBy running time-consuming tasks on a parallel “worker” thread, the main UI thread is free to continue processing keyboard and mouse events.Making efficient use of an otherwise blocked CPUMultithreading is useful when a thread is awaiting a response from another computer or piece of hardware. While one thread is blocked while performing the task, other threads can take advantage of the otherwise unburdened computer.Parallel programmingCode that performs intensive calculations can execute faster on multicore or multiprocessor computers if the workload is shared among multiple threads in a “divide-and-conquer” strategy
Allowing requests to be processed simultaneouslyOn a server, client requests can arrive concurrently and so need to be handled in parallel (the .NET Framework creates threads for this automatically if you use ASP.NET, WCF, Web Services, or Remoting). This can also be useful on a client (e.g., handling peer-to-peer networking — or even multiple requests from the user).
With technologies such as ASP.NET and WCF, you may be unaware that multithreading is even taking place — unless you access shared data (perhaps via static fields) without appropriate locking, running afoul of thread safety.
Threads also come with strings attached. The biggest is that multithreading can increase complexity. Having lots of threads does not in and of itself create much complexity; it’s the interaction between threads (typically via shared data) that does. This applies whether or not the interaction is intentional, and can cause long development cycles and an ongoing susceptibility to intermittent and nonreproducible bugs. For this reason, it pays to keep interaction to a minimum, and to stick to simple and proven designs wherever possible. This article focuses largely on dealing with just these complexities; remove the interaction and there’s much less to say!
A good strategy is to encapsulate multithreading logic into reusable classes that can be independently examined and tested. The Framework itself offers many higher-level threading constructs, which we cover later.
Threading also incurs a resource and CPU cost in scheduling and switching threads (when there are more active threads than CPU cores) — and there’s also a creation/tear-down cost. Multithreading will not always speed up your application — it can even slow it down if used excessively or inappropriately. For example, when heavy disk I/O is involved, it can be faster to have a couple of worker threads run tasks in sequence than to have 10 threads executing at once. (In Signaling with Wait and Pulse, we describe how to implement a producer/consumer queue, which provides just this functionality.)