Whats new in C# 5.0 Part-1
The Microsoft Visual Studio Async Community Technology Preview introducing a new language feature in C# and VB, and a new framework pattern to go with it, that will make asynchronous programming similar to synchronous programming.
This blog describes the limitations of the current callback based programming model for asynchrony, and provides overview to the new opportunities offered by the framework and language features proposed in the Async CTP.
As the level of abstraction in “local” programming has been steadily rising, there has been a push for transparency of remote operations – they should look just like local ones, so that a developer doesn’t need to grapple with conceptual overhead, architectural impedance mismatch and leaky abstractions.
The problem is that remote operations are different from local ones. They have orders of magnitude more latency even at the best of times, may fail in new ways or simply never come back, depend on a variety of external factors beyond the developer’s control or even perception, etc. So while they can be represented like “just method calls,” it is not desirable to do so because the developer is left without handles to manage the special conditions arising from their remoteness – managing cancellation and timeouts, preserving threading resources during blocking waits, predicting and handling threats to responsiveness, etc.
On .NET we have not ignored this challenge. In fact we have not just one but several patterns for how to do asynchronous programming; that is, dealing with I/O and similar high latency operations without blocking threads. Most often there is both a synchronous (i.e. blocking transparently) and an asynchronous (i.e. latency-explicit) way of doing things. The problem is that these current patterns are very disruptive to program structure, leading to exceedingly complex and error prone code or (more commonly) developers giving up and using the blocking approach, taking a responsiveness and performance hit instead.
The goal should be to bring the asynchronous development experience as close to the synchronous paradigm as possible, without letting go of the ability to handle the asynchrony-specific situations. Asynchrony should be explicit and non-transparent, but in a very lightweight and non-disruptive manner. Composability, abstraction and control structures should all work as simply and intuitively as with synchronous code.
The problem is best understood in the common scenario of a UI that has just one thread to run all its user interface code on, but applies equally in, for example, server scenarios where thread resources may be a scaling bottleneck and having thousands of threads spend most of their time doing nothing is a bad strategy.
A client app that doesn’t react to mouse events or update the display for user-recognizable periods of time is likely the result of code holding on to the single UI thread for far too long. Maybe it is waiting for network IO or maybe it is performing an intensive computation. Meanwhile, other events just can’t get processing time, and the user-perceived world grinds to a halt. What’s a more frustrating user experience than losing all contact with an app that is “busy” standing still, staring down a pipe for a response that may be seconds away?
Easy to say, hard to fix. For years the recommended approach to these issues has been asynchrony: don’t wait for that response. Return as soon as you issue the request, letting other events take place in the meantime, but have the eventual response call you back when it arrives so that you can process the result as a separate event. This is a great approach: your UI thread is never blocked waiting, but is instead blazing through small, nimble events that easily interleave and never have to wait long for their turn.
The problem: Asynchronous code totally blows up your control flow. The call you back part needs a callback – a delegate describing what comes after. But what if you wanted to “wait” inside a while loop? An if statement? A try block or using block? How do you then describe “what comes after”?
Look at this simple example:
public int SumPageSizes(IList<Uri> uris) { int total = 0; foreach (var uri in uris) { statusText.Text = string.Format("Found {0} bytes ...", total); var data = new WebClient().DownloadData(uri); total += data.Length; } statusText.Text = string.Format("Found {0} bytes total", total); return total; }
The method downloads a number of URI’s, totaling their sizes and updating a status text along the way.
Clearly this method doesn’t belong on the UI thread because it may take a very long time to complete, while holding up the UI completely. Just as clearly it does belong on the UI thread because it repeatedly updates the UI. What to do?
We can put it on a background thread, making it repeatedly “post” back to the UI thread to do the UI updates. That seems wasteful in this case, since a thread will be occupied spending most of its time just waiting for downloads, but sometimes it is really the only thing you can do. In this case, however, WebClient offers an asynchronous version of DownloadData – DownloadDataAsync – which returns promptly, and then fires an event – DownloadDataCompleted – when it is done. This allows us to write an asynchronous version of our method that splits it up into little callbacks and runs the next one on the UI thread whenever the download initiated by the previous one completes. Here’s a first attempt:
public void SumPageSizesAsync(IList<Uri> uris) { SumPageSizesAsyncHelper(uris.GetEnumerator(), 0); } private void SumPageSizesAsyncHelper(IEnumerator<Uri> enumerator, int total) { if (enumerator.MoveNext()) { statusText.Text = string.Format("Found {0} bytes ...", total); var client = new WebClient(); client.DownloadDataCompleted += (sender, e) => { SumPageSizesAsyncHelper(enumerator, total + e.Result.Length); }; client.DownloadDataAsync(enumerator.Current); } else { statusText.Text = string.Format("Found {0} bytes total", total); enumerator.Dispose(); } }
Already this is bad. We have to break up the neat foreach loop and manually get an enumerator. Each call to the private helper method hooks up an event handler for the completion of its download, that will eventually call the private helper again for the next element – if any. The code looks recursive instead of iterative. Still you may squint and be able to discern the intent of this code. But we are not nearly done yet.
The original code returned the total as well as display it. Our new asynchronous version returns to its caller way before the total has even been computed. How do we get a result back to our caller? The answer is: our caller must provide a callback to us – which we can then invoke with the total when ready. The caller must in turn have its code restructured so that it consumes the total in a callback instead of as a return value.
And what about exceptions? The original code said nothing about exceptions; they were just silently propagated to the caller. In the async case, though, exceptions will arise after we returned to the caller. We must extend the callback from the caller to also tell it about exceptions, and we have to explicitly propagate those, wherever they may arise.
Together, these requirements will further clutter the code:
public void SumPageSizesAsync(IList<Uri> uris, Action<int, Exception> callback) { SumPageSizesAsyncHelper(uris.GetEnumerator(), 0, callback); } private void SumPageSizesAsyncHelper(IEnumerator<Uri> enumerator, int total, Action<int, Exception> callback) { try { if (enumerator.MoveNext()) { statusText.Text = string.Format("Found {0} bytes ...", total); var client = new WebClient(); client.DownloadDataCompleted += (sender, e) => { if (e.Error != null) { enumerator.Dispose(); callback(0, e.Error); } else SumPageSizesAsyncHelper( enumerator, total + e.Result.Length, callback); }; client.DownloadDataAsync(enumerator.Current); } else { statusText.Text = string.Format("Found {0} bytes total", total); enumerator.Dispose(); callback(total, null); } } catch (Exception ex) { enumerator.Dispose(); callback(0, ex); } }
Is this code now correct? Did we expand the foreach statement correctly, and propagate all exceptions? In fact when you look at it can you tell what it does?
Unlikely. And this corresponds to a synchronous method with just one blocking call to replace with an asynchronous one (DownloadData), and one layer of control structure around it (the foreach loop). Imagine trying to compose more asynchronous calls, or having more complex control structure! And we haven’t even started on the callers of SumPageSizesAsync!
The real problem here is that we can no longer describe the logical flow of our method using the control flow constructs of the language. Instead of describing the flow, our program becomes about describing the wiring up of the flow. “Where to go next” becomes a matter not of the execution of a loop or conditional or try-block, but of which callback you installed.
Hence, making your code asynchronous with today’s tools is extremely ungrateful: After a lot of hard work the result is unappealing, hard to read and likely full of bugs.
To be continued…