Friday, April 5, 2013

Draft TOC for EC++11 Concurrency Chapter

A couple of months ago, I posted a draft Table of Contents (TOC) for Effective C++11. At that point, the entries for the concurrency chapter were so rough, they weren't even in the form of guidelines. Now they are, and I'm pleased to unveil my first draft TOC for the chapter on concurrency support:
  • Create tasks, not threads.
  • Pass std::launch::async if asynchronicity is essential.
  • Make std::threads unjoinable on all paths.
  • Be aware of varying thread handle destructor behavior.
  • Consider void futures for one-shot event communication.
  • Pass parameterless functions to std::thread, std::async, and std::call_once.
  • Use std::lock to acquire multiple locks.
  • Prefer non-recursive mutexes to recursive ones.
  • Declare future and std::thread members last.
  • Code for spurious failures in try_lock, condvar wait, and weak CAS operations.
  • Distinguish steady from unsteady clocks.
  • Use native handles to transcend the C++11 API.
  • Employ sequential consistency if at all possible.
  • Distinguish volatile from std::atomic.
This is a draft TOC. There's nothing final about the presence, order, or wording of these Items. Furthermore, unless either your mind-reading skills are better than I expect or my mind is easier to read than I fear, it will be tough for you to anticipate what I plan to say in these Items based only on the Item titles. Still, if you see advice above that you think is either especially good or especially bad, don't be shy about letting me know about it.

I'm especially pleased with the first Item on the list ("Create tasks, not threads"), because when I came up with that wording, a number of up-to-that-point disparate thoughts fell into place with a very satisfying thud.

When I began that Item, the only thing I knew I wanted to talk about was that thread construction can throw.  In Effective C++, Second Edition, my advice about dealing with the fact that operator new can throw is "Be prepared for out-of-memory conditions," so I started thinking about guidance such as "Be prepared for std::thread exhaustion." But what does it mean to be prepared to run out of threads? With operator new, there's a new handler you can configure. There's nothing like that for thread creation. And if you request n bytes from operator new and you can't get it, you may be able to scale down your request to, say, n/2 bytes, then try again. But if you request a new thread and that fails, what are you supposed to do, request half a thread?

I didn't like where that was going.  So I decided to think about avoiding the problem of running out of threads by not requesting them directly.  The prospective guideline "Prefer std::async to std::thread" had been an elephant in the room from the beginning, so I started playing with that idea.  But one of the other guidelines I was considering was "Pass std::launch::async if asynchronicity is essential" (it's on the draft TOC above), and the spec for std::async says that it throws the same exception as the std::thread constructor if you pass std::launch::async as the launch policy and std::async can't create a new thread. So advising people to use std::async was not sufficient, because using std::async with std::launch::async is no better than using std::thread for purposes of avoiding out-of-thread exceptions.

Though my primary focus had been on figuring out how to avoid exceptions due to too many threads, another issue I wanted to address was how to deal with oversubscription: creating more threads than can efficiently run on the machine. The way to avoid that problem is to use std::async with the default launch policy, and that got me to thinking about what to call a function (or function object--henceforth simply a "function") that could be run either synchronously or asynchronously.  A raw function doesn't qualify, because if you run a raw function asynchronously on a std::thread, there is no way to get the result of the function.  (And if the function throws an exception, std::terminate gets called.) Fortunately, C++11 offers a way to prepare a function for possible asynchronous execution: wrap it in std::packaged_task. How fortuitous! I had been looking for an excuse to discuss std::packaged_task, and its existence allowed me to assign a C++11 meaning to the otherwise squishy notion of a "task".

Thus the (still tentative) Item title was born.

What I really like about it is that it's both design advice and coding advice.  At a design level, creating tasks means developing independent pieces of functionality that may be run either synchronously or asynchronously, depending on the computational resources dynamically available on the machine.  At a coding level, it means taking functions and making them suitable for asynchronous execution, either by wrapping them with std::packaged_task or, preferably, by submitting them to std::async (which does the wrapping for you).

"Create tasks, not threads" thus gives me a context in which to discuss exceptions thrown by thread creation requests, the problem of oversubscription, std::thread, std::async, std::packaged_task, and tasks versus threads. Along the way I also get to discuss thread pools and the conditions under which it can make sense to bypass tasks and go straight to std::threads. (Can you see a cross-reference to "Use native handles to transcend the C++11 API"?  I can.)

Scott

10 comments:

Markus Jais said...

Looks great. I really liked your older C++ books.
Looking forward to the C++11 version.

I agree about tasks instead of threads.
When I write Java or Scala code (I do mostly Java/Scala now) I try to avoid creating specific threads.
It is almost always better to have independent tasks that don't share anything. This avoids a lot of concurrency problems.
An actor based approach (like www.akka.io) is also great. I hope actors can make it into the C++ standard one day.
Intel's Threading Building Blocks for C++ also help to use tasks instead of threads.

Markus

Norbert Wenzel said...

Looking forward to that book. And I'm especially curious about "Make std::threads unjoinable on all paths.". I honestly have no idea from the topic what will be in there and why.

One good reason (more) to buy that book when it's published.

Ben Craig said...

I too wonder about the "Make std::threads unjoinable on all paths" item. My advice has been to avoid detached threads, and this looks like it will run counter to that advice.

A lot of code that I write needs to live in shared libraries / DLLs. If my library creates a thread, it needs to be able to control the lifetime of that thread. If it doesn't control the lifetime of that thread, then when my DLL gets unloaded, bad things happen.

Scott Meyers said...

@Unknown and @Norbert Wenzel: The motivation for the guideline is that if a joinable thread has its destructor called, std::terminate is invoked. Threads are most commonly made unjoinable via a join, detach, or std::move to another std::thread object. The challenge is to make sure that one of these things happens on every path out of a block.

Anonymous said...

But those unwanted std::terminate calls are bugs that get caught really early in development. Is there anything more complex about that behavior besides the join/detach/move?

Scott Meyers said...

@Anonymouse: If there are no paths (including due to exceptions, breaks, continues, premature returns, flowing off the end of the function, etc.) where a joinable thread is destroyed, then the code adheres to the guideline.

Anonymous said...

I'm a bit uncertain about the first topic of tasks vs. threads. I see this come up often, and it seems to somewhat ignore what I feel are two very distinct classes of concurrency. The first one you are hinting at is processing based concurrency where you use another thread to complete some package of work.

The second one is service based concurrency. You have a service which continually runs and is waiting to be instructed what to do (for example an audio device). The "task" model doesn't apply well to these: instead you create messages and tell them things they should do.

Anonymous said...

@Markus For actors in C++11 see libcppa.blogspot.com

Michael Marcin said...

Shouldn't

Pass parameterless functions to...

be

Pass nullary functions to...

Scott Meyers said...

@Michael Marcin: "Nullary functions" would be fine, but I don't think "parameterless functions" is incorrect, and I think it's likely to be understood by more people.