C++ Core Guidelines: Be Aware of the Traps of Condition Variables

Contents[Show]

Today, I write a scary post about condition variables. You should be aware of this issues of condition variables. The C++ core guideline CP 42 just states: "Don't wait without a condition".

 trapMice

 

Wait! Condition variables support a quite simple concept. One thread prepares something and sends a notification another thread is waiting for. Why can't this be so dangers? Okay, let's start with the only rule for today.

CP.42: Don’t wait without a condition

Here is the rationale for the rule: "A wait without a condition can miss a wakeup or wake up simply to find that there is no work to do." What does that mean? Condition variables can be victims of two very serious issues: lost wakeup and spurious wakeup. The key concern about condition variables is that they have no memory.

Before I present you this issue, let me at first do it right. Here is the pattern, how to use condition variables.

 

// conditionVariables.cpp

#include <condition_variable>
#include <iostream>
#include <thread>

std::mutex mutex_;
std::condition_variable condVar; 

bool dataReady{false};

void waitingForWork(){
    std::cout << "Waiting " << std::endl;
    std::unique_lock<std::mutex> lck(mutex_);
    condVar.wait(lck, []{ return dataReady; });   // (4)
    std::cout << "Running " << std::endl;
}

void setDataReady(){
    {
        std::lock_guard<std::mutex> lck(mutex_);
        dataReady = true;
    }
    std::cout << "Data prepared" << std::endl;
    condVar.notify_one();                        // (3)
}

int main(){
    
  std::cout << std::endl;

  std::thread t1(waitingForWork);               // (1)
  std::thread t2(setDataReady);                 // (2)

  t1.join();
  t2.join();
  
  std::cout << std::endl;
  
}

How does the synchronisation work? The program has two child threads: t1 and t2. They get their work package waitingForWork and setDataRead in lines (1 and 2). setDataReady notifies - using the condition variable condVar - that it is done with the preparation of the work: condVar.notify_one()(line 3). While holding the lock, thread t1 waits for its notification: condVar.wait(lck, []{ return dataReady; })( line 4). The sender and receiver need a lock. In the case of the sender a std::lock_guard is sufficient, because it calls lock and unlock only once. In the case of the receiver, a std::unique_lock is necessary because it usually frequently locks and unlocks its mutex.

Here is the output of the program.

 conditionVariable

 

 

 

 

 

 

Maybe you are wondering: Why do you need a predicate for the wait call because you can invoke wait without a predicate? This workflow seems quite too complicated for such a simple synchronisation of threads. 

Now we are back to the missing memory and the two phenomena called lost wakeup and spurious wakeup.

Lost Wakeup and Spurious Wakeup

  • Lost wakeup: The phenomenon of the lost wakeup is that the sender sends its notification before the receiver gets to its wait state. The consequence is that the notification is lost. The C++ standard describes condition variables as a simultaneous synchronisation mechanism: "The condition_variable class is a synchronisation primitive that can be used to block a thread, or multiple threads at the same time, ...". So the notification gets lost, and the receiver is waiting and waiting and ... .
  • Spurious wakeup: It may happen that the receiver wakes up, although no notification happened. At a minimum POSIX Threads and the Windows API can be victims of these phenomena.

To become not the victim of this two issues, you have to use an additional predicate as memory; or as the rule state it an additional condition. If you don't believe it, here is the wait workflow.

The wait workflow 

In the initial processing of wait, the thread locks the mutex and then checks the predicate []{ return dataReady; }.

  • If the call of the predicated evaluates to
    • true: the thread continues its work.
    • false: condVar.wait() unlocks the mutex and puts the thread in a waiting (blocking) state

If the condition_variable condVar is in the waiting state and gets a notification or a spurious wakeup the following steps happen.

  • The thread is unblocked and will reacquire the lock on the mutex. 
  • The thread checks the predicate.
  • If the call of the predicated evaluates to
    • true: the thread continues its work.
    • false: condVar.wait() unlocks the mutex and puts the thread in a waiting (blocking) state.

Complicated! Right? Don't you believe me?

Without a predicate

What will happen if I remove the predicate from the last example?  

// conditionVariableWithoutPredicate.cpp

#include <condition_variable>
#include <iostream>
#include <thread>

std::mutex mutex_;
std::condition_variable condVar;

void waitingForWork(){
    std::cout << "Waiting " << std::endl;
    std::unique_lock<std::mutex> lck(mutex_);
    condVar.wait(lck);                       // (1)
    std::cout << "Running " << std::endl;
}

void setDataReady(){
    std::cout << "Data prepared" << std::endl;
    condVar.notify_one();                   // (2)
}

int main(){
    
  std::cout << std::endl;

  std::thread t1(waitingForWork);
  std::thread t2(setDataReady);

  t1.join();
  t2.join();
  
  std::cout << std::endl;
  
}

 

Now, the wait call in line (1) does not use a predicate and the synchronisation looks quite easy. Sad to say, but the program has now a race condition which you can see in the very first execution. The screenshot shows the deadlock.

conditionVariableWithoutPredicate

 

The sender sends in line (1)  (condVar.notify_one()) its notification before the receiver is capable to receive it; therefore, the receiver will sleep forever. 

Okay, lesson learned the hard way. The predicate is necessary but there must be a way to simplify the program conditionVariables.cpp?

An atomic predicate 

Maybe, you already saw it. The variable dataReady is just a boolean. We should make it an atomic boolean and, therefore, get rid of the mutex on the sender.

Here we are:

// conditionVariableAtomic.cpp

#include <atomic>
#include <condition_variable>
#include <iostream>
#include <thread>

std::mutex mutex_;
std::condition_variable condVar;

std::atomic<bool> dataReady{false};

void waitingForWork(){
    std::cout << "Waiting " << std::endl;
    std::unique_lock<std::mutex> lck(mutex_);
    condVar.wait(lck, []{ return dataReady.load(); });   // (1)
    std::cout << "Running " << std::endl;
}

void setDataReady(){
    dataReady = true;
    std::cout << "Data prepared" << std::endl;
    condVar.notify_one();
}

int main(){
    
  std::cout << std::endl;

  std::thread t1(waitingForWork);
  std::thread t2(setDataReady);

  t1.join();
  t2.join();
  
  std::cout << std::endl;
  
}

 

The program is quite straightforward compared to the first version because dataReady has not to be protected by a mutex. Once more, the program has a race condition which can cause a deadlock. Why? dataReady is atomic! Right, but the wait expression (condVar.wait(lck, []{ return dataReady.load(); });) in line (1) is way more complicated then it seems.

The wait expression is equivalent to the following four lines:

std::unique_lock<std::mutex> lck(mutex_);
while ( ![]{ return dataReady.load(); }() { // time window (1) condVar.wait(lck); }

Even if you make dataReady an atomic, it must be modified under the mutex; if not the modification to the waiting thread may be published, but not correctly synchronised. This race condition may cause a deadlock. What does that mean: published, but not correctly synchronised. Let's have a closer look at the previous code snippet and assume that data is atomic and is not protected by the mutex mutex_.

Let me assume the notification is sent while the condition variable condVar is not in the waiting state. This means the execution of the thread is in the source snippet in the line with the comment time window ( line 1). The result is that the notification is lost. Afterwards, the thread goes back in the waiting state and presumably sleeps forever. 

This wouldn't have happened if dataReady had been protected by a mutex. Because of the synchronisation with the mutex, the notification would only be sent if the condition variable and, therefore, the receiver thread is in the waiting state. 

What a scary story? Is there no possibility to make the initial program conditionVariables.cpp easier? No, not with a condition variable, but you can use a promise and future pair the make the job done. For the details, read the post Thread Synchronisation with Condition Variables or Tasks.

What's next?

Now, I'm nearly done with the rules to concurrency. The rules to parallelism, message passing, and vectorisation have no content, therefore, I skip them and write in my next post mainly about lock-free programming.

 

Thanks a lot to my Patreon Supporters: Eric Pederson, Paul Baxter,  Sai Raghavendra Prasad Poosa, Meeting C++, Matt Braun, Avi Lachmish, Adrian Muntea, and Roman Postanciuc.

 

 

Get your e-book at leanpub:

The C++ Standard Library

 

Concurrency With Modern C++

 

Get Both as one Bundle

cover   ConcurrencyCoverFrame   bundle
With C++11, C++14, and C++17 we got a lot of new C++ libraries. In addition, the existing ones are greatly improved. The key idea of my book is to give you the necessary information to the current C++ libraries in about 200 pages.  

C++11 is the first C++ standard that deals with concurrency. The story goes on with C++17 and will continue with C++20.

I'll give you a detailed insight in the current and the upcoming concurrency in C++. This insight includes the theory and a lot of practice with more the 100 source files.

 

Get my books "The C++ Standard Library" (including C++17) and "Concurrency with Modern C++" in a bundle.

In sum, you get more than 550 pages full of modern C++ and more than 100 source files presenting concurrency in practice.

 

Comments   

0 #1 Philippe BAUCOUR 2018-06-04 12:26
Bonjour,
could you elaborate a little bit more about the spurious wakeup. I still do not understand from where it comes from, what it is really etc.
Many thanks in advance
Best regards, Philippe
Quote
0 #2 RainerGrimm 2018-06-05 05:28
Quoting Philippe BAUCOUR:
Bonjour,
could you elaborate a little bit more about the spurious wakeup. I still do not understand from where it comes from, what it is really etc.
Many thanks in advance
Best regards, Philippe

Here is the best, what I can find:

According to David R. Butenhof's Programming with POSIX Threads ISBN 0-201-63392-2:

"This means that when you wait on a condition variable, the wait may (occasionally) return when no thread specifically broadcast or signaled that condition variable. Spurious wakeups may sound strange, but on some multiprocessor systems, making condition wakeup completely predictable might substantially slow all condition variable operations. The race conditions that cause spurious wakeups should be considered rare."
Quote
0 #3 cvomake 2018-06-29 08:17
In this text "I skip them and write in my next post mainly about lock-free programming." the link to the lock free programming has a typo, is missing the last "g", and a database error is displayed.

Here is the current link:
http://modernescpp.com/index.php/c-core-guidelines-concurrency-and-lock-free-programmin

The correct is: http://modernescpp.com/index.php/c-core-guidelines-concurrency-and-lock-free-programming
Quote

Add comment


Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 631

All 1162611

Currently are 149 guests and no members online

Kubik-Rubik Joomla! Extensions

Latest comments