Ongoing Optimization: Unsynchronized Access with CppMem


I've described my challenge in the last post. Let' start with our process of ongoing optimization. To be sure, I verify my reasoning with CppMem. I once made a big mistake in my presentation at Meeting C++ 2014.


Just to remind you. That is our starting point.

The program

// ongoingOptimization.cpp

#include <iostream>
#include <thread>

int x= 0;
int y= 0;

void writing(){
  x= 2000;
  y= 11;

void reading(){ 
  std::cout << "y: " << y << " ";
  std::cout << "x: " << x << std::endl;

int main(){
  std::thread thread1(writing);
  std::thread thread2(reading);


Totally unsynchronized

The program has two data races and therefore has undefined behaviour. Either the access to the variable x nor to the variable y are protected. Because the program has undefined behaviour, each result is possible. In C++ jargon that means, a cruise missile can launch or your PC catches fire. To me it never happened, but ... .

So, we can make no statement about the values of x and y.


It' not so bad

The known architectures guarantees, that the access of an int variable is atomic. But the int variable must be naturally aligned. Naturally aligned means, that on a 32-bit architecture the int variable must have an address, divisible by 4. On a 64-bit architecture, divisible by 8. There is a reason, why I mention this so explicitly. With C++11 you can adjust the alignment of your data types.

Once more. I don't say, that you should look at int variables as atomics. I only say, that the compiler in this case guarantees more than the C++11 standard. But, if you use this rule, your program is not compliant with the C++ standard.

This was my reasoning. Now we should have a look, what CppMem will say about the undefined behaviour of the program.


int main() {
  int x=0; 
  int y=0;
  {{{ { 
        x= 2000; 
        y= 11;
  ||| {


The program is reduced to the bare minimum. You can easily define a thread with the curly braces (line 4 and 12) and the pipe symbol (line 8). The additional curly braces in line 4 and 7 or line 8 and 11 define the work package of the thread. Because I'm not interested in the output of the variables x and y, I only read them in line 9 and 10.

That was the theory for CppMem. Now to the analysis.

Die Analysis

If I execute the program, CppMem complains in the red letters (1), that all four possible interleavings of threads are not race free. Only the first execution is consistent. Now I can use CppMem to switch between the four executions (2) and analyse the annotated graph (3).


We get the most out of CppMem form the graph. So I will dive more into the four graphs.

First execution

Which information can we draw from the paragraph(3)?


The nodes of the graph represents the expressions of the program, the edges the relations between the expressions. I will refer in my explanation to the names (a) to (f). So, what can I derive from the annotations in this concrete execution?

  • a:Wna x=0: Is the first expression (a), which is a non atomic write of x.
  • sb (sequenced-before): The writing of the first expression (a) is sequenced before the writing of the second expression (b). These relations hold also between the expressions (c) and (d), or (e) and (f).
  • rf (read from): The expression (e) reads the value of y from the expression (b). Accordingly, (f) reads from (a).
  • sw s(synchronizes-with): The expression (a) synchronizes with (f). This relation holds, because the expressions (f) takes place in a separate thread. The creation of a thread is a synchronization point. All, what happens before the thread creation, is visible in the thread. Out of symmetry reasons, the same hold between (b) and (e).
  • dr (data race): Here is the data race between the reading and writing of the variable x and y. So the program has undefined behaviour.

Why is the execution consistent?

The execution is consistent, because the values x and y are read from the values of x and y in the main thread (a) and (b). If the values would be read from x and y from the separate thread in the expressions (c) and (d), the effect can took place, that the values of x and y in (e) and (f) are only partially read. This is not consistent. Or to say it differently. In the concrete execution x and y gets the value 0. You can see that in addition in the expressions (e) and (f). 

This guarantee will not hold for the next three executions, to which I refer now.

Second execution


The expression (e) reads in this non consistent execution the value for y from the expression (d). The writing of (d) will happen in parallel to the reading of (e).

Third execution


That's symmetrical to the second execution. The expression (f) reads from the expression (c).

Fourth execution


Now all goes wrong. The expressions (e) and (f) reads from the expressions (d) and (c).

A short conclusion

Although I just used the default configuration of CppMem and I only used the graph, I get a lot of valuable information and insight. In particular CppMem brings it right to the spot.

  1. All four combinations of x and y are possible: (0,0), (11,0), (0,2000), (11,2000).
  2. The program has a data race and therefore undefined behaviour.
  3. Only one of the four executions is consistent.

What's next?

What is the most obvious way to synchronize a multithreading program. Of course, to use a mutex.








title page smalltitle page small Go to Leanpub/cpplibrary "What every professional C++ programmer should know about the C++ standard library".   Get your e-book. Support my blog.



0 #1 Roman 2016-08-07 07:35
Please tell about difference mutex & shared_mutex
0 #2 Rainer Grimm 2016-08-07 10:35
Quoting Roman:
Please tell about difference mutex & shared_mutex

Already done:

Add comment

Support my blog by buying my E-book

Latest comments

Modernes C++

Subscribe to the newsletter

Including two chapters of my e-book
Introduction and Multithreading

Blog archive

Source Code


Today 2131

All 228145

Currently are 162 guests and no members online