A Lock-Free Stack: A Complete Implementation

My last lock-free stack implementation was incomplete. It only supported push operations. Let’s change this.

The following paragraph about sequential consistency is optional. You can easily ignore it.

Sequential Consistency

In my examples, I use the default memory ordering: sequential consistency. The reason is simple. Sequential consistency provides the strongest guarantees of all memory ordering and is, therefore, easier to use than the other memory orders. Sequential consistency is an ideal starting point when designing lock-free data structures. In further optimization steps, you can weaken the memory ordering and apply acquire-release semantics or relaxed semantics.

Depending on the architecture, weakening the memory ordering may not pay off. For example, the
x86 memory model is one of the strongest memory models of all modern architectures. As a result,
breaking the sequential consistency and applying a weaker memory ordering might not give the
performance improvements you hoped for. On the contrary, ARMv8, PowerPC, Itanium, and, in
particular, DEC alpha may pay off when breaking the sequential consistency.

The simplified stack version from the last article has two issues. First, it does not have a pull operation, and second, it releases no memory.

A Complete Implementation

Typically, a stack supports the member functions push, pop, and top. Implementing the pop and top member functions thread-safe, does not guarantee that the invocation of top followed by pop is threadsafe. It may happen that one thread t1 called stack.top() and is interleaved by another thread t2 that called stack.top() and then stack.pop(). Now, the final pop call is based on the wrong stack size.

No Memory Reclamation

Consequentially, the following implementation combines the two member functions top and pop into one: topAndPop.

// lockFreeStackWithLeaks.cpp

#include <atomic>
#include <future>
#include <iostream>
#include <stdexcept>

template<typename T>
class LockFreeStack {
 private:
    struct Node {
        T data;
        Node* next;
        Node(T d): data(d), next(nullptr){ }
    };
    std::atomic<Node*> head;
 public:
    LockFreeStack() = default;
    LockFreeStack(const LockFreeStack&) = delete;
    LockFreeStack& operator= (const LockFreeStack&) = delete;
   
    void push(T val) {
        Node* const newNode = new Node(val);
        newNode->next = head.load();
        while( !head.compare_exchange_strong(newNode->next, newNode) );
    }

    T topAndPop() {
        Node* oldHead = head.load();                                                 // 1
        while( oldHead && !head.compare_exchange_strong(oldHead, oldHead->next) ) {  // 2
            if ( !oldHead ) throw std::out_of_range("The stack is empty!");          // 3
        }
        return oldHead->data;                                                        // 4
    }
};
   
int main(){

    LockFreeStack<int> lockFreeStack;
    
    auto fut = std::async([&lockFreeStack]{ lockFreeStack.push(2011); });
    auto fut1 = std::async([&lockFreeStack]{ lockFreeStack.push(2014); });
    auto fut2 = std::async([&lockFreeStack]{ lockFreeStack.push(2017); });
    
    auto fut3 = std::async([&lockFreeStack]{ return lockFreeStack.topAndPop(); });
    auto fut4 = std::async([&lockFreeStack]{ return lockFreeStack.topAndPop(); });
    auto fut5 = std::async([&lockFreeStack]{ return lockFreeStack.topAndPop(); });
    
    fut.get(), fut1.get(), fut2.get();                            // 5  
    
    std::cout << fut3.get() << '\n';
    std::cout << fut4.get() << '\n';
    std::cout << fut5.get() << '\n';

}

The member function topAndPop returns the top element of the stack. It reads the head element of the stack (line 1) and makes the next node the new head if oldHead is not a nullptr (line 2). oldhead is a nullptr if the stack is empty. I throw an exception if the stack is empty (line 3). Returning a special non-value or returning a std::optional is also a valid option. Copying the value has a downside. If the copy constructor of the value throws an exception such as std::bad_alloc, the value is lost. Finally, the member functions returns the head element (line 4).

The calls fut.get(), fut1.get(), fut2.get() (line 5) ensure that the associated promise runs. If you don’t specify the launch policy the promise may run lazily in the caller’s thread. Lazily means that the promise will be executed if and only if the future asks for its result with get or wait. You can also launch the promise in a separate thread:

auto fut = std::async(std::launch::asnyc, [&conStack]{ conStack.push(2011); });
auto fut1 = std::async(std::launch::asnyc, [&conStack]{ conStack.push(2014); });
auto fut2 = std::async(std::launch::asnyc, [&conStack]{ conStack.push(2017); });

Finally, the output of the program:

Modernes C++ Mentoring

"Fundamentals for C++ Professionals" (open)

"Design Patterns and Architectural Patterns with C++" (open)

"C++20: Get the Details" (open)

"Concurrency with Modern C++" (open)

"Embedded Programming with Modern C++": January 2025

"Generic Programming (Templates) with C++": February 2025

"Clean Code: Best Practices for Modern C++": May 2025

Do you want to stay informed: Subscribe.

Although the presented lock-free stack supports push and topAndPop, it has a serious issue: it leaks memory. You may ask: Why can’t the oldHead just be removed after the call head.compare_exchange_strong(oldHead, oldHead->next) (line 2) in the member function topAndPop? The answer is that another thread may use oldHead. Let’s analyze the member functions push and topAndPop. Concurrent execution of push is no issue because the call !head.compare_exchange_strong(newNode->next, newNode) atomically updates newNode->next to the new head. It is also valid if only one topAndPop call happens. The issue arises when more topAndPop calls interleave with or without a push call. Deleting the oldHead while another thread uses it would be disastrous because the deletion of oldHead must always happen before or after its update to the new head: oldHead->next (line 2).

What’s Next?

Thanks to RCU and Hazard Pointers, I can solve the memory leaks in my next post.

Post Views: 1,536

Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, G Prvulovic, Reinhold Dröge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Jozo Leko, John Breland, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Robert Blanch, Truels Wissneth, Mario Luoni, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, Daniel Hufschläger, Alessandro Pezzato, Bob Perry, Satish Vangipuram, Andi Ireland, Richard Ohnemus, Michael Dunsky, Leo Goodstadt, John Wiederhirn, Yacob Cohen-Arazi, Florian Tischler, Robin Furness, Michael Young, Holger Detering, Bernd Mühlhaus, Stephen Kelley, Kyle Dean, Tusar Palauri, Juan Dent, George Liao, Daniel Ceperley, Jon T Hess, Stephen Totten, Wolfgang Fütterer, Matthias Grün, Phillip Diekmann, Ben Atakora, Ann Shatoff, Rob North, Bhavith C Achar, Marco Parri Empoli, Philipp Lenk, Charles-Jianye Chen, Keith Jeffery, Matt Godbolt, Honey Sukesan, bruce_lee_wayne, and Silviu Ardelean.

Thanks, in particular, to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, Richard Sargeant, Rusty Fleming, John Nebel, Mipko, Alicja Kaminska, Slavko Radman, and David Poole.

My special thanks to Embarcadero
My special thanks to PVS-Studio
My special thanks to Tipi.build
My special thanks to Take Up Code
My special thanks to SHAVEDYAKS