C++23: More Small Pearls

With the static multidimensional subscript and call operator, the C++23 core language has more to offer.

auto(x) and auto{x}

In my last post, “C++23: The Small Pearls in the Core Language“, I gave a concise explanation of auto(x) and auto{x}: The calls auto(x) and auto{x} cast x into a prvalue as if passing x as a function argument by value. auto(x) and auto{x} perform a decay copy. I also explained a decay copy. This explanation of auto(x) and auto{x} was short, too short.

Gašper Ažman tweeted me about it:

Gašper is an active member of the C++ standardization committee: In C++, he participated in the standardization of “deducing this”, and “using enum“. Additionally, Gašper is involved in the customization points effort, the contracts effort, and is an assistant chair of the networking study group.

I asked Gašper if he wants to write a few words about the implications of auto(x) and auto{x} on function interface design. I’m happy to present his answer. The explanation is embedded into the source code.

#include <vector>
#include <string>
#include <memory>
#include <utility>
#include <array>
#include <algorithm>
#include <charconv>
#include <cstring>
#include <iostream>


// To achieve a design that improves local reasoning, we should
// design algorithm interfaces to not mutate their arguments if
// at all possible.
// Unfortunately, this is often at odds with the efficiency of
// the implementation of the algorithm.

// This comes up in many areas. A few examples include:
// - matrix algorithms, where implementations often require that arguments do not alias
// - state machines, where testing is far easier if we can produce a completely new state
// - range algorithms such as the below example
// - immutable maps and sets (see the immer library: https://github.com/arximboldi/immer)

// For clarity, let us consider a small example.
// Our data structure will be a vector of integers
// Our family of algorithms will be "sorted" and "uniqued"

// In c++20 (if we ignore ranges - we are trying to illustrate an approach),
// we would probably design the interface by taking by value and returning
// by value

// (helpers)
auto read_input(int argc, char** argv) -> std::vector<int>;
auto write_output(int) -> void;

namespace traditional {
template <typename T>
auto sorted(std::vector<T> x) -> std::vector<T> {
    std::sort(x.begin(), x.end());
    return x;
}

template <typename T>
auto uniqued(std::vector<T> x) -> std::vector<T> {
    x.erase(std::unique(x.begin(), x.end()), x.end());
    return x;
}

// This pattern leads to the following usage pattern
auto usage(int argc, char** argv) {
    for (auto i : uniqued(sorted(read_input(argc, argv)))) {
        write_output(i);
    };
}

// This is good, but sooner or later someone will want to refactor this
// code like this
auto refactor(int argc, char** argv) {
    auto input = read_input(argc, argv);
    for (auto i : uniqued(sorted(input))) {
        write_output(i);
    };
}

// can you spot the bug? Non-professionals often don't!
// If you work with researchers and scientists, this kind of mistake
// is ubiquitous, and leads to serious, serious slow-downs that are often
// difficult to find if not spotted immediately.

// What can we do? We should ask the compiler to issue an error, of course.
// We can do this by explicitly asking for an rvalue reference instead of 
// taking by value.
}

namespace require_moves {
template <typename T>
auto sorted(std::vector<T>&& x) -> std::vector<T> {
    //                    ^^ new
    std::sort(x.begin(), x.end());
    return x;
}

template <typename T>
auto uniqued(std::vector<T>&& x) -> std::vector<T> {
    //                     ^^ new
    x.erase(std::unique(x.begin(), x.end()), x.end());
    return x;
}
auto read_input(int argc, char** argv) -> std::vector<int>;
auto write_output(int) -> void;

auto usage(int argc, char** argv) {
    // compiles unchanged, and has the same performance
    for (auto i : uniqued(sorted(read_input(argc, argv)))) {
        write_output(i);
    };
}

auto refactor(int argc, char** argv) {
    auto input = read_input(argc, argv);
    for (auto i : uniqued(sorted(std::move(input)))) {
        //                       ^^^^^^^^^^     ^ required, does not compile without it!
        write_output(i);
    };
}

auto print_diff(std::vector<int> const&, std::vector<int> const&) -> void;

// of course, now we have a problem. What if we actually needed the copy?
#if defined(TRY_1)
auto check_is_sorted_and_uniqued(int argc, char** argv) {
    auto input = read_input(argc, argv);
    auto sorted_and_uniqued = uniqued(sorted(input));
    //                                ^^^^^^ no matching function for call to sorted
    if (input != sorted_and_uniqued) {
        print_diff(input, sorted_and_uniqued);
        exit(1);
    }
    exit(0);
}
#elif defined(TRY_2)
// we can work around this by making a copy explicitly
auto check_is_sorted_and_uniqued(int argc, char** argv) {
    auto input = read_input(argc, argv);
    auto input_copy = input; // <- sad face; requires its own statement, ugly
    auto sorted_and_uniqued = uniqued(sorted(std::move(input_copy)));
    if (input != sorted_and_uniqued) {
        print_diff(input, sorted_and_uniqued);
        exit(1);
    }
    exit(0);
}
// Don't you think this is bad user experience, though?
// Of course it is. We just wanted to make copies explicit, not near-impossible.
// The standard library specification has had a name for this for a long time:
// they call it DECAY_COPY, which is literally what happens, but is inscruitable
// jargon.
#elif defined(TRY_3)
// Some smart users have tried and defined their own accompanying function
// to std::move for this:
auto decay_copyish(auto&& x) { return std::forward<decltype(x)>(x); }

// If we have that, we could write our check_is_sorted_and_uniqued without the named copy:
auto check_is_sorted_and_uniqued(int argc, char** argv) {
    auto input = read_input(argc, argv);
    auto sorted_and_uniqued = uniqued(sorted(decay_copy(input)));
    //                                       ^^^^^^^^^^ "explicit" copy
    if (input != sorted_and_uniqued) {
        print_diff(input, sorted_and_uniqued);
        exit(1);
    }
    exit(0);
}

// This works, and in this case is optimal, but leaves something to be desired in generic cases.
// Let us try and see what happens if we try and refactor sort+unique into a generic
// algorithm

#endif
// Let's take our vector as a forwarding reference so we can reuse its memory if we own it
// We need a concept for that
template <typename T>
inline constexpr bool is_vector_v = false;
template <typename T, typename A>
inline constexpr bool is_vector_v<std::vector<T, A>> = true;
template <typename T>
concept a_vector = is_vector_v<std::remove_cvref_t<T>>;

// we take this by forwarding reference; but now,
// we make an additional move-construction if v is passed by rvalue reference
auto sorted_and_uniqued(a_vector auto&& v) {
    return uniqued(sorted(decay_copy(std::forward<decltype(v)>(v))));
    //                    ^^^^^^^^^^ an extra move construction or the needed copy-construction
    // specifically, we move-construct decay_copy's return value.
}

// so, decay_copy is clearly not optimal. We need something that won't result
// in additional move-constructions and still accomplish our "copies are explicit" goal.

// enter: decay-copy in the language!
}

namespace done_properly {
using require_moves::sorted, require_moves::uniqued, require_moves::a_vector, require_moves::print_diff;

// in regular user code, we can now use auto{} instead of decay-copy:
auto check_is_sorted_and_uniqued(int argc, char** argv) {
    auto input = read_input(argc, argv);
    auto sorted_and_uniqued = uniqued(sorted(auto(input)));
    //                                       ^^^^^^^^^^^ explicit copy
    if (input != sorted_and_uniqued) {
        print_diff(input, sorted_and_uniqued);
        exit(1);
    }
    exit(0);
}

// in generic contexts
auto sorted_and_uniqued(a_vector auto&& v) {
    return uniqued(sorted(auto(std::forward<decltype(v)>(v))));
    //                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    // correct forwarded copy of an argument we took by forwarding reference
}
}

// Thanks for reading!

Here is more information about how to pass function parameters: C++ Core Guidelines: The Rules for in, out, in-out, consume, and forward Function Parameter.

Multidimensional Subscript Operator

Thanks to std::mdspan, C++23 supports multi-dimensional arrays. The C++23 core language also supports a multidimensional subscript operator to complete the feature.

// multidimensionalSubscript.cpp

#include <array>
#include <iostream>
 
template<typename T, std::size_t X, std::size_t Y>
struct Matrix {
    std::array<T, X * Y> mat{};
 
    T& operator[](std::size_t x, std::size_t y) {               // (1)
        return mat[y * X + x];
    }
};
 
int main() {

    std::cout << '\n';

    Matrix<int, 3, 3> mat;
    for (auto i : {0, 1, 2}) {
        for (auto j : {0, 1, 2}) mat[i, j] = (i * 3) + j;       // (2)
    }
    for (auto i : {0, 1, 2}) {
        for (auto j : {0, 1, 2}) std::cout << mat[i, j] << " "; // (3)
    }

    std::cout << '\n';

}

Line (1) defines the two-dimensional subscript operator for the class Matrix. Line (2) uses it to define the elements, and line (3) reads them.

Here is the output of the program:

static operator () and operator []

With C++23, the call operator() and the multi-dimensional subscript operator [] can be static. You may ask, why? The answer is typical for C++: optimization

 

Rainer D 6 P2 500x500Modernes C++ Mentoring

  • "Fundamentals for C++ Professionals" (open)
  • "Design Patterns and Architectural Patterns with C++" (open)
  • "C++20: Get the Details" (open)
  • "Concurrency with Modern C++" (open)
  • "Generic Programming (Templates) with C++": October 2024
  • "Embedded Programming with Modern C++": October 2024
  • "Clean Code: Best Practices for Modern C++": March 2025
  • Do you want to stay informed: Subscribe.

     

    Optimization

    The implicit this pointer must be passed around in an extra register if a member function cannot be inlined. Thanks to a static member function, you can spare the this pointer. Also, lambdas, which do not capture anything, can be static in C++23:

    auto sum = [](auto a, auto b) static {return a + b;};
    

    For consistency reasons, the multi-dimensional subscript operator [] can also be static.

    // multidimensionalSubscriptStatic.cpp
    
    #include <array>
    #include <iostream>
     
    template<typename T, std::size_t X, std::size_t Y>
    struct Matrix {
    
        static inline std::array<T, X * Y> mat{};               // (2)
        
        static T& operator[](std::size_t x, std::size_t y) {    // (1)
            return mat[y * X + x];
        }
    };
    
     
    int main() {
    
        std::cout << '\n';
    
        Matrix<int, 3, 3> mat;
        for (auto i : {0, 1, 2}) {
            for (auto j : {0, 1, 2}) mat[i, j] = (i * 3) + j;
        }
        for (auto i : {0, 1, 2}) {
            for (auto j : {0, 1, 2}) std::cout << mat[i, j] << " ";  
        }
    
        std::cout << '\n';
    
    }
    

    Now, the two-dimensional subscript operator [] (line 1) and the std::array mat (line 2) are static.

    What’s next?

    My next post will be a guest post by Victor Duvanenko. He provides exhaustive performance numbers about my favorite C++17 feature: the parallel STL algorithms.

    Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, G Prvulovic, Reinhold Dröge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Jozo Leko, John Breland, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Robert Blanch, Truels Wissneth, Mario Luoni, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, Daniel Hufschläger, Alessandro Pezzato, Bob Perry, Satish Vangipuram, Andi Ireland, Richard Ohnemus, Michael Dunsky, Leo Goodstadt, John Wiederhirn, Yacob Cohen-Arazi, Florian Tischler, Robin Furness, Michael Young, Holger Detering, Bernd Mühlhaus, Stephen Kelley, Kyle Dean, Tusar Palauri, Juan Dent, George Liao, Daniel Ceperley, Jon T Hess, Stephen Totten, Wolfgang Fütterer, Matthias Grün, Phillip Diekmann, Ben Atakora, Ann Shatoff, Rob North, Bhavith C Achar, Marco Parri Empoli, Philipp Lenk, Charles-Jianye Chen, Keith Jeffery,and Matt Godbolt.

    Thanks, in particular, to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, Richard Sargeant, Rusty Fleming, John Nebel, Mipko, Alicja Kaminska, Slavko Radman, and David Poole.

    My special thanks to Embarcadero
    My special thanks to PVS-Studio
    My special thanks to Tipi.build 
    My special thanks to Take Up Code
    My special thanks to SHAVEDYAKS

    Modernes C++ GmbH

    Modernes C++ Mentoring (English)

    Do you want to stay informed about my mentoring programs? Subscribe Here

    Rainer Grimm
    Yalovastraße 20
    72108 Rottenburg

    Mobil: +49 176 5506 5086
    Mail: schulung@ModernesCpp.de
    Mentoring: www.ModernesCpp.org

    Modernes C++ Mentoring,

     

     

    4 replies
    1. Toto
      Toto says:

      // can you spot the bug? Non-professionals often don’t!
      // If you work with researchers and scientists, this kind of mistake
      // is ubiquitous, and leads to serious, serious slow-downs that are often
      // difficult to find if not spotted immediately.”

      Could explain the bug please ? is it a real bug or just a slow down due to the copy ?

      Reply
    2. Rainer Grimm
      Rainer Grimm says:

      The crucial point is the copy call in the function refactor:

      auto input = read_input(argc, argv);

      => The issue is that the modification is not applied to the original data but to a copy of it (input).

      Reply

    Leave a Reply

    Want to join the discussion?
    Feel free to contribute!

    Leave a Reply

    Your email address will not be published. Required fields are marked *