C++ Core Guidelines: Rules for Strings

The C++ core guidelines use the term string as a sequence of characters. Consequently, the guidelines are about  C-strings, C++-strings, the C++17 std::string_view's, and std::byte's. 

 

 thread 2995466 1280

I will in this post only loosely refer to the guidelines and ignore the strings which are part of the guidelines support library such as gsl::string_span, zstring, and czstring. For short, I call in this post a std::string a C++-string, and a const char* a C-string.

Let me start with the first rule:

SL.str.1: Use std::string to own character sequences

Maybe, you know another string which owns its characters sequence: a C-string. Don't use a C-string! Why? Because you have to take care of the memory management, the string termination character, and length of the string.

 

// stringC.c

#include <stdio.h>
#include <string.h>
 
int main( void ){
 
  char text[10];
 
  strcpy(text, "The Text is too long for text.");   // (1) text is too big
  printf("strlen(text): %u\n", strlen(text));       // (2) text has no termination character '\0'
  printf("%s\n", text);
 
  text[sizeof(text)-1] = '\0';
  printf("strlen(text): %u\n", strlen(text));
 
  return 0;
}

 

The simple program stringC.c has in line (1) and line (2) undefined behaviour. Compiling it with a rusty GCC 4.8 seems to work fine.

stringCThe C++ variant does not have the same issues.

// stringCpp.cpp

#include <iostream>
#include <string>

int main(){
 
  std::string text{"The Text is not too long."};  
 
  std::cout << "text.size(): " << text.size() << std::endl;
  std::cout << text << std::endl;
 
  text +=" And can still grow!";
 
  std::cout << "text.size(): " << text.size() << std::endl;
  std::cout << text << std::endl;
 
}

 

The output of the program should not surprise you.

stringCpp

In case of a C++-string, I cannot make an error because the C++ runtime takes care of the memory management and the termination character. Additionally, if you access the elements of the C++-string with the at-operator instead of the index-operator, bounds errors are not possible. You can read the details of the at-operator in my previous post: C++ Core Guidelines: Avoid Bounds Errors.

You know, what was strange in C++, including C++11? There was no way to create a C++-string without a C-string. This is strange because we want to get rid of the C-string. This inconsistency is gone with C++14.

SL.str.12: Use the s suffix for string literals meant to be standard-library strings 

With C++14 we got C++-string literals. It's a C-string literal with the suffix s: "cStringLiteral"s.

Let me show you an example which makes my point: C-string literals and C++-string literals a different.

 

// stringLiteral.cpp

#include <iostream>
#include <string>
#include <utility>

int main(){
    
    using namespace std::string_literals;                         // (1)

    std::string hello = "hello";                                  // (2)
    
    auto firstPair = std::make_pair(hello, 5);
    auto secondPair = std::make_pair("hello", 15);                // (3)
    // auto secondPair = std::make_pair("hello"s, 15);            // (4)
    
    if (firstPair < secondPair) std::cout << "true" << std::endl; // (5)
    
}

It's a pity; I have to include the namespace std::string_literals in line (1) to use the C++-string-literals. Line (2) is the critical line in the example. I use the C-string-literal "hello" to create a C++-string. This is the reason that the type of firstPair is (std::string, int), but the type of the secondPair is (const char*, int). In the end, the comparison in line (5) fails, because you can not compare different types. Look carefully at the last line of the error message: 

stringLiteralsError

When I use the C++-string-literal in line (4 ) instead of the C-string-literal in line (3), the program behaves as expected:

stringLiterals

C++-string-literals was a C++14 feature. Let's jump three years further. With C++17 we got std::string_view and std::byte. I already wrote, in particular, about std::string_view. Therefore, I will only recap the most important facts.

SL.str.2: Use std::string_view or gsl::string_span to refer to character sequences

Okay, a std::string view only refers to the character sequence. To say it more explicitly: A std::string_view does not own the character sequence. It represents a view of a sequence of characters. This sequence of characters can be a C++-string or C-string. A std::string_view only needs two information: the pointer to the character sequence and their length. It supports the reading part of the interface of the std::string. Additionally to a std::string, std::string_view has two modifying operations: remove_prefix and remove_suffix.

Maybe you wonder: Why do we need a std::string_view? A std::string_view is quite cheap to copy and needs no memory. My previous post C++17 - Avoid Copying with std::string_view shows the impressive performance numbers of a std::string_view.

As I already mentioned it, we got with C++17 also a std::byte.

SL.str.4: Use char* to refer to a single character and SL.str.5: Use std::byte to refer to byte values that do not necessarily represent characters

If you don't follow rule str.4 and use const char* as a C-string, you may end with critical issues as the following one.

 

char arr[] = {'a', 'b', 'c'};

void print(const char* p)
{
    cout << p << '\n';
}

void use()
{
    print(arr);   // run-time error; potentially very bad
}

arr decays to a pointer when used as argument of the function print. The undefined-behaviour is, that arr is not zero-terminated. If you now have the impression to can use std::byte as a character, you are wrong.

std::byte is a distinct type implementing the concept of a byte as specified in the C++ language definition. This means, a byte is not an integer or a character and is, therefore, not open to programmer errors. Its job is to access object storage. Consequently, its interface consists only of methods for bitwise logical operations.

 

namespace std { 

    template <class IntType> 
        constexpr byte operator<<(byte b, IntType shift); 
    template <class IntType> 
        constexpr byte operator>>(byte b, IntType shift); 
    constexpr byte operator|(byte l, byte r); 
    constexpr byte operator&(byte l, byte r); 
    constexpr byte operator~(byte b); 
    constexpr byte operator^(byte l, byte r); 

} 

 

You can use the function std::to_integer(std::byte b) to convert a std::byte to an integer type and the call std::byte{integer} to do it the other way around. integer has to be a non-negative value smaller than std::numeric_limits<unsigned_char>::max().

What's next?

I'm almost done with the rules to the standard library. Only a few rules to iostreams and the C-standard library are left. So you know, what I will write about in my next post.

 

 

 

Thanks a lot to my Patreon Supporters: Paul Baxter,  Meeting C++, Matt Braun, Avi Lachmish, Roman Postanciuc, Venkata Ramesh Gudpati, Tobias Zindl, Marko, Ramesh Jangama, G Prvulovic, Reiner Eiteljörge, Benjamin Huth, Reinhold Dröge, Timo, Abernitzke, Richard Ohnemus , Frank Grimm, Sakib, and Broeserl.

Thanks in particular to:  TakeUpCode 450 60     crp4

 

Get your e-book at Leanpub:

The C++ Standard Library

 

Concurrency With Modern C++

 

Get Both as one Bundle

cover   ConcurrencyCoverFrame   bundle
With C++11, C++14, and C++17 we got a lot of new C++ libraries. In addition, the existing ones are greatly improved. The key idea of my book is to give you the necessary information to the current C++ libraries in about 200 pages.  

C++11 is the first C++ standard that deals with concurrency. The story goes on with C++17 and will continue with C++20.

I'll give you a detailed insight in the current and the upcoming concurrency in C++. This insight includes the theory and a lot of practice with more the 100 source files.

 

Get my books "The C++ Standard Library" (including C++17) and "Concurrency with Modern C++" in a bundle.

In sum, you get more than 600 pages full of modern C++ and more than 100 source files presenting concurrency in practice.

 

Get your interactive course

 

Modern C++ Concurrency in Practice

C++ Standard Library including C++14 & C++17

educative CLibrary

Based on my book "Concurrency with Modern C++" educative.io created an interactive course.

What's Inside?

  • 140 lessons
  • 110 code playgrounds => Runs in the browser
  • 78 code snippets
  • 55 illustrations

Based on my book "The C++ Standard Library" educative.io created an interactive course.

What's Inside?

  • 149 lessons
  • 111 code playgrounds => Runs in the browser
  • 164 code snippets
  • 25 illustrations

Add comment


My Newest E-Books

Course: Modern C++ Concurrency in Practice

Course: C++ Standard Library including C++14 & C++17

Course: Embedded Programming with Modern C++

Course: Generic Programming (Templates)

Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 874

All 2910047

Currently are 163 guests and no members online

Kubik-Rubik Joomla! Extensions

Latest comments