Specialities of std::shared_ptr

Contents[Show]

After I draw the big picture of a std::shared_ptr's in the last post, I want to present two special aspects of this smart pointer in this post. First, I show with std::shared_from_this how to create a std::shared_ptr from an object; second, I'm interested in the question to the answer: Should a function take a std::shared_ptr by copy or by reference? The numbers are quite interesting.

 

std::shared_ptr from this

Thanks to std::enable_shared_from_this you can create object that return a std::shared_ptr from this. Therefore, the class of the objects has to be public derived from std::enable_shared_from_this. Now, you have the method shared_from_this available, which you can use to create std::shared_ptr from this.

The program shows the theory in practice.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// enableShared.cpp

#include <iostream>
#include <memory>

class ShareMe: public std::enable_shared_from_this<ShareMe>{
public:
  std::shared_ptr<ShareMe> getShared(){
    return shared_from_this();
  }
};

int main(){

  std::cout << std::endl;

  std::shared_ptr<ShareMe> shareMe(new ShareMe);
  std::shared_ptr<ShareMe> shareMe1= shareMe->getShared();
  {
    auto shareMe2(shareMe1);
    std::cout << "shareMe.use_count(): "  << shareMe.use_count() << std::endl;
  }
  std::cout << "shareMe.use_count(): "  << shareMe.use_count() << std::endl;
  
  shareMe1.reset();
  
  std::cout << "shareMe.use_count(): "  << shareMe.use_count() << std::endl;

  std::cout << std::endl;

}

 

The smart pointer shareMe (line 17) and it copies shareMe1 (line 18) and shareMe2 (line 20) reference the very same resource and increment and decrement the reference counter.

 enabledShared

The call shareMe->getShared() in line 18 creates a new smart pointer. getShared() internally uses (line 9) the function shared_from_this.

There is something very special with the class ShareMe.

Curiously recurring template pattern 

ShareMe is the derived class and type argument (line 6) of the base class std::enabled_shared_from_this. This pattern is coined CRTP and is an abbreviation for Curiously Recurring Template Pattern. Obviously, there is no recursion, because the methods of the base class will be instantiated when they are called. CRTP is an often used idiom in C++ to implement static polymorphism. In opposite to dynamic polymorphism with virtual methods at run time, static polymorphism takes place at compile time.

But now, back to the std::shared_ptr.

std::shared_ptr as function argument

Therefore, we are dealing with the quite interesting question. Should a function take its std::shared_ptr by copy of by reference? But first. Why should you care? Does it matter if a function take its std::shared_ptr by copy or by reference? Under the hood all is a reference. My definite answer is yes and now. Semantically, it makes no difference. From the performance perspective, it makes a difference.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// refVersusCopySharedPtr.cpp

#include <memory>
#include <iostream>

void byReference(std::shared_ptr<int>& refPtr){
  std::cout << "refPtr.use_count(): " << refPtr.use_count() << std::endl;
}

void byCopy(std::shared_ptr<int> cpyPtr){
  std::cout << "cpyPtr.use_count(): " << cpyPtr.use_count() << std::endl;
}


int main(){

    std::cout <<  std::endl;

    auto shrPtr= std::make_shared<int>(2011);

    std::cout << "shrPtr.use_count(): " << shrPtr.use_count() << std::endl;

    byReference(shrPtr);
    byCopy(shrPtr);
    
    std::cout << "shrPtr.use_count(): " << shrPtr.use_count() << std::endl;
    
    std::cout << std::endl;
    
}

 

The function byReference (line 6 - 8) and byCopy (line 10 - 12) takes their std::shared_ptr by reference and by copy. The output of the program emphasis the key point.

 refVersusCopySharedPtr

The function byCopy takes its std::shared_ptr by copy. Therefore, the reference count is increased in the function body to 2 and afterwards decreased to 1. The question is now. How expensive is the incrementing and decrementing of the reference counter? Because the incrementing of the reference counter is an atomic operation, I expect a measurable difference. To be precise. The incrementing of the reference counter is an atomic operation with relaxed semantic; the decrementing an atomic operation with acquire-release semantic.

Let's have a look at the numbers.

Performance comparison

How know my performance comparisons, knows, that my Linux PC is more powerful than my Windows PC. Therefore, you have to read the absolute numbers with a grain of salt. I use the GCC 4.8 and Microsoft Visual Studio 15. Additionally, I translate the program with maximum and without optimization. At first, my small test program.

In the test program, I hand over the std::shared_ptr by reference and by copy and use the std::shared_ptr to initialize another std::shared_ptr. This was the simplest scenario to cheat the optimizer. I invoke each function 100 million times.

The program

 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// performanceRefCopyShared.cpp

#include <chrono>
#include <memory>
#include <iostream>

constexpr long long mill= 100000000;

void byReference(std::shared_ptr<int>& refPtr){
  volatile auto tmpPtr(refPtr);
}

void byCopy(std::shared_ptr<int> cpyPtr){
  volatile auto tmpPtr(cpyPtr);
}


int main(){

    std::cout <<  std::endl;
    
    auto shrPtr= std::make_shared<int>(2011);
   
    auto start = std::chrono::steady_clock::now();
  
    for (long long i= 0; i <= mill; ++i) byReference(shrPtr);    
    
    std::chrono::duration<double> dur= std::chrono::steady_clock::now() - start;
    std::cout << "by reference: " << dur.count() << " seconds" << std::endl;
    
    start = std::chrono::steady_clock::now();
    
    for (long long i= 0; i<= mill; ++i){
        byCopy(shrPtr);
    }
    
    dur= std::chrono::steady_clock::now() - start;
    std::cout << "by copy: " << dur.count() << " seconds" << std::endl;
    
    std::cout << std::endl;
    
}

 

First, the program without optimization.

Without optimization

 performanceperformanceWindows

And now the one with maximum optimization.

With maximum optimization

performanceOptimizationperformanceOptimizationWindows

My conclusion

comparisonEng

The raw numbers of the program performanceCopyShared.cpp speak a clear message.

  • The perReference function is about 2 times faster than its pendant perCopy. With maximum optimization on Linux about 5 times faster.
  • Maximum optimization gives on Windows a performance boost by a factor of 3; on Linux by a factor of 30 - 80.
  • The Windows application is without optimization faster than the Linux application. That's interesting because my Windows PC is slower.

What's next?

The classical issue of smart pointers using reference count is to have cyclic references. Therefore, std::weak_ptr comes to our rescue. I will have in the next post a closer look at std::weak_ptr and show you how to break cyclic references.

 

 

 

 

 

 

 

 

 

 

 

title page smalltitle page small Go to Leanpub/cpplibrary "What every professional C++ programmer should know about the C++ standard library".   Get your e-book. Support my blog.

 

 

 

Comments   

0 #1 mehndi clothes 2017-02-22 08:19
Very great post. I just stumbled upon your blog and wanted to
say that I have really loved browsing your weblog posts.
After all I'll be subscribing to your feed and I am hoping you write onhe more soon!
Quote

Add comment


My Newest E-Books

Latest comments

Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 512

All 460341

Currently are 231 guests and no members online