Memory and Performance Overhead of Smart Pointers

Contents[Show]

C++11 offers four different smart pointers. On two of them I will have a closer look in this post regarding memory and performance overhead. My first candidate std::unique_ptr takes exclusively care of the lifetime of one resource; std::shared_ptr shares the ownership of a resource with other std::shared_ptr's. I will state the result of my tests before I show you the raw numbers: There are only few reasons in modern C++ justifying the memory management with new and delete.

 

 

Why? Here are the numbers.

Memory overhead

std::unique_ptr

std::unique_ptr needs by default no additional memory. That means, std::unique_ptr is as big as its underlying raw pointer. But what does by default mean? You can parametrize a std::unique_ptr with a special deleter function. If this deleter function has state, you will have an enriched std::unique_ptr and you will pay for it. As I mentioned, that is the special use case.

In opposite to the std::unique_ptr the std::shared_ptr has a little overhead.

std::shared_ptr

std::shared_ptr's share a resource. They internally use a reference counter. That means, if a std::shared_ptr is copied, the reference counter will be increased. The reference count will be decreased if the std::shared_ptr goes out of scope. Therefore, the std::shared_ptr needs additional memory for the reference counter. (To be precise, there is an additional reference counter for the std::weak_ptr). That's the overhead a std::shared_ptr has in opposite to a raw pointer.

Performance overhead

The story to the performance is a little bit move involved. Therefore, I let the numbers speak for themself. A simple performance test should give an idea of the overall performance.

The performance test

I allocate and deallocate in the test 100'000'000 times memory. Of course, I'm interested how long it takes.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// all.cpp

#include <chrono>
#include <iostream>

static const long long numInt= 100000000;

int main(){

  auto start = std::chrono::system_clock::now();

  for ( long long i=0 ; i < numInt; ++i){
    int* tmp(new int(i));
    delete tmp;
    // std::shared_ptr<int> tmp(new int(i));
    // std::shared_ptr<int> tmp(std::make_shared<int>(i));
    // std::unique_ptr<int> tmp(new int(i));
    // std::unique_ptr<int> tmp(std::make_unique<int>(i));
  }

  std::chrono::duration<double> dur= std::chrono::system_clock::now() - start;
  std::cout << "time native: " << dur.count() << " seconds" << std::endl;

}

 

I compare in my test the explicit calls of new and delete (line 13 and 14) with the usage of  std::shared_ptr (line 15), std::make_shared (line 16), std::unique_ptr (line 17), and std::make_unique (line 18). Already in this small program, the handling of smart pointer (line 15 - 18) is a lot simpler because the smart pointer automatically release its dynamically created int variable if it goes out of scope.  

The two functions std::make_shared (line 16) and std::make_unique(line 18) are quite handy. They create the smart pointer respectively. In particular std::make_shared is very interesting. There are more memory allocations necessary for the creation of a std::shared_ptr. Memory is necessary for the managed resource and the reference counters. std::make_shared make one memory allocation out of them. The performance benefits. std::make_unique is available since C++14; the other smart pointer functionality since C++11.

I use a GCC 4.9.2 and a cl.exe for my performance tests. The cl.exe is part of the Microsoft Visual Studio 2015. Although cl.exe officially supports only C++11, the helper function std::make_unique is already available. Therefore, I can run my performance test with maximum and without optimization. I have to admit that my Windows PC is not as powerful as my Linux PC. Therefore, I'm interested in the comparison between the raw memory allocation and the smart pointers. I'm not comparing Windows and Linux.

Here a the raw numbers.

The raw numbers

For simplicity reasons I will not show the screenshots of the programs and present you only the table holding the numbers.

comparisonEng

I want to draw a few interesting conclusions from the table.

  1. Optimization matters. In the case of std::make_shared_ptr the program with maximum optimizations is almost 10 times faster. But these observations holds also for the other smart pointers. The optimized program is about 2 to 3 times faster. Interestingly, this observation will not hold for new and delete.
  2. std::unique_ptr, std::make_unique, and with small deviations std::make_shared are in the same performance range like new and delete.
  3. You should not use std::shared_ptr and std::make_shared without optimization. std::shared_ptr is even with optimization about two times slower than new and delete.

My conclusion

  • std::unique_ptr has no memory or performance overhead compared to the explicit usage of new and delete. That is very great because std::unique_ptr offers a great benefit by automatically managing the lifetime of its resource without any additional cost.
  • My conclusion to std::shared_ptr is not so easy. Admittedly, the std::shared_ptr is about two times slower than new and delete. Even std::make_shared has a performance overhead of about 10%. But this calculation is based on the wrong assumptions because std::shared_ptr models shared ownership. That means, only the first std::shared_ptr has to shoulder the performance and memory overhead. The additional shared pointer share the infrastructure of managing the underlying object. In particular only one memory allocation for a std::shared_ptr is necessary. 

Now I can repeat myself.  There are only few reasons in modern C++ justifying the memory management with new and delete.

What's next?

After this plea for the smart pointers I will present in the next post the details about std::unique_ptr.

 

 

 

 

 

 

 

 

 

title page smalltitle page small Go to Leanpub/cpplibrary "What every professional C++ programmer should know about the C++ standard library".   Get your e-book. Support my blog.

 

Comments   

0 #1 Thiago Adams 2016-12-07 10:48
When a shared_ptr needs to be converted to shared or shared converted to shared then you have more overhead.

I did some tests in 2011.

http://www.thradams.com/codeblog/smartptrperf.htm

The unique_ptr has the same problem but I suspect (didn't check) that the compiler can easily remove the conceptual overhead of creating a new instance of unique_ptr.

There are more interesting performance tests, for instance, compare vector of unique_ptr against vector or normal pointers.
Quote
+1 #2 Kevin Albertson 2016-12-07 23:43
Nice post!

If you're gaming for performance, plain malloc/free seems to outperform new/delete (although it is not usually good practice to put malloc/free in C++ code). I ran the test locally using GCC (took the median of five trials):

new/delete: 2.776 seconds
malloc/free: 2.486

The reasons seem to be explained here: http://stackoverflow.com/questions/2570552/why-are-new-delete-slower-than-malloc-free
Quote
0 #3 Hering S Cheng 2016-12-08 03:47
Allocation and deallocation is one aspect of performance that users of smart pointers need to consider. Another aspect, some would argue a more important one, is the performance cost of passing shared_ptr by value to functions. In my experience, this is where the cost of smart pointers really bites. I'd be curious to see if you would conduct this experiment.
Quote
+1 #4 Mark Abraham 2016-12-08 05:18
Performance results from unoptimized builds are intrinsically not appropriate for an article focussed on performance. Simply observe that they're much slower and don't even show the numbers.
Quote
0 #5 Rainer Grimm 2016-12-08 10:56
Quoting Hering S Cheng:
Allocation and deallocation is one aspect of performance that users of smart pointers need to consider. Another aspect, some would argue a more important one, is the performance cost of passing shared_ptr by value to functions. In my experience, this is where the cost of smart pointers really bites. I'd be curious to see if you would conduct this experiment.

I already wrote this post in German. In about a week, I will translate it to English.
Quote
0 #6 stufe 2017-05-18 10:23
bellissimo post
Quote
0 #7 Kristy 2017-05-30 01:20
Hi, I log on to your blog daily. Your humoristic style
is witty, keeep up the good work!
Quote
0 #8 bricolage 2017-09-15 14:27
bellissimo post
Quote

Add comment


My Newest E-Books

Latest comments

Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 420

All 455474

Currently are 175 guests and no members online