C++11 offers four different smart pointers. I will have a closer look in this post regarding memory and performance overhead on two of them. My first candidate, std::unique_ptr takes care of the lifetime of one resource exclusively; std::shared_ptr shares the ownership of a resource with another std::shared_ptr. I will state the result of my tests before I show you the raw numbers: There are only a few reasons in modern C++ justifying the memory management with new and delete.
Why? Here are the numbers.
Memory overhead
std::unique_ptr
std::unique_ptr needs, by default, no additional memory. That means std::unique_ptr is as big as its underlying raw pointer. But what does default mean? You can parametrize a std::unique_ptr with a special deleter function. If this deleter function has been stated, you will have an enriched std::unique_ptr and pay for it. As I mentioned, that is the special use case.
In opposition to the std::unique_ptr, the std::shared_ptr has a little overhead.
std::shared_ptr
std::shared_ptr's share a resource. They internally use a reference counter. That means if a std::shared_ptr is copied, the reference counter will be increased. The reference count will be decreased if the std::shared_ptr goes out of scope. Therefore, the std::shared_ptr needs additional memory for the reference counter. (To be precise, there is an additional reference counter for the std::weak_ptr). That's the overhead a std::shared_ptr has in opposite to a raw pointer.
Modernes C++ Mentoring
Be part of my mentoring programs:
Do you want to stay informed about my mentoring programs: Subscribe via E-Mail.
The story of the performance is a little bit more involved. Therefore, I let the numbers speak for themself. A simple performance test should give an idea of the overall performance.
I allocate and deallocate in the test 100'000'000 times memory. Of course, I'm interested in how long it takes.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
// all.cpp
#include <chrono>
#include <iostream>
static const long long numInt= 100000000;
int main(){
auto start = std::chrono::system_clock::now();
for ( long long i=0 ; i < numInt; ++i){
int* tmp(new int(i));
delete tmp;
// std::shared_ptr<int> tmp(new int(i));
// std::shared_ptr<int> tmp(std::make_shared<int>(i));
// std::unique_ptr<int> tmp(new int(i));
// std::unique_ptr<int> tmp(std::make_unique<int>(i));
}
std::chrono::duration<double> dur= std::chrono::system_clock::now() - start;
std::cout << "time native: " << dur.count() << " seconds" << std::endl;
}
|
I compare in my test the explicit calls of new and delete (lines 13 and 14) with the usage of std::shared_ptr (line 15), std::make_shared (line 16), std::unique_ptr (line 17), and std::make_unique (line 18). In this small program, handling the smart pointer (lines 15 - 18) is much simpler because it automatically releases its dynamically created int variable if it goes out of scope.
The functions std::make_shared (line 16) and std::make_unique(line 18) are quite handy. They create the smart pointer, respectively. In particular, std::make_shared is very interesting. There are more memory allocations necessary for the creation of a std::shared_ptr. Memory is necessary for the managed resource and the reference counters. std::make_shared make one memory allocation out of them. The performance benefits. std::make_unique is available since C++14; the other smart pointer functionality since C++11.
I use GCC 4.9.2 and a cl.exe for my performance tests. The cl.exe is part of Microsoft Visual Studio 2015. Although cl.exe officially supports only C++11, the helper function std::make_unique is already available. Therefore, I can run my performance test with maximum and without optimization. I must admit that my Windows PC is less powerful than my Linux PC. Therefore, I'm interested in comparing the raw memory allocation and the smart pointers. I'm not comparing Windows and Linux.
Here a the raw numbers.
The raw numbers
For simplicity reasons, I will not show the screenshots of the programs and present you only the table holding the numbers.

I want to draw a few interesting conclusions from the table.
- Optimization matters. In the case of std::make_shared_ptr, the program with maximum optimizations is almost ten times faster. But these observations also hold for the other smart pointers. The optimized program is about 2 to 3 times faster. Interestingly, this observation will not hold for new and delete.
- std::unique_ptr, std::make_unique, and with minor deviations, std::make_shared are in the same performance range as new and delete.
- You should not use std::shared_ptr and std::make_shared without optimization. std::shared_ptr is about two times slower than new and deletes even with optimization.
My conclusion
- std::unique_ptr has no memory or performance overhead compared to the explicit usage of new and delete. That is great because std::unique_ptr offers an excellent benefit by automatically managing the lifetime of its resource without any additional cost.
- My conclusion to std::shared_ptr is not so easy. Admittedly, the std::shared_ptr is about two times slower than new and delete. Even std::make_shared has a performance overhead of about 10%. But this calculation is based on the wrong assumptions because std::shared_ptr models shared ownership. That means only the first std::shared_ptr has to shoulder the performance and memory overhead. The additional shared pointer shares the infrastructure of managing the underlying object. In particular, only one memory allocation for a std::shared_ptr is necessary.
Now I can repeat myself. There are only a few reasons in modern C++ justifying the memory management with new and delete.
What's next?
After this plea for the smart pointers, I will present in the next post the details about std::unique_ptr.
Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, G Prvulovic, Reinhold Dröge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Animus24, Jozo Leko, John Breland, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Robert Blanch, Truels Wissneth, Kris Kafka, Mario Luoni, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, Daniel Hufschläger, Alessandro Pezzato, Bob Perry, Satish Vangipuram, Andi Ireland, Richard Ohnemus, Michael Dunsky, Leo Goodstadt, John Wiederhirn, Yacob Cohen-Arazi, Florian Tischler, Robin Furness, Michael Young, Holger Detering, Bernd Mühlhaus, Matthieu Bolt, Stephen Kelley, Kyle Dean, Tusar Palauri, Dmitry Farberov, Juan Dent, George Liao, Daniel Ceperley, Jon T Hess, Stephen Totten, Wolfgang Fütterer, Matthias Grün, Phillip Diekmann, Ben Atakora, Ann Shatoff, and Rob North.
Thanks, in particular, to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, Richard Sargeant, Rusty Fleming, John Nebel, Mipko, Alicja Kaminska, and Slavko Radman.
My special thanks to Embarcadero 
My special thanks to PVS-Studio 
My special thanks to Tipi.build 
My special thanks to Take Up Code 
Seminars
I'm happy to give online seminars or face-to-face seminars worldwide. Please call me if you have any questions.
Bookable (Online)
German
Standard Seminars (English/German)
Here is a compilation of my standard seminars. These seminars are only meant to give you a first orientation.
- C++ - The Core Language
- C++ - The Standard Library
- C++ - Compact
- C++11 and C++14
- Concurrency with Modern C++
- Design Pattern and Architectural Pattern with C++
- Embedded Programming with Modern C++
- Generic Programming (Templates) with C++
New
- Clean Code with Modern C++
- C++20
Contact Me
- Phone: +49 7472 917441
- Mobil:: +49 176 5506 5086
- Mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
- German Seminar Page: www.ModernesCpp.de
- Mentoring Page: www.ModernesCpp.org
Modernes C++,

Comments
I did some tests in 2011.
http://www.thradams.com/codeblog/smartptrperf.htm
The unique_ptr has the same problem but I suspect (didn't check) that the compiler can easily remove the conceptual overhead of creating a new instance of unique_ptr.
There are more interesting performance tests, for instance, compare vector of unique_ptr against vector or normal pointers.
If you're gaming for performance, plain malloc/free seems to outperform new/delete (although it is not usually good practice to put malloc/free in C++ code). I ran the test locally using GCC (took the median of five trials):
new/delete: 2.776 seconds
malloc/free: 2.486
The reasons seem to be explained here: http://stackoverflow.com/questions/2570552/why-are-new-delete-slower-than-malloc-free
I already wrote this post in German. In about a week, I will translate it to English.
is witty, keeep up the good work!
RSS feed for comments to this post