Associative Containers - A simple Performance Comparison

Contents[Show]

Before I take a deeper look insight the interface of the hash tables - officially called unordered associative containers - I will at first compare the performance of the associative containers. The best candidates are std::unordered_map and the ordered pendant std::map because both are used most frequently.

 

Admittedly, the eight variations of associative containers are a little bit confusing. In addition, the name component unordered of the new ones is not so easy to read and to write. The reason for the special names is easily explained. The unordered associative containers were on one hand to late for the C++98 standard. On the other hand, they were missed so badly that the most architectures implemented them by themself. Therefore, the simple names have already been used and the C++ standardization committee has to use more elaborated names. But very nice is that names of the unordered associative containers follow a simple pattern. I presented the pattern in the post Hash tables.

std::map versus std::unordered_map

For my simple performance comparison I put a lot of arbitrary values into a std::map and std::unorderd_map and measured and compared their performance. Exactly this is done in my program.

 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
// associativeCompare.cpp

#include <chrono>
#include <iostream>
#include <map>
#include <random>
#include <unordered_map>

static const long long mapSize= 10000000;
static const long long accSize= 1000000;

int main(){

  std::cout << std::endl;

  std::map<int,int> myMap;
  std::unordered_map<int,int> myHash;

  for ( long long i=0; i < mapSize; ++i ){
    myMap[i]=i;
    myHash[i]= i;
  }

  std::vector<int> randValues;
  randValues.reserve(accSize);

  // random values
  std::random_device seed;
  std::mt19937 engine(seed());
  std::uniform_int_distribution<> uniformDist(0,mapSize);
  for ( long long i=0 ; i< accSize ; ++i) randValues.push_back(uniformDist(engine));

  auto start = std::chrono::system_clock::now();
  for ( long long i=0; i < accSize; ++i){
    myMap[randValues[i]];
  }
  std::chrono::duration<double> dur= std::chrono::system_clock::now() - start;
  std::cout << "time for std::map: " << dur.count() << " seconds" << std::endl;

  auto start2 = std::chrono::system_clock::now();
  for ( long long i=0; i < accSize; ++i){
    myHash[randValues[i]];
  }
  std::chrono::duration<double> dur2= std::chrono::system_clock::now() - start2;
  std::cout << "time for std::unordered_map: " << dur2.count() << " seconds" << std::endl;

  std::cout << std::endl;

}

 

I fill in the lines 19 -22 a std::map (line 20) and a std::unordered_map (line 21) with mapSize key/values pairs. mapSize is ten millions. Then I create a std::vector with one million arbitrary elements between 0 and mapSize. The random number generator engine(seed()) (line 29) initialized by the seed creates the values. I use in line 30 the random number generator engine in combination with the random number distribution unformDist: uniformDist(engine). Of course, uniformDist guarantees the uniform distribution of the values. At the end, randValues has one million arbitrary created elements between 0 and 10 million that are uniform distributed. This one million elements are my keys for which I'm interested in the values from std::map and die std::unordered_map.

How do they compare to each other?

Without optimization

At first, I compile the program without optimization. I use the current Microsoft Visual 15 C++ compiler and the GCC 4.8. Here are the numbers.

Microsoft Visual C++ Compiler

windows

GCC Compiler

linux

The performance differences are significant. The  std::unordered_map is about 3 times faster on Windows and about 5 times faster on Linux. I stop here with my reasoning because the executables run on different PC.

Now I'm curious on the optimized versions.

Maximum optimization

To compile the program with maximum optimization I use for the cl.exe compiler the flag Ox; in case of the g++ compiler the flag O3. The performance differences to the non-optimized versions are very impressive.

Microsoft Visual C++

By compiling with full optimization the access time to the std::map becomes about 2 times faster, the access time to the std::unorderd_map becomes about 6 times faster. Therefore, the std::unorderd_map is 11 times faster than the std::map.

windowsOptimizePNG

GCC Compiler

The performance difference are not so dramatic in the case of the GCC compiler. Therefore, the optimized access with std::map is about 20% faster, but the access time of std::unordered_map about 6 times faster.

 

linuxOptimzed

To whom - like me - is lost with so many numbers, I present them once more in a simple table.

The raw numbers

For simplicity reasons I rounded the value to 3 decimal places. The last two columns show the maximum optimized programs with (cl.exe /Ox) on Windows and (g++ -O3) on Linux.

comparisonTableEng

What's next?

In the post Hash tables I said quite imprecise that the unordered associative containers have a similar interface as the ordered associative containers. That not totally true. The unordered associative containers have a more powerful interface than the ordered pendants. E.g., you can adjust the hash function or the distribution of the hash values to their buckets. As ever, the details will follow in the next post.

 

 

 


 

 

 

 

 

 

title page smalltitle page small Go to Leanpub/cpplibrary "What every professional C++ programmer should know about the C++ standard library".   Get your e-book. Support my blog.

 

 

 

 

Add comment


My Newest E-Book

Latest comments

Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 170

All 277624

Currently are 149 guests and no members online