Testing

The library contains a single comprehensive regression test. For a given container type in this library, the test creates an object of the container type and an object of the corresponding standard type (e.g., std::set). It then performs a random sequence of methods with random arguments (e.g., inserts, erases, and so forth) on both objects. At each operation, the test checks the return value of the method, and optionally both compares this library's object with the standard's object as well as performing other consistency checks on this library's object (e.g., order preservation, when applicable, or node invariants, when applicable).

Additionally, the test integrally checks exception safety and resource leaks. This is done as follows. A special allocator type, written for the purpose of the test, both randomly throws an exceptions when allocations are performed, and tracks allocations and de-allocations. The exceptions thrown at allocations simulate memory-allocation failures; the tracking mechanism checks for memory-related bugs (e.g., resource leaks and multiple de-allocations). Both this library's containers and the containers' value-types are configured to use this allocator.

For granularity, the test is split into the several sources, each checking only some containers.

For more details, consult the files in testsuite/ext/pb_ds/regression.

This test inserts a number of values with keys from an arbitrary text ([biblio.wickland96thirty]) into a container, then performs a series of finds using find . It measures the average time for find as a function of the number of values inserted.

It uses the test file: performance/ext/pb_ds/text_find_timing_test.cc

And uses the data file: filethirty_years_among_the_dead_preproc.txt

The test checks the effect of different range-hashing functions, trigger policies, and cache-hashing policies.

The graphic below show the results for the native and collision-chaining hash types the the function applied being a text find timing test using find.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetails
n_hash_map_ncah
std::tr1::unordered_map cache_hash_code false  
cc_hash_mod_prime_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mask_exp_1div2_sth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2
cc_hash_mask_exp_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mask_exp_1div2_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

In this setting, the range-hashing scheme affects performance more than other policies. As the results show, containers using mod-based range-hashing (including the native hash-based container, which is currently hard-wired to this scheme) have lower performance than those using mask-based range-hashing. A modulo-based range-hashing scheme's main benefit is that it takes into account all hash-value bits. Standard string hash-functions are designed to create hash values that are nearly-uniform as is ([biblio.knuth98sorting]).

Trigger policies, i.e. the load-checks constants, affect performance to a lesser extent.

Perhaps surprisingly, storing the hash value alongside each entry affects performance only marginally, at least in this library's implementation. (Unfortunately, it was not possible to run the tests with std::tr1::unordered_map 's cache_hash_code = true , as it appeared to malfuntion.)

There are two sets of results for this type, one for collision-chaining hashes, and one for general-probe hashes.

The first graphic below shows the results for the native and collision-chaining hash types. The function applied being a random integer timing test using find.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetails
n_hash_map_ncah
std::tr1::unordered_map cache_hash_code false  
cc_hash_mod_prime_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mod_prime_1div2_nsth_map
cc_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2
cc_hash_mask_exp_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mask_exp_1div2_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

And the second graphic shows the results for the native and general-probe hash types. The function applied being a random integer timing test using find.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetails
n_hash_map_ncah
std::tr1::unordered_map cache_hash_code false  
gp_hash_mod_quadp_prime_1div2_nsth_map
gp_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Probe_Fn quadratic_probe_fn  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2
gp_hash_mask_linp_exp_1div2_nsth_map
gp_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Probe_Fn linear_probe_fn  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

There are two sets of results for this type, one for collision-chaining hashes, and one for general-probe hashes.

The first graphic below shows the results for the native and collision-chaining hash types, using as the function applied an integer subscript timing test with find.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetails
n_hash_map_ncah
std::tr1::unordered_map cache_hash_code false  
cc_hash_mod_prime_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mod_prime_1div2_nsth_map
cc_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2
cc_hash_mask_exp_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mask_exp_1div2_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

And the second graphic shows the results for the native and general-probe hash types. The function applied being a random integer timing test using find.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetails
n_hash_map_ncah
std::tr1::unordered_map cache_hash_code false  
gp_hash_mod_quadp_prime_1div2_nsth_map
gp_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Probe_Fn quadratic_probe_fn  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2
gp_hash_mask_linp_exp_1div2_nsth_map
gp_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Probe_Fn linear_probe_fn  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

There are two sets of results for this type, one for collision-chaining hashes, and one for general-probe hashes.

The first graphic below shows the results for the native and collision-chaining hash types, using as the function applied an integer subscript timing test with insert.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetails
n_hash_map_ncah
std::tr1::unordered_map cache_hash_code false  
cc_hash_mod_prime_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mod_prime_1div2_nsth_map
cc_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2
cc_hash_mask_exp_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mask_exp_1div2_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

And the second graphic shows the results for the native and general-probe hash types. The function applied being a random integer timing test using find.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetails
n_hash_map_ncah
std::tr1::unordered_map cache_hash_code false  
gp_hash_mod_quadp_prime_1div2_nsth_map
gp_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Probe_Fn quadratic_probe_fn  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2
gp_hash_mask_linp_exp_1div2_nsth_map
gp_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Probe_Fn linear_probe_fn  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

In this setting, as in Hash-Based Text find Find Timing test and Hash-Based Integer find Find Timing test , the choice of underlying hash-table underlying hash-table affects performance most, then the range-hashing scheme, and finally any other policies.

There are some differences, however:

Collision-chaining containers use indirection for greater flexibility; probing containers store values contiguously, in an array (see Figure Motivation::Different underlying data structures A and B, respectively). It follows that for simple data types, probing containers access their allocator less frequently than collision-chaining containers, (although they still have less efficient probing sequences). This explains why some probing containers fare better than collision-chaining containers in this case.

Within each type of hash-table, the range-hashing scheme affects performance more than other policies. This is similar to the situation in Hash-Based Text find Find Timing Test and Hash-Based Integer find Find Timing Test. Unsurprisingly, however, containers with lower αmax perform worse in this case, since more re-hashes are performed.

The graphic below show the results for the native, collision-chaining, and general-probing hash types.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetails
n_hash_map_ncah
std::tr1::unordered_map cache_hash_code false  
cc_hash_mod_prime_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mask_exp_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
gp_hash_mod_quadp_prime_1div2_nsth_map
gp_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Probe_Fn quadratic_probe_fn  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

In this setting, the distribution of keys is so skewed that the underlying hash-table type affects performance marginally. (This is in contrast with Hash-Based Text find Find Timing Test, Hash-Based Integer find Find Timing Test, Hash-Based Integer Subscript Find Timing Test and Hash-Based Integer Subscript Insert Timing Test.)

The range-hashing scheme affects performance dramatically. A mask-based range-hashing scheme effectively maps all values into the same bucket. Access degenerates into a search within an unordered linked-list. In the graphic above, it should be noted that std::tr1::unordered_map is hard-wired currently to mod-based and mask-based schemes, respectively.

When observing the settings of this test, it is apparent that the keys' distribution is far from natural. One might ask if the test is not contrived to show that, in some cases, mod-based range hashing does better than mask-based range hashing. This is, in fact just the case. A more natural case in which mod-based range hashing is better was not encountered. Thus the inescapable conclusion: real-life key distributions are handled better with an appropriate hash function and a mask-based range-hashing function. (pb_ds/example/hash_shift_mask.cc shows an example of handling this a-priori known skewed distribution with a mask-based range-hashing function). If hash performance is bad, a χ2 test can be used to check how to transform it into a more uniform distribution.

For this reason, this library's default range-hashing function is mask-based.

The graphic below show the results for the native, collision-chaining, and general-probing hash types.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetails
n_hash_map_ncah
std::tr1::unordered_map cache_hash_code false  
cc_hash_mod_prime_1div1_nsth_map
cc_hash_table Comb_Hash_Fn direct_mod_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_prime_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/1
cc_hash_mask_exp_1div2_nsth_map
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2
gp_hash_mask_linp_exp_1div2_nsth_set
gp_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Probe_Fn linear_probe_fn  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic immediately below shows the results for the native tree type and several other tree types.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_map
std::map  
splay_tree_map
tree Tag splay_tree_tag
Node_Update null_node_update
rb_tree_map
tree Tag rb_tree_tag
Node_Update null_node_update
ov_tree_map
tree Tag ov_tree_tag
Node_Update null_node_update
pat_trie_map
tree Tag pat_trie_tag
Node_Update null_node_update

For this setting, a splay tree (tree with Tag = splay_tree_tag) does not do well. This is possibly due to two reasons:

An ordered-vector tree (tree with Tag = ov_tree_tag), a red-black tree (tree with Tag = rb_tree_tag), and the native red-black tree all share approximately the same performance.

An ordered-vector tree is slightly slower than red-black trees, since it requires, in order to find a key, more math operations than they do. Conversely, an ordered-vector tree requires far lower space than the others. ([austern00noset], however, seems to have an implementation that is also faster than a red-black tree).

A PATRICIA trie (trie with Tag = pat_trie_tag) has good look-up performance, due to its large fan-out in this case. In this setting, a PATRICIA trie has look-up performance comparable to a hash table (see Hash-Based Text find Timing Test), but it is order preserving. This is not that surprising, since a large-fan-out PATRICIA trie works like a hash table with collisions resolved by a sub-trie. A large-fan-out PATRICIA trie does not do well on modifications (see Tree-Based and Trie-Based Text Insert Timing Test). Therefore, it is possibly beneficial in semi-static settings.

The graphic immediately below shows the results for the native tree type and several other tree types.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_map
std::map  
splay_tree_map
tree Tag splay_tree_tag
Node_Update null_node_update
rb_tree_map
tree Tag rb_tree_tag
Node_Update null_node_update
ov_tree_map
tree Tag ov_tree_tag
Node_Update null_node_update
pat_trie_map
tree Tag pat_trie_tag
Node_Update null_node_update

The graphic immediately below shows the results for the native tree type and several other tree types.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_set
std::set  
splay_tree_set
tree Tag splay_tree_tag
Node_Update null_node_update
rb_tree_set
tree Tag rb_tree_tag
Node_Update null_node_update
ov_tree_set
tree Tag ov_tree_tag
Node_Update null_node_update
pat_trie_map
tree Tag pat_trie_tag
Node_Update null_node_update

The graphic immediately below shows the results for the native tree type and several other tree types.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_set
std::set  
splay_tree_ost_set
tree Tag splay_tree_tag
Node_Update tree_order_statistics_node_update
rb_tree_ost_set
tree Tag rb_tree_tag
Node_Update tree_order_statistics_node_update

The graphic below show the results for "multimaps" which use a tree-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_mmap
std::multimap  
rb_tree_mmap_lu_mtf_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a hash-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_hash_mmap
std::tr1::unordered_multimap  
rb_tree_mmap_lu_mtf_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a tree-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_mmap
std::multimap  
rb_tree_mmap_lu_mtf_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a hash-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_hash_mmap
std::tr1::unordered_multimap  
rb_tree_mmap_lu_mtf_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a tree-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_mmap
std::multimap  
rb_tree_mmap_lu_mtf_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a hash-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_hash_mmap
std::tr1::unordered_multimap  
rb_tree_mmap_lu_mtf_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a tree-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_mmap
std::multimap  
rb_tree_mmap_lu_mtf_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a hash-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_hash_mmap
std::tr1::unordered_multimap  
rb_tree_mmap_lu_mtf_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a tree-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_mmap
std::multimap  
rb_tree_mmap_lu_mtf_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a hash-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_hash_mmap
std::tr1::unordered_multimap  
rb_tree_mmap_lu_mtf_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a tree-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_mmap
std::multimap  
rb_tree_mmap_lu_mtf_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
tree Tag rb_tree_tag  
Node_Update null_node_update  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The graphic below show the results for "multimaps" which use a hash-based container for primary keys.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetailsParameterDetailsParameterDetails
n_hash_mmap
std::tr1::unordered_multimap  
rb_tree_mmap_lu_mtf_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped list_update Update_Policy lu_move_to_front_policy  
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set
cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy  
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2  
Mapped cc_hash_table Comb_Hash_Fn direct_mask_range_hashing  
Resize_Policy hash_standard_resize_policy Size_Policy hash_exponential_size_policy
Trigger_Policy hash_load_check_resize_trigger with αmin = 1/8 and αmax = 1/2

The two graphics below show the results for the native priority_queues and this library's priority_queues.

The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
binary_heap
priority_queue Tag binary_heap_tag
binomial_heap
priority_queue Tag binomial_heap_tag
rc_binomial_heap
priority_queue Tag rc_binomial_heap_tag
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The graphic below shows the results for the binary-heap based native priority queues and this library's pairing-heap priority_queue data structures.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The two graphics below show the results for the native priority_queues and this library's priority_queues.

The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
binary_heap
priority_queue Tag binary_heap_tag
binomial_heap
priority_queue Tag binomial_heap_tag
rc_binomial_heap
priority_queue Tag rc_binomial_heap_tag
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The graphic below shows the results for the native priority queues and this library's pairing-heap priority_queue data structures.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue adapting std::vector Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
pairing_heap
priority_queue Tag pairing_heap_tag

The two graphics below show the results for the native priority_queues and this library's priority_queues.

The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
binary_heap
priority_queue Tag binary_heap_tag
binomial_heap
priority_queue Tag binomial_heap_tag
rc_binomial_heap
priority_queue Tag rc_binomial_heap_tag
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The graphic below shows the results for the binary-heap based native priority queues and this library's priority_queue data structures.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue adapting std::vector Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
binary_heap
priority_queue Tag binary_heap_tag

The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
binary_heap
priority_queue Tag binary_heap_tag
binomial_heap
priority_queue Tag binomial_heap_tag
rc_binomial_heap
priority_queue Tag rc_binomial_heap_tag
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
binary_heap
priority_queue Tag binary_heap_tag
binomial_heap
priority_queue Tag binomial_heap_tag
rc_binomial_heap
priority_queue Tag rc_binomial_heap_tag
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
binary_heap
priority_queue Tag binary_heap_tag
binomial_heap
priority_queue Tag binomial_heap_tag
rc_binomial_heap
priority_queue Tag rc_binomial_heap_tag
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The two graphics below show the results for the native priority_queues and this library's priority_queues.

The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
binary_heap
priority_queue Tag binary_heap_tag
binomial_heap
priority_queue Tag binomial_heap_tag
rc_binomial_heap
priority_queue Tag rc_binomial_heap_tag
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The graphic below shows the results for the native priority queues and this library's pairing and thin heap priority_queue data structures.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The two graphics below show the results for the native priority_queues and this library's priority_queues.

The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
n_pq_vector
std::priority_queue Sequence std::vector
n_pq_deque
std::priority_queue Sequence std::deque
binary_heap
priority_queue Tag binary_heap_tag
binomial_heap
priority_queue Tag binomial_heap_tag
rc_binomial_heap
priority_queue Tag rc_binomial_heap_tag
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

The graphic below shows the results for the native priority queues and this library's pairing and thin heap priority_queue data structures.

The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.

Name/Instantiating TypeParameterDetails
thin_heap
priority_queue Tag thin_heap_tag
pairing_heap
priority_queue Tag pairing_heap_tag

In general, there are several families of tree-based underlying data structures: balanced node-based trees (e.g., red-black or AVL trees), high-probability balanced node-based trees (e.g., random treaps or skip-lists), competitive node-based trees (e.g., splay trees), vector-based "trees", and tries. (Additionally, there are disk-residing or network-residing trees, such as B-Trees and their numerous variants. An interface for this would have to deal with the execution model and ACID guarantees; this is out of the scope of this library.) Following are some observations on their application to different settings.

Of the balanced node-based trees, this library includes a red-black tree, as does standard (in practice). This type of tree is the "workhorse" of tree-based containers: it offers both reasonable modification and reasonable lookup time. Unfortunately, this data structure stores a huge amount of metadata. Each node must contain, besides a value, three pointers and a boolean. This type might be avoided if space is at a premium [austern00noset].

High-probability balanced node-based trees suffer the drawbacks of deterministic balanced trees. Although they are fascinating data structures, preliminary tests with them showed their performance was worse than red-black trees. The library does not contain any such trees, therefore.

Competitive node-based trees have two drawbacks. They are usually somewhat unbalanced, and they perform a large number of comparisons. Balanced trees perform one comparison per each node they encounter on a search path; a splay tree performs two comparisons. If the keys are complex objects, e.g., std::string, this can increase the running time. Conversely, such trees do well when there is much locality of reference. It is difficult to determine in which case to prefer such trees over balanced trees. This library includes a splay tree.

Ordered-vector trees use very little space [austern00noset]. They do not have any other advantages (at least in this implementation).

Large-fan-out PATRICIA tries have excellent lookup performance, but they do so through maintaining, for each node, a miniature "hash-table". Their space efficiency is low, and their modification performance is bad. These tries might be used for semi-static settings, where order preservation is important. Alternatively, red-black trees cross-referenced with hash tables can be used. [okasaki98mereable] discusses small-fan-out PATRICIA tries for integers, but the cited results seem to indicate that the amortized cost of maintaining such trees is higher than that of balanced trees. Moderate-fan-out trees might be useful for sequences where each element has a limited number of choices, e.g., DNA strings.

Different mapping semantics were discussed in the introduction and design sections.Here the focus will be on the case where a keys can be composed into primary keys and secondary keys. (In the case where some keys are completely identical, it is trivial that one should use an associative container mapping values to size types.) In this case there are (at least) five possibilities:

Stated simply: there is a simple answer for this. (Excluding option 1, which should be avoided in all cases).

If the expected ratio of secondary keys to primary keys is small, then 3 and 4 seem reasonable. Both types of secondary containers are relatively lightweight (in terms of memory use and construction time), and so creating an entire container object for each primary key is not too expensive. Option 4 might be preferable to option 3 if changing the secondary key of some primary key is frequent - one cannot modify an associative container's key, and the only possibility, therefore, is erasing the secondary key and inserting another one instead; a non-associative container, conversely, can support in-place modification. The actual cost of erasing a secondary key and inserting another one depends also on the allocator used for secondary associative-containers (The tests above used the standard allocator, but in practice one might choose to use, e.g., [boost_pool]). Option 2 is definitely an overkill in this case. Option 1 loses out either immediately (when there is one secondary key per primary key) or almost immediately after that. Option 5 has the same drawbacks as option 2, but it has the additional drawback that finding all values whose primary key is equivalent to some key, might be linear in the total number of values stored (for example, if using a hash-based container).

If the expected ratio of secondary keys to primary keys is large, then the answer is more complicated. It depends on the distribution of secondary keys to primary keys, the distribution of accesses according to primary keys, and the types of operations most frequent.

To be more precise, assume there are m primary keys, primary key i is mapped to ni secondary keys, and each primary key is mapped, on average, to n secondary keys (i.e., E(ni) = n).

Suppose one wants to find a specific pair of primary and secondary keys. Using 1 with a tree based container (std::multimap), the expected cost is E(Θ(log(m) + ni)) = Θ(log(m) + n); using 1 with a hash-based container (std::tr1::unordered_multimap), the expected cost is Θ(n). Using 2 with a primary hash-based container and secondary hash-based containers, the expected cost is O(1); using 2 with a primary tree-based container and secondary tree-based containers, the expected cost is (using the Jensen inequality [motwani95random]) E(O(log(m) + log(ni)) = O(log(m)) + E(O(log(ni)) = O(log(m)) + O(log(n)), assuming that primary keys are accessed equiprobably. 3 and 4 are similar to 1, but with lower constants. Using 5 with a hash-based container, the expected cost is O(1); using 5 with a tree based container, the cost is E(Θ(log(mn))) = Θ(log(m) + log(n)).

Suppose one needs the values whose primary key matches some given key. Using 1 with a hash-based container, the expected cost is Θ(n), but the values will not be ordered by secondary keys (which may or may not be required); using 1 with a tree-based container, the expected cost is Θ(log(m) + n), but with high constants; again the values will not be ordered by secondary keys. 2, 3, and 4 are similar to 1, but typically with lower constants (and, additionally, if one uses a tree-based container for secondary keys, they will be ordered). Using 5 with a hash-based container, the cost is Θ(mn).

Suppose one wants to assign to a primary key all secondary keys assigned to a different primary key. Using 1 with a hash-based container, the expected cost is Θ(n), but with very high constants; using 1 with a tree-based container, the cost is Θ(nlog(mn)). Using 2, 3, and 4, the expected cost is Θ(n), but typically with far lower costs than 1. 5 is similar to 1.

The following table shows the complexities of the different underlying data structures in terms of orders of growth. It is interesting to note that this table implies something about the constants of the operations as well (see Amortized push and pop operations).

 pushpopmodifyerasejoin
std::priority_queue Θ(n) worst Θ(log(n)) amortized Θ(log(n)) Worst Θ(n log(n)) Worst [std note 1] Θ(n log(n)) [std note 2] Θ(n log(n)) [std note 1]
priority_queue <Tag = pairing_heap_tag> O(1) Θ(n) worst Θ(log(n)) amortized Θ(n) worst Θ(log(n)) amortized Θ(n) worst Θ(log(n)) amortized O(1)
priority_queue <Tag = binary_heap_tag> Θ(n) worst Θ(log(n)) amortized Θ(n) worst Θ(log(n)) amortized Θ(n) Θ(n) Θ(n)
priority_queue <Tag = binomial_heap_tag> Θ(log(n)) worst O(1) amortized Θ(log(n)) Θ(log(n)) Θ(log(n)) Θ(log(n))
priority_queue <Tag = rc_binomial_heap_tag> O(1) Θ(log(n)) Θ(log(n)) Θ(log(n)) Θ(log(n))
priority_queue<Tag = thin_heap_tag> O(1) Θ(n) worst Θ(log(n)) amortized Θ(log(n)) worst O(1) amortized, or Θ(log(n)) amortized [thin_heap_note] Θ(n) worst Θ(log(n)) amortized Θ(n)

[std note 1] This is not a property of the algorithm, but rather due to the fact that the standard's priority queue implementation does not support iterators (and consequently the ability to access a specific value inside it). If the priority queue is adapting an std::vector, then it is still possible to reduce this to Θ(n) by adapting over the standard's adapter and using the fact that top returns a reference to the first value; if, however, it is adapting an std::deque, then this is impossible.

[std note 2] As with [std note 1], this is not a property of the algorithm, but rather the standard's implementation. Again, if the priority queue is adapting an std::vector then it is possible to reduce this to Θ(n), but with a very high constant (one must call std::make_heap which is an expensive linear operation); if the priority queue is adapting an std::deque, then this is impossible.

[thin_heap_note] A thin heap has Θ(log(n)) worst case modify time always, but the amortized time depends on the nature of the operation: I) if the operation increases the key (in the sense of the priority queue's comparison functor), then the amortized time is O(1), but if II) it decreases it, then the amortized time is the same as the worst case time. Note that for most algorithms, I) is important and II) is not.