Why is std::map implemented as a red-black tree?












158














Why is std::map implemented as a red-black tree?



There are several balanced binary search trees (BSTs) out there. What were design trade-offs in choosing a red-black tree?










share|improve this question




















  • 25




    Although all implementations I've seen use an RB-tree, note that this is still implementation-dependent.
    – Thomas
    Mar 13 '11 at 8:38






  • 3




    @Thomas. It is implementation-dependent, so why it is so that all implementation use RB-trees?
    – Denis Gorodetskiy
    Mar 13 '11 at 13:12






  • 1




    I'd really like to know if any STL implementer has thought about using a Skip List.
    – Matthieu M.
    Mar 13 '11 at 13:41






  • 2




    C++'s map and set are actually ordered map and ordered set. They are not implemented using hash functions. Every query would take O(logn) and not O(1), but the values will be always sorted. Starting from C++11 (i think), there are unordered_map and unordered_set, that are implemented using hash functions and while they are not sorted, most queries and operations are possible in O(1) (averagely)
    – SomethingSomething
    Mar 15 '17 at 8:16










  • @Thomas that is true, but not that interesting in practice. The standard makes complexity guarantees with a specific algorithm, or set of algorithms in mind.
    – Justin Meiners
    Aug 7 at 15:17
















158














Why is std::map implemented as a red-black tree?



There are several balanced binary search trees (BSTs) out there. What were design trade-offs in choosing a red-black tree?










share|improve this question




















  • 25




    Although all implementations I've seen use an RB-tree, note that this is still implementation-dependent.
    – Thomas
    Mar 13 '11 at 8:38






  • 3




    @Thomas. It is implementation-dependent, so why it is so that all implementation use RB-trees?
    – Denis Gorodetskiy
    Mar 13 '11 at 13:12






  • 1




    I'd really like to know if any STL implementer has thought about using a Skip List.
    – Matthieu M.
    Mar 13 '11 at 13:41






  • 2




    C++'s map and set are actually ordered map and ordered set. They are not implemented using hash functions. Every query would take O(logn) and not O(1), but the values will be always sorted. Starting from C++11 (i think), there are unordered_map and unordered_set, that are implemented using hash functions and while they are not sorted, most queries and operations are possible in O(1) (averagely)
    – SomethingSomething
    Mar 15 '17 at 8:16










  • @Thomas that is true, but not that interesting in practice. The standard makes complexity guarantees with a specific algorithm, or set of algorithms in mind.
    – Justin Meiners
    Aug 7 at 15:17














158












158








158


68





Why is std::map implemented as a red-black tree?



There are several balanced binary search trees (BSTs) out there. What were design trade-offs in choosing a red-black tree?










share|improve this question















Why is std::map implemented as a red-black tree?



There are several balanced binary search trees (BSTs) out there. What were design trade-offs in choosing a red-black tree?







c++ dictionary data-structures stl binary-search-tree






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 5 at 7:46









ネロク

10.5k32140




10.5k32140










asked Mar 13 '11 at 8:33









Denis Gorodetskiy

1,12921121




1,12921121








  • 25




    Although all implementations I've seen use an RB-tree, note that this is still implementation-dependent.
    – Thomas
    Mar 13 '11 at 8:38






  • 3




    @Thomas. It is implementation-dependent, so why it is so that all implementation use RB-trees?
    – Denis Gorodetskiy
    Mar 13 '11 at 13:12






  • 1




    I'd really like to know if any STL implementer has thought about using a Skip List.
    – Matthieu M.
    Mar 13 '11 at 13:41






  • 2




    C++'s map and set are actually ordered map and ordered set. They are not implemented using hash functions. Every query would take O(logn) and not O(1), but the values will be always sorted. Starting from C++11 (i think), there are unordered_map and unordered_set, that are implemented using hash functions and while they are not sorted, most queries and operations are possible in O(1) (averagely)
    – SomethingSomething
    Mar 15 '17 at 8:16










  • @Thomas that is true, but not that interesting in practice. The standard makes complexity guarantees with a specific algorithm, or set of algorithms in mind.
    – Justin Meiners
    Aug 7 at 15:17














  • 25




    Although all implementations I've seen use an RB-tree, note that this is still implementation-dependent.
    – Thomas
    Mar 13 '11 at 8:38






  • 3




    @Thomas. It is implementation-dependent, so why it is so that all implementation use RB-trees?
    – Denis Gorodetskiy
    Mar 13 '11 at 13:12






  • 1




    I'd really like to know if any STL implementer has thought about using a Skip List.
    – Matthieu M.
    Mar 13 '11 at 13:41






  • 2




    C++'s map and set are actually ordered map and ordered set. They are not implemented using hash functions. Every query would take O(logn) and not O(1), but the values will be always sorted. Starting from C++11 (i think), there are unordered_map and unordered_set, that are implemented using hash functions and while they are not sorted, most queries and operations are possible in O(1) (averagely)
    – SomethingSomething
    Mar 15 '17 at 8:16










  • @Thomas that is true, but not that interesting in practice. The standard makes complexity guarantees with a specific algorithm, or set of algorithms in mind.
    – Justin Meiners
    Aug 7 at 15:17








25




25




Although all implementations I've seen use an RB-tree, note that this is still implementation-dependent.
– Thomas
Mar 13 '11 at 8:38




Although all implementations I've seen use an RB-tree, note that this is still implementation-dependent.
– Thomas
Mar 13 '11 at 8:38




3




3




@Thomas. It is implementation-dependent, so why it is so that all implementation use RB-trees?
– Denis Gorodetskiy
Mar 13 '11 at 13:12




@Thomas. It is implementation-dependent, so why it is so that all implementation use RB-trees?
– Denis Gorodetskiy
Mar 13 '11 at 13:12




1




1




I'd really like to know if any STL implementer has thought about using a Skip List.
– Matthieu M.
Mar 13 '11 at 13:41




I'd really like to know if any STL implementer has thought about using a Skip List.
– Matthieu M.
Mar 13 '11 at 13:41




2




2




C++'s map and set are actually ordered map and ordered set. They are not implemented using hash functions. Every query would take O(logn) and not O(1), but the values will be always sorted. Starting from C++11 (i think), there are unordered_map and unordered_set, that are implemented using hash functions and while they are not sorted, most queries and operations are possible in O(1) (averagely)
– SomethingSomething
Mar 15 '17 at 8:16




C++'s map and set are actually ordered map and ordered set. They are not implemented using hash functions. Every query would take O(logn) and not O(1), but the values will be always sorted. Starting from C++11 (i think), there are unordered_map and unordered_set, that are implemented using hash functions and while they are not sorted, most queries and operations are possible in O(1) (averagely)
– SomethingSomething
Mar 15 '17 at 8:16












@Thomas that is true, but not that interesting in practice. The standard makes complexity guarantees with a specific algorithm, or set of algorithms in mind.
– Justin Meiners
Aug 7 at 15:17




@Thomas that is true, but not that interesting in practice. The standard makes complexity guarantees with a specific algorithm, or set of algorithms in mind.
– Justin Meiners
Aug 7 at 15:17












6 Answers
6






active

oldest

votes


















99














Probably the two most common self balancing tree algorithms are Red-Black trees and AVL trees. To balance the tree after an insertion/update both algorithms use the notion of rotations where the nodes of the tree are rotated to perform the re-balancing.



While in both algorithms the insert/delete operations are O(log n), in the case of Red-Black tree re-balancing rotation is an O(1) operation while with AVL this is a O(log n) operation, making the Red-Black tree more efficient in this aspect of the re-balancing stage and one of the possible reasons that it is more commonly used.



Red-Black trees are used in most collection libraries, including the offerings from Java and Microsoft .NET Framework.






share|improve this answer



















  • 42




    you make it sound like red-black trees can do tree modifications in O(1) time, which is not true. tree modifications are O(log n) for both red-black and AVL trees. that makes it moot whether the balancing part of the tree modification is O(1) or O(log n) because the main operation is already O(log n). even after all the slightly extra work that AVL trees do results in a more tightly balanced tree which leads to slightly faster lookups. so it is a perfectly valid tradeoff and does not make AVL trees inferior to red-black trees.
    – necromancer
    Mar 13 '11 at 8:57








  • 28




    You have to look beyond the complexity to actual runtime to see a difference - AVL trees generally have a lower total runtime when there are many more lookups than inserts/deletes. RB trees have a lower total runtime when there are many more inserts/deletes. The exact proportion at which the break occurs depends of course on many details of implementation, hardware, and exact usage, but since library authors have to support a wide range of usage patterns, they have to take an educated guess. AVL is also slightly harder to implement, so you might want a proven benefit to use it.
    – Steve Jessop
    Mar 13 '11 at 11:57








  • 6




    RB tree isn't a "default implementation". Each implementer chooses an implementation. As far as we know, they've all chose RB trees, so presumably this is either for performance or for ease of implementation/maintenance. As I said, the breakpoint for performance might not imply that they think there are more inserts/deletes than lookups, just that the ratio between the two is above the level where they think RB probably beats AVL.
    – Steve Jessop
    Mar 13 '11 at 12:16






  • 9




    @Denis: unfortunately the only way to get numbers is to make a list of std::map implementations, track down the developers, and ask them what criteria they used to make the decision, so this remains speculation.
    – Steve Jessop
    Mar 14 '11 at 13:23








  • 2




    Missing from all this is the cost, per-node, to store the auxilary information required to make balance decisions. Red-Black trees require 1-bit to represent the colour. AVL trees require at least 2 bits (to represent -1, 0 or 1).
    – SJHowe
    Sep 12 '17 at 15:56



















40














It really depends on the usage. AVL tree usually has more rotations of rebalancing. So if your application doesn't have too many insertion and deletion operations, but weights heavily on searching, then AVL tree probably is a good choice.



std::map uses Red-Black tree as it gets a reasonable trade-off between the speed of node insertion/deletion and searching.






share|improve this answer



















  • 1




    Are you sure about that??? I personally think that Red-Black tree is either or more complex, never simpler. The only thing, is in Rd-Black tree, re-balancing occurs less often than AVL.
    – Eric Ouellet
    May 30 '17 at 20:23










  • @Eric Theoretically, both R/B tree and AVL tree has complexity O(log n) ) for insertion and deletion. But one big part of the operation cost is rotation, which is different between these two trees. Please refer to discuss.fogcreek.com/joelonsoftware/… Quote: "balancing an AVL tree can require O(log n) rotations, whilst a red black tree will take at most two rotations to bring it into balance (though it may have to examine O(log n) nodes to decide where the rotations are necessary)." Edited my comments accordingly.
    – webbertiger
    Jun 13 '17 at 22:26



















23














AVL trees have a maximum height of 1.44logn, while RB trees have a maximum of 2logn. Inserting an element in a AVL may imply a rebalance at one point in the tree. The rebalancing finishes the insertion. After insertion of a new leaf, updating the ancestors of that leaf has to be done up to the root, or up to a point where the two subtrees are of equal depth. The probability of having to update k nodes is 1/3^k. Rebalancing is O(1). Removing an element may imply more than one rebalancing (up to half the depth of the tree).



RB-trees are B-trees of order 4 represented as binary search trees. A 4-node in the B-tree results in two levels in the equivalent BST. In the worst case, all the nodes of the tree are 2-nodes, with only one chain of 3-nodes down to a leaf. That leaf will be at a distance of 2logn from the root.



Going down from the root to the insertion point, one has to change 4-nodes into 2-nodes, to make sure any insertion will not saturate a leaf. Coming back from the insertion, all these nodes have to be analysed to make sure they correctly represent 4-nodes. This can also be done going down in the tree. The global cost will be the same. There is no free lunch! Removing an element from the tree is of the same order.



All these trees require that nodes carry information on height, weight, color, etc. Only Splay trees are free from such additional info. But most people are afraid of Splay trees, because of the ramdomness of their structure!



Finally, trees can also carry weight information in the nodes, permitting weight balancing. Various schemes can be applied. One should rebalance when a subtree contains more than 3 times the number of elements of the other subtree. Rebalancing is again done either throuh a single or double rotation. This means a worst case of 2.4logn. One can get away with 2 times instead of 3, a much better ratio, but it may mean leaving a little less thant 1% of the subtrees unbalanced here and there. Tricky!



Which type of tree is the best? AVL for sure. They are the simplest to code, and have their worst height nearest to logn. For a tree of 1000000 elements, an AVL will be at most of height 29, a RB 40, and a weight balanced 36 or 50 depending on the ratio.



There are a lot of other variables: randomness, ratio of adds, deletes, searches, etc.






share|improve this answer



















  • 2




    Good answer. But if AVLs are the best, why standard library implements std::map as RB tree?
    – Denis Gorodetskiy
    Oct 7 '11 at 0:24






  • 12




    I disagree that AVL trees are unquestionably the best. Although they have a low height, they require (in total) more work to do relabancing than red/black trees (O(log n) rebalancing work versus O(1) amortized rebalancing work). Splay trees could be much, much better and your assertion that people are afraid of them is unfounded. There is no one universal "best" tree balancing scheme out there.
    – templatetypedef
    Jan 4 '13 at 17:15










  • Almost perfect answer. Why did you say AVL is the best. That is simply wrong and that's why most general implementation uses Red-Black tree. You need to have a pretty higher ratio of read over manipulation to choose AVL. Also, AVL has a little less memory footprint than RB.
    – Eric Ouellet
    May 30 '17 at 20:28



















9














The previous answers only address tree alternatives and red black probably only remains for historical reasons.



Why not a hash table?



In a tree, a type requires only partial ordering (< comparison) to be used as a key in the map. However, hash tables require that each key type has a hash function defined. Keeping these type requirements to a minimum is very important for generic programming.



Designing a good hash table requires intimate knowledge of the context it which it will be used. Should it use open addressing, or linked chaining? What levels of load should it accept before resizing? Should it use an expensive hash that avoids collisions, or one that is rough and fast?



(C++11 did add hash tables with unordered_map. You can see from the documentation it requires setting policies to configure many of these options.)



Since the STL can't anticipate which is the best choice for your application, the default needs to be more flexible. Trees "just work" and scale nicely.



What about other trees?



Red Black tree's offer fast lookup and are self balancing, unlike BSTs. Another user pointed out its advantages over the self-balancing AVL tree.



Alexander Stepanov (The creator of STL) said that he would use a B* Tree instead of a Red-Black tree if he wrote std::map again, because it is more friendly for modern memory caches.




One of the biggest changes since then has been the growth of caches.
Cache misses are very costly, so locality of reference is much more
important now. Node-based data structures, which have low locality of
reference, make much less sense. If I were designing STL today, I
would have a different set of containers. For example, an in-memory
B*-tree is a far better choice than a red-black tree for implementing
an associative container. - Alexander Stepanov




Is red black tree or B* always the best?



On other occasions Alex has stated that std::vector is almost always the best list container for similar reasons. It rarely makes sense to use std::list or std::deque even for those situations we were taught in school (such as removing an element from the middle of the list). std::vector is so fast that it beats those structures for everything but large N.



Applying that reasoning, if you have only a small number of elements (hundreds?) using a std::vector and linear searching may be more efficient than the tree implementation of std::map. Depending on the frequency of insertion, a sorted std::vector combined with std::binary_search can be the fastest choice.






share|improve this answer































    2














    It is just the choice of your implementation - they could be implemented as any balanced tree. The various choices are all comparable with minor differences. Therefore any is as good as any.






    share|improve this answer































      2














      Update 2017-06-14: webbertiger edit its answer after I commented. I should point out that its answer is now a lot better to my eyes. But I kept my answer just as additional information...



      Due to the fact that I think first answer is wrong (correction: not both anymore) and the third has a wrong affirmation. I feel I had to clarify things...



      The 2 most popular tree are AVL and Red Black (RB). The main difference lie in the utilization:




      • AVL : Better if ratio of consultation (read) is bigger than manipulation (modification). Memory foot print is a little less than RB (due to the bit required for coloring).

      • RB : Better in general cases where there is a balance between consultation (read) and manipulation (modification) or more modification over consultation. A slightly bigger memory footprint due to the storing of red-black flag.


      The main difference come from the coloring. You do have less re-balance action in RB tree than AVL because the coloring enable you to sometimes skip or shorten re-balance actions which have a relative hi cost. Because of the coloring, RB tree also have higher level of nodes because it could accept red nodes between black ones (having the possibilities of ~2x more levels) making search (read) a little bit less efficient... but because it is a constant (2x), it stay in O(log n).



      If you consider the performance hit for a modification of a tree (significative) VS the performance hit of consultation of a tree (almost insignificant), it become natural to prefer RB over AVL for a general case.






      share|improve this answer























        Your Answer






        StackExchange.ifUsing("editor", function () {
        StackExchange.using("externalEditor", function () {
        StackExchange.using("snippets", function () {
        StackExchange.snippets.init();
        });
        });
        }, "code-snippets");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "1"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f5288320%2fwhy-is-stdmap-implemented-as-a-red-black-tree%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        6 Answers
        6






        active

        oldest

        votes








        6 Answers
        6






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        99














        Probably the two most common self balancing tree algorithms are Red-Black trees and AVL trees. To balance the tree after an insertion/update both algorithms use the notion of rotations where the nodes of the tree are rotated to perform the re-balancing.



        While in both algorithms the insert/delete operations are O(log n), in the case of Red-Black tree re-balancing rotation is an O(1) operation while with AVL this is a O(log n) operation, making the Red-Black tree more efficient in this aspect of the re-balancing stage and one of the possible reasons that it is more commonly used.



        Red-Black trees are used in most collection libraries, including the offerings from Java and Microsoft .NET Framework.






        share|improve this answer



















        • 42




          you make it sound like red-black trees can do tree modifications in O(1) time, which is not true. tree modifications are O(log n) for both red-black and AVL trees. that makes it moot whether the balancing part of the tree modification is O(1) or O(log n) because the main operation is already O(log n). even after all the slightly extra work that AVL trees do results in a more tightly balanced tree which leads to slightly faster lookups. so it is a perfectly valid tradeoff and does not make AVL trees inferior to red-black trees.
          – necromancer
          Mar 13 '11 at 8:57








        • 28




          You have to look beyond the complexity to actual runtime to see a difference - AVL trees generally have a lower total runtime when there are many more lookups than inserts/deletes. RB trees have a lower total runtime when there are many more inserts/deletes. The exact proportion at which the break occurs depends of course on many details of implementation, hardware, and exact usage, but since library authors have to support a wide range of usage patterns, they have to take an educated guess. AVL is also slightly harder to implement, so you might want a proven benefit to use it.
          – Steve Jessop
          Mar 13 '11 at 11:57








        • 6




          RB tree isn't a "default implementation". Each implementer chooses an implementation. As far as we know, they've all chose RB trees, so presumably this is either for performance or for ease of implementation/maintenance. As I said, the breakpoint for performance might not imply that they think there are more inserts/deletes than lookups, just that the ratio between the two is above the level where they think RB probably beats AVL.
          – Steve Jessop
          Mar 13 '11 at 12:16






        • 9




          @Denis: unfortunately the only way to get numbers is to make a list of std::map implementations, track down the developers, and ask them what criteria they used to make the decision, so this remains speculation.
          – Steve Jessop
          Mar 14 '11 at 13:23








        • 2




          Missing from all this is the cost, per-node, to store the auxilary information required to make balance decisions. Red-Black trees require 1-bit to represent the colour. AVL trees require at least 2 bits (to represent -1, 0 or 1).
          – SJHowe
          Sep 12 '17 at 15:56
















        99














        Probably the two most common self balancing tree algorithms are Red-Black trees and AVL trees. To balance the tree after an insertion/update both algorithms use the notion of rotations where the nodes of the tree are rotated to perform the re-balancing.



        While in both algorithms the insert/delete operations are O(log n), in the case of Red-Black tree re-balancing rotation is an O(1) operation while with AVL this is a O(log n) operation, making the Red-Black tree more efficient in this aspect of the re-balancing stage and one of the possible reasons that it is more commonly used.



        Red-Black trees are used in most collection libraries, including the offerings from Java and Microsoft .NET Framework.






        share|improve this answer



















        • 42




          you make it sound like red-black trees can do tree modifications in O(1) time, which is not true. tree modifications are O(log n) for both red-black and AVL trees. that makes it moot whether the balancing part of the tree modification is O(1) or O(log n) because the main operation is already O(log n). even after all the slightly extra work that AVL trees do results in a more tightly balanced tree which leads to slightly faster lookups. so it is a perfectly valid tradeoff and does not make AVL trees inferior to red-black trees.
          – necromancer
          Mar 13 '11 at 8:57








        • 28




          You have to look beyond the complexity to actual runtime to see a difference - AVL trees generally have a lower total runtime when there are many more lookups than inserts/deletes. RB trees have a lower total runtime when there are many more inserts/deletes. The exact proportion at which the break occurs depends of course on many details of implementation, hardware, and exact usage, but since library authors have to support a wide range of usage patterns, they have to take an educated guess. AVL is also slightly harder to implement, so you might want a proven benefit to use it.
          – Steve Jessop
          Mar 13 '11 at 11:57








        • 6




          RB tree isn't a "default implementation". Each implementer chooses an implementation. As far as we know, they've all chose RB trees, so presumably this is either for performance or for ease of implementation/maintenance. As I said, the breakpoint for performance might not imply that they think there are more inserts/deletes than lookups, just that the ratio between the two is above the level where they think RB probably beats AVL.
          – Steve Jessop
          Mar 13 '11 at 12:16






        • 9




          @Denis: unfortunately the only way to get numbers is to make a list of std::map implementations, track down the developers, and ask them what criteria they used to make the decision, so this remains speculation.
          – Steve Jessop
          Mar 14 '11 at 13:23








        • 2




          Missing from all this is the cost, per-node, to store the auxilary information required to make balance decisions. Red-Black trees require 1-bit to represent the colour. AVL trees require at least 2 bits (to represent -1, 0 or 1).
          – SJHowe
          Sep 12 '17 at 15:56














        99












        99








        99






        Probably the two most common self balancing tree algorithms are Red-Black trees and AVL trees. To balance the tree after an insertion/update both algorithms use the notion of rotations where the nodes of the tree are rotated to perform the re-balancing.



        While in both algorithms the insert/delete operations are O(log n), in the case of Red-Black tree re-balancing rotation is an O(1) operation while with AVL this is a O(log n) operation, making the Red-Black tree more efficient in this aspect of the re-balancing stage and one of the possible reasons that it is more commonly used.



        Red-Black trees are used in most collection libraries, including the offerings from Java and Microsoft .NET Framework.






        share|improve this answer














        Probably the two most common self balancing tree algorithms are Red-Black trees and AVL trees. To balance the tree after an insertion/update both algorithms use the notion of rotations where the nodes of the tree are rotated to perform the re-balancing.



        While in both algorithms the insert/delete operations are O(log n), in the case of Red-Black tree re-balancing rotation is an O(1) operation while with AVL this is a O(log n) operation, making the Red-Black tree more efficient in this aspect of the re-balancing stage and one of the possible reasons that it is more commonly used.



        Red-Black trees are used in most collection libraries, including the offerings from Java and Microsoft .NET Framework.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 13 '11 at 9:02

























        answered Mar 13 '11 at 8:47









        Chris Taylor

        41.9k75776




        41.9k75776








        • 42




          you make it sound like red-black trees can do tree modifications in O(1) time, which is not true. tree modifications are O(log n) for both red-black and AVL trees. that makes it moot whether the balancing part of the tree modification is O(1) or O(log n) because the main operation is already O(log n). even after all the slightly extra work that AVL trees do results in a more tightly balanced tree which leads to slightly faster lookups. so it is a perfectly valid tradeoff and does not make AVL trees inferior to red-black trees.
          – necromancer
          Mar 13 '11 at 8:57








        • 28




          You have to look beyond the complexity to actual runtime to see a difference - AVL trees generally have a lower total runtime when there are many more lookups than inserts/deletes. RB trees have a lower total runtime when there are many more inserts/deletes. The exact proportion at which the break occurs depends of course on many details of implementation, hardware, and exact usage, but since library authors have to support a wide range of usage patterns, they have to take an educated guess. AVL is also slightly harder to implement, so you might want a proven benefit to use it.
          – Steve Jessop
          Mar 13 '11 at 11:57








        • 6




          RB tree isn't a "default implementation". Each implementer chooses an implementation. As far as we know, they've all chose RB trees, so presumably this is either for performance or for ease of implementation/maintenance. As I said, the breakpoint for performance might not imply that they think there are more inserts/deletes than lookups, just that the ratio between the two is above the level where they think RB probably beats AVL.
          – Steve Jessop
          Mar 13 '11 at 12:16






        • 9




          @Denis: unfortunately the only way to get numbers is to make a list of std::map implementations, track down the developers, and ask them what criteria they used to make the decision, so this remains speculation.
          – Steve Jessop
          Mar 14 '11 at 13:23








        • 2




          Missing from all this is the cost, per-node, to store the auxilary information required to make balance decisions. Red-Black trees require 1-bit to represent the colour. AVL trees require at least 2 bits (to represent -1, 0 or 1).
          – SJHowe
          Sep 12 '17 at 15:56














        • 42




          you make it sound like red-black trees can do tree modifications in O(1) time, which is not true. tree modifications are O(log n) for both red-black and AVL trees. that makes it moot whether the balancing part of the tree modification is O(1) or O(log n) because the main operation is already O(log n). even after all the slightly extra work that AVL trees do results in a more tightly balanced tree which leads to slightly faster lookups. so it is a perfectly valid tradeoff and does not make AVL trees inferior to red-black trees.
          – necromancer
          Mar 13 '11 at 8:57








        • 28




          You have to look beyond the complexity to actual runtime to see a difference - AVL trees generally have a lower total runtime when there are many more lookups than inserts/deletes. RB trees have a lower total runtime when there are many more inserts/deletes. The exact proportion at which the break occurs depends of course on many details of implementation, hardware, and exact usage, but since library authors have to support a wide range of usage patterns, they have to take an educated guess. AVL is also slightly harder to implement, so you might want a proven benefit to use it.
          – Steve Jessop
          Mar 13 '11 at 11:57








        • 6




          RB tree isn't a "default implementation". Each implementer chooses an implementation. As far as we know, they've all chose RB trees, so presumably this is either for performance or for ease of implementation/maintenance. As I said, the breakpoint for performance might not imply that they think there are more inserts/deletes than lookups, just that the ratio between the two is above the level where they think RB probably beats AVL.
          – Steve Jessop
          Mar 13 '11 at 12:16






        • 9




          @Denis: unfortunately the only way to get numbers is to make a list of std::map implementations, track down the developers, and ask them what criteria they used to make the decision, so this remains speculation.
          – Steve Jessop
          Mar 14 '11 at 13:23








        • 2




          Missing from all this is the cost, per-node, to store the auxilary information required to make balance decisions. Red-Black trees require 1-bit to represent the colour. AVL trees require at least 2 bits (to represent -1, 0 or 1).
          – SJHowe
          Sep 12 '17 at 15:56








        42




        42




        you make it sound like red-black trees can do tree modifications in O(1) time, which is not true. tree modifications are O(log n) for both red-black and AVL trees. that makes it moot whether the balancing part of the tree modification is O(1) or O(log n) because the main operation is already O(log n). even after all the slightly extra work that AVL trees do results in a more tightly balanced tree which leads to slightly faster lookups. so it is a perfectly valid tradeoff and does not make AVL trees inferior to red-black trees.
        – necromancer
        Mar 13 '11 at 8:57






        you make it sound like red-black trees can do tree modifications in O(1) time, which is not true. tree modifications are O(log n) for both red-black and AVL trees. that makes it moot whether the balancing part of the tree modification is O(1) or O(log n) because the main operation is already O(log n). even after all the slightly extra work that AVL trees do results in a more tightly balanced tree which leads to slightly faster lookups. so it is a perfectly valid tradeoff and does not make AVL trees inferior to red-black trees.
        – necromancer
        Mar 13 '11 at 8:57






        28




        28




        You have to look beyond the complexity to actual runtime to see a difference - AVL trees generally have a lower total runtime when there are many more lookups than inserts/deletes. RB trees have a lower total runtime when there are many more inserts/deletes. The exact proportion at which the break occurs depends of course on many details of implementation, hardware, and exact usage, but since library authors have to support a wide range of usage patterns, they have to take an educated guess. AVL is also slightly harder to implement, so you might want a proven benefit to use it.
        – Steve Jessop
        Mar 13 '11 at 11:57






        You have to look beyond the complexity to actual runtime to see a difference - AVL trees generally have a lower total runtime when there are many more lookups than inserts/deletes. RB trees have a lower total runtime when there are many more inserts/deletes. The exact proportion at which the break occurs depends of course on many details of implementation, hardware, and exact usage, but since library authors have to support a wide range of usage patterns, they have to take an educated guess. AVL is also slightly harder to implement, so you might want a proven benefit to use it.
        – Steve Jessop
        Mar 13 '11 at 11:57






        6




        6




        RB tree isn't a "default implementation". Each implementer chooses an implementation. As far as we know, they've all chose RB trees, so presumably this is either for performance or for ease of implementation/maintenance. As I said, the breakpoint for performance might not imply that they think there are more inserts/deletes than lookups, just that the ratio between the two is above the level where they think RB probably beats AVL.
        – Steve Jessop
        Mar 13 '11 at 12:16




        RB tree isn't a "default implementation". Each implementer chooses an implementation. As far as we know, they've all chose RB trees, so presumably this is either for performance or for ease of implementation/maintenance. As I said, the breakpoint for performance might not imply that they think there are more inserts/deletes than lookups, just that the ratio between the two is above the level where they think RB probably beats AVL.
        – Steve Jessop
        Mar 13 '11 at 12:16




        9




        9




        @Denis: unfortunately the only way to get numbers is to make a list of std::map implementations, track down the developers, and ask them what criteria they used to make the decision, so this remains speculation.
        – Steve Jessop
        Mar 14 '11 at 13:23






        @Denis: unfortunately the only way to get numbers is to make a list of std::map implementations, track down the developers, and ask them what criteria they used to make the decision, so this remains speculation.
        – Steve Jessop
        Mar 14 '11 at 13:23






        2




        2




        Missing from all this is the cost, per-node, to store the auxilary information required to make balance decisions. Red-Black trees require 1-bit to represent the colour. AVL trees require at least 2 bits (to represent -1, 0 or 1).
        – SJHowe
        Sep 12 '17 at 15:56




        Missing from all this is the cost, per-node, to store the auxilary information required to make balance decisions. Red-Black trees require 1-bit to represent the colour. AVL trees require at least 2 bits (to represent -1, 0 or 1).
        – SJHowe
        Sep 12 '17 at 15:56













        40














        It really depends on the usage. AVL tree usually has more rotations of rebalancing. So if your application doesn't have too many insertion and deletion operations, but weights heavily on searching, then AVL tree probably is a good choice.



        std::map uses Red-Black tree as it gets a reasonable trade-off between the speed of node insertion/deletion and searching.






        share|improve this answer



















        • 1




          Are you sure about that??? I personally think that Red-Black tree is either or more complex, never simpler. The only thing, is in Rd-Black tree, re-balancing occurs less often than AVL.
          – Eric Ouellet
          May 30 '17 at 20:23










        • @Eric Theoretically, both R/B tree and AVL tree has complexity O(log n) ) for insertion and deletion. But one big part of the operation cost is rotation, which is different between these two trees. Please refer to discuss.fogcreek.com/joelonsoftware/… Quote: "balancing an AVL tree can require O(log n) rotations, whilst a red black tree will take at most two rotations to bring it into balance (though it may have to examine O(log n) nodes to decide where the rotations are necessary)." Edited my comments accordingly.
          – webbertiger
          Jun 13 '17 at 22:26
















        40














        It really depends on the usage. AVL tree usually has more rotations of rebalancing. So if your application doesn't have too many insertion and deletion operations, but weights heavily on searching, then AVL tree probably is a good choice.



        std::map uses Red-Black tree as it gets a reasonable trade-off between the speed of node insertion/deletion and searching.






        share|improve this answer



















        • 1




          Are you sure about that??? I personally think that Red-Black tree is either or more complex, never simpler. The only thing, is in Rd-Black tree, re-balancing occurs less often than AVL.
          – Eric Ouellet
          May 30 '17 at 20:23










        • @Eric Theoretically, both R/B tree and AVL tree has complexity O(log n) ) for insertion and deletion. But one big part of the operation cost is rotation, which is different between these two trees. Please refer to discuss.fogcreek.com/joelonsoftware/… Quote: "balancing an AVL tree can require O(log n) rotations, whilst a red black tree will take at most two rotations to bring it into balance (though it may have to examine O(log n) nodes to decide where the rotations are necessary)." Edited my comments accordingly.
          – webbertiger
          Jun 13 '17 at 22:26














        40












        40








        40






        It really depends on the usage. AVL tree usually has more rotations of rebalancing. So if your application doesn't have too many insertion and deletion operations, but weights heavily on searching, then AVL tree probably is a good choice.



        std::map uses Red-Black tree as it gets a reasonable trade-off between the speed of node insertion/deletion and searching.






        share|improve this answer














        It really depends on the usage. AVL tree usually has more rotations of rebalancing. So if your application doesn't have too many insertion and deletion operations, but weights heavily on searching, then AVL tree probably is a good choice.



        std::map uses Red-Black tree as it gets a reasonable trade-off between the speed of node insertion/deletion and searching.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Jun 13 '17 at 22:27

























        answered May 26 '12 at 18:32









        webbertiger

        65956




        65956








        • 1




          Are you sure about that??? I personally think that Red-Black tree is either or more complex, never simpler. The only thing, is in Rd-Black tree, re-balancing occurs less often than AVL.
          – Eric Ouellet
          May 30 '17 at 20:23










        • @Eric Theoretically, both R/B tree and AVL tree has complexity O(log n) ) for insertion and deletion. But one big part of the operation cost is rotation, which is different between these two trees. Please refer to discuss.fogcreek.com/joelonsoftware/… Quote: "balancing an AVL tree can require O(log n) rotations, whilst a red black tree will take at most two rotations to bring it into balance (though it may have to examine O(log n) nodes to decide where the rotations are necessary)." Edited my comments accordingly.
          – webbertiger
          Jun 13 '17 at 22:26














        • 1




          Are you sure about that??? I personally think that Red-Black tree is either or more complex, never simpler. The only thing, is in Rd-Black tree, re-balancing occurs less often than AVL.
          – Eric Ouellet
          May 30 '17 at 20:23










        • @Eric Theoretically, both R/B tree and AVL tree has complexity O(log n) ) for insertion and deletion. But one big part of the operation cost is rotation, which is different between these two trees. Please refer to discuss.fogcreek.com/joelonsoftware/… Quote: "balancing an AVL tree can require O(log n) rotations, whilst a red black tree will take at most two rotations to bring it into balance (though it may have to examine O(log n) nodes to decide where the rotations are necessary)." Edited my comments accordingly.
          – webbertiger
          Jun 13 '17 at 22:26








        1




        1




        Are you sure about that??? I personally think that Red-Black tree is either or more complex, never simpler. The only thing, is in Rd-Black tree, re-balancing occurs less often than AVL.
        – Eric Ouellet
        May 30 '17 at 20:23




        Are you sure about that??? I personally think that Red-Black tree is either or more complex, never simpler. The only thing, is in Rd-Black tree, re-balancing occurs less often than AVL.
        – Eric Ouellet
        May 30 '17 at 20:23












        @Eric Theoretically, both R/B tree and AVL tree has complexity O(log n) ) for insertion and deletion. But one big part of the operation cost is rotation, which is different between these two trees. Please refer to discuss.fogcreek.com/joelonsoftware/… Quote: "balancing an AVL tree can require O(log n) rotations, whilst a red black tree will take at most two rotations to bring it into balance (though it may have to examine O(log n) nodes to decide where the rotations are necessary)." Edited my comments accordingly.
        – webbertiger
        Jun 13 '17 at 22:26




        @Eric Theoretically, both R/B tree and AVL tree has complexity O(log n) ) for insertion and deletion. But one big part of the operation cost is rotation, which is different between these two trees. Please refer to discuss.fogcreek.com/joelonsoftware/… Quote: "balancing an AVL tree can require O(log n) rotations, whilst a red black tree will take at most two rotations to bring it into balance (though it may have to examine O(log n) nodes to decide where the rotations are necessary)." Edited my comments accordingly.
        – webbertiger
        Jun 13 '17 at 22:26











        23














        AVL trees have a maximum height of 1.44logn, while RB trees have a maximum of 2logn. Inserting an element in a AVL may imply a rebalance at one point in the tree. The rebalancing finishes the insertion. After insertion of a new leaf, updating the ancestors of that leaf has to be done up to the root, or up to a point where the two subtrees are of equal depth. The probability of having to update k nodes is 1/3^k. Rebalancing is O(1). Removing an element may imply more than one rebalancing (up to half the depth of the tree).



        RB-trees are B-trees of order 4 represented as binary search trees. A 4-node in the B-tree results in two levels in the equivalent BST. In the worst case, all the nodes of the tree are 2-nodes, with only one chain of 3-nodes down to a leaf. That leaf will be at a distance of 2logn from the root.



        Going down from the root to the insertion point, one has to change 4-nodes into 2-nodes, to make sure any insertion will not saturate a leaf. Coming back from the insertion, all these nodes have to be analysed to make sure they correctly represent 4-nodes. This can also be done going down in the tree. The global cost will be the same. There is no free lunch! Removing an element from the tree is of the same order.



        All these trees require that nodes carry information on height, weight, color, etc. Only Splay trees are free from such additional info. But most people are afraid of Splay trees, because of the ramdomness of their structure!



        Finally, trees can also carry weight information in the nodes, permitting weight balancing. Various schemes can be applied. One should rebalance when a subtree contains more than 3 times the number of elements of the other subtree. Rebalancing is again done either throuh a single or double rotation. This means a worst case of 2.4logn. One can get away with 2 times instead of 3, a much better ratio, but it may mean leaving a little less thant 1% of the subtrees unbalanced here and there. Tricky!



        Which type of tree is the best? AVL for sure. They are the simplest to code, and have their worst height nearest to logn. For a tree of 1000000 elements, an AVL will be at most of height 29, a RB 40, and a weight balanced 36 or 50 depending on the ratio.



        There are a lot of other variables: randomness, ratio of adds, deletes, searches, etc.






        share|improve this answer



















        • 2




          Good answer. But if AVLs are the best, why standard library implements std::map as RB tree?
          – Denis Gorodetskiy
          Oct 7 '11 at 0:24






        • 12




          I disagree that AVL trees are unquestionably the best. Although they have a low height, they require (in total) more work to do relabancing than red/black trees (O(log n) rebalancing work versus O(1) amortized rebalancing work). Splay trees could be much, much better and your assertion that people are afraid of them is unfounded. There is no one universal "best" tree balancing scheme out there.
          – templatetypedef
          Jan 4 '13 at 17:15










        • Almost perfect answer. Why did you say AVL is the best. That is simply wrong and that's why most general implementation uses Red-Black tree. You need to have a pretty higher ratio of read over manipulation to choose AVL. Also, AVL has a little less memory footprint than RB.
          – Eric Ouellet
          May 30 '17 at 20:28
















        23














        AVL trees have a maximum height of 1.44logn, while RB trees have a maximum of 2logn. Inserting an element in a AVL may imply a rebalance at one point in the tree. The rebalancing finishes the insertion. After insertion of a new leaf, updating the ancestors of that leaf has to be done up to the root, or up to a point where the two subtrees are of equal depth. The probability of having to update k nodes is 1/3^k. Rebalancing is O(1). Removing an element may imply more than one rebalancing (up to half the depth of the tree).



        RB-trees are B-trees of order 4 represented as binary search trees. A 4-node in the B-tree results in two levels in the equivalent BST. In the worst case, all the nodes of the tree are 2-nodes, with only one chain of 3-nodes down to a leaf. That leaf will be at a distance of 2logn from the root.



        Going down from the root to the insertion point, one has to change 4-nodes into 2-nodes, to make sure any insertion will not saturate a leaf. Coming back from the insertion, all these nodes have to be analysed to make sure they correctly represent 4-nodes. This can also be done going down in the tree. The global cost will be the same. There is no free lunch! Removing an element from the tree is of the same order.



        All these trees require that nodes carry information on height, weight, color, etc. Only Splay trees are free from such additional info. But most people are afraid of Splay trees, because of the ramdomness of their structure!



        Finally, trees can also carry weight information in the nodes, permitting weight balancing. Various schemes can be applied. One should rebalance when a subtree contains more than 3 times the number of elements of the other subtree. Rebalancing is again done either throuh a single or double rotation. This means a worst case of 2.4logn. One can get away with 2 times instead of 3, a much better ratio, but it may mean leaving a little less thant 1% of the subtrees unbalanced here and there. Tricky!



        Which type of tree is the best? AVL for sure. They are the simplest to code, and have their worst height nearest to logn. For a tree of 1000000 elements, an AVL will be at most of height 29, a RB 40, and a weight balanced 36 or 50 depending on the ratio.



        There are a lot of other variables: randomness, ratio of adds, deletes, searches, etc.






        share|improve this answer



















        • 2




          Good answer. But if AVLs are the best, why standard library implements std::map as RB tree?
          – Denis Gorodetskiy
          Oct 7 '11 at 0:24






        • 12




          I disagree that AVL trees are unquestionably the best. Although they have a low height, they require (in total) more work to do relabancing than red/black trees (O(log n) rebalancing work versus O(1) amortized rebalancing work). Splay trees could be much, much better and your assertion that people are afraid of them is unfounded. There is no one universal "best" tree balancing scheme out there.
          – templatetypedef
          Jan 4 '13 at 17:15










        • Almost perfect answer. Why did you say AVL is the best. That is simply wrong and that's why most general implementation uses Red-Black tree. You need to have a pretty higher ratio of read over manipulation to choose AVL. Also, AVL has a little less memory footprint than RB.
          – Eric Ouellet
          May 30 '17 at 20:28














        23












        23








        23






        AVL trees have a maximum height of 1.44logn, while RB trees have a maximum of 2logn. Inserting an element in a AVL may imply a rebalance at one point in the tree. The rebalancing finishes the insertion. After insertion of a new leaf, updating the ancestors of that leaf has to be done up to the root, or up to a point where the two subtrees are of equal depth. The probability of having to update k nodes is 1/3^k. Rebalancing is O(1). Removing an element may imply more than one rebalancing (up to half the depth of the tree).



        RB-trees are B-trees of order 4 represented as binary search trees. A 4-node in the B-tree results in two levels in the equivalent BST. In the worst case, all the nodes of the tree are 2-nodes, with only one chain of 3-nodes down to a leaf. That leaf will be at a distance of 2logn from the root.



        Going down from the root to the insertion point, one has to change 4-nodes into 2-nodes, to make sure any insertion will not saturate a leaf. Coming back from the insertion, all these nodes have to be analysed to make sure they correctly represent 4-nodes. This can also be done going down in the tree. The global cost will be the same. There is no free lunch! Removing an element from the tree is of the same order.



        All these trees require that nodes carry information on height, weight, color, etc. Only Splay trees are free from such additional info. But most people are afraid of Splay trees, because of the ramdomness of their structure!



        Finally, trees can also carry weight information in the nodes, permitting weight balancing. Various schemes can be applied. One should rebalance when a subtree contains more than 3 times the number of elements of the other subtree. Rebalancing is again done either throuh a single or double rotation. This means a worst case of 2.4logn. One can get away with 2 times instead of 3, a much better ratio, but it may mean leaving a little less thant 1% of the subtrees unbalanced here and there. Tricky!



        Which type of tree is the best? AVL for sure. They are the simplest to code, and have their worst height nearest to logn. For a tree of 1000000 elements, an AVL will be at most of height 29, a RB 40, and a weight balanced 36 or 50 depending on the ratio.



        There are a lot of other variables: randomness, ratio of adds, deletes, searches, etc.






        share|improve this answer














        AVL trees have a maximum height of 1.44logn, while RB trees have a maximum of 2logn. Inserting an element in a AVL may imply a rebalance at one point in the tree. The rebalancing finishes the insertion. After insertion of a new leaf, updating the ancestors of that leaf has to be done up to the root, or up to a point where the two subtrees are of equal depth. The probability of having to update k nodes is 1/3^k. Rebalancing is O(1). Removing an element may imply more than one rebalancing (up to half the depth of the tree).



        RB-trees are B-trees of order 4 represented as binary search trees. A 4-node in the B-tree results in two levels in the equivalent BST. In the worst case, all the nodes of the tree are 2-nodes, with only one chain of 3-nodes down to a leaf. That leaf will be at a distance of 2logn from the root.



        Going down from the root to the insertion point, one has to change 4-nodes into 2-nodes, to make sure any insertion will not saturate a leaf. Coming back from the insertion, all these nodes have to be analysed to make sure they correctly represent 4-nodes. This can also be done going down in the tree. The global cost will be the same. There is no free lunch! Removing an element from the tree is of the same order.



        All these trees require that nodes carry information on height, weight, color, etc. Only Splay trees are free from such additional info. But most people are afraid of Splay trees, because of the ramdomness of their structure!



        Finally, trees can also carry weight information in the nodes, permitting weight balancing. Various schemes can be applied. One should rebalance when a subtree contains more than 3 times the number of elements of the other subtree. Rebalancing is again done either throuh a single or double rotation. This means a worst case of 2.4logn. One can get away with 2 times instead of 3, a much better ratio, but it may mean leaving a little less thant 1% of the subtrees unbalanced here and there. Tricky!



        Which type of tree is the best? AVL for sure. They are the simplest to code, and have their worst height nearest to logn. For a tree of 1000000 elements, an AVL will be at most of height 29, a RB 40, and a weight balanced 36 or 50 depending on the ratio.



        There are a lot of other variables: randomness, ratio of adds, deletes, searches, etc.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Dec 26 '17 at 8:53









        Michael Geary

        22.9k74359




        22.9k74359










        answered Jul 16 '11 at 1:52









        user847376

        24723




        24723








        • 2




          Good answer. But if AVLs are the best, why standard library implements std::map as RB tree?
          – Denis Gorodetskiy
          Oct 7 '11 at 0:24






        • 12




          I disagree that AVL trees are unquestionably the best. Although they have a low height, they require (in total) more work to do relabancing than red/black trees (O(log n) rebalancing work versus O(1) amortized rebalancing work). Splay trees could be much, much better and your assertion that people are afraid of them is unfounded. There is no one universal "best" tree balancing scheme out there.
          – templatetypedef
          Jan 4 '13 at 17:15










        • Almost perfect answer. Why did you say AVL is the best. That is simply wrong and that's why most general implementation uses Red-Black tree. You need to have a pretty higher ratio of read over manipulation to choose AVL. Also, AVL has a little less memory footprint than RB.
          – Eric Ouellet
          May 30 '17 at 20:28














        • 2




          Good answer. But if AVLs are the best, why standard library implements std::map as RB tree?
          – Denis Gorodetskiy
          Oct 7 '11 at 0:24






        • 12




          I disagree that AVL trees are unquestionably the best. Although they have a low height, they require (in total) more work to do relabancing than red/black trees (O(log n) rebalancing work versus O(1) amortized rebalancing work). Splay trees could be much, much better and your assertion that people are afraid of them is unfounded. There is no one universal "best" tree balancing scheme out there.
          – templatetypedef
          Jan 4 '13 at 17:15










        • Almost perfect answer. Why did you say AVL is the best. That is simply wrong and that's why most general implementation uses Red-Black tree. You need to have a pretty higher ratio of read over manipulation to choose AVL. Also, AVL has a little less memory footprint than RB.
          – Eric Ouellet
          May 30 '17 at 20:28








        2




        2




        Good answer. But if AVLs are the best, why standard library implements std::map as RB tree?
        – Denis Gorodetskiy
        Oct 7 '11 at 0:24




        Good answer. But if AVLs are the best, why standard library implements std::map as RB tree?
        – Denis Gorodetskiy
        Oct 7 '11 at 0:24




        12




        12




        I disagree that AVL trees are unquestionably the best. Although they have a low height, they require (in total) more work to do relabancing than red/black trees (O(log n) rebalancing work versus O(1) amortized rebalancing work). Splay trees could be much, much better and your assertion that people are afraid of them is unfounded. There is no one universal "best" tree balancing scheme out there.
        – templatetypedef
        Jan 4 '13 at 17:15




        I disagree that AVL trees are unquestionably the best. Although they have a low height, they require (in total) more work to do relabancing than red/black trees (O(log n) rebalancing work versus O(1) amortized rebalancing work). Splay trees could be much, much better and your assertion that people are afraid of them is unfounded. There is no one universal "best" tree balancing scheme out there.
        – templatetypedef
        Jan 4 '13 at 17:15












        Almost perfect answer. Why did you say AVL is the best. That is simply wrong and that's why most general implementation uses Red-Black tree. You need to have a pretty higher ratio of read over manipulation to choose AVL. Also, AVL has a little less memory footprint than RB.
        – Eric Ouellet
        May 30 '17 at 20:28




        Almost perfect answer. Why did you say AVL is the best. That is simply wrong and that's why most general implementation uses Red-Black tree. You need to have a pretty higher ratio of read over manipulation to choose AVL. Also, AVL has a little less memory footprint than RB.
        – Eric Ouellet
        May 30 '17 at 20:28











        9














        The previous answers only address tree alternatives and red black probably only remains for historical reasons.



        Why not a hash table?



        In a tree, a type requires only partial ordering (< comparison) to be used as a key in the map. However, hash tables require that each key type has a hash function defined. Keeping these type requirements to a minimum is very important for generic programming.



        Designing a good hash table requires intimate knowledge of the context it which it will be used. Should it use open addressing, or linked chaining? What levels of load should it accept before resizing? Should it use an expensive hash that avoids collisions, or one that is rough and fast?



        (C++11 did add hash tables with unordered_map. You can see from the documentation it requires setting policies to configure many of these options.)



        Since the STL can't anticipate which is the best choice for your application, the default needs to be more flexible. Trees "just work" and scale nicely.



        What about other trees?



        Red Black tree's offer fast lookup and are self balancing, unlike BSTs. Another user pointed out its advantages over the self-balancing AVL tree.



        Alexander Stepanov (The creator of STL) said that he would use a B* Tree instead of a Red-Black tree if he wrote std::map again, because it is more friendly for modern memory caches.




        One of the biggest changes since then has been the growth of caches.
        Cache misses are very costly, so locality of reference is much more
        important now. Node-based data structures, which have low locality of
        reference, make much less sense. If I were designing STL today, I
        would have a different set of containers. For example, an in-memory
        B*-tree is a far better choice than a red-black tree for implementing
        an associative container. - Alexander Stepanov




        Is red black tree or B* always the best?



        On other occasions Alex has stated that std::vector is almost always the best list container for similar reasons. It rarely makes sense to use std::list or std::deque even for those situations we were taught in school (such as removing an element from the middle of the list). std::vector is so fast that it beats those structures for everything but large N.



        Applying that reasoning, if you have only a small number of elements (hundreds?) using a std::vector and linear searching may be more efficient than the tree implementation of std::map. Depending on the frequency of insertion, a sorted std::vector combined with std::binary_search can be the fastest choice.






        share|improve this answer




























          9














          The previous answers only address tree alternatives and red black probably only remains for historical reasons.



          Why not a hash table?



          In a tree, a type requires only partial ordering (< comparison) to be used as a key in the map. However, hash tables require that each key type has a hash function defined. Keeping these type requirements to a minimum is very important for generic programming.



          Designing a good hash table requires intimate knowledge of the context it which it will be used. Should it use open addressing, or linked chaining? What levels of load should it accept before resizing? Should it use an expensive hash that avoids collisions, or one that is rough and fast?



          (C++11 did add hash tables with unordered_map. You can see from the documentation it requires setting policies to configure many of these options.)



          Since the STL can't anticipate which is the best choice for your application, the default needs to be more flexible. Trees "just work" and scale nicely.



          What about other trees?



          Red Black tree's offer fast lookup and are self balancing, unlike BSTs. Another user pointed out its advantages over the self-balancing AVL tree.



          Alexander Stepanov (The creator of STL) said that he would use a B* Tree instead of a Red-Black tree if he wrote std::map again, because it is more friendly for modern memory caches.




          One of the biggest changes since then has been the growth of caches.
          Cache misses are very costly, so locality of reference is much more
          important now. Node-based data structures, which have low locality of
          reference, make much less sense. If I were designing STL today, I
          would have a different set of containers. For example, an in-memory
          B*-tree is a far better choice than a red-black tree for implementing
          an associative container. - Alexander Stepanov




          Is red black tree or B* always the best?



          On other occasions Alex has stated that std::vector is almost always the best list container for similar reasons. It rarely makes sense to use std::list or std::deque even for those situations we were taught in school (such as removing an element from the middle of the list). std::vector is so fast that it beats those structures for everything but large N.



          Applying that reasoning, if you have only a small number of elements (hundreds?) using a std::vector and linear searching may be more efficient than the tree implementation of std::map. Depending on the frequency of insertion, a sorted std::vector combined with std::binary_search can be the fastest choice.






          share|improve this answer


























            9












            9








            9






            The previous answers only address tree alternatives and red black probably only remains for historical reasons.



            Why not a hash table?



            In a tree, a type requires only partial ordering (< comparison) to be used as a key in the map. However, hash tables require that each key type has a hash function defined. Keeping these type requirements to a minimum is very important for generic programming.



            Designing a good hash table requires intimate knowledge of the context it which it will be used. Should it use open addressing, or linked chaining? What levels of load should it accept before resizing? Should it use an expensive hash that avoids collisions, or one that is rough and fast?



            (C++11 did add hash tables with unordered_map. You can see from the documentation it requires setting policies to configure many of these options.)



            Since the STL can't anticipate which is the best choice for your application, the default needs to be more flexible. Trees "just work" and scale nicely.



            What about other trees?



            Red Black tree's offer fast lookup and are self balancing, unlike BSTs. Another user pointed out its advantages over the self-balancing AVL tree.



            Alexander Stepanov (The creator of STL) said that he would use a B* Tree instead of a Red-Black tree if he wrote std::map again, because it is more friendly for modern memory caches.




            One of the biggest changes since then has been the growth of caches.
            Cache misses are very costly, so locality of reference is much more
            important now. Node-based data structures, which have low locality of
            reference, make much less sense. If I were designing STL today, I
            would have a different set of containers. For example, an in-memory
            B*-tree is a far better choice than a red-black tree for implementing
            an associative container. - Alexander Stepanov




            Is red black tree or B* always the best?



            On other occasions Alex has stated that std::vector is almost always the best list container for similar reasons. It rarely makes sense to use std::list or std::deque even for those situations we were taught in school (such as removing an element from the middle of the list). std::vector is so fast that it beats those structures for everything but large N.



            Applying that reasoning, if you have only a small number of elements (hundreds?) using a std::vector and linear searching may be more efficient than the tree implementation of std::map. Depending on the frequency of insertion, a sorted std::vector combined with std::binary_search can be the fastest choice.






            share|improve this answer














            The previous answers only address tree alternatives and red black probably only remains for historical reasons.



            Why not a hash table?



            In a tree, a type requires only partial ordering (< comparison) to be used as a key in the map. However, hash tables require that each key type has a hash function defined. Keeping these type requirements to a minimum is very important for generic programming.



            Designing a good hash table requires intimate knowledge of the context it which it will be used. Should it use open addressing, or linked chaining? What levels of load should it accept before resizing? Should it use an expensive hash that avoids collisions, or one that is rough and fast?



            (C++11 did add hash tables with unordered_map. You can see from the documentation it requires setting policies to configure many of these options.)



            Since the STL can't anticipate which is the best choice for your application, the default needs to be more flexible. Trees "just work" and scale nicely.



            What about other trees?



            Red Black tree's offer fast lookup and are self balancing, unlike BSTs. Another user pointed out its advantages over the self-balancing AVL tree.



            Alexander Stepanov (The creator of STL) said that he would use a B* Tree instead of a Red-Black tree if he wrote std::map again, because it is more friendly for modern memory caches.




            One of the biggest changes since then has been the growth of caches.
            Cache misses are very costly, so locality of reference is much more
            important now. Node-based data structures, which have low locality of
            reference, make much less sense. If I were designing STL today, I
            would have a different set of containers. For example, an in-memory
            B*-tree is a far better choice than a red-black tree for implementing
            an associative container. - Alexander Stepanov




            Is red black tree or B* always the best?



            On other occasions Alex has stated that std::vector is almost always the best list container for similar reasons. It rarely makes sense to use std::list or std::deque even for those situations we were taught in school (such as removing an element from the middle of the list). std::vector is so fast that it beats those structures for everything but large N.



            Applying that reasoning, if you have only a small number of elements (hundreds?) using a std::vector and linear searching may be more efficient than the tree implementation of std::map. Depending on the frequency of insertion, a sorted std::vector combined with std::binary_search can be the fastest choice.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 20 at 21:17

























            answered Dec 22 '17 at 0:46









            Justin Meiners

            7,35253474




            7,35253474























                2














                It is just the choice of your implementation - they could be implemented as any balanced tree. The various choices are all comparable with minor differences. Therefore any is as good as any.






                share|improve this answer




























                  2














                  It is just the choice of your implementation - they could be implemented as any balanced tree. The various choices are all comparable with minor differences. Therefore any is as good as any.






                  share|improve this answer


























                    2












                    2








                    2






                    It is just the choice of your implementation - they could be implemented as any balanced tree. The various choices are all comparable with minor differences. Therefore any is as good as any.






                    share|improve this answer














                    It is just the choice of your implementation - they could be implemented as any balanced tree. The various choices are all comparable with minor differences. Therefore any is as good as any.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Jan 26 '15 at 9:17









                    Peter Mortensen

                    13.4k1983111




                    13.4k1983111










                    answered Mar 13 '11 at 8:48









                    necromancer

                    11k1758104




                    11k1758104























                        2














                        Update 2017-06-14: webbertiger edit its answer after I commented. I should point out that its answer is now a lot better to my eyes. But I kept my answer just as additional information...



                        Due to the fact that I think first answer is wrong (correction: not both anymore) and the third has a wrong affirmation. I feel I had to clarify things...



                        The 2 most popular tree are AVL and Red Black (RB). The main difference lie in the utilization:




                        • AVL : Better if ratio of consultation (read) is bigger than manipulation (modification). Memory foot print is a little less than RB (due to the bit required for coloring).

                        • RB : Better in general cases where there is a balance between consultation (read) and manipulation (modification) or more modification over consultation. A slightly bigger memory footprint due to the storing of red-black flag.


                        The main difference come from the coloring. You do have less re-balance action in RB tree than AVL because the coloring enable you to sometimes skip or shorten re-balance actions which have a relative hi cost. Because of the coloring, RB tree also have higher level of nodes because it could accept red nodes between black ones (having the possibilities of ~2x more levels) making search (read) a little bit less efficient... but because it is a constant (2x), it stay in O(log n).



                        If you consider the performance hit for a modification of a tree (significative) VS the performance hit of consultation of a tree (almost insignificant), it become natural to prefer RB over AVL for a general case.






                        share|improve this answer




























                          2














                          Update 2017-06-14: webbertiger edit its answer after I commented. I should point out that its answer is now a lot better to my eyes. But I kept my answer just as additional information...



                          Due to the fact that I think first answer is wrong (correction: not both anymore) and the third has a wrong affirmation. I feel I had to clarify things...



                          The 2 most popular tree are AVL and Red Black (RB). The main difference lie in the utilization:




                          • AVL : Better if ratio of consultation (read) is bigger than manipulation (modification). Memory foot print is a little less than RB (due to the bit required for coloring).

                          • RB : Better in general cases where there is a balance between consultation (read) and manipulation (modification) or more modification over consultation. A slightly bigger memory footprint due to the storing of red-black flag.


                          The main difference come from the coloring. You do have less re-balance action in RB tree than AVL because the coloring enable you to sometimes skip or shorten re-balance actions which have a relative hi cost. Because of the coloring, RB tree also have higher level of nodes because it could accept red nodes between black ones (having the possibilities of ~2x more levels) making search (read) a little bit less efficient... but because it is a constant (2x), it stay in O(log n).



                          If you consider the performance hit for a modification of a tree (significative) VS the performance hit of consultation of a tree (almost insignificant), it become natural to prefer RB over AVL for a general case.






                          share|improve this answer


























                            2












                            2








                            2






                            Update 2017-06-14: webbertiger edit its answer after I commented. I should point out that its answer is now a lot better to my eyes. But I kept my answer just as additional information...



                            Due to the fact that I think first answer is wrong (correction: not both anymore) and the third has a wrong affirmation. I feel I had to clarify things...



                            The 2 most popular tree are AVL and Red Black (RB). The main difference lie in the utilization:




                            • AVL : Better if ratio of consultation (read) is bigger than manipulation (modification). Memory foot print is a little less than RB (due to the bit required for coloring).

                            • RB : Better in general cases where there is a balance between consultation (read) and manipulation (modification) or more modification over consultation. A slightly bigger memory footprint due to the storing of red-black flag.


                            The main difference come from the coloring. You do have less re-balance action in RB tree than AVL because the coloring enable you to sometimes skip or shorten re-balance actions which have a relative hi cost. Because of the coloring, RB tree also have higher level of nodes because it could accept red nodes between black ones (having the possibilities of ~2x more levels) making search (read) a little bit less efficient... but because it is a constant (2x), it stay in O(log n).



                            If you consider the performance hit for a modification of a tree (significative) VS the performance hit of consultation of a tree (almost insignificant), it become natural to prefer RB over AVL for a general case.






                            share|improve this answer














                            Update 2017-06-14: webbertiger edit its answer after I commented. I should point out that its answer is now a lot better to my eyes. But I kept my answer just as additional information...



                            Due to the fact that I think first answer is wrong (correction: not both anymore) and the third has a wrong affirmation. I feel I had to clarify things...



                            The 2 most popular tree are AVL and Red Black (RB). The main difference lie in the utilization:




                            • AVL : Better if ratio of consultation (read) is bigger than manipulation (modification). Memory foot print is a little less than RB (due to the bit required for coloring).

                            • RB : Better in general cases where there is a balance between consultation (read) and manipulation (modification) or more modification over consultation. A slightly bigger memory footprint due to the storing of red-black flag.


                            The main difference come from the coloring. You do have less re-balance action in RB tree than AVL because the coloring enable you to sometimes skip or shorten re-balance actions which have a relative hi cost. Because of the coloring, RB tree also have higher level of nodes because it could accept red nodes between black ones (having the possibilities of ~2x more levels) making search (read) a little bit less efficient... but because it is a constant (2x), it stay in O(log n).



                            If you consider the performance hit for a modification of a tree (significative) VS the performance hit of consultation of a tree (almost insignificant), it become natural to prefer RB over AVL for a general case.







                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Jan 29 at 16:25

























                            answered May 30 '17 at 20:33









                            Eric Ouellet

                            5,73645480




                            5,73645480






























                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.





                                Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                Please pay close attention to the following guidance:


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f5288320%2fwhy-is-stdmap-implemented-as-a-red-black-tree%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Feedback on college project

                                Futebolista

                                Albești (Vaslui)