Disk space - Python dictionary vs list











up vote
1
down vote

favorite












I was asked to create an inverted index and save its binary in multiple ways (with and without compression).



Long story short, I noticed that using a dict representation takes much less disk space than transforming into a list.



Sample:



dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}

dic_list = list(dic.items())

import pickle

with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)


If you check both files sizes, you will notice the difference.



So, I am willing to know how and why they are different. Any additional information would be much appreciated










share|improve this question




















  • 2




    Closely related: Python memory consumption: dict VS list of tuples
    – Martijn Pieters
    Nov 19 at 14:29















up vote
1
down vote

favorite












I was asked to create an inverted index and save its binary in multiple ways (with and without compression).



Long story short, I noticed that using a dict representation takes much less disk space than transforming into a list.



Sample:



dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}

dic_list = list(dic.items())

import pickle

with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)


If you check both files sizes, you will notice the difference.



So, I am willing to know how and why they are different. Any additional information would be much appreciated










share|improve this question




















  • 2




    Closely related: Python memory consumption: dict VS list of tuples
    – Martijn Pieters
    Nov 19 at 14:29













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I was asked to create an inverted index and save its binary in multiple ways (with and without compression).



Long story short, I noticed that using a dict representation takes much less disk space than transforming into a list.



Sample:



dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}

dic_list = list(dic.items())

import pickle

with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)


If you check both files sizes, you will notice the difference.



So, I am willing to know how and why they are different. Any additional information would be much appreciated










share|improve this question















I was asked to create an inverted index and save its binary in multiple ways (with and without compression).



Long story short, I noticed that using a dict representation takes much less disk space than transforming into a list.



Sample:



dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}

dic_list = list(dic.items())

import pickle

with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)

with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)


If you check both files sizes, you will notice the difference.



So, I am willing to know how and why they are different. Any additional information would be much appreciated







python python-3.x list dictionary pickle






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 at 15:55









Martijn Pieters

692k12923892233




692k12923892233










asked Nov 19 at 14:14









leoschet

375114




375114








  • 2




    Closely related: Python memory consumption: dict VS list of tuples
    – Martijn Pieters
    Nov 19 at 14:29














  • 2




    Closely related: Python memory consumption: dict VS list of tuples
    – Martijn Pieters
    Nov 19 at 14:29








2




2




Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters
Nov 19 at 14:29




Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters
Nov 19 at 14:29












2 Answers
2






active

oldest

votes

















up vote
3
down vote













The dic_list list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.



The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.



You can analyse pickle data with the pickletools module; using a simpler dictionary with just one key-value, you can see the difference already:



>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP


If you consider EMPTY_DICT + SETITEM to be the equivalent of EMPTY_LIST + APPEND, then the only real difference in that stream in the addition of the TUPLE2 / MEMOIZE pair of opcodes. It's those opcodes that take the extra space.






share|improve this answer






























    up vote
    1
    down vote













    A dict can natively handle key-value pairs, while a list must use a separate container.



    Your dict is a straightforward representation of Dict[K, V] - pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.



    {'a': 1, 'b': 2}


    Your list uses a helper for pairs, resulting in List[Tuple[K,V]] - pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.



    [('a', 1), ('b', 2)]




    You can also inspect this in the pickle dump. The list dump contains markers for the additional tuples.



    pickle.dumps({'a': 1, 'b': 2}, protocol=0)
    (dp0 # <new dict>
    Va # string a
    p1
    I1 # integer 1
    sVb # <setitem key/value>, string b
    p2
    I2 # integer 2
    s. # <setitem key/value>

    pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
    (lp0 # <new list>
    (Va # <marker>, string a
    p1
    I1 # integer 1
    tp2 # <make tuple>
    a(Vb # <append>, <marker>, string b
    p3
    I2 # integer 2
    tp4 # <make tuple>
    a. # <append>


    While the surrounding dict and list are both stored as a sequence of pairs, the pairs are stored differently. For the dict, only key, value and stop are stored flatly. For the list, an additional tuple is needed for each pair.






    share|improve this answer























    • protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on. pickletools.dis() is far more readable, at any rate.
      – Martijn Pieters
      Nov 19 at 14:45










    • @MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer dis output anyways. ;)
      – MisterMiyagi
      Nov 19 at 14:52










    • The nesting is not present, not in the pickle.dumps() string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
      – Martijn Pieters
      Nov 19 at 15:07










    • @MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
      – MisterMiyagi
      Nov 19 at 15:14











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53376510%2fdisk-space-python-dictionary-vs-list%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote













    The dic_list list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.



    The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.



    You can analyse pickle data with the pickletools module; using a simpler dictionary with just one key-value, you can see the difference already:



    >>> import pickle, pickletools
    >>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
    0: x80 PROTO 4
    2: x95 FRAME 12
    11: } EMPTY_DICT
    12: x94 MEMOIZE (as 0)
    13: x8c SHORT_BINUNICODE 'foo'
    18: x94 MEMOIZE (as 1)
    19: K BININT1 42
    21: s SETITEM
    22: . STOP
    highest protocol among opcodes = 4
    >>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
    0: x80 PROTO 4
    2: x95 FRAME 14
    11: ] EMPTY_LIST
    12: x94 MEMOIZE (as 0)
    13: x8c SHORT_BINUNICODE 'foo'
    18: x94 MEMOIZE (as 1)
    19: K BININT1 42
    21: x86 TUPLE2
    22: x94 MEMOIZE (as 2)
    23: a APPEND
    24: . STOP


    If you consider EMPTY_DICT + SETITEM to be the equivalent of EMPTY_LIST + APPEND, then the only real difference in that stream in the addition of the TUPLE2 / MEMOIZE pair of opcodes. It's those opcodes that take the extra space.






    share|improve this answer



























      up vote
      3
      down vote













      The dic_list list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.



      The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.



      You can analyse pickle data with the pickletools module; using a simpler dictionary with just one key-value, you can see the difference already:



      >>> import pickle, pickletools
      >>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
      0: x80 PROTO 4
      2: x95 FRAME 12
      11: } EMPTY_DICT
      12: x94 MEMOIZE (as 0)
      13: x8c SHORT_BINUNICODE 'foo'
      18: x94 MEMOIZE (as 1)
      19: K BININT1 42
      21: s SETITEM
      22: . STOP
      highest protocol among opcodes = 4
      >>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
      0: x80 PROTO 4
      2: x95 FRAME 14
      11: ] EMPTY_LIST
      12: x94 MEMOIZE (as 0)
      13: x8c SHORT_BINUNICODE 'foo'
      18: x94 MEMOIZE (as 1)
      19: K BININT1 42
      21: x86 TUPLE2
      22: x94 MEMOIZE (as 2)
      23: a APPEND
      24: . STOP


      If you consider EMPTY_DICT + SETITEM to be the equivalent of EMPTY_LIST + APPEND, then the only real difference in that stream in the addition of the TUPLE2 / MEMOIZE pair of opcodes. It's those opcodes that take the extra space.






      share|improve this answer

























        up vote
        3
        down vote










        up vote
        3
        down vote









        The dic_list list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.



        The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.



        You can analyse pickle data with the pickletools module; using a simpler dictionary with just one key-value, you can see the difference already:



        >>> import pickle, pickletools
        >>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
        0: x80 PROTO 4
        2: x95 FRAME 12
        11: } EMPTY_DICT
        12: x94 MEMOIZE (as 0)
        13: x8c SHORT_BINUNICODE 'foo'
        18: x94 MEMOIZE (as 1)
        19: K BININT1 42
        21: s SETITEM
        22: . STOP
        highest protocol among opcodes = 4
        >>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
        0: x80 PROTO 4
        2: x95 FRAME 14
        11: ] EMPTY_LIST
        12: x94 MEMOIZE (as 0)
        13: x8c SHORT_BINUNICODE 'foo'
        18: x94 MEMOIZE (as 1)
        19: K BININT1 42
        21: x86 TUPLE2
        22: x94 MEMOIZE (as 2)
        23: a APPEND
        24: . STOP


        If you consider EMPTY_DICT + SETITEM to be the equivalent of EMPTY_LIST + APPEND, then the only real difference in that stream in the addition of the TUPLE2 / MEMOIZE pair of opcodes. It's those opcodes that take the extra space.






        share|improve this answer














        The dic_list list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.



        The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.



        You can analyse pickle data with the pickletools module; using a simpler dictionary with just one key-value, you can see the difference already:



        >>> import pickle, pickletools
        >>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
        0: x80 PROTO 4
        2: x95 FRAME 12
        11: } EMPTY_DICT
        12: x94 MEMOIZE (as 0)
        13: x8c SHORT_BINUNICODE 'foo'
        18: x94 MEMOIZE (as 1)
        19: K BININT1 42
        21: s SETITEM
        22: . STOP
        highest protocol among opcodes = 4
        >>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
        0: x80 PROTO 4
        2: x95 FRAME 14
        11: ] EMPTY_LIST
        12: x94 MEMOIZE (as 0)
        13: x8c SHORT_BINUNICODE 'foo'
        18: x94 MEMOIZE (as 1)
        19: K BININT1 42
        21: x86 TUPLE2
        22: x94 MEMOIZE (as 2)
        23: a APPEND
        24: . STOP


        If you consider EMPTY_DICT + SETITEM to be the equivalent of EMPTY_LIST + APPEND, then the only real difference in that stream in the addition of the TUPLE2 / MEMOIZE pair of opcodes. It's those opcodes that take the extra space.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 19 at 14:44

























        answered Nov 19 at 14:21









        Martijn Pieters

        692k12923892233




        692k12923892233
























            up vote
            1
            down vote













            A dict can natively handle key-value pairs, while a list must use a separate container.



            Your dict is a straightforward representation of Dict[K, V] - pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.



            {'a': 1, 'b': 2}


            Your list uses a helper for pairs, resulting in List[Tuple[K,V]] - pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.



            [('a', 1), ('b', 2)]




            You can also inspect this in the pickle dump. The list dump contains markers for the additional tuples.



            pickle.dumps({'a': 1, 'b': 2}, protocol=0)
            (dp0 # <new dict>
            Va # string a
            p1
            I1 # integer 1
            sVb # <setitem key/value>, string b
            p2
            I2 # integer 2
            s. # <setitem key/value>

            pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
            (lp0 # <new list>
            (Va # <marker>, string a
            p1
            I1 # integer 1
            tp2 # <make tuple>
            a(Vb # <append>, <marker>, string b
            p3
            I2 # integer 2
            tp4 # <make tuple>
            a. # <append>


            While the surrounding dict and list are both stored as a sequence of pairs, the pairs are stored differently. For the dict, only key, value and stop are stored flatly. For the list, an additional tuple is needed for each pair.






            share|improve this answer























            • protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on. pickletools.dis() is far more readable, at any rate.
              – Martijn Pieters
              Nov 19 at 14:45










            • @MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer dis output anyways. ;)
              – MisterMiyagi
              Nov 19 at 14:52










            • The nesting is not present, not in the pickle.dumps() string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
              – Martijn Pieters
              Nov 19 at 15:07










            • @MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
              – MisterMiyagi
              Nov 19 at 15:14















            up vote
            1
            down vote













            A dict can natively handle key-value pairs, while a list must use a separate container.



            Your dict is a straightforward representation of Dict[K, V] - pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.



            {'a': 1, 'b': 2}


            Your list uses a helper for pairs, resulting in List[Tuple[K,V]] - pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.



            [('a', 1), ('b', 2)]




            You can also inspect this in the pickle dump. The list dump contains markers for the additional tuples.



            pickle.dumps({'a': 1, 'b': 2}, protocol=0)
            (dp0 # <new dict>
            Va # string a
            p1
            I1 # integer 1
            sVb # <setitem key/value>, string b
            p2
            I2 # integer 2
            s. # <setitem key/value>

            pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
            (lp0 # <new list>
            (Va # <marker>, string a
            p1
            I1 # integer 1
            tp2 # <make tuple>
            a(Vb # <append>, <marker>, string b
            p3
            I2 # integer 2
            tp4 # <make tuple>
            a. # <append>


            While the surrounding dict and list are both stored as a sequence of pairs, the pairs are stored differently. For the dict, only key, value and stop are stored flatly. For the list, an additional tuple is needed for each pair.






            share|improve this answer























            • protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on. pickletools.dis() is far more readable, at any rate.
              – Martijn Pieters
              Nov 19 at 14:45










            • @MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer dis output anyways. ;)
              – MisterMiyagi
              Nov 19 at 14:52










            • The nesting is not present, not in the pickle.dumps() string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
              – Martijn Pieters
              Nov 19 at 15:07










            • @MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
              – MisterMiyagi
              Nov 19 at 15:14













            up vote
            1
            down vote










            up vote
            1
            down vote









            A dict can natively handle key-value pairs, while a list must use a separate container.



            Your dict is a straightforward representation of Dict[K, V] - pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.



            {'a': 1, 'b': 2}


            Your list uses a helper for pairs, resulting in List[Tuple[K,V]] - pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.



            [('a', 1), ('b', 2)]




            You can also inspect this in the pickle dump. The list dump contains markers for the additional tuples.



            pickle.dumps({'a': 1, 'b': 2}, protocol=0)
            (dp0 # <new dict>
            Va # string a
            p1
            I1 # integer 1
            sVb # <setitem key/value>, string b
            p2
            I2 # integer 2
            s. # <setitem key/value>

            pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
            (lp0 # <new list>
            (Va # <marker>, string a
            p1
            I1 # integer 1
            tp2 # <make tuple>
            a(Vb # <append>, <marker>, string b
            p3
            I2 # integer 2
            tp4 # <make tuple>
            a. # <append>


            While the surrounding dict and list are both stored as a sequence of pairs, the pairs are stored differently. For the dict, only key, value and stop are stored flatly. For the list, an additional tuple is needed for each pair.






            share|improve this answer














            A dict can natively handle key-value pairs, while a list must use a separate container.



            Your dict is a straightforward representation of Dict[K, V] - pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.



            {'a': 1, 'b': 2}


            Your list uses a helper for pairs, resulting in List[Tuple[K,V]] - pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.



            [('a', 1), ('b', 2)]




            You can also inspect this in the pickle dump. The list dump contains markers for the additional tuples.



            pickle.dumps({'a': 1, 'b': 2}, protocol=0)
            (dp0 # <new dict>
            Va # string a
            p1
            I1 # integer 1
            sVb # <setitem key/value>, string b
            p2
            I2 # integer 2
            s. # <setitem key/value>

            pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
            (lp0 # <new list>
            (Va # <marker>, string a
            p1
            I1 # integer 1
            tp2 # <make tuple>
            a(Vb # <append>, <marker>, string b
            p3
            I2 # integer 2
            tp4 # <make tuple>
            a. # <append>


            While the surrounding dict and list are both stored as a sequence of pairs, the pairs are stored differently. For the dict, only key, value and stop are stored flatly. For the list, an additional tuple is needed for each pair.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 19 at 15:17

























            answered Nov 19 at 14:24









            MisterMiyagi

            7,1601939




            7,1601939












            • protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on. pickletools.dis() is far more readable, at any rate.
              – Martijn Pieters
              Nov 19 at 14:45










            • @MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer dis output anyways. ;)
              – MisterMiyagi
              Nov 19 at 14:52










            • The nesting is not present, not in the pickle.dumps() string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
              – Martijn Pieters
              Nov 19 at 15:07










            • @MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
              – MisterMiyagi
              Nov 19 at 15:14


















            • protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on. pickletools.dis() is far more readable, at any rate.
              – Martijn Pieters
              Nov 19 at 14:45










            • @MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer dis output anyways. ;)
              – MisterMiyagi
              Nov 19 at 14:52










            • The nesting is not present, not in the pickle.dumps() string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
              – Martijn Pieters
              Nov 19 at 15:07










            • @MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
              – MisterMiyagi
              Nov 19 at 15:14
















            protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on. pickletools.dis() is far more readable, at any rate.
            – Martijn Pieters
            Nov 19 at 14:45




            protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on. pickletools.dis() is far more readable, at any rate.
            – Martijn Pieters
            Nov 19 at 14:45












            @MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer dis output anyways. ;)
            – MisterMiyagi
            Nov 19 at 14:52




            @MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer dis output anyways. ;)
            – MisterMiyagi
            Nov 19 at 14:52












            The nesting is not present, not in the pickle.dumps() string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
            – Martijn Pieters
            Nov 19 at 15:07




            The nesting is not present, not in the pickle.dumps() string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
            – Martijn Pieters
            Nov 19 at 15:07












            @MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
            – MisterMiyagi
            Nov 19 at 15:14




            @MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
            – MisterMiyagi
            Nov 19 at 15:14


















             

            draft saved


            draft discarded



















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53376510%2fdisk-space-python-dictionary-vs-list%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            404 Error Contact Form 7 ajax form submitting

            How to know if a Active Directory user can login interactively

            TypeError: fit_transform() missing 1 required positional argument: 'X'