Disk space - Python dictionary vs list
up vote
1
down vote
favorite
I was asked to create an inverted index and save its binary in multiple ways (with and without compression).
Long story short, I noticed that using a dict
representation takes much less disk space than transforming into a list
.
Sample:
dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}
dic_list = list(dic.items())
import pickle
with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)
If you check both files sizes, you will notice the difference.
So, I am willing to know how and why they are different. Any additional information would be much appreciated
python python-3.x list dictionary pickle
add a comment |
up vote
1
down vote
favorite
I was asked to create an inverted index and save its binary in multiple ways (with and without compression).
Long story short, I noticed that using a dict
representation takes much less disk space than transforming into a list
.
Sample:
dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}
dic_list = list(dic.items())
import pickle
with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)
If you check both files sizes, you will notice the difference.
So, I am willing to know how and why they are different. Any additional information would be much appreciated
python python-3.x list dictionary pickle
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 at 14:29
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I was asked to create an inverted index and save its binary in multiple ways (with and without compression).
Long story short, I noticed that using a dict
representation takes much less disk space than transforming into a list
.
Sample:
dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}
dic_list = list(dic.items())
import pickle
with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)
If you check both files sizes, you will notice the difference.
So, I am willing to know how and why they are different. Any additional information would be much appreciated
python python-3.x list dictionary pickle
I was asked to create an inverted index and save its binary in multiple ways (with and without compression).
Long story short, I noticed that using a dict
representation takes much less disk space than transforming into a list
.
Sample:
dic = {
'w1': [1,2,3,4,5,6],
'w2': [2,3,4,5,6],
'w3': [3,4,5,6],
'w4': [4,5,6]
}
dic_list = list(dic.items())
import pickle
with open('dic.pickle', 'wb') as handle:
pickle.dump(dic, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open('dic_list.pickle', 'wb') as handle:
pickle.dump(dic_list, handle, protocol=pickle.HIGHEST_PROTOCOL)
If you check both files sizes, you will notice the difference.
So, I am willing to know how and why they are different. Any additional information would be much appreciated
python python-3.x list dictionary pickle
python python-3.x list dictionary pickle
edited Nov 19 at 15:55
Martijn Pieters♦
692k12923892233
692k12923892233
asked Nov 19 at 14:14
leoschet
375114
375114
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 at 14:29
add a comment |
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 at 14:29
2
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 at 14:29
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 at 14:29
add a comment |
2 Answers
2
active
oldest
votes
up vote
3
down vote
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
add a comment |
up vote
1
down vote
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 at 15:14
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
add a comment |
up vote
3
down vote
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
add a comment |
up vote
3
down vote
up vote
3
down vote
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
The dic_list
list consists of more objects. You have an outer list of tuples, each tuple a key-value pair. Each value is another list. Those tuples are the reason you need more space.
The dictionary pickle format doesn't have to use tuple objects to store key-value pairs; it is already known up front that a dictionary consists of a series of pairs, so you can serialise key and value per such pair directly without the overhead of a wrapping tuple object.
You can analyse pickle data with the pickletools
module; using a simpler dictionary with just one key-value, you can see the difference already:
>>> import pickle, pickletools
>>> pickletools.dis(pickle.dumps({'foo': 42}, protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 12
11: } EMPTY_DICT
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: s SETITEM
22: . STOP
highest protocol among opcodes = 4
>>> pickletools.dis(pickle.dumps(list({'foo': 42}.items()), protocol=pickle.HIGHEST_PROTOCOL))
0: x80 PROTO 4
2: x95 FRAME 14
11: ] EMPTY_LIST
12: x94 MEMOIZE (as 0)
13: x8c SHORT_BINUNICODE 'foo'
18: x94 MEMOIZE (as 1)
19: K BININT1 42
21: x86 TUPLE2
22: x94 MEMOIZE (as 2)
23: a APPEND
24: . STOP
If you consider EMPTY_DICT
+ SETITEM
to be the equivalent of EMPTY_LIST
+ APPEND
, then the only real difference in that stream in the addition of the TUPLE2
/ MEMOIZE
pair of opcodes. It's those opcodes that take the extra space.
edited Nov 19 at 14:44
answered Nov 19 at 14:21
Martijn Pieters♦
692k12923892233
692k12923892233
add a comment |
add a comment |
up vote
1
down vote
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 at 15:14
add a comment |
up vote
1
down vote
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 at 15:14
add a comment |
up vote
1
down vote
up vote
1
down vote
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
A dict
can natively handle key-value pairs, while a list
must use a separate container.
Your dict
is a straightforward representation of Dict[K, V]
- pairs plus some structure. Since the structure is runtime only, it can be ignored for storage.
{'a': 1, 'b': 2}
Your list
uses a helper for pairs, resulting in List[Tuple[K,V]]
- pairs plus wrapper. Since the wrapper is needed to reconstruct the pairs, it cannot be ignored for storage.
[('a', 1), ('b', 2)]
You can also inspect this in the pickle dump. The list
dump contains markers for the additional tuples.
pickle.dumps({'a': 1, 'b': 2}, protocol=0)
(dp0 # <new dict>
Va # string a
p1
I1 # integer 1
sVb # <setitem key/value>, string b
p2
I2 # integer 2
s. # <setitem key/value>
pickle.dumps(list({'a': 1, 'b': 2}.items()), protocol=0)
(lp0 # <new list>
(Va # <marker>, string a
p1
I1 # integer 1
tp2 # <make tuple>
a(Vb # <append>, <marker>, string b
p3
I2 # integer 2
tp4 # <make tuple>
a. # <append>
While the surrounding dict
and list
are both stored as a sequence of pairs, the pairs are stored differently. For the dict
, only key, value and stop are stored flatly. For the list
, an additional tuple
is needed for each pair.
edited Nov 19 at 15:17
answered Nov 19 at 14:24
MisterMiyagi
7,1601939
7,1601939
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 at 15:14
add a comment |
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.pickletools.dis()
is far more readable, at any rate.
– Martijn Pieters♦
Nov 19 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicerdis
output anyways. ;)
– MisterMiyagi
Nov 19 at 14:52
The nesting is not present, not in thepickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.
– Martijn Pieters♦
Nov 19 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 at 15:14
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.
pickletools.dis()
is far more readable, at any rate.– Martijn Pieters♦
Nov 19 at 14:45
protocol 0 differs in enough ways from the current protocol revision 4 that I'd recommend against using it to illustrate what might be going on.
pickletools.dis()
is far more readable, at any rate.– Martijn Pieters♦
Nov 19 at 14:45
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer
dis
output anyways. ;)– MisterMiyagi
Nov 19 at 14:52
@MartijnPieters Well, the nesting is present either way and ASCII is less intimidating for some people. The point is that there is an additional container, not how these are encoded. That, plus your answer already showed the nicer
dis
output anyways. ;)– MisterMiyagi
Nov 19 at 14:52
The nesting is not present, not in the
pickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.– Martijn Pieters♦
Nov 19 at 15:07
The nesting is not present, not in the
pickle.dumps()
string, nor is the explanation of the opcodes. You had to add those yourself, or you used an IDE that detected you were dumping a pickle.– Martijn Pieters♦
Nov 19 at 15:07
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 at 15:14
@MartijnPieters I mean the tuples nested into the list. The relevant part is that the tuples must be dumped as well. I don't think it really matters where the formatting comes from, a pickle dump is low-level gibberish even with formatting/annotations for most people.
– MisterMiyagi
Nov 19 at 15:14
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53376510%2fdisk-space-python-dictionary-vs-list%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Closely related: Python memory consumption: dict VS list of tuples
– Martijn Pieters♦
Nov 19 at 14:29