This question already has answers here:
In-memory size of a Python structure
(7 answers)
Closed 9 years ago.
I am writing Python code to do some big number calculation, and have serious concern about the memory used in the calculation.
Thus, I want to count every bit of each variable.
For example, I have a variable x, which is a big number, and want to count the number of bits for representing x.
The following code is obviously useless:
x=2**1000
len(x)
Thus, I turn to use the following code:
x=2**1000
len(repr(x))
The variable x is (in decimal) is:
10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376
but the above code returns 303
The above long long sequence is of length 302, and so I believe that 303 should be related to the string length only.
So, here comes my original question:
How can I know the memory size of variable x?
One more thing; in C/C++ language, if I define
int z=1;
This means that there are 4 bytes= 32 bits allocated for z, and the bits are arranged as 00..001(31 0's and one 1).
Here, my variable x is huge, I don't know whether it follows the same memory allocation rule?
Use sys.getsizeof to get the size of an object, in bytes.
>>> from sys import getsizeof
>>> a = 42
>>> getsizeof(a)
12
>>> a = 2**1000
>>> getsizeof(a)
146
>>>
Note that the size and layout of an object is purely implementation-specific. CPython, for example, may use totally different internal data structures than IronPython. So the size of an object may vary from implementation to implementation.
Regarding the internal structure of a Python long, check sys.int_info (or sys.long_info for Python 2.7).
>>> import sys
>>> sys.int_info
sys.int_info(bits_per_digit=30, sizeof_digit=4)
Python either stores 30 bits into 4 bytes (most 64-bit systems) or 15 bits into 2 bytes (most 32-bit systems). Comparing the actual memory usage with calculated values, I get
>>> import math, sys
>>> a=0
>>> sys.getsizeof(a)
24
>>> a=2**100
>>> sys.getsizeof(a)
40
>>> a=2**1000
>>> sys.getsizeof(a)
160
>>> 24+4*math.ceil(100/30)
40
>>> 24+4*math.ceil(1000/30)
160
There are 24 bytes of overhead for 0 since no bits are stored. The memory requirements for larger values matches the calculated values.
If your numbers are so large that you are concerned about the 6.25% unused bits, you should probably look at the gmpy2 library. The internal representation uses all available bits and computations are significantly faster for large values (say, greater than 100 digits).
Related
As seen in Find the memory size of a set of strings vs. set of bytestrings, it's difficult to precisely measure the memory used by a set or list containing strings. But here is a good estimation/upper bound:
import os, psutil
process = psutil.Process(os.getpid())
a = process.memory_info().rss
L = [b"a%09i" % i for i in range(10_000_000)]
b = process.memory_info().rss
print(L[:10]) # [b'a000000000', b'a000000001', b'a000000002', b'a000000003', b'a000000004', b'a000000005', b'a000000006', b'a000000007', b'a000000008', b'a000000009']
print(b-a)
# 568762368 bytes
i.e. 569 MB for 100 MB of actual data.
Solutions to improve this (for example with other data structures) have been found in Memory-efficient data structure for a set of short bytes-strings and Set of 10-char strings in Python is 10 times bigger in RAM as expected, so here my question is not "how to improve", but:
How can we precisely explain this size in the case of a standard list of byte-string?
How many bytes for each byte-string, for each (linked?) list item to finally obtain 569 MB?
This will help to understand the internals of lists and bytes-strings in CPython (platform: Windows 64 bit).
Summary:
89 MB for the list object
480 MB for the string objects
=> total 569 MB
sys.getsizeof(L) will tell you the list object itself is about 89 MB. That's a few dozen organizational bytes, 8 bytes per bytestring reference, and up to 12.5% overallocation to allow efficient insertions.
sys.getsizeof(one_of_your_bytestrings) will tell you they're 43 bytes each. That's:
8 bytes for the reference counter
8 bytes for the pointer to the type
8 bytes for the length (since bytestrings aren't fixed size)
8 bytes hash
10 bytes for your actual bytestring content
1 byte for a terminating 0-byte.
Storing the objects every 43 bytes in memory would cross memory word boundaries, which is slower. So they're actually stored usually every 48 bytes. You can use id(one_of_your_bytestrings) to get the addresses to check.
(There's some variance here and there, partly due to the exact memory allocations that happen, but 569 MB is about what's expected knowing the above reasons, and it matches what you measured.)
This question already has answers here:
Python string interning
(2 answers)
About the changing id of an immutable string
(5 answers)
Closed 3 years ago.
The following two codes are equivalent, but the first one takes about 700M memory, the latter one takes only about 100M memory(via windows task manager). What happens here?
def a():
lst = []
for i in range(10**7):
t = "a"
t = t * 2
lst.append(t)
return lst
_ = a()
def a():
lst = []
for i in range(10**7):
t = "a" * 2
lst.append(t)
return lst
_ = a()
#vurmux presented the right reason for the different memory usage: string interning, but some important details seem to be missing.
CPython-implementation interns some strings during the compilation, e.g "a"*2 - for more info about how/why "a"*2 gets interned see this SO-post.
Clarification: As #MartijnPieters has correctly pointed out in his comment: the important thing is whether the compiler does the constant-folding (e.g. evaluates the multiplication of two constants "a"*2) or not. If constant-folding is done, the resulting constant will be used and all elements in the list will be references to the same object, otherwise not. Even if all string constants get interned (and thus constant folding performed => string interned) - still it was sloppy to speak about interning: constant folding is the key here, as it explains the behavior also for types which have no interning at all, for example floats (if we would use t=42*2.0).
Whether constant folding has happened, can be easily verified with dis-module (I call your second version a2()):
>>> import dis
>>> dis.dis(a2)
...
4 18 LOAD_CONST 2 ('aa')
20 STORE_FAST 2 (t)
...
As we can see, during the run time the multiplication isn't performed, but directly the result (which was computed during the compiler time) of the multiplication is loaded - the resulting list consists of references to the same object (the constant loaded with 18 LOAD_CONST 2):
>>> len({id(s) for s in a2()})
1
There, only 8 bytes per reference are needed, that means about 80Mb (+overalocation of the list + memory needed for the interpreter) memory needed.
In Python3.7 constant folding isn't performed if the resulting string has more than 4096 characters, so replacing "a"*2 with "a"*4097 leads to the following byte-code:
>>> dis.dis(a1)
...
4 18 LOAD_CONST 2 ('a')
20 LOAD_CONST 3 (4097)
22 BINARY_MULTIPLY
24 STORE_FAST 2 (t)
...
Now, the multiplication isn't precalculated, the references in the resulting string will be of different objects.
The optimizer is yet not smart enough to recognize, that t is actually "a" in t=t*2, otherwise it would be able to perform the constant folding, but for now the resulting byte-code for your first version (I call it a2()):
...
5 22 LOAD_CONST 3 (2)
24 LOAD_FAST 2 (t)
26 BINARY_MULTIPLY
28 STORE_FAST 2 (t)
...
and it will return a list with 10^7 different objects (but all object being equal) inside:
>>> len({id(s) for s in a1()})
10000000
i.e. you will need about 56 bytes per string (sys.getsizeof returns 51, but because the pymalloc-memory-allocator is 8-byte aligned, 5 bytes will be wasted) + 8 bytes per reference (assuming 64bit-CPython-version), thus about 610Mb (+overalocation of the list + memory needed for the interpreter).
You can enforce the interning of the string via sys.intern:
import sys
def a1_interned():
lst = []
for i in range(10**7):
t = "a"
t = t * 2
# here ensure, that the string-object gets interned
# returned value is the interned version
t = sys.intern(t)
lst.append(t)
return lst
And realy, we can now not only see, that less memory is needed, but also that the list has references to the same object (see it online for a slightly smaller size(10^5) here):
>>> len({id(s) for s in a1_interned()})
1
>>> all((s=="aa" for s in a1_interned())
True
String interning can save a lot of memory, but it is sometimes tricky to understand, whether/why a string gets interned or not. Calling sys.intern explicitly eliminates this uncertainty.
The existence of additional temporary objects referenced by t is not the problem: CPython uses reference counting for memory managment, so an object gets deleted as soon as there is no references to it - without any interaction from the garbage collector, which in CPython is only used to break-up cycles (which is different to for example Java's GC, as Java doesn't use reference counting). Thus, temporary variables are really temporaries - those objects cannot be accumulated to make any impact on memory usage.
The problem with the temporary variable t is only that it prevents peephole optimization during the compilation, which is performed for "a"*2 but not for t*2.
This difference is exist because of string interning in Python interpreter:
String interning is the method of caching particular strings in memory as they are instantiated. The idea is that, since strings in Python are immutable objects, only one instance of a particular string is needed at a time. By storing an instantiated string in memory, any future references to that same string can be directed to refer to the singleton already in existence, instead of taking up new memory.
Let me show it in a simple example:
>>> t1 = 'a'
>>> t2 = t1 * 2
>>> t2 is 'aa'
False
>>> t1 = 'a'
>>> t2 = 'a'*2
>>> t2 is 'aa'
True
When you use the first variant, the Python string interning is not used so the interpreter creates additional internal variables to store temporal data. It can't optimize many-lines-code this way.
I am not a Python guru, but I think the interpreter works this way:
t = "a"
t = t * 2
In the first line it creates an object for t. In the second line it creates a temporary object for t right of the = sign and writes the result in the third place in the memory (with GC called later). So the second variant should use at least 3 times less memory than the first.
P.S. You can read more about string interning here.
I am trying to compare sizes of data types in Python with sys.getsizeof(). However, for integers and floats, it returns same - 24 (not customary 4 or 8 bytes). Also, size of an array declared with array.array() with 4 integer elements is returned 72 (not 96). and with 4 float elements- 88 (not 96). What is going on?
import array, sys
arr1 = array.array('d', [1,2,3,4])
arr2 = array.array('i', [1,2,3,4])
print sys.getsizeof(arr1[1]), sys.getsizeof(arr2[1]) # 24, 24
print sys.getsizeof(arr1), sys.getsizeof(arr2) # 88, 72
The function sys.getsizeof() returns the amount of space the Python object takes. Not the amount of space you would need to represent the data in that object in the memory of the underlying system.
Python objects have overhead to cover reference counting (for garbage collection) and other implementation-related stuff. In addition, an array is not a naive sequence of floats or ints; the data structure has a fair amount of stuff under the hood that keeps track of datatype, number of elements and so on. That's where the 'd' or 'i' lives, for example.
To get the answers I think you are expecting, try
print (arr1.itemsize * len(arr1))
print (arr2.itemsize * len(arr2))
First of all this is my computer Spec :
Memory - https://gist.github.com/vyscond/6425304
CPU - https://gist.github.com/vyscond/6425322
So this morning I've tested the following 2 code snippets:
code A
a = 'a' * 1000000000
and code B
a = 'a' * 10000000000
The code A works fine. But the code B give me some error message :
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
So I started a researching about method to measuring the size of data on python.
The first thing I've found is the classic built-in function len().
for code A function len() returned the value 1000000000, but for code B the same memory error was returned.
After this I decided to get more precision on this tests. So I've found a function from the sys module called getsizeof(). With this function I made the same test on code A:
sys.getsizeof( 'a' * 1000000000 )
the result return is 1000000037 (in bytes)
question 1 - which means 0.9313226090744 gigabytes?
So I checked the amount of bytes of a string with a single character 'a'
sys.getsizeof( 'a' )
the result return is 38 (in bytes)
question 02 - which means if we need a string composed of 1000000000 character 'a' this will result in 38 * 1000000000 = 38.000.000.000 bytes?
question 03 - which means we need a 35.390257835388 gigabytes to hold a string like this?
I would like to know where is the error in this reasoning! Because this not any sense to me '-'
Python objects have a minimal size, the overhead of keeping several pieces of bookkeeping data attached to the object.
A Python str object is no exception. Take a look at the difference between a string with no, one, two and three characters:
>>> import sys
>>> sys.getsizeof('')
37
>>> sys.getsizeof('a')
38
>>> sys.getsizeof('aa')
39
>>> sys.getsizeof('aaa')
40
The Python str object overhead is 37 bytes on my machine, but each character in the string only takes one byte over the fixed overhead.
Thus, a str value with 1000 million characters requires 1000 million bytes + 37 bytes overhead of memory. That is indeed about 0.931 gigabytes.
Your sample code 'B' created ten times more characters, so you needed nearly 10 gigabyte of memory just to hold that one string, not counting the rest of Python, and the OS and whatever else might be running on that machine.
The program I've written stores a large amount of data in dictionaries. Specifically, I'm creating 1588 instances of a class, each of which contains 15 dictionaries with 1500 float to float mappings. This process has been using up the 2GB of memory on my laptop pretty quickly (I start writing to swap at about the 1000th instance of the class).
My question is, which of the following is using up my memory?
34 million some pairs of floats?
The overhead of 22,500 dictionaries?
the overhead of 1500 classes?
To me it seems like the memory hog should be the huge number of floating point numbers that I'm holding in memory. However, If what I've read so far is correct, each of my floating point numbers take up 16 bytes. Since I have 34 million pairs, this should be about 108 million bytes, which should be just over a gigabyte.
Is there something I'm not taking into consideration here?
The floats do take up 16 bytes apiece, and a dict with 1500 entries about 100k:
>> sys.getsizeof(1.0)
16
>>> d = dict.fromkeys((float(i) for i in range(1500)), 2.0)
>>> sys.getsizeof(d)
98444
so the 22,500 dicts take over 2GB all by themselves, the 68 million floats another GB or so. Not sure how you compute 68 million times 16 equal only 100M -- you may have dropped a zero somewhere.
The class itself takes up a negligible amount, and 1500 instances thereof (net of the objects they refer to of course, just as getsizeof gives us such net amounts for the dicts) not much more than a smallish dict each, so, that's hardly the problem. I.e.:
>>> sys.getsizeof(Sic)
452
>>> sys.getsizeof(Sic())
32
>>> sys.getsizeof(Sic().__dict__)
524
452 for the class, (524 + 32) * 1550 = 862K for all the instances, as you see that's not the worry when you have gigabytes each in dicts and floats.