Python - Garbage collector for list - python

I made simple loop which for each iteration, appends a number to a list.
After the program has completed, will the memory used by the list be automatically freed?
if __name__ == '__main__':
for i in range(100000):
list.append(i)
Anyone can explain to me please?

Yes, all memory is freed when the program is terminated. There is just no way in a modern operating system for you to reserve memory and not have it freed when the process is terminated.
Garbage collector is for freeing memory before the program terminates. This way long-running programs won't reserved all the resources of the computer.
If you have a big data structure, and you want the garbage collector to take care of it (free the memory used by it), you should remove all references to it after you're done using it. In this case simple del list would be sufficient.

Yes, the operating system takes care and frees the memory used by a process after it terminates. Python has nothing to do with it.
However, Python itself has automatic garbage collection, so it frees all of the memory that is no longer necessary while your Python program is running.
Finally, you probably should just use:
if __name__ == '__main__':
list = range(100000)
to achieve exactly the same thing you have written.

Yes, the memory will be freed when the program terminates and on most implementations it will also be freed when the reference count to that list reaches zero, i.e., when there is no variables in scope pointing to that value.
You can also manually control the GC using the gc module.
If you are just iterating over the list and the list is big enough to get you worried about memory consumption, you probably should check Python generators and generator expressions.
So instead of:
for i in range(100000):
results.append(do_something_with(i))
for result in results:
do_something_else_with(result)
You can write:
partial = (do_something_with(i) for i in range(100000))
for result in partial:
do_something_else_with(result)
Or:
def do_something(iterable):
for item in iterable:
yield some_calculation(item)
def do_something_else(iterable):
for item in iterable:
yield some_other_calculation(item)
partial = do_something(range(100000))
for result in do_something_else(partial):
print result
And so on... This way you don't have to allocate the whole list in memory.

You needn't worry about freeing memory in python, since this is an automatic feature of the program. Python uses reference counting to manage memory, when something is no longer being referenced, python will automatically allocate memory accordingly. In other words, removing all references to the list should be enough to free the memory that was allocated to it.
That said, if your program is huge, you may use gc.collect() to make the garbage collector free some memory. However, this is generally unnecessary, as Python's garbage collector is designed to do its job well enough.
Moreover, although this is not recommended, and generally never very useful, you may also disable Python's automatic garbage collector using gc.disable(), which allows you as the user to allocate the memory manually, in an almost C style approach.

All the memory will be freed after termination but if you want to be efficient in creating a list this large during execution use xrange instead which generates the number on the fly.
alist = [i for i in xrange(10000)]

Related

Freeing up all used memory in python

Setup:
I am running a python code where:
I open a file.
For every line in file, I create an object
Do some operations with the object
Note that once I am done with the operations part, I no longer need the object. Every new line is independent.
Relevant Code as per request:
I have commented all the parts of my code, leaving below the following code:
import gc
for l in range(num_lines):
inp = f.readline()[:-1]
collector = [int(i) for i in inp]
M = BooleanFunction(collector)
deg = M.algebraic_degree()
del M
gc.collect()
The problem:
The object once created, is consuming some amount of memory. After performing the operations, I am not able to free it. So while looping over the file, my memory keeps getting accumulated with new objects, and by around 793 lines into the file, my 16 GB of RAM is completely depleted.
What I have tried:
Using the garbage collector:
import gc
del Object
gc.collect()
However, the garbage collector will not free up the RAM (or) python is not giving up the memory to the system. Creating child-processes is an idea, but not what I am up for.
Questions:
Is there any way I can free up all the memory currently occupied by the program to the OS? That means removing all variables (loop vars, global vars, etc). Something similar to what happens when you press CTRL+C to terminate the program, it returns all the memory to the OS.
A way to specifically de-allocate an object (If I am not doing it right).
Previous questions do not answer what if gc.collect() fails to do so and how do I completely give up the memory allocated.
Objects in Python can be garbage-colleted once their reference count drops to zero.
Looking at your code, every variable gets re-assigned in every iteration. So their reference count should be zero.
If that doesn't happen then I can see three main possibilities;
You are unwittingly keeping a reference to that object.
Garbage collection is disabled (gc.disable()) or frozen (gc.freeze() in Python 3.7).
The objects are made by a Python extension written in C that manages its own memory.
Note that (1) or (2) doesn't have to happen in your code. It can also happen in modules that you use.
In your case (2) should not be an issue since you force garbage collection.
For an example of (1), consider what would happen if BooleanFunction was memoized. Then a reference to each object (that you wouldn't see and can't delete) would be kept.
The only way to give all memory back to the OS is to terminate the program.
Edit 1:
Try running your program with the garbage collection debug flags enabled (gc.DEBUG_LEAK). Run gc.get_count() at the end of every loop. And maybe gc.garbage() as well.
For a better understanding of where the memory allocation happens and what exactly happens, you could run your script under the Python debugger. Step through the program line by line while monitoring the resident set size of the Python process with ps in another terminal.

Is it possible for a Python function to still use memory after being called?

If I run a function in Python 3 (func()) is it possible that objects that are created inside func() but cannot be accessed after it has finished would cause it to increase its memory usage?
For instance, will running
def func():
# Objects being created, that are not able to be used after function call has ended.
while True:
func()
ever cause the program run out of memory, no matter what is in func()?
If the program is continually using memory, what are some possible things that could be going on in func() to cause it to continue using memory after it has been called?
Edit:
I'm only asking about creating objects that can no longer be accessed after the function has ended, so they should be deleted.
Yes, it is possible for a Python function to still use memory after being
called.
Python uses garbage collection (GC) for memory management. Most GCs (I suppose
there could be some exceptions) make no guarantee if or when they will free
the memory of unreferenced objects. Say you have a function
consume_lots_of_memory() and call it as:
while True:
consume_lots_of_memory()
There is no guarantee that all of the memory allocated in the first call
to consume_lots_of_memory() will be released before it is called a
second time. Ideally the GC would run after the call finished, but it
might run half way through the fifth call. So depending on when the GC
runs, you could end up consuming more memory than you would expect and
possibly even run out of memory.
Your function could be modifying global state, and using large amounts of
memory that never gets released. Say you have a module level cache, and a
function cache_lots_of_objects() called as:
module_cache = {}
while True:
cache_lots_of_objects()
Every call to cache_lots_of_objects() only ever adds to the cache, and
the cache just keeps consuming more memory. Even if the GC promptly
releases the non-cached objects created in cache_lots_of_objects(), your
cache could eventually consume all of your memory.
You could be encountering an actual memory leak from Python itself (unlikely
but possible), or from a third-party library improperly using the C API, using
a leaky C library, or incorrectly interfacing with a C library.
One final note about memory usage. Just because Python has freed allocated
objects, it does not necessarily mean that the memory will be released from the process
and returned to the operating system. The reason has to do with how memory is
allocated to a process in chunks (pages). See abarnert's answer
to Releasing memory in Python
for a better explanation than I can offer.

Is there a need to delete a large variable in python immediately after its use?

If I create a list that’s 1 GB, print it to screen, then delete it, would it also be deleted from memory? Would this deletion essentially be like deallocating memory such as free() in C.
If a variable is very large should I delete it immediately after use or should I let Python's garbage collector handle it ?
# E.g. creating and deleting a large list
largeList = ['data', 'etc', 'etc'] # Continues to 1 GB
print largeList
largeList = [] # Or would it need to be: del largeList [:]
Most of the time, you shouldn't worry about memory management in a garbage collected language. That means actions such as deleting variables (in python), or calling the garbage collector (GC) manually.
You should trust the GC to do its job properly - most of the time, micromanaging memory will lead to adverse results, as the GC has a lot more information and statistics about what memory is used and needed than you. Also, garbage collection is a expensive process (CPU-wise), so there's a good chance you'd be calling it too often/in a bad moment.
What happens in your example is that, as soon as ỳou largeList = [], the memory content previously referenced will be GC'd as soon as its convenient, or the memory is needed.
You can check this using a interpreter and a memory monitor:
#5 MiB used
>>> l1=[0]*1024*1024*32
#261 MiB used
>>> l2=[0]*1024*1024*32
#525 MiB used
>>> l1=[0]*1024*1024*32
# 525 MiB used
There are very rare cases where you do need to manage memory manually, and you turn off garbage collection. Of course, that can lead to memory leak bugs such as this one. It's worth mentioning that the modern python GC can handle circular references properly.
Using "die" ---> Deletion of a name removes the binding of that name from the local or global namespace. It releases memory for sure but not all the memory is released.
NOTE: When a process frees some memory from HEAP, it releases back to the OS only after the process dies.
So, better leave it for the Garbage Collector.

Virtual memory management in Python (2.6-2.7) - Will it reuse the allocated memory for any type of data?

I have a question about the virtual memory in Python.
When the process is consuming a relatively large amount of memory, it doesn't "release" the unused memory. For example, after creating a massive list of strings, let's say the list uses 30MB of memory, so the entire process takes roughly 40MB, when the list is deleted, the process still consuming 40MB, but if another list with the same amount of data is created, the process will not take more memory, because it will use the virtual memory that is available but not released to the OS.
My question is: What kind of data will reuse that non-released virtual memory? I mean, that 30MB was "taken" from the OS when I created a list of strings, and even when I delete it, the next list of strings will not take more memory from the OS as long as it fits in the 30MB. But if instead a list of strings another type of data is created, like a QPixmap (from Qt, using PyQt), will it use that 30MB originally allocated by the list of strings?
Thank you in advance.
Edit: Well, this question sounds lazy. I know I could simply test this specific case, but i want to know in theory, I don't want the answer for this "list of strings and qpixmap" specific case, but in general.
At the C level (CPython's implementation), anything that is allocated on the heap with malloc() will consume memory and this memory will not be released to the OS when that memory is freed with free(). It will only be returned when the process dies. But when new blocks are allocated with malloc() they will use the freed-up memory.
(Unless the free memory is really badly fragmented and there is not enough contiguous free space in the freed-up zones to accommodate new allocations. But let's not worry about this pathological case.)
Every Python object is implemented by CPython as one or more blocks of memory allocated with malloc() so the answer to your question is: pretty much any piece of Python data can reuse the space that was freed by the deallocation of some other piece of Python data.
There are two parts to the problem of "freeing" memory: first, getting Python to garbage collect the objects, and second, getting unused memory returned to the OS at the C level.
If you are having problems with process size growing without bounds, you are almost certainly not allowing objects to be garbage collected. 99.9% of the time (to 0 significant digits :) ) if you are trying to second-guess Python's C-level memory management, you are in a bunny hole.
Remember that in Python your objects are not even candidates to be garbage collected until there are no more live objects with references to them. You can very easily squirrel away a reference to an object somewhere without realizing it.
There's a Python tool called Dowser that is very helpful at finding leaks of memory caused by keeping around references to objects. If you see your object count for a certain class growing without bounds over time.... there's your memory problem.
Good luck!

Do Python dictionaries have all memory freed when reassigned?

Working in Python. I have a function that reads from a queue and creates a dictionary based on some of the XML tags in the record read from the queue, and returns this dictionary. I call this function in a loop forever. The dictionary gets reassigned each time. Does the memory previously used by the dictionary get freed at each reassignment, or does it get orphaned and eventually cause memory problems?
def readq():
qtags = {}
# Omitted code to read the queue record, get XML string, DOMify it
qtags['result'] = "Success"
qtags['call_offer_time'] = get_node_value_by_name(audio_dom, 'call_offer_time')
# More omitted code to extract the rest of the tags
return qtags
while signals.sigterm_caught == False:
tags = readq()
if tags['result'] == "Empty":
time.sleep(SLEEP_TIME)
continue
# Do stuff with the tags
So when I reassign tags each time in that loop, will the memory used by the previous assignment get freed before being allocated by the new assignment?
The memory of an object will be freed if it can be proven (from the knowledge the language implementation has at runtime) that it cannot possibly be accessed any more and the garbage collector sees it fit to make a collection. That's the absolute minimum, and you shouldn't assume any more. And you usually shouldn't have to worry about anything more.
More practically speaking, it may be freed at some point in time between the last reference (where "reference" isn't limited to names in scope, but can be anything that makes the object reachable) being removed and memory running out. It doesn't have to be freed by the Python implementation running your code, it may as well leave the memory cleaning to the OS and forget about any finalizers and such. Note that there can be a noticeable delay between the last reference dying and memory usage actually dropping. But as mentioned before, most implementations go out of their way to avoid excessive memory usage if there is garbage to collect.
Even more practically, you'll propably be running this on CPython (the reference implementation), which always used and most propably will always use reference counting (augmented with a real GC to handle cyclic references), so unless there's a cyclic reference (relatively rare and your code doesn't look like it has them, but can occur e.g. in graph-like structures) it will be freed as soon as the last reference to it is deleted/overwritten. Of course, other implementations aren't that predictable - PyPy alone has half a dozen different garbage collectors, all but one falling under the above paragraph.
No, it will be freed AFTER the new object has been created.
In order for the reference count to go down on the old object, tags has to be pointed to the new object. This happens after readq returns, so at the very least both objects will exist from the beginning of qtags = {} to after tags = readq().
As #delnan stated, soon after tags has been pointed to the new object, the old one will be freed by the garbage collector as there is no longer a reference to it.
Usually Python can keep up with anything you throw at it. The Garbage collector used in Python uses reference counting, so your memory usage should be about constant, you won't see any spikes in memory. Right when you remove a reference (assign the variable to something else), the garbage collector throws the memory back into the "heap" if you will. So don't worry about memory. I have run simulators doing tests for hours rewriting variables, but the memory usage stays about the same. It will be freed when you assign it a new dictionary.

Categories