What does "domain" mean in Python's tracemalloc?

What does "domain" mean in Python's tracemalloc? - python

Python's tracemalloc documentation defines "domain" as an "address space of a memory block" and identifies a domain using an integer. It also says:
tracemalloc uses the domain 0 to trace memory allocations made by Python. C extensions can use other domains to trace other resources
What are these address spaces to which the tracemalloc documentation is referring? As a C extension developer, how can I "use" a particular domain as the above quote suggests, and in what situations should I? As a Python package developer, in what situations should I use a DomainFilter when using tracemalloc?
Python's memory management documentation uses the phrase "allocator domain," which seems to be unrelated. These domains are categories of allocator functions. They don't seem to have integer identifiers like the domains in the tracemalloc documentation.

The complexity comes from the idea that a C extension module can reuse the built-in memory tracer tracemalloc to trace memory blocks allocated by the custom allocation routine of the extension. The domain is an integer tag that can be used to distinguish address spaces. For example, a C extension can request to track a pointer 0x7f0000000001 in different address space(not one in the main memory), while there exists some object at 0x7f0000000001 in the main memory.(The pointer even doesn't need to be an address. It can be a handle.)
You can call the C API PyTraceMalloc_Track() to request tracking a pointer. See the example in the alloc.c of NumPy.
In most cases, you deal with addresses in the main memory, so you can use the default domain, 0. And of course, you can use your own domain number to filter your memory allocations only.

Related

Release memory in python when removing item from a list [duplicate]

I have a few related questions regarding memory usage in the following example.
If I run in the interpreter,
foo = ['bar' for _ in xrange(10000000)]
the real memory used on my machine goes up to 80.9mb. I then,
del foo
real memory goes down, but only to 30.4mb. The interpreter uses 4.4mb baseline so what is the advantage in not releasing 26mb of memory to the OS? Is it because Python is "planning ahead", thinking that you may use that much memory again?
Why does it release 50.5mb in particular - what is the amount that is released based on?
Is there a way to force Python to release all the memory that was used (if you know you won't be using that much memory again)?
NOTE
This question is different from How can I explicitly free memory in Python?
because this question primarily deals with the increase of memory usage from baseline even after the interpreter has freed objects via garbage collection (with use of gc.collect or not).

I'm guessing the question you really care about here is:
Is there a way to force Python to release all the memory that was used (if you know you won't be using that much memory again)?
No, there is not. But there is an easy workaround: child processes.
If you need 500MB of temporary storage for 5 minutes, but after that you need to run for another 2 hours and won't touch that much memory ever again, spawn a child process to do the memory-intensive work. When the child process goes away, the memory gets released.
This isn't completely trivial and free, but it's pretty easy and cheap, which is usually good enough for the trade to be worthwhile.
First, the easiest way to create a child process is with concurrent.futures (or, for 3.1 and earlier, the futures backport on PyPI):
with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
result = executor.submit(func, *args, **kwargs).result()
If you need a little more control, use the multiprocessing module.
The costs are:
Process startup is kind of slow on some platforms, notably Windows. We're talking milliseconds here, not minutes, and if you're spinning up one child to do 300 seconds' worth of work, you won't even notice it. But it's not free.
If the large amount of temporary memory you use really is large, doing this can cause your main program to get swapped out. Of course you're saving time in the long run, because that if that memory hung around forever it would have to lead to swapping at some point. But this can turn gradual slowness into very noticeable all-at-once (and early) delays in some use cases.
Sending large amounts of data between processes can be slow. Again, if you're talking about sending over 2K of arguments and getting back 64K of results, you won't even notice it, but if you're sending and receiving large amounts of data, you'll want to use some other mechanism (a file, mmapped or otherwise; the shared-memory APIs in multiprocessing; etc.).
Sending large amounts of data between processes means the data have to be pickleable (or, if you stick them in a file or shared memory, struct-able or ideally ctypes-able).

Memory allocated on the heap can be subject to high-water marks. This is complicated by Python's internal optimizations for allocating small objects (PyObject_Malloc) in 4 KiB pools, classed for allocation sizes at multiples of 8 bytes -- up to 256 bytes (512 bytes in 3.3). The pools themselves are in 256 KiB arenas, so if just one block in one pool is used, the entire 256 KiB arena will not be released. In Python 3.3 the small object allocator was switched to using anonymous memory maps instead of the heap, so it should perform better at releasing memory.
Additionally, the built-in types maintain freelists of previously allocated objects that may or may not use the small object allocator. The int type maintains a freelist with its own allocated memory, and clearing it requires calling PyInt_ClearFreeList(). This can be called indirectly by doing a full gc.collect.
Try it like this, and tell me what you get. Here's the link for psutil.Process.memory_info.
import os
import gc
import psutil
proc = psutil.Process(os.getpid())
gc.collect()
mem0 = proc.memory_info().rss
# create approx. 10**7 int objects and pointers
foo = ['abc' for x in range(10**7)]
mem1 = proc.memory_info().rss
# unreference, including x == 9999999
del foo, x
mem2 = proc.memory_info().rss
# collect() calls PyInt_ClearFreeList()
# or use ctypes: pythonapi.PyInt_ClearFreeList()
gc.collect()
mem3 = proc.memory_info().rss
pd = lambda x2, x1: 100.0 * (x2 - x1) / mem0
print "Allocation: %0.2f%%" % pd(mem1, mem0)
print "Unreference: %0.2f%%" % pd(mem2, mem1)
print "Collect: %0.2f%%" % pd(mem3, mem2)
print "Overall: %0.2f%%" % pd(mem3, mem0)
Output:
Allocation: 3034.36%
Unreference: -752.39%
Collect: -2279.74%
Overall: 2.23%
Edit:
I switched to measuring relative to the process VM size to eliminate the effects of other processes in the system.
The C runtime (e.g. glibc, msvcrt) shrinks the heap when contiguous free space at the top reaches a constant, dynamic, or configurable threshold. With glibc you can tune this with mallopt (M_TRIM_THRESHOLD). Given this, it isn't surprising if the heap shrinks by more -- even a lot more -- than the block that you free.
In 3.x range doesn't create a list, so the test above won't create 10 million int objects. Even if it did, the int type in 3.x is basically a 2.x long, which doesn't implement a freelist.

eryksun has answered question #1, and I've answered question #3 (the original #4), but now let's answer question #2:
Why does it release 50.5mb in particular - what is the amount that is released based on?
What it's based on is, ultimately, a whole series of coincidences inside Python and malloc that are very hard to predict.
First, depending on how you're measuring memory, you may only be measuring pages actually mapped into memory. In that case, any time a page gets swapped out by the pager, memory will show up as "freed", even though it hasn't been freed.
Or you may be measuring in-use pages, which may or may not count allocated-but-never-touched pages (on systems that optimistically over-allocate, like linux), pages that are allocated but tagged MADV_FREE, etc.
If you really are measuring allocated pages (which is actually not a very useful thing to do, but it seems to be what you're asking about), and pages have really been deallocated, two circumstances in which this can happen: Either you've used brk or equivalent to shrink the data segment (very rare nowadays), or you've used munmap or similar to release a mapped segment. (There's also theoretically a minor variant to the latter, in that there are ways to release part of a mapped segment—e.g., steal it with MAP_FIXED for a MADV_FREE segment that you immediately unmap.)
But most programs don't directly allocate things out of memory pages; they use a malloc-style allocator. When you call free, the allocator can only release pages to the OS if you just happen to be freeing the last live object in a mapping (or in the last N pages of the data segment). There's no way your application can reasonably predict this, or even detect that it happened in advance.
CPython makes this even more complicated—it has a custom 2-level object allocator on top of a custom memory allocator on top of malloc. (See the source comments for a more detailed explanation.) And on top of that, even at the C API level, much less Python, you don't even directly control when the top-level objects are deallocated.
So, when you release an object, how do you know whether it's going to release memory to the OS? Well, first you have to know that you've released the last reference (including any internal references you didn't know about), allowing the GC to deallocate it. (Unlike other implementations, at least CPython will deallocate an object as soon as it's allowed to.) This usually deallocates at least two things at the next level down (e.g., for a string, you're releasing the PyString object, and the string buffer).
If you do deallocate an object, to know whether this causes the next level down to deallocate a block of object storage, you have to know the internal state of the object allocator, as well as how it's implemented. (It obviously can't happen unless you're deallocating the last thing in the block, and even then, it may not happen.)
If you do deallocate a block of object storage, to know whether this causes a free call, you have to know the internal state of the PyMem allocator, as well as how it's implemented. (Again, you have to be deallocating the last in-use block within a malloced region, and even then, it may not happen.)
If you do free a malloced region, to know whether this causes an munmap or equivalent (or brk), you have to know the internal state of the malloc, as well as how it's implemented. And this one, unlike the others, is highly platform-specific. (And again, you generally have to be deallocating the last in-use malloc within an mmap segment, and even then, it may not happen.)
So, if you want to understand why it happened to release exactly 50.5mb, you're going to have to trace it from the bottom up. Why did malloc unmap 50.5mb worth of pages when you did those one or more free calls (for probably a bit more than 50.5mb)? You'd have to read your platform's malloc, and then walk the various tables and lists to see its current state. (On some platforms, it may even make use of system-level information, which is pretty much impossible to capture without making a snapshot of the system to inspect offline, but luckily this isn't usually a problem.) And then you have to do the same thing at the 3 levels above that.
So, the only useful answer to the question is "Because."
Unless you're doing resource-limited (e.g., embedded) development, you have no reason to care about these details.
And if you are doing resource-limited development, knowing these details is useless; you pretty much have to do an end-run around all those levels and specifically mmap the memory you need at the application level (possibly with one simple, well-understood, application-specific zone allocator in between).

First, you may want to install glances:
sudo apt-get install python-pip build-essential python-dev lm-sensors
sudo pip install psutil logutils bottle batinfo https://bitbucket.org/gleb_zhulik/py3sensors/get/tip.tar.gz zeroconf netifaces pymdstat influxdb elasticsearch potsdb statsd pystache docker-py pysnmp pika py-cpuinfo bernhard
sudo pip install glances
Then run it in the terminal!
glances
In your Python code, add at the begin of the file, the following:
import os
import gc # Garbage Collector
After using the "Big" variable (for example: myBigVar) for which, you would like to release memory, write in your python code the following:
del myBigVar
gc.collect()
In another terminal, run your python code and observe in the "glances" terminal, how the memory is managed in your system!
Good luck!
P.S. I assume you are working on a Debian or Ubuntu system

How to share a user-defined object that occupy large memory and changes frequently in python multiprocess efficiently?

I've tried module multiprocessing Manager and Array , but it can't meet my needs
Is there a method just like shared memory in linux C?

Not as such.
Sharing memory like this in the general case is very tricky. The CPython interpreter does not relocate objects, so they would have to be created in situ within the shared memory region. That means shared memory allocation, which is considerably more complex than just calling PyMem_Malloc(). In increasing order of difficulty, you would need cross-process locking, a per-process reference count, and some kind of inter-process cyclic garbage collection. That last one is really hard to do efficiently and safely. It's also necessary to ensure that shared objects only reference other shared objects, which is very difficult to do if you're not willing to relocate objects into the shared region. So Python doesn't provide a general purpose means of stuffing arbitrary full-blown Python objects into shared memory.
But you can share mmap objects between processes, and mmap supports the buffer protocol, so you can wrap it up in something higher-level like array/numpy.ndarray or anything else with buffer protocol support. Depending on your precise modality, you might have to write a small amount of C or Cython glue code to rapidly move data between the mmap and the array. This should not be necessary if you are working with NumPy. Note that high-level objects may require locking which mmap does not provide.

How does app engine (python) manage memory across requests (Exceeded soft private memory limit)

I'm experiencing occasional Exceeded soft private memory limit error in a wide variety of request handlers in app engine. I understand that this error means that the RAM used by the instance has exceeded the amount allocated, and how that causes the instance to shut down.
I'd like to understand what might be the possible causes of the error, and to start, I'd like to understand how app engine python instances are expected to manage memory. My rudimentary assumptions were:
An F2 instance starts with 256 MB
When it starts up, it loads my application code - lets say 30 MB
When it handles a request it has 226 MB available
so long as that request does not exceed 226 MB (+ margin of error) the request completes w/o error
if it does exceed 226 MB + margin, the instance completes the request, logs the 'Exceeded soft private memory limit' error, then terminates - now go back to step 1
When that request returns, any memory used by it is freed up - ie. the unused RAM goes back to 226 MB
Step 3-4 are repeated for each request passed to the instance, indefinitely
That's how I presumed it would work, but given that I'm occasionally seeing this error across a fairly wide set of request handlers, I'm now not so sure. My questions are:
a) Does step #4 happen?
b) What could cause it not to happen? or not to fully happen? e.g. how could memory leak between requests?
c) Could storage in module level variables causes memory usage to leak? (I'm not knowingly using module level variables in that way)
d) What tools / techniques can I use to get more data? E.g. measure memory usage at entry to request handler?
In answers/comments, where possible, please link to the gae documentation.
[edit] Extra info: my app is congifured as threadsafe: false. If this has a bearing on the answer, please state what it is. I plan to change to threadsafe: true soon.
[edit] Clarification: This question is about the expected behavior of gae for memory management. So while suggestions like 'call gc.collect()' might well be partial solutions to related problems, they don't fully answer this question. Up until the point that I understand how gae is expected to behave, using gc.collect() would feel like voodoo programming to me.
Finally: If I've got this all backwards then I apologize in advance - I really cant find much useful info on this, so I'm mostly guessing..

App Engine's Python interpreter does nothing special, in terms of memory management, compared to any other standard Python interpreter. So, in particular, there is nothing special that happens "per request", such as your hypothetical step 4. Rather, as soon as any object's reference count decreases to zero, the Python interpreter reclaims that memory (module gc is only there to deal with garbage cycles -- when a bunch of objects never get their reference counts down to zero because they refer to each other even though there is no accessible external reference to them).
So, memory could easily "leak" (in practice, though technically it's not a leak) "between requests" if you use any global variable -- said variables will survive the instance of the handler class and its (e.g) get method -- i.e, your point (c), though you say you are not doing that.
Once you declare your module to be threadsafe, an instance may happen to serve multiple requests concurrently (up to what you've set as max_concurrent_requests in the automatic_scaling section of your module's .yaml configuration file; the default value is 8). So, your instance's RAM will need be a multiple of what each request needs.
As for (d), to "get more data" (I imagine you actually mean, get more RAM), the only thing you can do is configure a larger instance_class for your memory-hungry module.
To use less RAM, there are many techniques -- which have nothing to do with App Engine, everything to do with Python, and in particular, everything to do with your very specific code and its very specific needs.
The one GAE-specific issue I can think of is that ndb's caching has been reported to leak -- see https://code.google.com/p/googleappengine/issues/detail?id=9610 ; that thread also suggests workarounds, such as turning off ndb caching or moving to old db (which does no caching and has no leak). If you're using ndb and have not turned off its caching, that might be the root cause of "memory leak" problems you're observing.

Point 4 is an invalid asumption, Python's garbage collector doesn't return the memory that easily, Python's program is taking up that memory but it's not used until garbage collector has a pass. In the meantime if some other request requires more memory - new might be allocated, on top the memory from the first request. If you want to force Python to garbage collect, you can use gc.collect() as mentioned here

Take a look at this Q&A for approaches to check on garbage collection and for potential alternate explanations: Google App Engine DB Query Memory Usage

Releasing memory in Python

I have a few related questions regarding memory usage in the following example.
If I run in the interpreter,
foo = ['bar' for _ in xrange(10000000)]
the real memory used on my machine goes up to 80.9mb. I then,
del foo
real memory goes down, but only to 30.4mb. The interpreter uses 4.4mb baseline so what is the advantage in not releasing 26mb of memory to the OS? Is it because Python is "planning ahead", thinking that you may use that much memory again?
Why does it release 50.5mb in particular - what is the amount that is released based on?
Is there a way to force Python to release all the memory that was used (if you know you won't be using that much memory again)?
NOTE
This question is different from How can I explicitly free memory in Python?
because this question primarily deals with the increase of memory usage from baseline even after the interpreter has freed objects via garbage collection (with use of gc.collect or not).

Memory allocated on the heap can be subject to high-water marks. This is complicated by Python's internal optimizations for allocating small objects (PyObject_Malloc) in 4 KiB pools, classed for allocation sizes at multiples of 8 bytes -- up to 256 bytes (512 bytes in 3.3). The pools themselves are in 256 KiB arenas, so if just one block in one pool is used, the entire 256 KiB arena will not be released. In Python 3.3 the small object allocator was switched to using anonymous memory maps instead of the heap, so it should perform better at releasing memory.
Additionally, the built-in types maintain freelists of previously allocated objects that may or may not use the small object allocator. The int type maintains a freelist with its own allocated memory, and clearing it requires calling PyInt_ClearFreeList(). This can be called indirectly by doing a full gc.collect.
Try it like this, and tell me what you get. Here's the link for psutil.Process.memory_info.
import os
import gc
import psutil
proc = psutil.Process(os.getpid())
gc.collect()
mem0 = proc.memory_info().rss
# create approx. 10**7 int objects and pointers
foo = ['abc' for x in range(10**7)]
mem1 = proc.memory_info().rss
# unreference, including x == 9999999
del foo, x
mem2 = proc.memory_info().rss
# collect() calls PyInt_ClearFreeList()
# or use ctypes: pythonapi.PyInt_ClearFreeList()
gc.collect()
mem3 = proc.memory_info().rss
pd = lambda x2, x1: 100.0 * (x2 - x1) / mem0
print "Allocation: %0.2f%%" % pd(mem1, mem0)
print "Unreference: %0.2f%%" % pd(mem2, mem1)
print "Collect: %0.2f%%" % pd(mem3, mem2)
print "Overall: %0.2f%%" % pd(mem3, mem0)
Output:
Allocation: 3034.36%
Unreference: -752.39%
Collect: -2279.74%
Overall: 2.23%
Edit:
I switched to measuring relative to the process VM size to eliminate the effects of other processes in the system.
The C runtime (e.g. glibc, msvcrt) shrinks the heap when contiguous free space at the top reaches a constant, dynamic, or configurable threshold. With glibc you can tune this with mallopt (M_TRIM_THRESHOLD). Given this, it isn't surprising if the heap shrinks by more -- even a lot more -- than the block that you free.
In 3.x range doesn't create a list, so the test above won't create 10 million int objects. Even if it did, the int type in 3.x is basically a 2.x long, which doesn't implement a freelist.

eryksun has answered question #1, and I've answered question #3 (the original #4), but now let's answer question #2:
Why does it release 50.5mb in particular - what is the amount that is released based on?
What it's based on is, ultimately, a whole series of coincidences inside Python and malloc that are very hard to predict.
First, depending on how you're measuring memory, you may only be measuring pages actually mapped into memory. In that case, any time a page gets swapped out by the pager, memory will show up as "freed", even though it hasn't been freed.
Or you may be measuring in-use pages, which may or may not count allocated-but-never-touched pages (on systems that optimistically over-allocate, like linux), pages that are allocated but tagged MADV_FREE, etc.
If you really are measuring allocated pages (which is actually not a very useful thing to do, but it seems to be what you're asking about), and pages have really been deallocated, two circumstances in which this can happen: Either you've used brk or equivalent to shrink the data segment (very rare nowadays), or you've used munmap or similar to release a mapped segment. (There's also theoretically a minor variant to the latter, in that there are ways to release part of a mapped segment—e.g., steal it with MAP_FIXED for a MADV_FREE segment that you immediately unmap.)
But most programs don't directly allocate things out of memory pages; they use a malloc-style allocator. When you call free, the allocator can only release pages to the OS if you just happen to be freeing the last live object in a mapping (or in the last N pages of the data segment). There's no way your application can reasonably predict this, or even detect that it happened in advance.
CPython makes this even more complicated—it has a custom 2-level object allocator on top of a custom memory allocator on top of malloc. (See the source comments for a more detailed explanation.) And on top of that, even at the C API level, much less Python, you don't even directly control when the top-level objects are deallocated.
So, when you release an object, how do you know whether it's going to release memory to the OS? Well, first you have to know that you've released the last reference (including any internal references you didn't know about), allowing the GC to deallocate it. (Unlike other implementations, at least CPython will deallocate an object as soon as it's allowed to.) This usually deallocates at least two things at the next level down (e.g., for a string, you're releasing the PyString object, and the string buffer).
If you do deallocate an object, to know whether this causes the next level down to deallocate a block of object storage, you have to know the internal state of the object allocator, as well as how it's implemented. (It obviously can't happen unless you're deallocating the last thing in the block, and even then, it may not happen.)
If you do deallocate a block of object storage, to know whether this causes a free call, you have to know the internal state of the PyMem allocator, as well as how it's implemented. (Again, you have to be deallocating the last in-use block within a malloced region, and even then, it may not happen.)
If you do free a malloced region, to know whether this causes an munmap or equivalent (or brk), you have to know the internal state of the malloc, as well as how it's implemented. And this one, unlike the others, is highly platform-specific. (And again, you generally have to be deallocating the last in-use malloc within an mmap segment, and even then, it may not happen.)
So, if you want to understand why it happened to release exactly 50.5mb, you're going to have to trace it from the bottom up. Why did malloc unmap 50.5mb worth of pages when you did those one or more free calls (for probably a bit more than 50.5mb)? You'd have to read your platform's malloc, and then walk the various tables and lists to see its current state. (On some platforms, it may even make use of system-level information, which is pretty much impossible to capture without making a snapshot of the system to inspect offline, but luckily this isn't usually a problem.) And then you have to do the same thing at the 3 levels above that.
So, the only useful answer to the question is "Because."
Unless you're doing resource-limited (e.g., embedded) development, you have no reason to care about these details.
And if you are doing resource-limited development, knowing these details is useless; you pretty much have to do an end-run around all those levels and specifically mmap the memory you need at the application level (possibly with one simple, well-understood, application-specific zone allocator in between).

First, you may want to install glances:
sudo apt-get install python-pip build-essential python-dev lm-sensors
sudo pip install psutil logutils bottle batinfo https://bitbucket.org/gleb_zhulik/py3sensors/get/tip.tar.gz zeroconf netifaces pymdstat influxdb elasticsearch potsdb statsd pystache docker-py pysnmp pika py-cpuinfo bernhard
sudo pip install glances
Then run it in the terminal!
glances
In your Python code, add at the begin of the file, the following:
import os
import gc # Garbage Collector
After using the "Big" variable (for example: myBigVar) for which, you would like to release memory, write in your python code the following:
del myBigVar
gc.collect()
In another terminal, run your python code and observe in the "glances" terminal, how the memory is managed in your system!
Good luck!
P.S. I assume you are working on a Debian or Ubuntu system

How Does Python Memory Management Work?

Okay, I got this concept of a class that would allow other classes to import classes on as basis versus if you use it you must import it. How would I go about implementing it? Or, does the Python interpreter already do this in a way? Does it destroy classes not in use from memory, and how so?
I know C++/C are very memory orientated with pointers and all that, but is Python? And I'm not saying I have problem with it; I, more or less, want to make a modification to it for my program's design. I want to write a large program that use hundreds of classes and modules. But I'm afraid if I do this I'll bog the application down, since I have no understanding of how Python handles memory management.
I know it is a vague question, but if somebody would link or point me in the right direction it would be greatly appreciated.

Python -- like C#, Java, Perl, Ruby, Lua and many other languages -- uses garbage collection rather than manual memory management. You just freely create objects and the language's memory manager periodically (or when you specifically direct it to) looks for any objects that are no longer referenced by your program.
So if you want to hold on to an object, just hold a reference to it. If you want the object to be freed (eventually) remove any references to it.
def foo(names):
for name in names:
print name
foo(["Eric", "Ernie", "Bert"])
foo(["Guthtrie", "Eddie", "Al"])
Each of these calls to foo creates a Python list object initialized with three values. For the duration of the foo call they are referenced by the variable names, but as soon as that function exits no variable is holding a reference to them and they are fair game for the garbage collector to delete.

x =10
print (type(x))
memory manager (MM):
x points to 10
y = x
if(id(x) == id(y)):
print('x and y refer to the same object')
(MM):
y points to same 10 object
x=x+1
if(id(x) != id(y)):
print('x and y refer to different objects')
(MM):
x points to another object is 11, previously pointed object was destroyed
z=10
if(id(y) == id(z)):
print('y and z refer to same object')
else:
print('y and z refer different objects')
Python memory management is been divided into two parts.
Stack memory
Heap memory
Methods and variables are created in Stack memory.
Objects and instance variables values are created in Heap memory.
In stack memory - a stack frame is created whenever methods and
variables are created.
These stacks frames are destroyed automaticaly whenever
functions/methods returns.
Python has mechanism of Garbage collector, as soon as variables and
functions returns, Garbage collector clear the dead objects.

Read through following articles about Python Memory Management :
Python : Memory Management (updated to version 3)
Exerpt: (examples can be found in the article):
Memory management in Python involves a private heap containing all
Python objects and data structures. The management of this private
heap is ensured internally by the Python memory manager. The Python
memory manager has different components which deal with various
dynamic storage management aspects, like sharing, segmentation,
preallocation or caching.
At the lowest level, a raw memory allocator ensures that there is
enough room in the private heap for storing all Python-related data by
interacting with the memory manager of the operating system. On top of
the raw memory allocator, several object-specific allocators operate
on the same heap and implement distinct memory management policies
adapted to the peculiarities of every object type. For example,
integer objects are managed differently within the heap than strings,
tuples or dictionaries because integers imply different storage
requirements and speed/space tradeoffs. The Python memory manager thus
delegates some of the work to the object-specific allocators, but
ensures that the latter operate within the bounds of the private heap.
It is important to understand that the management of the Python heap
is performed by the interpreter itself and that the user has no
control over it, even if she regularly manipulates object pointers to
memory blocks inside that heap. The allocation of heap space for
Python objects and other internal buffers is performed on demand by
the Python memory manager through the Python/C API functions listed in
this document.

My 5 cents:
most importantly, python frees memory for referenced objects only (not for classes because they are just containers or custom data types). Again, in python everything is an object, so int, float, string, [], {} and () all are objects. That mean if your program don't reference them anymore they are victims for garbage collection.
Though python uses'Reference count' and 'GC' to free memory (for the objects that are not in used), this free memory is not returned back to the operating system (in windows its different case though). This mean free memory chunk just return back to python interpreter not to the operating system. So utlimately your python process is going to hold the same memory. However, python will use this memory to allocate to some other objects.
Very good explanation for this given at: http://deeplearning.net/software/theano/tutorial/python-memory-management.html

Yes its the same behaviour in python3 as well

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.