I have a couple of Python/Numpy programs that tend to cause the PC to freeze/run very slowly when they use too much memory. I can't even stop the scripts or move the cursor anymore, when it uses to much memory (e.g. 3.8/4GB)
Therefore, I would like to quit the program automatically when it hits a critical limit of memory usage, e.g. 3GB.
I could not find a solution yet. Is there a Pythonic way to deal with this, since I run my scripts on Windows and Linux machines.
You could limit the process'es memory limit, but that is OS specific.
Another solution would be checking value of psutil.virtual_memory(), and exiting your program if it reaches some point.
Though OS-independent, the second solution is not Pythonic at all. Memory management is one of the things we have operating systems for.
I'd agree that in general you want to do this from within the operating system - only because there's a reliability factor in having "possibly runaway code check itself for possibly runaway behavior"
If a hard and fast requirement is to do this WITHIN the script, then I think we'd need to know more about what you're actually doing. If you have a single large data structure that's consuming the majority of the memory, you can use sys.getsizeof to identify how large that structure is, and throw/catch an error if it gets larger than you want.
But without knowing at least a little more about the program structure, I think it'll be hard to help...
Related
This is a follow up to a stackoverflow answer from 2009
How can I explicitly free memory in Python?
Unfortunately (depending on your version and release of Python) some
types of objects use "free lists" which are a neat local optimization
but may cause memory fragmentation, specifically by making more and
more memory "earmarked" for only objects of a certain type and thereby
unavailable to the "general fund".
The only really reliable way to ensure that a large but temporary use
of memory DOES return all resources to the system when it's done, is
to have that use happen in a subprocess, which does the memory-hungry
work then terminates. Under such conditions, the operating system WILL
do its job, and gladly recycle all the resources the subprocess may
have gobbled up. Fortunately, the multiprocessing module makes this
kind of operation (which used to be rather a pain) not too bad in
modern versions of Python.
In your use case, it seems that the best way for the subprocesses to
accumulate some results and yet ensure those results are available to
the main process is to use semi-temporary files (by semi-temporary I
mean, NOT the kind of files that automatically go away when closed,
just ordinary files that you explicitly delete when you're all done
with them).
It's been 10 years since that answer, and I am wondering if there is a better way to create some sort of process/subprocess/function/method that releases all of it's memory when completed.
The motivation for this is an issue I am having, where a forloop creates a memory error, despite creating no new variables.
Repeated insertions into sqlite database via sqlalchemy causing memory leak?
It is insertion to a database. I know it's not the database itself that is causing the memory error because when I restart my runtime, the database is still preserved, but the crash doesn't happen until another several hundred iterations of the for loop.
I'm wanting to take photos from 2 different cameras at exactly the same time (or as close as possible).
If I use multithreading or multiprocessing, it still runs the threads/processes consecutively.. For instance if I start the following processes:
Take_photo_1.start()
Take_photo_2.start()
While those processes would run in parallel, the commands to start the processes are still executed sequentially. Is there any way to execute both those processes at exactly the same time?
There's no way to make this exact even if you're writing directly in machine code. Even if you have all the threads wait on a kernel barrier, that wait can take different times on different cores, and there are opcodes to process between the barrier wait and the camera get that have to get fetched and run on a system where the caches may be in different states, and there's nothing stopping the OS from stealing the CPU from one of the threads to run some completely unrelated code, and the I/O to the camera (even if it isn't serialized, which it may be) probably isn't a guaranteed static time, and so on.
When you throw an interpreted language on top of it (especially one with a GIL, like Python, which means the bytecodes between the barrier wait and the camera get can't be run in parallel)… well, you're not really changing anything; "impossible * 7" is still "impossible". But you are making it even more obvious.
Fortunately, very few real-life problems have a true hard real-time requirement like that. Instead, you have a requirement like "99.9% of the time, all camera gets should happen within +/-4ms of the desired exact 30fps". Or, maybe, "90% of the time it's within +/-1ms, 99.9% of the time it's within +/-4ms, 99.999% of the time it's within +/-20ms, as long as you don't do anything stupid like change the wall-power state of the laptop while running the code".
Or… well, only you know why you wanted "exact", and can figure out what the actual requirements are that would satisfy you.
And for that case, often the simplest thing to do is write the code the obvious way, stress test the hell out of it, see if it meets your requirements, and figure out how to optimize things only if it doesn't.
So, your existing code may well be fine.
If not, adding a shared barrier = threading.Barrier() and doing a barrier.wait() right before the camera.get() may be all you need.
You may need to add logic to detect timer lag and re-synchronize (which you might do independently in each thread, or have whichever thread gets there first compute it and just make everyone else wait at the barrier).
You may need to rewrite the core loop in C. Or dump whichever OS you're using for one with better real-time guarantees like QNX. Or throw out the OS entirely so there's no scheduler to get in the way. Or throw out the complex superscalar CPUs and implement the whole thing as a hardware state machine. Or…
But, assuming you have reasonable requirements in the first place, you usually don't have to go very far.
I have a simulation code in Python that uses much of memory with set/list/dict data structure.
The outline is as follows:
massSimulation
for i in simList:
individualSimulation
individualSimulation.py
// do simulation and get the result.
...
return result
The issue is that it claims memory little by little until it uses more memory (around 12G) than the system can provide (8G) to make the system really slow, the CPU used by python starts 100% then drops very rapidly to almost 0%. If this happens, I kill the python process and start again.
I added the garbage reclaim code in the individudalSimulation.py, but the results seem to be the same (I didn't measure, just gut feeling).
import gc
gc.collect()
What could be a solution to this problems? How can I enforce python to relinquish all the memory it claims when a method is finished?
Hard to say without seeing more code, but these are my guesses/proposals:
If the elements in simList are mutable, and you are adding information on them in the individualSimulation, they can be responsible for your problem since they are still referenced by simList. Avoid that. Even better, use an iterator instead of a list, since all the elements you want to loop through
The way you store the results could be using too much space. If there are a lot of simulations, and the results are big, that could be the reason.. you may consider storing them to the hard drive and clean the space in memory.
The point is that you have to delete ALL references to the objects that are taking up so much memory, and then they will be garbage-collectable.
I would like to obtain, from within the Python program, the maximum physical memory used by a Python program (ActiveState Python 3.2, under Windows 7).
Ideally, every 0.1 sec or so, I'd like to have memory usage polled, and if it exceeds the maximum seen so far, the maximum value (a global variable stored somewhere), updated.
UPDATE:
I realize my question is a close duplicate of Which Python memory profiler is recommended?.
Sorry for being unclear. I have two specific problems that are not, to my knowledge, addressed by a regular memory profiler:
I need to see not only the memory allocated by Python, but the total memory used by the Python program (under Windows, this would include DLLs, etc.). In other words, under Windows, this is exactly what you'd see in the Task Manager.
I need to see the maximum memory rather than the memory at any given instant. I can't think of a way to do that other than to place numerous memory checks all around the code whenever I think I'm allocating something large.
I think you could go under Control Panel, Admin Tools, Performances, click on the '+', select 'Processor' under 'Performance Object', Pool Paged Bytes or Pool Unpaged Bytes under the left column and you select your process under the right column.
You can generate a log file with the Performance monitor.
I did not find any way to do precisely what I want with Python.
However, in playing with Windows 7 Task Manager, I found I can add a column "Peak Working Set (Memory)". So if I simply pause the python program before it can exit (and catch exceptions from main() to pause for the same reason), I can see the peak memory in the task manager.
I know it's stupid (e.g., it doesn't allow me to print it to a log file, do something based on how memory I'm using, etc.), but it's better than nothing.
Here's a link to a similar question:
Which Python memory profiler is recommended?
I've used Guppy/Heapy and Meliae. They're a little different, but I've found both are quite useful.
I'm working on a program in python on Windows 7 that matches features between multiple images in real time. It is intended to be the only program running.
When I run it on my laptop, it runs very slowly. However, when I check how much memory it is using with the task manager, it is only using about 46,000 KB. I would like to increase the memory available to the python process so that it can use all available memory.
Any advice would be greatly appreciated.
Python does not have a built-in mechanism for limiting memory consumption; if that's all it's using, then that's all it'll use.
If you're doing image comparisons, chances are good you are CPU-bound, not memory-bound. Unless those are gigantic images, you're probably OK.
So, check your code for performance problems, use heuristics to avoid running unnecessary code, and send what you've got out for code review for others to help you.
Each process can use the same amount of (virtual) memory that the OS makes available. Python is not special in that regard. Perhaps you'd want to modify your program, but we'd have to see some code to comment on that.