memory profiling on pypy-2.3.1 run process? - python

My question is as title, we got memory leak on pypy process and process will down when out of memory, only on production site.
Our simplified environment as below:
OS: Centos 6
pypy-2.3.1
< Tried Solutions >
objgraph is seems the only profiling library we can use in this env, and only with its partial function of printing all current objects in memory instead of any further info such as references (.getrefcount not implemented).
It turns out we can only see lots of "int", "str", "list" objects seems leaking rather than knowing who are using them or whom they are using. :(
"pmap" produced data only shows memory growing in a [anon] block.
periodically gc => not helping concluded it's a REAL memory leak
< Our constraint >
it's hard to change production python runtime since it might affect to our users
we cannot reproduce on other environment
Please advise if there's other tool/methodologies to attack this problem, thanks a lot in advance :)

Related

killing kube pod on the fly with python

I have code/app running on Kubernetes.
and I want a way to dynamically(from the code) kill the pod.
does someone know a way do to it, I tried to exit the process and raise error but it didn't work
Based on the answer to my comment above, I understand your app has a memory leak and you want to restart the pod to clean up that memory leak.
First, I will state the obvious and the best way to fix the problem is to the fix the leak in the code. That being said, you have also said finding the problem will take a lot of time.
The other solution I can think of is to utilise Kubernete's own resource requests/limits functionality.
You declare how much memory and/or CPU you want the node to reserve for your app and you can also declare how much memory and/or CPU you want the node to give your app at max. If the CPU usage exceeds the limit declared, the app is throttled. If the memory usage exceeds the limit declared, it becomes OOMKilled
Keeping the memory limit low will allow your app to be restarted if it exceeds the memory limit you declared.
Details of this, and a examples (which includes a memory leak example) can be found here: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

Django memory leak: possible causes?

I've a Django application that every so often is getting into memory leak.
I am not using large data that could overload the memory, in fact the application 'eats' memory incrementally (in a week the memory goes from ~ 70 MB to 4GB), that is why I suspect the garbage collector is missing something, I am not sure though. Also, it seems as this increment is not dependant of the number of requests.
Obvious things like DEBUG=True, leaving open files, etc... no apply here.
I'm using uWSGI 2.0.3 (+ nginx) and Django 1.4.5
I could set up wsgi so that restart the server when the memory exceeds certain limit, but I wouldn't like to do that since that is not a solution really.
Are there any well know situations where the garbage collector "doesn't do its work properly"? Could it provide some code examples?
Is there any configuration of uWSGI + Django that could cause this?
I haven't found the exact stuff I'm looking for (each project is a world!), but following some clues and advices I managed to solve the issue. I share with you a few links that could help if you are facing a similar problem.
django memory leaks, part I, django memory leaks, part II and Finding and fixing memory leaks in Python
Some useful SO answers/questions:
Which Python memory profiler is recommended?, Is there any working memory profiler for Python3, Python memory leaks and Python: Memory leak debugging
Update
pyuwsgimemhog is a new tool that helps to find out where the leak is.
Django doesn't have known memory leak issues.
I had a similar memory issue. I found that there is a slow SQL causing a high DB CPU percentage. The memory issue is fixed after I fixed the slow SQL.

Determine Python's actual memory usage

I am currently trying to debug the memory usage of my Python program (on Windows with CPython 2.7). But unfortunately, I can't even find any way to reliably measure the amount of memory it's currently using.
I've been using the Task Manager/Resource Monitor to measure the process memory, but this appears to only be useful for determining peak memory consumption. Often times Python will not reduce the Commit or Working Set even long after the relevant objects have been garbage collected.
Is there any way to find out how much memory Python is actually using, or failing that, to force it to free up its unused memory? I'd prefer not to use anything that would require recompiling the interperter.
An example of the behavior that proves it isn't freeing unused memory:
(after some calculations) # 290k
gc.collect() # still 290k
x = range(9999999) # 444k
del x # 405k
gc.collect() # 40k
Is there any way to find out how much memory Python is actually using,
Not from with-in Python.
You can get a rough idea of memory usage per object using sys.getsizeof however that doesn't capture total memory usage, overallocations, fragmentation, memory unused but not freed back to the OS.
There is a third-party tool called Pympler that can help with memory analysis. Also, there a programming environment called Guppy for object and heap memory sizing, profiling and analysis. And there is a similar project called PySizer with a memory usage profiler for Python code.
or failing that, to force it to free up its unused memory?
There is no public API for forcing memory to be released.

Increase availible memory for python in windows

I'm working on a program in python on Windows 7 that matches features between multiple images in real time. It is intended to be the only program running.
When I run it on my laptop, it runs very slowly. However, when I check how much memory it is using with the task manager, it is only using about 46,000 KB. I would like to increase the memory available to the python process so that it can use all available memory.
Any advice would be greatly appreciated.
Python does not have a built-in mechanism for limiting memory consumption; if that's all it's using, then that's all it'll use.
If you're doing image comparisons, chances are good you are CPU-bound, not memory-bound. Unless those are gigantic images, you're probably OK.
So, check your code for performance problems, use heuristics to avoid running unnecessary code, and send what you've got out for code review for others to help you.
Each process can use the same amount of (virtual) memory that the OS makes available. Python is not special in that regard. Perhaps you'd want to modify your program, but we'd have to see some code to comment on that.

How does Python handle memory?

I've been looking at a in-memory database -- and it got me thinking, how does Python handle IO that's not tied to a connection (and even data that is); for example, hashes, sets, etc.; is this a config somewhere, or is it dynamically managed based on resources; are there "easy" ways to view the effect resources are having on a real program, and simulate what the performance hit would be differing hardware setups?
NOTE: If it matters, Redis is the in-memory data store I'm looking at; there's an implementation of a wrapper for Redis datatypes so they mimic the datatypes found in Python.
Python allocates all memory that the application asks for. There is not much room for policy. The only issue is when to release memory. (C)Python immediately releases all memory that is not referenced anymore (this is also not tunable). Memory that is referenced only from itself (ie. cycles) are released by the garbage collector; this has tunable settings.
It is the operating system's decision to write some of the memory into the pagefile.
Not exactly what you're asking for, but Dowser is a Python tool for interactively browsing the memory usage of your running program. Very useful in understanding memory usage and allocation patterns.
http://www.aminus.net/wiki/Dowser

Categories