This question already has answers here:
How do I profile memory usage in Python?
(8 answers)
Closed 2 months ago.
I'm trying to diagnose memory leaks caused by Exceptions.traceback and I would like to be able to list all or most of the paths that leads to the variable that should be garbage collected but isn't.
I'm currently using a bit clumsy code to print out the reference graph, but I hoped there is a library or tool that have this ability build in. Ideally with some nice way to dump the graph and then explore it later interactively.
You can see my current approach (functions print_ref_graph and find_tracebacks), here: https://nbviewer.org/gist/PiotrCzapla/1ff0fa083e8a4ca657ad86b1942abf42
While looking for memory profilers I've found some tools that are able to show the dependency graph. The best so far is objgraph. For an unhandled exception it is able to nicely show where the object lives.
objgraph
It works much faster than pympler and show nice visualisation in jupyter notebook. So I really recommend this one.
To find all places that holds your last unhandled exception
objgraph.show_backref(sys.last_traceback) is what you need.
If you want to find out if you have some objects of particular type that are still alive, it has a method for that: objgraph.by_type('type name') to return a list.
pympler
Its interactive reference browser does not work on macOS, but file browser gave me an output that is super slow but acceptable. (the only bit missing is that it does not list keys where the object is referenced).
The following code listed most of the places where sys.last_traceback is referenced in jupyter. But without keys in dict you won't know that sys holds traceback in last_traceback, and AutoFormattedTB has tb attribute.
from pympler import refbrowser
ib = refbrowser.FileBrowser(sys.last_traceback)
ib.print_tree('out.txt')
This question already exists:
Is there a way to run a script with 'unrecognizable' variablenames & funcitonnames? [closed]
Closed 2 years ago.
I wrote a project with many sub-sheets of python code. and need to run it on a p2p cloud-computing service because it needs performance. I dont want the 'host' to know what it is by trying to understand the variable names and function names.
Its about 1000s of variables and 100s of functions, so doing it via CTRL+r and then renaming gives a high risk of errors and takes a long time.
a) Is there a procedure in compiling to make the variable names (of a copy) unrecognizable? (e.g. ahjbeunicsnj instead of placeholder_1_for_xy_csv or kjbej() instead of save_csv())
or alternatively b) Is there a way to encrypt all the files and run it encrypted? (maybe .exe)
Yes it's possible. You can obfuscate the Python script. Programs like PyInstaller can create executables too. You didn't indicate that you researched that but it's an option.
Here's the official page on this topic which goes into far more detail: https://wiki.python.org/moin/Asking%20for%20Help/How%20do%20you%20protect%20Python%20source%20code%3F
Here's an answer on another StackExchange that's also relevant: https://reverseengineering.stackexchange.com/questions/22648/best-way-to-protect-source-code-of-exe-program-running-on-python
I'm creating a project in Python that I intend to be able to run under both Python 2.7 and Python 3. I created a class where it became apparent that a nice piece of functionality was available under Python 3 using some Python 3-specific functionality. I don't believe I can replicate the same functionality in Python 2.7, and am not trying to do so. But I intend for the Python 3 app to perhaps have some additional functionality as a consequence.
Anyway, I was hoping that so long as the 2.7 app never called the functions that used the 3.x functionality I'd be okay. But, no, because the presence of the code generates a compile-time error in 2.7, so it spits the dummy despite the function never being called at runtime. And because of Python's lack of any compile-time guards I'm not entirely sure what the best solution is.
I guess I could create a subclass of MyClass, call it MyClass3, put it in another module and add the extra functions there. But that makes a lot of things substantially grubbier...many more split code paths based on sys.version_info, circular inclusion problems unless I do a lot of file-splitting and...(waves hand). It's a mess that way. But maybe it's the only option available?
EDIT:
The original question made reference to "yield from" which is why the answer below discusses it. But the original question was not actually looking for advice on how to get "yield from" working in 2.7, but the moderator seemed to THINK this was what the question was about and flagged it as a duplicate accordingly.
As it happened, just as I edited the question to focus it on the issue of organizing the project to avoid compile errors (and to remove references to "yield from"), an answer came in that referenced the yield from issue and turned out to be super-useful.
yield from was backported to Python 2.7 in the following module: yieldfrom.
There is also a SO question about implementing yield from functionality back to python 2 that you may find useful as well as a blog post on the same topic.
AFAIK, there is no official backport of the functionality so there is nothing like from __future__ import yieldfrom that one could expect (please correct if you know otherwise).
This question already has answers here:
How can I profile Python code line-by-line?
(5 answers)
Closed 7 years ago.
I have a Python function in which I want to find most slowing-down places. Just not I'm using cProfile, but I have an additional functionality.
I don't want to split my function into a dozen of sub-functions: it looks a bit bulky and annoying.
Isn't there instead a way to profile a function line-by-line? Or add something like timer_start(timer_id) and timer_stop(timer_id) before and after each block of code I want to profile execution time?
If you are not using IPython already, you should give it a look. It has magic functions like %lprun which make line-by-line profiling easy. Take a look at Timing and Profiling in IPython
I have a Python 3 file. I want to use an open-source tool on the internet (nltk), but unfortunately it only supports Python 2. There is no way for me to convert it to Python 3, nor can I convert my Python 3 file to Python 2.
If the user does not give a certain argument (on argparse) then I do something in my file. If the user does give a certain argument, however, I need to use nltk.
Writing a Python 2 script that uses nltk and then executing script that in my Python 3 script
My current idea is to write a script in Python 2 that does what I want with nltk and then run that from my current Python 3 script. However, I don't actually know how to do this.
I found this code: os.system(command) and so I will modify it to be os.system("python py2.py") (where py2.py is my newly written Python 2 file).
I'm not sure if that will work.
I also don't know if that is the most efficient way to solve my problem. I cannot find any information about it on the internet.
The data transferred will probably be quite large. Currently, my test data is about 6600 lines, utf-8. Functionality is more important than how long it takes (to a certain extent) in my case.
Also, how would I pass values from my Python 2 script to my Python 3 script?
Thanks
Is there any other way to do this?
Well, if you're sure you can't convert your script to Python 2, then having one script call the other by running the Python interpreter probably is the best way. (And, this being Python, the best way is, or at least should be, the only way.)
But are you sure? Between the six module, the 3to2 tool, and __future__ statements, it may not be as hard as you think.
Anyway, if you do need to have one script call the other, you should almost never use os.system. As the docs for that function say:
The subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using this function. See the Replacing Older Functions with the subprocess Module section in the subprocess documentation for some helpful recipes.
The simplest version is this:
subprocess.check_call(["python", "py2.py"])
This runs your script, waits for it to finish, and raises an exception if the script returns failure—basically, what you wanted to do with os.system, but better. (For example, it doesn't spawn an unnecessary extra shell, it takes care of error handling, etc.)
That assumes whatever other data you need to share is being shared in some implicit, external way (e.g., by accessing files with the same name). You might be better off passing data to py2.py as command-line arguments and/or stdin, passing data back as via stdout, or even opening an explicit pipe or socket to pass things over. Without knowing more about exactly what you need to do, it's hard to suggest anything, but the docs, especially the section Replacing Older Functions with the subprocess Module have lots of discussion on the options.
To give you an idea, here's a simple example: to pass one of your filename arguments to py2.py, and then get data back from py2.py to py3.py, just have py3.py do this:
py2output = subprocess.check_output(["python", "py2.py", my_args[0]])
And then in py2.py, just print whatever you want to send back.
The Anyone hear when NLTK 3.0 will be out? here in SO points out that...
There's a Python 3 branch:
https://github.com/nltk/nltk/tree/nltk-py3k
The answer is from July 2011. It could be improved since then.
I have just looked at https://github.com/nltk/nltk. There is at least the document that talks about Python 3 port related things https://github.com/nltk/nltk/blob/2and3/web/dev/python3porting.rst.
Here is a longer discussion on NLTK and Python 3 that you may be interested in.
And the Grants to Assist Kivy, NLTK in Porting to Python 3 (published 3 days ago) is directly related to the problem.