How much of a speedup does bytecode compilation give Python code?

How much of a speedup does bytecode compilation give Python code? - python

I searched around for a while and have found a number of reasonable claims that CPython's compilation allows faster execution of Python code. I was wondering, though, if anyone knows of any benchmarks demonstrating the degree of the speedup.
Alternatively, perhaps there's an easy way for me to benchmark it. Is there a Python flag that can be given at runtime to turn off compilation?

All code run by cpython must be compiled to bytecode before it can be run. That's just how the interpreter works, and you probably can't reasonably change this (without writing your own interpreter).
However, by default the compiled bytecode for modules that are loaded will be cached in .pyc files. This means it won't need to be compiled again the next time you load it. Bytecode caching is probably what you heard about, as it can speed up the importing of previously used modules by a fair amount. It doesn't change performance after the modules are loaded though.
You can disable bytecode caching with the -B command line option or the PYTHONDONTWRITEBYTECODE environment variable. If you wanted to do a test of the speed difference, you may also need to delete any existing cache. In Python 2, the compiled bytecode would be written to a .pyc file right next to the .py souce file. In Python 3, this was changed to use a __pycache__ folder which can hold several .pyc files from different versions of Python (so you could have several cached versions at once, see PEP 3147 for more details).

Related

Inspect/disable bytecode check & regeneration in Python 2

I would like to force Python to use pre-compiled bytecode files that I provide, and avoid any other behavior that 'runs the script' but wastes time trying to regenerate the bytecode. Python however is designed to make this behavior generally transparent to the user, so I can't find any obvious ways to control (or even inspect) what it's doing in this regard.
My question(s):
Is my only option to remove the .py source, leaving only .pyc files? (I don't want to do that because keeping the source around has proven very useful for debugging in this scenario.)
How can I check up on this behavior to see what's really happening?
Background & Motivation
I am creating a package of Python modules to be distributed over read-only network file system, and I include pre-compiled bytecode as .pyc along with the Python packages. I can guarantee that the end users will always be using it with the same version of Python, on the same Linux OS, with the same kernel, in the same environment, etc, and therefore should never have to regenerate the bytecode. (This is a special setup for a mid-sized scientific experiment where we enforce these things in the environment we provide for our software developers. The package manager in this environment refuses to set up packages that don't match each other in the ways we specify.)
In this context, I want to ensure that Python does not try to regenerate bytecode or write .pyc's because we want to avoid slow startup times importing some large modules (matplotlib, scipy, and some others). Python's default behavior, as I understand it, is to
always check if the .pyc files are stale
always regenerate this bytecode if it is stale
always try to overwrite the .pyc file with the new bytecode, and
never tell the user if overwriting the .pyc failed.
Step 1 should not be necessary, but I'm not too worried about it because it only requires checking a few bytes. I think it doesn't take much time (even on our network file system, which is optimized for serving up small or partial binary files). I suppose I could be wrong about this for packages that have many .pyc files?
Step 2 should never be necessary after the package is built. If this happens then there is a bug in the package management.
Step 3 will always fail here because this is a read-only filesystem. If this happens then we need to know about it.
Step 4 is frustrating because I can't tell what's going on. And I have seen modules (e.g. matplotlib) which load more quickly on subsequent imports in this scenario, which I don't understand.

Where does PyCharm store compiled files?

I'm new to PyCharm/Python, and can't figure out where the IDE stores compiled python *.pyc files.
Coming from the IntelliJ world, it is strange that I don't see any menu options to re-build the project, or build individual files.
I'm also unable to find any pyc files while searching the project directory, so basically, I've no idea whether successful compilation has happened at all, although the GitHub imported project is error free.
What can I do here?

Because most Python implementations are interpreted rather than a compiled, the compilation step happens when you run the code. This is why the PyCharm UI features a prominent "Run" button (▶️) but no compile button.
It is true that for CPython there is a compilation step which compiles from the Python code to bytecode, but this is an implementation detail. CPython 3 stores its cached compilation results in .pyc files in a directory called __pycache__. These files are automatically generated when a module is imported (using import module will result in a module.pyc file) but not when a normal program is run.
Lastly, as per #shmee's comment, it is possible to compile a source file with the py_compile module, but I should emphasise that this is not usually done or necessary.
Now, if you are worried about checking that your code is correct, in the interpreted language world we rely more strongly on testing. I would recommend that you investigate tests for your code (using pytest and the excellent test integration in PyCharm).

Let me begin with a bit on terminology:
Python is a programming language. It's "just" the programming language specification.
CPython is the reference implementation of the Python language. It's actually just one of several different Python interpreters. CPython itself works (let's call it an implementation detail) by translating (but you could also say compiling) the code in imported Python files/modules to bytecode and then executing that bytecode. It actually stores the translation as .pyc files in the folder of that file) to make subsequent imports faster, but that's specific to CPython and can also be disabled.
PyCharm is an integrated development environment. However it requires to "Configure a Python Interpreter" to run Python code.
That means that PyCharm isn't responsible for creating .pyc files. If you configured a non-CPython interpreter or used the environmental variable to disable the pyc file creation there won't be any pyc files.
But if you used an appropriate CPython interpreter in PyCharm it will create .pyc files for the files/modules you successfully imported. That means you actually have to import or otherwise run the Python files in your project to get the .pyc files.
Actually the Python documentation contains a note about the "compiled" Python files:
To speed up loading modules, Python caches the compiled version of each module in the __pycache__ directory under the name module.version.pyc, where the version encodes the format of the compiled file; it generally contains the Python version number. For example, in CPython release 3.3 the compiled version of spam.py would be cached as __pycache__/spam.cpython-33.pyc. This naming convention allows compiled modules from different releases and different versions of Python to coexist.
Python checks the modification date of the source against the compiled version to see if it’s out of date and needs to be recompiled. This is a completely automatic process. Also, the compiled modules are platform-independent, so the same library can be shared among systems with different architectures.
Python does not check the cache in two circumstances. First, it always recompiles and does not store the result for the module that’s loaded directly from the command line. Second, it does not check the cache if there is no source module. To support a non-source (compiled only) distribution, the compiled module must be in the source directory, and there must not be a source module.
Some tips for experts:
You can use the -O or -OO switches on the Python command to reduce the size of a compiled module. The -O switch removes assert statements, the -OO switch removes both assert statements and doc strings. Since some programs may rely on having these available, you should only use this option if you know what you’re doing. “Optimized” modules have an opt- tag and are usually smaller. Future releases may change the effects of optimization.
A program doesn’t run any faster when it is read from a .pyc file than when it is read from a .py file; the only thing that’s faster about .pyc files is the speed with which they are loaded.
The module compileall can create .pyc files for all modules in a directory.
There is more detail on this process, including a flow chart of the decisions, in PEP 3147.

Python interpretation model in comparison to direct and virtual machine compilation

I have been compiling diagrams (pun intended) in hope of understanding the different implementations of common programming languages. I understand whether code is compiled or interpreted depends on the implementation of the code, and is not an aspect of the programming language itself.
I am interested in comparing Python interpretation with direct compilation (ex of C++)
and the virtual machine model (ex Java or C#)
In light of these two diagrams above, could you please help me develop a similar flowchart of how the .py file is converted to .pyc, uses the standard libraries (I gather they are called modules) and then actually run. Many programmers on SO indicate that python as a scripting language is not executed by the CPU but rather the interpreter, but that sounds quite impossible because ultimately hardware must be doing the computation.

First off, this is an implementation detail. I am limiting my answer to CPython and PyPy because I am familiar with them. Answers for Jython, IronPython, and other implementations will differ - probably radically.
Python is closer to the "virtual machine model". Python code is, contrary to the statements of some too-loud-for-their-level-of-knowledge people and despite everyone (including me) conflating it in casual discussion, never interpreted. It is always compiled to bytecode (again, on CPython and PyPy) when it is loaded. If it was loaded because a module was imported and was loaded from a .py file, a .pyc file may be created to cache the compilation output. This step is not mandatory; you can turn it off via various means, and program execution is not affected the tiniest bit (except that the next process to load the module has to do it again). However, the compilation to bytecode is not avoidable, the bytecode is generated in memory if it is not loaded from disk.
This bytecode (the exact details of which are an implementation detail and differ between versions) is then executed, at module level, which entails building function objects, class objects, and the like. These objects simply reuse (hold a pointer to) the bytecode which is already in memory. This is unlike C++ and Java, where code and classes are set in stone during/after compilation. During execution, import statements may be encountered. I lack the space, time and understanding to describe the import machinery, but the short story is:
If it was already imported once, you get that module object (another runtime construct for a thing static languages only have at compile time). A couple of builtin modules (well, all of them in PyPy, for reasons beyond the scope of this question) are already imported before any Python code runs, simply because they are so tightly integrated with the core of the interpreter and so fundamental. sys is such a module. Some Python code may also run beforehand, especially when you start the interactive interpreter (look up site.py).
Otherwise, the module is located. The rules for this are not our concern. In the end, these rules arrive at either a Python file or a dynamically-linked piece of machine code (.DLL on Windows, though Python modules specifically use the extension .pyd but that's just a name; on unix the equivalent .so is used).
The module is first loaded into memory (loaded dynamically, or parsed and compiled to bytecode).
Then, the module is initialized. Extension modules have a special function for that which is called. Python modules are simply run, from top to bottom. In well-behaved modules this just sets up global data, defines functions and classes, and imports dependencies. Of course, anything else can also happen. The resulting module object is cached (remember step one) and returned.
All of this applies to standard library modules as well as third party modules. That's also why you can get a confusing error message if you call a script of yours just like a standard library module which you import in that script (it imports itself, albeit without crashing due to caching - one of many things I glossed over).
How the bytecode is executed (the last part of your question) differs. CPython simply interprets it, but as you correctly note, that doesn't mean it magically doesn't use the CPU. Instead, there is a large ugly loop which detects what bytecode instruction shall be executed next, and then jumps to some native code which carries out the semantics of that instruction. PyPy is more interesting; it starts off interpreting but records some stats along the way. When it decides it's worth doing so, it starts recording what the interpreter does in detail, and generates some highly optimized native code. The interpreter is still used for other parts of the Python code. Note that it's the same with many JVMs and possibly .NET, but the diagram you cite glosses over that.

For the reference implementation of python:
(.py) -> python (checks for .pyc) -> (.pyc) -> python (execution dynamically loads modules)
There are other implementations. Most notable are:
jython which compiles (.py) to (.class) and follows the java pattern from there
pypy which employs a JIT as it compiles (.py). the chain from there could vary (pypy could be run in cpython, jython or .net environments)

Python is technically a scripted language but it is also compiled, python source is taken from its source file and fed into the interpreter which often compiles the source to bytecode either internally and then throws it away or externally and saves it like a .pyc
Yes python is a single virtual machine that then sits ontop of the actual hardware but all python bytecode is, is a series of instructions for the pvm (python virtual machine) much like assembler for the actual CPU.

Difference between loading time and running time in python?

I am quoting a part of Python documentation:
"A program doesn’t run any faster when it is read from a .pyc or .pyo file than when it is read from a .py file; the only thing that’s faster about .pyc or .pyo files is the speed with which they are loaded."
I don't understand what does it mean when it says it doesn't affect the running time but the loading time? Could someone please explain it a little deep that I can understand completely?

When you import a module test.py, Python must read the source and convert it into the bytecode Python can execute. This takes time, but Python will store this in test.pyc. This Bytecode is the result of breaking your code down into simpler terms able to run directly on the CPython Virtual Machine.
If you load test.pyc, Python doesn't need to compile your source into bytecode before running, so it takes slightly less time to start.
If you import the module test.py twice without modifying it or deleting the generated test.pyc, Python checks for the existence of test.pyc and loads it instead - so the performance benefit is automatic.

There are two steps in converting the Python code you write to instructions the computer can understand:
A compile step. The raw Python code is converted to Python bytecode. This bytecode will be recognised by a Python interpreter on any operating system, on any hardware. This is what is stored in a .pyo or .pyc file.
An interpretation step. The Python interpreter, or if you prefer the Python virtual machine, interprets the bytecode and sends low-level instructions to the computer. These low level instructions will be different between Linux and Windows, or between an Intel chip and an AMD, etc, so someone has to write a different interpreter for each type of system that Python can be run on.
When you run code from a .pyc file, step 1 has already been completed, so the execution goes straight to step 2. But step 2 runs just as fast as it would run if you compiled it immediately before running it. Whether the compile step slows down your code significantly depends on what your program does. You should experiment to see how big a difference waiting for your code to compile takes, but if you are writing short scripts the difference will probably be unnoticeable.

Python .py, .pyo, *.pyc: Which can be eliminated for an Embedded System?

To squeeze into the limited amount of filesystem storage available in an embedded system I'm currently playing with, I would like to eliminate any files that could reasonably be removed without significantly impacting functionality or performance. The *.py, *.pyo, and *.pyc files in the Python library account for a sizable amount of space, I'm wondering which of these options would be most reasonable for a Python 2.6 installation in a small embedded system:
Keep *.py, eliminate *.pyc and *.pyo (Maintain ability to debug, performance suffers?)
Keep *.py and *.pyc, eliminate *.pyo (Does optimization really buy anything?)
Keep *.pyc, eliminate *.pyo and *.py (Will this work?)
Keep *.py, *.pyc, and *.pyo (All are needed?)

http://www.network-theory.co.uk/docs/pytut/CompiledPythonfiles.html
When the Python interpreter is invoked with the -O flag, optimized code is generated and stored in ‘.pyo’ files. The optimizer currently doesn't help much; it only removes assert statements.
Passing two -O flags to the Python interpreter (-OO) will cause the bytecode compiler to perform optimizations that could in some rare cases result in malfunctioning programs. Currently only doc strings are removed from the bytecode, resulting in more compact ‘.pyo’ files.
My suggestion to you?
Use -OO to compile only .pyo files if you don't need assert statements and __doc__ strings.
Otherwise, go with .pyc only.
Edit
I noticed that you only mentioned the Python library. Much of the python library can be removed if you only need part of the functionality.
I also suggest that you take a look at tinypy which is large subset of Python in about 64kb.

Number 3 should and will work. You do not need the .pyo or .py files in order to use the compiled python code.

I would recommend keeping only .py files. The difference in startup time isn't that great, and having the source around is a plus, as it will run under different python versions without any issues.
As of python 2.6, setting sys.dont_write_bytecode to True will suppress compilation of .pyc and .pyo files altogether, so you may want to use that option if you have 2.6 available.

Here's how I minimize disk requirements for mainline Python 2.7 at the day job:
1) Remove packages from the standard library which you won't need. The following is a conservative list:
bsddb/test ctypes/test distutils/tests email/test idlelib lib-tk
lib2to3 pydoc.py tabnanny.py test unittest
Note that some Python code may have surprising dependencies; e.g. setuptools needs unittest to run.
2) Pre-compile all Python code, using -OO to strip asserts and docstrings.
find -name '*.py' | python -OO -m py_compile -
Note that Python by default does not look at .pyo files; you have to explicitly ask for optimization at runtime as well, using an option or an environment variable. Run scripts in one of the following ways:
python -OO -m mylib.myscript
PYTHONOPTIMIZE=2 python -m mylib.myscript
3) Remove .py source code files (unless you need to run them as scripts) and .pyc unoptimized files.
find '(' -name '*.py' -or -name '*.pyc' ')' -and -not -executable -execdir rm '{}' ';'
4) Compress the Python library files. Python can load modules from a zip file. The paths in the zip-file must match the package hierarchy; thus you should merge site-packages and .egg directories into the main library directory before zipping. (Or you can add multiple zip files to the Python path.)
On Linux, Python's default path includes /usr/lib/python27.zip already, so just drop the zip file there and you're ready to go.
Leave os.pyo as an ordinary (non-zipped) file, since Python looks for this as a sanity check. If you move it to the zip file, you'll get a warning on every Python invocation (though everything will still work). Or you can just leave an empty os.py file there, and put the real one in the zip file.
Final notes:
In this manner, Python fits in 7 MB of disk space. There's a lot more that can be done to reduce size, but 7 MB was small enough for my purposes. :)
Python bytecode is not compatible across versions, but who cares when it's you who do the compilation and you who controls the Python version?
.pyo files in a zip file should be a performance win in all cases, unless the disk is extremely fast and the processor/RAM is extremely slow. Either way, Python executes from memory, not the on-disk format, so it only affects performance on load. Although the stripping of docstrings can save quite a bit of memory.
Do note that .pyo files do not contain assert statements.
.pyo files preserve function names and line numbers, so debugability is not decreased: You still get nice tracebacks, you just have to manually go look up the line number in the source, which you'd have to do anyway.
If you want to "hack" a file at runtime, just put it in the current working directory. It take precedence over the library zip file.

What it ultimately boils down to is that you really only need one of the three options, but your best bet is to go with .pys and either .pyos or .pycs.
Here's how I see each of your options:
If you put the .pys in a zip file, you won't see pycs or pyos built. It should also be pointed out that the performance difference is only in startup time, and even then isn't too great in my experience (your milage may vary though). Also note that there is a way to prevent the interpreter from outputting .pycs as Algorias points out.
I think that this is an ideal option (either that or .pys and .pyos) because you get the best mix of performance, debuggability and reliability. You don't necessarily need a source file and compiled file though.
If you're really strapped for space and need performance, this will work. I'd advise you to keep the .pys if at all possible though. Compiled binaries (.pycs or .pyos) don't always transfer to different versions of python.
It's doubtful that you'll need all three unless you plan on running in optimized mode sometimes and non-optimized mode sometimes.
In terms of space it's been my (very anecdotal) experience that .py files compress the best compared to .pycs and .pyos if you put them in a zipfile. If you plan on compressing the files, .pyos don't tend to gain a lot in terms of sheer space because docstrings tend to compress fairly well and asserts just don't take up that much space.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.