Use .pyc for entire Python distribution and apps - python

I have a restriction on my embedded device to only contain python main distribution and apps to be .pyc, i.e. no source code .py files. Note this has nothing to do with protecting source code or hiding IP.
The requirement comes from a security standard (fips 140-3) that wants to keep all just in time compiled code to be in binary and not in source format. Any ideas ?

Related

Building python 3.6.6 from source on Win10

I've downloaded the python 3.6.6 source from here...
https://www.python.org/downloads/release/python-366/
...and followed the instruction on how to build on Windows (run ../PCbuild/build.bat). Python compiles and seems to be working (funny and scary: while fetching externals, it actually downloads python-3.7.0 as a dependency... :/ ). However, it looks like the build is somehow 'in place', and the binaries end up in some sub-folder of the source (../PCbuild/amd64/python.exe). This means I'm left with source and compiled code mixed up instead of some clean/lean and deployable package.
can I somehow provide '--prefix=/target/build/path' to define a target location to build to, like I would on linux?
is there a way of removing all src files/folders and leave only the required files/folders (../lib, ../include, etc...).
Or in general, is there a way of making the build process more behave like on linux?
Thanks for your help,
Max
The build.bat from PCBuild is intended for developers, that is, for testing purposes. What you want is under \Tools\msi\buildrelease.bat. This creates a subdirectory under \PCBuild\ that places all msi, cab and exe files ready for later installation. According to the readme, there doesn't seem to be an option to pack all those files in a single .exe file, like all installers eventually do, but another option is under \Tools\msi\build.bat which does have an option for packing (namely build.bat --pack). "But", the readme does state that the buildrelease.bat should be used for an official release. The advantage of doing so is that Pyhton would be optimized using PGO to your own hardware. I am also trying to compile from source using this method but I am having an issue with a recurring error (and other ones):
PGO run did not succeed (no python36!*.pgc files) and there is no data to merge [E:\RepoGiT\3.6\PCbuild\pythoncore.vcxproj]
so, if you do go this route, and find this, or other errors, please send the bug report to python's bug tracker webpage. And better yet, if you find errors and their solution, please report back here!

Where does PyCharm store compiled files?

I'm new to PyCharm/Python, and can't figure out where the IDE stores compiled python *.pyc files.
Coming from the IntelliJ world, it is strange that I don't see any menu options to re-build the project, or build individual files.
I'm also unable to find any pyc files while searching the project directory, so basically, I've no idea whether successful compilation has happened at all, although the GitHub imported project is error free.
What can I do here?
Because most Python implementations are interpreted rather than a compiled, the compilation step happens when you run the code. This is why the PyCharm UI features a prominent "Run" button (▶️) but no compile button.
It is true that for CPython there is a compilation step which compiles from the Python code to bytecode, but this is an implementation detail. CPython 3 stores its cached compilation results in .pyc files in a directory called __pycache__. These files are automatically generated when a module is imported (using import module will result in a module.pyc file) but not when a normal program is run.
Lastly, as per #shmee's comment, it is possible to compile a source file with the py_compile module, but I should emphasise that this is not usually done or necessary.
Now, if you are worried about checking that your code is correct, in the interpreted language world we rely more strongly on testing. I would recommend that you investigate tests for your code (using pytest and the excellent test integration in PyCharm).
Let me begin with a bit on terminology:
Python is a programming language. It's "just" the programming language specification.
CPython is the reference implementation of the Python language. It's actually just one of several different Python interpreters. CPython itself works (let's call it an implementation detail) by translating (but you could also say compiling) the code in imported Python files/modules to bytecode and then executing that bytecode. It actually stores the translation as .pyc files in the folder of that file) to make subsequent imports faster, but that's specific to CPython and can also be disabled.
PyCharm is an integrated development environment. However it requires to "Configure a Python Interpreter" to run Python code.
That means that PyCharm isn't responsible for creating .pyc files. If you configured a non-CPython interpreter or used the environmental variable to disable the pyc file creation there won't be any pyc files.
But if you used an appropriate CPython interpreter in PyCharm it will create .pyc files for the files/modules you successfully imported. That means you actually have to import or otherwise run the Python files in your project to get the .pyc files.
Actually the Python documentation contains a note about the "compiled" Python files:
To speed up loading modules, Python caches the compiled version of each module in the __pycache__ directory under the name module.version.pyc, where the version encodes the format of the compiled file; it generally contains the Python version number. For example, in CPython release 3.3 the compiled version of spam.py would be cached as __pycache__/spam.cpython-33.pyc. This naming convention allows compiled modules from different releases and different versions of Python to coexist.
Python checks the modification date of the source against the compiled version to see if it’s out of date and needs to be recompiled. This is a completely automatic process. Also, the compiled modules are platform-independent, so the same library can be shared among systems with different architectures.
Python does not check the cache in two circumstances. First, it always recompiles and does not store the result for the module that’s loaded directly from the command line. Second, it does not check the cache if there is no source module. To support a non-source (compiled only) distribution, the compiled module must be in the source directory, and there must not be a source module.
Some tips for experts:
You can use the -O or -OO switches on the Python command to reduce the size of a compiled module. The -O switch removes assert statements, the -OO switch removes both assert statements and doc strings. Since some programs may rely on having these available, you should only use this option if you know what you’re doing. “Optimized” modules have an opt- tag and are usually smaller. Future releases may change the effects of optimization.
A program doesn’t run any faster when it is read from a .pyc file than when it is read from a .py file; the only thing that’s faster about .pyc files is the speed with which they are loaded.
The module compileall can create .pyc files for all modules in a directory.
There is more detail on this process, including a flow chart of the decisions, in PEP 3147.

How to create Python module installer on Windows without including source files

I would like to create a Windows Installer of my Python module (as explained here https://docs.python.org/2/distutils/builtdist.html) without including the sources, meaning that the installer should only embed the Python bytecode.
I followed the solution explained here https://stackoverflow.com/a/3444346/343201 and it only works for ZIP distributions. In other words, if I create a ZIP built distribution, I manage to exclude all sources and export only bytecode (.pyc). On the other hand, if I create a MSI/WININST built distribution, it will only contain sources (.py).
I looked at the output of bdist_wininst and looks like when creating a Windows installer, only source files are exported, bytecode is ignored. Why? Is there a way to enforce the installer to embed only bytecode and no source files?
Thanks
You can use py2exe, though it uses your script files. It will make a .exe file and the source code stays locked.
You can try every way around. Head over here and try http://www.py2exe.org/old/

Compiling Python statically

I can use
python -m py_compile mytest.py
And it will byte-compile the file. From reading some other documentation, it was my impression that it byte-compiled any modules imported. But if I change any of the files it imports, I see the changed functionality. Is there some way to completely compile a python script and modules it imports, so that any changes to the originals don't reflect? I want to do this for security purposes, essentially creating a "trusted" version which can't be subverted by changing the functionality of any modules that it calls.
If you compile to bytecode, then delete the source files, then the bytecode can't change. But if someone has the ability to change source files on your machine, they can also change bytecode files on your machine. I don't think this will give you any actual security.
If you want a single-file Python program, you can run from a zip file.
Another option is to use cx_freeze or a similar program to compile the program into a native executable.
We solved this problem by developing our own customer loader using signet http://jamercee.github.io/signet/. Signet will scan your python source and it's dependencies and calculate sha1 hashes, which it embeds in the loader. You deliver the loader AND your script to your users with instructions they run the loader. On invocation, it re-calculates the hashes and if they match, control is then transferred to your script. Signet also supports code signing and PE verification.
Freeze/Bundle:
bbfreeze cross-platform
cx_freeze cross-platform
py2exe Windows
py2app OS X
pyinstaller cross-platform
Cryptography:
Assuming your "process" for running "scripts" is secure and cannot be tampered with:
Create secure hashes of the scripts and record them (e.g: SHA1)
When executing "scripts" ensure their cryptographic hashes match (ensuring they haven't been tampered with).
A common approach to this for secure protocols and apis is to use HMAC

Python4Delphi-powered program, how to deploy it?

My Delphi program uses Python4Delphi, and the Python script uses some standard libs. When deploying my program I don't want an entire Python installation, and it must work with python27.dll. What are the minimal set of necessary files? The document in Python4Delphi is dated and it's not clear to me...
Thanks for your help.
When I did this, I made the list myself, of what I needed for my embedded python application to work.
I remember this worked with python15.dll:
PythonXX.dll should work, without any other external files other than the Visual C++ Runtime DLLs, which require a side-by-side manifest (see the p4d wiki page) to work.
If you want to IMPORT something, then you need to ship it and anything it depends on. That means, either you pick part of the python standard libraries you want, or you pick all of it. There is no way you need all of Python's standard libraries. But I wouldn't want to live without OS and a a few other key ones. BUt the decision is yours.

Categories