We've got a (Windows) application, with which we distribute an entire Python installation (including several 3rd-party modules that we use), so we have consistency and so we don't need to install everything separately. This works pretty well, but the application is pretty huge.
Obviously, we don't use everything available in the runtime. I'd like to trim down the runtime to only include what we really need.
I plan on trying out py2exe, but I'd like to try and find another solution that will just help me remove the unneeded parts of the Python runtime.
One trick I've learned while trimming down .py files to ship: Delete all the .pyc files in the standard library, then run your application throughly (that is, enough to be sure all the Python modules it needs will be loaded). If you examine the standard library directories, there will be .pyc files for all the modules that were actually used. .py files without .pyc are ones that you don't need.
Both py2exe and pyinstaller (NOTE: for the latter use the SVN version, the released one is VERY long in the tooth;-) do their "trimming" via modulefinder, the standard library module for finding all modules used by a given Python script; you can of course use the latter yourself to identify all needed modules, if you don't trust pyinstaller or py2exe to do it properly and automatically on your behalf.
This py2exe page on compression suggests using UPX to compress any DLLs or .pyd files (which are actually just DLLs, still). Obviously this doesn't help in trimming out unneeded modules, but it can/will trim down the size of your distribution, if that's a large concern.
Related
I have a .py file that imports from other python modules that import from config files, other modules, etc.
I am to move the code needed to run that .py file, but only whatever the py file is reading from (I am not talking about packages installed by pip install, it's more about other python files in the project directory, mostly classes, functions and ini files).
Is there a way to find out only the external files used by that particular python script? Is it something that can be found using PyCharm for example?
Thanks!
Static analysis tools (such as PyCharm's refactoring tools) can (mostly) figure out the module import tree for a program (unless you do dynamic imports using e.g. importlib.import_module()).
However, it's not quite possible to statically definitively know what other files are required for your program to function. You could use Python's audit events (or strace/ptrace or similar OS-level functions) to look at what files are being opened by your program (e.g. during your tests being run (you do have tests, right?), or during regular program use), but it's likely not going to be exhaustive.
I have a small program I am trying pass around to friends. It includes nonstandard libraries like the PIL and mtTkinter packages, and thus usually requires them to be included in the pythonpath.
Is there a way I can generate the the .pyc file so that all required code for the program to run is included in the file, and not required to be installed on my friends computers. I want this to be able to work between different OS like windows, mac, and linux.
Does it have to a pyc file? I use pyinstaller to package programs and share. The first time I used it, I couldn't believe how easy it is.
Here's a link to the portion on packaging for multiple operating systems: https://pyinstaller.readthedocs.io/en/stable/usage.html#supporting-multiple-operating-systems
Consider that I have a package called "A" consisting of several modules and also nested packages. Now, I want to distribute this package to user and I do not want user to see my code at all. I heard that ".pyc" can be de-compiled. So, I am just wondering what could be the other alternatives for this problem.
It would be great if someone gives some ideas in this regard.
You actually have few options. First, you can compile your code into pyc files. However, this can be circumvented with the disassembler library dis, but this requires a lot of technical know-how. You can also use py2exe to package it as an exe file; this converts the pyc file into an exe file. This can still be disassembled but adds an extra layer. You also have a few encryption solutions; for example you can use pyconcrete to encrypt your imports until they are loaded into memory. You can also just encryption the entire application, then ship the decrypter and launcher with it as a C/C++ application (or any other compiled language). Lastly, if you are comfortable with getting python to run custom C/C++ code, you can also put your private code into a DLL or SO and call it directly for the script.
Python is an interpreted language. That means that if you want to distribute pyc files you'll have to have them run on the same OS/architecture as yours or you'll run into subtle problems. That, and the fact that most code can be decompiled to some degree, would urge me to rethink your use case.
Can you rethink your package as a service instead?
I'm working on an Inno Setup installer for a Python application for Windows 7, and I have these requirements:
The app shouldn't write anything to the installation directory
It should be able to use .pyc files
The app shouldn't require a specific Python version, so I can't just add a set of .pyc files to the installer
Is there a recommended way of handling this? Like give the user a way to (re)generate the .pyc files? Or is the shorter startup time benefit from the .pyc files usually not worth worrying about?
PYC files aren't guaranteed to be compatible for different python versions. If you don't know that all your customers are running the same python versions, you really don't want to distribute pyc's directly. So, you have to choose between distributing PYCs and supporting multiple python versions.
You could create build process that compiles all your files using py_compile and zips them up into a version-specific package. You can do this with setuptools.; however it will be awkward to do because you'll have to run py_compile in every version you need to support.
If you are basically distributing a closed application and don't want people to have trivial access to your source code, then py2exe is probably a simpler alternative. If your python is supposed to be integrated into the user's python install, then it's probably simpler to just create a zip of your .py files and add a one-line .py stub that imports the zipped package(s) using zipfile
if it makes you feel better, PYC doesn't provide much extra security and it doesn't really boost perf much either :)
If you haven't read PEP 3147, that will probably answer your questions.
I don't mean the solution described in that PEP and implemented as of Python 3.2. That's great if your "multiple Python versions" just means "3.2, 3.3, and probably future 3.x". Or even if it means "2.6+ and 3.1+, but I only really care about 3.2 and 3.3, so if I don't get the pyc speedups for other ones that's OK".
But when I asked your supported versions, you said, "2.7", which means you can't rely on PEP 3147 to solve your problems.
Fortunately, the PEP is full of discussion of earlier attempts to solve the problem, and the pitfalls of each, and there should be more than enough there to figure out what the options are and how to implement them.
The one problem is that the PEP is very linux-centric—mainly because it's primarily linux distros that tried to solve the problem in the past. (Apple also did so, but their solution was (a) pretty much working, and (b) tightly coupled with the whole Mac-specific "framework" thing, so they were mostly ignored…)
So, it largely leaves open the question of "Where should I put the .pyc files on Windows?"
The best choice is probably an app-specific directory under the user's local application data directory. See Known Folders if you can require Vista or later, CSIDL if you can't. Either way, you're looking for the FOLDERID_LocalAppData or CSIDL_LOCAL_APPDATA, which is:
The file system directory that serves as a data repository for local (nonroaming) applications. A typical path is C:\Documents and Settings\username\Local Settings\Application Data.
The point is that it's a place for applications to store data that's separate for each user (and inside that user's profile directory), and also for each machine the user's roaming profile might end up on, which means you can safely put stuff there and know that the user has the permissions to write there without UAC getting involved, and also know (as well as you ever can) that no other user or machine will interfere with what's there.
Within that directory, you create a directory for your program, and put whatever you want there, and as long as you picked a unique name (e.g., My Unique App Name or My Company Name\My App Name or a UUID), you're safe from accidental collision with other programs. (There used to be specific guidelines on this in MSDN, but I can no longer find them.)
So, how do you get to that directory?
The easiest way is to just use the env variable %LOCALAPPDATA%. If you need to deal with older Windows, you can use %USERPROFILE% and tack \Local Settings\Application Data onto the end, which is guaranteed to either be the same, or end up in the same place via junctions.
You can also use pywin32 or ctypes to access the native Windows APIs (since there are at least 3 different APIs for this and at least two ways to access those APIs, I don't want to give all possible ways to write this… but a quick google or SO search for "pywin32 SHGetFolderPath" or "ctypes SHGetKnownFolderPath" or whatever should give you what you need).
Or, there are multiple third-party modules to handle this. The first one both Google and PyPI turned up was winshell.
Re-reading the original question, there's a much simpler answer that probably fits your requirements.
I don't know much about Inno, but most installers give you a way to run an arbitrary command as a post-copy step.
So, you can just use python -m compileall to create the .pyc files for you at install time—while you've still got elevated privileges, so there's no problem with UAC.
In fact, if you look at pywin32, and various other Python packages that come as installer packages, they do exactly this. This is an idiomatic thing to do for installing libraries into the user's Python installation, so I don't see why it wouldn't be considered reasonable for installing an executable that uses the user's Python installation.
Of course if the user later decides to uninstall Python 2.6 and install 2.7, your .pyc files will be hosed… but from your description, it sounds like your entire program will be hosed anyway, and the recommended solution for the user would probably be to uninstall and reinstall anyway, right?
I plan to use PyInstaller to create a stand-alone python executable. PythonInstaller comes with built-in support for UPX and uses it to compress the executable but they are still really huge (about 2,7 mb).
Is there any way to create even smaller Python executables? For example using a shrinked python.dll or something similiar?
If you recompile pythonxy.dll, you can omit modules that you don't need. Going by size, stripping off the unicode database and the CJK codes creates the largest code reduction. This, of course, assumes that you don't need these. Remove the modules from the pythoncore project, and also remove them from PC/config.c
Using a earlier Python version will also decrease the size considerably if your really needing a small file size. I don't recommend using a very old version, Python2.3 would be the best option. I got my Python executable size to 700KB's! Also I prefer Py2Exe over Pyinstaller.
You can't go too low in size, because you obviously need to bundle the Python interpreter in, and only that takes a considerable amount of space.
I had the same concerns once, and there are two approaches:
Install Python on the computers you want to run on and only distribute the scripts
Install Python in the internal network on some shared drive, and rig the users' PATH to recognize where Python is located. With some installation script / program trickery, users can be completely oblivious to this, and you'll get to distribute minimal applications.