How to make Python API using py2exe? - python

Is it possible to "compile" a Python script with py2exe (or similar) and then allow the user access to modify the top-level Python scripts? Or possibly import the compiled modules into their normal Python scripts? I'm looking for the ability to distribute an easy installer for some customers, but allow other customers to build upon that installed version by creating their own scripts that work with the installed framework modules, like an API.
I have tried to use py2exe to import files that I have placed in the "dist" directory, but it complains that they aren't frozen. Why can't it use a mix of frozen binary modules and interpreted modules?
The reason that I am using py2exe is because I have some troublesome libraries (paramiko/pycrypto, plus some internally developed ones) that I don't want to require my customers to trudge through those installations. I also don't want them to have open access to my framework files. I know that they can reverse-compile the py2exe objects, but they will have to work to modify the framework, which is good enough protection.

I figured out how to get it to work. I placed my "head" framework file in the "includes" list in the setup.py file. Then, I have a compliled runner that uses the imp module to dynamically load regular Python scripts, and those scripts call upon that head framework file. This is exactly the kind of hidden framework, yet reachable API that I was looking for.
For example, let's say we have a directory called "framework" with a master file "foo" that contains all of the API calls. The line in the py2exe setup.py file would look like this:
includes = ['framework.foo', 'some_other_module', 'etc']
I then make a target for this runner script:
FrameworkTarget = Target(
# what to build
script = "run_framework.py",
dest_base = "run_framework"
)
Then add the target to the setup() command in the setup.py script among the other things:
console = [FrameworkTarget],
The compiled runner script is passed the name of the "test suite" script from the command line:
test_suite_name = sys.argv[1]
file_name = test_suite_name + ".py"
path_name = os.path.join(os.getcwd(), file_name)
print "Loading source %s at %s"%(file_name, path_name)
module = imp.load_source(file_name, path_name )
Then, in the file called by the imp.load_source() command, I have this:
import framework.foo
When I didn't have 'framework.foo' in my includes, it couldn't find the compiled version of framework.foo. Maybe someone will find this useful in the future. I don't know if I could do one useful thing without Stackoverflow!

Is it possible to "compile" a Python script with py2exe (or similar)
and then allow the user access to modify the top-level Python scripts?
I'm not particularly familiar with py2exe, but looking at the tutorial page, it would seem relatively straightforward to replace the hello.py script with something along the lines of...
import sys
import os
# Import your framework here, and anything else you want py2exe to embed
import my_framework
TOP_LEVEL_SCRIPT_DIR = '/path/to/scripts'
MAIN_SCRIPT = os.path.join(TOP_LEVEL_SCRIPT_DIR, 'main.py')
sys.path.append(TOP_LEVEL_SCRIPT_DIR)
execfile(MAIN_SCRIPT)
...and put any scripts you want the user to be able to modify in /path/to/scripts, although it'd probably make more sense to define TOP_LEVEL_SCRIPT_DIR as a path relative to the binary.
The reason that I am using py2exe is because I have some troublesome
libraries (paramiko/pycrypto, plus some internally developed ones)
that I don't want to require my customers to trudge through those
installations. I also don't want them to have open access to my
framework files.
If the goal is ease of installation, it might also suffice to create a regular InstallShield-esque installer to put all the files in the right places, and just include the .pyc versions of your "framework files" if you don't want them reading the source code.

Related

How do I get the list of imports and dependant files from python script?

I have a .py file that imports from other python modules that import from config files, other modules, etc.
I am to move the code needed to run that .py file, but only whatever the py file is reading from (I am not talking about packages installed by pip install, it's more about other python files in the project directory, mostly classes, functions and ini files).
Is there a way to find out only the external files used by that particular python script? Is it something that can be found using PyCharm for example?
Thanks!
Static analysis tools (such as PyCharm's refactoring tools) can (mostly) figure out the module import tree for a program (unless you do dynamic imports using e.g. importlib.import_module()).
However, it's not quite possible to statically definitively know what other files are required for your program to function. You could use Python's audit events (or strace/ptrace or similar OS-level functions) to look at what files are being opened by your program (e.g. during your tests being run (you do have tests, right?), or during regular program use), but it's likely not going to be exhaustive.

Fully embedded SymPy+Matplotlib+others within a C/C++ application

I've read the Python documentation chapter explaining how to embed the Python interpreter in a C/C++ application. Also, I've read that you can install Python modules either in a system-wide fashion, or locally to a given user.
But let's suppose my C/C++ application will use some Python modules such as SymPy, Matplotlib, and other related modules. And let's suppose end users of my application won't have any kind of Python installation in their machines.
This means that my application needs to ship with "pseudo-installed" modules, inside its data directories (just like the application has a folder for icons and other resources, it will need to have a directory for Python modules).
Another requirement is that the absolute path of my application installation isn't fixed: the user can "drag" the application bundle to another directory and it will run fine there (it already works this way but that's prior to embedding Python in it, and I wish it continues being this way after embedding Python).
I guess my question could be expressed more concisely as "how can I use Python without installing Python, neither system-wide, nor user-wide?"
There are various ways you could attempt to do this, but none of them are general solutions. From the (docs):
5.5. Embedding Python in C++
It is also possible to embed Python in a C++ program; precisely how this is done will depend on the details of the C++ system used; in general you will need to write the main program in C++, and use the C++ compiler to compile and link your program. There is no need to recompile Python itself using C++.
This is the shortest section in the document, and is roughly equivalent to: 'left as an exercise for the reader`. I do not believe you will find any straight forward solutions.
Use pyinstaller to gather the pieces:
This means that my application needs to ship with "pseudo-installed" modules, inside its data directories (just like the application has a folder for icons and other resources, it will need to have a directory for Python modules).
If I needed to tackle this problem, I would use pyinstaller as a base. (Disclosure: I am an occasional contributer). One of the major functions of pyinstaller is to gather up all of the needed resources for a python program. In onedir mode, all of the things needed to let the program run are gathered into one directory.
You could include this tool into your make system, and have it place all of the needed pieces into your python data directory in your build tree.

Local collection of Python packages: best way to import them?

I need to ship a collection of Python programs that use multiple packages stored in a local Library directory: the goal is to avoid having users install packages before using my programs (the packages are shipped in the Library directory). What is the best way of importing the packages contained in Library?
I tried three methods, but none of them appears perfect: is there a simpler and robust method? or is one of these methods the best one can do?
In the first method, the Library folder is simply added to the library path:
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
import package_from_Library
The Library folder is put at the beginning so that the packages shipped with my programs have priority over the same modules installed by the user (this way I am sure that they have the correct version to work with my programs). This method also works when the Library folder is not in the current directory, which is good. However, this approach has drawbacks. Each and every one of my programs adds a copy of the same path to sys.path, which is a waste. In addition, all programs must contain the same three path-modifying lines, which goes against the Don't Repeat Yourself principle.
An improvement over the above problems consists in trying to add the Library path only once, by doing it in an imported module:
# In module add_Library_path:
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
and then to use, in each of my programs:
import add_Library_path
import package_from_Library
This way, thanks to the caching mechanism of CPython, the module add_Library_path is only run once, and the Library path is added only once to sys.path. However, a drawback of this approach is that import add_Library_path has an invisible side effect, and that the order of the imports matters: this makes the code less legible, and more fragile. Also, this forces my distribution of programs to inlude an add_Library_path.py program that users will not use.
Python modules from Library can also be imported by making it a package (empty __init__.py file stored inside), which allows one to do:
from Library import module_from_Library
However, this breaks for packages in Library, as they might do something like from xlutils.filter import …, which breaks because xlutils is not found in sys.path. So, this method works, but only when including modules in Library, not packages.
All these methods have some drawback.
Is there a better way of shipping programs with a collection of packages (that they use) stored in a local Library directory? or is one of the methods above (method 1?) the best one can do?
PS: In my case, all the packages from Library are pure Python packages, but a more general solution that works for any operating system is best.
PPS: The goal is that the user be able to use my programs without having to install anything (beyond copying the directory I ship them regularly), like in the examples above.
PPPS: More precisely, the goal is to have the flexibility of easily updating both my collection of programs and their associated third-party packages from Library by having my users do a simple copy of a directory containing my programs and the Library folder of "hidden" third-party packages. (I do frequent updates, so I prefer not forcing the users to update their Python distribution too.)
Messing around with sys.path() leads to pain... The modern package template and Distribute contain a vast array of information and were in part set up to solve your problem.
What I would do is to set up setup.py to install all your packages to a specific site-packages location or if you could do it to the system's site-packages. In the former case, the local site-packages would then be added to the PYTHONPATH of the system/user. In the latter case, nothing needs to changes
You could use the batch file to set the python path as well. Or change the python executable to point to a shell script that contains a modified PYTHONPATH and then executes the python interpreter. The latter of course, means that you have to have access to the user's machine, which you do not. However, if your users only run scripts and do not import your own libraries, you could use your own wrapper for scripts:
#!/path/to/my/python
And the /path/to/my/python script would be something like:
#!/bin/sh
PYTHONPATH=/whatever/lib/path:$PYTHONPATH /usr/bin/python $*
I think you should have a look at path import hooks which allow to modify the behaviour of python when searching for modules.
For example you could try to do something like kde's scriptengine does for python plugins[1].
It adds a special token to sys.path(like "<plasmaXXXXXX>" with XXXXXX being a random number just to avoid name collisions) and then when python try to import modules and can't find them in the other paths, it will call your importer which can deal with it.
A simpler alternative is to have a main script used as launcher which simply adds the path to sys.path and execute the target file(so that you can safely avoid putting the sys.path.append(...) line on every file).
Yet an other alternative, that works on python2.6+, would be to install the library under the per-user site-packages directory.
[1] You can find the source code under /usr/share/kde4/apps/plasma_scriptengine_python in a linux installation with kde.

Python scripts that depend on binaries... how to distribute?

I have a codebase that includes some C++ code and Python scripts that make use of the resulting binaries (via the subprocess module).
root/
experiments/
script_1.py (needs to call binary_1)
clis/
binary_1.cc
binary_1
What's the best way to refer to the binary from the Python scripts?
A relative path from the Python script's directory to the binary, which assumes the user will be running the Python script from a particular directory
Just the binary name, which assumes the user will have added the binary's directory to the $PATH variable, or copied the binary to /usr/local/bin, or something
Something else?
If your binaries are pre-compiled you can use the data_files parameter to setuptools. Have it installed in /usr/local/bin.
data_files=[("/usr/local/bin", glob("bin/*"))], ...
You could use __file__ to find out the location of the Python script, so it wouldn't matter where the user ran the script from.
path = os.path.normpath(os.path.join(
os.path.dirname(__file__), '..', 'clis', 'binary_1'
))
In my experience, the best way to integrate your C(pp) code in your Python program is to make a compiled Python module out of the C(pp) code instead of using the subprocess module as you are now doing.
In addition to a more consistent and readable Python codebase, you get the added benefit of modularity (solving among others the $PATH issues) and can use distutils as build tool. Distribution is also easier, then, as setup.py automates it.

Problem accessing config files within a Python egg

I have a Python project that has the following structure:
package1
class.py
class2.py
...
package2
otherClass.py
otherClass2.py
...
config
dev_settings.ini
prod_settings.ini
I wrote a setup.py file that converts this into an egg with the same file structure. (When I examine it using a zip program the structure seems identical.) The funny thing is, when I run the Python code from my IDE it works fine and can access the config files; but when I try to run it from a different Python script using the egg, it can't seem to find the config files in the egg. If I put the config files into a directory relative to the calling Python script (external to the egg), it works - but that sort of defeats the purpose of having a self-contained egg that has all the functionality of the program and can be called from anywhere. I can use any classes/modules and run any functions from the egg as long as they don't use the config files... but if they do, the egg can't find them and so the functions don't work.
Any help would be really appreciated! We're kind of new to the egg thing here and don't really know where to start.
The problem is, the config files are not files anymore - they're packaged within the egg. It's not easy to find the answer in the docs, but it is there. From the setuptools developer's guide:
Typically, existing programs manipulate a package's __file__ attribute in order to find the location of data files. However, this manipulation isn't compatible with PEP 302-based import hooks, including importing from zip files and Python Eggs.
To access them, you need to follow the instructions for the Resource Management API.
In my own code, I had this problem with a logging configuration file. I used the API successfully like this:
from pkg_resources import resource_stream
_log_config_file = 'logging.conf'
_log_config_location = resource_stream(__name__, _log_config_file)
logging.config.fileConfig(_log_config_location)
_log = logging.getLogger('package.module')
See Setuptools' discussion of accessing pacakged data files at runtime. You have to get at your configuration file a different way if you want the script to work inside an egg. Also, for that to work, you may need to make your config directory a Python package by tossing in an empty __init__.py file.

Categories