I've just built a Jython project that uses both some Python module imports and some Java jars. On my own computer, since I just wanted the thing done, I've gotten things to work in a very hacky way by hardcoding sys.path and installing every module and jar I wanted separately. This is definitely not something I want to keep for a release version. I've read about being able to package everything up into one standalone Jython jar, and that sounds pretty good to me. Is there any reason I shouldn't do this? If not, is there a guide on the best way to do this someone can point me to? I'm running the whole thing through PIG, so having a callable Jython jar would be ideal.
I know some similar questions to this already exist on SO, but the answers to those seem pretty old, and the documentation given (for MavenJython, for example) is pretty poor. I've already looked at MavenJython, and Jip, but I can't really decide between the two, and I'm not really finding sufficient information for either. An ideal answer to this question what I should use, why I should use it, and give a brief demo of how one would use it. A link to any of those would also be awesome.
Thanks!
Please elaborate on how the MavenJython documentation can be improved. Things have not changed since 2011 (you probably have seen this answer), which is not that long ago. There is the website, but as a programmer you probably just want to read the source code of the jython-compile-maven-plugin-test project, which is only 200 lines of code. It is a good idea to use this package as a starting point for your own project.
Distribution philosophies
The way Java software distribution (and Windows software distribution) usually works is that you package everything you need. So yes, a standalone Jython jar would be appropriate. The drawback is that every software may use a different Jython version, the benefit being that this might be what you want (updates may break things). This is the MavenJython approach too, packaging everything in the correct versions.
Python and Linux software distribution just installs packages, checking compatibility at install time. This is the jip approach, which assumes you already have Jython, and whoever installs software will resolve compability issues by installing the correct versions.
Differences
I can not say much about jip though, I have not used it. From what I see in the demos jip is meant to provide Python packages access to Maven Java libraries. It also seems to be capable of producing maven packages from Jython code. So you can probably achieve your goal using either MavenJython or jip. Just try.
The deliverables built using MavenJython distribute Jython, while jip does not.
If you want to instruct programmers who already use jython and are unfamiliar with Maven, who want to use your library to fetch your jython library package, jip might be the way to go.
If you want to write Jython libraries for programmers and distribute them, you can use either MavenJython or jip.
If you have a software package that is going to end up as a deliverable to customers, which happens to also use Jython code and Jython packages, perhaps also providing in-program scripting to the user, go with MavenJython. It allows you to create a standalone executable.
Pig use case
For running jython through pig it is enough to install jython and put the jython sources in your path -- see the embed python section of the pig manual. jip can be appropriate for installing jython packages locally, but is not necessary if you only want to run your code. If you however want to distribute software which uses pig, and install pig, jython and your code on a clients computer, MavenJython can do that for you.
Related
I'm working on an Inno Setup installer for a Python application for Windows 7, and I have these requirements:
The app shouldn't write anything to the installation directory
It should be able to use .pyc files
The app shouldn't require a specific Python version, so I can't just add a set of .pyc files to the installer
Is there a recommended way of handling this? Like give the user a way to (re)generate the .pyc files? Or is the shorter startup time benefit from the .pyc files usually not worth worrying about?
PYC files aren't guaranteed to be compatible for different python versions. If you don't know that all your customers are running the same python versions, you really don't want to distribute pyc's directly. So, you have to choose between distributing PYCs and supporting multiple python versions.
You could create build process that compiles all your files using py_compile and zips them up into a version-specific package. You can do this with setuptools.; however it will be awkward to do because you'll have to run py_compile in every version you need to support.
If you are basically distributing a closed application and don't want people to have trivial access to your source code, then py2exe is probably a simpler alternative. If your python is supposed to be integrated into the user's python install, then it's probably simpler to just create a zip of your .py files and add a one-line .py stub that imports the zipped package(s) using zipfile
if it makes you feel better, PYC doesn't provide much extra security and it doesn't really boost perf much either :)
If you haven't read PEP 3147, that will probably answer your questions.
I don't mean the solution described in that PEP and implemented as of Python 3.2. That's great if your "multiple Python versions" just means "3.2, 3.3, and probably future 3.x". Or even if it means "2.6+ and 3.1+, but I only really care about 3.2 and 3.3, so if I don't get the pyc speedups for other ones that's OK".
But when I asked your supported versions, you said, "2.7", which means you can't rely on PEP 3147 to solve your problems.
Fortunately, the PEP is full of discussion of earlier attempts to solve the problem, and the pitfalls of each, and there should be more than enough there to figure out what the options are and how to implement them.
The one problem is that the PEP is very linux-centric—mainly because it's primarily linux distros that tried to solve the problem in the past. (Apple also did so, but their solution was (a) pretty much working, and (b) tightly coupled with the whole Mac-specific "framework" thing, so they were mostly ignored…)
So, it largely leaves open the question of "Where should I put the .pyc files on Windows?"
The best choice is probably an app-specific directory under the user's local application data directory. See Known Folders if you can require Vista or later, CSIDL if you can't. Either way, you're looking for the FOLDERID_LocalAppData or CSIDL_LOCAL_APPDATA, which is:
The file system directory that serves as a data repository for local (nonroaming) applications. A typical path is C:\Documents and Settings\username\Local Settings\Application Data.
The point is that it's a place for applications to store data that's separate for each user (and inside that user's profile directory), and also for each machine the user's roaming profile might end up on, which means you can safely put stuff there and know that the user has the permissions to write there without UAC getting involved, and also know (as well as you ever can) that no other user or machine will interfere with what's there.
Within that directory, you create a directory for your program, and put whatever you want there, and as long as you picked a unique name (e.g., My Unique App Name or My Company Name\My App Name or a UUID), you're safe from accidental collision with other programs. (There used to be specific guidelines on this in MSDN, but I can no longer find them.)
So, how do you get to that directory?
The easiest way is to just use the env variable %LOCALAPPDATA%. If you need to deal with older Windows, you can use %USERPROFILE% and tack \Local Settings\Application Data onto the end, which is guaranteed to either be the same, or end up in the same place via junctions.
You can also use pywin32 or ctypes to access the native Windows APIs (since there are at least 3 different APIs for this and at least two ways to access those APIs, I don't want to give all possible ways to write this… but a quick google or SO search for "pywin32 SHGetFolderPath" or "ctypes SHGetKnownFolderPath" or whatever should give you what you need).
Or, there are multiple third-party modules to handle this. The first one both Google and PyPI turned up was winshell.
Re-reading the original question, there's a much simpler answer that probably fits your requirements.
I don't know much about Inno, but most installers give you a way to run an arbitrary command as a post-copy step.
So, you can just use python -m compileall to create the .pyc files for you at install time—while you've still got elevated privileges, so there's no problem with UAC.
In fact, if you look at pywin32, and various other Python packages that come as installer packages, they do exactly this. This is an idiomatic thing to do for installing libraries into the user's Python installation, so I don't see why it wouldn't be considered reasonable for installing an executable that uses the user's Python installation.
Of course if the user later decides to uninstall Python 2.6 and install 2.7, your .pyc files will be hosed… but from your description, it sounds like your entire program will be hosed anyway, and the recommended solution for the user would probably be to uninstall and reinstall anyway, right?
Is it possible to deploy python applications such that you don't release the source code and you don't have to be sure the customer has python installed?
I'm thinking maybe there is some installation process that can run a python app from just the .pyc files and a shared library containing the interpreter or something like that?
Basically I'm keen to get the development benefits of a language like Python - high productivity etc. but can't quite see how you could deploy it professionally to a customer where you don't know how there machine is set up and you definitely can't deliver the source.
How do professional software houses developing in python do it (or maybe the answer is that they don't) ?
You protect your source code legally, not technologically. Distributing py files really isn't a big deal. The only technological solution here is not to ship your program (which is really becoming more popular these days, as software is provided over the internet rather than fully installed locally more often.)
If you don't want the user to have to have Python installed but want to run Python programs, you'll have to bundle Python. Your resistance to doing so seems quite odd to me. Java programs have to either bundle or anticipate the JVM's presence. C programs have to either bundle or anticipate libc's presence (usually the latter), etc. There's nothing hacky about using what you need.
Professional Python desktop software bundles Python, either through something like py2exe/cx_Freeze/some in-house thing that does the same thing or through embedding Python (in which case Python comes along as a library rather than an executable). The former approach is usually a lot more powerful and robust.
Yes, it is possible to make installation packages. Look for py2exe, cx_freeze and others.
No, it is not possible to keep the source code completely safe. There are always ways to decompile.
Original source code can trivially be obtained from .pyc files if someone wants to do it. Code obfuscation would make it more difficult to do something with the code.
I am surprised no one mentioned this before now, but Cython seems like a viable solution to this problem. It will take your Python code and transpile it into CPython compatible C code. You also get a small speed boost (~25% last I checked) since it will be compiled to native machine code instead of just Python byte code. You still need to be sure the user has Python installed (either by making it a pre-requisite pushed off onto the user to deal with, or bundling it as part of the installer process). Also, you do need to have at least one small part of your application in pure Python: the hook into the main function.
So you would need something basic like this:
import cython_compiled_module
if __name__ == '__main__':
cython_compiled_module.main()
But this effectively leaks no implementation details. I think using Cython should meet the criteria in the question, but it also introduces the added complexity of compiling in C, which loses some of Python's easy cross-platform nature. Whether that is worth it or not is up to you.
As others stated, even the resulting compiled C code could be decompiled with a little effort, but it is likely much more close to the type of obfuscation you were initially hoping for.
Well, it depends what you want to do. If by "not releasing the source code" you mean "the customer should not be able to access the source code in any way", well, you're fighting a losing battle. Even programs written in C can be reverse engineered, after all. If you're afraid someone will steal from you, make them sign a contract and sue them if there's trouble.
But if you mean "the customer should not care about python files, and not be able to casually access them", you can use a solution like cx_Freeze to turn your Python application into an executable.
Build a web application in python. Then the world can use it via a browser with zero install.
As a small-time Python package writer (cobs, simplerandom), I'm wondering what Python versions I should support.
I've heard anecdotally that Python 2.5 is still in use on enterprise type servers. So I thought 2.5 was the oldest that needed to be practically supported, here in 2011.
However, I saw this blog in which the author says he's still using 2.4. From memory, I saw an e-mail on the PyCrypto mailing list saying they aimed to keep support going back to 2.2 if possible.
Of course, then there's Python 3.x which is slowly gaining momentum. It would be good to know who is using that.
Then, there is also Jython and Ironpython, and I have very little idea about them.
Is there any concrete and up-to-date Python installation/usage data available to enlighten us? Is there any "best practices" or other advice for what versions/flavours of Python a package writer should aim to support?
I think that this is a problem that's simply inherent when developing any software. Anyone could be running any version and would need support for that version (I wonder how many people are still running Windoze ME out there? ;)). Personally, when developing libraries, I'll support only support the current version+. If for no other reason, because I'm only one person and I don't have a team.
Having said that, I'd stick my packages up on github and take patches from anyone who wants support for previous versions (and is willing to put in the work).
Edit:
I've found that a good rule in software development (especially packages) is develop only for what is needed, not what you think might be needed. In other words, get it working for whatever version of Python you're running (or is dearest to you) and then take support requests if you want to implement them yourself as people need them.
I have servers that run Python 2.3. :-)
But no, you don't need to support it. Most servers like that are just running, and do not get any new modules installed.
When creating a new module today, 2.6, 2.7, 3.1 and 3.2 is the versions to support. For existing modules you can ask your users. :-)
You should support the latest of the 2.x series (2.7 as of now) and 3.x (3.2 as of now). Unless you have a specific need for supporting older versions, I don't think you need to go there.
As for alternate implementations like IronPython and Jython, you should do that only if needed. It can be time consuming (although perhaps educational) to support you app for all these implementations.
As a side node, to test your app on multiple versions/implementations of Python, I recommend tox.
I want to distribute a Python application to windows users who don't have Python or the correct Python version.
I have tried py2exe conversion but my Python program is really complex and involve code import on the fly by xmlrpc process so it is not suitable for py2exe.
The complete Python folder takes around 80MB but this includes docs and a lot of non-essential things.
Do you know if there exists a small package of a minimal Python interpreter I can include with my program ? Include a folder of 80MB is a bit big ;)
PyInstaller is a py2exe "competitor" that has many extras (such as being cross-platform, supporting popular third party packages "out of the box", and explicitly supporting advanced importing options) -- it might meet your needs. Just be sure to install the SVN trunk -- the existing (1.3) release is way, WAY obsolete (PyInstaller is under active development again since quite a while, but I can't convince the current maintainers to stop and do a RELEASE already -- they're kind of perfectionists and keep piling more and more great goodies, optimizations, enhancements, etc, into the SVN trunk instead;-).
Have a look at Portable Python. This will install a Python programming environment in a local folder. I am sure that you could strip many unwanted things off.
I recommend however that you give py2exe another chance.
..involve code import on the fly by xmlrpc process so it is not suitable for py2exe
Py2exe can deal with situations like this. You just have to tell it which modules are being imported at runtime, so that it includes them in the distribution. Your code should then be able to import from these modules dynamically.
püy2exe is bad and incompabilite to Windows 10 now.
I suggest you use BoxedApp Packer until 22 mb small without runtimes....
enter link description here
It is almost better than py2exe because py2exe need many py files and opened data files...
I have started on a personal python application that runs on the desktop. I am using wxPython as a GUI toolkit. Should there be a demand for this type of application, I would possibly like to commercialize it.
I have no knowledge of deploying "real-life" Python applications, though I have used py2exe in the past with varied success. How would I obfuscate the code? Can I somehow deploy only the bytecode?
An ideal solution would not jeopardize my intellectual property (source code), would not require a direct installation of Python (though I'm sure it will need to have some embedded interpreter), and would be cross-platform (Windows, Mac, and Linux). Does anyone know of any tools or resources in this area?
Thanks.
You can distribute the compiled Python bytecode (.pyc files) instead of the source. You can't prevent decompilation in Python (or any other language, really). You could use an obfuscator like pyobfuscate to make it more annoying for competitors to decipher your decompiled source.
As Alex Martelli says in this thread, if you want to keep your code a secret, you shouldn't run it on other people's machines.
IIRC, the last time I used cx_Freeze it created a DLL for Windows that removed the necessity for a native Python installation. This is at least worth checking out.
Wow, there are a lot of questions in there:
It is possible to run the bytecode (.pyc) file directly from the Python interpreter, but I haven't seen any bytecode obfuscation tools available.
I'm not aware of any "all in one" deployment solution, but:
For Windows you could use NSIS(http://nsis.sourceforge.net/Main_Page). The problem here is that while OSX/*nix comes with python, Windows doesn't. If you're not willing to build a binary with py2exe, I'm not sure what the licensing issues would be surrounding distribution of the Python runtime environment (not to mention the technical ones).
You could package up the OS X distribution using the "bundle" format, and *NIX has it's own conventions for installing software-- typically a "make install" script.
Hope that was helpful.
Maybe IronPython can provide something for you? I bet those .exe/.dll-files can be pretty locked down. Not sure how such features work on mono, thus no idea how this works on Linux/OS X...
I have been using py2exe with good success on Windows. The code needs to be modified a bit so that the code analysis picks up all modules needed, but apart from that, it works.
As for Linux, there are several important distribution formats:
DEB (Debian, Ubuntu and other derivatives)
RPM (RedHat, Fedora, openSuSE)
DEBs aren't particularly difficult to make, especially when you're already using distutils/setuptools. Some hints are given in the policy document, examples for packaging Python applications can be found in the repository.
I don't have any experience with RPM, but I'm sure there are enough examples to be found.
Try to use scraZ obfuscator (http://scraZ.me).
This is obfuscator for bytecode, not for source code.
Free version have good, but not perfect obfuscation methods.
PRO version have very very strong protection for bytecode.
(after bytecode obfuscation a decompilation is impossible)