I need pypy to speed up my python code. While the pypy doesn't support a lot of modules I need (e.g. GNU Radio). Could I use pypy to speed up parts of my python code. Could I use pypy to only speed up some of my python files? How can I do that?
No, you can't. You can only have one interpreter instance running all of the code in a single program at a time. The exception is if you break out some of your functionality into a totally separate program that communicates with the other part of your code through some form of inter-process communication; then you can run those totally separate programs however you like. But for code that is not separated like that, it's not possible.
It will probably be more straightforward to adapt the entirety of your code to work with PyPy one way or another, instead of trying to break out bits and pieces. If that's absolutely not possible, then PyPy probably can't help you.
No, you can't. And GNU Radio does the signal processing and scheduling in C++, so that's totally opaque to your python interpreter. Also, GNU Radio itself is highly optimized and contains specialized implementations for most of the CPU intense tasks for SSE, SSE4, and some NEON.
I need pypy to speed up my python code.
I doubt that. If your program runs too slow, it's probably nothing your Python interpreter can solve -- you might have to look into what could take so much time, and solve this on a higher level.
Related
Is it possible to deploy python applications such that you don't release the source code and you don't have to be sure the customer has python installed?
I'm thinking maybe there is some installation process that can run a python app from just the .pyc files and a shared library containing the interpreter or something like that?
Basically I'm keen to get the development benefits of a language like Python - high productivity etc. but can't quite see how you could deploy it professionally to a customer where you don't know how there machine is set up and you definitely can't deliver the source.
How do professional software houses developing in python do it (or maybe the answer is that they don't) ?
You protect your source code legally, not technologically. Distributing py files really isn't a big deal. The only technological solution here is not to ship your program (which is really becoming more popular these days, as software is provided over the internet rather than fully installed locally more often.)
If you don't want the user to have to have Python installed but want to run Python programs, you'll have to bundle Python. Your resistance to doing so seems quite odd to me. Java programs have to either bundle or anticipate the JVM's presence. C programs have to either bundle or anticipate libc's presence (usually the latter), etc. There's nothing hacky about using what you need.
Professional Python desktop software bundles Python, either through something like py2exe/cx_Freeze/some in-house thing that does the same thing or through embedding Python (in which case Python comes along as a library rather than an executable). The former approach is usually a lot more powerful and robust.
Yes, it is possible to make installation packages. Look for py2exe, cx_freeze and others.
No, it is not possible to keep the source code completely safe. There are always ways to decompile.
Original source code can trivially be obtained from .pyc files if someone wants to do it. Code obfuscation would make it more difficult to do something with the code.
I am surprised no one mentioned this before now, but Cython seems like a viable solution to this problem. It will take your Python code and transpile it into CPython compatible C code. You also get a small speed boost (~25% last I checked) since it will be compiled to native machine code instead of just Python byte code. You still need to be sure the user has Python installed (either by making it a pre-requisite pushed off onto the user to deal with, or bundling it as part of the installer process). Also, you do need to have at least one small part of your application in pure Python: the hook into the main function.
So you would need something basic like this:
import cython_compiled_module
if __name__ == '__main__':
cython_compiled_module.main()
But this effectively leaks no implementation details. I think using Cython should meet the criteria in the question, but it also introduces the added complexity of compiling in C, which loses some of Python's easy cross-platform nature. Whether that is worth it or not is up to you.
As others stated, even the resulting compiled C code could be decompiled with a little effort, but it is likely much more close to the type of obfuscation you were initially hoping for.
Well, it depends what you want to do. If by "not releasing the source code" you mean "the customer should not be able to access the source code in any way", well, you're fighting a losing battle. Even programs written in C can be reverse engineered, after all. If you're afraid someone will steal from you, make them sign a contract and sue them if there's trouble.
But if you mean "the customer should not care about python files, and not be able to casually access them", you can use a solution like cx_Freeze to turn your Python application into an executable.
Build a web application in python. Then the world can use it via a browser with zero install.
I am working with an ARM Cortex M3 on which I need to port Python (without operating system). What would be my best approach? I just need the core Python and basic I/O.
Golly, that's kind of a tall order. There are so many services of a kernel that Python depends upon, and that you'd have to provide yourself. I'd think you'd be far better off looking for a lightweight OS -- maybe Minix 3? -- to put on your embedded processor.
Failing that, I'd be horribly tempted to think about hand-translating to C and building the essentials on that.
You should definitely look at eLua:
http://www.eluaproject.net
"Embedded power, driven by Lua
Quickly prototype and develop embedded software applications with the power of Lua and run them on a wide range of microcontroller architectures"
There are a few projects that have attempted to port Python to the situation you mention, take a look at python-on-a-chip, PyMite or tinypy. These are aimed at lower power microcontrollers without an OS and tend to focus on slightly older versions of the Python language and reduced library support.
One possible approach is to build your own stack machine in software to interpret and execute Python byte code directly. Certainly not a porting job and quite labor-intensive to implement, but a self-contained Python byte code stack processor built for your embedded system gets you around needing an operating system.
Another approach is writing your own low level executive (one step below a general purpose OS) that contains the bare minimum in services that a core Python interpreter port requires. I am not certain if this is more or less labor intensive than building a stack processor.
I am not recommending either of these approaches - personally, I like Charlie Martin's Minix 3 approach best since it is a balanced requirements compromise. On the other hand, what I suggest might be interesting if your project absolutely requires Python without an operating system and if the project has an excellent time and money budget.
Update 5 Mar 2012: Given a strict adherence to your Python/No OS requirements, another possibility of a path to a solution may lie in using an OS-less Java VM (e.g., jnode, currently in beta) and use Jython to create Java byte code from Python. Certainly not an ideal off-the-shelf solution, and it does seem to meet an OS-less Python requirement.
Compile it to c :)
http://shed-skin.blogspot.com/
fyi I just ported CPython 2.7x to non-POSIX OS. That was easy.
You need write pyconfig.h in right way, remove most of unused modules. Disable unused features.
Then fix compile, link errors. Then it just works after fixing some simple problems on run.
If You have no some POSIX header, write one by yourself. Implement all POSIX functions, that needed, such as file i/o.
Took 2-3 weeks in my case. Although I have heavily customized Python core. Unfortunately cannot opensource it :(.
After that I think Python can be ported easily to any platform, that has enough RAM.
Wouldn't it be possible to have an OS entirely in Python if the Python VM itself is build into a hardware? Something like the good old Lisp Machine?
Suppose I have a cpu that is the hardware implementation of the python virtual machine, then all programs written in python would perform with the speed of assembly, won't it (but Python is mostly interpreted but we can compile it)?
If we have such a 'python-microprocessor', what about the memory and other subsystems? Would it be compatible with the current memory.
Is there any information on the registers and the Python VM architecture, something similar to what we have for 8086?
Wouldn't it be possible to have an OS
entirely in Python if the Python VM
itself is build into a hardware?
Something like the good old Lisp
Machine?
Yes, theoretically it would be possible.
Suppose I have a cpu that is the
hardware implementation of the python
virtual machine, then all programs
written in python would perform with
the speed of assembly, won't it (but
Python is mostly interpreted but we
can compile it)?
Python doesn't have a speed, it's a language. The speed of the interpreter (in this case the processor) can be tested. But just as it's difficult to compare the performance of a RISC and a CISC processor, comparing Assembly with Python will be difficult too.
If we have such a
'python-microprocessor', what about
the memory and other subsystems? Would
it be compatible with the current
memory.
The python microprocessor would have to do the memory management (and thus the garbage collection). Since that's normally done by the interpreter, now the microprocessor has to do it.
Is there any information on the
registers and the Python VM
architecture, something similar to
what we have for 8086?
Normally you don't access the memory directly in Python, so the registers shouldn't be relevant here.
Similar things were tried for Java, but none really took the world by storm.
Yeah, it might be possible, but designing new hardware is expensive. Would the return on investment justify building such a toy? I'd guess not, otherwise someone would have tried it by now. :)
Suppose I have a cpu that is the hardware implementation of the python virtual machine, then all programs written in python would perform with the speed of assembly, won't it (but Python is mostly interpreted but we can compile it)?
Yes it would be assembly speed. See this link for a comparison with an avr microcontroller assembly code. http://pycpu.wordpress.com/code-examples/speed-pycpu-vs-8bit-avr/.
It is a hardware implemntation of a cpu that can do very very limited python bytecode. But enought for ifs conditions and while loops with simple integers.
In the 70ties such ideas were quite popular. The idea was to close the semantic gap between compilers/virtual machines and instruction set architectures, and thereby bring programming languages and hardware closer together. However, when Patterson and Ditzel published The Case for the Reduced Instruction Set Computer (PDF, 672KB) and after the success of RISC and the microprocessor, the idea of closing the semantic gap was basically dead.
Now, with ever increasing transistor counts the idea may become interesting again. But, as others already noted, designing chips is costly. You need a very good reason to sink so much money. But it is definitely possible. IBM and Azul have shown this with their massively parallel Java Chips.
I guess you should call Google and convince them that they urgently need a Python processor. ;-)
New operating systems are interesting and cool, and basing one off of python would be cool. Then again, linux is so good and has so much development for it already. It would have to be the "right time".
From the Google Open Source Blog:
PyPy is a reimplementation of Python
in Python, using advanced techniques
to try to attain better performance
than CPython. Many years of hard work
have finally paid off. Our speed
results often beat CPython, ranging
from being slightly slower, to
speedups of up to 2x on real
application code, to speedups of up to
10x on small benchmarks.
How is this possible? Which Python implementation was used to implement PyPy? CPython? And what are the chances of a PyPyPy or PyPyPyPy beating their score?
(On a related note... why would anyone try something like this?)
"PyPy is a reimplementation of Python in Python" is a rather misleading way to describe PyPy, IMHO, although it's technically true.
There are two major parts of PyPy.
The translation framework
The interpreter
The translation framework is a compiler. It compiles RPython code down to C (or other targets), automatically adding in aspects such as garbage collection and a JIT compiler. It cannot handle arbitrary Python code, only RPython.
RPython is a subset of normal Python; all RPython code is Python code, but not the other way around. There is no formal definition of RPython, because RPython is basically just "the subset of Python that can be translated by PyPy's translation framework". But in order to be translated, RPython code has to be statically typed (the types are inferred, you don't declare them, but it's still strictly one type per variable), and you can't do things like declaring/modifying functions/classes at runtime either.
The interpreter then is a normal Python interpreter written in RPython.
Because RPython code is normal Python code, you can run it on any Python interpreter. But none of PyPy's speed claims come from running it that way; this is just for a rapid test cycle, because translating the interpreter takes a long time.
With that understood, it should be immediately obvious that speculations about PyPyPy or PyPyPyPy don't actually make any sense. You have an interpreter written in RPython. You translate it to C code that executes Python quickly. There the process stops; there's no more RPython to speed up by processing it again.
So "How is it possible for PyPy to be faster than CPython" also becomes fairly obvious. PyPy has a better implementation, including a JIT compiler (it's generally not quite as fast without the JIT compiler, I believe, which means PyPy is only faster for programs susceptible to JIT-compilation). CPython was never designed to be a highly optimising implementation of the Python language (though they do try to make it a highly optimised implementation, if you follow the difference).
The really innovative bit of the PyPy project is that they don't write sophisticated GC schemes or JIT compilers by hand. They write the interpreter relatively straightforwardly in RPython, and for all RPython is lower level than Python it's still an object-oriented garbage collected language, much more high level than C. Then the translation framework automatically adds things like GC and JIT. So the translation framework is a huge effort, but it applies equally well to the PyPy python interpreter however they change their implementation, allowing for much more freedom in experimentation to improve performance (without worrying about introducing GC bugs or updating the JIT compiler to cope with the changes). It also means when they get around to implementing a Python3 interpreter, it will automatically get the same benefits. And any other interpreters written with the PyPy framework (of which there are a number at varying stages of polish). And all interpreters using the PyPy framework automatically support all platforms supported by the framework.
So the true benefit of the PyPy project is to separate out (as much as possible) all the parts of implementing an efficient platform-independent interpreter for a dynamic language. And then come up with one good implementation of them in one place, that can be re-used across many interpreters. That's not an immediate win like "my Python program runs faster now", but it's a great prospect for the future.
And it can run your Python program faster (maybe).
Q1. How is this possible?
Manual memory management (which is what CPython does with its counting) can be slower than automatic management in some cases.
Limitations in the implementation of the CPython interpreter preclude certain optimisations that PyPy can do (eg. fine grained locks).
As Marcelo mentioned, the JIT. Being able to on the fly confirm the type of an object can save you the need to do multiple pointer dereferences to finally arrive at the method you want to call.
Q2. Which Python implementation was used to implement PyPy?
The PyPy interpreter is implemented in RPython which is a statically typed subset of Python (the language and not the CPython interpreter). - Refer https://pypy.readthedocs.org/en/latest/architecture.html for details.
Q3. And what are the chances of a PyPyPy or PyPyPyPy beating their score?
That would depend on the implementation of these hypothetical interpreters. If one of them for example took the source, did some kind of analysis on it and converted it directly into tight target specific assembly code after running for a while, I imagine it would be quite faster than CPython.
Update: Recently, on a carefully crafted example, PyPy outperformed a similar C program compiled with gcc -O3. It's a contrived case but does exhibit some ideas.
Q4. Why would anyone try something like this?
From the official site. https://pypy.readthedocs.org/en/latest/architecture.html#mission-statement
We aim to provide:
a common translation and support framework for producing
implementations of dynamic languages, emphasizing a clean
separation between language specification and implementation
aspects. We call this the RPython toolchain_.
a compliant, flexible and fast implementation of the Python_
Language which uses the above toolchain to enable new advanced
high-level features without having to encode the low-level
details.
By separating concerns in this way, our implementation of Python - and
other dynamic languages - is able to automatically generate a
Just-in-Time compiler for any dynamic language. It also allows a
mix-and-match approach to implementation decisions, including many
that have historically been outside of a user's control, such as
target platform, memory and threading models, garbage collection
strategies, and optimizations applied, including whether or not to
have a JIT in the first place.
The C compiler gcc is implemented in C, The Haskell compiler GHC is written in Haskell. Do you have any reason for the Python interpreter/compiler to not be written in Python?
PyPy is implemented in Python, but it implements a JIT compiler to generate native code on the fly.
The reason to implement PyPy on top of Python is probably that it is simply a very productive language, especially since the JIT compiler makes the host language's performance somewhat irrelevant.
PyPy is written in Restricted Python. It does not run on top of the CPython interpreter, as far as I know. Restricted Python is a subset of the Python language. AFAIK, the PyPy interpreter is compiled to machine code, so when installed it does not utilize a python interpreter at runtime.
Your question seems to expect the PyPy interpreter is running on top of CPython while executing code.
Edit: Yes, to use PyPy you first translate the PyPy python code, either to C and build with gcc, to jvm byte code, or to .Net CLI code. See Getting Started
I've got some experience with Bash, which I don't mind, but now that I'm doing a lot of Windows development I'm needing to do basic stuff/write basic scripts using
the Windows command-line language. For some reason said language really irritates me, so I was considering learning Python and using that instead.
Is Python suitable for such things? Moving files around, creating scripts to do things like unzipping a backup and restoring a SQL database, etc.
Python is well suited for these tasks, and I would guess much easier to develop in and debug than Windows batch files.
The question is, I think, how easy and painless it is to ensure that all the computers that you have to run these scripts on, have Python installed.
Summary
Windows: no need to think, use Python.
Unix: quick or run-it-once scripts are for Bash, serious and/or long life time scripts are for Python.
The big talk
In a Windows environment, Python is definitely the best choice since cmd is crappy and PowerShell has not really settled yet. What's more Python can run on several platform so it's a better investment. Finally, Python has a huge set of library so you will almost never hit the "god-I-can't-do-that" wall. This is not true for cmd and PowerShell.
In a Linux environment, this is a bit different. A lot of one liners are shorter, faster, more efficient and often more readable in pure Bash. But if you know your quick and dirty script is going to stay around for a while or will need to be improved, go for Python since it's far easier to maintain and extend and you will be able to do most of the task you can do with GNU tools with the standard library. And if you can't, you can still call the command-line from a Python script.
And of course you can call Python from the shell using -c option:
python -c "for line in open('/etc/fstab') : print line"
Some more literature about Python used for system administration tasks:
The IBM lab point of view.
A nice example to compare bash and python to script report.
The basics.
The must-have book.
Sure, python is a pretty good choice for those tasks (I'm sure many will recommend PowerShell instead).
Here is a fine introduction from that point of view:
http://www.redhatmagazine.com/2008/02/07/python-for-bash-scripters-a-well-kept-secret/
EDIT: About gnud's concern: http://www.portablepython.com/
Are you aware of PowerShell?
Anything is a good replacement for the Batch file system in windows. Perl, Python, Powershell are all good choices.
#BKB definitely has a valid concern. Here's a couple links you'll want to check if you run into any issues that can't be solved with the standard library:
Pywin32 is a package for working with low-level win32 APIs (advanced file system modifications, COM interfaces, etc.)
Tim Golden's Python page: he maintains a WMI wrapper package that builds off of Pywin32, but be sure to also check out his "Win32 How Do I" page for details on how to accomplish typical Windows tasks in Python.
Python is certainly well suited to that. If you're going down that road, you might also want to investigate SCons which is a build system itself built with Python. The cool thing is the build scripts are actually full-blown Python scripts themselves, so you can do anything in the build script that you could otherwise do in Python. It makes make look pretty anemic in comparison.
Upon rereading your question, I should note that SCons is more suited to building software projects than to writing system maintenance scripts. But I wouldn't hesitate to recommend Python to you in any case.
As a follow up, after some experimentation the thing I've found Python most useful for is any situation involving text manipulation (yourStringHere.replace(), regexes for more complex stuff) or testing some basic concept really quickly, which it is excellent for.
For stuff like SQL DB restore scripts I find I still usually just resort to batch files, as it's usually either something short enough that it actually takes more Python code to make the appropriate system calls or I can reuse snippets of code from other people reducing the writing time to just enough to tweak existing code to fit my needs.
As an addendum I would highly recommend IPython as a great interactive shell complete with tab completion and easy docstring access.
I've done a decent amount of scripting in both Linux/Unix and Windows environments, in Python, Perl, batch files, Bash, etc. My advice is that if it's possible, install Cygwin and use Bash (it sounds from your description like installing a scripting language or env isn't a problem?). You'll be more comfortable with that since the transition is minimal.
If that's not an option, then here's my take. Batch files are very kludgy and limited, but make a lot of sense for simple tasks like 'copy some files' or 'restart this service'. Python will be cleaner, easier to maintain, and much more powerful. However, the downside is that either you end up calling external applications from Python with subprocess, popen or similar. Otherwise, you end up writing a bunch more code to do things that are comparatively simple in batch files, like copying a folder full of files. A lot of this depends on what your scripts are doing. Text/string processing is going to be much cleaner in Python, for example.
Lastly, it's probably not an attractive alternative, but you might also consider VBScript as an alternative. I don't enjoy working with it as a language personally, but if portability is any kind of concern then it wins out by virtue of being available out of the box in any copy of Windows. Because of this I've found myself writing scripts that were unwieldy as batch files in VBScript instead, since I can't usually depend on Python or Perl or Bash being available on Windows.
Python, along with Pywin32, would be fine for Windows automation. However, VBScript or JScript used with the Windows Scripting Host works just as well, and requires nothing additional to install.
I've been using a lot of Windows Script Files lately. More powerful than batch scripts, and since it uses Windows scripting, there's nothing to install.
As much as I love python, I don't think it a good choice to replace basic windows batch scripts.
I can't see see someone having to import modules like sys, os or getopt to do basic things you can do with shell like call a program, check environment variable or an argument.
Also, in my experience, goto is much easier to understand to most sysadmins than a function call.