Today I stumbled over a post in stackoverflow (see also here):
We are developing opencl4py, higher level bindings. This project uses CFFI, so it works on Pypy.
The major issue we encountered with pyopencl is that 'import pyopencl' does OpenCL initialization and takes the whole virtual memory in case of NVIDIA driver, preventing from correct forking and effectively disabling multiprocessing (yes, we claim that using pyopencl disables multiprocessing at least with NVIDIA). opencl4py uses lazy OpenCL initialization, resolving this "import hell".
Later, it gained some nice features like super easy binary program caching, etc. Unfortunately, the documentation is somewhat brief. The best way to learn how it works is go through the tests.
As there is also pyOpenCL, I was woundering what the difference between these two packages is. Does anybody know where I can find an overview on the pro's and con's for these both packages?
Edit: To include benshope's comment as I would also be interested: what does "disable[s] multiprocessing" mean? Like, it can't run kernels on several devices at one time?
As far as I know, there is no such overview. I'll try to list some key points:
pyOpenCL is a mature project with a relatively large user base. There are tutorials, FAQ, etc. opencl4py appeared on 03/2014; no tutorials, FAQ and so on - only unit tests and docstrings.
pyOpenCL is a native cPython extension, whereas opencl4py uses cffi, so that it works on PyPy (pyOpenCL does NOT) and it does not require to be recompiled each time cPython changes version.
PyOpenCL has extras, such as random number generator and OpenGL interoperability.
opencl4py is extensively tested in Samsung production real world scenarios and is being actively developed.
what does "disable[s] multiprocessing" mean? Like, it can't run kernels on several devices at one time?
Of course, it can, I was trying to say that after importing pyopencl, os.fork() or multiprocessing.Process() lead to crashes inside NVIDIA OpenCL userspace library. It is always a bad idea of doing work during import.
Related
I am starting to read up over possible ways to parallelise Python code.
DISCLAIMER. This is NOT a question about Multiprocessing vs Multithreading.
At this link https://ipyparallel.readthedocs.io/en/latest/demos.html one finds references to several
concurrency packages for Python to avoid the GIL: https://scipy.github.io/old-wiki/pages/ParallelProgramming
-IPython1
-mpi4py
-parallel python
-Numba
There is also a multiprocessing package:
https://docs.python.org/3/library/multiprocessing.html
And another one called processing:
https://pypi.org/project/processing/
First of all, it is not at all clear to me the difference between the latter two above; what is the difference in using between the multiprocessing module and the processing module?.
In general, I fail to understand the differences between those all -- which must be there, given some developers made the effort to create a mpi4py version for the MPI used in C++. I guess this is not just about the dualism between "threading" and "multiprocessing" approaches, where in one case the memory is shared while the other has each process with its own memory and interpreter, something more must be different between all of those different packages out there.
Thanks to all of those who will dedicate time to answer this!
The difference is that the last version of processing was released in April of 2008 and multiprocessing was added in Python 2.6 in October 2008.
processing was a library that was used before multiprocessing was distributed with Python.
As far as the specific difference between other modules designed for multiprocessing: The scipy page you linked says that "This is a subject for graduate courses in computer science, and I'm not going to address it here....there are some python tools you can use to implement the things you learn in that graduate course." While they admit that may be a bit of an exaggeration, independent study of multiprocessing in general will be required to discern the difference between these libraries, you should probably just stick to the built in multiprocessing module for your initial experiments while you learn how it works. One you're more comfortable with multiprocessing, you might want to check out the pathos framework.
But here are the basics for the packages you mention:
Numba adds decorators that automatically compile functions to make them run faster, it isn't really a multiprocessing tool as much as a JIT compiling tool.
Parallel Python overcomes the GIL to utilize multiple cores or multiple computers, it's designed to be easy to use and to handle all the complex stuff behind the scenes.
MPI for Python is like Paralell Python with less emphasis on simplicity.
IPython is a toolkit with many features, including a shell and Jupyter kernel, it's also not really a multiprocessing tool.
Keep in mind that plenty of libraries/modules do the same thing, there doesn't need to be a reason more than one exists. Use whatever works for you.
I'm building a rendering engine in Python for fun. I need to load 3D scenes. Any standard modern format like DAE, 3DS, or MAX would work: I can convert my files easily between standard formats.
OpenSceneGraph seems to be the most comprehensive and well-maintained solution. It would be ideal to be able to use it in Python without much hassle. Are there working Python bindings for OSG that are easy to install, work on Mac OS X (I'm on 10.8), and are compatible with the latest versions of OSG?
I searched around and came across osgswig (http://code.google.com/p/osgswig/) and PyOSG (http://sourceforge.net/projects/pyosg/), but they don't seem to be actively maintained. I don't see any recent activity related to these packages, and it seems that people had trouble running osgswig on OSX. Ideally, I'd like to find something that "just works", without major compilation hassles. I'd like to just install a package and be able to import a module that will let me load COLLADA or 3DS files.
I also came across pycollada (https://github.com/pycollada/pycollada). It seems active, but fairly early-stage. Ideally, I'd like a reasonably comprehensive package that supports specular maps, normal maps, and other reasonably advanced features. Animation would be nice as well.
In summary, I need to load 3D scenes in Python. Bindings for OSG would probably be ideal, because OSG is so comprehensive. But I need something that works on OSX. I would also prefer something that can be installed reasonably easily. Does something like this exist?
Thanks!
Take a look at Open Asset Import Library (short name: Assimp). It is a portable Open Source library to import various well-known 3D model formats in a uniform manner. http://www.assimp.org/
You should loot at panda3D (http://www.panda3d.org/), it's a game engine with extensive python bindings. It has the features you want : http://www.panda3d.org/manual/index.php/Features
I used it for a few years and it was a solid tool.
I made my own fork of a mirror of a clone of the osgswig project for a similar purpose. I have it working with OpenSceneGraph version 3.2.1 on Windows and Mac; and it's likely I will eventually polish it for linux too. I'm already delivering one product to customers based on my version of osgswig, and I'm considering making others. Find my fork here:
https://github.com/cmbruns/osgswig
If others show enough interest, I might be coaxed into creating binary installers for my version of the osgswig module, to make installation easier.
If you just want the easiest OpenSceneGraph bindings for OSG 3.2.1, you can stop reading this answer here. Read on for more of my thoughts for the future.
Though I am maintaining a fork of osgswig (as stated above), I sort of hate SWIG, and I would prefer to use bindings based on Boost.Python, rather than on SWIG. For large, complex C++ APIs, like OpenSceneGraph, Boost.Python can be much more elegant than SWIG, both for the API consumer, and for the binding maintainer (me, and me). I found one project using Boost.Python to wrap OSG, at https://code.google.com/p/osgboostpython/, but the developer is lovingly wrapping each part of the interface by hand, and has thus only completed a tiny fraction of the large OpenSceneGraph API.
Taking that Boost.Python based project as inspiration, I created yet another OpenSceneGraph Python binding project, at https://github.com/JaneliaSciComp/osgpyplusplus. Eventually, I want to use this osgpyplusplus project for all my python osg needs. And I would appreciate help in making it ready. Right now, osgpyplusplus suffers from the following weaknesses, compared to osgswig:
osgpyplusplus is not yet used in any working product
The build environment is tricky to set up, requiring both Boost.Python and Pyplusplus
I haven't paid much attention to osgpyplusplus recently, so it might rust away if I continue to ignore it.
Though osgpyplusplus probably wraps most of the OpenSceneGraph API, there are probably some important missing pieces that won't be identified until someone tries to develop a significant project with it.
It would be a lot of work for me to create a binary module installer for osgpyplusplus at this point, so please don't ask me to.
I was looking at PyPy and I was just wondering why it hasn't been adopted into the mainline Python distributions. Wouldn't things like JIT compilation and lower memory footprint greatly improve the speeds of all Python code?
In short, what are the main drawbacks of PyPy that cause it to remain a separate project?
PyPy is not a fork of CPython, so it could never be merged directly into CPython.
Theoretically the Python community could universally adopt PyPy, PyPy could be made the reference implementation, and CPython could be discontinued. However, PyPy has its own weaknesses:
CPython is easy to integrate with Python modules written in C, which is traditionally the way Python applications have handled CPU-intensive tasks (see for instance the SciPy project).
The PyPy JIT compilation step itself costs CPU time -- it's only through repeated running of compiled code that it becomes faster overall. This means startup times can be higher, and therefore PyPy isn't necessarily as efficient for running glue code or trivial scripts.
PyPy and CPython behavior is not identical in all respects, especially when it comes to "implementation details" (behavior that is not specified by the language but is still important at a practical level).
CPython runs on more architectures than PyPy and has been successfully adapted to run in embedded architectures in ways that may be impractical for PyPy.
CPython's reference counting scheme for memory management arguably has more predictable performance impacts than PyPy's various GC systems, although this isn't necessarily true of all "pure GC" strategies.
PyPy does not yet fully support Python 3.x, although that is an active work item.
PyPy is a great project, but runtime speed on CPU-intensive tasks isn't everything, and in many applications it's the least of many concerns. For instance, Django can run on PyPy and that makes templating faster, but CPython's database drivers are faster than PyPy's; in the end, which implementation is more efficient depends on where the bottleneck in a given application is.
Another example: you'd think PyPy would be great for games, but most GC strategies like those used in PyPy cause noticeable jitter. For CPython, most of the CPU-intensive game stuff is offloaded to the PyGame library, which PyPy can't take advantage of since PyGame is primarily implemented as a C extension (though see: pygame-cffi). I still think PyPy can be a great platform for games, but I've never seen it actually used.
PyPy and CPython have radically different approaches to fundamental design questions and make different tradeoffs, so neither one is "better" than the other in every case.
For one, it's not 100% compatible with Python 2.x, and has only preliminary support for 3.x.
It's also not something that could be merged - The Python implementation that is provided by PyPy is generated using a framework they have created, which is extremely cool, but also completely disparate with the existing CPython implementation. It would have to be a complete replacement.
There are some very concrete differences between PyPy and CPython, a big one being how extension modules are supported - which, if you want to go beyond the standard library, is a big deal.
It's also worth noting that PyPy isn't universally faster.
See this video by Guido van Rossum. He talks about the same question you asked at 12 min 33 secs.
Highlights:
lack of Python 3 compatibility
lack of extension support
not appropriate as glue code
speed is not everything
After all, he's the one to decide...
One reason might be that according to PyPy site, it currently runs only on 32- and 64-bit Intel x86 architecture, while CPython runs on other platforms as well. This is probably due to platform-specific speed enhancements in PyPy. While speed is a good thing, people often want language implementations to be as "platform-independent" as possible.
I recommend watching this keynote by David Beazley for more insights. It answers your question by giving clarity on nature & intricacies of PyPy.
In addition to everything that's been said here, PyPy is not nearly as rock solid as CPython in terms of bugs. With SymPy, we've found at about a dozen bugs in PyPy over the past couple of years, both in released versions and in the nightlies.
On the other hand, we've only ever found one bug in CPython, and that was in a prerelease.
Plus, don't discount the lack of Python 3 support. No one in the core Python community even cares about Python 2 any more. They are working on the next big things in Python 3.4, which will be the fifth major release of Python 3. The PyPy guys still haven't gotten one of them. So they've got some catching up to do before they can start to be contenders.
Don't get me wrong. PyPy is awesome. But it's still far from being better than CPython in a lot of very important ways.
And by the way, if you use SymPy in PyPy, you won't see a smaller memory footprint (or a speedup either). See https://bitbucket.org/pypy/pypy/issues/1447/.
I am working with an ARM Cortex M3 on which I need to port Python (without operating system). What would be my best approach? I just need the core Python and basic I/O.
Golly, that's kind of a tall order. There are so many services of a kernel that Python depends upon, and that you'd have to provide yourself. I'd think you'd be far better off looking for a lightweight OS -- maybe Minix 3? -- to put on your embedded processor.
Failing that, I'd be horribly tempted to think about hand-translating to C and building the essentials on that.
You should definitely look at eLua:
http://www.eluaproject.net
"Embedded power, driven by Lua
Quickly prototype and develop embedded software applications with the power of Lua and run them on a wide range of microcontroller architectures"
There are a few projects that have attempted to port Python to the situation you mention, take a look at python-on-a-chip, PyMite or tinypy. These are aimed at lower power microcontrollers without an OS and tend to focus on slightly older versions of the Python language and reduced library support.
One possible approach is to build your own stack machine in software to interpret and execute Python byte code directly. Certainly not a porting job and quite labor-intensive to implement, but a self-contained Python byte code stack processor built for your embedded system gets you around needing an operating system.
Another approach is writing your own low level executive (one step below a general purpose OS) that contains the bare minimum in services that a core Python interpreter port requires. I am not certain if this is more or less labor intensive than building a stack processor.
I am not recommending either of these approaches - personally, I like Charlie Martin's Minix 3 approach best since it is a balanced requirements compromise. On the other hand, what I suggest might be interesting if your project absolutely requires Python without an operating system and if the project has an excellent time and money budget.
Update 5 Mar 2012: Given a strict adherence to your Python/No OS requirements, another possibility of a path to a solution may lie in using an OS-less Java VM (e.g., jnode, currently in beta) and use Jython to create Java byte code from Python. Certainly not an ideal off-the-shelf solution, and it does seem to meet an OS-less Python requirement.
Compile it to c :)
http://shed-skin.blogspot.com/
fyi I just ported CPython 2.7x to non-POSIX OS. That was easy.
You need write pyconfig.h in right way, remove most of unused modules. Disable unused features.
Then fix compile, link errors. Then it just works after fixing some simple problems on run.
If You have no some POSIX header, write one by yourself. Implement all POSIX functions, that needed, such as file i/o.
Took 2-3 weeks in my case. Although I have heavily customized Python core. Unfortunately cannot opensource it :(.
After that I think Python can be ported easily to any platform, that has enough RAM.
I have started on a personal python application that runs on the desktop. I am using wxPython as a GUI toolkit. Should there be a demand for this type of application, I would possibly like to commercialize it.
I have no knowledge of deploying "real-life" Python applications, though I have used py2exe in the past with varied success. How would I obfuscate the code? Can I somehow deploy only the bytecode?
An ideal solution would not jeopardize my intellectual property (source code), would not require a direct installation of Python (though I'm sure it will need to have some embedded interpreter), and would be cross-platform (Windows, Mac, and Linux). Does anyone know of any tools or resources in this area?
Thanks.
You can distribute the compiled Python bytecode (.pyc files) instead of the source. You can't prevent decompilation in Python (or any other language, really). You could use an obfuscator like pyobfuscate to make it more annoying for competitors to decipher your decompiled source.
As Alex Martelli says in this thread, if you want to keep your code a secret, you shouldn't run it on other people's machines.
IIRC, the last time I used cx_Freeze it created a DLL for Windows that removed the necessity for a native Python installation. This is at least worth checking out.
Wow, there are a lot of questions in there:
It is possible to run the bytecode (.pyc) file directly from the Python interpreter, but I haven't seen any bytecode obfuscation tools available.
I'm not aware of any "all in one" deployment solution, but:
For Windows you could use NSIS(http://nsis.sourceforge.net/Main_Page). The problem here is that while OSX/*nix comes with python, Windows doesn't. If you're not willing to build a binary with py2exe, I'm not sure what the licensing issues would be surrounding distribution of the Python runtime environment (not to mention the technical ones).
You could package up the OS X distribution using the "bundle" format, and *NIX has it's own conventions for installing software-- typically a "make install" script.
Hope that was helpful.
Maybe IronPython can provide something for you? I bet those .exe/.dll-files can be pretty locked down. Not sure how such features work on mono, thus no idea how this works on Linux/OS X...
I have been using py2exe with good success on Windows. The code needs to be modified a bit so that the code analysis picks up all modules needed, but apart from that, it works.
As for Linux, there are several important distribution formats:
DEB (Debian, Ubuntu and other derivatives)
RPM (RedHat, Fedora, openSuSE)
DEBs aren't particularly difficult to make, especially when you're already using distutils/setuptools. Some hints are given in the policy document, examples for packaging Python applications can be found in the repository.
I don't have any experience with RPM, but I'm sure there are enough examples to be found.
Try to use scraZ obfuscator (http://scraZ.me).
This is obfuscator for bytecode, not for source code.
Free version have good, but not perfect obfuscation methods.
PRO version have very very strong protection for bytecode.
(after bytecode obfuscation a decompilation is impossible)