Change python import behavior based on intended usage - python

I have a scientific code written in python.
The code works as a proper package to be imported and then the underlying functionality used. However, the workflow allows for the package to be used in two "modes":
One as a pre/post-processor to setup simulations, initial conditions, create grids (its a fluid flow solver), manipulate data etc. For this kind of usage, one is typically writing scripts and running them in serial on local machines. And this mode takes advantage of pythons amazing ecosystem, imports and utilizes other packages like scipy, etc...
The package then has another mode where its functionality is used for HPC applications, and runs on tens of thousands of processors with mpi4py. In this mode, I don't need many external packages, and I want to keep them to a minimum, and avoid unnecessary imports (this is question comes from an issue I am having on a cluster where numpy cannot import its random module above a certain number of processes).
My question is, is there a pythonic way to differentiate between these modes of use, and make my package heavy, feature rich, and all things great about python in one mode, and then "trim the fat" when used at run time?
If it spurs an idea, when I want the package to be light and not import unnecessary packages, it is always called from a single "run.py" script which basically bootstraps a simulation and executes it. Otherwise, the packages is imported from custom scripts or from the interpreter and I want the user to have access to all the available functionality.

Related

add functions to Python standard library?

Is there a way to add functions I create to the Python standard library on my local machine?
I come from the matlab world where things aren't really efficient and fast but there are looooads of functions at my fingertips without having to import their files. My problem is that, if I make a function in Python and want to use it, then i will need to also remember the module its in. My memory is shite. I understand that Python is structured that way for efficiency but if I'm adding only a handful of functions to the standard library that I consider very important, I'd guess that the impact to the performance is practically negligible.
Python has a namespace called __builtins__ in which you can stick stuff that you want available all the time. You probably shouldn't, but you can. Be careful not to clobber anything. Python won't stop you from using the same name as a built-in function, and if you do that, it'll probably break a lot of things.
# define function to always be available
def fart():
print("poot!")
__builtins__.fart = fart
# make re module always available without import
import re
__builtins__.re = re
Now the question is how to get Python to run that code for you each time you start up the interpreter. The answer is usercustomize.py. Follow these instructions to find out where the correct directory is on your machine, then put a new file called usercustomize.py in that directory that defines all the stuff you want to have in __builtins__.
There's also an environment variable, PYTHONSTARTUP, that you can set to have a Python script run whenever you start the interpreter in interactive mode (i.e. to a command prompt). I can see the benefit of e.g. having your favorite modules available when exploring in the REPL. More details here.
It sounds like you want to create your own packages & modules with tools you plan on using in the future on other projects. If that is the case, you want to look into the packaging your own project documentation:
https://packaging.python.org/tutorials/packaging-projects/
You may also find this useful:
How to install a Python package system-wide on Linux?
How to make my Python module available system wide on Linux?
How can I create a simple system wide python library?

How to split a very large python file?

I am using and maintaining a python script allowing to automatize compilation, execution and performances analysis of some particular applications. The script was quite simple when I created it (it only provided the compilation option) but is now very large (2100 lines, not optimized I agree), quite complex and providing many many different command line options (managing the arguments with argparse is a nightmare, and I am not able to do what I need exactly)
To simplify this, I am planning to split it in several scripts:
compile.py
run.py
analyse.py
These three scripts will need to access to share functions, classes and constants. Regarding this constraint, my question is what is the pyhtonic way to handle this ?
You can put the shared code in separate files and then import the file as a module in each script which needs it. To see how the module system in Python works, see the modules documentation for Python 2.7 or the documentation on modules for Python 3.4, depending on which version of Python you are writing code in.

Rewrite part of a c/c++ program in python

I am trying to develop a plugin interface in python for a popular 3d program, called Blender 3D. The purpose of the plugin interface is to extend some functionality of the program through python.
Now my problem is that I am trying to asses the performance impact. I will replace an existing functionality written in c code with something written in python.
I am worried that this might slow the application because the functionality that I am replacing is executed in real time and has to be really fast. It consists of a plain c function that just splits some polygons in triangles.
So the operations that I executing work on pieces of data that usually do not have more than 30 or 40 input points. And at most the operations that I am executing on them have a complexity of log(n) * n^2.
But I will be creating plenty of python objects each second, so I am already prepared to implement pooling to recycle the objects.
Now I am mostly worried that the python code will run 100 times slower than the c code and slow down the application. Should I be worried?
At most I will be doing 8500 computations in a single python function call. This function will be called each time when rendering the application interface.
The question of using c or python will depend on the use of your work. Is this a function that the blender developers will accept into the blender development? Do you expect many blender users will want to use it? A python addon allows you to develop your work outside of the main blender development and give many users access to it, while a patch to the c code that requires the user to compile their own version will reduce users.
You could also look at compiling your c code to a binary library that is included with the python addon and loaded as a python module. See two addons by Pyroevil created using cython - molecular and cubesurfer, some pre-built binaries are available on his main website. I'm not sure if using cython makes the python module creation easier or not, you could also use cython only as glue between python and your library.

Deploying Python modules for 3D software

I have been developing a fairly extensive library of python modules that automate the more time consuming parts of "3D character development" for games/film/tv.
All of my code up until a few months ago has been run within Maya's dedicated python interpreter, however, my GUIs are built in PySide/PyQt, and so, run just fine in mac/windows/linux or a few other Graphics programs such as Nuke, XSI, Max.
What I would really like to figure out is a "simple" way to distribute my code to various different people ---> using various different operating Systems ---> potentially using various applications (Nuke, XSI, Max), which, in turn, have their own dedicated python interpreters.
The obvious option would be pip and easy_install.. These modules are clearly the "right" way to go, but its not really clear how a user would install/run them under the dedicated python installs that ship with Maya/Nuke/ etc...Though, it does seem possible (as explained here). Still Its going to be a pretty big barrier for a less-technical user.
Any help or points in the right direction would be immensely appreciated..
I would not say that pip/easy_install are the 'right' way for this problem. They are pretty good (not quite 'great') tools for motivated, technically inclined users -- but even in that context they have issues (such as unintended upgrades or deletions). Most importantly, they are opt-in methods: nobody can make you pip unless you want to. This means users can accidentally or deliberately get themselves into very different positions from each other, which makes support and maintenance a nightmare.
I've had very good luck in Maya distributing a zipped file containing a complete environment - all the modules etc. userSetup.py adds that zip to the path and the Python's native zipimport functionality handles the rest. This makes sure that there is only one file to maintain and distribute. It also fixes the common problem of leftover .pyc files creating havok after .py files get moved or renamed. Since this is all standard python, I'd assume this will work for any app-specific python that uses a 2.6+ version of python, though I've never tried it in Nuke or Max.
The main wrinkle will be modules with .pyd or other binary components, typically these don't work inside the zip files. I include a bootstrap routine which unpacks those to a (disposable) location on the user's disk and adds that to the path.
There's a detailed discussion of the method here and some background here

Using C in a shared multi-platform POSIX environment

I write tools that are used in a shared workspace. Since there are multiple OS's working in this space, we generally use Python and standardize the version that is installed across machines. However, if I wanted to write some things in C, I was wondering if maybe I could have the application wrapped in a Python script, that detected the operating system and fired off the correct version of the C application. Each platform has GCC available and uses the same shell.
One idea was to have the C compiled to the users local ~/bin, with timestamp comparison with C code so it is not compiled each run, but only when code is updated. Another was to just compile it for each platform, and have the wrapper script select the proper executable.
Is there an accepted/stable process for this? Are there any catches? Are there alternatives (assuming the absolute need to use native C code)?
Clarification: Multiple OS's are involved that do not share ABI. Eg. OS X, various Linuxes, BSD etc. I need to be able to update the code in place in shared folders and have the new code working more or less instantaneously. Distributing binary or source packages is less than ideal.
Launching a Python interpreter instance just to select the right binary to run would be much heavier than you need. I'd distribute a shell .rc file which provides aliases.
In /shared/bin, you put the various binaries: /shared/bin/toolname-mac, /shared/bin/toolname-debian-x86, /shared/bin/toolname-netbsd-dreamcast, etc. Then, in the common shared shell .rc file, you put the logic to set the aliases according to platform, so that on OSX, it gets alias toolname=/shared/bin/toolname-mac, and so forth.
This won't work as well if you're adding new tools all the time, because the users will need to reload the aliases.
I wouldn't recommend distributing tools this way, though. Testing and qualifying new builds of the tools should be taking up enough time and effort that the extra time required to distribute the tools to the users is trivial. You seem to be optimizing to reduce the distribution time. Replacing tools that quickly in a live environment is all too likely to result in lengthy and confusing downtime if anything goes wrong in writing and building the tools--especially when subtle cross-platform issues creep in.
Also, you could use autoconf and distribute your application in source form only. :)
You know, you should look at static linking.
These days, we all have HUGE hard drives, and a few extra megabytes (for carrying around libc and what not) is really not that big a deal anymore.
You could also try running your applications in chroot() jails and distributing those.
Depending on your mix os OSes, you might be better off creating packages for each class of system.
Alternatively, if they all share the same ABI and hardware architecture, you could also compile static binaries.

Categories