How to split a very large python file?

How to split a very large python file? - python

I am using and maintaining a python script allowing to automatize compilation, execution and performances analysis of some particular applications. The script was quite simple when I created it (it only provided the compilation option) but is now very large (2100 lines, not optimized I agree), quite complex and providing many many different command line options (managing the arguments with argparse is a nightmare, and I am not able to do what I need exactly)
To simplify this, I am planning to split it in several scripts:
compile.py
run.py
analyse.py
These three scripts will need to access to share functions, classes and constants. Regarding this constraint, my question is what is the pyhtonic way to handle this ?

You can put the shared code in separate files and then import the file as a module in each script which needs it. To see how the module system in Python works, see the modules documentation for Python 2.7 or the documentation on modules for Python 3.4, depending on which version of Python you are writing code in.

Related

add functions to Python standard library?

Is there a way to add functions I create to the Python standard library on my local machine?
I come from the matlab world where things aren't really efficient and fast but there are looooads of functions at my fingertips without having to import their files. My problem is that, if I make a function in Python and want to use it, then i will need to also remember the module its in. My memory is shite. I understand that Python is structured that way for efficiency but if I'm adding only a handful of functions to the standard library that I consider very important, I'd guess that the impact to the performance is practically negligible.

Python has a namespace called __builtins__ in which you can stick stuff that you want available all the time. You probably shouldn't, but you can. Be careful not to clobber anything. Python won't stop you from using the same name as a built-in function, and if you do that, it'll probably break a lot of things.
# define function to always be available
def fart():
print("poot!")
__builtins__.fart = fart
# make re module always available without import
import re
__builtins__.re = re
Now the question is how to get Python to run that code for you each time you start up the interpreter. The answer is usercustomize.py. Follow these instructions to find out where the correct directory is on your machine, then put a new file called usercustomize.py in that directory that defines all the stuff you want to have in __builtins__.
There's also an environment variable, PYTHONSTARTUP, that you can set to have a Python script run whenever you start the interpreter in interactive mode (i.e. to a command prompt). I can see the benefit of e.g. having your favorite modules available when exploring in the REPL. More details here.

It sounds like you want to create your own packages & modules with tools you plan on using in the future on other projects. If that is the case, you want to look into the packaging your own project documentation:
https://packaging.python.org/tutorials/packaging-projects/
You may also find this useful:
How to install a Python package system-wide on Linux?
How to make my Python module available system wide on Linux?
How can I create a simple system wide python library?

Can python import the SPSS and SPSSAux libraries and use them to any value outside of the spss context?

I'm helping my wife try and navigate IBM SPSS and python. She knows SPSS, and I kinda know python -- We might be able to work together. As it stands, I understand that I can call small snippets of python from within an SPSS syntax. While this is useful for looping and conditional branching based on data, it seems a little fuzzy to me. It almost feels like Inversion of Control, but not really.
I was wondering is it possible to have a python script, external to an spss syntax, that can still use the SPSS libraries in any meaningful way, or do I have to keep my scripts confined to the SPSS syntax and runtime?

Yes, you can run Statistics in external mode from a Python or R program. You might have to add the SPSS Python directory to your Python search path, but then just do
import spss
and run your Python code. The only thing you can't do is Viewer and user interface stuff, because there is no SPSS UI in that mode. By default, you will get output as text (which you can turn off when you get the hang of things). If you want better quality output, you can use OMS to capture output in a wide variety of formats.
Note that you need a compatible version of Python if you don't use the one installed with SPSS. That would be 2.7 for most Statistics versions. The Python installed with Statistics is not registered, but you can install a standard version from Python.org and just add the SPSS Python directory to the search path.
HTH

How to embed Python in a multi platform C++ framework (JUCE)?

I'm designing musical training games using JUCE -- a multiplatform C++ framework that allows me to code audio/visuals close to the wire.
However, I have coded my gameplay (control flow / data-processing) in Python -- it is complex and I wish to keep changing it so I can experiment with different gameplays. Python is ideal for this kind of rapid prototyping work.
So I would like my (platform independent, so Win/OSX/Lin/iOS/And) C++ to start up a Python runtime, feed it a .py file, and then call various functions in that .py. Also I would like to be able to call back to the C++ code from the .py.
Here is the relevant official Python documentation: https://docs.python.org/2/extending/extending.html
And here is a CodeProject article: http://www.codeproject.com/Articles/11805/Embedding-Python-in-C-C-Part-I
However, neither of them seem to address the issue of multiplatform.
The technique seems to be to link with the library libpython.a, and #include which contains the various functions for starting up the runtime environment, loading scripts, executing python-code, etc.
But surely this libpython.a would need to be compiled separately per platform? If so, this wouldn't be a very clean solution, so could I instead add the Python source code to my project and get it to compile the .a?
How can I go about doing this?
EDIT: https://wiki.python.org/moin/boost.python/EmbeddingPython
EDIT2: I'm pretty sure trying to bring in the full CPython source code is overkill here -- someone must have made some stripped down Python implementation in C/C++ that doesn't support any system-calls/multithreading/fancy-stuff -- just works through Python syntax line by line. Looking thru https://wiki.python.org/moin/PythonImplementations but I can't see an obvious candidate.
EDIT3: https://github.com/micropython/micropython should be added to that last page, but still it doesn't look like it is what I'm after

There's an entire chapter of the Python docs that explain the different approaches you can take embedding a Python interpreter into another app.
Embedding Python is similar to extending it, but not quite. The
difference is that when you extend Python, the main program of the
application is still the Python interpreter, while if you embed
Python, the main program may have nothing to do with Python — instead,
some parts of the application occasionally call the Python interpreter
to run some Python code.
So if you are embedding Python, you are providing your own main
program. One of the things this main program has to do is initialize
the Python interpreter. At the very least, you have to call the
function Py_Initialize(). There are optional calls to pass command
line arguments to Python. Then later you can call the interpreter from
any part of the application.
There are several different ways to call the interpreter: you can pass
a string containing Python statements to PyRun_SimpleString(), or you
can pass a stdio file pointer and a file name (for identification in
error messages only) to PyRun_SimpleFile(). You can also call the
lower-level operations described in the previous chapters to construct
and use Python objects.
A simple demo of embedding Python can be found in the directory
Demo/embed/ of the source distribution.

I recently decided to create a project that mixes C++ with Python, thus getting the best of both worlds. My idea was to do rapid prototyping of classes and functions in Python for obvious reasons, but still being able to call C++ code within Python (for obvious reasons as well). So instead of embedding Python in the C++ framework, I suggest you do the opposite: embed your C++ framework into a Python project. In order to do so, you just have to write very simple interface files and let Swig take care of the interfacing part.
If you want to start from scratch, there's a nice tool called cookiecutter that can be used to generate a project templates. You can choose either the cookiecutter-pypackage, or the cookiecutter-pylibrary, the latter improving over the former as described here. Interestingly, you can also use the cookiecutter code to generate the structure of a C++ project. This empty project uses the CMake build system, which IMHO is the best framework for developing platform independent C++ code. I then had to decide on the directory structure for this mixed project, so one of my previous posts describes this in detail. Good luck!

I'm using SWIG to embed Python into my C++ application, and to extend it as well, i.e. access my C++ API in Python outside my application. SWIG and Python are multi-platform, so that is not really an issue. One of the main advantage of SWIG is that it can generate bindings for a lot of languages. There are also a lot of C++ code wrappers that could be used, for example boost.python or cython.
Check these links on SO:
Extending python - to swig, not to swig or Cython
Exposing a C++ API to Python
Or you can go the hard way and use plain Python/C API.

How i can profile a python script consisting of multiple modules and classes?

I want to profile my python code. My code includes several modules where each module is a class with several functions. I am using eclipse PyDev as IDE. I have read a few QA about timeit and cprofile but using these profilers are a bit hard when you have classes that make use of other classes and there are chain calls through them.
I was just wondering if there is a profiler like Java profilers which can show me where i need to optimize my code (i was thinking of multithreading some parts of my code but i want to make sure which parts need it).

pycallgraph is a beautiful tool for profiling python code http://pycallgraph.slowchop.com/en/master/

bash extension modules on python

I know that it is possible to write bash extension modules (loadable builtins) on C or lua (see luabash), but is it possible on Python/Cython? Is there any projects that make steps in this direction?

The way you would probably do this is start out with a C library which includes the appropriate exported functions, then within the exported function load and run the python interpreter, run your python code, then tear down the python interpreter.
You can see how to load the python interpreter into a C program/library here:
http://docs.python.org/extending/embedding.html
http://docs.python.org/extending/extending.html#calling-python-functions-from-c
http://www.linuxjournal.com/article/8497
If you do this a lot, then it may be simpler to write a single generic handler when you can use with multiple different python scripts.

I used Bash examples and the linked resources #tylerl mentioned to make bashpy. It's a proof of concept and currently lacks support for both passing variables and calling functions. So not very useful yet, but maybe it can help someone ending up here.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.