How to make my package importable without initializing the GPU - python

I'm writing a Python package that does GPU computing using the PyCUDA library. PyCUDA needs to initialize a GPU device (usually by importing pycuda.autoinit) before any of its submodules can be imported.
In my own modules I import whatever submodules and functions I need from PyCUDA, which means that my own modules are not importable without first initializing PyCUDA. That's fine mostly, because my package does nothing useful without a GPU present. However, now I want to write documentation and Sphinx Autodoc needs to import my package to read the docstrings. It works fine if I put import pycuda.autoinit into docs/conf.py, but I would like for the documentation to be buildable on machines that don't have an NVIDIA GPU such as my own laptop or readthedocs.org.
What's the most elegant way to defer the of import my dependencies such that I can import my own submodules on machines that don't have all the dependencies installed?

The autodoc mechanism requires that all modules to be documented are importable. When this requirement is a problem, mocking (replacing parts of the system with mock objects) can be a solution.
Here is an article that explains how mock objects can be used when working with Sphinx: http://blog.rtwilson.com/how-to-make-your-sphinx-documentation-compile-with-readthedocs-when-youre-using-numpy-and-scipy/.
The gist of the article is that it should work if you add something like this to conf.py:
import mock # See http://www.voidspace.org.uk/python/mock/
MOCK_MODULES = ['module1', 'module2', ...]
for mod_name in MOCK_MODULES:
sys.modules[mod_name] = mock.Mock()

The usual method I've seen is to have a module-level function like foo.init() that sets up the GPU/display/whatever that you need at runtime but don't want automatically initialized on import.
You might also consider exposing initialization options here: what if I have 2 CUDA-capable GPUs, but only want to use one of them?

Related

How to build sphinx docs for micropython

How do I configure sphinx to document modules intended for a MicroPython interpreter?
The fundamental problem I'm facing is that sphinx gets the information it documents from the imported module. Therefore the python interpreter used to document a module must be importable into that interpreter.
First Problem
I'm using a pyboard, so naturally
import pyb
cannot find module pyb...
So I added to conf.py
from unittest.mock import MagicMock
sys.modules['pyb'] = MagicMock() # and many more
Second Problem
One of my MicroPython libraries is called cmd
Exception occurred:
File "/usr/lib/python3.5/pdb.py", line 135, in <module>
class Pdb(bdb.Bdb, cmd.Cmd):
AttributeError: module 'cmd' has no attribute 'Cmd'
So that makes sense... I changed the name of the module to ucmd, and that appears to be working... but it's suuuuuper dodgy.
Question
Is there a proper way to do this?
To sphinx document a module not designed for the platform running the sphinx-build command?
Phrased more practically: if I wanted to document a MicroPython module called collections, subprocess, or io (all of which are used by the sphinx library), is it possible to use sphinx to do so?
Or would I simply have to be content with naming them ucollections, usubprocess, and uio respectively?
Below is not a sphinx solution, but does provide for a partial autocompletion in most modern editors.
to generate stubs for a (custom) MicroPython module you could use the MicroPython-Stubber
for configuration for a custom module see section 4.4
Alternatively in that same repro in various tests I import the MicroPython-CPython stubs ( sourced from micropython-lib and pycopy-lib) by inserting that that in CPython's sys.path.
This works very well for my testing purposes, allowing me to run and debug (hardware agnostic) MicroPython code with no or little alteration on CPython.
Perhaps it suits your documentation needs as well.

How to load python modules statically (like scipy)?

Under normal circumstance, external python modules such as scipy and numpy are compiled into shared objects when being installed (The part written in C). When python calls import scipy, it will dynamically load these shared objects.
Now I am working on a platform which does not support any dynamic loading function. As a result, I have to link those modules statically with python.
My current approach is to compile all source code of scipy/numpy with python, and call the module initialization function when python initializes.
Py_initializeEx(){
...
//init scipy modules statically
//below are scipy modules init functions
init_comb();
init_cython_special();
...
}
However, this brings me another problem. I found in many python module initialization functions, especially when they are auto generated from cython, they contain codes to import its parent packages. For example, the cython_special() calls import scipy, but when it is being called, the scipy initialization is not completed yet.
My question is, is there an easy way I can linked these modules statically? What is your suggestions to solve this problem?
Thanks.
PyImport_AppendInittab - this tells Python in advance of a module initialization function associated with a specific name. You'd identify all the modules you need to use that are compiled, link them statically, and then before Py_Initialize you add them to the Inittab.
Nothing happens until the module is imported at runtime when the correct initialization function is run.
If I got you right, what you could do is add a path to a dir where the modules will be located at.
import sys
sys.path.insert(0,'/path/to/modules')
from module1 import *
from module2 import *
etc.

Project structure leads to redundant dot notation

I have created a Python package which builds on the structure indicated in Kenneth Reitz' "Repository Structure and Python" (1). The main package path is:
/projects-folder (not site-packages)
/package
/package
__init__.py
Datasets.py
Draw.py
Gmaps.py
ShapeSVG.py
project.py
__init__.py
setup.py
With the current structure, I must use the following module import syntax:
import package.package.Datasets
I would prefer to type the following:
import package.Datasets
I am capable of typing the same word twice, of course, but it feels wrong in a deeper sense, i.e., I am structuring my package incorrectly or misunderstanding how Python interprets that structure.
The outer __init__.py is required for Python to detect this package at all, per the docs (2). But that sets up /package/ as the top level of the package and /package/package/ as a sub-package, forcing me into the unwieldy import syntax above.
To avoid this, it seems that my options are to:
Create a package in which the outer folder contains the top level of package modules.
Add the inner folder to my PYTHONPATH environment variable.
Yet both of these seem like suboptimal workarounds for something that shouldn't be an issue in the first place. What should I do?
You've misunderstood. You have two package packages for some reason, but the source you cite never said to do that. The outer folder, with setup.py, is not supposed to be a package.
It sounds like you're running Python in projects-folder and trying to import your package from there. That's not what you should be doing. You have several options to get your package into the import system. (I'll refer to the folder with setup.py in it as setupfolder, to distinguish it from the inner folder):
Build your package with setup.py, for example, python setup.py bdist-wheel --universal, and install the built package with pip.
Skip the build step and just run pip install path/to/setupfolder. Building the package produces an installer useful if you want to distribute your package, but maybe you don't want to do that.
"Install" the package's source tree in development mode with pip install -e path/to/setupfolder, so the Python import system will locate the package's source tree when performing imports. This is handy because you don't have to rebuild and reinstall if you edit the source repository, although you'll still want to restart any running Python processes that are using the package.
Run Python from directly inside the setupfolder.
Any of these options will cause your package to be importable directly as package instead of package.package, as it should be.
While I do not entirely agree with your package structure, you can make use of __all__ and possibly the one legitimate use for star imports I've seen so far. __init__.py can serve more purposes than just identifying your folder as a package or sub-package.
Using a Star Import
In package/package/__init__.py, add a variable __all__ that declares all the public elements you want to export:
__all__ = ['Datasets', 'Draw', 'Gmaps', 'ShapeSVG', 'project']
In package/__init__.py do from package.package import *. Now all the attributes that were available as package.package.x will also be available as package.x.
If you want to directly copy package.package.__all__ to package.__all__ (which is optional, but will allow you to do from package import * properly), you can do something like
from package.package import *
from package.package import __all__ as _all
__all__ = _all
del _all
Not Using a Star Import
You can accomplish the same thing without using package.package.__all__ at all. Just add __all__ directly to package/__init__.py and use from package.package import x-style imports:
from package.package import (
Datasets, Draw, Gmaps, ShapeSVG, project
)
# As before, package.__all__ is optional
__all__ = ['Datasets', 'Gmaps', 'ShapeSVG', 'project']
I would still recommend having a package.package.__all__ variable, but it is optional for this particular purpose.
Pros and Cons
Both approaches are pretty legitimate and I have seen both used in major projects. The first approach reduces redundancy. You only define the public exports in one place: package.package.__all__. The star imports and package.__all__ reference that definition directly, leading to one place that you really have to maintain. On the other hand, there are times when you want to separate the "full" package.package.x API from what you expose via package.x to the casual user. In that case, go with the second option. The only downside here is that you have to be more careful to keep package.__all__ and the corresponding imports synchronized properly.
Note
A number of projects I've seen (numpy especially comes to mind), export the attributes of the individual modules to the top level using this technique. For example, if you had a function package.package.Datasets.get_data, it would be listed in package.package.Datasets.__all__, which would be imported into pacakge.package.__init__, appended to package.package.__all__, and then be referenced by the top-level package and package.__all__.

Re-opening a package in Python

Maybe it's not possible (I'm more used to Ruby, where this sort of thing is fine). I'm writing a library that provides additional functionality to docker-py, which provides the docker package, so you just import docker and then you get access to docker.Client etc.
Because it seemed a logical naming scheme, I wanted users to pull in my project with import docker.mymodule, so I've created a directory called docker with an __init__.py, and mymodule.py inside it.
When I try to access docker.Client, Python can't see it, as if my docker package has hidden it:
import docker
import docker.mymodule
docker.Client() # AttributeError: 'module' object has no attribute 'Client'
Is this possible, or do all top-level package names have to differ between source trees?
This would only be possible if docker was set up as a namespace package (which it isn't).
See zope.schema, zope.interface, etc. for an example of a namespace package (zope is the namespace package here). Because zope is declared as a namespace package in setup.py, it means that zope doesn't refer to a particular module or directory on the file system, but is a namespace shared by several packages. This also means that the result of import zope is pretty much undefined - it will simply import the top-level module of the first zope.* package found in the import path.
Therefore, when dealing with namespace packages, you need to explicitely import a specific one with import zope.schema or from zope import schema.
Unfortunately, namespace packages aren't that well documented. As noted by #Bakuriu in the comment, these are some resources that contain some helpful information:
Stackoverflow: How do I create a namespace package in Python?
Built-in support for namespace packages in Python 3.3
Namespace packages in the setuptools documentation
Post about namespace packages at sourceweaver.com

setup.py adding options (aka setup.py --enable-feature )

I'm looking for a way to include some feature in a python (extension) module in installation phase.
In a practical manner:
I have a python library that has 2 implementations of the same function, one internal (slow) and one that depends from an external library (fast, in C).
I want that this library is optional and can be activated at compile/install time using a flag like:
python setup.py install # (it doesn't include the fast library)
python setup.py --enable-fast install
I have to use Distutils, however all solution are well accepted!
The docs for distutils include a section on extending the standard functionality. The relevant suggestion seems to be to subclass the relevant classes from the distutils.command.* modules (such as build_py or install) and tell setup to use your new versions (through the cmdclass argument, which is a dictionary mapping commands to classes which are to be used to execute them). See the source of any of the command classes (e.g. the install command) to get a good idea of what one has to do to add a new option.
An example of exactly what you want is the sqlalchemy's cextensions, which are there specifically for the same purpose - faster C implementation. In order to see how SA implemented it you need to look at 2 files:
1) setup.py. As you can see from the extract below, they handle the cases with setuptools and distutils:
try:
from setuptools import setup, Extension, Feature
except ImportError:
from distutils.core import setup, Extension
Feature = None
Later there is a check if Feature: and the extension is configured properly for each case using variable extra, which is later added to the setup() function.
2) base.py: here look at how BaseRowProxy is defined:
try:
from sqlalchemy.cresultproxy import BaseRowProxy
except ImportError:
class BaseRowProxy(object):
#....
So basically once C extensions are installed (using --with-cextensions flag during setup), the C implementation will be used. Otherwise, pure Python implementation of the class/function is used.

Categories