Unable to use pip modules with PyO3 - python

Recently I have been working on a project that involves generating docx files. Since Rust's docx support is still quite immature, I've decided to use Python's python-docx module via PyO3.
Here's my code so far:
extern crate pyo3;
use pyo3::prelude::*;
(...)
// Initialize some Python
let gil = Python::acquire_gil();
let py = gil.python();
let docx = PyModule::import(py, "docx")?;
let document = docx.Document();
Unfortunately, I'm running into two pretty serious errors.
Error #1:
let docx = PyModule::import(py, "docx")?;
^ cannot use the `?` operator in a function that returns `std::string::String
Error #2:
let document = docx.Document();
^^^^^^^^ method not found in `&pyo3::prelude::PyModule`
How do I solve these errors?
N.B. Yes, I have made sure that python-docx is installed. It's located in /home/<my username>/.local/lib/python3.8/site-packages

Related

Is there a built-in way to use inline C code in Python?

Even if numba, cython (and especially cython.inline) exist, in some cases, it would be interesting to have inline C code in Python.
Is there a built-in way (in Python standard library) to have inline C code?
PS: scipy.weave used to provide this, but it's Python 2 only.
Directly in the Python standard library, probably not. But it's possible to have something very close to inline C in Python with the cffi module (pip install cffi).
Here is an example, inspired by this article and this question, showing how to implement a factorial function in Python + "inline" C:
from cffi import FFI
ffi = FFI()
ffi.set_source("_test", """
long factorial(int n) {
long r = n;
while(n > 1) {
n -= 1;
r *= n;
}
return r;
}
""")
ffi.cdef("""long factorial(int);""")
ffi.compile()
from _test import lib # import the compiled library
print(lib.factorial(10)) # 3628800
Notes:
ffi.set_source(...) defines the actual C source code
ffi.cdef(...) is the equivalent of the .h header file
you can of course add some cleaning code after, if you don't need the compiled library at the end (however, cython.inline does the same and the compiled .pyd files are not cleaned by default, see here)
this quick inline use is particularly useful during a prototyping / development phase. Once everything is ready, you can separate the build (that you do only once), and the rest of the code which imports the pre-compiled library
It seems too good to be true, but it seems to work!

load_resource function not found as a class method of FPDF

I am looking at the answer to the following question: Insert Base64 image to pdf using pyfpdf
The answer suggested here was to override the existing load_resource method.
What I did instead was
class EnhancedPdf(FPDF):
def load_resource(self, reason, filename):
if reason == "image":
if filename.startswith("data"):
f = filename.split("base64,")[1]
f = base64.b64decode(f)
f = BytesIO(f)
return f
else:
return super().load_resource(reason, filename)
However, Pycharm highlights the super call with the message "Unresolved attribute reference "load_resource" for class "FPDF"
In my command line, I ran the commands
from fpdf import FPDF
dir(FPDF)
Inspecting this list, I see load_resource function is indeed not a listed method. Hence my question is why is the load_resource function not visible?
Most probably you are using Python 3.x where x >= 5 .
On the pypi it says that the module has only experimental support for python 3.y where y <= 4 .
Try it with python 2.7 and it might work.
PS: Better try https://pypi.org/project/fpdf2/, the updated version. For bugs or issues see https://github.com/alexanderankin/pyfpdf .
If you really want to use the old version, you can install whatever version you want from the original repo like this
pip install git+https://github.com/reingart/pyfpdf#<branchname of tag or commit>

Simple pybind11 module fails with No module named

I've created a python binding for one of my projects a while back and just now wanted to pick it up again.
The binding was no longer working as python was no longer able to import it - this all was working fine back then.
I've then decided to break it down to the simplest possible example:
binding.cpp
#include <pybind11/pybind11.h>
int add(int i, int j) {
return i + j;
}
PYBIND11_MODULE(TestBinding, m) {
m.doc() = "pybind11 example plugin"; // optional module docstring
m.def("add", &add, "A function which adds two numbers");
}
CMakeLists.txt:
cmake_minimum_required( VERSION 3.2 )
project(TestBinding)
add_subdirectory(pybind11) # or find_package(pybind11)
pybind11_add_module(TestBinding binding.cpp)
# Configure project to inject source path as include directory on dependent projects
target_include_directories( TestBinding
INTERFACE
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}>
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/pybind11/include/> )
set_target_properties( TestBinding
PROPERTIES
CXX_STANDARD 17
CXX_STANDARD_REQUIRED ON
PREFIX ""
SUFFIX ".so"
)
Then I have a very simple test.py file which goes like this:
sys.path.insert(0, "/path/to/so/lib/")
from TestBinding import *
...which once executed always gives me the following error:
from TestBinding import *
ModuleNotFoundError: No module named 'TestBinding'
I have literally no idea anymore what in the world could have changed from when it worked just fine and now.
Here are some more informations about my working environment:
Windows 10
Visual Studio 15 2017 Win64
Python 3.7 (also tried 3.5 and 3.6)
Am I missing anything really obvious?
I've been able to resolve this by removing the SUFFIX ".so" rule from my CMakeLists.txt.
This was needed back when I've initially created my bindings, but it no longer is apparently.
I have the same problem as you. After checking, it is found that the problem is caused by the inconsistency between the python version of pybind11 and the python version of the local environment. My problem was solved when I adjusted to the same python version.

sklearn internals access cython classes and functions

I am interested in testing out many of the internal classes and functions defined within sklearn (eg. maybe add print statement to the treebuilder so I can see how the tree got built). However as many of the internals were written in Cython, I want to learn what is the best practices and workflows of testing out the functions in Jupyter notebook.
For example, I managed to import the Stack class from the tree._utils module. I was even able to construct it but unable to call any of the methods. Any thoughts on what I should do in order to call and test the cdef classes and its methods in Python?
%%cython
from sklearn.tree import _utils
s = _utils.Stack(10)
print(s.top())
# AttributeError: 'sklearn.tree._utils.Stack' object has no attribute 'top'
There are some problems which must be solved in order to be able to use c-interfaces of the internal classes.
First problem (skip if your sklearn version is >=0.21.x):
Until version 0.21.x sklearn used implicit relative imports (as in Python2), compiling it with Cython's language_level=3 (default in IPython3) would not work - so setting language_level=2 is needed for versions < 0.21.x (i.e. %cython -2) or even better, scikit-learn should be updated.
Second problem:
We need to include path to numpy-headers. Let's take a look at a simpler version:
%%cython
from sklearn.tree._tree cimport Node
print("loaded")
which fails with nothing saying error "command 'gcc' failed with exit status 1" - but the real reason can be seen in the terminal, where gcc outputs its error message (and not to notebook):
fatal error: numpy/arrayobject.h: No such file or directory
compilation terminated.
_tree.pxd uses numpy-API and thus we need to provide the location of numpy-headers.
That means we need to add include_dirs=[numpy.get_include()] to Extension definition. There are two ways to do it in %%cython-magic, via -I option:
%%cython -I <path from numpy.get_include()>
...
or somewhat dirtier trick, exploiting that %%cython magic will add the include automatically when it sees string "numpy", by adding a comment like
%%cython
# requires numpy headers
...
is enough.
Last but not least:
Note: since 0.22 this is no longer an issue as pxd-files are included into the installation (see this).
The pxd-files must be present in the installation for us to be able to cimport them. This is the case for pxd-files from the sklearn.tree subpackage, as one can see in the local setup.py-file (given this PR, this seems to be more or less a random decision without a strategy behind):
...
config.add_data_files("_criterion.pxd")
config.add_data_files("_splitter.pxd")
config.add_data_files("_tree.pxd")
config.add_data_files("_utils.pxd")
...
but not for some other cython-extensions, in particular not for sklearn.neighbors-subpackage. Now, that is a problem for your example:
%%cython
# requires numpy headers
from sklearn.tree._utils cimport Stack
s = Stack(10)
print(s.top())
fails to be cythonized, because _utils.pxd cimports data structures from neighbors/*.pxd's:
...
from sklearn.neighbors.quad_tree cimport Cell
...
which are not present in the installation.
The situation is described with more details in this SO-post, your options to build are (as described in the link)
copy pdx-files to installation
reinstall from the downloaded source with pip install -e
reinstall from the downloaded source after manipulating corresponding local setup.py-files.
Another option is to ask the developers of sklearn to include pxd-files into the installation, so not only building but also distribution becomes possible.

How to properly write cross-references to external documentation with intersphinx?

I'm trying to add cross-references to external API into my documentation but I'm facing three different behaviors.
I am using sphinx(1.3.1) with Python(2.7.3) and my intersphinx mapping is configured as:
{
'python': ('https://docs.python.org/2.7', None),
'numpy': ('http://docs.scipy.org/doc/numpy/', None),
'cv2' : ('http://docs.opencv.org/2.4/', None),
'h5py' : ('http://docs.h5py.org/en/latest/', None)
}
I have no trouble writing a cross-reference to numpy API with :class:`numpy.ndarray` or :func:`numpy.array` which gives me, as expected, something like numpy.ndarray.
However, with h5py, the only way I can have a link generated is if I omit the module name. For example, :class:`Group` (or :class:`h5py:Group`) gives me Group but :class:`h5py.Group` fails to generate a link.
Finally, I cannot find a way to write a working cross-reference to OpenCV API, none of these seems to work:
:func:`cv2.convertScaleAbs`
:func:`cv2:cv2.convertScaleAbs`
:func:`cv2:convertScaleAbs`
:func:`convertScaleAbs`
How to properly write cross-references to external API, or configure intersphinx, to have a generated link as in the numpy case?
In addition to the detailed answer from #gall, I've discovered that intersphinx can also be run as a module:
python -m sphinx.ext.intersphinx 'http://python-eve.org/objects.inv'
This outputs nicely formatted info. For reference: https://github.com/sphinx-doc/sphinx/blob/master/sphinx/ext/intersphinx.py#L390
I gave another try on trying to understand the content of an objects.inv file and hopefully this time I inspected numpy and h5py instead of only OpenCV's one.
How to read an intersphinx inventory file
Despite the fact that I couldn't find anything useful about reading the content of an object.inv file, it is actually very simple with the intersphinx module.
from sphinx.ext import intersphinx
import warnings
def fetch_inventory(uri):
"""Read a Sphinx inventory file into a dictionary."""
class MockConfig(object):
intersphinx_timeout = None # type: int
tls_verify = False
class MockApp(object):
srcdir = ''
config = MockConfig()
def warn(self, msg):
warnings.warn(msg)
return intersphinx.fetch_inventory(MockApp(), '', uri)
uri = 'http://docs.python.org/2.7/objects.inv'
# Read inventory into a dictionary
inv = fetch_inventory(uri)
# Or just print it
intersphinx.debug(['', uri])
File structure (numpy)
After inspecting numpy's one, you can see that keys are domains:
[u'np-c:function',
u'std:label',
u'c:member',
u'np:classmethod',
u'np:data',
u'py:class',
u'np-c:member',
u'c:var',
u'np:class',
u'np:function',
u'py:module',
u'np-c:macro',
u'np:exception',
u'py:method',
u'np:method',
u'np-c:var',
u'py:exception',
u'np:staticmethod',
u'py:staticmethod',
u'c:type',
u'np-c:type',
u'c:macro',
u'c:function',
u'np:module',
u'py:data',
u'np:attribute',
u'std:term',
u'py:function',
u'py:classmethod',
u'py:attribute']
You can see how you can write your cross-reference when you look at the content of a specific domain. For example, py:class:
{u'numpy.DataSource': (u'NumPy',
u'1.9',
u'http://docs.scipy.org/doc/numpy/reference/generated/numpy.DataSource.html#numpy.DataSource',
u'-'),
u'numpy.MachAr': (u'NumPy',
u'1.9',
u'http://docs.scipy.org/doc/numpy/reference/generated/numpy.MachAr.html#numpy.MachAr',
u'-'),
u'numpy.broadcast': (u'NumPy',
u'1.9',
u'http://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast.html#numpy.broadcast',
u'-'),
...}
So here, :class:`numpy.DataSource` will work as expected.
h5py
In the case of h5py, the domains are:
[u'py:attribute', u'std:label', u'py:method', u'py:function', u'py:class']
and if you look at the py:class domain:
{u'AttributeManager': (u'h5py',
u'2.5',
u'http://docs.h5py.org/en/latest/high/attr.html#AttributeManager',
u'-'),
u'Dataset': (u'h5py',
u'2.5',
u'http://docs.h5py.org/en/latest/high/dataset.html#Dataset',
u'-'),
u'ExternalLink': (u'h5py',
u'2.5',
u'http://docs.h5py.org/en/latest/high/group.html#ExternalLink',
u'-'),
...}
That's why I couldn't make it work as numpy references. So a good way to format them would be :class:`h5py:Dataset`.
OpenCV
OpenCV's inventory object seems malformed. Where I would expect to find domains there is actually 902 function signatures:
[u':',
u'AdjusterAdapter::create(const',
u'AdjusterAdapter::good()',
u'AdjusterAdapter::tooFew(int',
u'AdjusterAdapter::tooMany(int',
u'Algorithm::create(const',
u'Algorithm::getList(vector<string>&',
u'Algorithm::name()',
u'Algorithm::read(const',
u'Algorithm::set(const'
...]
and if we take the first one's value:
{u'Ptr<AdjusterAdapter>': (u'OpenCV',
u'2.4',
u'http://docs.opencv.org/2.4/detectorType)',
u'ocv:function 1 modules/features2d/doc/common_interfaces_of_feature_detectors.html#$ -')}
I'm pretty sure it is then impossible to write OpenCV cross-references with this file...
Conclusion
I thought intersphinx generated the objects.inv based on the content of the documentation project in an standard way, which seems not to be the case.
As a result, it seems that the proper way to write cross-references is API dependent and one should inspect a specific inventory object to actually see what's available.
An additional way to inspect the objects.inv file is with the sphobjinv module.
You can search local or even remote inventory files (with fuzzy matching). For instance with scipy:
$ sphobjinv suggest -t 90 -u https://docs.scipy.org/doc/scipy/reference/objects.inv "signal.convolve2d"
Remote inventory found.
:py:function:`scipy.signal.convolve2d`
:std:doc:`generated/scipy.signal.convolve2d`
Note that you may need to use :py:func: and not :py:function: (I'd be happy to know why).
How to use OpenCV 2.4 (cv2) intersphinx
Inspired by #Gall's answer, I wanted to compare the contents of the OpenCV & numpy inventory files. I couldn't get sphinx.ext.intersphinx.fetch_inventory to work from ipython, but the following does work:
curl http://docs.opencv.org/2.4/objects.inv | tail -n +5 | zlib-flate -uncompress > cv2.inv
curl https://docs.scipy.org/doc/numpy/objects.inv | tail -n +5 | zlib-flate -uncompress > numpy.inv
numpy.inv has lines like this:
numpy.ndarray py:class 1 reference/generated/numpy.ndarray.html#$ -
whereas cv2.inv has lines like this:
cv2.imread ocv:pyfunction 1 modules/highgui/doc/reading_and_writing_images_and_video.html#$ -
So presumably you'd link to the OpenCV docs with :ocv:pyfunction:`cv2.imread` instead of :py:function:`cv2.imread`. Sphinx doesn't like it though:
WARNING: Unknown interpreted text role "ocv:pyfunction".
A bit of Googling revealed that the OpenCV project has its own "ocv" sphinx domain: https://github.com/opencv/opencv/blob/2.4/doc/ocv.py -- presumably because they need to document C, C++ and Python APIs all at the same time.
To use it, save ocv.py next to your Sphinx conf.py, and modify your conf.py:
sys.path.insert(0, os.path.abspath('.'))
import ocv
extensions = [
'ocv',
]
intersphinx_mapping = {
'cv2': ('http://docs.opencv.org/2.4/', None),
}
In your rst files you need to say :ocv:pyfunc:`cv2.imread` (not :ocv:pyfunction:).
Sphinx prints some warnings (unparseable C++ definition: u'cv2.imread') but the generated html documentation actually looks ok with a link to http://docs.opencv.org/2.4/modules/highgui/doc/reading_and_writing_images_and_video.html#cv2.imread. You can edit ocv.py and remove the line that prints that warning.
The accepted answer no longer works in the new version (1.5.x) ...
import requests
import posixpath
from sphinx.ext.intersphinx import read_inventory
uri = 'http://docs.python.org/2.7/'
r = requests.get(uri + 'objects.inv', stream=True)
r.raise_for_status()
inv = read_inventory(r.raw, uri, posixpath.join)
Stubborn fool that I am, I used 2to3 and the Sphinx deprecated APIs chart to revive #david-röthlisberger's ocv.py-based answer so it'll work with Sphinx 2.3 on Python 3.5.
The fixed-up version is here:
https://gist.github.com/ssokolow/a230b27b7ea4a31f7fb40621e6461f9a
...and the quick version of what I did was:
Run 2to3 -w ocv.py && rm ocv.py.bak
Cycle back and forth between running Sphinx and renaming functions to their replacements in the chart. I believe these were the only changes I had to make on this step:
Directive now has to be imported from docutils.parsers.rst
Replace calls to l_(...) with calls to _(...) and remove the l_ import.
Replace calls to env.warn with calls to log.warn where log = sphinx.util.logging.getLogger(__name__).
Then, you just pair it with this intersphinx definition and you get something still new enough to be relevant for most use cases:
'cv2': ('https://docs.opencv.org/3.0-last-rst/', None)
For convenience, I made a small extension for aliasing intersphinx cross references. This is useful as sometimes the object inventory gets confused when an object from a submodule is imported from a package's __init__.py.
See also https://github.com/sphinx-doc/sphinx/issues/5603
###
# Workaround of
# Intersphinx references to objects imported at package level can"t be mapped.
#
# See https://github.com/sphinx-doc/sphinx/issues/5603
intersphinx_aliases = {
("py:class", "click.core.Group"):
("py:class", "click.Group"),
("py:class", "click.core.Command"):
("py:class", "click.Command"),
}
def add_intersphinx_aliases_to_inv(app):
from sphinx.ext.intersphinx import InventoryAdapter
inventories = InventoryAdapter(app.builder.env)
for alias, target in app.config.intersphinx_aliases.items():
alias_domain, alias_name = alias
target_domain, target_name = target
try:
found = inventories.main_inventory[target_domain][target_name]
try:
inventories.main_inventory[alias_domain][alias_name] = found
except KeyError:
print("could not add to inv")
continue
except KeyError:
print("missed :(")
continue
def setup(app):
app.add_config_value("intersphinx_aliases", {}, "env")
app.connect("builder-inited", add_intersphinx_aliases_to_inv)
To use this, I paste the above code in my conf.py and add aliases to the intersphinx_aliases dictionary.

Categories