Alternatives to imp.find_module? - python

Background
I've grown tired of the issue with pylint not being able to import files when you use namespace packages and divide your code-base into separate folders. As such I started digging into the astNG source-code which has been identified as the source of the trouble (see bugreport 8796 on astng). At the heart of the issue seems to be the use of pythons own imp.find_module in the process of finding imports.
What happens is that the import's first (sub)package - a in import a.b.c - is fed to find_module with a None path. Whatever path comes back is then fed into find_module the next pass in the look up loop where you try to find b in the previous example.
Pseudo-code from logilab.common.modutils:
path = None
while import_as_list:
try:
_, found_path, etc = find_module(import_as_list[0], path)
#exception handling and checking for a better version in the .egg files
path = [found_path]
import_as_list.pop(0)
The Problem
This is what's broken: you only get the first best hit from find_module, which may or may not have your subpackages in it. If you DON'T find the subpackages, you have no way to back out and try the next one.
I tried explicitly using sys.path instead of None, so that the result could be removed from the path list and a second attempt be made, but python's module finder is clever enough that there doesn't have to be an exact match in the paths, making this approach unusable - to the best of my knowledge anyway.
Teary-eyed Plea
Is there an alternative to find_modules which will return ALL possible matches or take an exclude list? I'm also open to completely different solutions. Preferably not patching python by hand, but it wouldn't be impossible - at least for a local solution.
(Caveat emptor: I'm running python 2.6 and for reasons of current company policy can't upgrade, suggestions for p3k etc won't get marked as accepted unless it's the only answer.)

Since Python 2.5, the right way to do this is with pkgutil.iter_modules() (for a flat list) or pkgutil.walk_packages() (for a subpackage tree). Both are fully compatible with namespace packages.
For example, if I wanted to find just the subpackages/submodules of 'jmb', I would do:
import jmb, pkgutil
for (module_loader, name, ispkg) in pkgutil.iter_modules(jmb.__path__, 'jmb.'):
# 'name' will be 'jmb.foo', 'jmb.bar', etc.
# 'ispkg' will be true if 'jmb.foo' is a package, false if it's a module
You can also use iter_modules or walk_packages to walk all the modules on sys.path; see the docs linked above for details.

I've grown tired of this limitation in PyLint too.
I don't know a replacement for imp.find_modules(), but I think I found another way to deal with namespace packages in PyLint. See my comment on the bug report you linked to (http://www.logilab.org/ticket/8796).
The idea is to use pkg_resources to find namespace packages. Here's my addition to logilab.common.modutils._module_file(), just after while modpath:
while modpath:
if modpath[0] in pkg_resources._namespace_packages and len(modpath) > 1:
module = sys.modules[modpath.pop(0)]
path = module.__path__
This not very refined and only handles top-level namespace packages though.

warning + disclaimer: not tested yet!
before:
for part in parts:
modpath.append(part)
curname = '.'.join(modpath)
# ...
if module is None:
mp_file, mp_filename, mp_desc = imp.find_module(part, path)
module = imp.load_module(curname, mp_file, mp_filename, mp_desc)
after: - thanks pjeby for mentioning pkgutil!
for part in parts:
modpath.append(part)
curname = '.'.join(modpath)
# ...
if module is None:
# + https://stackoverflow.com/a/14820895/611007
# # mp_file, mp_filename, mp_desc = imp.find_module(part, path)
# # module = imp.load_module(curname, mp_file, mp_filename, mp_desc)
import pkgutil
mp_file = None
for loadr,name,ispkg in pkgutil.iter_modules(path=path,prefix='.'.join(modpath[:-1])+'.'):
if name.split('.')[-1] == part:
if not hasattr(loadr,'path') and hasattr(loadr,'archive'):
# with zips `name` was like '.somemodule'
# it gives `RuntimeWarning: Parent module '' not found while handling absolute import`
# I expect the name I need to be 'somemodule'
# TODO: I don't know why python does this or what the correct usage is.
# https://stackoverflow.com/questions/2267984/
if name and name[0] == '.':
name = name[1:]
ldr= loadr.find_module(name,loadr.archive)
module = ldr.load_module(name)
break
imploader= loadr.find_module(name,loadr.path)
mp_file,mp_filename,mp_desc= imploader.file,imploader.filename,imploader.etc
module = imploader.load_module(imploader.fullname)
break
if module is None:
raise ImportError

Related

Last imported file overwrites statements from previous files. Better ways of specifying imported variables?

Hey stackoverflow community,
i’m new to this forum and to python developing in general and have a problem with Alexa/ Python overriding the similar named variable from different files.
In my language learning skill I want Alexa to specifically link a “start specific practice” intent from the user to a specific practice file and from this file to import an intro, keyword and answer to give back to the user.
My problem with the importing, is that Python takes the last imported file and overrides the statements of the previous files.
I know I could probably change the variable names according to the practices but then wouldn't I have have to create a lot of individual handler functions which link the user intent to a specific file/function and basically look and act all the same?
Is there a better way more efficient of doing the specifying of those variables when importing or inside the functions?
import files and variables
from übung_1 import intro_1, keywords_1, real_1
from übung_2 import intro_1, keywords_1, real_1
working with the variables
def get_practice_response(practice_number):
print("get_practice_response")
session_attributes = {}
card_title = "Übung"
number = randint(0, len(keywords_1))
print(intro_1 + keywords_1[number])
speech_output = intro_1 + keywords_1[number]
session_attributes["answer"] = real_1[number]
session_attributes["practice_number"] = practice_number
session_attributes["keyword"] = keywords_1[number]
reprompt_text = "test"
should_end_session = False
return build_response(session_attributes, build_speechlet_response(
card_title, speech_output, reprompt_text, should_end_session))
I expected giving out the content of the specifically asked file and not variable content from the most recent files.
Sadly I haven't found a solution for this specific problem and hope someone could help me pointing me in the right direction.
Thank you very much in advance.
Might be easiest to import the modules like so:
import übung_1
import übung_2
The refer to the contents as übung_1.intro_1, übung_2.intro_1, übung_1.keywords_1 and so on.
As you point out, these two lines
from übung_1 import intro_1, keywords_1, real_1
from übung_2 import intro_1, keywords_1, real_1
don't work the way you want because the second import overrides the first. This has to happen because you can't have two different variables in the same namespace called intro_1.
You can get around this by doing
import übung_1
import übung_2
and then in your code you explicitly state the namespace you want:
print(übung_1.intro_1 + übung_1.keywords_1[number])

Get Python's LIB path

I can see that INCLUDE path is sysconfig.get_path('include').
But I don't see any similar value for LIB.
NumPy outright hardcodes it as os.path.join(sys.prefix, "libs") in Windows and get_config_var('LIBDIR') (not documented and missing in Windows) otherwise.
Is there a more supported way?
Since it's not a part of any official spec/doc, and, as shown by another answer, there are cases when none of appropriate variables from sysconfig/distutils.sysconfig .get_config_var() are set,
the only way to reliably get it in all cases, exactly as a build would (e.g. even for a Python in the sourcetree) is to delegate to the reference implementation.
In distutils, the logic that sets the library path for a compiler is located in distutils.commands.build_ext.finalize_options(). So, this code would get it with no side effects on the build:
import distutils.command.build_ext #imports distutils.core, too
d = distutils.core.Distribution()
b = distutils.command.build_ext.build_ext(d) #or `d.get_command_class('build_ext')(d)',
# then it's enough to import distutils.core
b.finalize_options()
print b.library_dirs
Note that:
Not all locations in the resulting list necessarily exist.
If your setup.py is setuptools-based, use setuptools.Distribution and setuptools.command.build_ext instead, correspondingly.
If you pass any values to setup() that affect the result, you must pass them to Distribution here, too.
Since there are no guarantees that the set of the additional values you need to pass will stay the same, and the value is only needed when building an extension,
it seems like you aren't really supposed to get this value independently at all:
If you're using another build facility, you should rather subclass build_ext and get the value from the base method during the build.
Below is the (rather long) subroutine in skbuild.cmaker that locates libpythonxx.so/pythonxx.lib for the running Python. In CMake, 350-line Modules/FindPythonLibs.cmake is dedicated to this task.
The part of the former that gets just the directory is much simpler though:
libdir = dustutils.sysconfig.get_config_var('LIBDIR')
if sysconfig.get_config_var('MULTIARCH'):
masd = sysconfig.get_config_var('multiarchsubdir')
if masd:
if masd.startswith(os.sep):
masd = masd[len(os.sep):]
libdir = os.path.join(libdir, masd)
if libdir is None:
libdir = os.path.abspath(os.path.join(
sysconfig.get_config_var('LIBDEST'), "..", "libs"))
def get_python_library(python_version):
"""Get path to the python library associated with the current python
interpreter."""
# determine direct path to libpython
python_library = sysconfig.get_config_var('LIBRARY')
# if static (or nonexistent), try to find a suitable dynamic libpython
if (python_library is None or
os.path.splitext(python_library)[1][-2:] == '.a'):
candidate_lib_prefixes = ['', 'lib']
candidate_extensions = ['.lib', '.so', '.a']
if sysconfig.get_config_var('WITH_DYLD'):
candidate_extensions.insert(0, '.dylib')
candidate_versions = [python_version]
if python_version:
candidate_versions.append('')
candidate_versions.insert(
0, "".join(python_version.split(".")[:2]))
abiflags = getattr(sys, 'abiflags', '')
candidate_abiflags = [abiflags]
if abiflags:
candidate_abiflags.append('')
# Ensure the value injected by virtualenv is
# returned on windows.
# Because calling `sysconfig.get_config_var('multiarchsubdir')`
# returns an empty string on Linux, `du_sysconfig` is only used to
# get the value of `LIBDIR`.
libdir = du_sysconfig.get_config_var('LIBDIR')
if sysconfig.get_config_var('MULTIARCH'):
masd = sysconfig.get_config_var('multiarchsubdir')
if masd:
if masd.startswith(os.sep):
masd = masd[len(os.sep):]
libdir = os.path.join(libdir, masd)
if libdir is None:
libdir = os.path.abspath(os.path.join(
sysconfig.get_config_var('LIBDEST'), "..", "libs"))
candidates = (
os.path.join(
libdir,
''.join((pre, 'python', ver, abi, ext))
)
for (pre, ext, ver, abi) in itertools.product(
candidate_lib_prefixes,
candidate_extensions,
candidate_versions,
candidate_abiflags
)
)
for candidate in candidates:
if os.path.exists(candidate):
# we found a (likely alternate) libpython
python_library = candidate
break
# TODO(opadron): what happens if we don't find a libpython?
return python_library

How to properly write cross-references to external documentation with intersphinx?

I'm trying to add cross-references to external API into my documentation but I'm facing three different behaviors.
I am using sphinx(1.3.1) with Python(2.7.3) and my intersphinx mapping is configured as:
{
'python': ('https://docs.python.org/2.7', None),
'numpy': ('http://docs.scipy.org/doc/numpy/', None),
'cv2' : ('http://docs.opencv.org/2.4/', None),
'h5py' : ('http://docs.h5py.org/en/latest/', None)
}
I have no trouble writing a cross-reference to numpy API with :class:`numpy.ndarray` or :func:`numpy.array` which gives me, as expected, something like numpy.ndarray.
However, with h5py, the only way I can have a link generated is if I omit the module name. For example, :class:`Group` (or :class:`h5py:Group`) gives me Group but :class:`h5py.Group` fails to generate a link.
Finally, I cannot find a way to write a working cross-reference to OpenCV API, none of these seems to work:
:func:`cv2.convertScaleAbs`
:func:`cv2:cv2.convertScaleAbs`
:func:`cv2:convertScaleAbs`
:func:`convertScaleAbs`
How to properly write cross-references to external API, or configure intersphinx, to have a generated link as in the numpy case?
In addition to the detailed answer from #gall, I've discovered that intersphinx can also be run as a module:
python -m sphinx.ext.intersphinx 'http://python-eve.org/objects.inv'
This outputs nicely formatted info. For reference: https://github.com/sphinx-doc/sphinx/blob/master/sphinx/ext/intersphinx.py#L390
I gave another try on trying to understand the content of an objects.inv file and hopefully this time I inspected numpy and h5py instead of only OpenCV's one.
How to read an intersphinx inventory file
Despite the fact that I couldn't find anything useful about reading the content of an object.inv file, it is actually very simple with the intersphinx module.
from sphinx.ext import intersphinx
import warnings
def fetch_inventory(uri):
"""Read a Sphinx inventory file into a dictionary."""
class MockConfig(object):
intersphinx_timeout = None # type: int
tls_verify = False
class MockApp(object):
srcdir = ''
config = MockConfig()
def warn(self, msg):
warnings.warn(msg)
return intersphinx.fetch_inventory(MockApp(), '', uri)
uri = 'http://docs.python.org/2.7/objects.inv'
# Read inventory into a dictionary
inv = fetch_inventory(uri)
# Or just print it
intersphinx.debug(['', uri])
File structure (numpy)
After inspecting numpy's one, you can see that keys are domains:
[u'np-c:function',
u'std:label',
u'c:member',
u'np:classmethod',
u'np:data',
u'py:class',
u'np-c:member',
u'c:var',
u'np:class',
u'np:function',
u'py:module',
u'np-c:macro',
u'np:exception',
u'py:method',
u'np:method',
u'np-c:var',
u'py:exception',
u'np:staticmethod',
u'py:staticmethod',
u'c:type',
u'np-c:type',
u'c:macro',
u'c:function',
u'np:module',
u'py:data',
u'np:attribute',
u'std:term',
u'py:function',
u'py:classmethod',
u'py:attribute']
You can see how you can write your cross-reference when you look at the content of a specific domain. For example, py:class:
{u'numpy.DataSource': (u'NumPy',
u'1.9',
u'http://docs.scipy.org/doc/numpy/reference/generated/numpy.DataSource.html#numpy.DataSource',
u'-'),
u'numpy.MachAr': (u'NumPy',
u'1.9',
u'http://docs.scipy.org/doc/numpy/reference/generated/numpy.MachAr.html#numpy.MachAr',
u'-'),
u'numpy.broadcast': (u'NumPy',
u'1.9',
u'http://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast.html#numpy.broadcast',
u'-'),
...}
So here, :class:`numpy.DataSource` will work as expected.
h5py
In the case of h5py, the domains are:
[u'py:attribute', u'std:label', u'py:method', u'py:function', u'py:class']
and if you look at the py:class domain:
{u'AttributeManager': (u'h5py',
u'2.5',
u'http://docs.h5py.org/en/latest/high/attr.html#AttributeManager',
u'-'),
u'Dataset': (u'h5py',
u'2.5',
u'http://docs.h5py.org/en/latest/high/dataset.html#Dataset',
u'-'),
u'ExternalLink': (u'h5py',
u'2.5',
u'http://docs.h5py.org/en/latest/high/group.html#ExternalLink',
u'-'),
...}
That's why I couldn't make it work as numpy references. So a good way to format them would be :class:`h5py:Dataset`.
OpenCV
OpenCV's inventory object seems malformed. Where I would expect to find domains there is actually 902 function signatures:
[u':',
u'AdjusterAdapter::create(const',
u'AdjusterAdapter::good()',
u'AdjusterAdapter::tooFew(int',
u'AdjusterAdapter::tooMany(int',
u'Algorithm::create(const',
u'Algorithm::getList(vector<string>&',
u'Algorithm::name()',
u'Algorithm::read(const',
u'Algorithm::set(const'
...]
and if we take the first one's value:
{u'Ptr<AdjusterAdapter>': (u'OpenCV',
u'2.4',
u'http://docs.opencv.org/2.4/detectorType)',
u'ocv:function 1 modules/features2d/doc/common_interfaces_of_feature_detectors.html#$ -')}
I'm pretty sure it is then impossible to write OpenCV cross-references with this file...
Conclusion
I thought intersphinx generated the objects.inv based on the content of the documentation project in an standard way, which seems not to be the case.
As a result, it seems that the proper way to write cross-references is API dependent and one should inspect a specific inventory object to actually see what's available.
An additional way to inspect the objects.inv file is with the sphobjinv module.
You can search local or even remote inventory files (with fuzzy matching). For instance with scipy:
$ sphobjinv suggest -t 90 -u https://docs.scipy.org/doc/scipy/reference/objects.inv "signal.convolve2d"
Remote inventory found.
:py:function:`scipy.signal.convolve2d`
:std:doc:`generated/scipy.signal.convolve2d`
Note that you may need to use :py:func: and not :py:function: (I'd be happy to know why).
How to use OpenCV 2.4 (cv2) intersphinx
Inspired by #Gall's answer, I wanted to compare the contents of the OpenCV & numpy inventory files. I couldn't get sphinx.ext.intersphinx.fetch_inventory to work from ipython, but the following does work:
curl http://docs.opencv.org/2.4/objects.inv | tail -n +5 | zlib-flate -uncompress > cv2.inv
curl https://docs.scipy.org/doc/numpy/objects.inv | tail -n +5 | zlib-flate -uncompress > numpy.inv
numpy.inv has lines like this:
numpy.ndarray py:class 1 reference/generated/numpy.ndarray.html#$ -
whereas cv2.inv has lines like this:
cv2.imread ocv:pyfunction 1 modules/highgui/doc/reading_and_writing_images_and_video.html#$ -
So presumably you'd link to the OpenCV docs with :ocv:pyfunction:`cv2.imread` instead of :py:function:`cv2.imread`. Sphinx doesn't like it though:
WARNING: Unknown interpreted text role "ocv:pyfunction".
A bit of Googling revealed that the OpenCV project has its own "ocv" sphinx domain: https://github.com/opencv/opencv/blob/2.4/doc/ocv.py -- presumably because they need to document C, C++ and Python APIs all at the same time.
To use it, save ocv.py next to your Sphinx conf.py, and modify your conf.py:
sys.path.insert(0, os.path.abspath('.'))
import ocv
extensions = [
'ocv',
]
intersphinx_mapping = {
'cv2': ('http://docs.opencv.org/2.4/', None),
}
In your rst files you need to say :ocv:pyfunc:`cv2.imread` (not :ocv:pyfunction:).
Sphinx prints some warnings (unparseable C++ definition: u'cv2.imread') but the generated html documentation actually looks ok with a link to http://docs.opencv.org/2.4/modules/highgui/doc/reading_and_writing_images_and_video.html#cv2.imread. You can edit ocv.py and remove the line that prints that warning.
The accepted answer no longer works in the new version (1.5.x) ...
import requests
import posixpath
from sphinx.ext.intersphinx import read_inventory
uri = 'http://docs.python.org/2.7/'
r = requests.get(uri + 'objects.inv', stream=True)
r.raise_for_status()
inv = read_inventory(r.raw, uri, posixpath.join)
Stubborn fool that I am, I used 2to3 and the Sphinx deprecated APIs chart to revive #david-röthlisberger's ocv.py-based answer so it'll work with Sphinx 2.3 on Python 3.5.
The fixed-up version is here:
https://gist.github.com/ssokolow/a230b27b7ea4a31f7fb40621e6461f9a
...and the quick version of what I did was:
Run 2to3 -w ocv.py && rm ocv.py.bak
Cycle back and forth between running Sphinx and renaming functions to their replacements in the chart. I believe these were the only changes I had to make on this step:
Directive now has to be imported from docutils.parsers.rst
Replace calls to l_(...) with calls to _(...) and remove the l_ import.
Replace calls to env.warn with calls to log.warn where log = sphinx.util.logging.getLogger(__name__).
Then, you just pair it with this intersphinx definition and you get something still new enough to be relevant for most use cases:
'cv2': ('https://docs.opencv.org/3.0-last-rst/', None)
For convenience, I made a small extension for aliasing intersphinx cross references. This is useful as sometimes the object inventory gets confused when an object from a submodule is imported from a package's __init__.py.
See also https://github.com/sphinx-doc/sphinx/issues/5603
###
# Workaround of
# Intersphinx references to objects imported at package level can"t be mapped.
#
# See https://github.com/sphinx-doc/sphinx/issues/5603
intersphinx_aliases = {
("py:class", "click.core.Group"):
("py:class", "click.Group"),
("py:class", "click.core.Command"):
("py:class", "click.Command"),
}
def add_intersphinx_aliases_to_inv(app):
from sphinx.ext.intersphinx import InventoryAdapter
inventories = InventoryAdapter(app.builder.env)
for alias, target in app.config.intersphinx_aliases.items():
alias_domain, alias_name = alias
target_domain, target_name = target
try:
found = inventories.main_inventory[target_domain][target_name]
try:
inventories.main_inventory[alias_domain][alias_name] = found
except KeyError:
print("could not add to inv")
continue
except KeyError:
print("missed :(")
continue
def setup(app):
app.add_config_value("intersphinx_aliases", {}, "env")
app.connect("builder-inited", add_intersphinx_aliases_to_inv)
To use this, I paste the above code in my conf.py and add aliases to the intersphinx_aliases dictionary.

waf multi-step build - target path

In one of our projects, I have a need to build a library, using waf.
The library has multiple steps, like it builds a binary, then executes the binary
to generate a few more files, and those files are included in further builds.
To run the binary (which got generated in the intermediate step), I need its
path - as string, so that I can prefix to the binary. From the Waf book, I saw an example, and
some references to bld.path.find_dir() and bld.path.parent.find_dir().
But these functions do not return path as string.
And, there is bld.path.abspath() which returns the source path as string.
I want to be able to get the path to the binary file which got generated. Here is a snippet of what I am trying:
bld.program(
source = my_sources,
target = 'my_binary', # <-- path to this
includes = my_includes,
cflags = my_cflags,
linkflags = my_ldflags
)
bld.add_group()
# use the above generated binary file
P.S This might seem fairly trivial, but I come from make background, and new to
waf !
Thanks.
--EDIT--
I am able to build the my_binary here, but I want to get its abs path, and reference it in the further steps
build/${build_target}/${your_binary} - unless you overwrite some default value
Update#1
A cut down thing that should keep you going, especially the derival of build targets, also be sure to check the waf book which includes a lot of examples.
def configure(ctx):
ctx.load(...)
ctx.env.appname = APPNAME
ctx.env.version = VERSION
ctx.define(...)
ctx.check_cc(...)
ctx.setenv('debug', env=ctx.env.derive())
ctx.env.CFLAGS = ['-ggdb', '-Wall']
ctx.define('DEBUG',1)
ctx.setenv('release', env=ctx.env.derive())
ctx.env.CFLAGS = ['-O2', '-Wall']
ctx.define('RELEASE',1)
def build(bld):
### subdirs :) under build are usually related to build variant or command
print (">>>>> "+bld.cmd)
print (">>>>> "+bld.variant)
bin = bld.program(...)
from waflib.Build import BuildContext
class release(BuildContext):
cmd = 'release'
variant = 'release'
class debug(BuildContext):
cmd = 'debug'
variant = 'debug'

How to emulate os.path.samefile behaviour on Windows and Python 2.7?

Given two paths I have to compare if they're pointing to the same file or not. In Unix this can be done with os.path.samefile, but as documentation states it's not available in Windows.
What's the best way to emulate this function?
It doesn't need to emulate common case. In my case there are the following simplifications:
Paths don't contain symbolic links.
Files are in the same local disk.
Now I use the following:
def samefile(path1, path2)
return os.path.normcase(os.path.normpath(path1)) == \
os.path.normcase(os.path.normpath(path2))
Is this OK?
According to issue#5985 the os.path.samefile and os.path.sameopenfile are now in py3k. I verified this on Python 3.3.0
For older versions of Python here's a way which uses the GetFileInformationByHandle function:
see_if_two_files_are_the_same_file
The os.stat system call returns a tuple with a lot of information about each file - including creation and last modification time stamps, size, file attributes. The chances of different files having the same paramters are very slim. I think it is very resonable to do:
def samefile(file1, file2):
return os.stat(file1) == os.stat(file2)
The real use-case of os.path.samefile is not symbolic links, but hard links. os.path.samefile(a, b) returns True if a and b are both hard links to the same file. They might not have the same path.
I know this is a late answer in this thread. But I use python on Windows, and ran into this issue today, found this thread, and found that os.path.samefile doesn't work for me.
So, to answer the OP, now to emulate os.path.samefile, this is how I emulate it:
# because some versions of python do not have os.path.samefile
# particularly, Windows. :(
#
def os_path_samefile(pathA, pathB):
statA = os.stat(pathA) if os.path.isfile(pathA) else None
if not statA:
return False
statB = os.stat(pathB) if os.path.isfile(pathB) else None
if not statB:
return False
return (statA.st_dev == statB.st_dev) and (statA.st_ino == statB.st_ino)
It is not as tight as possible, because I was more interested in being clear in what I was doing.
I tested this on Windows-10, using python 2.7.15.

Categories