Get package of Python object - python

Given an object or type I can get the object's module using the inspect package
Example
Here, given a function I get the module that contains that function:
>>> inspect.getmodule(np.memmap)
<module 'numpy.core.memmap' from ...>
However what I really want is to get the top-level module that corresponds to the package, in this case numpy rather than numpy.core.memmap.
>>> function_that_I_want(np.memmap)
<module 'numpy' from ...>
Given an object or a module, how do I get the top-level module?

If you have imported the submodule, then the top-level module must also be loaded in sys.modules already (because that's how the import system works). So, something dumb and simple like this should be reliable:
import sys, inspect
def function_that_I_want(obj):
mod = inspect.getmodule(obj)
base, _sep, _stem = mod.__name__.partition('.')
return sys.modules[base]
The module's __package__ attribute may be interesting for you also (or future readers). For submodules, this is a string set to the parent package's name (which is not necessarily the top-level module name). See PEP366 for more details.

Related

How to get the active module as object

When you import another python-file the variable with the same name is assigned with a module-object:
>>> import somemodule
>>> somemodule
<module 'somemodule' from '/home/code/test/somemodule.py'>
But how do you get the active module as an object in somemodule.py itsself, which should be the same object as the variable somemodule in <stdin>?
You can get the name of the current module from __name__. That name will look something like package.module (assuming your file is in package/module.py).
Use that string to access the module inside sys.modules but beware: you don't want to do this while the module load is in progress! While your module's outermost level code is being run, the module definitions are still in a state of flux -- you can't necessarily count on always being able to access everything.
For specific example, the module statements are executed top to bottom. So if you access the module in the middle, any variables and functions defined above the access will be in the module, but things defined below the access will not have been executed, and so won't be in the module yet. Whoops!

How can I mock a module that is imported from a function and not present in sys.path? [duplicate]

This question already has answers here:
python: mock a module
(5 answers)
Closed 6 years ago.
I am unittesting code which contains a method that does a local import.
def function_under_test():
import unknown.dependency
(I'm not using an import at the top of the .py file because it may or may not be present when the function is called.)
Mocking unknown.dependency in the usual way only works if it can be found somewhere in sys.path. Otherwise, the call to patch fails (it refuses to mock something it cannot look at):
with mock.patch('unknown.dependency'):
function_under_test()
>>> ImportError: No module named 'unknown'
The docs suggest that I should patch to the namespace where the module is used -- in this case, function_under_test. However, when that namespace is a function, this doesn't cut it. The call to patch succeeds, but the actual import statement is still referencing the original, non-existing unknown module, which cannot be found.
with mock.patch('method_under_test.unknown'):
print("fails")
>>> ImportError: No module named 'unknown'
So, how do I replace the dependency module with a mock, if the module itself does not exist and it is imported from a function?
Mock the import only
The easy way out is to move the import statement into a separate function.
def get_dependency():
import unknown.dependency
return unknown.dependency
def function_under_test():
module = get_dependency()
# use module
That makes it easy to mock out only that 'helper' function. This has the disadvantage that it requires a change to the code under test. It does work though:
with mock.patch('get_dependency'):
function_under_test()
Mock the module
As described here, you can add the mocked module straight to sys.modules so that Python can find it regardless of whether it's present in sys.path or not.
with mock.patch.dict('sys.modules', unknown=mock.MagicMock()):
function_under_test()

Python module import - why are components only available when explicitly imported?

I have recently installed scikit-image version 0.11.3. I am using python 2.7.10. When I import the entire module I cannot access the io module.
import skimage
img = skimage.io.imread(path_)
Gives error:
AttributeError: 'module' object has no attribute 'io'
However the following does not error.
from skimage import io
img = io.imread(path_)
Question: Why?
Quick answer: IO is a submodule. Submodules need to be imported from the parent module explicitly.
Long answer: From section 5.4.2 of the python docs:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in import()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule. Let’s say you have the following directory structure:
spam/
__init__.py
foo.py
bar.py
and spam/init.py has the following lines in it:
from .foo import Foo
from .bar import Bar
then executing the following puts a name binding to foo and bar in the spam module:
>>>
>>> import spam
>>> spam.foo
<module 'spam.foo' from '/tmp/imports/spam/foo.py'>
>>> spam.bar
<module 'spam.bar' from '/tmp/imports/spam/bar.py'>
Given Python’s familiar name binding rules this might seem surprising, but it’s actually a fundamental feature of the import system. The invariant holding is that if you have sys.modules['spam'] and sys.modules['spam.foo'] (as you would after the above import), the latter must appear as the foo attribute of the former.
It's simply the way Python handles modules.
One reason is that it would make importing one module very slow if cpython needed to scan for submodules, import all of them and then import all of their submodules.
The other reason is "better be explicit than implicit". Why should Python import everything possible when you only need a small fraction of a package with a complex module hierarchy.
Instead of from skimage import io you can also write
import skimage.io
then skimage.io.imread will be found.

Distinguish modules from packages

I need a function that returns the name of the package of the module from which the function was called. Getting the module's name is easy:
import inspect
module_name = inspect.currentframe().f_back.f_globals['__name__']
And stripping the last part to get the module's package is also easy:
package_name = '.'.join(module_name.split('.')[:-1])
But if the function is called from a package's __init__.py, the last part of the name should not be stripped. E.g. if called from foo/bar/__init__.py, module_name in the above example will be set to 'foo.bar', which is already the name of the package.
How can I check, from the module name or the module object, whether it refers to a package or a module?
The best way I found is getting the module object's __file__ attribute, if it exists, and check whether it points to a file whose name is __init__ plus extension. But this seems very brittle to me.
from the module object:
module.__package__
is available for a couple that I looked at, and appears to be correct. it may not give you exactly what you want though:
import os.path
import httplib2
import xml.etree.ElementTree as etree
os.path.__package__
<blank... this is a builtin>
httplib2.__package__
'httplib2'
etree.__package__
'xml.etree'
you can use
function.__module__
also, but that gives you the module name, not the actual module - not sure if there is an easier way to get the module object itself than importing again to a local variable.
etree.parse.__module__
'xml.etree.ElementTree'
os.path.split.__module__
'ntpath'
The good thing is this appears to be more correct, following the actual location of things, e.g.:
httplib2.Response.__module__
'httplib2'
httplib2.copy.copy.__module__
'copy'
httplib2.urlparse.ParseResult.__module__
'urlparse'
etc.
This is what importlib.__import__() does, which needs to re-implement most of the Python's built-in import logic and needs to find a module's package to support relative imports:
# __package__ is not guaranteed to be defined or could be set to None
# to represent that it's proper value is unknown
package = globals.get('__package__')
if package is None:
package = globals['__name__']
if '__path__' not in globals:
package = package.rpartition('.')[0]
module = _gcd_import(name, package, level)
So it seems that there is no reliable "direct" way to get a module's package.

Having to evaluate a string to access a class from a module

I've loaded one of my modules <module my_modules.mymodule from .../my_modules/mymodule.pyc> with __import__.
Now I'm having the module saved in a variable, but I'd like to create an instance of the class mymodule. Thing is - I've gotten the module name passed as string into my script.
So I've got a variable containing the module, and I've got the module name but as string.
variable_containing_module.string_variable_with_correct_classname() doesn't work. Because it says there is no such thing in the module as "string_variable_with_correct_classname" - because it doesn't evaluate the string name. How can I do so?
Your problem is that __import__ is designed for use by the import internals, not really for direct usage. One symptom of this is that when importing a module from inside a package, the top-level package is returned rather than the module itself.
There are two ways to deal with this. The preferred way is to use importlib.import_module() instead of using __import__ directly.
If you're on an older version of Python that doesn't provide importlib, then you can create your own helper function:
import sys
def import_module(module_name):
__import__(module_name)
return sys.modules[module_name]
Also see http://bugs.python.org/issue9254

Categories