Distinguish modules from packages - python

I need a function that returns the name of the package of the module from which the function was called. Getting the module's name is easy:
import inspect
module_name = inspect.currentframe().f_back.f_globals['__name__']
And stripping the last part to get the module's package is also easy:
package_name = '.'.join(module_name.split('.')[:-1])
But if the function is called from a package's __init__.py, the last part of the name should not be stripped. E.g. if called from foo/bar/__init__.py, module_name in the above example will be set to 'foo.bar', which is already the name of the package.
How can I check, from the module name or the module object, whether it refers to a package or a module?
The best way I found is getting the module object's __file__ attribute, if it exists, and check whether it points to a file whose name is __init__ plus extension. But this seems very brittle to me.

from the module object:
module.__package__
is available for a couple that I looked at, and appears to be correct. it may not give you exactly what you want though:
import os.path
import httplib2
import xml.etree.ElementTree as etree
os.path.__package__
<blank... this is a builtin>
httplib2.__package__
'httplib2'
etree.__package__
'xml.etree'
you can use
function.__module__
also, but that gives you the module name, not the actual module - not sure if there is an easier way to get the module object itself than importing again to a local variable.
etree.parse.__module__
'xml.etree.ElementTree'
os.path.split.__module__
'ntpath'
The good thing is this appears to be more correct, following the actual location of things, e.g.:
httplib2.Response.__module__
'httplib2'
httplib2.copy.copy.__module__
'copy'
httplib2.urlparse.ParseResult.__module__
'urlparse'
etc.

This is what importlib.__import__() does, which needs to re-implement most of the Python's built-in import logic and needs to find a module's package to support relative imports:
# __package__ is not guaranteed to be defined or could be set to None
# to represent that it's proper value is unknown
package = globals.get('__package__')
if package is None:
package = globals['__name__']
if '__path__' not in globals:
package = package.rpartition('.')[0]
module = _gcd_import(name, package, level)
So it seems that there is no reliable "direct" way to get a module's package.

Related

A few questions on PEP366 - Python relative Imports

I am always confused and never felt confident about relative imports in python. Here I am trying to understand what I don't understand and then seek StackOverflow's help to fix that.
The Reason why PEP366 exists:
Python relative imports are based on the __name__ attribute. __name__ is parsed to determine the relative position of the module in the package hierarchy. The __name__ attribute for foo.py is always __main__ if it is run like python foo.py from anywhere at all. This means there can never be a relative import statement in foo.py since __main__ has no package information. By package information I mean: __name__ is, for instance, set to a.b.foo. This gets parsed as package a, subpackage b and module foo.
This is the problem that gets fixed by PEP366. It included a -m switch and a __package__attribute that will be used to parse the relative position of the module in pakcae hierarchy. One can use -m to run foo like so : python -m foo
Now, here is PEP366.
Questions in bold
The major proposed change is the introduction of a new module level attribute, __package__. When it is present, relative imports will be based on this attribute rather than the module name attribute. Will it ever be the case that it is not present? I know it can be None, '', or __name__.rpartition('.')[0] but, will it ever be that referencing __package__ will throw me an attribute not found error? Is it safe to say that __package__ is always present actually?
As with the current __name__ attribute, setting __package__ will be the responsibility of the PEP 302 loader used to import a module. Loaders which use imp.new_module() to create the module object will have the new attribute set automatically to None. When the import system encounters an explicit relative import in a module without package set (Just to be clear, by module here they mean import foo or python -m foo only? python foo.py means script and not module. Am I correct?)(or with it set to None), it will calculate and store the correct value (name.rpartition('.')[0] for normal modules and name for package initialisation modules). The language suggests that package gets set only if there is a relative import statement present. Is that the case? Will __package__ never be set if there are no relative import statements?. If __package__ has already been set (This means the user manually setting __package__ in code?) then the import system will use it in preference to recalculating the package name from the name and path attributes.
The runpy module will explicitly set the new attribute, basing it off the name used to locate the module to be executed rather than the name used to set the module's name attribute. This will allow relative imports to work correctly from main modules executed with the -m switch.What is being said here? I thought there was just one name attribute.Here they are talking about 2 different names
When the main module is specified by its filename (What is meant by this statement? Is that not always the case? Can you give an example where it is not?), then the __package__ attribute will be set to None. To allow relative imports when the module is executed directly, boilerplate similar to the following would be needed before the first relative import statement:
if __name__ == "__main__" and __package__ is None:
__package__ = "expected.package.name"
Note that this boilerplate is sufficient only if the top level package is already accessible via sys.path. Additional code that manipulates sys.path would be needed in order for direct execution to work without the top level package already being importable. (** Does this boilerplating actually happen under the hood and the user does not have to take care of this explicitly?**)
This approach also has the same disadvantage as the use of absolute imports of sibling modules - if the script is moved to a different package or subpackage, the boilerplate will need to be updated manually. It has the advantage that this change need only be made once per file, regardless of the number of relative imports.Please elucidate this, preferably giving an example of disadvantage and advantage that they are talking about here
Note that setting __package__ to the empty string explicitly is permitted, and has the effect of disabling all relative imports from that module (since the import machinery will consider it to be a top level module in that case). This means that tools like runpy do not need to provide special case handling for top level modules when setting __package__.

Get package of Python object

Given an object or type I can get the object's module using the inspect package
Example
Here, given a function I get the module that contains that function:
>>> inspect.getmodule(np.memmap)
<module 'numpy.core.memmap' from ...>
However what I really want is to get the top-level module that corresponds to the package, in this case numpy rather than numpy.core.memmap.
>>> function_that_I_want(np.memmap)
<module 'numpy' from ...>
Given an object or a module, how do I get the top-level module?
If you have imported the submodule, then the top-level module must also be loaded in sys.modules already (because that's how the import system works). So, something dumb and simple like this should be reliable:
import sys, inspect
def function_that_I_want(obj):
mod = inspect.getmodule(obj)
base, _sep, _stem = mod.__name__.partition('.')
return sys.modules[base]
The module's __package__ attribute may be interesting for you also (or future readers). For submodules, this is a string set to the parent package's name (which is not necessarily the top-level module name). See PEP366 for more details.

__import__ only imports __init__.py

I am having an issue with the __import__ method. It seems to only import the base directory of the module, but not the file.
For instance I have:
test_suite/assert_array_length.py
when I pass this into __import__:
moduleLocation = "test_suite.assert_array_length"
module = __import__(moduleLocation)
print module
I am getting:
[sub_directories]/test_suite/__init__.pyc
The call sequence is going from run_tests.py to test_runner.py. test_runner.py then imports assert_array_length.py. They are laid out like this:
run_tests.py
|-----------test_runner.py
|-----------assert_array_length.py
because it's importing the __init__.py, I can't get what I need from the assert_array_length.py file.
__import__ imports the module you asked for. However, if you checked the documentation, you would find the following:
When the name variable is of the form package.module, normally, the top-level package (the name up till the first dot) is returned, not the module named by name.
You may prefer importlib.import_module, which will return package.module instead of package if you tell it to import package.module.

python import not working

I have three files in total in python 2.7:
A module file in some directory (e.g. module1.py)
A different module in the same directory, which imports this module (e.g. worker.py)
A main script in the top-level directory importing worker.py
When the file worker.py looks as the following
import module1 as mod
everything works as expected (i.e. I can use worker.mod.XXX in my main code). But when I replace the content of worker.py as follows (which I expected to do the same):
mod = __import__('module1')
I get an error: ImportError: No module named module1. I need the latter way to handle things to automate importing modules from a list.
What am I missing here?
To be more precise: I am just looking for a way to replace the statement
import module1 as mod
by an expression in which module1 is a string. I have e.g. modulname='module1', and want to import the module with the name of the module given in the string modulname. How to do that?
__import__(name, globals={}, locals={}, fromlist=[], level=-1) -> module
Import a module. Because this function is meant for use by the Python
interpreter and not for general use it is better to use
importlib.import_module() to programmatically import a module.
The globals argument is only used to determine the context;
they are not modified. The locals argument is unused. The fromlist
should be a list of names to emulate from name import ..., or an
empty list to emulate import name.
When importing a module from a package, note that __import__('A.B', ...)
returns package A when fromlist is empty, but its submodule B when
fromlist is not empty. Level is used to determine whether to perform
absolute or relative imports. -1 is the original strategy of attempting
both absolute and relative imports, 0 is absolute, a positive number
is the number of parent directories to search relative to the current module.
try this:
mod = __import__('module1', globals=globals())

Having to evaluate a string to access a class from a module

I've loaded one of my modules <module my_modules.mymodule from .../my_modules/mymodule.pyc> with __import__.
Now I'm having the module saved in a variable, but I'd like to create an instance of the class mymodule. Thing is - I've gotten the module name passed as string into my script.
So I've got a variable containing the module, and I've got the module name but as string.
variable_containing_module.string_variable_with_correct_classname() doesn't work. Because it says there is no such thing in the module as "string_variable_with_correct_classname" - because it doesn't evaluate the string name. How can I do so?
Your problem is that __import__ is designed for use by the import internals, not really for direct usage. One symptom of this is that when importing a module from inside a package, the top-level package is returned rather than the module itself.
There are two ways to deal with this. The preferred way is to use importlib.import_module() instead of using __import__ directly.
If you're on an older version of Python that doesn't provide importlib, then you can create your own helper function:
import sys
def import_module(module_name):
__import__(module_name)
return sys.modules[module_name]
Also see http://bugs.python.org/issue9254

Categories