I want to make use of an existing python module (called "module.py"). I'm only interested in one function from that module ("my_function()"). The module also contains a lot of other functions, which I'm not using. These other functions cause the module to have a lot of imports that are not used in my_function.
"""module.py"""
import useful_import
import useless_import1
import useless_import2
def my_function():
return useful_import.do()
def other_function1():
return useless_import1.do()
def other_function2():
return useless_import2.do()
The code I've written (main.py) imports only my_function, but it still requires me to include/install the other useless modules. I've checked and none of the useless modules run any code on import, so I should be able to safely remove them.
"""main.py"""
from module import my_function
print my_function()
How do I best deal with this?
Should I included the useless imports in my project anyway?
Should I make a copy of module.py and edit it so that it only contains my_function and the right imports?
Should I copy my_function and its imports into main.py?
(some other option I didn't think/know of)?
It kind of depends on the context, e.g. how will this code be used later, who will maintain it, what kind of code is it realy etc etc.
But my suggestion under most circumstances would be:
refactor my_function and its needed imports into a new_module.py
use this module in main.py
Either remove module.py from your code base, or have it import from new_module
Related
I would like a way to detect if my module was executed directly, as in import module or from module import * rather than by import module.submodule (which also executes module), and have this information accessible in module's __init__.py.
Here is a use case:
In Python, a common idiom is to add import statement in a module's __init__.py file, such as to "flatten" the module's namespace and make its submodules accessible directly. Unfortunately, doing so can make loading a specific submodule very slow, as all other siblings imported in __init__.py will also execute.
For instance:
module/
__init__.py
submodule/
__init__.py
...
sibling/
__init__.py
...
By adding to module/__init__.py:
from .submodule import *
from .sibling import *
It is now possible for users of the module to access definitions in submodules without knowing the details of the package structure (i.e. from module import SomeClass, where SomeClass is defined somewhere in submodule and exposed in its own __init__.py file).
However, if I now run submodule directly (as in import module.submodule, by calling python3 -m module.submodule, or even indirectly via pytest) I will also, unavoidably, execute sibling! If sibling is large, this can slow things down for no reason.
I would instead like to write module/__init__.py something like:
if __???__ == 'module':
from .submodule import *
from .sibling import *
Where __???__ gives me the fully qualified name of the import. Any similar mechanism would also work, although I'm mostly interested in the general case (detecting direct executing) rather than this specific example.
What is being desired is will result in undefined behavior (in the sense whether or not the flattened names be importable from module) when we consider how the import system actually works, if it were actually possible.
Hypothetically, if what you want to achieve is possible, where some __dunder__ that will disambiguate which import statement was used to import module/__init__.py (e.g. import module and from module import *, vs import module.submodule. For the first case, module may trigger the subsequent (slow) import to produce a "flattened" version of the desired imports, while the latter case (import module.submodule) will avoid that and thus module will not contain any assignments of the "flattened" imports.
To illustrate the example a bit more, say one may import SiblingClass from module.sibling.SiblingClass by simply doing from module import SiblingClass as the module/__init__.py file executes from .sibling import * statement to create that binding. But then, if executing import module.submodule resulting in the avoidance of that flatten import, we get the following scenario:
import module.submodule
# module.submodule gets imported
from module import SiblingClass
# ImportError will occur
Why is that? This is simply due to how Python imports a file - the source file is executed in its entirety once to assign imports, function and class declarations to the designated names, and be registered to sys.modules under its import name. Importing the module again will not execute the file again, thus if the from .sibling import * statement was not executed during its initial import (i.e. import module.submodule), it will never be executed again during subsequent import of the same module, as the copy produced by the initial import assigned to its module entry in sys.module is returned (unless the module was reloaded manually, the code for the module will be executed again).
You may verify this fact by putting in a print statement into a file, import the corresponding module to see the output produced, and see that no further output will be produced on subsequent import of that module (related: What happens when a module is imported twice?).
Effectively, the desired functionality as described in the question cannot be implemented in Python.
A related thread on this topic: How to only import sub module without exec __init__.py in the package
This is not a complete solution, but standalone py.test (ignore __init__.py files) proposes setting a global flag to detect when in test. This corrects the problem for tests at least, provided the concerned modules don't call each other.
I am a python beginne, and am currently learning import modules in python.
So my question is:
Suppose I currently have three python files, which is module1.py, module2.py, and module3.py;
In module1.py:
def function1():
print('Hello')
In module2.py, in order to use those functions in module1.py:
import module1
#Also, I have some other public functions in this .py file
def function2():
print('Goodbye')
#Use the function in module1.py
if __name__ == '__main__':
module1.function1();
function2();
In module3.py, I would like to use both the functions from module1.py and module2.py.
import module1
import module2
def function3():
print('Nice yo meet you');
if __name__ == '__main__':
module1.function1()
function3()
module2.function2()
Seems like it works. But my questions are mainly on module3.py. The reason is that in module3.py, I imported both module1 and module2. However, module1 is imported by module2 already. I am just wondering if this is a good way to code? Is this effective? Should I do this? or Should I just avoid doing this and why?
Thank you so much. I am just a beginner, so if I ask stupid questions, please forgive me. Thank you!!
There will be no problem if you avoid circular imports, that is you never import a module that itself imports the current importing module.
A module does not see the importer namespace, so imports in the importer code don't become globals to the imported module.
Also module top-level code will run on first import only.
Edit 1:
I am answering Filipe's comments here because its easier.
"There will be no problem if you avoid circular imports" -> This is incorrect, python is fine with circular imports for the most part."
The fact that you sensed some misconception of mine, doesn't make that particular statement incorrect. It is correct and it is good advice.
(Saying it's fine for the most part looks a bit like saying something will run fine most of time...)
I see what you mean. I avoid it so much that I even thought your first example would give an error right away (it doesn't). You mean there is no need to avoid it because most of the time (actually given certain conditions) Python will go fine with it. I am also certain that there are cases where circular imports would be the easiest solution. That doesn't mean we should use them if we have a choice. That would promote the use of a bad architecture, where every module starts depending on every other.
It also means the coder has to be aware of the caveats.
This link I found here in SO states some of the worries about circular imports.
The previous link is somewhat old so info can be outdated by newer Python versions, but import confusion is even older and still apllies to 3.6.2.
The example you give works well because relevant or initialization module code is wrapped in a function and will not run at import time. Protecting code with an if __name__ == "__main__": also removes it from running when imported.
Something simple like this (the same example from effbot.org) won't work (remember OP says he is a beginner):
# file y.py
import x
x.func1()
# file x.py
import y
def func1():
print('printing from x.func1')
On your second comment you say:
"This is also incorrect. An imported module will become part of the namespace"
Yes. But I didn't mention that, nor its contrary. I just said that an imported module code doesn't know the namespace of the code making the import.
To eliminate the ambiguity I just meant this:
# w.py
def funcw():
print(z_var)
# z.py
import w
z_var = 'foo'
w.funcw() # error: z_var undefined in w module namespace
Running z.py gives the stated error. That's all that I meant.
Now going further, to get the access we want, we go circular...
# w.py
import z # go circular
def funcw():
'''Notice that we gain access not to the z module that imported
us but to the z module we import (yes its the same thing but
carries a different namespace). So the reference we obtain
points to a different object, because it really is in a
different namespace.'''
print(z.z_var, id(z.z_var))
...and we protect some code from running with the import:
# z.py
import w
z_var = ['foo']
if __name__ == '__main__':
print(z_var, id(z_var))
w.funcw()
By running z.py we confirm the objects are different (they can be the same with immutables, but that is python kerning - internal optimization, or implementation details - at work):
['foo'] 139791984046856
['foo'] 139791984046536
Finally I agree with your third comment about being explicit with imports.
Anyway I thank your comments. I actually improved my understanding of the problem because of them (we don't learn much about something by just avoiding it).
I'm trying to do the following in python 2.6.
my_module.py:-
from another_module import another_factory
def my_factory(name):
pass
another_module.py:-
from my_module import my_factory
def another_factory(name):
pass
Both modules in the same folder.
It gives me the error:
Error: cannot import name my_factory
As seen from the comments, you are trying to do a circle import which is impossible.
If in your module A you try to import something from the module B, and when loading the module B (to satisfy this dependency) you are trying to import something from the module A, you are where you started and you got a circle import: A needs B and B needs A!!, it is somehow like saying that A needs A, which is quite unlogic.
For instance:
# moduleA
from moduleB import functionB
...
So the interpreter tries to load the moduleB, which looks like the following:
# moduleB
from moduleA import functionA
...
And goes back to the moduleA, which tries again to import B, and, etc. Therefore python just raises the error and stops the insanity for a greater good.
Dependencies don't work like this. Define what module needs the other one, and just do a simple import. In your example, it seems that another_module needs my_module, so change my_module and eliminate the dependency on another_module.
If both modules actually need each other, it is a clear sign that they belong to the same logical concept, and should be merged.
PD: in some cases to avoid huge files, you can split a logical unit in two, and to avoid the circle dependencies, you write your imports inside of the functions (which are not executed at load time), so that there is not a circle. This is however in general something to avoid.
The real question is... do you consider each file as a module or are they part of a package ?
Trying to import modules outside a package is sometimes painful. You should rather build a package by simply creating an empty __init__.py module in the directory. Though, if you have
__init__.py
my_module.py
another_module.py
If you have te following function in my_module.py,
def my_factory(x):
return x * x
You should be able to access the my_factory() function from another_module.py by writing this :
from my_module import my_factory
But, if you don't have the __init__.py file/module, the import function will be (somehow) lost and will only use the sys.path for searching other modules. You may then add the following lines (before the import) in the another_module.py file :
sys.path.append(os.path.dirname(os.path.expanduser('.')))
You may also use the various packages available to help importing modules, like imp or import_file (see the documentation). Or you can decide to use load_source (also see the doc : https://docs.python.org/2/library/imp.html)
How can I find out what file is importing a particular file in python?
Consider the following example:
#a.py
import cmn
....
#b.py
import cmn
...
#cmn.py
#Here, I want to know which file (a.py or b.py)
#is importing this one.
#Is it possible to do this?
...
All the files a.py, b.py and cmn.py are in the same directory.
Why do I want to do this?
In C/C++, they have include feature. What i want to do can illuminate by the C/C++ code.
//a.cpp
....
#define SOME_STUFF ....
#include "cmn.h"
//b.cpp
...
#define SOME_STUFF ....
#include "cmn.h"
//cmn.h
//Here, I'll define some functions/classes that will use the symbol define
//in the a.cpp or b.cpp
...
....code refer to the SOME_STUFF.....
In C/C++, we can use this method to reuse sourecode.
Now return to my python code.
When a.py import cmn.py, i hope to run cmn.py and the cmn.py will refer to the symbol defined in the a.py.
When b.py import cmn.py, i hope to run cmn.py and the cmn.py will refer to the symbol defined in the b.py.
The namedtuple code in the collections module has an example of how (and when) to do this:
#cmn.py
import sys
print 'I am being imported by', sys._getframe(1).f_globals.get('__name__')
One limitation of this approach is that the outermost module is always named __main__. If that is the case, the name of the outermost module can be determined from sys.argv[0].
A second limitation is that if the code using sys._getframe is in the module scope it is only executed on the first import of cmn.py. You'd need to call a function of some sort after imports if you want to monitor all imports of the module.
Well, this is a kind of bizarre thing to do. You haven't explained why you want to know what is importing your module, so I can't actually help you solve your problem. You also haven't explained how or when you want to know the importing module.
def who_imports(studied_module):
for loaded_module in sys.modules.values():
for module_attribute in dir(loaded_module):
if getattr(loaded_module, module_attribute) is studied_module:
yield loaded_module
This will give you an iterator over all the modules which use your module as a top-level object. It won't find modules that do from cmn import *, and the list will change over time.
>>> import os
>>> for m in who_imports(os):
... print m.__name__
...
site
__main__
posixpath
genericpath
posixpath
linecache
You'd need to install an import hook that tracks all imports. See PEP 302 and http://docs.python.org/dev/py3k/library/importlib.html. However, as the comments above point out, there is probably a better way to structure your code.
I've done what I shouldn't have done and written 4 modules (6 hours or so) without running any tests along the way.
I have a method inside of /mydir/__init__.py called get_hash(), and a class inside of /mydir/utils.py called SpamClass.
/mydir/utils.py imports get_hash() from /mydir/__init__.
/mydir/__init__.py imports SpamClass from /mydir/utils.py.
Both the class and the method work fine on their own but for some reason if I try to import /mydir/, I get an import error saying "Cannot import name get_hash" from /mydir/__init__.py.
The only stack trace is the line saying that __init__.py imported SpamClass. The next line is where the error occurs in in SpamClass when trying to import get_hash. Why is this?
This is a pretty easy problem to encounter. What's happening is this that the interpreter evaluates your __init__.py file line-by line. When you have the following code:
import mydir.utils
def get_hash(): return 1
The interpreter will suspend processing __init__.py at the point of import mydir.utils until it has fully executed 'mydir/utils.py' So when utils.py attempts to import get_hash(), it isn't defined because the interpreter hasn't gotten to it's definition yet.
To add to what the others have said, another good approach to avoiding circular import problems is to avoid from module import stuff.
If you just do standard import module at the top of each script, and write module.stuff in your functions, then by the time those functions run, the import will have finished and the module members will all be available.
You then also don't have to worry about situations where some modules can update/change one of their members (or have it monkey-patched by a naughty third party). If you'd imported from the module, you'd still have your old, out-of-date copy of the member.
Personally, I only use from-import for simple, dependency-free members that I'm likely to refer to a lot: in particular, symbolic constants.
In absence of more information, I would say you have a circular import that you aren't working around. The simplest, most obvious fix is to not put anything in mydir/__init__.py that you want to use from any module inside mydir. So, move your get_hash function to another module inside the mydir package, and import that module where you need it.