I have a number of functions that I have written myself over time, over time I have placed each of these functions in their own modules (as I thought this was best practice).
I would like take the next step and organise these modules that I can import in fewer lines of code. However, I'm finding that I need to need to use a lot of lines of 'repeated' code to import the functions I want to have access to.
my_functions
-src
--__init__.py
--func1.py
--func2.py
--func3.py
I have followed this talk tutorial to build this collection of modules into a package thinking that I could use something like
import my_fucntions as mf
but I have to import each module
import func1 as fu1
import func2 as fu2
...
a = fu1.func1()
my question is, what do I have to do to be able to import my functions as a package, the same way that I would import something like pandas
import my_functions as mf
a = mf.func1(arg)
I'm finding it difficult to find either a tutorial or a clear simple example of how to structure this so any guidance at all would be useful. If it's just not doable without building something as complex and pandas or numpy thats ok too, I just said I'd try one last shot at it.
Create a __init__.py file in both my_functions directory and src directory.
Now in the my_functions __init__.py file. No need to edit __init__ file inside src.
from src.func1 import func1
from src.func2 import func2
__all__ = [
'func1',
'func2'
]
Now you can use my_functions like below
import my_functions
my_functions.f1()
__all__ gets called when you try to import my_functions and allow you to import things you have mentioned using dot notation.
To call a python script from the root call it using dot notation as below
python3 -m <some_directory>.<file>
Suppose I have a module file like this:
# my_module.py
print("hello")
Then I have a simple script:
# my_script.py
import my_module
This will print "hello".
Let's say I want to "override" the print() function so it returns "world" instead. How could I do this programmatically (without manually modifying my_module.py)?
What I thought is that I need somehow to modify the source code of my_module before or while importing it. Obvisouly, I cannot do this after importing it so solution using unittest.mock are impossible.
I also thought I could read the file my_module.py, perform modification, then load it. But this is ugly, as it will not work if the module is located somewhere else.
The good solution, I think, is to make use of importlib.
I read the doc and found a very intersecting method: get_source(fullname). I thought I could just override it:
def get_source(fullname):
source = super().get_source(fullname)
source = source.replace("hello", "world")
return source
Unfortunately, I am a bit lost with all these abstract classes and I do not know how to perform this properly.
I tried vainly:
spec = importlib.util.find_spec("my_module")
spec.loader.get_source = mocked_get_source
module = importlib.util.module_from_spec(spec)
Any help would be welcome, please.
Here's a solution based on the content of this great talk. It allows any arbitrary modifications to be made to the source before importing the specified module. It should be reasonably correct as long as the slides did not omit anything important. This will only work on Python 3.5+.
import importlib
import sys
def modify_and_import(module_name, package, modification_func):
spec = importlib.util.find_spec(module_name, package)
source = spec.loader.get_source(module_name)
new_source = modification_func(source)
module = importlib.util.module_from_spec(spec)
codeobj = compile(new_source, module.__spec__.origin, 'exec')
exec(codeobj, module.__dict__)
sys.modules[module_name] = module
return module
So, using this you can do
my_module = modify_and_import("my_module", None, lambda src: src.replace("hello", "world"))
This doesn't answer the general question of dynamically modifying the source code of an imported module, but to "Override" or "monkey-patch" its use of the print() function can be done (since it's a built-in function in Python 3.x). Here's how:
#!/usr/bin/env python3
# my_script.py
import builtins
_print = builtins.print
def my_print(*args, **kwargs):
_print('In my_print: ', end='')
return _print(*args, **kwargs)
builtins.print = my_print
import my_module # -> In my_print: hello
I first needed to better understand the import operation. Fortunately, this is well explained in the importlib documentation and scratching through the source code helped too.
This import process is actually split in two parts. First, a finder is in charge of parsing the module name (including dot-separated packages) and instantiating an appropriate loader. Indeed, built-in are not imported as local modules for example. Then, the loader is called based on what the finder returned. This loader get the source from a file or from a cache, and executed the code if the module was not previously loaded.
This is very simple. This explains why I actually did not need to use abstract classes from importutil.abc: I do not want to provide my own import process. Instead, I could create a subclass inherited from one of the classes from importuil.machinery and override get_source() from SourceFileLoader for example. However, this is not the way to go because the loader is instantiated by the finder so I do not have the hand on which class is used. I cannot specify that my subclass should be used.
So, the best solution is to let the finder do its job, and then replace the get_source() method of whatever Loader has been instantiated.
Unfortunately, by looking trough the code source I saw that the basic Loaders are not using get_source() (which is only used by the the inspect module). So my whole idea could not work.
In the end, I guess get_source() should be called manually, then the returned source should be modified, and finally the code should be executed. This is what Martin Valgur detailed in his answer.
If compatibility with Python 2 is needed, I see no other way than reading the source file:
import imp
import sys
import types
module_name = "my_module"
file, pathname, description = imp.find_module(module_name)
with open(pathname) as f:
source = f.read()
source = source.replace('hello', 'world')
module = types.ModuleType(module_name)
exec(source, module.__dict__)
sys.modules[module_name] = module
If importing the module before the patching it is okay, then a possible solution would be
import inspect
import my_module
source = inspect.getsource(my_module)
new_source = source.replace('"hello"', '"world"')
exec(new_source, my_module.__dict__)
If you're after a more general solution, then you can also take a look at the approach I used in another answer a while ago.
My solution updates the source file, which works for the inner import situation. The inner import means that transformers.models.albert import modeling_albert from the source file. In such case, even if I use the solution from Martin Valgur, it won't work. So I update the source file. Hope it help the people who have the same trouble with me.
import inspect
from transformers.models.albert import modeling_albert
# Get source
source = inspect.getsource(modeling_albert)
source_before = "AlbertModel(config, add_pooling_layer=False)"
source_after = "AlbertModel(config, add_pooling_layer=True)"
new_source = source.replace(source_before, source_after)
# Update file
file_path = modeling_albert.__spec__.origin
with open(file_path, 'w') as f:
f.write(new_source)
Not elegant, but works for me (may have to add a path):
with open ('my_module.py') as aFile:
exec (aFile.read () .replace (<something>, <something else>))
I simply want to take all my .py files from a single folder (I don't care about the sub-folders for now) and put them into a single module.
The use case I'm having here is that I'm writing some pretty standard object-oriented code and I'm using a single file for every class, and I don't want to have to write from myClass import myClass for every class into my __init__.py. I can't use Python3, so I'm still working with impand reloadand such.
At the moment I'm using
# this is __init__.py
import pkgutil
for loader, name, is_pkg in pkgutil.walk_packages(__path__):
if not is_pkg:
__import__(__name__ + "." + name)
and it doesn't seem to work, it includes the packages but it includes them as modules, so that I have to write MyClass.MyClass for a class that is defined in a file with it's own name. That's silly and I don't like it.
I've been searching forever and I'm just getting more confused how complicated this seemingly standard use case seems to be. Do python devs just write everything into a single file? Or do they always have tons of imports?
Is this something that should be approached in an entirely different way?
What you really want to do
To do the job you need to bind your class names to namespace of your __init__.py script.
After this step you will be able to just from YourPackageName import * and just use your classes directly. Like this:
import YourPackageName
c = YourPackageName.MyClass()
or
from YourPackageName import *
c = MyClass()
Ways to achieve this
You have multiple ways to import modules dynamically: __import__(), __all__.
But.
The only way to bind names into namespace of current module is to use from myClass import myClass statement. Static statement.
In other words, content of each of your __init__.py scripts should be looking like that:
#!/usr/bin/env python
# coding=utf-8
from .MySubPackage import *
from .MyAnotherSubPackage import *
from .my_pretty_class import myPrettyClass
from .my_another_class import myAnotherClass
...
And you should know that even for a dynamic __all__:
It is up to the package author to keep this list up-to-date when a new version of the package is released.
(https://docs.python.org/2/tutorial/modules.html#importing-from-a-package)
So a clear answers to your questions:
Do python devs just write everything into a single file?
No, they don't.
Or do they always have tons of imports?
Almost. But definitely not tons. You need to import each of your modules just once (into an appropriate __init__.py scripts). And then just import whole package or sub-package at once.
Example
Let's assume that there is next package structure:
MyPackage
|---MySubPackage
| |---__init__.py
| |---pretty_class_1.py
| |---pretty_class_2.py
|---__init__.py
|---sleepy_class_1.py
|---sleepy_class_2.py
Content of the MyPackage/MySubPackage/__init__.py:
#!/usr/bin/env python
# coding=utf-8
from .pretty_class_1 import PrettyClass1
from .pretty_class_2 import PrettyClass2
Content of the MyPackage/__init__.py:
#!/usr/bin/env python
# coding=utf-8
from .MySubPackage import *
from .sleepy_class_1 import SleepyClass1
from .sleepy_class_2 import SleepyClass2
As result, now we are able to write next code in our application:
import MyPackage
p = MyPackage.PrettyClass1()
s = MyPackage.SleepyClass2()
or
from MyPackage import *
p = PrettyClass1()
s = SleepyClass2()
I am making a program where a user can save a file, load it back, and name the file the way they want.
How can I dynamically import all attributes of that module?
E.g. something like the pseudocode below:
module_name = input()
from module_name import *
Assume module_name ends with ".py".
I am using python 3.5.
This sounds like a terrible idea, but here is an example of loading math module and using some of its functions.
import importlib
my_module = importlib.import_module('math')
globals().update(my_module.__dict__)
print(globals())
print(sin(pi/2))
(Code improvements by jmd_dk)
It sounds like a bad idea because from x import * will import everything and potentially overwrite local attributes.
Additionally, there are most likely more clear ways to achieve your goals by using a dictionary.
I am a little confused by the multitude of ways in which you can import modules in Python.
import X
import X as Y
from A import B
I have been reading up about scoping and namespaces, but I would like some practical advice on what is the best strategy, under which circumstances and why. Should imports happen at a module level or a method/function level? In the __init__.py or in the module code itself?
My question is not really answered by "Python packages - import by class, not file" although it is obviously related.
In production code in our company, we try to follow the following rules.
We place imports at the beginning of the file, right after the main file's docstring, e.g.:
"""
Registry related functionality.
"""
import wx
# ...
Now, if we import a class that is one of few in the imported module, we import the name directly, so that in the code we only have to use the last part, e.g.:
from RegistryController import RegistryController
from ui.windows.lists import ListCtrl, DynamicListCtrl
There are modules, however, that contain dozens of classes, e.g. list of all possible exceptions. Then we import the module itself and reference to it in the code:
from main.core import Exceptions
# ...
raise Exceptions.FileNotFound()
We use the import X as Y as rarely as possible, because it makes searching for usage of a particular module or class difficult. Sometimes, however, you have to use it if you wish to import two classes that have the same name, but exist in different modules, e.g.:
from Queue import Queue
from main.core.MessageQueue import Queue as MessageQueue
As a general rule, we don't do imports inside methods -- they simply make code slower and less readable. Some may find this a good way to easily resolve cyclic imports problem, but a better solution is code reorganization.
Let me just paste a part of conversation on django-dev mailing list started by Guido van Rossum:
[...]
For example, it's part of the Google Python style guides[1] that all
imports must import a module, not a class or function from that
module. There are way more classes and functions than there are
modules, so recalling where a particular thing comes from is much
easier if it is prefixed with a module name. Often multiple modules
happen to define things with the same name -- so a reader of the code
doesn't have to go back to the top of the file to see from which
module a given name is imported.
Source: http://groups.google.com/group/django-developers/browse_thread/thread/78975372cdfb7d1a
1: http://code.google.com/p/soc/wiki/PythonStyleGuide#Module_and_package_imports
I would normally use import X on module level. If you only need a single object from a module, use from X import Y.
Only use import X as Y in case you're otherwise confronted with a name clash.
I only use imports on function level to import stuff I need when the module is used as the main module, like:
def main():
import sys
if len(sys.argv) > 1:
pass
HTH
Someone above said that
from X import A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P
is equivalent to
import X
import X allows direct modifications to A-P, while from X import ... creates copies of A-P. For from X import A..P you do not get updates to variables if they are modified. If you modify them, you only modify your copy, but X does know about your modifications.
If A-P are functions, you won't know the difference.
Others have covered most of the ground here but I just wanted to add one case where I will use import X as Y (temporarily), when I'm trying out a new version of a class or module.
So if we were migrating to a new implementation of a module, but didn't want to cut the code base over all at one time, we might write a xyz_new module and do this in the source files that we had migrated:
import xyz_new as xyz
Then, once we cut over the entire code base, we'd just replace the xyz module with xyz_new and change all of the imports back to
import xyz
DON'T do this:
from X import *
unless you are absolutely sure that you will use each and every thing in that module. And even then, you should probably reconsider using a different approach.
Other than that, it's just a matter of style.
from X import Y
is good and saves you lots of typing. I tend to use that when I'm using something in it fairly frequently But if you're importing a lot from that module, you could end up with an import statement that looks like this:
from X import A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P
You get the idea. That's when imports like
import X
become useful. Either that or if I'm not really using anything in X very frequently.
I generally try to use the regular import modulename, unless the module name is long, or used often..
For example, I would do..
from BeautifulSoup import BeautifulStoneSoup as BSS
..so I can do soup = BSS(html) instead of BeautifulSoup.BeautifulStoneSoup(html)
Or..
from xmpp import XmppClientBase
..instead of importing the entire of xmpp when I only use the XmppClientBase
Using import x as y is handy if you want to import either very long method names , or to prevent clobbering an existing import/variable/class/method (something you should try to avoid completely, but it's not always possible)
Say I want to run a main() function from another script, but I already have a main() function..
from my_other_module import main as other_module_main
..wouldn't replace my main function with my_other_module's main
Oh, one thing - don't do from x import * - it makes your code very hard to understand, as you cannot easily see where a method came from (from x import *; from y import *; my_func() - where is my_func defined?)
In all cases, you could just do import modulename and then do modulename.subthing1.subthing2.method("test")...
The from x import y as z stuff is purely for convenience - use it whenever it'll make your code easier to read or write!
When you have a well-written library, which is sometimes case in python, you ought just import it and use it as it. Well-written library tends to take life and language of its own, resulting in pleasant-to-read -code, where you rarely reference the library. When a library is well-written, you ought not need renaming or anything else too often.
import gat
node = gat.Node()
child = node.children()
Sometimes it's not possible to write it this way, or then you want to lift down things from library you imported.
from gat import Node, SubNode
node = Node()
child = SubNode(node)
Sometimes you do this for lot of things, if your import string overflows 80 columns, It's good idea to do this:
from gat import (
Node, SubNode, TopNode, SuperNode, CoolNode,
PowerNode, UpNode
)
The best strategy is to keep all of these imports on the top of the file. Preferrably ordered alphabetically, import -statements first, then from import -statements.
Now I tell you why this is the best convention.
Python could perfectly have had an automatic import, which'd look from the main imports for the value when it can't be found from global namespace. But this is not a good idea. I explain shortly why. Aside it being more complicated to implement than simple import, programmers wouldn't be so much thinking about the depedencies and finding out from where you imported things ought be done some other way than just looking into imports.
Need to find out depedencies is one reason why people hate "from ... import *". Some bad examples where you need to do this exist though, for example opengl -wrappings.
So the import definitions are actually valuable as defining the depedencies of the program. It is the way how you should exploit them. From them you can quickly just check where some weird function is imported from.
The import X as Y is useful if you have different implementations of the same module/class.
With some nested try..import..except ImportError..imports you can hide the implementation from your code. See lxml etree import example:
try:
from lxml import etree
print("running with lxml.etree")
except ImportError:
try:
# Python 2.5
import xml.etree.cElementTree as etree
print("running with cElementTree on Python 2.5+")
except ImportError:
try:
# Python 2.5
import xml.etree.ElementTree as etree
print("running with ElementTree on Python 2.5+")
except ImportError:
try:
# normal cElementTree install
import cElementTree as etree
print("running with cElementTree")
except ImportError:
try:
# normal ElementTree install
import elementtree.ElementTree as etree
print("running with ElementTree")
except ImportError:
print("Failed to import ElementTree from any known place")
I'm with Jason in the fact of not using
from X import *
But in my case (i'm not an expert programmer, so my code does not meet the coding style too well) I usually do in my programs a file with all the constants like program version, authors, error messages and all that stuff, so the file are just definitions, then I make the import
from const import *
That saves me a lot of time. But it's the only file that has that import, and it's because all inside that file are just variable declarations.
Doing that kind of import in a file with classes and definitions might be useful, but when you have to read that code you spend lots of time locating functions and classes.