So, this is a set of questions about how to use __init__.py in packages/sub-packages. I have searched, and surprisingly not found a decent answer to this.
If I have the following structure (which is just a simplified example obviously):
my_package/
__init__.py
module1.py
my_sub_package/
__init__.py
module2.py
the contents of module1.py is
my_string = 'Hello'
the contents of module2.py is
my_int = 42
First question: importing multiple modules from a package by default
What should be in the __init__.py files?
I can leave them empty, in which case, import my_package does nothing really (obviously it imports the package, but the package effectively contains nothing). This is fine obviously, and what should happen in most cases.
What I'd like in this case though is for import my_package to allow me to use my_package.module1.my_string and my_package.my_sub_package.module2.my_int.
I can add __all__ = [ 'module1' ] to my_package/__init__.py and __all__ = [ 'module2' ] to my_package/my_sub_package/__init__.py, but this only affects imports using a wildcard as I understand it (so only from my_package import * and from my_package.my_sub_package import *).
I can achieve this by putting
import my_package.module1
import my_package.my_sub_package
in my_package/__init__.py and
import my_package.my_sub_package.module2
in my_package/my_sub_package/__init__.py, but is this a bad idea? It creates a (seemingly) infinite series of my_package.my_package.my_package.... when I do this in the Python interpreter (3.5.5).
Separate, but highly related, question: using modules to keep files reasonably sized
If I wanted instead to be able to do the following
import my_package
print(my_package.my_string)
print(str(my_package.my_sub_package.my_int))
i.e. I wanted to use module1 and module2 purely for separating code into smaller more readable files if I actually had lots of modules in each package (which obviously doesn't apply in this trivial example, but can easily)
is doing from my_package.module1 import * in my_package/__init__.py and from my_package.my_sub_package.module2 import * in my_package/my_sub_package/__init__.py a reasonable way to do that? I don't like the use of the wildcard import, but it seems like it would be impractically verbose to import everything defined in a (real) module, listing them all.
Third (also highly related) question: avoiding writing the package names in multiple places
Is there a way I can achieve the above without having to put the names of the packages into the source code in them? I ask because I'd like to avoid having to change it in multiple places if I renamed the package (again, simple in this trivial example, can be done by an IDE or script in reality, but would be nice to know how to avoid).
In my_package/__init__.py, use
from . import my_sub_package
etc.
See for example NumPy's __init__.py, which has from . import random, and allows
import numpy as np
np.random.random
Wildcard imports inside a single package tend to be common, provided you have __all__ defined in the modules and subpackages you import from.
Again an example from NumPy's __init__.py, which has several wildcard imports.
Here's part of that __init__.py:
from . import core
from .core import *
from . import compat
from . import lib
from .lib import *
from . import linalg
from . import fft
from . import polynomial
from . import random
from . import ctypeslib
from . import ma
from . import matrixlib as _mat
from .matrixlib import *
from .compat import long
Notice also the two core import lines. Both numpy.core and the core definitions (functions, classes etc) are then available.
When in doubt how to do something, or whether something is good practice, have a look at a few well-known libraries or packages. That can help gaining some valuable insights.
Related
I am writing a library in python. It has one package and 3 modules, such as
mypackage
__init__.py
utils.py
fileio.py
math.py
If I just leave the __init__.py empty, my users must figure out which functions are in which modules, which is annoying. The fact that I have three modules is an implementation detail that is not important.
Maybe I should import the main functions into the __init__.py, like this
from .utils import create_table
from .fileio import save_rows, load_rows
from .math import matrix_inverse
so that my users can just do
import mypackage as myp
rows = myp.load_rows()
Is that best practice?
What about the alternative to put ALL symbols into the __init__.py, such as
from .utils import *
from .fileio import *
from .math import *
And if there are any functions that I don't want to expose, I will prefix them with underscore. Is that better? It certainly is easier for me.
What if the fileio.py needs to call some functions in the utils.py? I could put
from .utils import *
into the fileio.py, but won't that create a circular or redundant reference? What's the best way to handle this?
Maybe I should import the main functions into the init.py, like this [...] Is that best practice?
I wouldn't say there is a "best practice", it depends on the specific case, but this is surely pretty common: you define a bunch of stuff in each module, and import the relevant ones in __init__. This is an easy way to not bother the users with remembering which submodule has the needed function, however it can get pretty annoying if you have a lot of functions to import from each module.
What about the alternative to put ALL symbols into the init.py, such as
from .utils import *
from .fileio import *
from .math import *
You most likely don't want to do this. This will import everything in user scripts, including other imported modules and internal functions. You should avoid it.
What if the fileio.py needs to call some functions in the utils.py? [...] won't that create a circular or redundant reference?
Yeah, that is something that can happen and you usually want to avoid it at all costs. If you need some functions from utils.py in fileio.py, you should import them explicitly as from .utils import x, y, z. Remember to also always use relative imports when importing things between modules of the same package (i.e. use from .utils import x, not from package.utils import x).
A good compromise between these two options you mention which solves most of the above problems (although not circular imports, you would have to avoid those yourself) would be to define an __all__ list inside each one of your modules to specify which functions should be exported when using from x import *, like this:
# utils.py
import sys
__all__ = ['user_api_one', 'user_api_two']
def user_api_one():
...
def user_api_two():
...
def internal_function():
...
If you properly define an __all__ list in all your modules, then in your __init__.py you will be able to safely do:
from .utils import *
from .fileio import *
from .math import *
This will only import relevant functions (for example user_api_one and user_api_two for utils, and not internal_function nor sys).
I have been creating a few modules organized by purpose, and each module contain a number number of functions. I would like to bundle these individual modules into a larger "package" that other users can import from a shared location.
Currently, I have all of my modules in one folder, called python_modules and I have appended this path to os.path so I can easily import my individual modules as needed.
However, I would like to instead import a single package, that contains all of my modules, so I don't have to import each one individually. I know that I could put all my modules into one file, but that doesn't seem like a good way to organize my processes.
Currently, I have to following files in my python_modules folder:
__init__.py
load_data_functions.py
parse_data_functions.py
network_functions.py
counting_function.py
math_functions.py
...
...
other_functions.py
The __init__.py file is empty and does not have anything inside of it. The other modules all have various functions inside of them, and some are dependent on others. For example, network_functions.py relies on load_data_functions.py and parse_data_functions.py.
As I said, I want to package all of these modules into a larger package that I can share with others, and so we don't have to import each module independently.
A Package is just a bunch of modules. You already have an __init__.py file, so python_modules is already a package. You should simply be able to import python_modules and then call functions from each individual module as load_data_functions.some_function(), parse_data_functions.some_other_function().
assuming "I would like to instead import a single package" means you want to be able to something like:
import python_modules
port = python_modules.network_functions.get_port()
your __init__.py file should look like:
from . import load_data_functions
from . import parse_data_functions
from . import network_functions
...
if you'd like to be able to do:
import python_modules
port = python_modules.get_port()
your __init__.py file should look like:
from .load_data_functions import *
from .parse_data_functions import *
from .network_functions import *
...
As for modules within the package referring to each other, you can use the same idea, e.g. at the top of network_funtions.py you'd want put from . import load_data_functions. You should structure things as to avoid circular imports.
I have made a package in the following structure:
test.py
pakcage1/
__init__.py
module1.py
module2.py
In the test.py file, with the code
from package1 import *
what I want it to do is to
from numpy import *
from module1 import *
from module2 import *
What should I write in __init__.py file to achieve this?
Currently in my __init__.py file I have
from numpy import *
__all__ = ['module1','module2']
and this doesn't give me what I wanted. In this way numpy wan't imported at all, and the modules are imported as
import module1
rather than
from module1 import *
If you want this, your __init__.py should contain just what you want:
from numpy import *
from module1 import *
from module2 import *
When you do from package import *, it imports all names defined in the package's __init__.py.
Note that this could become awkward if there are name clashes among the modules you import. If you just want convenient access to the functions in those modules, I would suggest using instead something like:
import numpy as np
import module1 as m1
import module2 as m2
That is, import the modules (not their contents), but under shorter names. You can then still access numpy stuff with something like np.add, which adds only three characters of typing but guards against name clashes among different modules.
I second BrenBarn's suggestion, but be warned though, importing everything into one single namespace using from x import * is generally a bad idea, unless you know for certain that there won't be any conflicting names.
I think it's still safer to use import package.module, though it does take extra keystrokes.
Let's say I have such a directory structure:
- Documents/
- thesis_program/
- __init__.py
- classes.py
- utils.py
- GE_Test.py
- GE_Test_fail.py
classes.py and utils.py contains some classes and functions.
GE_Test.py and GE_Test_fail.py contains the exactly same code, except the import part.
In GE_Test.py I import classes and utils this way:
from utils import execute
from classes import Grammatical_Evolution
While in GE_Test_fail.py, I import classes and utils this way:
from thesis_program.utils import execute
from thesis_program.classes import Grammatical_Evolution
And unexpectedly I get a different result. Is there anything wrong here?
Do I import the modules correctly?
I can ensure that the result should be the same, because I generate the random number with certain seed
Also classes.py is somehow depended on utils.py since I have several common functions in utils.py. I suspect that utils is also a name used by the system. So in the second case (GE_Test_fail.py) The system utils override my utils.py. But it doesn't seem make sense for me.
The complete source code of classes.py and utils.py is available here (if it helps to discover what's wrong): https://github.com/goFrendiAsgard/feature-extractor
And also, the screenshots: https://picasaweb.google.com/111207164087437537257/November25201204?authuser=0&authkey=Gv1sRgCOKxot2a2fTtlAE&feat=directlink
add below mentioned lines to your test files which are going outside of your thesis folder.
import sys sys.path.insert(0,"/path to your thesis folder/thesis_program")
and maintain everything else; for example in GE_Test.py.. .
import sys
sys.path.insert(0,"/path to your thesis folder/thesis_program")
from utils import execute
from classes import Grammatical_Evolution
EDIT:
Or use this to make it more dynamic
(caution: don't try to find the path by os.path.abspath('./thesis_program') because it may not be always possible that you find your test_files and your thesis_folder are in the same dir; if you can fix them permanently in your code like above; then you are free to use them from anywhere on your system)
import os, sys
lib_path = os.path.abspath('./thesis_program')
sys.path.insert(0,lib_path)
all.
I'd think that this could be answered easily, but it isn't. As long as I've been searching for an answer, I keep thinking that I'm overlooking something simple.
I have a python workspace with the following package structure:
MyTestProject
/src
/TestProjectNamespace
__init__.py
Module_A.py
Module_B.py
SecondTestProject
/src
/SecondTestProjectNamespace
__init__.py
Module_1.py
Module_2.py
...
Module_10.py
Note that MyTestProjectNamespace has a reference to SecondTestProjectNamespace.
In MyTestProjectNamespace, I need to import everything in SecondTestProjectNamespace. I could import one module at a time with the following statement(s):
from SecondTestProjectNamespace.Module_A import *
from SecondTestProjectNamespace.Module_B import *
...but this isn't practical if the SecondTestProject has 50 modules in it.
Does Python support a way to import everything in a namespace / package? Any help would be appreciated.
Thanks in advance.
Yes, you can roll this using pkgutil.
Here's an example that lists all packages under twisted (except tests), and imports them:
# -*- Mode: Python -*-
# vi:si:et:sw=4:sts=4:ts=4
import pkgutil
import twisted
for importer, modname, ispkg in pkgutil.walk_packages(
path=twisted.__path__,
prefix=twisted.__name__+'.',
onerror=lambda x: None):
# skip tests
if modname.find('test') > -1:
continue
print(modname)
# gloss over import errors
try:
__import__(modname)
except:
print 'Failed importing', modname
pass
# show that we actually imported all these, by showing one subpackage is imported
print twisted.python
I have to agree with the other posters that star imports are a bad idea.
No. It is possible to set up SecondTestProject to automatically import everything in its submodules, by putting code in __init__.py to do the from ... import * you mention. It's also possible to automate this to some extent using the __import__ function and/or the imp module. But there is no quick and easy way to take a package that isn't set up this way and make it work this way.
It's probably not a good idea anyway. If you have 50 modules, importing everything from all of them into your global namespace is going to cause a proliferation of names, and very likely conflicts among those names.
As other had put it - it might not be a good idea. But there are ways of keeping your namespaces and therefore avoiding naming conflicts - and having all the modules/sub-packages in a module available to the package user with a single import.
Let's suppose I have a package named "pack", within it a module named "a.py" defining some "b" variable. All I want to do is :
>>> import pack
>>> pack.a.b
1
One way of doing this is to put in pack/__init__.py a line that says
import a - thus in your case you'd need fifty such lines, and keep them up to date.
Not that bad.
However, the documentation at http://docs.python.org/tutorial/modules.html#importing-from-a-package - says that if you have a string list named __all__ in your __init__.py file, all module/sub-package names in that list are imported when one does from pack import *
That alone would half-work - but would require users of your package to perform the not-recommended "from x import *" form.
But -- you can do the "... import *" inside __init__.py itself, after defining the __all__ variable - so all you have to do is to keep the __all__ up to date:
With the TestProjectNamespace/__init__.py being like this:
__all__ = ["Module_A", "Module_B", ...]
from TestProjectNamespace import *
your users would have
TestProjectNamespace.Module_A (and others) available upon import of TestProjectNamespace.
And, of course - you could automate the creation of __all__ - it is just a variable, after all - but I would not recommend that.
Does Python support a way to import everything in a namespace / package?
No. A package is not a super-module -- it's a collection of modules grouped together.
At least part of the reason is that it's not trivial to determine what 'everything' means inside a folder: there are problems like network drives, soft links, hard links, ...