Adding code to __init__.py - python

I'm taking a look at how the model system in django works and I noticed something that I don't understand.
I know that you create an empty __init__.py file to specify that the current directory is a package. And that you can set some variable in __init__.py so that import * works properly.
But django adds a bunch of from ... import ... statements and defines a bunch of classes in __init__.py. Why? Doesn't this just make things look messy? Is there a reason that requires this code in __init__.py?

All imports in __init__.py are made available when you import the package (directory) that contains it.
Example:
./dir/__init__.py:
import something
./test.py:
import dir
# can now use dir.something
EDIT: forgot to mention, the code in __init__.py runs the first time you import any module from that directory. So it's normally a good place to put any package-level initialisation code.
EDIT2: dgrant pointed out to a possible confusion in my example. In __init__.py import something can import any module, not necessary from the package. For example, we can replace it with import datetime, then in our top level test.py both of these snippets will work:
import dir
print dir.datetime.datetime.now()
and
import dir.some_module_in_dir
print dir.datetime.datetime.now()
The bottom line is: all names assigned in __init__.py, be it imported modules, functions or classes, are automatically available in the package namespace whenever you import the package or a module in the package.

It's just personal preference really, and has to do with the layout of your python modules.
Let's say you have a module called erikutils. There are two ways that it can be a module, either you have a file called erikutils.py on your sys.path or you have a directory called erikutils on your sys.path with an empty __init__.py file inside it. Then let's say you have a bunch of modules called fileutils, procutils, parseutils and you want those to be sub-modules under erikutils. So you make some .py files called fileutils.py, procutils.py, and parseutils.py:
erikutils
__init__.py
fileutils.py
procutils.py
parseutils.py
Maybe you have a few functions that just don't belong in the fileutils, procutils, or parseutils modules. And let's say you don't feel like creating a new module called miscutils. AND, you'd like to be able to call the function like so:
erikutils.foo()
erikutils.bar()
rather than doing
erikutils.miscutils.foo()
erikutils.miscutils.bar()
So because the erikutils module is a directory, not a file, we have to define it's functions inside the __init__.py file.
In django, the best example I can think of is django.db.models.fields. ALL the django *Field classes are defined in the __init__.py file in the django/db/models/fields directory. I guess they did this because they didn't want to cram everything into a hypothetical django/db/models/fields.py model, so they split it out into a few submodules (related.py, files.py, for example) and they stuck the made *Field definitions in the fields module itself (hence, __init__.py).

Using the __init__.py file allows you to make the internal package structure invisible from the outside. If the internal structure changes (e.g. because you split one fat module into two) you only have to adjust the __init__.py file, but not the code that depends on the package. You can also make parts of your package invisible, e.g. if they are not ready for general usage.
Note that you can use the del command, so a typical __init__.py may look like this:
from somemodule import some_function1, some_function2, SomeObject
del somemodule
Now if you decide to split somemodule the new __init__.py might be:
from somemodule1 import some_function1, some_function2
from somemodule2 import SomeObject
del somemodule1
del somemodule2
From the outside the package still looks exactly as before.

"We recommend not putting much code in an __init__.py file, though. Programmers do not expect actual logic to happen in this file, and much like with from x import *, it can trip them up if they are looking for the declaration of a particular piece of code and can't find it until they check __init__.py. "
-- Python Object-Oriented Programming Fourth Edition Steven F. Lott Dusty Phillips

Related

Correctly setting imports in a file meant to be imported

I'm building a small Python package. Following https://packaging.python.org/tutorials/packaging-projects/, I have a file structure like so:
pkg/
- src/
- pkg/
- __init__.py
- file.py
whith ClassA, ClassB, etc. defined in file.py
I've installed this package on my system. I can now do things like import pkg.file
in an interpreter, which is great. However, it gives me access to whatever not starting with _ in file.py; including all the imports, global variables, etc. that live in this file. I'm happy with pkg.file.ClassA; less so with, for instance, pkg.file.itertools, or pkg.file.PI. It just doesn't feel very clean.
What would be the best practice here? Modifying my import statements in file.py as import itertools as _itertools? Some pythonic trickery in the init file? I thought of adding from file import ClassA, ClassB to it, but it doesn't seem very DRY to me. Additionally, file.py is susceptible to being broken into two or more files in the near future.
Alright, so I came up with a two-stage process:
setting __all__ = ['ClassA', 'ClassB'] at the top level in file.py;
adding from .file import * in init.py.
This way, on ìmport pkg I have directly access to my classes in the namespace pkg. As a side effect, I'm quite happy with this flattening of the hierarchy!
pkg.file.whatever are still accessible, a way to circumvent that would be great (for cleanness if anything else) but I can live with it.

Can a module have the same name as the package itself when submodules are used?

Currently, I have a package name (let's say DummyPackage). DummyPackage contains three modules with functions, classes, etc. So the directory structure looks like this:
project_dir/
__init__.py
DummyPackage/
__init__.py
Module1/
__init__.py
module_x.py
module_y.py
Module2/
__init__.py
module_z.py
So importing methods from modules looks like this
from DummyPackage.Module1.module_x import method_x
We are adding new stuff to the project and I would like to create a module, with the name DummyProject, which should be importable like this
from DummyProject import new_method
I assumed, only adding file DummyPackage.py would be enough, but apparently, it's not. I tried to add it to the project_dir/ dir and to DummyPackage/ dir, but neither works.
Is it because of name conflict? Is it possible to have a code like this?
import DummyPackage
from DummyPackage.Module1.module_x import method_x
DummyPackage.new_method
method_x
to put my three comments in an answer:
First let me explain relative imports using the modules you already have, if you wanted to import module_x from module_y you can do this:
module_y.py
from .module_x import method_x
or similarly in module_z.py
from ..Module1.module_x import method_x
so depending on the location of your DummyProject in the package Intra-package References may be all you need.
As for the second part yes it is possible to have (runnable) code like this:
import DummyPackage
from DummyPackage.Module1.module_x import method_x
DummyPackage.new_method
method_x
in this case it looks like you want new_method to be a package level variable. To quote this great answer:
In addition to labeling a directory as a Python package and defining __all__, __init__.py allows you to define any variable at the package level.
I highly recommend taking a look at the source code for json/__init__.py in the standard library if you want an real world example.
Or as an example with your setup to be able to import method_x right from the package you would just need to add this to the top level __init__.py:
from .Module1.module_x import method_x
then from any file importing the package you could do this:
import DummyPackage
DummyPackage.method_x
(Although obviously you would do it for new_method according to where you place DummyProject)

Import local packages in python

i've run through many posts about this, but still doesn't seem to work. The deal is pretty cut. I've the got the following hierarchy.
main.py
DirA/
__init__.py
hello.py
DirB/
__init__.py
foo.py
bla.py
lol.py
The__init__.py at DirA is empty. The respective one at DirB just contains the foo module.
__all__.py = ["foo"]
The main.py has the following code
import DirA
import DirB
hey() #Def written at hello.py
foolish1() #Def written at foo.py
foolish2() #Def written at foo.py
Long story short, I got NameError: name 'foo' is not defined. Any ideas? Thanks in advance.
You only get what you import. Therefore, in you main, you only get DirA and DirB. You would use them in one of those ways:
import DirA
DirA.something_in_init_py()
# Importing hello:
import DirA.hello
DirA.hello.something_in_hello_py()
# Using a named import:
from DirA.hello import something_in_hello_py
something_in_hello_py()
And in DirB, just make the __init__.py empty as well. The only use of __all__ is for when you want to import *, which you don't want because, as they say, explicit is better than implicit.
But in case you are curious, it would work this way:
from DirB import *
something_in_dirb()
By default the import * will import everything it can find that does not start with an underscore. Specifying a __all__ restricts what it imported to the names defined in __all__. See this question for more details.
Edit: about init.
The __init__.py is not really connected to the importing stuff. It is just a special file with the following properties:
Its existence means the directory is a python package, with several modules in it. If it does not exist, python will refuse to import anything from the directory.
It will always be loaded before loading anything else in the directory.
Its content will be available as the package itself.
Just try it put this in DirA/__init__.py:
foo = 42
Now, in your main:
from DirA import foo
print(foo) # 42
It can be useful, because you can import some of your submodules in the __init__.py to hide the inner structure of your package. Suppose you build an application with classes Author, Book and Review. To make it easier to read, you give each class its own file in a package. Now in your main, you have to import the full path:
from myapp.author import Author
from myapp.book import Book
from myapp.review import Review
Clearly not optimal. Now suppose you put those exact lines above in your __init__.py, you may simplify you main like this:
from myapp import Author, Book, Review
Python will load the __init__.py, which will in turn load all submodules and import the classes, making them available on the package. Now your main does not need to know where the classes are actually implemented.
Have you tried something like this:
One way
from DirA import hello
Another way
from DirA.hello import hey
If those don't work then append a new system path
You need to import the function itself:
How to call a function from another file in Python?
In your case:
from DirA import foolish1, foolish2

How to organize code with __init__.py?

I am just starting off using google app engine, and have been looking around for good practices and code organization. Most of my problems lie from confusion of __init__.py.
My current test structure looks like
/website
main.py
/pages
__init__.py #1
blog.py
hello2.py
hello.py
/sub
__init__.py #2
base.py
I am trying to use main.py as a file that simply points to everything in /pages and /pages/sub. Most modules in /pages share almost all the same imports (ex. import urllib), is there a way to define that everything in /pages imports what I want rather than adding it in every individual module?
Currently in __init__.py #1 I have
from sub.base import *
Yet my module blog.py says BaseHandler (a function in base.py) not defined.
My end goal is to have something like ...
main.py
from pages import *
#be able to call any function in /pages without having to do blog.func1() or hello.func2()
#rather just func1() and func2()
And to be able to share common imports for modules in /pages in __init__.py. So that they share for example urllib and all functions from base.py. Thank you for taking the time to read this post, I look forward to your insight.
Sounds like you think __init__.py is an initializer for the other modules in the package. It is not. It turns pages into a package (allowing its files and subdirectories to be modules), and it is executed, like a normal module would be, when your program calls import pages. Imagine that it's named pages.py instead.
So if you really want to dump everything into the same namespace, init #2 can contain from base import * (which will import everything in base to the namespace of sub), and blog.py can contain from sub import *. Got it?

How to reference to the top-level module in Python inside a package?

In the below hierachy, is there a convenient and universal way to reference to the top_package using a generic term in all .py file below? I would like to have a consistent way to import other modules, so that even when the "top_package" changes name nothing breaks.
I am not in favour of using the relative import like "..level_one_a" as relative path will be different to each python file below. I am looking for a way that:
Each python file can have the same import statement for the same module in the package.
A decoupling reference to "top_package" in any .py file inside the package, so whatever name "top_package" changes to, nothing breaks.
top_package/
__init__.py
level_one_a/
__init__.py
my_lib.py
level_two/
__init__.py
hello_world.py
level_one_b/
__init__.py
my_lib.py
main.py
This should do the job:
top_package = __import__(__name__.split('.')[0])
The trick here is that for every module the __name__ variable contains the full path to the module separated by dots such as, for example, top_package.level_one_a.my_lib. Hence, if you want to get the top package name, you just need to get the first component of the path and import it using __import__.
Despite the variable name used to access the package is still called top_package, you can rename the package and if will still work.
Put your package and the main script into an outer container directory, like this:
container/
main.py
top_package/
__init__.py
level_one_a/
__init__.py
my_lib.py
level_two/
__init__.py
hello_world.py
level_one_b/
__init__.py
my_lib.py
When main.py is run, its parent directory (container) will be automatically added to the start of sys.path. And since top_package is now in the same directory, it can be imported from anywhere within the package tree.
So hello_world.py could import level_one_b/my_lib.py like this:
from top_package.level_one_b import my_lib
No matter what the name of the container directory is, or where it is located, the imports will always work with this arrangement.
But note that, in your original example, top_package it could easily function as the container directory itself. All you would have to do is remove top_package/__init__.py, and you would be left with efectively the same arrangement.
The previous import statement would then change to:
from level_one_b import my_lib
and you would be free to rename top_package however you wished.
You could use a combination of the __import__() function and the __path__ attribute of a package.
For example, suppose you wish to import <whatever>.level_one_a.level_two.hello_world from somewhere else in the package. You could do something like this:
import os
_temp = __import__(__path__[0].split(os.sep)[0] + ".level_one_a.level_two.hello_world")
my_hello_world = _temp.level_one_a.level_two.hello_world
This code is independent of the name of the top level package and can be used anywhere in the package. It's also pretty ugly.
This works from within a library module:
import __main__ as main_package
TOP_PACKAGE = main_package.__package__.split('.')[0]
I believe #2 is impossible without using relative imports or the named package. You have to specify what module to import either by explicitly calling its name or using a relative import. otherwise how would the interpreter know what you want?
If you make your application launcher one level above top_level/ and have it import top_level you can then reference top_level.* from anywhere inside the top_level package.
(I can show you an example from software I'm working on: http://github.com/toddself/beerlog/)

Categories