How did the import statement find that module? - python

Given the following directory structure:
here/
├── app
│   ├── __init__.py
│   ├── json.py
│   └── example.py
└── my_script.py
__init__.py and json.py are empty files.
Contents of my_script.py:
from app import example
Contents of example.py
import importlib, imp, sys, os
# ensures '' is not in sys.path
sys.path = [p for p in sys.path if p]
# ensures PYTHONPATH, if any, is not over-reaching
os.environ.pop('PYTHONPATH', None)
# ensures we do not see json.py in the cwd
assert not os.path.isfile('json.py')
print '1: ', imp.find_module('json')
print '2: ', __import__('json')
print '3: ', importlib.import_module('json')
import json
json.loads
Now, from the here directory, execute:
python ./my_script.py
You will see that methods 1, 2, 3, all find the core library version of json module.
However, the actual import statement still manages to grab the empty json.py file somehow (AttributeError: 'module' object has no attribute 'loads').
My understanding was that the package version of json here should have only been accessible by namespace, i.e. from app import json, but the namespacing doesn't seem to work here.
On python3, I can not reproduce the issue. I also noticed if we put from __future__ import absolute_import into the example.py file, the problem just goes away.
How does the import statement find the local file, and why does it shadow the core library version?
edit: on another minor note, by the time we have reached the line import json, there is already a json module loaded into sys.modules from the lines above. So why does python try to import the module again, shouldn't it simply use the one already in the module cache?

You have more or less arrived at the answer here. Python 2.x by default will do a package-relative import first, which includes the potential of "shadowing" a base-level package.
See the section on Intra-package References in the python 2 documentation.
The ability to specify explicit relative imports, as well as from __future__ import absolute_import, were actually introduced back in Python 2.5 and this is explained further in PEP 328. This behavior became the default in Python 3. The new behavior (assumed absolute and explicit relative imports) was implemented in large part explicitly to address the problem you raise (shadowing builtin modules) although it also allows greater control with multi-level relative import syntax (i.e. .. for the parent module, ... for a level further up, and so on.)

Related

Python: Calling function from another file gives a ModuleNotFoundError [duplicate]

What is __init__.py for in a Python source directory?
It used to be a required part of a package (old, pre-3.3 "regular package", not newer 3.3+ "namespace package").
Here's the documentation.
Python defines two types of packages, regular packages and namespace packages. Regular packages are traditional packages as they existed in Python 3.2 and earlier. A regular package is typically implemented as a directory containing an __init__.py file. When a regular package is imported, this __init__.py file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. The __init__.py file can contain the same Python code that any other module can contain, and Python will add some additional attributes to the module when it is imported.
But just click the link, it contains an example, more information, and an explanation of namespace packages, the kind of packages without __init__.py.
Files named __init__.py are used to mark directories on disk as Python package directories.
If you have the files
mydir/spam/__init__.py
mydir/spam/module.py
and mydir is on your path, you can import the code in module.py as
import spam.module
or
from spam import module
If you remove the __init__.py file, Python will no longer look for submodules inside that directory, so attempts to import the module will fail.
The __init__.py file is usually empty, but can be used to export selected portions of the package under more convenient name, hold convenience functions, etc.
Given the example above, the contents of the init module can be accessed as
import spam
based on this
In addition to labeling a directory as a Python package and defining __all__, __init__.py allows you to define any variable at the package level. Doing so is often convenient if a package defines something that will be imported frequently, in an API-like fashion. This pattern promotes adherence to the Pythonic "flat is better than nested" philosophy.
An example
Here is an example from one of my projects, in which I frequently import a sessionmaker called Session to interact with my database. I wrote a "database" package with a few modules:
database/
__init__.py
schema.py
insertions.py
queries.py
My __init__.py contains the following code:
import os
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine
engine = create_engine(os.environ['DATABASE_URL'])
Session = sessionmaker(bind=engine)
Since I define Session here, I can start a new session using the syntax below. This code would be the same executed from inside or outside of the "database" package directory.
from database import Session
session = Session()
Of course, this is a small convenience -- the alternative would be to define Session in a new file like "create_session.py" in my database package, and start new sessions using:
from database.create_session import Session
session = Session()
Further reading
There is a pretty interesting reddit thread covering appropriate uses of __init__.py here:
http://www.reddit.com/r/Python/comments/1bbbwk/whats_your_opinion_on_what_to_include_in_init_py/
The majority opinion seems to be that __init__.py files should be very thin to avoid violating the "explicit is better than implicit" philosophy.
There are 2 main reasons for __init__.py
For convenience: the other users will not need to know your functions' exact location in your package hierarchy (documentation).
your_package/
__init__.py
file1.py
file2.py
...
fileN.py
# in __init__.py
from .file1 import *
from .file2 import *
...
from .fileN import *
# in file1.py
def add():
pass
then others can call add() by
from your_package import add
without knowing file1's inside functions, like
from your_package.file1 import add
If you want something to be initialized; for example, logging (which should be put in the top level):
import logging.config
logging.config.dictConfig(Your_logging_config)
The __init__.py file makes Python treat directories containing it as modules.
Furthermore, this is the first file to be loaded in a module, so you can use it to execute code that you want to run each time a module is loaded, or specify the submodules to be exported.
Since Python 3.3, __init__.py is no longer required to define directories as importable Python packages.
Check PEP 420: Implicit Namespace Packages:
Native support for package directories that don’t require __init__.py marker files and can automatically span multiple path segments (inspired by various third party approaches to namespace packages, as described in PEP 420)
Here's the test:
$ mkdir -p /tmp/test_init
$ touch /tmp/test_init/module.py /tmp/test_init/__init__.py
$ tree -at /tmp/test_init
/tmp/test_init
├── module.py
└── __init__.py
$ python3
>>> import sys
>>> sys.path.insert(0, '/tmp')
>>> from test_init import module
>>> import test_init.module
$ rm -f /tmp/test_init/__init__.py
$ tree -at /tmp/test_init
/tmp/test_init
└── module.py
$ python3
>>> import sys
>>> sys.path.insert(0, '/tmp')
>>> from test_init import module
>>> import test_init.module
references:
https://docs.python.org/3/whatsnew/3.3.html#pep-420-implicit-namespace-packages
https://www.python.org/dev/peps/pep-0420/
Is __init__.py not required for packages in Python 3?
Although Python works without an __init__.py file you should still include one.
It specifies that the directory should be treated as a package, so therefore include it (even if it is empty).
There is also a case where you may actually use an __init__.py file:
Imagine you had the following file structure:
main_methods
|- methods.py
And methods.py contained this:
def foo():
return 'foo'
To use foo() you would need one of the following:
from main_methods.methods import foo # Call with foo()
from main_methods import methods # Call with methods.foo()
import main_methods.methods # Call with main_methods.methods.foo()
Maybe there you need (or want) to keep methods.py inside main_methods (runtimes/dependencies for example) but you only want to import main_methods.
If you changed the name of methods.py to __init__.py then you could use foo() by just importing main_methods:
import main_methods
print(main_methods.foo()) # Prints 'foo'
This works because __init__.py is treated as part of the package.
Some Python packages actually do this. An example is with JSON, where running import json is actually importing __init__.py from the json package (see the package file structure here):
Source code: Lib/json/__init__.py
In Python the definition of package is very simple. Like Java the hierarchical structure and the directory structure are the same. But you have to have __init__.py in a package. I will explain the __init__.py file with the example below:
package_x/
|-- __init__.py
|-- subPackage_a/
|------ __init__.py
|------ module_m1.py
|-- subPackage_b/
|------ __init__.py
|------ module_n1.py
|------ module_n2.py
|------ module_n3.py
__init__.py can be empty, as long as it exists. It indicates that the directory should be regarded as a package. Of course, __init__.py can also set the appropriate content.
If we add a function in module_n1:
def function_X():
print "function_X in module_n1"
return
After running:
>>>from package_x.subPackage_b.module_n1 import function_X
>>>function_X()
function_X in module_n1
Then we followed the hierarchy package and called module_n1 the function. We can use __init__.py in subPackage_b like this:
__all__ = ['module_n2', 'module_n3']
After running:
>>>from package_x.subPackage_b import *
>>>module_n1.function_X()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named module_n1
Hence using * importing, module package is subject to __init__.py content.
__init__.py will treat the directory it is in as a loadable module.
For people who prefer reading code, I put Two-Bit Alchemist's comment here.
$ find /tmp/mydir/
/tmp/mydir/
/tmp/mydir//spam
/tmp/mydir//spam/__init__.py
/tmp/mydir//spam/module.py
$ cd ~
$ python
>>> import sys
>>> sys.path.insert(0, '/tmp/mydir')
>>> from spam import module
>>> module.myfun(3)
9
>>> exit()
$
$ rm /tmp/mydir/spam/__init__.py*
$
$ python
>>> import sys
>>> sys.path.insert(0, '/tmp/mydir')
>>> from spam import module
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named spam
>>>
It facilitates importing other python files. When you placed this file in a directory (say stuff)containing other py files, then you can do something like import stuff.other.
root\
stuff\
other.py
morestuff\
another.py
Without this __init__.py inside the directory stuff, you couldn't import other.py, because Python doesn't know where the source code for stuff is and unable to recognize it as a package.
An __init__.py file makes imports easy. When an __init__.py is present within a package, function a() can be imported from file b.py like so:
from b import a
Without it, however, you can't import directly. You have to amend the system path:
import sys
sys.path.insert(0, 'path/to/b.py')
from b import a
One thing __init__.py allows is converting a module to a package without breaking the API or creating extraneous nested namespaces or private modules*. This helps when I want to extend a namespace.
If I have a file util.py containing
def foo():
...
then users will access foo with
from util import foo
If I then want to add utility functions for database interaction, and I want them to have their own namespace under util, I'll need a new directory**, and to keep API compatibility (so that from util import foo still works), I'll call it util/. I could move util.py into util/ like so,
util/
__init__.py
util.py
db.py
and in util/__init__.py do
from util import *
but this is redundant. Instead of having a util/util.py file, we can just put the util.py contents in __init__.py and the user can now
from util import foo
from util.db import check_schema
I think this nicely highlights how a util package's __init__.py acts in a similar way to a util module
* this is hinted at in the other answers, but I want to highlight it here
** short of employing import gymnastics. Note it won't work to create a new package with the same name as the file, see this
If you're using Python 2 and want to load siblings of your file you can simply add the parent folder of your file to your system paths of the session. It will behave about the same as if your current file was an init file.
import os
import sys
dir_path = os.path.dirname(__file__)
sys.path.insert(0, dir_path)
After that regular imports relative to the file's directory will work just fine. E.g.
import cheese
from vehicle_parts import *
# etc.
Generally you want to use a proper init.py file instead though, but when dealing with legacy code you might be stuck with f.ex. a library hard-coded to load a particular file and nothing but. For those cases this is an alternative.
__init__.py : It is a Python file found in a package directory, it is invoked when the package or a module in the package is imported. You can use this to execute package initialization code, i.e. whenever the package is imported the python statements are executed first before the other modules in this folder gets executed. It is similar to main function of c or Java program, but this exists in the Python package module (folder) rather than in the core Python file.
also it has access to global variables defined in this __init__.py file as when the module is imported into Python file.
for eg.
I have a __init__.py file in a folder called pymodlib, this file contains the following statements:
print(f'Invoking __init__.py for {__name__}')
pystructures = ['for_loop', 'while__loop', 'ifCondition']
When I import this package pymodlib in my solution module or notebook or python console:
These two statements get executed while importing.
So in the log or console you would see the following output:
>>> import pymodlib
Invoking __init__.py for pymodlib
in the next statement of python console: I can access the global variable:
>> pymodlib.pystructures
it gives the following output:
['for_loop', 'while__loop', 'ifCondition']
Now, from Python 3.3 onwards the use of this file has been optional to make folder a Python module. So you can skip from including it in the python module folder.

How can I make functionality accessible directly from my package (not a module within that package)? [duplicate]

What is __init__.py for in a Python source directory?
It used to be a required part of a package (old, pre-3.3 "regular package", not newer 3.3+ "namespace package").
Here's the documentation.
Python defines two types of packages, regular packages and namespace packages. Regular packages are traditional packages as they existed in Python 3.2 and earlier. A regular package is typically implemented as a directory containing an __init__.py file. When a regular package is imported, this __init__.py file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. The __init__.py file can contain the same Python code that any other module can contain, and Python will add some additional attributes to the module when it is imported.
But just click the link, it contains an example, more information, and an explanation of namespace packages, the kind of packages without __init__.py.
Files named __init__.py are used to mark directories on disk as Python package directories.
If you have the files
mydir/spam/__init__.py
mydir/spam/module.py
and mydir is on your path, you can import the code in module.py as
import spam.module
or
from spam import module
If you remove the __init__.py file, Python will no longer look for submodules inside that directory, so attempts to import the module will fail.
The __init__.py file is usually empty, but can be used to export selected portions of the package under more convenient name, hold convenience functions, etc.
Given the example above, the contents of the init module can be accessed as
import spam
based on this
In addition to labeling a directory as a Python package and defining __all__, __init__.py allows you to define any variable at the package level. Doing so is often convenient if a package defines something that will be imported frequently, in an API-like fashion. This pattern promotes adherence to the Pythonic "flat is better than nested" philosophy.
An example
Here is an example from one of my projects, in which I frequently import a sessionmaker called Session to interact with my database. I wrote a "database" package with a few modules:
database/
__init__.py
schema.py
insertions.py
queries.py
My __init__.py contains the following code:
import os
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine
engine = create_engine(os.environ['DATABASE_URL'])
Session = sessionmaker(bind=engine)
Since I define Session here, I can start a new session using the syntax below. This code would be the same executed from inside or outside of the "database" package directory.
from database import Session
session = Session()
Of course, this is a small convenience -- the alternative would be to define Session in a new file like "create_session.py" in my database package, and start new sessions using:
from database.create_session import Session
session = Session()
Further reading
There is a pretty interesting reddit thread covering appropriate uses of __init__.py here:
http://www.reddit.com/r/Python/comments/1bbbwk/whats_your_opinion_on_what_to_include_in_init_py/
The majority opinion seems to be that __init__.py files should be very thin to avoid violating the "explicit is better than implicit" philosophy.
There are 2 main reasons for __init__.py
For convenience: the other users will not need to know your functions' exact location in your package hierarchy (documentation).
your_package/
__init__.py
file1.py
file2.py
...
fileN.py
# in __init__.py
from .file1 import *
from .file2 import *
...
from .fileN import *
# in file1.py
def add():
pass
then others can call add() by
from your_package import add
without knowing file1's inside functions, like
from your_package.file1 import add
If you want something to be initialized; for example, logging (which should be put in the top level):
import logging.config
logging.config.dictConfig(Your_logging_config)
The __init__.py file makes Python treat directories containing it as modules.
Furthermore, this is the first file to be loaded in a module, so you can use it to execute code that you want to run each time a module is loaded, or specify the submodules to be exported.
Since Python 3.3, __init__.py is no longer required to define directories as importable Python packages.
Check PEP 420: Implicit Namespace Packages:
Native support for package directories that don’t require __init__.py marker files and can automatically span multiple path segments (inspired by various third party approaches to namespace packages, as described in PEP 420)
Here's the test:
$ mkdir -p /tmp/test_init
$ touch /tmp/test_init/module.py /tmp/test_init/__init__.py
$ tree -at /tmp/test_init
/tmp/test_init
├── module.py
└── __init__.py
$ python3
>>> import sys
>>> sys.path.insert(0, '/tmp')
>>> from test_init import module
>>> import test_init.module
$ rm -f /tmp/test_init/__init__.py
$ tree -at /tmp/test_init
/tmp/test_init
└── module.py
$ python3
>>> import sys
>>> sys.path.insert(0, '/tmp')
>>> from test_init import module
>>> import test_init.module
references:
https://docs.python.org/3/whatsnew/3.3.html#pep-420-implicit-namespace-packages
https://www.python.org/dev/peps/pep-0420/
Is __init__.py not required for packages in Python 3?
Although Python works without an __init__.py file you should still include one.
It specifies that the directory should be treated as a package, so therefore include it (even if it is empty).
There is also a case where you may actually use an __init__.py file:
Imagine you had the following file structure:
main_methods
|- methods.py
And methods.py contained this:
def foo():
return 'foo'
To use foo() you would need one of the following:
from main_methods.methods import foo # Call with foo()
from main_methods import methods # Call with methods.foo()
import main_methods.methods # Call with main_methods.methods.foo()
Maybe there you need (or want) to keep methods.py inside main_methods (runtimes/dependencies for example) but you only want to import main_methods.
If you changed the name of methods.py to __init__.py then you could use foo() by just importing main_methods:
import main_methods
print(main_methods.foo()) # Prints 'foo'
This works because __init__.py is treated as part of the package.
Some Python packages actually do this. An example is with JSON, where running import json is actually importing __init__.py from the json package (see the package file structure here):
Source code: Lib/json/__init__.py
In Python the definition of package is very simple. Like Java the hierarchical structure and the directory structure are the same. But you have to have __init__.py in a package. I will explain the __init__.py file with the example below:
package_x/
|-- __init__.py
|-- subPackage_a/
|------ __init__.py
|------ module_m1.py
|-- subPackage_b/
|------ __init__.py
|------ module_n1.py
|------ module_n2.py
|------ module_n3.py
__init__.py can be empty, as long as it exists. It indicates that the directory should be regarded as a package. Of course, __init__.py can also set the appropriate content.
If we add a function in module_n1:
def function_X():
print "function_X in module_n1"
return
After running:
>>>from package_x.subPackage_b.module_n1 import function_X
>>>function_X()
function_X in module_n1
Then we followed the hierarchy package and called module_n1 the function. We can use __init__.py in subPackage_b like this:
__all__ = ['module_n2', 'module_n3']
After running:
>>>from package_x.subPackage_b import *
>>>module_n1.function_X()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named module_n1
Hence using * importing, module package is subject to __init__.py content.
__init__.py will treat the directory it is in as a loadable module.
For people who prefer reading code, I put Two-Bit Alchemist's comment here.
$ find /tmp/mydir/
/tmp/mydir/
/tmp/mydir//spam
/tmp/mydir//spam/__init__.py
/tmp/mydir//spam/module.py
$ cd ~
$ python
>>> import sys
>>> sys.path.insert(0, '/tmp/mydir')
>>> from spam import module
>>> module.myfun(3)
9
>>> exit()
$
$ rm /tmp/mydir/spam/__init__.py*
$
$ python
>>> import sys
>>> sys.path.insert(0, '/tmp/mydir')
>>> from spam import module
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named spam
>>>
It facilitates importing other python files. When you placed this file in a directory (say stuff)containing other py files, then you can do something like import stuff.other.
root\
stuff\
other.py
morestuff\
another.py
Without this __init__.py inside the directory stuff, you couldn't import other.py, because Python doesn't know where the source code for stuff is and unable to recognize it as a package.
An __init__.py file makes imports easy. When an __init__.py is present within a package, function a() can be imported from file b.py like so:
from b import a
Without it, however, you can't import directly. You have to amend the system path:
import sys
sys.path.insert(0, 'path/to/b.py')
from b import a
One thing __init__.py allows is converting a module to a package without breaking the API or creating extraneous nested namespaces or private modules*. This helps when I want to extend a namespace.
If I have a file util.py containing
def foo():
...
then users will access foo with
from util import foo
If I then want to add utility functions for database interaction, and I want them to have their own namespace under util, I'll need a new directory**, and to keep API compatibility (so that from util import foo still works), I'll call it util/. I could move util.py into util/ like so,
util/
__init__.py
util.py
db.py
and in util/__init__.py do
from util import *
but this is redundant. Instead of having a util/util.py file, we can just put the util.py contents in __init__.py and the user can now
from util import foo
from util.db import check_schema
I think this nicely highlights how a util package's __init__.py acts in a similar way to a util module
* this is hinted at in the other answers, but I want to highlight it here
** short of employing import gymnastics. Note it won't work to create a new package with the same name as the file, see this
If you're using Python 2 and want to load siblings of your file you can simply add the parent folder of your file to your system paths of the session. It will behave about the same as if your current file was an init file.
import os
import sys
dir_path = os.path.dirname(__file__)
sys.path.insert(0, dir_path)
After that regular imports relative to the file's directory will work just fine. E.g.
import cheese
from vehicle_parts import *
# etc.
Generally you want to use a proper init.py file instead though, but when dealing with legacy code you might be stuck with f.ex. a library hard-coded to load a particular file and nothing but. For those cases this is an alternative.
__init__.py : It is a Python file found in a package directory, it is invoked when the package or a module in the package is imported. You can use this to execute package initialization code, i.e. whenever the package is imported the python statements are executed first before the other modules in this folder gets executed. It is similar to main function of c or Java program, but this exists in the Python package module (folder) rather than in the core Python file.
also it has access to global variables defined in this __init__.py file as when the module is imported into Python file.
for eg.
I have a __init__.py file in a folder called pymodlib, this file contains the following statements:
print(f'Invoking __init__.py for {__name__}')
pystructures = ['for_loop', 'while__loop', 'ifCondition']
When I import this package pymodlib in my solution module or notebook or python console:
These two statements get executed while importing.
So in the log or console you would see the following output:
>>> import pymodlib
Invoking __init__.py for pymodlib
in the next statement of python console: I can access the global variable:
>> pymodlib.pystructures
it gives the following output:
['for_loop', 'while__loop', 'ifCondition']
Now, from Python 3.3 onwards the use of this file has been optional to make folder a Python module. So you can skip from including it in the python module folder.

Should there be a .py file for each import or from x import statement? [duplicate]

What is __init__.py for in a Python source directory?
It used to be a required part of a package (old, pre-3.3 "regular package", not newer 3.3+ "namespace package").
Here's the documentation.
Python defines two types of packages, regular packages and namespace packages. Regular packages are traditional packages as they existed in Python 3.2 and earlier. A regular package is typically implemented as a directory containing an __init__.py file. When a regular package is imported, this __init__.py file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. The __init__.py file can contain the same Python code that any other module can contain, and Python will add some additional attributes to the module when it is imported.
But just click the link, it contains an example, more information, and an explanation of namespace packages, the kind of packages without __init__.py.
Files named __init__.py are used to mark directories on disk as Python package directories.
If you have the files
mydir/spam/__init__.py
mydir/spam/module.py
and mydir is on your path, you can import the code in module.py as
import spam.module
or
from spam import module
If you remove the __init__.py file, Python will no longer look for submodules inside that directory, so attempts to import the module will fail.
The __init__.py file is usually empty, but can be used to export selected portions of the package under more convenient name, hold convenience functions, etc.
Given the example above, the contents of the init module can be accessed as
import spam
based on this
In addition to labeling a directory as a Python package and defining __all__, __init__.py allows you to define any variable at the package level. Doing so is often convenient if a package defines something that will be imported frequently, in an API-like fashion. This pattern promotes adherence to the Pythonic "flat is better than nested" philosophy.
An example
Here is an example from one of my projects, in which I frequently import a sessionmaker called Session to interact with my database. I wrote a "database" package with a few modules:
database/
__init__.py
schema.py
insertions.py
queries.py
My __init__.py contains the following code:
import os
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine
engine = create_engine(os.environ['DATABASE_URL'])
Session = sessionmaker(bind=engine)
Since I define Session here, I can start a new session using the syntax below. This code would be the same executed from inside or outside of the "database" package directory.
from database import Session
session = Session()
Of course, this is a small convenience -- the alternative would be to define Session in a new file like "create_session.py" in my database package, and start new sessions using:
from database.create_session import Session
session = Session()
Further reading
There is a pretty interesting reddit thread covering appropriate uses of __init__.py here:
http://www.reddit.com/r/Python/comments/1bbbwk/whats_your_opinion_on_what_to_include_in_init_py/
The majority opinion seems to be that __init__.py files should be very thin to avoid violating the "explicit is better than implicit" philosophy.
There are 2 main reasons for __init__.py
For convenience: the other users will not need to know your functions' exact location in your package hierarchy (documentation).
your_package/
__init__.py
file1.py
file2.py
...
fileN.py
# in __init__.py
from .file1 import *
from .file2 import *
...
from .fileN import *
# in file1.py
def add():
pass
then others can call add() by
from your_package import add
without knowing file1's inside functions, like
from your_package.file1 import add
If you want something to be initialized; for example, logging (which should be put in the top level):
import logging.config
logging.config.dictConfig(Your_logging_config)
The __init__.py file makes Python treat directories containing it as modules.
Furthermore, this is the first file to be loaded in a module, so you can use it to execute code that you want to run each time a module is loaded, or specify the submodules to be exported.
Since Python 3.3, __init__.py is no longer required to define directories as importable Python packages.
Check PEP 420: Implicit Namespace Packages:
Native support for package directories that don’t require __init__.py marker files and can automatically span multiple path segments (inspired by various third party approaches to namespace packages, as described in PEP 420)
Here's the test:
$ mkdir -p /tmp/test_init
$ touch /tmp/test_init/module.py /tmp/test_init/__init__.py
$ tree -at /tmp/test_init
/tmp/test_init
├── module.py
└── __init__.py
$ python3
>>> import sys
>>> sys.path.insert(0, '/tmp')
>>> from test_init import module
>>> import test_init.module
$ rm -f /tmp/test_init/__init__.py
$ tree -at /tmp/test_init
/tmp/test_init
└── module.py
$ python3
>>> import sys
>>> sys.path.insert(0, '/tmp')
>>> from test_init import module
>>> import test_init.module
references:
https://docs.python.org/3/whatsnew/3.3.html#pep-420-implicit-namespace-packages
https://www.python.org/dev/peps/pep-0420/
Is __init__.py not required for packages in Python 3?
Although Python works without an __init__.py file you should still include one.
It specifies that the directory should be treated as a package, so therefore include it (even if it is empty).
There is also a case where you may actually use an __init__.py file:
Imagine you had the following file structure:
main_methods
|- methods.py
And methods.py contained this:
def foo():
return 'foo'
To use foo() you would need one of the following:
from main_methods.methods import foo # Call with foo()
from main_methods import methods # Call with methods.foo()
import main_methods.methods # Call with main_methods.methods.foo()
Maybe there you need (or want) to keep methods.py inside main_methods (runtimes/dependencies for example) but you only want to import main_methods.
If you changed the name of methods.py to __init__.py then you could use foo() by just importing main_methods:
import main_methods
print(main_methods.foo()) # Prints 'foo'
This works because __init__.py is treated as part of the package.
Some Python packages actually do this. An example is with JSON, where running import json is actually importing __init__.py from the json package (see the package file structure here):
Source code: Lib/json/__init__.py
In Python the definition of package is very simple. Like Java the hierarchical structure and the directory structure are the same. But you have to have __init__.py in a package. I will explain the __init__.py file with the example below:
package_x/
|-- __init__.py
|-- subPackage_a/
|------ __init__.py
|------ module_m1.py
|-- subPackage_b/
|------ __init__.py
|------ module_n1.py
|------ module_n2.py
|------ module_n3.py
__init__.py can be empty, as long as it exists. It indicates that the directory should be regarded as a package. Of course, __init__.py can also set the appropriate content.
If we add a function in module_n1:
def function_X():
print "function_X in module_n1"
return
After running:
>>>from package_x.subPackage_b.module_n1 import function_X
>>>function_X()
function_X in module_n1
Then we followed the hierarchy package and called module_n1 the function. We can use __init__.py in subPackage_b like this:
__all__ = ['module_n2', 'module_n3']
After running:
>>>from package_x.subPackage_b import *
>>>module_n1.function_X()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named module_n1
Hence using * importing, module package is subject to __init__.py content.
__init__.py will treat the directory it is in as a loadable module.
For people who prefer reading code, I put Two-Bit Alchemist's comment here.
$ find /tmp/mydir/
/tmp/mydir/
/tmp/mydir//spam
/tmp/mydir//spam/__init__.py
/tmp/mydir//spam/module.py
$ cd ~
$ python
>>> import sys
>>> sys.path.insert(0, '/tmp/mydir')
>>> from spam import module
>>> module.myfun(3)
9
>>> exit()
$
$ rm /tmp/mydir/spam/__init__.py*
$
$ python
>>> import sys
>>> sys.path.insert(0, '/tmp/mydir')
>>> from spam import module
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named spam
>>>
It facilitates importing other python files. When you placed this file in a directory (say stuff)containing other py files, then you can do something like import stuff.other.
root\
stuff\
other.py
morestuff\
another.py
Without this __init__.py inside the directory stuff, you couldn't import other.py, because Python doesn't know where the source code for stuff is and unable to recognize it as a package.
An __init__.py file makes imports easy. When an __init__.py is present within a package, function a() can be imported from file b.py like so:
from b import a
Without it, however, you can't import directly. You have to amend the system path:
import sys
sys.path.insert(0, 'path/to/b.py')
from b import a
One thing __init__.py allows is converting a module to a package without breaking the API or creating extraneous nested namespaces or private modules*. This helps when I want to extend a namespace.
If I have a file util.py containing
def foo():
...
then users will access foo with
from util import foo
If I then want to add utility functions for database interaction, and I want them to have their own namespace under util, I'll need a new directory**, and to keep API compatibility (so that from util import foo still works), I'll call it util/. I could move util.py into util/ like so,
util/
__init__.py
util.py
db.py
and in util/__init__.py do
from util import *
but this is redundant. Instead of having a util/util.py file, we can just put the util.py contents in __init__.py and the user can now
from util import foo
from util.db import check_schema
I think this nicely highlights how a util package's __init__.py acts in a similar way to a util module
* this is hinted at in the other answers, but I want to highlight it here
** short of employing import gymnastics. Note it won't work to create a new package with the same name as the file, see this
If you're using Python 2 and want to load siblings of your file you can simply add the parent folder of your file to your system paths of the session. It will behave about the same as if your current file was an init file.
import os
import sys
dir_path = os.path.dirname(__file__)
sys.path.insert(0, dir_path)
After that regular imports relative to the file's directory will work just fine. E.g.
import cheese
from vehicle_parts import *
# etc.
Generally you want to use a proper init.py file instead though, but when dealing with legacy code you might be stuck with f.ex. a library hard-coded to load a particular file and nothing but. For those cases this is an alternative.
__init__.py : It is a Python file found in a package directory, it is invoked when the package or a module in the package is imported. You can use this to execute package initialization code, i.e. whenever the package is imported the python statements are executed first before the other modules in this folder gets executed. It is similar to main function of c or Java program, but this exists in the Python package module (folder) rather than in the core Python file.
also it has access to global variables defined in this __init__.py file as when the module is imported into Python file.
for eg.
I have a __init__.py file in a folder called pymodlib, this file contains the following statements:
print(f'Invoking __init__.py for {__name__}')
pystructures = ['for_loop', 'while__loop', 'ifCondition']
When I import this package pymodlib in my solution module or notebook or python console:
These two statements get executed while importing.
So in the log or console you would see the following output:
>>> import pymodlib
Invoking __init__.py for pymodlib
in the next statement of python console: I can access the global variable:
>> pymodlib.pystructures
it gives the following output:
['for_loop', 'while__loop', 'ifCondition']
Now, from Python 3.3 onwards the use of this file has been optional to make folder a Python module. So you can skip from including it in the python module folder.

Python3.5 import from subdirectory doesn't work

I have a bunch of python scripts and simply want to structure them by putting most of them into subdirectories. However, when I try to load scripts from subdirectories, python gives me different error messages, depending on how I try to import the subdirectory scripts.
My subdir looks like this:
io
├── dataset_creator.py
└── read_data.py
In my script from the parent dir, when I do
from io import dataset_creator
this error occurs:
ImportError: cannot import name 'dataset_creator'
When I do
import io.dataset_creator
this error occurs:
ImportError: No module named 'io.dataset_creator'; 'io' is not a package
I also touched __init__.py into io/ but it didn't help at all, as well as preceeding a dot to io/, but no luck. The python docs say I should add the __init__.py and then everything should work, basically (as far as I interpreted it).
Can anyone help me here? If I left out some important info, please tell me and I'll add it.
Cheers,
Jakob
EDIT:
As many of you stated, io is already another package in python, so renaming my io/ to something different fixed the problem (while also having the __init__.py). Thank you very much!
I know there have been multiple correct answers, however, I could just mark one as correct, sorry.
The name io is already being used by a standard library module. Since it's one of the very basic modules used by the interpreter, it gets loaded during the startup process, before any of your code runs. This means that by the time Python sees your request to import io.dataset_creator, it's already got an io module in sys.modules. Since that module is not a package, it won't try loading the other submodule you've written in your io package (even if you had a module search path set up so that your package came ahead of the standard library).
You should rename your io package. One option is to put it inside another package (mypackage.io.dataset_creator should work fine). You could also just replace the name io with something more specific (e.g. myproject_io).
It's possible that that it's failing because io is already a built-in module
I have answered a similar question here Using exec on a file in a different directory causes module import errors
Append your parent path to Pythonpath:
import sys
sys.path.append("/path/to/parentfolder")
You can use os.path.dirname(__file__) to get file's absolute path other than hardcoded path.
Add __init__.py to your parent folder and io folder, make the it as python package other than directory.
import the module:
import io.dataset_creator as dcreator
parent/
-- app.py
-- io/
--dataset_creator.py
--read_data.py
In you app.py:
import sys
sys.path.append(os.path.abspath(os.path.dirname(__file__)))
import io.dataset_creator as dcreator
This happened to me as well on Python 3.5.1 when I tested it.
Renaming the directory io to something else (I used my_io) fixed the problem. Here was my test case:
main.py
my_io
├── module.py
└── something.py
Both modules imported correctly when I changed the directory's name. I suggest you change your io directory to something similar to avoid this.
I think this must be to do with some internal Python module called io which was conflicting somehow.
Recreating the problem:
mkdir io
touch dataset_creator.py
touch read_data.py
python3 -c 'from io import dataset_creator'
python3 -c 'import io.dataset_creator'
Gives the error messages.
Solution:
Create another sub-directory called "io" and put the files there.
Use a name different than "io", as a module by that name already exists.
Explanation:
You are already in the io dir, so you don't need to specify the "io". You can simply do:
python3 -c 'import dataset_creator'
python3 -c 'import read_data'
And once you add a function or class in your python files:
def hello_world():
print("hello world")
You can import like this:
python3 -c 'from read_data import hello_world'
To organise your code under an io module umbrella, create another io directory as follows and use it to store your python code:
ia (parent dir where you do the import)
├── ia
│ ├── dataset_creator.py
│ └── read_data.py
├── .gitignore
├── requirements.txt
├── setup.py
└── README.md
python3 -c 'import ia.dataset_creator'
Note I renamed the directory to "ia" as well as there is already an "io" module that exists (ref).

Python 3 package module conflicts with standard module

I recently decided to upgrade to python 3, and start converting some of my scripts. I encountered a problem in a script that uses a module named io - in python 2, this is perfectly fine, however in python 3, io is a standard module for files. I found this old question about the same kind of problem, however this appears to be in reference to python 2. I have the opposite problem - given two files, main.py and io.py in the top level package, import io in main.py will import the standard io module, not the local one. from __future__ import absolute_imports didn't help, and from . import io and related attempts fail as expected (which I have never understood - python really doesn't know where the top level package is?). Renaming is obviously a solution, but if possible I'd like to avoid it. Is there some python 3 standard way of resolving module name conflicts?
Here's my answer:-
My directory structure:-
calvin$ tree /Users/calvin/work/learn3/
/Users/calvin/work/learn3/
└── myspecialpackage
├── __init__.py
├── __init__.pyc
├── io.py
├── io.pyc
└── main.py
__init__.py is an empty file.
io.py is your custom module which conflicts with python3's io module.
main.py contains this bunch of example code:-
import os
import sys
# These two lines are not needed you are installing the `myspecialpackage` via pip/pypi and as setup.py script places "myspecialpackage" and all its contents in your python site-packages, which is already in PYTHONPATH.
our_package_root = os.path.dirname(os.path.realpath(__file__))
sys.path.append(our_package_root)
from myspecialpackage import io
print(io.__file__)
And the imported io module will be the one in your io.py and not python3's module.
As a bonus, using this methodology will allow us to have your custom io.py as well as python3's io module (if you so desire having your cake and eat it ;-)). You can deconflict the use of the namespace io like this:-
from myspecialpackage import io as my_special_io
print(my_special_io.__file__)
import io
print(io.__file__)
Running main.py will then give you:-
In [3]: run myspecialpackage/main.py
/Users/calvin/work/learn3/myspecialpackage
./myspecialpackage/io.py
/Users/calvin/.virtualenvs/learn3/bin/../lib/python3.3/io.py
Take note of the comment I made above regarding
our_package_root = os.path.dirname(os.path.realpath(__file__))
sys.path.append(our_package_root)

Categories