I currently have a folder structure like this:
.
├── main.py
└── parent.py
└── classes
└── subclass1.py
└── subclass2.py
└── subclass3.py
Each of the subclasses are a subclass of parent, and parent is an abstract class. The subclasses need to execute some functions like mix() and meld(), and each of these subclasses must implement mix() and meld().
I would like to write main.py such that the functions in each of the subclasses are executed, without me having to import their files into my program. That is, I'd like something akin to the following to happen:
def main():
# Note that I don't care about the order of which these
for each subclass in the sources folder:
execute `mix()` and `meld()`
# Note that I don't mind which order the order
# of which the subclasses' functions are invoked.
Is there any way I could get this to happen?
Essentially, what I want to do is drop a bunch of classes into the classes folder, with only mix() and meld() defined, and let this program run wild.
I've never tried this, but I think it does what you're asking for:
import os
import imp
import runpy
package = 'classes'
_, path, _ = imp.find_module(package)
for m in os.listdir(path):
if m.endswith('.py'):
runpy.run_module(
package + '.' + os.path.splitext(m)[0],
run_name="Mix_Meld"
)
Then, inside of your subclasses, you can write:
if __name__ == 'Mix_Meld':
ClassName.mix()
ClassName.meld()
This may result in additional code, granted, but if you ever need to stop the execution in one of the files, doing so is just a matter of commenting out that part of the code.
Another advantage is extensibility and polymorphism; if you need to run the code a bit differently for each one of these modules in the future, you only have to change the behaviour in the specific modules. The callee (main.py) will remain oblivious to those changes and continue calling the modules the usual way.
Try importing from the subfolder in the main.py program file:
from glob import *
import os
for i in glob.glob(os.path.join('classes', '*.py')):
__import__(i); className = i[:-3].capitalize()
eval(className).mix(); eval(className).meld()
Related
Imagine I have a module with two files, like this:
mymodule
|-- __init__.py
`-- submodule.py
mymodule/__init__.py contains:
SOME_CONSTANT_ONE = 1
SOME_CONSTANT_TWO = 2
SOME_CONSTANT_THREE = 3
...
SOME_CONSTANT_ONE_HUNDRED = 100
def initialize():
pass # do some stuff
def support_function():
pass # something that lots of other functions might need
I already know that I can use a relative import to bring in specific objects from the __init__.py file, like this:
submodule.py:
from . import initialize, support_function
def do_work():
initialize() # initialize the module
print(support_function()) # do something with the support function
But now what I want to know is if I can import all of the constants from the __init__.py file, but simultaneously have them appear in a namespace.
What won't work (what I've tried/considered):
import mymodule as outer_module works, since the import system already has knowledge of where the module is. However, if I ever need to change the name of the outer module, that code will break.
Doing import . as outer_module doesn't work.
Doing from . import * does work but puts all of the objects in __init__.py in the current namespace rather than in the sub-namespace.
Doing from . import SOME_CONSTANT_ONE as outer_constant_1, SOME_CONSTANT_TWO as outer_constant_2, SOME_CONSTANT_THREE as outer_constant_3, ... is ugly and won't bring in any new constants should they be defined later on in __init__.py.
What I really want is something like this:
submodule.py:
SOME_CONSTANT_ONE = "one!" # We don't want to clobber this.
import . as outer_module # this does not work, but it illustrates what is desired.
def do_work():
print(SOME_CONSTANT_ONE) # should print "one!"
print(outer_module.SOME_CONSTANT_ONE) # should print "1"
I know that I could move all of the constants to a constants.py file and then I should be able to import it with from . import constants (as something) but I'm working on existing code and making that change would require a lot of refactoring. While that's not a bad idea, I'm wondering, given that Python does have a way to import individual objects, and also to import the whole module by name to an explicit name, if I can maybe do something with importlib to accomplish importing everything from __init__.py into a namespace?
The loader sets __package__ which you can use:
import sys
SOME_CONSTANT_ONE = "one!" # We don't want to clobber this.
outer_module = sys.modules[__package__]
def do_work():
print(SOME_CONSTANT_ONE) # should print "one!"
print(outer_module.SOME_CONSTANT_ONE) # should print "1"
This is precisely the attribute from which relative imports are based.
See PEP 366 for details.
However, I really think the backwards-compatible refactoring which the other answer suggests is probably the better approach here.
I could move all of the constants to a constants.py file and then I should be able to import it with from . import constants (as something) but I'm working on existing code and making that change would require a lot of refactoring
You can still refactor the constants into the new constants.py module. To support existing code relying on __init__.py you can import constants into the __init__.py
# constants.py
SOME_CONSTANT_ONE = 1
SOME_CONSTANT_TWO = 2
SOME_CONSTANT_THREE = 3
... # etc
# __init__.py
from .constants import *
# submodule.py
SOME_CONSTANT_ONE = 'dont clobber me!'
from . import constants as something
print(something.SOME_CONSTANT_ONE) # Yay namespaces
# existing_code.py
from . import SOME_CONSTANT_ONE
# still works!
# no refactor required!
Typically speaking, the __init__.py file is usually left completely empty and nothing is ever directly defined there. If there are contents in __init__.py they are typically imported from within the package. https://stackoverflow.com/a/4116384/5747944
I am working in the following directory tree:
src/
__init__.py
train.py
modules/
__init__.py
encoders/
__init__.py
rnn_encoder.py
My pwd is the top-level directory and my __init__.py files are all empty. I am executing train.py, which contains the following code snippet.
import modules
# RNNEncoder is a class in rnn_encoder.py
encoder = modules.encoders.rnn_encoder.RNNEncoder(**params)
When I execute train.py, I get an error saying that
AttributeError: module 'modules' has no attribute 'encoders'
I am wondering if there is any clean way to make this work. Note that I am not looking for alternative methods of importing, I am well-aware that this can be done in other ways. What I'd like to know is whether it is possible to keep the code in train.py as is while maintaining the given directory structure.
Putting an __init__.py file in a folder allows that folder to act as an import target, even when it's empty. The way you currently have things set up, the following should work:
from modules.encoders import rnn_encoder
encoder = rnn_encoder.RNNEncoder(**params)
Here, python treats modules.encoders as a filepath, essentially, and then tries to actually import the code inside rnn_encoder.
However, this won't work:
import modules
encoder = modules.encoders.rnn_encoder.RNNEncoder(**params)
The reason is that, when you do import modules, what python is doing behind the scenes is importing __init__.py from the modules folder, and nothing else. It doesn't run into an error, since __init__.py exists, but since it's empty it doesn't actually do much of anything.
You can put code in __init__.py to fill out your module's namespace and allow people to access that namespace from outside your module. To solve your problem, make the following changes:
modules/encoders/__init__.py
from . import rnn_encoder
modules/__init__.py
from . import encoders
This imports rnn_encoder and assigns it to the namespace of encoders, allowing you to import encoders and then access encoders.rnn_encoder. Same with modules.encoders, except a step upwards.
I often end up in a situation where one package needs to use a sibling package. I want to clarify that I'm not asking about how Python allows you to import sibling packages, which has been asked many times. Instead, my question is about a best practice for writing maintainable code.
Let's say we have a tools package, and the function tools.parse_name() depends on tools.split_name(). Initially, both might live in the same file where everything is easy:
# tools/__init__.py
from .name import parse_name, split_name
# tools/name.py
def parse_name(name):
splits = split_name(name) # Can access from same file.
return do_something_with_splits(splits)
def split_name(name):
return do_something_with_name(name)
Now, at some point we decide that the functions have grown and split them into two files:
# tools/__init__.py
from .parse_name import parse_name
from .split_name import split_name
# tools/parse_name.py
import tools
def parse_name(name):
splits = tools.split_name(name) # Won't work because of import order!
return do_something_with_splits(splits)
# tools/split_name.py
def split_name(name):
return do_something_with_name(name)
The problem is that parse_name.py can't just import the tools package which it is part of itself. At least, this won't allow it to use tools listed below its own line in tools/__init__.py.
The technical solution is to import tools.split_name rather than tools:
# tools/__init__.py
from .parse_name import parse_name
from .split_name import split_name
# tools/parse_name.py
import tools.split_name as tools_split_name
def parse_name(name):
splits = tools_split_name.split_name(name) # Works but ugly!
return do_something_with_splits(splits)
# tools/split_name.py
def split_name(name):
return do_something_with_name(name)
This solution technically works but quickly becomes messy if more than just one sibling packages are used. Moreover, renaming the package tools to utilities would be a nightmare, since now all the module aliases should change as well.
It would like to avoid importing functions directly and instead import packages, so that it is clear where a function came from when reading the code. How can I handle this situation in a readable and maintainable way?
I can literally ask you what syntax do you need and provide it. I won't, but you can do it yourself too.
"The problem is that parse_name.py can't just import the tools package which is part of itself."
That looks like a wrong and strange thing to do, indeed.
"At least, this won't allow it to use tools listed below its own line in tools/__init__.py"
Agreed, but again, we don't need that, if things are structured properly.
To simplify the discussion and reduce the degrees of freedom,I assumed several things in the example below.
You can then adapt to different but similar scenarios, because you can modify the code to fit your import syntax requirements.
I give some hints for changes in the end.
Scenario:
You want to build an import package named tools.
You have a lot of functions in there, that you want to make available to client code in client.py. This file uses the package tools by importing it. To keep simplicity I make all the functions (from everywhere) available below tools namespace, by using a from ... import * form. That is dangerous and should be modified in real scenario to prevent name clashes with and between subpackage names.
You organize the functions together by grouping them in import packages inside your tools package (subpackages).
The subpackages have (by definition) their own folder and at least an __init__.py inside. I choose to put the subpackages code in a single module in every subpackage folder, besides the __init__.py. You can have more modules and/or inner packages.
.
├── client.py
└── tools
├── __init__.py
├── splitter
│ ├── __init__.py
│ └── splitter.py
└── formatter
├── __init__.py
└── formatter.py
I keep the __init__.pys empty, except for the outside one, which is responsible to make all the wanted names available to client importing code, in the tools namespace.
This can be changed of course.
#/tools/__init.py___
# note that relative imports avoid using the outer package name
# which is good if later you change your mind for its name
from .splitter.splitter import *
from .formatter.formatter import *
# tools/client.py
# this is user code
import tools
text = "foo bar"
splits = tools.split(text) # the two funcs came
# from different subpackages
text = tools.titlefy(text)
print(splits)
print(text)
# tools/formatter/formatter.py
from ..splitter import splitter # tools formatter sibling
# subpackage splitter,
# module splitter
def titlefy(name):
splits = splitter.split(name)
return ' '.join([s.title() for s in splits])
# tools/splitter/splitter.py
def split(name):
return name.split()
You can actually tailor the imports syntax to your taste, to answer your comment about what they look like.
from form is needed for relative imports. Otherwise use absolute imports by prefixing the path with tools.
__init__.pys can be used to adjust the imported names into the importer code, or to initialize the module. They can also be empty, or actually start as the only file in the subpackage, with all the code in it, and then be splitted in other modules, despite I don't like this "everything in __init__.py" approach as much.
They are just code that runs on import.
You can also avoid repeated names in imported paths by either using different names, or by putting everything in __init__.py, dropping the module with the repeated name, or by using aliases in the __init__.py imports, or with name attributions there. You may also limit what gets exported when the * form is used by importer by assigning names to an __all__ list.
A change you might want for safer readability is to force client.py in specifying the subpackage when using names that is,
name1 = tools.splitter.split('foo bar')
Change the __init__.py to import only the submodules, like this:
from .splitter import splitter
from .formatter import formatter
I'm not proposing this to be actually used in practice, but just for fun, here is a solution using pkgutil and inspect:
import inspect
import os
import pkgutil
def import_siblings(filepath):
"""Import and combine names from all sibling packages of a file."""
path = os.path.dirname(os.path.abspath(filepath))
merged = type('MergedModule', (object,), {})
for importer, module, _ in pkgutil.iter_modules([path]):
if module + '.py' == os.path.basename(filepath):
continue
sibling = importer.find_module(module).load_module(module)
for name, member in inspect.getmembers(sibling):
if name.startswith('__'):
continue
if hasattr(merged, name):
message = "Two sibling packages define the same name '{}'."
raise KeyError(message.format(name))
setattr(merged, name, member)
return merged
The example from the question becomes:
# tools/__init__.py
from .parse_name import parse_name
from .split_name import split_name
# tools/parse_name.py
tools = import_siblings(__file__)
def parse_name(name):
splits = tools.split_name(name) # Same usage as if this was an external module.
return do_something_with_splits(splits)
# tools/split_name.py
def split_name(name):
return do_something_with_name(name)
I am just starting off using google app engine, and have been looking around for good practices and code organization. Most of my problems lie from confusion of __init__.py.
My current test structure looks like
/website
main.py
/pages
__init__.py #1
blog.py
hello2.py
hello.py
/sub
__init__.py #2
base.py
I am trying to use main.py as a file that simply points to everything in /pages and /pages/sub. Most modules in /pages share almost all the same imports (ex. import urllib), is there a way to define that everything in /pages imports what I want rather than adding it in every individual module?
Currently in __init__.py #1 I have
from sub.base import *
Yet my module blog.py says BaseHandler (a function in base.py) not defined.
My end goal is to have something like ...
main.py
from pages import *
#be able to call any function in /pages without having to do blog.func1() or hello.func2()
#rather just func1() and func2()
And to be able to share common imports for modules in /pages in __init__.py. So that they share for example urllib and all functions from base.py. Thank you for taking the time to read this post, I look forward to your insight.
Sounds like you think __init__.py is an initializer for the other modules in the package. It is not. It turns pages into a package (allowing its files and subdirectories to be modules), and it is executed, like a normal module would be, when your program calls import pages. Imagine that it's named pages.py instead.
So if you really want to dump everything into the same namespace, init #2 can contain from base import * (which will import everything in base to the namespace of sub), and blog.py can contain from sub import *. Got it?
I'm taking a look at how the model system in django works and I noticed something that I don't understand.
I know that you create an empty __init__.py file to specify that the current directory is a package. And that you can set some variable in __init__.py so that import * works properly.
But django adds a bunch of from ... import ... statements and defines a bunch of classes in __init__.py. Why? Doesn't this just make things look messy? Is there a reason that requires this code in __init__.py?
All imports in __init__.py are made available when you import the package (directory) that contains it.
Example:
./dir/__init__.py:
import something
./test.py:
import dir
# can now use dir.something
EDIT: forgot to mention, the code in __init__.py runs the first time you import any module from that directory. So it's normally a good place to put any package-level initialisation code.
EDIT2: dgrant pointed out to a possible confusion in my example. In __init__.py import something can import any module, not necessary from the package. For example, we can replace it with import datetime, then in our top level test.py both of these snippets will work:
import dir
print dir.datetime.datetime.now()
and
import dir.some_module_in_dir
print dir.datetime.datetime.now()
The bottom line is: all names assigned in __init__.py, be it imported modules, functions or classes, are automatically available in the package namespace whenever you import the package or a module in the package.
It's just personal preference really, and has to do with the layout of your python modules.
Let's say you have a module called erikutils. There are two ways that it can be a module, either you have a file called erikutils.py on your sys.path or you have a directory called erikutils on your sys.path with an empty __init__.py file inside it. Then let's say you have a bunch of modules called fileutils, procutils, parseutils and you want those to be sub-modules under erikutils. So you make some .py files called fileutils.py, procutils.py, and parseutils.py:
erikutils
__init__.py
fileutils.py
procutils.py
parseutils.py
Maybe you have a few functions that just don't belong in the fileutils, procutils, or parseutils modules. And let's say you don't feel like creating a new module called miscutils. AND, you'd like to be able to call the function like so:
erikutils.foo()
erikutils.bar()
rather than doing
erikutils.miscutils.foo()
erikutils.miscutils.bar()
So because the erikutils module is a directory, not a file, we have to define it's functions inside the __init__.py file.
In django, the best example I can think of is django.db.models.fields. ALL the django *Field classes are defined in the __init__.py file in the django/db/models/fields directory. I guess they did this because they didn't want to cram everything into a hypothetical django/db/models/fields.py model, so they split it out into a few submodules (related.py, files.py, for example) and they stuck the made *Field definitions in the fields module itself (hence, __init__.py).
Using the __init__.py file allows you to make the internal package structure invisible from the outside. If the internal structure changes (e.g. because you split one fat module into two) you only have to adjust the __init__.py file, but not the code that depends on the package. You can also make parts of your package invisible, e.g. if they are not ready for general usage.
Note that you can use the del command, so a typical __init__.py may look like this:
from somemodule import some_function1, some_function2, SomeObject
del somemodule
Now if you decide to split somemodule the new __init__.py might be:
from somemodule1 import some_function1, some_function2
from somemodule2 import SomeObject
del somemodule1
del somemodule2
From the outside the package still looks exactly as before.
"We recommend not putting much code in an __init__.py file, though. Programmers do not expect actual logic to happen in this file, and much like with from x import *, it can trip them up if they are looking for the declaration of a particular piece of code and can't find it until they check __init__.py. "
-- Python Object-Oriented Programming Fourth Edition Steven F. Lott Dusty Phillips