I've got a really complex singleton object. I've decided to modify it, so it'll be a separate module with module--wide global variables that would store data.
Are there some pitfalls of this approach? I just feel, like that's a little bit hacky, and that there may be some problems I cannot see now.
Maybe someone did this or have some opinion :) Thanks in advance for help.
Regards.
// Minimal, Complete, and Verifiable example:
"""
This is __init__.py of the module, that could be used as a singleton:
I need to set and get value of IMPORTANT_VARIABLE from different places in my code.
Folder structure:
--singleton_module
|
-__init__.py
Example of usage:
import singleton_module as my_singleton
my_singleton.set_important_variable(3)
print(my_singleton.get_important_variable())
"""
IMPORTANT_VARIABLE = 0
def set_important_variable(value):
global IMPORTANT_VARIABLE
IMPORTANT_VARIABLE = value
def get_important_variable():
return IMPORTANT_VARIABLE
Technically, Python modules ARE singletons, so from this point of view there's no particular issue (except the usual issues with singletons that is) with your code. I'd just spell the varibale in all_lower (ALL_UPPER denotes a pseudo-constant) and prefix it with either a single ("protected") or double ("really private") leading underscore to make clear it's not part of the public API (standard Python naming convention).
Now whether singletons are a good idea is another debate but that's not the point here...
e.g that in one potential situation I may lost data, or that module could be imported in different places of code two times, so it would not be a singleton if imported inside scope of function or something like that.
A module is only instanciated once per process (the first time it's imported), then subsquent imports will directly get if from sys.modules. The only case where you could have two distinct instances of the same module is when the module is imported by two different path, which can only happens if you have a somewhat broken sys.path ie something like this:
src/
foo/
__init.py
bar/
__init__.py
baaz/
__init__.py
mymodule.py
with both "src" and "foo" in sys.path, then importing mymodule once as from foo.bar.baaz import mymodule and a second time as from bar.baaz import mymodule
Needless to say that it's a degenerate case, but it can happens and lead to hard to diagnose bugs. Note that when you have this case, you do have quite a few other things that breaks, like identity testing anything from mymodule.
Also, I am not sure how would using object instead of module increase security
It doesn't.
And I am just asking, if that's not a bad practice, maybe someone did this and found some problems. This is probably not a popular pattern
Well, quite on the contrary you'll often find advises on using modules as singletons instead of using classes with only staticmethods, classmethods and class attributes (another way of implementing a singleton in Python). This most often concerns stateless classes used as namespaces while your example does have a state, but this doesn't make much practical difference.
Now what you won't get are all the nice OO features like computed attributes, inheritance, magicmethods etc, but I assume you already understood this.
As far as I'm concerned, depending on the context, I might rather use a plain class but only expose one single instance of the class as the module's API ie:
# mymodule.py
__all__ = ["mysingleton"]
class __MySingletonLike(object):
def __init__(self):
self._variable = 42
#property
def variable(self):
return self._variable
#variable.setter
def variable(self, value):
check_value(value) # imaginary validation
self._variable = value
mysingleton = __MySingleton()
but that's only when I have special concerns about the class (implementation reuse, proper testability, other special features requiring a class etc).
Related
I want to define a bunch of config variables that can be imported in all the modules in my project. The values of those variables will be constant during runtime but are not known before runtime; they depend on the input. Usually I'd define a dict in my top module which would be passed to all functions and classes from other modules; however, I was thinking it may be cleaner to simply create a blank config.py module which would be dynamically filled with config variables by the top module:
# top.py
import config
config.x = x
# config.py
x = None
# other.py
import config
print(config.x)
I like this approach because I don't have to save the parameters as attributes of classes in my other modules; which makes sense to me because parameters do not describe classes themselves.
This works but is it considered bad practice?
The question as such may be disputed. But I would generally say yes, it's "bad practice" because scope and impact of change is really getting blurred. Note the use case you're describing really is not about sharing configuration, but about different parts of the program functions, objects, modules exchanging data and as such it's a bit of a variation on (meta)global variable).
Reading common configuration values could be fine, but changing them along the way... you may lose track of what happened where and also in which order as modules get imported / values get modified. For instance assume the config.py and two modules m1.py:
import config
print(config.x)
config.x=1
and m2.py:
import config
print(config.x)
config.x=2
and a main.py that just does:
import m1
import m2
import config
print(config.x)
or:
import m2
import m1
import config
print(config.x)
The state in which you find config in each module and really any other (incl. main.py here) depends on order in which imports have occurred and who assigned what value when. Even for a program entirely under your control, this may get confusing (and source of mistakes) rather quickly.
For runtime data and passing information between objects and modules (and your example is really that and not configuration that is predefined and shared between modules) I would suggest you look into describing the information perhaps in a custom state (config) object and pass it around through appropriate interface. But really just a function / method argument may be all that is needed. The exact form depends on what exactly you're trying to achieve and what your overall design is.
In your example, other.py behaves differently when called or imported before top.py which may still seem obvious and manageable in a minimal example, but really is not a very sound design. Anyone reading the code (incl. future you) should be able to follow its logic and this IMO breaks its flow.
The most trivial (and procedural) example of what for what you've described and now I hopefully have a better grasp of would be other.py recreating your current behavior:
def do_stuff(value):
print(value) # We did something useful here
if __name__ == "__main__":
do_stuff(None) # Could also use config with defaults
And your top.py presumably being the entry point and orchestrating importing and execution doing:
import other
x = get_the_value()
other.do_stuff(x)
You can of course introduce an interface to configure do_stuff perhaps a dict or a custom class even with default implementation in config.py:
class Params:
def __init__(self, x=None):
self.x = x
and your other.py:
def do_stuff(params=config.Params()):
print(params.x) # We did something useful here
And on your top.py you can use:
params = config.Params(get_the_value())
other.do_stuff(params)
But you could also have any use case specific source of value(s):
class TopParams:
def __init__(self, url):
self.x = get_value_from_url(url)
params = TopParams("https://example.com/value-source")
other.do_stuff(params)
x could even be a property which you retrieve every time you access it... or lazily when needed and then cached... Again, it really then is a matter of what you need to do.
"Is it bad practice to modify attributes of one module from another module?"
that it is considered as bad practice - violation of the law of demeter, which means in fact "talk to friends, not to strangers".
Objects should expose behaviour and functions, but should HIDE the data.
DataStructures should EXPOSE data, but should not have any methods (which are exposed). The law of demeter does not apply to such DataStructures. OOP Purists might cover such DataStructures with setters and getters, but it really adds no value in Python.
there is a lot of literature about that like : https://en.wikipedia.org/wiki/Law_of_Demeter
and of course, a must to read: "Clean Code", by Robert C. Martin (Uncle Bob), check it out on Youtube also.
For procedural programming it is perfectly normal to keep data in a DataStructure which does not have any (exposed) methods.
The procedures in the program work with that data. Consider to use the module attrs, see : https://www.attrs.org/en/stable/ for easy creation of such classes.
my prefered method for keeping config is (here without using attrs):
# conf_xy.py
"""
config is code - so why use damned parsers, textfiles, xml, yaml, toml and all that
if You just can use testable code as config that can deliver the correct types, etc.
as well as hinting in Your favorite IDE ?
Here, for demonstration without using attrs package - usually I use attrs (read the docs)
"""
class ConfXY(object):
def __init__(self) -> None:
self.x: int = 1
self.z: float = get_z_from_input()
...
conf_xy=ConfXY()
# other.py
from conf_xy import conf_xy
...
y = conf_xy.x * 2
...
The Python PEP 8 style guide gives the following guidance for a single leading underscore in method names:
_single_leading_underscore: weak "internal use" indicator. E.g. from M import * does not import objects whose names start with an underscore.
What constitutes "internal use"?
Is this for methods only called within a given class?
MyClass:
def _internal_method(self):
# do_something
def public_method(self):
self._internal_method()
What about inherited methods - are they still considered "internal"?
BaseClass:
def _internal_method(self):
# do something
MyClass(BaseClass):
def public_method(self):
self._internal_method() # or super()._internal_method()
What about if the inheritance is from another module within a software package?
file1.py
BaseClass:
def _internal_method(self):
# do something
file2.py
from file1 import BaseClass
MyClass(BaseClass):
def public_method(self):
self._internal_method() # or super()._internal_method()
All these examples are fine technically, but are they all acceptable stylistically? At what point do you say the leading underscore is not necessary/helpful?
A single leading underscore is Python's convention for "private" and "protected" variables, available as hard-implementations in some other languages.
The "internal use" language is just to say that you are reserving that name, as developer, to be used by your code as you want, and other users of your module/code can't rely on the thing tied to that name to behave the same way in further versions, or even to exist. It is just the use case for "protected" attributes, but without a hard-implementation from the language runtime: users are supposed to know that attribute/function/method can be changed without any previous warning.
So, yes, as long as other classes using your _ prefixed methods are on the same code package - even if on other file, or folder (other completly distinct package), it is ok to use them.
If you have different Python packages, even if closely related, it would not be advisable to call directly on the internal stuff on the other package, style-wise.
And as for limits, sometimes there are entire modules and classes that are not supposed to be used by users of your class - and it would be somewhat impairing to prefix everything on those modules with an _ - I'd say that it is enough to document what public interfaces to your package users are supposed to call, and add on the docs that certain parts (modules/classes/functions) are designed for "internal use and may change without note" - no need to meddle with their names.
As an illustration, I am currently developing a set of tools/library for text-art on the terminal - I put everything users should call as public names in its __init__.py - the remaining names are meant to be "internal".
I'm having some real headaches right now trying to figure out how to import stuff properly. I had my application structured like so:
main.py
util_functions.py
widgets/
- __init__.py
- chooser.py
- controller.py
I would always run my applications from the root directory, so most of my imports would be something like this
from util_functions import *
from widgets.chooser import *
from widgets.controller import *
# ...
And my widgets/__init__.py was setup like this:
from widgets.chooser import Chooser
from widgets.controller import MainPanel, Switch, Lever
__all__ = [
'Chooser', 'MainPanel', 'Switch', 'Lever',
]
It was working all fine, except that widgets/controller.py was getting kind of lengthy, and I wanted it to split it up into multiple files:
main.py
util_functions.py
widgets/
- __init__.py
- chooser.py
- controller/
- __init__.py
- mainpanel.py
- switch.py
- lever.py
One of issues is that the Switch and Lever classes have static members where each class needs to access the other one. Using imports with the from ___ import ___ syntax that created circular imports. So when I tried to run my re-factored application, everything broke at the imports.
My question is this: How can I fix my imports so I can have this nice project structure? I cannot remove the static dependencies of Switch and Lever on each other.
This is covered in the official Python FAQ under How can I have modules that mutually import each other.
As the FAQ makes clear, there's no silvery bullet that magically fixes the problem. The options described in the FAQ (with a little more detail than is in the FAQ) are:
Never put anything at the top level except classes, functions, and variables initialized with constants or builtins, never from spam import anything, and then the circular import problems usually don't arise. Clean and simple, but there are cases where you can't follow those rules.
Refactor the modules to move the imports into the middle of the module, where each module defines the things that need to be exported before importing the other module. This can means splitting classes into two parts, an "interface" class that can go above the line, and an "implementation" subclass that goes below the line.
Refactor the modules in a similar way, but move the "export" code (with the "interface" classes) into a separate module, instead of moving them above the imports. Then each implementation module can import all of the interface modules. This has the same effect as the previous one, with the advantage that your code is idiomatic, and more readable by both humans and automated tools that expect imports at the top of a module, but the disadvantage that you have more modules.
As the FAQ notes, "These solutions are not mutually exclusive." In particular, you can try to move as much top-level code as possible into function bodies, replace as many from spam import … statements with import spam as is reasonable… and then, if you still have circular dependencies, resolve them by refactoring into import-free export code above the line or in a separate module.
With the generalities out of the way, let's look at your specific problem.
Your switch.Switch and lever.Lever classes have "static members where each class needs to access the other one". I assume by this you mean they have class attributes that are initialized using class attributes or class or static methods from the other class?
Following the first solution, you could change things so that these values are initialized after import time. Let's assume your code looked like this:
class Lever:
switch_stuff = Switch.do_stuff()
# ...
You could change that to:
class Lever:
#classmethod
def init_class(cls):
cls.switch_stuff = Switch.do_stuff()
Now, in the __init__.py, right after this:
from lever import Lever
from switch import Switch
… you add:
Lever.init_class()
Switch.init_class()
That's the trick: you're resolving the ambiguous initialization order by making the initialization explicit, and picking an explicit order.
Alternatively, following the second or third solution, you could split Lever up into Lever and LeverImpl. Then you do this (whether as separate lever.py and leverimpl.py files, or as one file with the imports in the middle):
class Lever:
#classmethod
def get_switch_stuff(cls):
return cls.switch_stuff
from switch import Swift
class LeverImpl(Lever):
switch_stuff = Switch.do_stuff()
Now you don't need any kind of init_class method. Of course you do need to change the attribute to a method—but if you don't like that, with a bit of work, you can always change it into a "class #property" (either by writing a custom descriptor, or by using #property in a metaclass).
Note that you don't actually need to fix both classes to resolve the circularity, just one. In theory, it's cleaner to fix both, but in practice, if the fixes are ugly, it may be better to just fix the one that's less ugly to fix and leave the dependency in the opposite direction alone.
One way is to use import x, without using "from" keyword. So then you refer to things with their namespace everywhere.
Is there any other way? like doing something like in C++ ifnotdef __b__ def __b__ type of thing?
Merge any pair of modules that depend on each other into a single module. Then introduce extra modules to get the old names back.
E.g.,
# a.py
from b import B
class A: whatever
# b.py
from a import A
class B: whatever
becomes
# common.py
class A: whatever
class B: whatever
# a.py
from common import A
# b.py
from common import B
Circular imports are a "code smell," and often (but not always) indicate that some refactoring would be appropriate. E.g., if A.x uses B.y and B.y uses A.z, then you might consider moving A.z into its own module.
If you do think you need circular imports, then I'd generally recommend importing the module and referring to objects with fully qualified names (i.e, import A and use A.x rather than from A import x).
If you're trying to do from A import *, the answer is very simple: Don't do that. You're usually supposed to do import A and refer to the qualified names.
For quick&dirty scripts, and interactive sessions, that's a perfectly reasonable thing to do—but in such cases, you won't run into circular imports.
There are some cases where it makes sense to do import * in real code. For example, if you want to hide a module structure that's complex, or that you generate dynamically, or that changes frequently between versions, or if you're wrapping up someone else's package that's too deeply nested, import * may make sense from a "wrapper module" or a top-level package module. But in that case, nothing you import will be importing you.
In fact, I'm having a hard time imagining any case where import * is warranted and circular dependencies are even a possibility.
If you're doing from A import foo, there are ways around that (e.g., import A then foo = A.foo). But you probably don't want to do that. Again, consider whether you really need to bring foo into your namespace—qualified names are a feature, not a problem to be worked around.
If you're doing the from A import foo just for convenience in implementing your functions, because A is actually long_package_name.really_long_module_name and your code is unreadable because of all those calls to long_package_name.really_long_module_name.long_class_name.class_method_that_puts_me_over_80_characters, remember that you can always import long_package_name.really_long_module_name as P and then use P for you qualified calls.
(Also, remember that with any from done for implementation convenience, you probably want to make sure to specify a __all__ to make sure the imported names don't appear to be part of your namespace if someone does an import * on you from an interactive session.)
Also, as others have pointed out, most, but not all, cases of circular dependencies, are a symptom of bad design, and refactoring your modules in a sensible way will fix it. And in the rare cases where you really do need to bring the names into your namespace, and a circular set of modules is actually the best design, some artificial refactoring may still be a better choice than foo = A.foo.
I'm working on my first significant Python project and I'm having trouble with scope issues and executing code in included files. Previously my experience is with PHP.
What I would like to do is have one single file that sets up a number of configuration variables, which would then be used throughout the code. Also, I want to make certain functions and classes available globally. For example, the main file would include a single other file, and that file would load a bunch of commonly used functions (each in its own file) and a configuration file. Within those loaded files, I also want to be able to access the functions and configuration variables. What I don't want to do, is to have to put the entire routine at the beginning of each (included) file to include all of the rest. Also, these included files are in various sub-directories, which is making it much harder to import them (especially if I have to re-import in every single file).
Anyway I'm looking for general advice on the best way to structure the code to achieve what I want.
Thanks!
In python, it is a common practice to have a bunch of modules that implement various functions and then have one single module that is the point-of-access to all the functions. This is basically the facade pattern.
An example: say you're writing a package foo, which includes the bar, baz, and moo modules.
~/project/foo
~/project/foo/__init__.py
~/project/foo/bar.py
~/project/foo/baz.py
~/project/foo/moo.py
~/project/foo/config.py
What you would usually do is write __init__.py like this:
from foo.bar import func1, func2
from foo.baz import func3, constant1
from foo.moo import func1 as moofunc1
from foo.config import *
Now, when you want to use the functions you just do
import foo
foo.func1()
print foo.constant1
# assuming config defines a config1 variable
print foo.config1
If you wanted, you could arrange your code so that you only need to write
import foo
At the top of every module, and then access everything through foo (which you should probably name "globals" or something to that effect). If you don't like namespaces, you could even do
from foo import *
and have everything as global, but this is really not recommended. Remember: namespaces are one honking great idea!
This is a two-step process:
In your module globals.py import the items from wherever.
In all of your other modules, do "from globals import *"
This brings all of those names into the current module's namespace.
Now, having told you how to do this, let me suggest that you don't. First of all, you are loading up the local namespace with a bunch of "magically defined" entities. This violates precept 2 of the Zen of Python, "Explicit is better than implicit." Instead of "from foo import *", try using "import foo" and then saying "foo.some_value". If you want to use the shorter names, use "from foo import mumble, snort". Either of these methods directly exposes the actual use of the module foo.py. Using the globals.py method is just a little too magic. The primary exception to this is in an __init__.py where you are hiding some internal aspects of a package.
Globals are also semi-evil in that it can be very difficult to figure out who is modifying (or corrupting) them. If you have well-defined routines for getting/setting globals, then debugging them can be much simpler.
I know that PHP has this "everything is one, big, happy namespace" concept, but it's really just an artifact of poor language design.
As far as I know program-wide global variables/functions/classes/etc. does not exist in Python, everything is "confined" in some module (namespace). So if you want some functions or classes to be used in many parts of your code one solution is creating some modules like: "globFunCl" (defining/importing from elsewhere everything you want to be "global") and "config" (containing configuration variables) and importing those everywhere you need them. If you don't like idea of using nested namespaces you can use:
from globFunCl import *
This way you'll "hide" namespaces (making names look like "globals").
I'm not sure what you mean by not wanting to "put the entire routine at the beginning of each (included) file to include all of the rest", I'm afraid you can't really escape from this. Check out the Python Packages though, they should make it easier for you.
This depends a bit on how you want to package things up. You can either think in terms of files or modules. The latter is "more pythonic", and enables you to decide exactly which items (and they can be anything with a name: classes, functions, variables, etc.) you want to make visible.
The basic rule is that for any file or module you import, anything directly in its namespace can be accessed. So if myfile.py contains definitions def myfun(...): and class myclass(...) as well as myvar = ... then you can access them from another file by
import myfile
y = myfile.myfun(...)
x = myfile.myvar
or
from myfile import myfun, myvar, myclass
Crucially, anything at the top level of myfile is accessible, including imports. So if myfile contains from foo import bar, then myfile.bar is also available.