Can I use __init__.py to define global variables? - python

I want to define a constant that should be available in all of the submodules of a package. I've thought that the best place would be in in the __init__.py file of the root package. But I don't know how to do this. Suppose I have a few subpackages and each with several modules. How can I access that variable from these modules?
Of course, if this is totally wrong, and there is a better alternative, I'd like to know it.

You should be able to put them in __init__.py. This is done all the time.
mypackage/__init__.py:
MY_CONSTANT = 42
mypackage/mymodule.py:
from mypackage import MY_CONSTANT
print "my constant is", MY_CONSTANT
Then, import mymodule:
>>> from mypackage import mymodule
my constant is 42
Still, if you do have constants, it would be reasonable (best practices, probably) to put them in a separate module (constants.py, config.py, ...) and then if you want them in the package namespace, import them.
mypackage/__init__.py:
from mypackage.constants import *
Still, this doesn't automatically include the constants in the namespaces of the package modules. Each of the modules in the package will still have to import constants explicitly either from mypackage or from mypackage.constants.

You cannot do that. You will have to explicitely import your constants into each individual module's namespace. The best way to achieve this is to define your constants in a "config" module and import it everywhere you require it:
# mypackage/config.py
MY_CONST = 17
# mypackage/main.py
from mypackage.config import *

You can define global variables from anywhere, but it is a really bad idea. import the __builtin__ module and modify or add attributes to this modules, and suddenly you have new builtin constants or functions. In fact, when my application installs gettext, I get the _() function in all my modules, without importing anything. So this is possible, but of course only for Application-type projects, not for reusable packages or modules.
And I guess no one would recommend this practice anyway. What's wrong with a namespace? Said application has the version module, so that I have "global" variables available like version.VERSION, version.PACKAGE_NAME etc.

Just wanted to add that constants can be employed using a config.ini file and parsed in the script using the configparser library. This way you could have constants for multiple circumstances. For instance if you had parameter constants for two separate url requests just label them like so:
mymodule/config.ini
[request0]
conn = 'admin#localhost'
pass = 'admin'
...
[request1]
conn = 'barney#localhost'
pass = 'dinosaur'
...
I found the documentation on the Python website very helpful. I am not sure if there are any differences between Python 2 and 3 so here are the links to both:
For Python 3: https://docs.python.org/3/library/configparser.html#module-configparser
For Python 2: https://docs.python.org/2/library/configparser.html#module-configparser

Related

Should I use from tkinter import * or import tkinter as tk? [duplicate]

It is recommended to not to use import * in Python.
Can anyone please share the reason for that, so that I can avoid it doing next time?
Because it puts a lot of stuff into your namespace (might shadow some other object from previous import and you won't know about it).
Because you don't know exactly what is imported and can't easily find from which module a certain thing was imported (readability).
Because you can't use cool tools like pyflakes to statically detect errors in your code.
According to the Zen of Python:
Explicit is better than implicit.
... can't argue with that, surely?
You don't pass **locals() to functions, do you?
Since Python lacks an "include" statement, and the self parameter is explicit, and scoping rules are quite simple, it's usually very easy to point a finger at a variable and tell where that object comes from -- without reading other modules and without any kind of IDE (which are limited in the way of introspection anyway, by the fact the language is very dynamic).
The import * breaks all that.
Also, it has a concrete possibility of hiding bugs.
import os, sys, foo, sqlalchemy, mystuff
from bar import *
Now, if the bar module has any of the "os", "mystuff", etc... attributes, they will override the explicitly imported ones, and possibly point to very different things. Defining __all__ in bar is often wise -- this states what will implicitly be imported - but still it's hard to trace where objects come from, without reading and parsing the bar module and following its imports. A network of import * is the first thing I fix when I take ownership of a project.
Don't misunderstand me: if the import * were missing, I would cry to have it. But it has to be used carefully. A good use case is to provide a facade interface over another module.
Likewise, the use of conditional import statements, or imports inside function/class namespaces, requires a bit of discipline.
I think in medium-to-big projects, or small ones with several contributors, a minimum of hygiene is needed in terms of statical analysis -- running at least pyflakes or even better a properly configured pylint -- to catch several kind of bugs before they happen.
Of course since this is python -- feel free to break rules, and to explore -- but be wary of projects that could grow tenfold, if the source code is missing discipline it will be a problem.
That is because you are polluting the namespace. You will import all the functions and classes in your own namespace, which may clash with the functions you define yourself.
Furthermore, I think using a qualified name is more clear for the maintenance task; you see on the code line itself where a function comes from, so you can check out the docs much more easily.
In module foo:
def myFunc():
print 1
In your code:
from foo import *
def doThis():
myFunc() # Which myFunc is called?
def myFunc():
print 2
It is OK to do from ... import * in an interactive session.
Say you have the following code in a module called foo:
import ElementTree as etree
and then in your own module you have:
from lxml import etree
from foo import *
You now have a difficult-to-debug module that looks like it has lxml's etree in it, but really has ElementTree instead.
Understood the valid points people put here. However, I do have one argument that, sometimes, "star import" may not always be a bad practice:
When I want to structure my code in such a way that all the constants go to a module called const.py:
If I do import const, then for every constant, I have to refer it as const.SOMETHING, which is probably not the most convenient way.
If I do from const import SOMETHING_A, SOMETHING_B ..., then obviously it's way too verbose and defeats the purpose of the structuring.
Thus I feel in this case, doing a from const import * may be a better choice.
http://docs.python.org/tutorial/modules.html
Note that in general the practice of importing * from a module or package is frowned upon, since it often causes poorly readable code.
These are all good answers. I'm going to add that when teaching new people to code in Python, dealing with import * is very difficult. Even if you or they didn't write the code, it's still a stumbling block.
I teach children (about 8 years old) to program in Python to manipulate Minecraft. I like to give them a helpful coding environment to work with (Atom Editor) and teach REPL-driven development (via bpython). In Atom I find that the hints/completion works just as effectively as bpython. Luckily, unlike some other statistical analysis tools, Atom is not fooled by import *.
However, lets take this example... In this wrapper they from local_module import * a bunch modules including this list of blocks. Let's ignore the risk of namespace collisions. By doing from mcpi.block import * they make this entire list of obscure types of blocks something that you have to go look at to know what is available. If they had instead used from mcpi import block, then you could type walls = block. and then an autocomplete list would pop up.
It is a very BAD practice for two reasons:
Code Readability
Risk of overriding the variables/functions etc
For point 1:
Let's see an example of this:
from module1 import *
from module2 import *
from module3 import *
a = b + c - d
Here, on seeing the code no one will get idea regarding from which module b, c and d actually belongs.
On the other way, if you do it like:
# v v will know that these are from module1
from module1 import b, c # way 1
import module2 # way 2
a = b + c - module2.d
# ^ will know it is from module2
It is much cleaner for you, and also the new person joining your team will have better idea.
For point 2: Let say both module1 and module2 have variable as b. When I do:
from module1 import *
from module2 import *
print b # will print the value from module2
Here the value from module1 is lost. It will be hard to debug why the code is not working even if b is declared in module1 and I have written the code expecting my code to use module1.b
If you have same variables in different modules, and you do not want to import entire module, you may even do:
from module1 import b as mod1b
from module2 import b as mod2b
As a test, I created a module test.py with 2 functions A and B, which respectively print "A 1" and "B 1". After importing test.py with:
import test
. . . I can run the 2 functions as test.A() and test.B(), and "test" shows up as a module in the namespace, so if I edit test.py I can reload it with:
import importlib
importlib.reload(test)
But if I do the following:
from test import *
there is no reference to "test" in the namespace, so there is no way to reload it after an edit (as far as I can tell), which is a problem in an interactive session. Whereas either of the following:
import test
import test as tt
will add "test" or "tt" (respectively) as module names in the namespace, which will allow re-loading.
If I do:
from test import *
the names "A" and "B" show up in the namespace as functions. If I edit test.py, and repeat the above command, the modified versions of the functions do not get reloaded.
And the following command elicits an error message.
importlib.reload(test) # Error - name 'test' is not defined
If someone knows how to reload a module loaded with "from module import *", please post. Otherwise, this would be another reason to avoid the form:
from module import *
As suggested in the docs, you should (almost) never use import * in production code.
While importing * from a module is bad, importing * from a package is probably even worse.
By default, from package import * imports whatever names are defined by the package's __init__.py, including any submodules of the package that were loaded by previous import statements.
If a package’s __init__.py code defines a list named __all__, it is taken to be the list of submodule names that should be imported when from package import * is encountered.
Now consider this example (assuming there's no __all__ defined in sound/effects/__init__.py):
# anywhere in the code before import *
import sound.effects.echo
import sound.effects.surround
# in your module
from sound.effects import *
The last statement will import the echo and surround modules into the current namespace (possibly overriding previous definitions) because they are defined in the sound.effects package when the import statement is executed.

How to define variables in `__init__.py` , so that when module in its package is imported, they're available to modules of another package? [duplicate]

I want to define a constant that should be available in all of the submodules of a package. I've thought that the best place would be in in the __init__.py file of the root package. But I don't know how to do this. Suppose I have a few subpackages and each with several modules. How can I access that variable from these modules?
Of course, if this is totally wrong, and there is a better alternative, I'd like to know it.
You should be able to put them in __init__.py. This is done all the time.
mypackage/__init__.py:
MY_CONSTANT = 42
mypackage/mymodule.py:
from mypackage import MY_CONSTANT
print "my constant is", MY_CONSTANT
Then, import mymodule:
>>> from mypackage import mymodule
my constant is 42
Still, if you do have constants, it would be reasonable (best practices, probably) to put them in a separate module (constants.py, config.py, ...) and then if you want them in the package namespace, import them.
mypackage/__init__.py:
from mypackage.constants import *
Still, this doesn't automatically include the constants in the namespaces of the package modules. Each of the modules in the package will still have to import constants explicitly either from mypackage or from mypackage.constants.
You cannot do that. You will have to explicitely import your constants into each individual module's namespace. The best way to achieve this is to define your constants in a "config" module and import it everywhere you require it:
# mypackage/config.py
MY_CONST = 17
# mypackage/main.py
from mypackage.config import *
You can define global variables from anywhere, but it is a really bad idea. import the __builtin__ module and modify or add attributes to this modules, and suddenly you have new builtin constants or functions. In fact, when my application installs gettext, I get the _() function in all my modules, without importing anything. So this is possible, but of course only for Application-type projects, not for reusable packages or modules.
And I guess no one would recommend this practice anyway. What's wrong with a namespace? Said application has the version module, so that I have "global" variables available like version.VERSION, version.PACKAGE_NAME etc.
Just wanted to add that constants can be employed using a config.ini file and parsed in the script using the configparser library. This way you could have constants for multiple circumstances. For instance if you had parameter constants for two separate url requests just label them like so:
mymodule/config.ini
[request0]
conn = 'admin#localhost'
pass = 'admin'
...
[request1]
conn = 'barney#localhost'
pass = 'dinosaur'
...
I found the documentation on the Python website very helpful. I am not sure if there are any differences between Python 2 and 3 so here are the links to both:
For Python 3: https://docs.python.org/3/library/configparser.html#module-configparser
For Python 2: https://docs.python.org/2/library/configparser.html#module-configparser

Making util file not accessible in python

I am building a python library. The functions I want available for users are in stemmer.py. Stemmer.py uses stemmerutil.py
I was wondering whether there is a way to make stemmerutil.py not accessible to users.
If you want to hide implementation details from your users, there are two routes that you can go. The first uses conventions to signal what is and isn't part of the public API, and the other is a hack.
The convention for declaring an API within a python library is to add all classes/functions/names that should be exposed into an __all__-list of the topmost __init__.py. It doesn't do that many useful things, its main purpose nowadays is a symbolic "please use this and nothing else". Yours would probably look somewhat like this:
urdu/urdu/__init__.py
from urdu.stemmer import Foo, Bar, Baz
__all__ = [Foo, Bar, Baz]
To emphasize the point, you can also give all definitions within stemmerUtil.py an underscore before their name, e.g. def privateFunc(): ... becomes def _privateFunc(): ...
But you can also just hide the code from the interpreter by making it a resource instead of a module within the package and loading it dynamically. This is a hack, and probably a bad idea, but it is technically possible.
First, you rename stemmerUtil.py to just stemmerUtil - now it is no longer a python module and can't be imported with the import keyword. Next, update this line in stemmer.py
import stemmerUtil
with
import importlib.util
import importlib.resources
# in python3.7 and lower, this is importlib_resources and needs to be installed first
stemmer_util_spec = importlib.util.spec_from_loader("stemmerUtil", loader=None)
stemmerUtil = importlib.util.module_from_spec(stemmer_util_spec)
with importlib.resources.path("urdu", "stemmerUtil") as stemmer_util_path:
with open(stemmer_util_path) as stemmer_util_file:
stemmer_util_code = stemmer_util_file.read()
exec(stemmer_util_code, stemmerUtil.__dict__)
After running this code, you can use the stemmerUtil module as if you had imported it, but it is invisible to anyone who installed your package - unless they run this exact code as well.
But as I said, if you just want to communicate to your users which part of your package is the public API, the first solution is vastly preferable.

Integrate class attributes in client namespace

I want to define a bunch of attributes for use in a module that should also be accessible from other modules, because they're part of my interface contract.
I've put them in a data class in my module like this, but I want to avoid qualifying them every time, similar to how you use import * from a module:
#dataclass
class Schema:
key1='key1'
key2='key2'
and in the same module:
<mymodule.py>
print(my_dict[Schema.key1])
I would prefer to be able to do this:
print(my_dict[key1])
I was hoping for an equivalent syntax to:
from Schema import *
This would allow me to do this from other modules too:
<another_module.py>
from mymodule.Schema import *
but that doesn't work.
Is there a way to do this?
Short glossary
module - a python file that can be imported
package - a collection of modules in a directory that can also be imported, is technically also a module
name - shorthand for a named value (often just "variable" in other languages), they can be imported from modules
Using import statements allows you to import either packages, modules, or names:
import xml # package
from xml import etree # also a package
from xml.etree import ElementTree # module
from xml.etree.ElementTree import TreeBuilder # name
# --- here is where it ends ---
from xml.etree.ElementTree.TreeBuilder import element_factory # does not work
The dots in such an import chain can only be made after module objects, which packages and modules are, and names are not. So, while it looks like we are just accessing attributes of objects, we are actually relying on a mechanism that normal objects just don't support, so we can't import from within them.
In your particular case, a reasonable solution would be to simply turn the object that you wanted to hold the schema into a top-level module in your project:
schema.py
key1 = 'key1'
key2 = 'key2'
...
Which will give you the option to import them in the way that you initially proposed. Doing something like this to make common constants easily accessible in your project is not unusual, and the django framework for example uses a settings.py in the same manner.
One thing you should keep in mind is that names in python modules are effectively singletons, so their values can't be changed at runtime[1].
[1] They can, but it's so hacky that it should pretty much always be treated as not possible.

How to import modules that are used in both the main code and a module correctly?

Let's assume I have a main script, main.py, that imports another python file with import coolfunctions and another: import chores
Now, suppose coolfunctions also uses stuff from chores, hence I declare import chores inside coolfunctions.
Since both main.py, and coolfunctions import chores ~ is this redundant? Is there any other way of doing this? Am I doing it correctly?
I'm confused about how python projects should be structured in general. I have a "conf.py" file, that I import for a bunch of variables ~ is this a module or not? I load this conf file in multiple places as well.
If two modules want to use chores, then each one must import chores (or some equivalent import). Each import creates a name binding only in the namespace of the module that does the import; that is, import's namespace effect is local to a module's namespace.
This is good, because by looking at a module's code you can (barring pathological cases) know where each name is bound to by the import statements that explicitly bind modules or module attributes to names. Imports made in other modules won't affect this module's namespace.
Each module X should import all (and only) the modules Y, Z, T, ... whose functionality it requires, without any worry about what other modules Fee, Fie, Foo ... (if any) may have already done part or all of those imports, or may be going to do so in the future.
It would make a module extremely fragile (indeed, it would be the very opposite of modularity!) if each module had to worry about such subtle, "covert-channel" effects.
What other modules Y, Z, T, ..., each module X chooses to import (if any) is part of X's implementation details, and shouldn't concern anybody except the developers who are coding, testing, or maintaining X.
In order to ensure that this is the case, and that this clearly-best strategy of decoupling can and will fully be followed by sane code, Python "caches" modules as they get imported: a module is "loaded" only once per run of a program, the first time anybody imports it (or anything from inside it) -- all other imports use the same object obtained by that first loading, which Python keeps in a cache (which is specified as being the dict sys.modules, but you need to know that detail only for somewhat-advanced programming techniques... don't worry about it, 98.7% of the time -- just remember that "import is cheap"!-).
Sure, a conf.py that you use from several other modules via import conf is definitely a module (you may think you're loading it multiple times, but you aren't unless you're using pretty advanced and deliberate techniques indeed for the purpose) -- why shouldn't it be?
No, this isn't redundant - it's fine to import chores in both the main module and coolfunctions.
The exact import mechanics of Python are complex (for example, module imports are only done once, meaning in your case that the actual parsing and loading of the chores module will only happen once, which is a nice optimization) but in general you shouldn't worry about it because it just works.
Each Python file is a module, so your conf.py is also a module.
It is always the best practice to import all necessary modules in the file that uses them. Take for example:
A.py contains: import coolfunctions
B.py contains: import A
Main.py contains: import B and uses functions that are defined in A.py (this is possible because by importing B, Main.py has imported everything that B imports)
If in the future, you change B.py to function without needing to import A.py and therefore remove the import A, then your Main.py will suffer the loss of not having imported A.

Categories