Python now has an Enum type (new in 3.4 with PEP 435, and alse backported), and while namespaces are a good thing, sometimes Enums are used more like constants, and the enum members should live in the global (er, module) namespace.
So instead of:
Constant(Enum):
PI = 3.14
...
area = Constant.PI * r * r
I can just say:
area = PI * r * r
Is there an easy way to get from Constant.PI to just PI?
The officially supported method is something like this:
globals().update(Constant.__members__)
This works because __members__ is the dict-like object that holds the names and members of the Enum class.
I personally find that ugly enough that I usually add the following method to my Enum classes:
#classmethod
def export_to(cls, namespace):
namespace.update(cls.__members__)
and then in my top level code I can say:
Constant.export_to(globals())
Note: exporting an Enum to the global namespace only works well when the module only has one such exported Enum. If you have several it is better to have a shorter alias for the Enum itself, and use that instead of polluting the global namespace:
class Constant(Enum):
PI = ....
C = Constant
area = C.PI * r * r
FWIW — this is more of a comment rather than an answer — below is the beginning of a function from some old code I wrote which implements it's own named int-like enumerated-value objects which were added to the global namespace by default (there's no named container Enum class involved). However it's doing something similar what's shown in your own answer, so I think it a good overall approach because it's worked well for me.
def Enum(names, values=None, namespace=None):
"""Function to assign values to names given and add them to a
namespace. Default values are a sequence of integers starting at
zero. Default namespace is the caller's globals."""
if namespace is None:
namespace = sys._getframe(1).f_globals # caller's globals
pairs = _paired(names, values)
namespace.update(pairs) # bind names to cooresponding named numbers
. . .
The point being, as far as implementing something for the current Enum class module goes, I'd suggest adding something like it or the def export_to() method shown in your own answer to the Enum base class in the next Python release so it's available automatically.
Related
Should I use underscore before function name in single-file simple python script?
For example, in this script should I add underscore before definition of f and g? This script won't be imported from other files.
def f(x): # or _f(x)?
return x * 2
def g(x): # or _g(x)?
return x ** 2
def main():
x = f(100)
print(g(x))
if __name__ == "__main__":
main()
I read many documents about usage of underscores in python. Many of them says underscores are about OOP-style programming and how import statement works. However, in simple one-file script, I can't find good answer.
What is better pattern?
The most popular programming languages (such as Java and C/C++) have this kind of syntax to declare private attributes and methods of a class instance:
class MyClass {
private:
int _function() {
// Some code here
}
public:
bool _foo() {
// Some code here
}
}
Python doesn't have (and will probably never have) a syntax like this, so they simply create a conventional name to make developers understand that a method is private and should never be accessed from outside the class.
According to PEP-8:
We don't use the term "private" here, since no attribute is really private in Python (without a generally unnecessary amount of work).
According to Python 2 and Python 3 class documentation:
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member).
Using _ may be useful for a tecnique explained below.
According to Python 3 class documentation:
Since there is a valid use-case for class-private members (namely to avoid name clashes of names with names defined by subclasses), there is limited support for such a mechanism, called name mangling.
NOTE: The Python documentation mostly talks about "internal use" referring to classes, but it can be referred to modules too, for example:
# main.py
import myModule
functionThatCanBeImported()
# myModule.py
def functionThatCanBeImported(): pass
def _functionThatShouldNeverBeImported(): pass
If somebody creates a package, they don't have to put private functions in the documentation, since they are for internal scope and explainations about them could be useful only to developers.
_ is used to let developers know that the variables and methods are private and should not be modified externally. So use _ if you want to keep the functionality within the class.
NOTE: using _ will not restrict the use of these methods or variables though, it's just a way of representation.
I have a module which I called entities.py - there are 2 classes within it and 2 global variables as in below pattern:
FIRST_VAR = ...
SECOND_VAR = ...
class FirstClass:
[...]
class SecondClass:
[...]
I also have another module (let's call it main.py for now) where I import both classes and constants as like here:
from entities import FirstClass, SecondClass, FIRST_VAR, SECOND_VAR
In the same "main.py" module I have another constant: THIRD_VAR = ..., and another class, in which all of imported names are being used.
Now, I have a function, which is being called only if a certain condition is met (passing config file path as CLI argument in my case). As my best bet, I've written it as following:
def update_consts_from_config(config: ConfigParser):
global FIRST_VAR
global SECOND_VAR
global THIRD_VAR
FIRST_VAR = ...
SECOND_VAR = ...
THIRD_VAR = ...
This works perfectly fine, although PyCharm indicates two issues, which at least I don't consider accurate.
from entities import FirstClass, SecondClass, FIRST_VAR, SECOND_VAR - here it warns me that FIRST_VAR and SECOND_VAR are unused imports, but from my understanding and testing they are used and not re-declared elsewhere unless function update_consts_from_config is invoked.
Also, under update_consts_from_config function:
global FIRST_VAR - at this and next line, it says
Global variable FIRST_VAR is undefined at the module level
My question is, should I really care about those warnings and (as I think the code is correct and clear), or am I missing something important and should come up with something different here?
I know I can do something as:
import entities
from entities import FirstClass, SecondClass
FIRST_VAR = entities.FIRST_VAR
SECOND_VAR = entities.SECOND_VAR
and work from there, but this look like an overkill for me, entities module has only what I have to import in main.py which also strictly depends on it, therefore I would rather stick to importing those names explicitly than referencing them by entities. just for that reason
What do you think would be a best practice here? I would like my code to clear, unambiguous and somehow optimal.
Import only entities, then refer to variables in its namespace to access/modify them.
Note: this pattern, modifying constants in other modules (which then, to purists, aren't so much constants as globals) can be justified. I have tons of cases where I use constants, rather than magic variables, as module level configuration. However, for example for testing, I might reach in and modify these constants. Say to switch a cache expiry from 2 days to 0.1 seconds to test caching. Or like you propose, to override configuration. Tread carefully, but it can be useful.
main.py:
import entities
def update_consts_from_config(FIRST_VAR):
entities.FIRST_VAR = FIRST_VAR
firstclass = entities.FirstClass()
print(f"{entities.FIRST_VAR=} before override")
firstclass.debug()
entities.debug()
update_consts_from_config("override")
print(f"{entities.FIRST_VAR=} after override")
firstclass.debug()
entities.debug()
entities.py:
FIRST_VAR = "ori"
class FirstClass:
def debug(self):
print(f"entities.py:{FIRST_VAR=}")
def debug():
print(f"making sure no closure/locality effects after object instantation {FIRST_VAR=}")
$ python main.py
entities.FIRST_VAR='ori' before override
entities.py:FIRST_VAR='ori'
making sure no closure/locality effects after object instantation FIRST_VAR='ori'
entities.FIRST_VAR='override' after override
entities.py:FIRST_VAR='override'
making sure no closure/locality effects after object instantation FIRST_VAR='override'
Now, if FIRST_VAR wasn't a string, int or another type of immutable, you should I think be able to import it separately and mutate it. Like SECOND_VAR.append("config override") in main.py. But assigning to a global in main.py will only affect affect the main.py binding, so if you want to share actual state between main.py and entities and other modules, everyone, not just main.py needs to import entities then access entities.FIRST_VAR.
Oh, and if you had:
class SecondClass:
def __init__(self):
self.FIRST_VAR = FIRST_VAR
then its instance-level value of that immutable string/int would not be affected by any overrides done after an instance creation. Mutables like lists or dictionaries would be affected because they're all different bindings pointing to the same variable.
Last, wrt to those "tricky" namespaces. global in your original code means: "dont consider FIRST_VAR as a variable to assign in update_consts_from_config s local namespace , instead assign it to main.py global, script-level namespace".
It does not mean "assign it to some global state magically shared between entities.py and main.py". __builtins__ might be that beast but modifying it is considered extremely bad form in Python.
globalEx1.py:
globals()['a']='100'
def setvalue(val):
globals()['a'] = val
globalEx2.py:
from globalEx1 import *
print a
setvalue('200')
print a
On executing globalEx2.py:
Output:
100
100
How can I change value of globals['a'] using a function, so that it reflects across the .py files?
Each module has its own globals. Python is behaving exactly as expected. Updating globalEx1's a to point to something else isn't going to affect where globalEx2's a is pointing.
There are various ways around this, depending on exactly what you want.
re-import a after the setvalue() call
return a and assign it, like a = setvalue().
import globalEx1 and use globalEx1.a instead of a. (Or use import globalEx1 as and a shorter name.)
pass globalEx2's globals() as an argument to setvalue and set the value on that instead.
make a a mutable object containing your value, like a list, dict or types.SimpleNamespace, and mutate it in setvalue.
use inspect inside setvalue to get the caller's globals from its stack frame. (Convenient, but brittle.)
Last option looks suitable for me.. it will do the job with minimal code change but can I update globals of multiple modules using same way? or it only gives me the caller's globals?
Option 6 is actually the riskiest. The caller itself basically becomes a hidden parameter to the function, so something like a decorator from another module can break it without warning. Option 4 just makes that hidden parameter explicit, so it's not so brittle.
If you need this to work across more than two modules, option 6 isn't good enough, since it only gives you the current call stack. Option 3 is probably the most reliable for what you seem to be trying to do.
How does option 1 work? I mean is it about running again -> "from globalEx1 import *" because I have many variables like 'a'.
A module becomes an object when imported the first time and it's saved in the sys.modules cache, so importing it again doesn't execute the module again. A from ... import (even with the *) just gets attributes from that module object and adds them to the local scope (which is the module globals if done at the top level, that is, outside of any definition.)
The module object's __dict__ is basically its globals, so any function that alters the module's globals will affect the resulting module object's attrs, even if it's done after the module was imported.
We cannot do from 'globalEx1 import *' from a python function, any alternative to this?
The star syntax is only allowed at the top level. But remember that it's just reading attributes from the module object. So you can get a dict of all the module attributes like
return vars(globalEx1)
This will give you more than * would. It doesn't return names that begin with an _ by default, or the subset specified in __all__ otherwise. You can filter the resulting dict with a dict comprehension, and even .update() the globals dict for some other module with the result.
But rather than re-implementing this filtering logic, you could just use exec to make it the top level. Then the only weird key you'd get is __builtins__
namespace = {}
exec('from globalEx1 import *', namespace)
del namespace['__builtins__']
return namespace
Then you can globals().update(namespace) or whatever.
Using exec like this is probably considered bad form, but then so is import * to begin with, honestly.
This is an interesting problem, related to the fact that strings are immutable. The line from globalEx1 import * creates two references in the globalEx2 module: a and setvalue. globalEx2.a initially refers to the same string object as globalEx1.a, since that's how imports work.
However, once you call setvalue, which operates on the globals of globalEx1, the value referenced by globalEx1.a is replaced by another string object. Since strings are immutable, there is no way to do this in place. The value of globalEx2.a remains bound to the original string object, as it should.
You have a couple of workarounds available here. The most pythonic is to fix the import in globalEx2:
import globalEx1
print globalEx1.a
globalEx1.setvalue('200')
print globalEx1.a
Another option would be to use a mutable container for a, and access that:
globals()['a']=['100']
def setvalue(val):
globals()['a'][0] = val
from globalEx1 import *
print a[0]
setvalue('200')
print a[0]
A third, and wilder option, is to make globalEx2's setvalue a copy of the original function, but with its __globals__ attribute set to the namespace of globalEx2 instead of globalEx1:
from functools import update_wrapper
from types import FunctionType
from globalEx1 import *
_setvalue = FunctionType(setvalue.__code__, globals(), name=setvalue.__name__,
argdefs=setvalue.__defaults__,
closure=setvalue.__closure__)
_setvalue = functools.update_wrapper(_setvalue, setvalue)
_setvalue.__kwdefaults__ = f.__kwdefaults__
setvalue = _setvalue
del _setvalue
print a
...
The reason you have to make the copy is that __globals__ is a read-only attribute, and also you don't want to mess with the function in globalEx1. See https://stackoverflow.com/a/13503277/2988730.
Globals are imported only once at the beginning with the import statement. Thus, if the global is an immutable object like str, int, etc, any update will not be reflected. However, if the global is a mutable object like list, etc, updates will be reflected. For example,
globalEx1.py:
globals()['a']=[100]
def setvalue(val):
globals()['a'][0] = val
The output will be changed as expected:
[100]
[200]
Aside
It's easier to define globals like normal variables:
a = [100]
def setvalue(value):
a[0] = value
Or when editing value of immutable objects:
a = 100
def setvalue(value):
global a
a = value
I use a specialized python module which modifies some of the Django class methods in the runtime (aka monkey-patching). If I need these 'old' versions is it possible to 'come back' to them overriding monkey patching?
Something like importing the initial version of these classes, for example?
Here is an example of how patching was done in the package:
from django.template.base import FilterExpression
def patch_filter_expression():
original_resolve = FilterExpression.resolve
def resolve(self, context, ignore_failures=False):
return original_resolve(self, context, ignore_failures=False)
FilterExpression.resolve = resolve
It depends on what the patch did. Monkeypatching is nothing special, it's just an assignment of a different object to a name. If nothing else references the old value anymore, then it's gone from Python's memory.
But if the code that patched the name has kept a reference to the original object in the form of a different variable, then the original object is still there to be 'restored':
import target.module
_original_function = target.module.target_function
def new_function(*args, **kwargs):
result = _original_function(*args, **kwargs)
return result * 5
target.module.target_function = new_function
Here the name target_function in the target.module module namespace was re-bound to point to new_function, but the original object is still available as _original_function in the namespace of the patching code.
If this is done in a function, then the original could be available as a closure too. For your specific example, you can get the original with:
FilterExpression.resolve.__closure__[0].cell_contents
or, if you prefer access by name:
def closure_mapping(func):
closures, names = func.__closure__, func.__code__.co_freevars
return {n: c.cell_contents for n, c in zip(names, closures)}
original_resolve = closure_mapping(FilterExpression.resolve)['original_resolve']
Otherwise, you can tell Python to reload the original module with importlib.reload():
import target.module
importlib.reload(target.module)
This refreshes the module namespace, 'resetting' all global names to what they'd been set to at import time (any additional names are retained).
Note, however, that any code holding a direct reference to the patched object (such as your class object), would not see the updated objects! That's because from target.module import target_function creates a new reference to the target_function object in the current namespace and no amount of reloading of the original target.module module will update any of the other direct references. You'd have to update those other references manually, or reload their namespaces too.
Question
Is there a "pythonic" (i.e. canonical, official, PEP8-approved, etc) way to re-use string literals in python internal (and external) APIs?
Background
For example, I'm working with some (inconsistent) JSON-handling code (thousands of lines) where there are various JSON "structs" we assemble, parse, etc. One of the recurring problems that comes up during code reviews is different JSON structs that use the same internal parameter names, causing confusion and eventually causing bugs to arise, e.g.:
pathPacket['src'] = "/tmp"
pathPacket['dst'] = "/home/user/out"
urlPacket['src'] = "localhost"
urlPacket['dst'] = "contoso"
These two (example) packets that have dozens of identically named fields, but they represent very different types of data. There was no code-reuse justification for this implementation. People typically use code-completion engines to get the members of the JSON struct, and this eventually leads to hard-to-debug problems down the road due to mis-typed string literals causing functional issues, and not triggering an error earlier on. When we have to change these APIs, it takes a lot of time to hunt down the string literals to find out which JSON structs use which fields.
Question - Redux
Is there a better approach to this that is common amongst members of the python community? If I was doing this in C++, the earlier example would be something like:
const char *JSON_PATH_SRC = "src";
const char *JSON_PATH_DST = "dst";
const char *JSON_URL_SRC = "src";
const char *JSON_URL_DST = "dst";
// Define/allocate JSON structs
pathPacket[JSON_PATH_SRC] = "/tmp";
pathPacket[JSON_PATH_DST] = "/home/user/out";
urlPacket[JSON_URL_SRC] = "localhost";
urlPacket[JSON_URL_SRC] = "contoso";
My initial approach would be to:
Use abc to make an abstract base class that can't be initialized as an object, and populate it with read-only constants.
Use that class as a common module throughout my project.
By using these constants, I can reduce the chance of a monkey-patching error as the symbols won't exist if mis-spelled, whereas a string literal typo can slip through code reviews.
My Proposed Solution (open to advice/criticism)
from abc import ABCMeta
class Custom_Structure:
__metaclass__ = ABCMeta
#property
def JSON_PATH_SRC():
return self._JSON_PATH_SRC
#property
def JSON_PATH_DST():
return self._JSON_PATH_DST
#property
def JSON_URL_SRC():
return self._JSON_URL_SRC
#property
def JSON_URL_DST():
return self._JSON_URL_DST
The way this is normally done is:
JSON_PATH_SRC = "src"
JSON_PATH_DST = "dst"
JSON_URL_SRC = "src"
JSON_URL_DST = "dst"
pathPacket[JSON_PATH_SRC] = "/tmp"
pathPacket[JSON_PATH_DST] = "/home/user/out"
urlPacket[JSON_URL_SRC] = "localhost"
urlPacket[JSON_URL_SRC] = "contoso"
Upper-case to denote "constants" is the way it goes. You'll see this in the standard library, and it's even recommended in PEP8:
Constants are usually defined on a module level and written in all
capital letters with underscores separating words. Examples include
MAX_OVERFLOW and TOTAL.
Python doesn't have true constants, and it seems to have survived without them. If it makes you feel more comfortable wrapping this in a class that uses ABCmeta with properties, go ahead. Indeed, I'm pretty sure abc.ABCmeta doesn't not prevent object initialization. Indeed, if it did, your use of property would not work! property objects belong to the class, but are meant to be accessed from an instance. To me, it just looks like a lot of rigamarole for very little gain.
The easiest way in my opinion to make constants is just to set them as variables in your module (and not modify them).
JSON_PATH_SRC = "src"
JSON_PATH_DST = "dst"
JSON_URL_SRC = "src"
JSON_URL_DST = "dst"
Then if you need to reference them from another module they're already namespaced for you.
>>> that_module.JSON_PATH_SRC
'src'
>>> that_module.JSON_PATH_DST
'dst'
>>> that_module.JSON_URL_SRC
'src'
>>> that_module.JSON_URL_DST
'dst'
The simplest way to create a bunch of constants is to place them into a module, and import them as necessary. For example, you could have a constants.py module with
JSON_PATH_SRC = "src"
JSON_PATH_DST = "dst"
JSON_URL_SRC = "src"
JSON_URL_DST = "dst"
Your code would then do something like
from constants import JSON_URL_SRC
...
urlPacket[JSON_URL_SRC] = "localhost"
If you would like a better defined grouping of the constants, you can either stick them into separate modules in a dedicated package, allowing you to access them like constants.json.url.DST for example, or you could use Enums. The Enum class allows you to group related sets of constants into a single namespace. You could write a module constants.py like this:
from enum import Enum
class JSONPath(Enum):
SRC = 'src'
DST = 'dst'
class JSONUrl(Enum):
SRC = 'src'
DST = 'dst'
OR
from enum import Enum
class JSON(Enum):
PATH_SRC = 'src'
PATH_DST = 'dst'
URL_SRC = 'src'
URL_DST = 'dst'
How exactly you separate your constants is up to you. You can have a single giant enum, one per category or something in between. You would access the in your code like this:
from constants import JSONURL
...
urlPacket[JSONURL.SRC.value] = "localhost"
OR
from constants import JSON
...
urlPacket[JSON.URL_SRC.value] = "localhost"