Re-referencing a large number of functions in python - python

I have a file functional.py which defines a number of useful functions. For each function, I want to create an alias that when called will give a reference to a function. Something like this:
foo/functional.py
def fun1(a):
return a
def fun2(a):
return a+1
...
foo/__init__.py
from inspect import getmembers, isfunction
from . import functional
for (name, fun) in getmembers(functional, isfunction):
dun = lambda f=fun: f
globals()[name] = dun
>> bar.fun1()(1)
>> 1
>> bar.fun2()(1)
>> 2
I can get the functions from functional.py using inspect and dynamically define a new set of functions that are fit for my purpose.
But why? you might ask... I am using a configuration manager Hydra where one can instantiate objects by specifying the fully qualified name. I want to make use of the functions in functional.py in the config and have hydra pass a reference to the function when creating an object that uses the function (more details can be found in the Hydra documentation).
There are many functions and I don't want to write them all out ... people have pointed out in similar questions that modifying globals() for this purpose is bad practice. My use case is fairly constrained - documentation wise there is a one-one mapping (but obviously an IDE won't be able to resolve it).
Basically, I am wondering if there is a better way to do it!

Is your question related to this feature request and in particular to this comment?
FYI: In Hydra 1.1, instantiate fully supports positional arguments so I think you should be able to call functools.partial directly without redefining it.

Related

Is it bad practice to modify attributes of one module from another module?

I want to define a bunch of config variables that can be imported in all the modules in my project. The values of those variables will be constant during runtime but are not known before runtime; they depend on the input. Usually I'd define a dict in my top module which would be passed to all functions and classes from other modules; however, I was thinking it may be cleaner to simply create a blank config.py module which would be dynamically filled with config variables by the top module:
# top.py
import config
config.x = x
# config.py
x = None
# other.py
import config
print(config.x)
I like this approach because I don't have to save the parameters as attributes of classes in my other modules; which makes sense to me because parameters do not describe classes themselves.
This works but is it considered bad practice?
The question as such may be disputed. But I would generally say yes, it's "bad practice" because scope and impact of change is really getting blurred. Note the use case you're describing really is not about sharing configuration, but about different parts of the program functions, objects, modules exchanging data and as such it's a bit of a variation on (meta)global variable).
Reading common configuration values could be fine, but changing them along the way... you may lose track of what happened where and also in which order as modules get imported / values get modified. For instance assume the config.py and two modules m1.py:
import config
print(config.x)
config.x=1
and m2.py:
import config
print(config.x)
config.x=2
and a main.py that just does:
import m1
import m2
import config
print(config.x)
or:
import m2
import m1
import config
print(config.x)
The state in which you find config in each module and really any other (incl. main.py here) depends on order in which imports have occurred and who assigned what value when. Even for a program entirely under your control, this may get confusing (and source of mistakes) rather quickly.
For runtime data and passing information between objects and modules (and your example is really that and not configuration that is predefined and shared between modules) I would suggest you look into describing the information perhaps in a custom state (config) object and pass it around through appropriate interface. But really just a function / method argument may be all that is needed. The exact form depends on what exactly you're trying to achieve and what your overall design is.
In your example, other.py behaves differently when called or imported before top.py which may still seem obvious and manageable in a minimal example, but really is not a very sound design. Anyone reading the code (incl. future you) should be able to follow its logic and this IMO breaks its flow.
The most trivial (and procedural) example of what for what you've described and now I hopefully have a better grasp of would be other.py recreating your current behavior:
def do_stuff(value):
print(value) # We did something useful here
if __name__ == "__main__":
do_stuff(None) # Could also use config with defaults
And your top.py presumably being the entry point and orchestrating importing and execution doing:
import other
x = get_the_value()
other.do_stuff(x)
You can of course introduce an interface to configure do_stuff perhaps a dict or a custom class even with default implementation in config.py:
class Params:
def __init__(self, x=None):
self.x = x
and your other.py:
def do_stuff(params=config.Params()):
print(params.x) # We did something useful here
And on your top.py you can use:
params = config.Params(get_the_value())
other.do_stuff(params)
But you could also have any use case specific source of value(s):
class TopParams:
def __init__(self, url):
self.x = get_value_from_url(url)
params = TopParams("https://example.com/value-source")
other.do_stuff(params)
x could even be a property which you retrieve every time you access it... or lazily when needed and then cached... Again, it really then is a matter of what you need to do.
"Is it bad practice to modify attributes of one module from another module?"
that it is considered as bad practice - violation of the law of demeter, which means in fact "talk to friends, not to strangers".
Objects should expose behaviour and functions, but should HIDE the data.
DataStructures should EXPOSE data, but should not have any methods (which are exposed). The law of demeter does not apply to such DataStructures. OOP Purists might cover such DataStructures with setters and getters, but it really adds no value in Python.
there is a lot of literature about that like : https://en.wikipedia.org/wiki/Law_of_Demeter
and of course, a must to read: "Clean Code", by Robert C. Martin (Uncle Bob), check it out on Youtube also.
For procedural programming it is perfectly normal to keep data in a DataStructure which does not have any (exposed) methods.
The procedures in the program work with that data. Consider to use the module attrs, see : https://www.attrs.org/en/stable/ for easy creation of such classes.
my prefered method for keeping config is (here without using attrs):
# conf_xy.py
"""
config is code - so why use damned parsers, textfiles, xml, yaml, toml and all that
if You just can use testable code as config that can deliver the correct types, etc.
as well as hinting in Your favorite IDE ?
Here, for demonstration without using attrs package - usually I use attrs (read the docs)
"""
class ConfXY(object):
def __init__(self) -> None:
self.x: int = 1
self.z: float = get_z_from_input()
...
conf_xy=ConfXY()
# other.py
from conf_xy import conf_xy
...
y = conf_xy.x * 2
...

Is there a way to get ad-hoc polymorphism in Python?

Many languages support ad-hoc polymorphism (a.k.a. function overloading) out of the box. However, it seems that Python opted out of it. Still, I can imagine there might be a trick or a library that is able to pull it off in Python. Does anyone know of such a tool?
For example, in Haskell one might use this to generate test data for different types:
-- In some testing library:
class Randomizable a where
genRandom :: a
-- Overload for different types
instance Randomizable String where genRandom = ...
instance Randomizable Int where genRandom = ...
instance Randomizable Bool where genRandom = ...
-- In some client project, we might have a custom type:
instance Randomizable VeryCustomType where genRandom = ...
The beauty of this is that I can extend genRandom for my own custom types without touching the testing library.
How would you achieve something like this in Python?
Python is not a strongly typed language, so it really doesn't matter if yo have an instance of Randomizable or an instance of some other class which has the same methods.
One way to get the appearance of what you want could be this:
types_ = {}
def registerType ( dtype , cls ) :
types_[dtype] = cls
def RandomizableT ( dtype ) :
return types_[dtype]
Firstly, yes, I did define a function with a capital letter, but it's meant to act more like a class. For example:
registerType ( int , TheLibrary.Randomizable )
registerType ( str , MyLibrary.MyStringRandomizable )
Then, later:
type = ... # get whatever type you want to randomize
randomizer = RandomizableT(type) ()
print randomizer.getRandom()
A Python function cannot be automatically specialised based on static compile-time typing. Therefore its result can only depend on its arguments received at run-time and on the global (or local) environment, unless the function itself is modifiable in-place and can carry some state.
Your generic function genRandom takes no arguments besides the typing information. Thus in Python it should at least receive the type as an argument. Since built-in classes cannot be modified, the generic function (instance) implementation for such classes should be somehow supplied through the global environment or included into the function itself.
I've found out that since Python 3.4, there is #functools.singledispatch decorator. However, it works only for functions which receive a type instance (object) as the first argument, so it is not clear how it could be applied in your example. I am also a bit confused by its rationale:
In addition, it is currently a common anti-pattern for Python code to inspect the types of received arguments, in order to decide what to do with the objects.
I understand that anti-pattern is a jargon term for a pattern which is considered undesirable (and does not at all mean the absence of a pattern). The rationale thus claims that inspecting types of arguments is undesirable, and this claim is used to justify introducing a tool that will simplify ... dispatching on the type of an argument. (Incidentally, note that according to PEP 20, "Explicit is better than implicit.")
The "Alternative approaches" section of PEP 443 "Single-dispatch generic functions" however seems worth reading. There are several references to possible solutions, including one to "Five-minute Multimethods in Python" article by Guido van Rossum from 2005.
Does this count for ad hock polymorphism?
class A:
def __init__(self):
pass
def aFunc(self):
print "In A"
class B:
def __init__(self):
pass
def aFunc(self):
print "In B"
f = A()
f.aFunc()
f = B()
f.aFunc()
output
In A
In B
Another version of polymorphism
from module import aName
If two modules use the same interface, you could import either one and use it in your code.
One example of this is from xml.etree.ElementTree import XMLParser

Python abstract module possible?

I've built a module in Python in one single file without using classes. I do this so that using some api module becomes easier. Basically like this:
the_module.py
from some_api_module import some_api_call, another_api_call
def method_one(a, b):
return some_api_call(a + b)
def method_two(c, d, e):
return another_api_call(c * d * e)
I now need to built many similar modules, for different api modules, but I want all of them to have the same basic set of methods so that I can import any of these modules and call a function knowing that this function will behave the same in all the modules I built. To ensure they are all the same, I want to use some kind of abstract base module to build upon. I would normally grab the Abstract Base Classes module, but since I don't use classes at all, this doesn't work.
Does anybody know how I can implement an abstract base module on which I can build several other modules without using classes? All tips are welcome!
You are not using classes, but you could easily rewrite your code to do so.
A class is basically a namespace which contains functions and variables, as is a module.
Should not make a huge difference whether you call mymodule.method_one() or mymodule.myclass.method_one().
In python there is no such thing as interfaces which you might know from java.
The paradigm in python is Duck typing, that means more or less that for a given module you can tell whether it implements your API if it provides the right methods.
Python does this i.e. to determine what to do if you call myobject[i] on an instance of your class myclass. It looks whether the class has the method __getitem__ and if it does so, it replaces myobject[i] by myobject.__getitem__(i).
Yout don't have to tell python that your class supports this kind of access, python just figures it out from the way you defined your class.
The same way you should determine whether your module implements your API.
Maybe you want to look inside the hidden dictionary mymodule.__dict__ after import mymodulewhich contains all function names and pointers to them of your module. You could then check whether the right functions are present and raise an error otherwise
import my_module_4
#check if my_module_4 implements api
if all(func in my_module_4.__dict__ for func in ("method_one","method_two"):
print "API implemented"
else:
print "Warning: Not all API functions found in my_module_4"

Set defaults at runtime

I manage a fairly large python-based quantum chemistry suite, PyQuante. I'm currently struggling with how to set various defaults so that users can choose among different options at runtime.
For example, I have three different methods for computing electron repulsion integrals. Let's call them a,b,c. I used to simply pick the one I liked best (say, c), and have that hard-wired into the module that computes these integrals.
I have now modified this to use a module, Defaults.py, that contains all such hard-wires. But this is set at compile/install time. I would now like users to be able to override these options at runtime, say, using a .pyquanterc.py file.
In my integral routines, I currently have something like
from Defaults import integral_method
I know about dictionaries, and the .update() method. But I don't know how I would use this in real life. My defaults module looks like
integral_method = c
should I modify the end of Defaults.py to look for a .pythonrc.py file and override these values? E.g.
if os.path.exists('$HOME/.pythonrc.py'): do_something
If so, what should do_something look like?
With your current setup, the user can change the default functions in his scripts quite easily:
import Defaults
Defaults.integral_method = somefunc
If the user adds this to his script, all your modules that use integral_method from Defaults will use somefunc to calculate integrals.
I might do this via a factory class.
class IntegralSolver:
"""
Factory class containing methods for solving integrals.
>>> solver = IntegralSolver("method1")
>>> solver(x)
# solution via method1
Can also be used directly:
>>> IntegralSolver.method2(x)
# solution via method2
"""
def __init__(self, method):
self.__call__ = getattr(self, method)
#staticmethod
def method1(x):
return method1_solution
#staticmethod
def method2(x):
return method2_solution
It really depends on how your user runs the toolset. If they twiddle the python code each time, just setting a block at the top labeled OPTIONS should be good. If they run it off the command line, use the argparse library to allow them to switch options on the command line. Perhaps have it read the options out of a file with configParser to read a default file with your options, and if the user sets it, an additional file with their options.

Python namespace in between builtins and global?

As I understand it python has the following outermost namespaces:
Builtin - This namespace is global across the entire interpreter and all scripts running within an interpreter instance.
Globals - This namespace is global across a module, ie across a single file.
I am looking for a namespace in between these two, where I can share a few variables declared within the main script to modules called by it.
For example, script.py:
import Log from Log
import foo from foo
log = Log()
foo()
foo.py:
def foo():
log.Log('test') # I want this to refer to the callers log object
I want to be able to call script.py multiple times and in each case, expose the module level log object to the foo method.
Any ideas if this is possible?
It won't be too painful to pass down the log object, but I am working with a large chunk of code that has been ported from Javascript. I also understand that this places constraints on the caller of foo to expose its log object.
Thanks,
Paul
There is no namespace "between" builtins and globals -- but you can easily create your own namespaces and insert them with a name in sys.modules, so any other module can "import" them (ideally not using the from ... import syntax, which carries a load of problems, and definitely not using tghe import ... from syntax you've invented, which just gives a syntax error). For example, in script.py:
import sys
import types
sys.modules['yay'] = types.ModuleType('yay')
import Log
import foo
yay.log = Log.Log()
foo.foo()
and in foo.py
import yay
def foo():
yay.log.Log('test')
Do not fear qualified names -- they're goodness! Or as the last line of the Zen of Python (AKA import this) puts it:
Namespaces are one honking great idea -- let's do more of those!
You can make and use "more of those" most simply -- just qualify your names (situating them in the proper namespace they belong in!) rather than insisting on barenames where they're just not a good fit. There's a bazillion things that are quite easy with qualified names and anywhere between seriously problematic and well-nigh unfeasible for those who're stuck on barenames!-)
There is no such scope. You will need to either add to the builtins scope, or pass the relevant object.
Actually, I did figure out what I was looking for.
This hack is actually used PLY and that is where is stumbled across.
The library code can raise a runtime exception, which then gives access to the callers stack.

Categories