I have some function which uses outside variables. A (substantially) simplified example:
a = 2
b = 3
def f(x):
return x * a + b
While I need a and b in f, I don't need them anywhere else. In particular, one can write a = 5, and that will change the behavior of f. How should I make a and b invisible to the outside?
Other languages allow me to write roughly the following code:
let f =
a = 2
b = 3
lambda x: x * a + b
What I want:
f must work as intended and have the same signature
a and b must be computed only once
a and b must not exist in the scope outside of f
Assignments a = ... and b = ... don't affect f
The cleanest way to do this. E.g. the following solution formally works, but it introduces g and then deletes it, which I don't like (e.g. there is a risk of overriding an existing g and I believe that it's simply ugly):
def g():
a = 2
b = 3
return lambda x: x * a + b
f = g()
del g
One method is to simply use a class. This allows you to place a and b in the scope of the class while f can still access them.
custom class
class F:
def __init__(self):
self.a = 2
self.b = 3
def __call__(self, x):
return x * self.a + self.b
f = F()
f(1)
# returns:
5
If you don't like having to call the class constructor, you can override __new__ to essentially create a callable with internal stored variables. This is an antipattern though and not very pythonic.
custom callable
class f:
a = 2
b = 3
def __new__(cls, x):
return x * cls.a + cls.b
f(1)
# returns:
5
This approach is based on the answers provided in this thread, though scoped to the specific problem above. You can use a decorator to update the global variables available to the function while also storin a and b within a closure.
decorator with closure
from functools import wraps
def dec_ab(fn):
a = 2
b = 3
#wraps(fn)
def wrapper(*args, **kwargs):
# get global scope
global_scope = f.__globals__
# copy current values of variables
var_list = ['a', 'b']
current_vars = {}
for var in var_list:
if var in global_scope:
current_vars[var] = global_scope.get(var)
# update global scope
global_scope.update({'a': a, 'b': b})
try:
out = fn(*args, **kwargs)
finally:
# undo the changes to the global scope
for var in var_list:
global_scope.pop(var)
global_scope.update(current_vars)
return out
return wrapper
#dec_ab
def f(x):
"""hello world"""
return x * a + b
This preserves the functions signature and keeps a and b from being altered
f(1)
# returns:
5
a
# raises:
NameError: name 'a' is not defined
You can use default arguments to accomplish this. Default arguments are only computed once, when the closure is created (that is why if you have mutable objects as default arguments, the state is retained between invocations).
def f(x, a=2, b=3):
return x * a + b
The encoding is basically string representation of a dictionary, containing the object's fields. However, a dictionary does not respect order, and I could potentially get different encoding string on different runs. How do I preclude this from happening? Or should I use another library where I can ensure deterministic encoding?
By deterministic encoding, I mean if I create 100000 objects that are practically the same, i.e. same class and same constructor args, when I call encode() on each one of them, I get the exact same string every time.
So, for example, if I have
class MyClass(object):
def __init__(self, a, b):
self.a = a
self.b = b
c1 = MyClass(1, 2)
c2 = MyClass(1, 2)
I want to be sure that the strings encode(c1) and encode(c2) are perfectly identical, character for character, i.e.
assert jsonpickle.encode(c1)==jsonpickle.encode(c2)
I think that jsonpickle will take care of what you call determininstic endocing.
Example
import jsonpickle
class Monopoly(object):
def __init__(self):
self.boardwalk_price = 500
#property
def boardwalk(self):
self.boardwalk_price += 50
return self.boardwalk_price
m = Monopoly()
serialized = jsonpickle.encode(m)
Take a look at
print (serialized)
{"py/object": "__main__.Monopoly", "boardwalk_price": 500}
Now, let's decode
d = jsonpickle.decode(serialized)
print (d)
<__main__.Monopoly object at 0x7f01bc093278>
d.boardwalk_price
500
For comparing objects,Python uses identifiers.
class MyClass(object):
def __init__(self, a, b):
self.a = a
self.b = b
c1 = MyClass(1, 2)
c2 = MyClass(1, 2)
If you take a look at id
id(c1)
140154854189040
id(c2)
140154854190440
c1 == c2
False
You can override eq operator
def __eq__(self, x):
if isinstance(x, number):
return self.number == x.number
return False
As a simple example, take a class Ellipse that can return its properties such as area A, circumference C, major/minor axis a/b, eccentricity eetc. In order to get that, one obviously has to provide precisely two of its parameters to obtain all the other ones, though as a special case providing only one parameter should assume a circle. Three or more parameters that are consistent should yield a warning but work, otherwise obviously raise an exception.
So some examples of valid Ellipses are:
Ellipse(a=5, b=2)
Ellipse(A=3)
Ellipse(a=3, e=.1)
Ellipse(a=3, b=3, A=9*math.pi) # note the consistency
while invalid ones would be
Ellipse()
Ellipse(a=3, b=3, A=7)
The constructor would therefore either contain many =None arguments,
class Ellipse(object):
def __init__(self, a=None, b=None, A=None, C=None, ...):
or, probably more sensible, a simple **kwargs, maybe adding the option to provide a,b as positional arguments,
class Ellipse(object):
def __init__(self, a=None, b=None, **kwargs):
kwargs.update({key: value
for key, value in (('a', a), ('b', b))
if value is not None})
So far, so good. But now comes the actual implementation, i.e. figuring out which parameters were provided and which were not and determine all the others depending on them, or check for consistency if required.
My first approach would be a simple yet tedious combination of many
if 'a' in kwargs:
a = kwargs['a']
if 'b' in kwargs:
b = kwargs['b']
A = kwargs['A'] = math.pi * a * b
f = kwargs['f'] = math.sqrt(a**2 - b**2)
...
elif 'f' in kwargs:
f = kwargs['f']
b = kwargs['b'] = math.sqrt(a**2 + f**2)
A = kwargs['A'] = math.pi * a * b
...
elif ...
and so on*. But is there no better way? Or is this class design totally bollocks and I should create constructors such as Ellipse.create_from_a_b(a, b), despite that basically making the "provide three or more consistent parameters" option impossible?
Bonus question: Since the ellipse's circumference involves elliptic integrals (or elliptic functions if the circumference is provided and the other parameters are to be obtained) which are not exactly computationally trivial, should those calculations actually be in the constructor or rather be put into the #property Ellipse.C?
* I guess at least one readability improvement would be always extracting a and b and calculating the rest from them but that means recalculating the values already provided, wasting both time and precision...
My proposal is focused on data encapsulation and code readability.
a) Pick pair on unambigous measurements to represent ellipse internally
class Ellipse(object):
def __init__(a, b):
self.a = a
self.b = b
b) Create family of properties to get desired metrics about ellipse
class Ellipse(object):
#property
def area(self):
return math.pi * self._x * self._b
c) Create factory class / factory methods with unambigous names:
class Ellipse(object):
#classmethod
def fromAreaAndCircumference(cls, area, circumference):
# convert area and circumference to common format
return cls(a, b)
Sample usage:
ellipse = Ellipse.fromLongAxisAndEccentricity(axis, eccentricity)
assert ellipse.a == axis
assert ellipse.eccentricity == eccentricity
Check that you have enough parameters
Calculate a from every pairing of the other parameters
Confirm every a is the same
Calculate b from every pairing of a and another parameter
Calculate the other parameters from a and b
Here's a shortened version with just a, b, e, and f that easily extends to other parameters:
class Ellipse():
def __init__(self, a=None, b=None, e=None, f=None):
if [a, b, e, f].count(None) > 2:
raise Exception('Not enough parameters to make an ellipse')
self.a, self.b, self.e, self.f = a, b, e, f
self.calculate_a()
for parameter in 'b', 'e', 'f': # Allows any multi-character parameter names
if self.__dict__[parameter] is None:
Ellipse.__dict__['calculate_' + parameter](self)
def calculate_a(self):
"""Calculate and compare a from every pair of other parameters
:raises Exception: if the ellipse parameters are inconsistent
"""
a_raw = 0 if self.a is None else self.a
a_be = 0 if not all((self.b, self.e)) else self.b / math.sqrt(1 - self.e**2)
a_bf = 0 if not all((self.b, self.f)) else math.sqrt(self.b**2 + self.f**2)
a_ef = 0 if not all((self.e, self.f)) else self.f / self.e
if len(set((a_raw, a_be, a_bf, a_ef)) - set((0,))) > 1:
raise Exception('Inconsistent parameters')
self.a = a_raw + a_be + a_bf + a_ef
def calculate_b(self):
"""Calculate and compare b from every pair of a and another parameter"""
b_ae = 0 if self.e is None else self.a * math.sqrt(1 - self.e**2)
b_af = 0 if self.f is None else math.sqrt(self.a**2 - self.f**2)
self.b = b_ae + b_af
def calculate_e(self):
"""Calculate e from a and b"""
self.e = math.sqrt(1 - (self.b / self.a)**2)
def calculate_f(self):
"""Calculate f from a and b"""
self.f = math.sqrt(self.a**2 - self.b**2)
It's pretty Pythonic, though the __dict__ usage might not be. The __dict__ way is fewer lines and less repetitive, but you can make it more explicit by breaking it out into separate if self.b is None: self.calculate_b() lines.
I only coded e and f, but it's extensible. Just mimic e and f code with the equations for whatever you want to add (area, circumference, etc.) as a function of a and b.
I didn't include your request for one-parameter Ellipses to become circles, but that's just a check at the beginning of calculate_a for whether there's only one parameter, in which case a should be set to make the ellipse a circle (b should be set if a is the only one):
def calculate_a(self):
"""..."""
if [self.a, self.b, self.e, self.f].count(None) == 3:
if self.a is None:
# Set self.a to make a circle
else:
# Set self.b to make a circle
return
a_raw = ...
If the need for such functionality is only for this single class, My advice would be to go with the second solution you have mentioned, using Nsh's answer.
Otherwise, if this problem arises in number of places in your project, here is a solution I came up with:
class YourClass(MutexInit):
"""First of all inherit the MutexInit class by..."""
def __init__(self, **kwargs):
"""...calling its __init__ at the end of your own __init__. Then..."""
super(YourClass, self).__init__(**kwargs)
#sub_init
def _init_foo_bar(self, foo, bar):
"""...just decorate each sub-init method with #sub_init"""
self.baz = foo + bar
#sub_init
def _init_bar_baz(self, bar, baz):
self.foo = bar - baz
This will make your code more readable, and you will hide the ugly details behind this decorators, which are self-explanatory.
Note: We could also eliminate the #sub_init decorator, however I think it is the only legal way to mark the method as sub-init. Otherwise, an option would be to agree on putting a prefix before the name of the method, say _init, but I think that's a bad idea.
Here are the implementations:
import inspect
class MutexInit(object):
def __init__(self, **kwargs):
super(MutexInit, self).__init__()
for arg in kwargs:
setattr(self, arg, kwargs.get(arg))
self._arg_method_dict = {}
for attr_name in dir(self):
attr = getattr(self, attr_name)
if getattr(attr, "_isrequiredargsmethod", False):
self._arg_method_dict[attr.args] = attr
provided_args = tuple(sorted(
[arg for arg in kwargs if kwargs[arg] is not None]))
sub_init = self._arg_method_dict.get(provided_args, None)
if sub_init:
sub_init(**kwargs)
else:
raise AttributeError('Insufficient arguments')
def sub_init(func):
args = sorted(inspect.getargspec(func)[0])
self_arg = 'self'
if self_arg in args:
args.remove(self_arg)
def wrapper(funcself, **kwargs):
if len(kwargs) == len(args):
for arg in args:
if (arg not in kwargs) or (kwargs[arg] is None):
raise AttributeError
else:
raise AttributeError
return func(funcself, **kwargs)
wrapper._isrequiredargsmethod = True
wrapper.args = tuple(args)
return wrapper
Here's my try on it. If you're doing this for some end users, you might want to skip. What I did probably works well for setting up some fast math objects library, but only when the user knows what's going on.
Idea was that all variables describing a math object follow the same pattern, a=something*smntng.
So when calculating a variable irl, in the worst case I would be missing "something", then I'd go and calculate that value, and any values I'd be missing when calculating that one, and bring it back to finish calculating the original variable I was looking for. There's a certain recursion pattern noticeable.
When calculating a variable therefore, at each access of a variable I've got to check if it exists, and if it doesn't calculate it. Since it's at each access I have to use __getattribute__.
I also need a functional relationship between the variables. So I'll pin a class attribute relations which will serve just that purpose. It'll be a dict of variables and an appropriate function.
But I've also got to check in advance if I have all the necessary variables to calculate current one. so I'll amend my table, of centralized math relations between variables, to list all dependencies and before I go to calculate anything, I'll run over the listed dependencies and calc those if I need too.
So now it looks more like we'll have a ping pong match of semi-recursion where a function _calc will call __getattribute__ which calls function _calc again. Until such a time we run out of variables or we actually calculate something.
The Good:
There are no ifs
Can initialize with different init variables. As long as the sent variables enable calculations of others.
It's fairly generic and looks like it could work for any other mathematical object describable in a similar manner.
Once calculated all your variables will be remembered.
The Bad:
It's fairly "unpythonic" for whatever that word means to you (explicit is always better).
Not user friendly. Any error message you recieve will be as long as the number of times __getattribute__ and _calc called each other. Also no nice way of formulating a pretty error print.
You've a consistency issue at hand. This can probably be dealt with by overriding setters.
Depending on initial parameters, there is a possibility that you'll have to wait a long time to calculate a certain variable, especially if the requested variable calculation has to fall through several other calculations.
If you need a complex function, you have to make sure it's declared before relations which might make the code ugly (also see last point). I couldn't quite work out how to get them to be instance methods, and not class methods or some other more global functions because I basically overrided the . operator.
Circular functional dependencies are a concern as well. (a needs b which needs e which needs a again and into an infinite loop).
relations are set in a dict type. That means here's only 1 functional dependency you can have per variable name, which isn't necessarily true in mathematical terms.
It's already ugly: value = self.relations[var]["func"]( *[self.__getattribute__(x) for x in requirements["req"]] )
Also that's the line in _calc that calls __getattribute__ which either calls _calc again, or if the variable exists returns the value. Also at each __init__ you have to set all your attributes to None, because otherwise a _getattr will be called.
def cmplx_func_A(e, C):
return 10*C*e
class Elipse():
def __init__(self, a=None, b=None, **kwargs):
self.relations = {
"e": {"req":["a", "b"], "func": lambda a,b: a+b},
"C": {"req":["e", "a"], "func": lambda e,a: e*1/(a*b)},
"A": {"req":["C", "e"], "func": lambda e,C: cmplx_func_A(e, C)},
"a": {"req":["e", "b"], "func": lambda e,b: e/b},
"b": {"req":["e", "a"], "func": lambda e,a: e/a}
}
self.a = a
self.b = b
self.e = None
self.C = None
self.A = None
if kwargs:
for key in kwargs:
setattr(self, key, kwargs[key])
def __getattribute__(self, attr):
val = super(Elipse, self).__getattribute__(attr)
if val: return val
return self._calc(attr)
def _calc(self, var):
requirements = self.relations[var]
value = self.relations[var]["func"](
*[self.__getattribute__(x) for x in requirements["req"]]
)
setattr(self, var, value)
return value
Oputput:
>>> a = Elipse(1,1)
>>> a.A #cal to calculate this will fall through
#and calculate every variable A depends on (C and e)
20
>>> a.C #C is not calculated this time.
1
>>> a = Elipse(1,1, e=3)
>>> a.e #without a __setattribute__ checking the validity, there is no
3 #insurance that this makes sense.
>>> a.A #calculates this and a.C, but doesn't recalc a.e
30
>>> a.e
3
>>> a = Elipse(b=1, e=2) #init can be anything that makes sense
>>> a.a #as it's defined by relations dict.
2.0
>>> a = Elipse(a=2, e=2)
>>> a.b
1.0
There is one more issue here, related to the next to last point in "the bad". I.e. let's imagine that we can can define an elipse with C and A. Because we can relate each variable with others over only 1 functional dependency, if you defined your variables a and b over e and a|b like I have, you won't be able to calculate them. There will always be at least some miniature subset of variables you will have to send. This can be alleviated by making sure you define as much of your variables over as little other variables you can but can't be avoided.
If you're lazy, this is a good way to short-circuit something you need done fast, but I wouldn't do this somewhere, where I expect someone else to use it, ever!
For the bonus question it's probably sensible (depending on your use case) to calculate on request but remember the computed value if it's been computed before. E.g.
#property
def a(self):
return self._calc_a()
def _calc_a(self):
if self.a is None:
self.a = ...?
return self.a
Included below is an approach which I've used before for partial data dependency and result caching. It actually resembles the answer #ljetibo provided with the following significant differences:
relationships are defined at the class level
work is done at definition time to permute them into a canonical reference for dependency sets and the target variables that may be calculated if they are available
calculated values are cached but there is no requirement that the instance be immutable since stored values may be invalidated (e.g. total transformation is possible)
Non-lambda based calculations of values giving some more flexibility
I've written it from scratch so there may be some things I've missed but it should cover the following adequately:
Define data dependencies and reject initialising data which is inadequate
Cache the results of calculations to avoid extra work
Returns a meaningful exception with the names of variables which are not derivable from the specified information
Of course this can be split into a base class to do the core work and a subclass which defines the basic relationships and calculations only. Splitting the logic for the extended relationship mapping out of the subclass might be an interesting problem though since the relationships must presumably be specified in the subclass.
Edit: it's important to note that this implementation does not reject inconsistent initialising data (e.g. specifying a, b, c and A such that it does not fulfil the mutual expressions for calculation). The assumption being that only the minimal set of meaningful data should be used by the instantiator. The requirement from the OP can be enforced without too much trouble via instantiation time evaluation of consistency between the provided kwargs.
import itertools
class Foo(object):
# Define the base set of dependencies
relationships = {
("a", "b", "c"): "A",
("c", "d"): "B",
}
# Forumulate inverse relationships from the base set
# This is a little wasteful but gives cheap dependency set lookup at
# runtime
for deps, target in relationships.items():
deps = set(deps)
for dep in deps:
alt_deps = deps ^ set([dep, target])
relationships[tuple(alt_deps)] = dep
def __init__(self, **kwargs):
available = set(kwargs)
derivable = set()
# Run through the permutations of available variables to work out what
# other variables are derivable given the dependency relationships
# defined above
while True:
for r in range(1, len(available) + 1):
for permutation in itertools.permutations(available, r):
if permutation in self.relationships:
derivable.add(self.relationships[permutation])
if derivable.issubset(available):
# If the derivable set adds nothing to what is already noted as
# available, that's all we can get
break
else:
available |= derivable
# If any of the variables are underivable, raise an exception
underivable = set(self.relationships.values()) - available
if len(underivable) > 0:
raise TypeError(
"The following properties cannot be derived:\n\t{0}"
.format(tuple(underivable))
)
# Store the kwargs in a mapping where we'll also cache other values as
# are calculated
self._value_dict = kwargs
def __getattribute__(self, name):
# Try to collect the value from the stored value mapping or fall back
# to the method which calculates it below
try:
return super(Foo, self).__getattribute__("_value_dict")[name]
except (AttributeError, KeyError):
return super(Foo, self).__getattribute__(name)
# This is left hidden but not treated as a staticmethod since it needs to
# be run at definition time
def __storable_property(getter):
name = getter.__name__
def storing_getter(inst):
# Calculates the value using the defined getter and save it
value = getter(inst)
inst._value_dict[name] = value
return value
def setter(inst, value):
# Changes the stored value and invalidate saved values which depend
# on it
inst._value_dict[name] = value
for deps, target in inst.relationships.items():
if name in deps and target in inst._value_dict:
delattr(inst, target)
def deleter(inst):
# Delete the stored value
del inst._value_dict[name]
# Pass back a property wrapping the get/set/deleters
return property(storing_getter, setter, deleter, getter.__doc__)
## Each variable must have a single defined calculation to get its value
## Decorate these with the __storable_property function
#__storable_property
def a(self):
return self.A - self.b - self.c
#__storable_property
def b(self):
return self.A - self.a - self.c
#__storable_property
def c(self):
return self.A - self.a - self.b
#__storable_property
def d(self):
return self.B / self.c
#__storable_property
def A(self):
return self.a + self.b + self.c
#__storable_property
def B(self):
return self.c * self.d
if __name__ == "__main__":
f = Foo(a=1, b=2, A=6, d=10)
print f.a, f.A, f.B
f.d = 20
print f.B
I would check for the consistency of the data each time you set a parameter.
import math
tol = 1e-9
class Ellipse(object):
def __init__(self, a=None, b=None, A=None, a_b=None):
self.a = self.b = self.A = self.a_b = None
self.set_short_axis(a)
self.set_long_axis(b)
self.set_area(A)
self.set_maj_min_axis(a_b)
def set_short_axis(self, a):
self.a = a
self.check()
def set_long_axis(self, b):
self.b = b
self.check()
def set_maj_min_axis(self, a_b):
self.a_b = a_b
self.check()
def set_area(self, A):
self.A = A
self.check()
def check(self):
if self.a and self.b and self.A:
if not math.fabs(self.A - self.a * self.b * math.pi) <= tol:
raise Exception('A=a*b*pi does not check!')
if self.a and self.b and self.a_b:
if not math.fabs(self.a / float(self.b) - self.a_b) <= tol:
raise Exception('a_b=a/b does not check!')
The main:
e1 = Ellipse(a=3, b=3, a_b=1)
e2 = Ellipse(a=3, b=3, A=27)
The first ellipse object is consistent; set_maj_min_axis(1) passes fine.
The second is not; set_area(27) fails, at least within the 1e-9 tolerance specified, and raises an error.
Edit 1
Some additional lines are needed for the cases when the uses supply a, a_b and A, in the check() method:
if self.a and self.A and self.a_b:
if not math.fabs(self.A - self.a **2 / self.a_b * math.pi) <= tol:
raise Exception('A=a*a/a_b*pi does not check!')
if self.b and self.A and self.a_b:
if not math.fabs(self.A - self.b **2 * self.a_b * math.pi) <= tol:
raise Exception('A=b*b*a_b*pi does not check!')
Main:
e3 = Ellipse(b=3.0, a_b=1.0, A=27)
An arguably wiser way would be to calculate self.b = self.a / float(self.a_b) directly into the set method of a_b. Since you decide yourself of the order of the set methods in the constructor, that might be more manageable than to write dozens of checks.
Given a function:
def A(a, b, c):
a *= 2
b *= 4
c *= 8
return a+b+c
How can I set the 'c' var to be called-by-reference, so if i call d = A(a,b,c), c will point to the same int object, and be updated from within the function?
You're getting into murky territory: you're talking about declaring global (or nonlocal) variables in functions, which should simply not be done. Functions are meant to do one thing, and do them without the consent of other values or functions affecting their state or output.
It's difficult to suggest a working example: are you alright with having copies of the variables left behind for later reference? You could expand this code to pass back a tuple, and reference the members of the tuple if needed, or the sum:
>>> def A(a, b, c):
return (a*2, b*4, c*8)
>>> d = A(2, 4, 8)
>>> sum(d)
84
>>> d[-1] #or whatever index you'd need...this may serve best as a constant
64
You can do this if c is a list:
c = [2]
def A(a, b, c):
a *= 2
b *= 4
c[0] *= 8
return a+b+c[0]
print c # gives [16]
You can't. Python cannot do that.
What you can do is pass a wrapper that has __imul__() defined and an embedded int, and the augmented assignment will result in the embedded attribute being mutated instead.
All calls in Python are "by reference". Integers are immutable in Python. You can't change them.
class C:
def __init__(self, c):
self.c = c
def __call__(self, a, b):
a *= 2
b *= 4
self.c *= 8
return a + b + self.c
Example
A = C(1)
print A(1, 1), A.c
print A(1, 1), A.c
Output
14 8
70 64