When using the functools caching functions like lru_cache, the inner function doesn't update the values of the non local variables. The same method works without the decorator.
Are the non-local variables not updated when using the caching decorator? Also, what to do if I have to update non-local variables but also store results to avoid duplicate work? Or do I need to return an answer from the cached function necessarily?
Eg. the following does not correctly update the value of the nonlocal variable
def foo(x):
outer_var=0
#lru_cache
def bar(i):
nonlocal outer_var
if condition:
outer_var+=1
else:
bar(i+1)
bar(x)
return outer_var
Background
I was trying the Decode Ways problem which is finding the number of ways a string of numbers can be interpreted as letters. I start from the first letter and take one or two steps and check if they're valid. On reaching the end of the string, I update a non local variable which stores the number of ways possible. This method is giving correct answer without using lru_cache but fails when caching is used. Another method where I return the value is working but I wanted to check how to update non-local variables while using memoization decorators.
My code with the error:
ways=0
#lru_cache(None) # works well without this
def recurse(i):
nonlocal ways
if i==len(s):
ways+=1
elif i<len(s):
if 1<=int(s[i])<=9:
recurse(i+1)
if i+2<=len(s) and 10<=int(s[i:i+2])<=26:
recurse(i+2)
return
recurse(0)
return ways
The accepted solution:
#lru_cache(None)
def recurse(i):
if i==len(s):
return 1
elif i<len(s):
ans=0
if 1<=int(s[i])<=9:
ans+= recurse(i+1)
if i+2<=len(s) and 10<=int(s[i:i+2])<=26:
ans+= recurse(i+2)
return ans
return recurse(0)
There's nothing special about lru_cache, a nonlocal variable or recursion causing any inherent issue here, per se. The issue is purely logical rather than a behavioral anomaly. See this minimal example:
from functools import lru_cache
def foo():
c = 0
#lru_cache(None)
def bar(i=0):
nonlocal c
if i < 5:
c += 1
bar(i + 1)
bar()
return c
print(foo()) # => 5
The problem in the cached version of decode ways code is due to the overlapping nature of the recursive calls. The cache prevents the base case call recurse(i) where i == len(s) from ever executing more than once, even if it's reached from a different recursive path.
A good way to establish this is to slap a print("hello") in the base case (the if i == len(s) branch), then feed it a sizable problem. You'll see print("hello") fire once, and only once, and since ways cannot be updated by any other means than through recurse(i) when i == len(s), you're left with ways == 1 when all is said and done.
In the above toy example, there's only one recursive path: the calls expand for each i between 0 and 9 and the cache is never used. In contrast, decode ways offers multiple recursive paths, so the path via recurse(i+1) finds the base case linearly, then as the stack unwinds, recurse(i+2) tries to find other ways of reaching it.
Adding the cache cuts off extra paths, but it has no value to return for each intermediate node. With the cache, it's like you have a memoized or dynamic programming table of subproblems, but you never update any entries, so the whole table is zero (except for the base case).
Here's an example of the linear behavior the cache causes:
from functools import lru_cache
def cached():
#lru_cache(None)
def cached_recurse(i=0):
print("cached", i)
if i < 3:
cached_recurse(i + 1)
cached_recurse(i + 2)
cached_recurse()
def uncached():
def uncached_recurse(i=0):
print("uncached", i)
if i < 3:
uncached_recurse(i + 1)
uncached_recurse(i + 2)
uncached_recurse()
cached()
uncached()
Output:
cached 0
cached 1
cached 2
cached 3
cached 4
uncached 0
uncached 1
uncached 2
uncached 3
uncached 4
uncached 3
uncached 2
uncached 3
uncached 4
The solution is exactly as you show: pass the results up the tree and use the cache to store values for each node representing a subproblem. This is the best of both worlds: we have the values for the subproblems, but without re-executing the functions that ultimately lead to your ways += 1 base case.
In other words, if you're going to use the cache, think of it like a lookup table, not just a call tree pruner. In your attempt, it doesn't remember what work was done, just prevents it from being done again.
Related
I'm very new to coding and am working on a project where I write a code to perform newtons method for lots of different functions. The code I wrote to do newtons method is as follows:
def fnewton(function, dx, x):
#defined the functions that need to be evaluated so that this code can be applied to any function I call
def f(x):
f=eval(function)
return f
#eval is used to evaluate whatever I put in the function place when I recall fnewton
#this won't work without eval to run the functions
def df(x):
df=eval(dx)
return df
n=0
min=.000001
guess=2
xi_1=guess
#defining these variables before Fnewton so I can use them in print
while np.absolute((xi_1))>min:
#basically this means that you continue the process until funct/der is greater than the tolerance
n=n+1 #helps keep track over how many times I iterated
x=xi_1-f(xi_1)/df(xi_1) #this is the newton eqn
xi_1=x
print('the root is at:')
print(x)
print('after this many iterations:')
print(n)
I am trying to call on this function to operate on functions I defined before it by using the command:
fnewton("a(x)", "dadx(x)",2)
Once I added the two it would run(and tell me variables weren't defined) but now it just runs forever and never computes anything. please help, did I code wrong?
ps. a(x) and dadx(x) are:
def a(x):
f=np.exp(np.exp(-x))-x**2+x
return(f)
def dadx(x):
f=(a(x+.01)-a(x))/.01
return(f)
I executed the code you loop stuck at value 1.7039784148789716, your logic which says while np.absolute(xi_1)>min1: seems not working
try printing values inside the loop as below
while np.absolute(xi_1)>min1:
#basically this means that you continue the process until funct/der is greater than the tolerance
n=n+1 #helps keep track over how many times I iterated
x=xi_1-f(xi_1)/df(xi_1) #this is the newton eqn
xi_1=x
print(np.absolute(xi_1))
and find the proper while expression to suite your result
I think you meant while np.absolute(f(xi_1))>min:
PS: just refrain yourself from using functions like eval() in python, it makes your code a lot harder to debug
Best to write your f and df as discrete functions then pass references to them to fnewton(). In this way the implementation of fnewton() remains constant and you just have to pass your estimate and f and df references. You can reasonably hard-code Euler's Number for this trivial case which avoids the need to import numpy
e = 2.718281828459045
def f(x):
return e**(e**(-x))-x**2+x
def df(x):
return (f(x+.01)-f(x))/.01
def fnewton(x0, f, df, tolerance=0.0001):
if abs(fx0 := f(x0)) < tolerance:
return x0
return newton(x0 - fx0/df(x0), f, df, tolerance)
print(fnewton(1.5, f, df))
Output:
1.7039788103083038
This question already has answers here:
Static variable in Python?
(6 answers)
Closed 1 year ago.
I'm trying to write a function that updates its local variable each time it is run but it is not working for some reason.
def max_equity(max_equity=0):
if current_equity() > max_equity:
max_equity = current_equity()
print(max_equity)
return max_equity
else:
print(max_equity)
return max_equity
and the function which it is calling
def current_equity():
for n in range(len(trade_ID_tracker)-1):
equity_container = 0
if (trade_ID_tracker[n,2]) == 0:
break
else:
if (trade_ID_tracker[n, 1].astype(int) == long):
equity_container += (df.loc[tbar_count,'Ask_Price'] - trade_ID_tracker[n, 2]) * trade_lots * pip_value * 1000
elif (trade_ID_tracker[n, 1].astype(int) == short):
equity_container += 0 - (df.loc[tbar_count,'Ask_Price'] - trade_ID_tracker[n, 2]) * trade_lots * pip_value * 10000
return (current_balance + equity_container)
but for some reason the max_equity() function prints current_equity() which I can only imagine means that either:
if current_equity() > max_equity:
is not doing it's job and is triggering falsely
or
max_equity = current_equity()
is not doing its job and max_equity starts at zero every time it is run.
In other words if I put max_equity() in a loop where current_equity() is
[1,2,3,4,5,4,3,2,1]
then max_equity() should return
[1,2,3,4,5,5,5,5,5]
But instead it returns
[1,2,3,4,5,4,3,2,1]
Here's a quick example test
ar = [1,2,3,4,5,4,3,2,1]
def stuff(max_equity=0):
if ar[n] > max_equity:
max_equity = ar[n]
print(max_equity)
else:
print(max_equity)
for n in range(len(ar)):
stuff()
Either way I'm kind of stumped.
Any advice?
local function variables are reset at each function call. This is essential for the behavior of functions as idempotent, and is a major factor in the success of the procedural programming approach: a function can be called form multiple contexts, and even in parallel, in concurrent threads, and it will yield the same result.
A big exception, and most often regarded as one of the bigger beginner traps of Python is that, as parameters are reset to the default values specified in the function definition for each call, if these values are mutable objects, each new call will see the same object, as it has been modified by previous calls.
This means it could be done on purpose by, instead of setting your default value as 0 you would set it as a list which first element was a 0. At each run, you could update that value, and this change would be visible in subsequent calls.
This approach would work, but it is not "nice" to depend on a side-effect of the language in this way. The official (and nice) way to keep state across multiple calls in Python is to use objects rather than functions.
Objects can have attributes tied to them, which are both visible and writable by its methods - which otherwise have their own local variables which are re-started at each call:
class MaxEquity:
def __init__(self):
self.value = 0
def update(max_equity=0):
current = current_equity()
if current > self.value:
self.value = current
return self.value
# the remainder of the code should simply create a single instance
# of that like ]
max_equity = MaxEquity()
# and eeach time yoiu want the max value, you should call its "update"
# method
I use global variables but I've read that they aren't a good practice or pythonic. I often use functions that give as a result many yes/no variables that I need to use in the main function. For example, how can I write the following code without using global variables?
def secondary_function():
global alfa_is_higher_than_12
global beta_is_higher_than_12
alfa = 12
beta = 5
if alfa > 10:
alfa_is_higher_than_12 = "yes"
else:
alfa_is_higher_than_12 = "no"
if beta > 10:
beta_is_higher_than_12 = "yes"
else:
beta_is_higher_than_12 = "no"
def main_function():
global alfa_is_higher_than_12
global beta_is_higher_than_12
secondary_function()
if alfa_is_higher_than_12=="yes":
print("alfa is higher than 12")
else:
print("alfa isn't higher than 12")
if beta_is_higher_than_12=="yes":
print("beta is higher than 12")
else:
print("beta isn't higher thant 12")
main_function()
The term "Pythonic" doesn't apply to this topic--using globals like this is poor practice in any programming language and paradigm and isn't something specific to Python.
The global keyword is the tool that Python provides for you to opt out of encapsulation and break the natural scope of a variable. Encapsulation means that each of your components is a logical, self-contained unit that should work as a black box and performs one thing (note: this one thing is conceptual and may consist of many, possibly non-trivial, sub-steps) without mutating global state or producing side effects. The reason is modularity: if something goes wrong in a program (and it will), having strong encapsulation makes it very easy to determine where the failing component is.
Encapsulsation makes code easier to refactor, maintain and expand upon. If you need a component to behave differently, it should be easy to remove it or adjust it without these modifications causing a domino effect of changes across other components in the system.
Basic tools for enforcing encapsulation include classes, functions, parameters and the return keyword. Languages often provide modules, namespaces and closures to similar effect, but the end goal is always to limit scope and allow the programmer to create loosely-coupled abstractions.
Functions take in input through parameters and produce output through return values. You can assign the return value to variables in the calling scope. You can think of parameters as "knobs" that adjust the function's behavior. Inside the function, variables are just temporary storage used by the function needed to generate its one return value then disappear.
Ideally, functions are written to be pure and idempotent; that is, they don't modify global state and produce the same result when called multiple times. Python is a little less strict about this than other languages and it's natural to use certain in-place functions like sort and random.shuffle. These are exceptions that prove the rule (and if you know a bit about sorting and shuffling, they make sense in these contexts due to the algorithms used and the need for efficiency).
An in-place algorithm is impure and non-idempotent, but if the state that it modifies is limited to its parameter(s) and its documentation and return value (usually None) support this, the behavior is predictable and comprehensible.
So what does all this look like in code? Unfortunately, your example seems contrived and unclear as to its purpose/goal, so there's no direct way to transform it that makes the advantages of encapsulation obvious.
Here's a list of some of the problems in these functions beyond modifying global state:
using "yes" and "no" string literals instead of True/False boolean values.
hardcoding values in functions, making them entirely single-purpose (they may as well be inlined).
printing in functions (see side effects remark above--prefer to return values and let the calling scope print if they desire to do so).
generic variable names like secondary_function (I'm assuming this is equivalent to foo/bar for the example, but it still doesn't justify their reason for existence, making it difficult to modify as a pedagogical example).
But here's my shot anyway:
if __name__ == "__main__":
alpha = 42
beta = 6
print("alpha %s higher than 12" % ("is" if alpha > 12 else "isn't"))
print("beta %s higher than 12" % ("is" if beta > 12 else "isn't"))
We can see there's no need for all of the functions--just write alpha > 12 wherever you need to make a comparison and call print when you need to print. One drawback of functions is that they can serve to hide important logic, so if their names and "contract" (defined by the name, docstring and parameters/return value) aren't clear, they'll only serve to confuse the client of the function (yourself, generally).
For sake of illustration, say you're calling this formatter often. Then, there's reason to abstract; the calling code would become cumbersome and repetitive. You can move the formatting code to a helper function and pass any dynamic data to inject into the template:
def fmt_higher(name, n, cutoff=12):
verb = "is" if n > cutoff else "isn't"
return f"{name} {verb} higher than {cutoff}"
if __name__ == "__main__":
print(fmt_higher("alpha", 42))
print(fmt_higher("beta", 6))
print(fmt_higher("epsilon", 0))
print(fmt_higher(name="delta", n=2, cutoff=-5))
We can go a step further and pretend that n > cutoff was a much more complicated test with many small steps that would breach single-responsibility if left in fmt_higher. Maybe the complicated test is used elsewhere in the code and could be generalized to support both use cases.
In this situation, you can still use parameters and return values instead of global and perform the same sort of abstraction to the predicate as you did with the formatter:
def complex_predicate(n, cutoff):
# pretend this function is much more
# complex and/or used in many places...
return n > cutoff
def fmt_higher(name, n, cutoff=12):
verb = "is" if complex_predicate(n, cutoff) else "isn't"
return f"{name} {verb} higher than {cutoff}"
if __name__ == "__main__":
print(fmt_higher("alpha", 42))
print(fmt_higher("beta", 6))
print(fmt_higher("epsilon", 0))
print(fmt_higher(name="delta", n=2, cutoff=-5))
Only abstract when there is sufficient reason to abstract (the calling code becomes clogged or when you're repeating similar blocks of code multiple times are classic rules-of-thumb). And when you do abstract, do it properly.
One could ask what reasons you might have to structure your code like this, but assuming you have your reasons, you could just return the values from your secondary function:
def secondary_function():
alfa = 12
beta = 5
if alfa > 10:
alfa_is_higher_than_12 = "yes"
else:
alfa_is_higher_than_12 = "no"
if beta > 10:
beta_is_higher_than_12 = "yes"
else:
beta_is_higher_than_12 = "no"
return alfa_is_higher_than_12, beta_is_higher_than_12
def main_function():
alfa_is_higher_than_12, beta_is_higher_than_12 = secondary_function()
if alfa_is_higher_than_12=="yes":
print("alfa is higher than 12")
else:
print("alfa isn't higher than 12")
if beta_is_higher_than_12=="yes":
print("beta is higher than 12")
else:
print("beta isn't higher thant 12")
Never write 'global'. Then you are sure you are not introducing any global variables.
You could also pass the values as arguments:
def secondary_function():
alfa = 12
beta = 5
if alfa > 10:
alfa_is_higher_than_12 = "yes"
else:
alfa_is_higher_than_12 = "no"
if beta > 10:
beta_is_higher_than_12 = "yes"
else:
beta_is_higher_than_12 = "no"
return alfa_is_higher_than_12, beta_is_higher_than_12
def main_function(alfa_is_higher_than_12, beta_is_higher_than_12):
if alfa_is_higher_than_12=="yes":
print("alfa is higher than 12")
else:
print("alfa isn't higher than 12")
if beta_is_higher_than_12=="yes":
print("beta is higher than 12")
else:
print("beta isn't higher thant 12")
main_function(*secondary_function())
I am quering a database for some paramaters which depend on a attribute called count! count can be incremented incase the 1st query does not return anything. Here is a sample code
sls = {(213.243, 55.556): {}, (217.193, 55.793): {}, (213.403, 55.369): {}}
for key in sls.keys:
if not sls[key]:
ra, dec = key[0], key[1]
search_from_sourcelist(sl, ra,dec)
count = 1
def search_from_sourcelist(sl, ra,dec):
dist = count/3600.0
sls[(ra,dec)] = sl.sources.area_search(Area=(ra,dec,dist))
return
Incase i run the method search_from_sourcelist, and it doesnt return anything, i would like to increment count, and do the query again. This is to be done for all keys in sls dictionary, untill all the keys have a value!!
Here is the most fundamental recursive function
def countdown(n):
if n == 0:
return "Blastoff"
else:
print "T minus %s" % n
return countdown(n-1)
You will notice that countdown returns itself with a modified argument, in this case n but -1, so if you actually followed this all the way through you would get
(-> indicates a call)
countdown(5) -> countdown(4) -> countdown(3) -> countdown(2) -> countdown(1) -> countdown(0) #stop
So now you understand what a recursive function looks like you realize you never actually return a call of your own function, thus your code is not recursive
We use recursion because we want to boil a task down to its simplest form then work from there, so a good example of this would be the mcnuggets problem. So you need to tell us what you are trying to achieve and how it can be made a smaller problem (or more importantly why.) Are you sure you cannot do this iteratively? remember you don't want to blow your stack depth because python is NOT tail recursive by standard
Recursion is useful when you find a way to reduce the initial problem to a "smaller version of itself".
The standard example is the factorial function
def fac(n):
return n * fac(n-1) if n > 1 else 1
Here you reduce the problem of calculating the factorial of n to calculating the factorial of n-1.
In your code there is no such "reduction". You just increment a value and start the same problem over again. Thus, I recommend you solve it iteratively.
I'm not sure that you need a recursive algorithm for this.
Incase i run the method search_from_sourcelist, and it doesnt return anything, i would like to increment count, and do the query again. This can be done with a while loop as follows:
for key, value in sls.iteritems():
if not value:
ra, dec = key[0], key[1]
count = 1
while not search_from_sourcelist(sls, ra, dec):
count += 1
But if you really do want to do this recursively, you can do it as follows, leave a comment and I'll write it up.
Further, you should look into your search_from_sourcelist function, as it always returns None
This question already has answers here:
What is memoization and how can I use it in Python?
(14 answers)
Closed 6 months ago.
Is it "good practice" to create a class like the one below that can handle the memoization process for you? The benefits of memoization are so great (in some cases, like this one, where it drops from 501003 to 1507 function calls and from 1.409 to 0.006 seconds of CPU time on my computer) that it seems a class like this would be useful.
However, I've read only negative comments on the usage of eval(). Is this usage of it excusable, given the flexibility this approach offers?
This can save any returned value automatically at the cost of losing side effects. Thanks.
import cProfile
class Memoizer(object):
"""A handler for saving function results."""
def __init__(self):
self.memos = dict()
def memo(self, string):
if string in self.memos:
return self.memos[string]
else:
self.memos[string] = eval(string)
self.memo(string)
def factorial(n):
assert type(n) == int
if n == 1:
return 1
else:
return n * factorial(n-1)
# find the factorial of num
num = 500
# this many times
times = 1000
def factorialTwice():
factorial(num)
for x in xrange(0, times):
factorial(num)
return factorial(num)
def memoizedFactorial():
handler = Memoizer()
for x in xrange(0, times):
handler.memo("factorial(%d)" % num)
return handler.memo("factorial(%d)" % num)
cProfile.run('factorialTwice()')
cProfile.run('memoizedFactorial()')
You can memoize without having to resort to eval.
A (very basic) memoizer:
def memoized(f):
cache={}
def ret(*args):
if args in cache:
return cache[args]
else:
answer=f(*args)
cache[args]=answer
return answer
return ret
#memoized
def fibonacci(n):
if n==0 or n==1:
return 1
else:
return fibonacci(n-1)+fibonacci(n-2)
print fibonacci(100)
eval is often misspelt as evil primarily because the idea of executing "strings" at runtime is fraught with security considerations. Have you escaped the code sufficiently? Quotation marks? And a host of other annoying headaches. Your memoise handler works but it's really not the Python way of doing things. MAK's approach is much more Pythonic. Let's try a few experiments.
I edited up both the versions and made them run just once with 100 as the input. I also moved out the instantiation of Memoizer.
Here are the results.
>>> timeit.timeit(memoizedFactorial,number=1000)
0.08526921272277832h
>>> timeit.timeit(foo0.mfactorial,number=1000)
0.000804901123046875
In addition to this, your version necessitates a wrapper around the the function to be memoised which should be written in a string. That's ugly. MAK's solution is clean since the "process of memoisation" is encapsulated in a separate function which can be conveniently applied to any expensive function in an unobtrusive fashion. This is not very Pythonic. I have some details on writing such decorators in my Python tutorial at http://nibrahim.net.in/self-defence/ in case you're interested.