Memoization Handler [duplicate] - python

This question already has answers here:
What is memoization and how can I use it in Python?
(14 answers)
Closed 6 months ago.
Is it "good practice" to create a class like the one below that can handle the memoization process for you? The benefits of memoization are so great (in some cases, like this one, where it drops from 501003 to 1507 function calls and from 1.409 to 0.006 seconds of CPU time on my computer) that it seems a class like this would be useful.
However, I've read only negative comments on the usage of eval(). Is this usage of it excusable, given the flexibility this approach offers?
This can save any returned value automatically at the cost of losing side effects. Thanks.
import cProfile
class Memoizer(object):
"""A handler for saving function results."""
def __init__(self):
self.memos = dict()
def memo(self, string):
if string in self.memos:
return self.memos[string]
else:
self.memos[string] = eval(string)
self.memo(string)
def factorial(n):
assert type(n) == int
if n == 1:
return 1
else:
return n * factorial(n-1)
# find the factorial of num
num = 500
# this many times
times = 1000
def factorialTwice():
factorial(num)
for x in xrange(0, times):
factorial(num)
return factorial(num)
def memoizedFactorial():
handler = Memoizer()
for x in xrange(0, times):
handler.memo("factorial(%d)" % num)
return handler.memo("factorial(%d)" % num)
cProfile.run('factorialTwice()')
cProfile.run('memoizedFactorial()')

You can memoize without having to resort to eval.
A (very basic) memoizer:
def memoized(f):
cache={}
def ret(*args):
if args in cache:
return cache[args]
else:
answer=f(*args)
cache[args]=answer
return answer
return ret
#memoized
def fibonacci(n):
if n==0 or n==1:
return 1
else:
return fibonacci(n-1)+fibonacci(n-2)
print fibonacci(100)

eval is often misspelt as evil primarily because the idea of executing "strings" at runtime is fraught with security considerations. Have you escaped the code sufficiently? Quotation marks? And a host of other annoying headaches. Your memoise handler works but it's really not the Python way of doing things. MAK's approach is much more Pythonic. Let's try a few experiments.
I edited up both the versions and made them run just once with 100 as the input. I also moved out the instantiation of Memoizer.
Here are the results.
>>> timeit.timeit(memoizedFactorial,number=1000)
0.08526921272277832h
>>> timeit.timeit(foo0.mfactorial,number=1000)
0.000804901123046875
In addition to this, your version necessitates a wrapper around the the function to be memoised which should be written in a string. That's ugly. MAK's solution is clean since the "process of memoisation" is encapsulated in a separate function which can be conveniently applied to any expensive function in an unobtrusive fashion. This is not very Pythonic. I have some details on writing such decorators in my Python tutorial at http://nibrahim.net.in/self-defence/ in case you're interested.

Related

How to update nonlocal variables while caching results?

When using the functools caching functions like lru_cache, the inner function doesn't update the values of the non local variables. The same method works without the decorator.
Are the non-local variables not updated when using the caching decorator? Also, what to do if I have to update non-local variables but also store results to avoid duplicate work? Or do I need to return an answer from the cached function necessarily?
Eg. the following does not correctly update the value of the nonlocal variable
def foo(x):
outer_var=0
#lru_cache
def bar(i):
nonlocal outer_var
if condition:
outer_var+=1
else:
bar(i+1)
bar(x)
return outer_var
Background
I was trying the Decode Ways problem which is finding the number of ways a string of numbers can be interpreted as letters. I start from the first letter and take one or two steps and check if they're valid. On reaching the end of the string, I update a non local variable which stores the number of ways possible. This method is giving correct answer without using lru_cache but fails when caching is used. Another method where I return the value is working but I wanted to check how to update non-local variables while using memoization decorators.
My code with the error:
ways=0
#lru_cache(None) # works well without this
def recurse(i):
nonlocal ways
if i==len(s):
ways+=1
elif i<len(s):
if 1<=int(s[i])<=9:
recurse(i+1)
if i+2<=len(s) and 10<=int(s[i:i+2])<=26:
recurse(i+2)
return
recurse(0)
return ways
The accepted solution:
#lru_cache(None)
def recurse(i):
if i==len(s):
return 1
elif i<len(s):
ans=0
if 1<=int(s[i])<=9:
ans+= recurse(i+1)
if i+2<=len(s) and 10<=int(s[i:i+2])<=26:
ans+= recurse(i+2)
return ans
return recurse(0)
There's nothing special about lru_cache, a nonlocal variable or recursion causing any inherent issue here, per se. The issue is purely logical rather than a behavioral anomaly. See this minimal example:
from functools import lru_cache
def foo():
c = 0
#lru_cache(None)
def bar(i=0):
nonlocal c
if i < 5:
c += 1
bar(i + 1)
bar()
return c
print(foo()) # => 5
The problem in the cached version of decode ways code is due to the overlapping nature of the recursive calls. The cache prevents the base case call recurse(i) where i == len(s) from ever executing more than once, even if it's reached from a different recursive path.
A good way to establish this is to slap a print("hello") in the base case (the if i == len(s) branch), then feed it a sizable problem. You'll see print("hello") fire once, and only once, and since ways cannot be updated by any other means than through recurse(i) when i == len(s), you're left with ways == 1 when all is said and done.
In the above toy example, there's only one recursive path: the calls expand for each i between 0 and 9 and the cache is never used. In contrast, decode ways offers multiple recursive paths, so the path via recurse(i+1) finds the base case linearly, then as the stack unwinds, recurse(i+2) tries to find other ways of reaching it.
Adding the cache cuts off extra paths, but it has no value to return for each intermediate node. With the cache, it's like you have a memoized or dynamic programming table of subproblems, but you never update any entries, so the whole table is zero (except for the base case).
Here's an example of the linear behavior the cache causes:
from functools import lru_cache
def cached():
#lru_cache(None)
def cached_recurse(i=0):
print("cached", i)
if i < 3:
cached_recurse(i + 1)
cached_recurse(i + 2)
cached_recurse()
def uncached():
def uncached_recurse(i=0):
print("uncached", i)
if i < 3:
uncached_recurse(i + 1)
uncached_recurse(i + 2)
uncached_recurse()
cached()
uncached()
Output:
cached 0
cached 1
cached 2
cached 3
cached 4
uncached 0
uncached 1
uncached 2
uncached 3
uncached 4
uncached 3
uncached 2
uncached 3
uncached 4
The solution is exactly as you show: pass the results up the tree and use the cache to store values for each node representing a subproblem. This is the best of both worlds: we have the values for the subproblems, but without re-executing the functions that ultimately lead to your ways += 1 base case.
In other words, if you're going to use the cache, think of it like a lookup table, not just a call tree pruner. In your attempt, it doesn't remember what work was done, just prevents it from being done again.

Updating local variable within function not working correctly [duplicate]

This question already has answers here:
Static variable in Python?
(6 answers)
Closed 1 year ago.
I'm trying to write a function that updates its local variable each time it is run but it is not working for some reason.
def max_equity(max_equity=0):
if current_equity() > max_equity:
max_equity = current_equity()
print(max_equity)
return max_equity
else:
print(max_equity)
return max_equity
and the function which it is calling
def current_equity():
for n in range(len(trade_ID_tracker)-1):
equity_container = 0
if (trade_ID_tracker[n,2]) == 0:
break
else:
if (trade_ID_tracker[n, 1].astype(int) == long):
equity_container += (df.loc[tbar_count,'Ask_Price'] - trade_ID_tracker[n, 2]) * trade_lots * pip_value * 1000
elif (trade_ID_tracker[n, 1].astype(int) == short):
equity_container += 0 - (df.loc[tbar_count,'Ask_Price'] - trade_ID_tracker[n, 2]) * trade_lots * pip_value * 10000
return (current_balance + equity_container)
but for some reason the max_equity() function prints current_equity() which I can only imagine means that either:
if current_equity() > max_equity:
is not doing it's job and is triggering falsely
or
max_equity = current_equity()
is not doing its job and max_equity starts at zero every time it is run.
In other words if I put max_equity() in a loop where current_equity() is
[1,2,3,4,5,4,3,2,1]
then max_equity() should return
[1,2,3,4,5,5,5,5,5]
But instead it returns
[1,2,3,4,5,4,3,2,1]
Here's a quick example test
ar = [1,2,3,4,5,4,3,2,1]
def stuff(max_equity=0):
if ar[n] > max_equity:
max_equity = ar[n]
print(max_equity)
else:
print(max_equity)
for n in range(len(ar)):
stuff()
Either way I'm kind of stumped.
Any advice?
local function variables are reset at each function call. This is essential for the behavior of functions as idempotent, and is a major factor in the success of the procedural programming approach: a function can be called form multiple contexts, and even in parallel, in concurrent threads, and it will yield the same result.
A big exception, and most often regarded as one of the bigger beginner traps of Python is that, as parameters are reset to the default values specified in the function definition for each call, if these values are mutable objects, each new call will see the same object, as it has been modified by previous calls.
This means it could be done on purpose by, instead of setting your default value as 0 you would set it as a list which first element was a 0. At each run, you could update that value, and this change would be visible in subsequent calls.
This approach would work, but it is not "nice" to depend on a side-effect of the language in this way. The official (and nice) way to keep state across multiple calls in Python is to use objects rather than functions.
Objects can have attributes tied to them, which are both visible and writable by its methods - which otherwise have their own local variables which are re-started at each call:
class MaxEquity:
def __init__(self):
self.value = 0
def update(max_equity=0):
current = current_equity()
if current > self.value:
self.value = current
return self.value
# the remainder of the code should simply create a single instance
# of that like ]
max_equity = MaxEquity()
# and eeach time yoiu want the max value, you should call its "update"
# method

How to speed up processing speed while using functions

I wrote simplePrint.py
simplePrint.py
print(1)
print(2)
print(3)
And, I wrote functionPrint.py.
functionPrint.py
def printTestFunction(one,two,three):
print(one)
print(two)
print(three)
printTestFunction(1,2,3)
It may be natural, but functionPrint.py is slower.
Is there a way to speed up processing while using functions?
The speed comparison method is as follows
import timeit
class checkTime():
def __init__(self):
self.a = 0
self.b = 0
self.c = 0
def local(self):
print(1)
print(2)
print(3)
def self(self):
def printTestFunction(one,two,three):
print(one)
print(two)
print(three)
printTestFunction(1,2,3)
def getTime(self):
def test1():
self.self()
self_elapsed_time = timeit.Timer(stmt=test1).repeat(number=10000)
def test2():
self.local()
local_elapsed_time = timeit.Timer(stmt=test2).repeat(number=10000)
print ("local_time:{0}".format(local_elapsed_time) + "[sec]")
print ("self_time:{0}".format(self_elapsed_time) + "[sec]")
checkTime = checkTime()
checkTime.getTime()
result
local_time:[0.04716750000000003, 0.09638709999999995, 0.07357000000000002, 0.04696279999999997, 0.04360750000000002][sec]
self_time:[0.09702539999999998, 0.111617, 0.07951390000000003, 0.08777400000000002, 0.099128][sec]
There are plenty of ways to optimize your Python, but for something this simple, I wouldn't worry about it. Function calls are damn near instantaneous in human time.
A function call in most languages has to create new variables for the arguments, create a local scope, perform all the actions. So:
def printTestFunction(one,two,three):
print(one)
print(two)
print(three)
printTestFunction(1,2,3)
runs something like this:
define function printTestFunction
call function with args (1, 2, 3)
create local scope
one=arg[0]
two=arg[1]
three=arg[2]
print(one)
print(two)
print(three)
return None
destroy local scope
garbage collect
That's my guess anyway. You can see that there's a lot more going on here and that takes time. (in particular, creating a local scope is a lot of instructions).
(You should definitely still use functions as programming anything complex gets out of control VERY quickly without them. The speed bump is negligible.)
Doing a simple Google search yields:
Here are 5 important things to keep in mind in order to write efficient Python cde... Know the basic data structures. ... Reduce memory footprint. ... Use builtin functions and libraries. ... Move calculations outside the loop. ... Keep your code base small. If you want to test your scripts and see which runs faster you can use this (taken from this post):
import time
start_time = time.time()
// Your main code here
print("--- %s seconds ---" % (time.time() - start_time))

Wrapper Function in Python

#***This code uses wrapper function to print the phone number in a
standard format with country code (if not supplied in the input)***
def ori_func(a):
mm=[]
def wraap(*args):
for k in args:
for i in k:
#print(i)
if len(str(i))==10:
mm.append("+91"+str(i))
elif str(i)[0]=="0" and len(str(i))==11:
mm.append("+91"+str(i)[1:])
#elif len(str(i))==12 and i[0]=="+":
# mm.append(i)
elif len(str(i)) == 12:
mm.append("+"+str(i))
#print (mm)
return a(mm)
return wraap
#ori_func
def srt_phone(mm):
#sorted(int(mm))
for j in sorted(mm):
cc=str(j)[:3]
mmn1=str(j)[3:8]
mmn2=str(j)[8:]
print (cc+" "+mmn1+" "+mmn2)
m=[1234567891, 912345678923, +919876543219,"07418529637"]
srt_phone(m)
This code works fine as per my knowledge. However I need you to look-through my code and let me know my level of Wrapper function knowledge is correct
When I pass a list to wrapper function, do I need to really use 2 "For" loops in wrapper function like I did? Is there any other way?
When we asked to get the phone number as input in INT format,how to handle with the input that starts with 0?
Thanks
Yes. There are other ways but that way is pretty clean.
You can't, and I suggest not treating phone numbers as numbers because they aren't real numbers. Real numbers can't start with a + or a 0.
Your code looks fine to me; I'd have done a few things differently but that's just personal preference. I do recommend that you look into using #functools.wraps(a) on your inner function.

python, how to write an iterative function

I am quering a database for some paramaters which depend on a attribute called count! count can be incremented incase the 1st query does not return anything. Here is a sample code
sls = {(213.243, 55.556): {}, (217.193, 55.793): {}, (213.403, 55.369): {}}
for key in sls.keys:
if not sls[key]:
ra, dec = key[0], key[1]
search_from_sourcelist(sl, ra,dec)
count = 1
def search_from_sourcelist(sl, ra,dec):
dist = count/3600.0
sls[(ra,dec)] = sl.sources.area_search(Area=(ra,dec,dist))
return
Incase i run the method search_from_sourcelist, and it doesnt return anything, i would like to increment count, and do the query again. This is to be done for all keys in sls dictionary, untill all the keys have a value!!
Here is the most fundamental recursive function
def countdown(n):
if n == 0:
return "Blastoff"
else:
print "T minus %s" % n
return countdown(n-1)
You will notice that countdown returns itself with a modified argument, in this case n but -1, so if you actually followed this all the way through you would get
(-> indicates a call)
countdown(5) -> countdown(4) -> countdown(3) -> countdown(2) -> countdown(1) -> countdown(0) #stop
So now you understand what a recursive function looks like you realize you never actually return a call of your own function, thus your code is not recursive
We use recursion because we want to boil a task down to its simplest form then work from there, so a good example of this would be the mcnuggets problem. So you need to tell us what you are trying to achieve and how it can be made a smaller problem (or more importantly why.) Are you sure you cannot do this iteratively? remember you don't want to blow your stack depth because python is NOT tail recursive by standard
Recursion is useful when you find a way to reduce the initial problem to a "smaller version of itself".
The standard example is the factorial function
def fac(n):
return n * fac(n-1) if n > 1 else 1
Here you reduce the problem of calculating the factorial of n to calculating the factorial of n-1.
In your code there is no such "reduction". You just increment a value and start the same problem over again. Thus, I recommend you solve it iteratively.
I'm not sure that you need a recursive algorithm for this.
Incase i run the method search_from_sourcelist, and it doesnt return anything, i would like to increment count, and do the query again. This can be done with a while loop as follows:
for key, value in sls.iteritems():
if not value:
ra, dec = key[0], key[1]
count = 1
while not search_from_sourcelist(sls, ra, dec):
count += 1
But if you really do want to do this recursively, you can do it as follows, leave a comment and I'll write it up.
Further, you should look into your search_from_sourcelist function, as it always returns None

Categories