I have a program in Python and I use numba to compile the code to native and run faster.
I want to accelerate the run even further, and implement a cache for function results - if the function is called twice with the same parameters, the first time the calculation will run and return the result and the same time the function will return the result from the cache.
I tried to implement this with a dict, where the keys are tuples containing the function parameters, and the values are the function return values.
However, numba doesn't support dictionaries and the support for global variables is limited, so my solution didn't work.
I can't use a numpy.ndarray and use the indices as the parameters, since some of my parameters are floats.
The problem i that both the function with cached results and and the calling function are compiled with numba (if the calling function was a regular python function, I could cache using just Python and not numba)
How can I implement this result cache with numba?
============================================
The following code gives an error, saying the Memoize class is not recognized
from __future__ import annotations
from numba import njit
class Memoize:
def __init__(self, f):
self.f = f
self.memo = {}
def __call__(self, *args):
if args not in self.memo:
self.memo[args] = self.f(*args)
#Warning: You may wish to do a deepcopy here if returning objects
return self.memo[args]
#Memoize
#njit
def bla(a: int, b: float):
for i in range(1_000_000_000):
a *= b
return a
#njit
def caller(x: int):
s = 0
for j in range(x):
s += bla(j % 5, (j + 1) % 5)
return s
if __name__ == "__main__":
print(caller(30))
The error:
Untyped global name 'bla': Cannot determine Numba type of <class '__main__.Memoize'>
File "try_numba2.py", line 30:
def caller(x: int):
<source elided>
for j in range(x):
s += bla(j % 5, (j + 1) % 5)
^
Changing the order of the decorators for bla gives the following error:
TypeError: The decorated object is not a function (got type <class '__main__.Memoize'>).
Related
I am trying to double numbers using Lambda function in python but can't understand the function that's all because I'm starting to learn python. Below is the function:
def myfunc(n):
return lambda a : a * n
mydoubler = myfunc(2)
print(mydoubler(11))
I just need to understand how this code is working. Any help will be much appreciated.
def myfunc(n): # this function is taking input n and returning a function object
return lambda a : a * n # this function is taking input as a and we are returning this function object
mydoubler = myfunc(2) # now we are calling the returned function object if you print(mydoubler)-> this will give you function object
print(mydoubler(11)) # here we are calling the lambda function with parameters
If you want to implement it without lambda then you can use partial
from functools import partial
def double(n, a):
return a * n
def my_func1(n):
return partial(double,n)
mydoubler = my_func1(2)
print(mydoubler(11))
Your myfunc function creates and returns a closure. In this case, it returns a function that returns its argument times n. The tricky part is that n refers to the value in the specific call frame for myfunc. In your example, n has the value 2. That instance of n is specific to that particular call to myfunc. So if you called myfunc several times in a row, with several different values for n, each of the returned lambda functions would refer to a unique value of n unrelated to the other values.
Note that the use of lambda is optional. It's just shorthand for an anonymous function. For example, you could have written myfunc as:
def myfunc(n):
def f(a):
return a * n
return f
which would produce the same result.
Back to the original question, here's an example of multiple closures coexisting:
def myfunc(n):
def f(a):
return a * n
return f
mydoubler = myfunc(2)
mytripler = myfunc(3)
print(mydoubler(11))
print(mytripler(11))
This prints 22 followed by 33. This works because mydoubler and mytripler each refer to different instances of n.
The value of n can even be modified by calls to the returned closure. For example:
def myfunc(n):
def f(a):
nonlocal n
n += a
return n
return f
func = myfunc(100)
print(func(1))
print(func(5))
This prints 101 followed by 106. In this example, the nonlocal declaration is needed since f assigns to n. Without it, n would be taken to be local to f.
I would like to create a function. The code doesn't work, but you can get the idea what I want to get:
def time_test(func, test_data: int) -> float:
# should return a runtime of given function
return timeit(stmt=func(test_data), number=10000)
Obviously, I do have to pass as stmt something callable or a string with executable code. That's why it doesn't work, but I don't know how to do this in a correct way.
Example how I want to use time_test() function:
from timeit import timeit
# To be tested
def f1(argument_ops):
result = 0
for i in range(argument_ops):
result += 4
return result
def f2(argument_ops):
result = 0
for i in range(argument_ops):
for j in range(argument_ops):
result += 4
return result
# test function to be implemented
def time_test(func, test_data: int) -> float:
runtime = 0
# implement this, it should return a runtime of a given function. Function needs
# argument test_data.
return runtime
# example of usage
print(time_test(f1, 96))
print(time_test(f2, 24))
Actually, I think you can use the globals argument, which takes a namespace in which to execute the statement, to do this relatively easily. Something to the effect of:
def time_test(func, test_data: int) -> float:
gs = dict(func=func, test_data=test_data)
runtime = timeit("func(test_data)", globals=gs)
# implement this, it should return a runtime of a given function. Function needs argument test_data.
return runtime
Note, by default this times how long it takes to the statement 1000000 times.
I would suggest doing it this way which involves passing the function definition to the time_test() function as well as the function name and arguments:
from timeit import timeit
# To be tested
f1_def = """
def f1(argument_ops):
result = 0
for i in range(argument_ops):
result += 4
return result
"""
f2_def = """
def f2(argument_ops):
result = 0
for i in range(argument_ops):
for j in range(argument_ops):
result += 4
return result
"""
# test function
def time_test(func, test_data: int, setup: str) -> float:
stmt = f'{func}({test_data!r})'
runtime = timeit(stmt, setup)
return runtime
# example of usage
print(time_test('f1', 96, setup=f1_def)) # -> 4.069019813000001
print(time_test('f2', 24, setup=f2_def)) # -> 34.072881441999996
I have a function that calls another function, and I would like to use numba's ahead-of-time (AOT) compiler.
Simplified example:
from numba.pycc import CC
cc = CC('test')
cc.verbose = True
#cc.export('calc', 'f8(f8, f8)')
def calc(a, b):
return a + b
#cc.export('calc2', 'f8(f8, f8)')
def calc2(a, b):
return a * calc(a, b)
if __name__ == "__main__":
cc.compile()
When I run this code I get the following error:
Untyped global name 'calc': Cannot determine Numba type of <class 'function'>
I assume that this means Numba doesn't like dependent functions.
Suggestions on how get around this error?
You need to add the #njit() decorator in front of the dependent function.
For the case at hand, the code would become:
#nb.njit()
#cc.export('calc', 'f8(f8, f8)')
def calc(a, b):
return a + b
I'm looking for a nice functional way to do the following:
def add(x, y):
return x + y
def neg(x):
return -x
def c(x, y):
# Apply neg to inputs for add
_x = neg(x)
_y = neg(y)
return add(_x, _y)
neg_sum = c(2, 2) # -4
It seems related to currying, but all of the examples I can find use functions that only have one input variable. I would like something that looks like this:
def add(x, y):
return x + y
def neg(x):
return -x
c = apply(neg, add)
neg_sum = c(2, 2) # -4
This is a fairly direct way to do it:
def add(x, y):
return x + y
def neg(x):
return -x
def apply(g, f):
# h is a function that returns
# f(g(arg1), g(arg2), ...)
def h(*args):
return f(*map(g, args))
return h
# or this:
# def apply(g, f):
# return lambda *args: f(*map(g, args))
c = apply(neg, add)
neg_sum = c(2, 2) # -4
Note that when you use *myvar as an argument in a function definition, myvar becomes a list of all non-keyword arguments that are received. And if you call a function with *expression as an argument, then all the items in expression are unpacked and sent as separate arguments to the function. I use these two behaviors to make h accept an unknown list of arguments, then apply function g to each one (with map), then pass all of them as arguments to f.
A different approach, depending on how extensible you need this to be, is to create an object which implements your operator methods, which each return the same object, allowing you to chain operators together in arbitrary orders.
If you can cope with it always returning a list, you might be able to make it work.
class mathifier:
def __init__(self,values):
self.values = values
def neg(self):
self.values = [-value for value in self.values]
return self
def add(self):
self.values = [sum(self.values)]
return self
print (mathifier([2,3]).neg().add().values)
And you can still get your named function for any set of chained functions:
neg_add = lambda x : mathifier(x).neg().add()
print(neg_add([2,3]).values)
From Matthias Fripp's answer, I asked myself : I'd like to compose add and neg both ways : add_neg(*args) and neg_add(*args). This requires hacking Matthias suggestion a bit. The idea is to get some hint on the arity (number of args) of the functions to compose. This information is obtained with a bit of introspection, thanks to inspect module. With this in mind, we adapt the way args are passed through the chain of funcs. The main assumption here is that we deal with real functions, in the mathematical sense, i.e. functions returning ONE float, and taking at least one argument.
from functools import reduce
from inspect import getfullargspec
def arity_one(func):
spec = getfullargspec(func)
return len(spec[0])==1 and spec[1] is None
def add(*args):
return reduce(lambda x,y:x+y, args, 0)
def neg(x):
return -x
def compose(fun1,fun2):
def comp(*args):
if arity_one(fun2): return fun1(*(map( fun2, args)))
else: return fun1(fun2(*args))
return comp
neg_add = compose(neg, add)
add_neg = compose(add, neg)
print(f"-2+(-3) = {add_neg(2, 3)}")
print(f"-(2+3) = {neg_add(2, 3)}")
The solution is still very adhoc...
I have just discovered numba, and learnt that optimal performance requires adding #njit to most functions, such that numba rarely exits LLVM mode.
I still have a few expensive/lookup functions that could benefit from memoization, but so far none of my attempts have found a workable solution that compiles without error.
Using common decorator functions, before #njit results in a numba not being able to do type inference.
Using decorators after #njit fails to compile the decorator
Numba doesn't like the use of global variables, even when using numba.typed.Dict
Numba doesn't like using closures to store mutable state
Removing #njit also causes type errors when called from other #njit functions
What is the correct way to add memoization to functions when working inside numba?
import functools
import time
import fastcache
import numba
import numpy as np
import toolz
from numba import njit
from functools import lru_cache
from fastcache import clru_cache
from toolz import memoize
# #fastcache.clru_cache(None) # BUG: Untyped global name 'expensive': cannot determine Numba type of <class 'fastcache.clru_cache'>
# #functools.lru_cache(None) # BUG: Untyped global name 'expensive': cannot determine Numba type of <class 'functools._lru_cache_wrapper'>
# #toolz.memoize # BUG: Untyped global name 'expensive': cannot determine Numba type of <class 'function'>
#njit
# #fastcache.clru_cache(None) # BUG: AttributeError: 'fastcache.clru_cache' object has no attribute '__defaults__'
# #functools.lru_cache(None) # BUG: AttributeError: 'functools._lru_cache_wrapper' object has no attribute '__defaults__'
# #toolz.memoize # BUG: CALL_FUNCTION_EX with **kwargs not supported
def expensive():
bitmasks = np.array([ 1 << n for n in range(0, 64) ], dtype=np.uint64)
return bitmasks
# #fastcache.clru_cache(None) # BUG: Untyped global name 'expensive_nojit': cannot determine Numba type of <class 'fastcache.clru_cache'>
# #functools.lru_cache(None) # BUG: Untyped global name 'expensive_nojit': cannot determine Numba type of <class 'fastcache.clru_cache'>
# #toolz.memoize # BUG: Untyped global name 'expensive_nojit': cannot determine Numba type of <class 'function'>
def expensive_nojit():
bitmasks = np.array([ 1 << n for n in range(0, 64) ], dtype=np.uint64)
return bitmasks
# BUG: Failed in nopython mode pipeline (step: analyzing bytecode)
# Use of unsupported opcode (STORE_GLOBAL) found
_expensive_cache = None
#njit
def expensive_global():
global _expensive_cache
if _expensive_cache is None:
bitmasks = np.array([ 1 << n for n in range(0, 64) ], dtype=np.uint64)
_expensive_cache = bitmasks
return _expensive_cache
# BUG: The use of a DictType[unicode_type,array(int64, 1d, A)] type, assigned to variable 'cache' in globals,
# is not supported as globals are considered compile-time constants and there is no known way to compile
# a DictType[unicode_type,array(int64, 1d, A)] type as a constant.
cache = numba.typed.Dict.empty(
key_type = numba.types.string,
value_type = numba.uint64[:]
)
#njit
def expensive_cache():
global cache
if "expensive" not in cache:
bitmasks = np.array([ 1 << n for n in range(0, 64) ], dtype=np.uint64)
cache["expensive"] = bitmasks
return cache["expensive"]
# BUG: Cannot capture the non-constant value associated with variable 'cache' in a function that will escape.
#njit()
def _expensive_wrapped():
cache = []
def wrapper(bitmasks):
if len(cache) is None:
bitmasks = np.array([ 1 << n for n in range(0, 64) ], dtype=np.uint64)
cache.append(bitmasks)
return cache[0]
return wrapper
expensive_wrapped = _expensive_wrapped()
#njit
def loop(count):
for n in range(count):
expensive()
# expensive_nojit()
# expensive_cache()
# expensive_global)
# expensive_wrapped()
def main():
time_start = time.perf_counter()
count = 10000
loop(count)
time_taken = time.perf_counter() - time_start
print(f'{count} loops in {time_taken:.4f}s')
loop(1) # precache numba
main()
# Pure Python: 10000 loops in 0.2895s
# Numba #njit: 10000 loops in 0.0026s
You already mentioned that your real code is more complex, but looking at your minimal example, I would recommend the following pattern:
#njit
def loop(count):
expensive_result = expensive()
for i in range(count):
do_something(count, expensive_result)
Instead of using a cache, you could pre-compute it outside of the loop and provide the result to the loop body. Instead of using globals, I would recommend you to pass every argument explicitly (always, but especially when using the numba jit).