Not sure if this is possible in Python, but I'm trying to profile a large function and indicate which parts of it's processing / I/O are slow. I was attempting to write a couple of decorator functions; a top-level function to wrap the function being profiled. And decorators for some of the nested functions to report on their timing if a threshold is exceeded for the top level decorator. I'm not sure how I could share this context across decorators though.
Top level Decorator
def time_stack(name, threshold=60000):
def wrapper(f):
def wrapped(*args, **kwargs):
start = time_millis()
f(*args, **kwargs)
end = time_millis()
if end - start > threshold:
# Log out frame timings here
return wrapped
return wrapper
For nested functions
def time_frame(name):
def wrapper(f):
def wrapped(*args, **kwargs):
start = time_millis()
f(*args, **kwargs)
end = time_millis()
t = end - start
# Somehow remember this value for the outer time_stack to use if needed
return wrapped
return wrapper
Example
#time_frame(name="do_some_io")
def do_some_io(string):
# do some io
#time_frame(name="do_a_transform")
def do_a_transform(result):
# do some transforming
#time_frame(name="do_some_caching")
def do_some_caching(stuff):
# do some caching
#time_stack(name="search", threshold=100000):
def search(string):
result = do_some_io(string)
transformed = do_a_transform(result)
return do_some_caching(transformed)
Here, if the execution time of search exceeds 100000ms, it would print out something like
search took 123456ms
do_some_io: 23000ms
do_a_transform: 13678ms
do_some_caching: 86778ms
I though about passing an object down through the kwargs to keep track of the times, but then all the functions in the call stack have to have **kwargs in their signature, and if theres a way to achieve this without having to do that it would be preferable.
You can define a global stack which keeps the data of each time_frame. It will be set on time_stack before calling the function and will be reset at the end of it. You can use its data if the time has passed the threshold.
However, there should be only one time_stack. For multiple time_stack functions, there should be a stack containing stacks.
A sketch of this idea is something like:
PROFILE_STACK = []
STACK_IS_SET = False
def time_stack(name, threshold=60000):
def wrapper(f):
def wrapped(*args, **kwargs):
PROFILE_IS_SET = True
start = time_millis()
f(*args, **kwargs)
end = time_millis()
if end - start > threshold:
# use PROFILE_STACK
PROFILE_STACK.clear()
STACK_IS_SET = False
return wrapped
return wrapper
And
def time_frame(name):
def wrapper(f):
def wrapped(*args, **kwargs):
start = time_millis()
f(*args, **kwargs)
end = time_millis()
t = end - start
if STACK_IS_SET:
PROFILE_STACK.append("SOMETHING")
# Somehow remember this value for the outer time_stack to use if needed
return wrapped
return wrapper
Related
I'm writing a decorator which needs to pass data to other utility functions; something like:
STORE = []
def utility(message):
STORE.append(message)
def decorator(func):
def decorator_wrap(*args, **kwargs):
global STORE
saved_STORE = STORE
STORE = list()
func(*args, **kwargs)
for line in STORE:
print(line)
STORE = saved_STORE
return decorator_wrap
#decorator
def foo(x):
# ...
utility(x)
# ...
But that's kind of yuck, and not thread safe. Is there a way to override utility()'s view of STORE for the duration of decorator_wrap()? Or some other way to signal to utility() that there's an alternate STORE it should use?
Alternatively, to present an different utility() to foo() and all its callees; but that seems like exactly the same problem.
From this answer I find that I can implement it this way:
import inspect
STORE = []
def utility(message):
global STORE
store = STORE
frame = inspect.currentframe()
while frame:
if 'LOCAL_STORE' in frame.f_locals:
store = frame.f_locals['LOCAL_STORE']
break;
frame = frame.f_back
store.append(message)
def decorator(func):
def decorator_wrap(*args, **kwargs):
LOCAL_STORE = []
func(*args, **kwargs)
for line in LOCAL_STORE:
print(line)
return decorator_wrap
Buuuut while reading the documentation I see f_globals is present in every stack frame. I think the more efficient method would be to inject my local into my callee's f_globals. This would be similar to setting an environment variable before executing another command, but I don't know if it's legal.
How could one write a debounce decorator in python which debounces not only on function called but also on the function arguments/combination of function arguments used?
Debouncing means to supress the call to a function within a given timeframe, say you call a function 100 times within 1 second but you only want to allow the function to run once every 10 seconds a debounce decorated function would run the function once 10 seconds after the last function call if no new function calls were made. Here I'm asking how one could debounce a function call with specific function arguments.
An example could be to debounce an expensive update of a person object like:
#debounce(seconds=10)
def update_person(person_id):
# time consuming, expensive op
print('>>Updated person {}'.format(person_id))
Then debouncing on the function - including function arguments:
update_person(person_id=144)
update_person(person_id=144)
update_person(person_id=144)
>>Updated person 144
update_person(person_id=144)
update_person(person_id=355)
>>Updated person 144
>>Updated person 355
So calling the function update_person with the same person_id would be supressed (debounced) until the 10 seconds debounce interval has passed without a new call to the function with that same person_id.
There's a few debounce decorators but none includes the function arguments, example: https://gist.github.com/walkermatt/2871026
I've done a similar throttle decorator by function and arguments:
def throttle(s, keep=60):
def decorate(f):
caller = {}
def wrapped(*args, **kwargs):
nonlocal caller
called_args = '{}'.format(*args)
t_ = time.time()
if caller.get(called_args, None) is None or t_ - caller.get(called_args, 0) >= s:
result = f(*args, **kwargs)
caller = {key: val for key, val in caller.items() if t_ - val > keep}
caller[called_args] = t_
return result
# Keep only calls > keep
caller = {key: val for key, val in caller.items() if t_ - val > keep}
caller[called_args] = t_
return wrapped
return decorate
The main takaway is that it keeps the function arguments in caller[called_args]
See also the difference between throttle and debounce: http://demo.nimius.net/debounce_throttle/
Update:
After some tinkering with the above throttle decorator and the threading.Timer example in the gist, I actually think this should work:
from threading import Timer
from inspect import signature
import time
def debounce(wait):
def decorator(fn):
sig = signature(fn)
caller = {}
def debounced(*args, **kwargs):
nonlocal caller
try:
bound_args = sig.bind(*args, **kwargs)
bound_args.apply_defaults()
called_args = fn.__name__ + str(dict(bound_args.arguments))
except:
called_args = ''
t_ = time.time()
def call_it(key):
try:
# always remove on call
caller.pop(key)
except:
pass
fn(*args, **kwargs)
try:
# Always try to cancel timer
caller[called_args].cancel()
except:
pass
caller[called_args] = Timer(wait, call_it, [called_args])
caller[called_args].start()
return debounced
return decorator
I've had the same need to build a debounce annotation for a personal project, after stumbling upon the same gist / discussion you have, I ended up with the following solution:
import threading
def debounce(wait_time):
"""
Decorator that will debounce a function so that it is called after wait_time seconds
If it is called multiple times, will wait for the last call to be debounced and run only this one.
"""
def decorator(function):
def debounced(*args, **kwargs):
def call_function():
debounced._timer = None
return function(*args, **kwargs)
# if we already have a call to the function currently waiting to be executed, reset the timer
if debounced._timer is not None:
debounced._timer.cancel()
# after wait_time, call the function provided to the decorator with its arguments
debounced._timer = threading.Timer(wait_time, call_function)
debounced._timer.start()
debounced._timer = None
return debounced
return decorator
I've created an open-source project to provide functions such as debounce, throttle, filter ... as decorators, contributions are more than welcome to improve on the solution I have for these decorators / add other useful decorators: decorator-operations repository
Is there any way to check inside function f1 in my example if calling a function (here decorated or not_decorated) has a specific decorator (in code #out)? Is such information passed to a function?
def out(fun):
def inner(*args, **kwargs):
fun(*args, **kwargs)
return inner
#out
def decorated():
f1()
def not_decorated():
f1()
def f1():
if is_decorated_by_out: # here I want to check it
print('I am')
else:
print('I am not')
decorated()
not_decorated()
Expected output:
I am
I am not
To be clear, this is egregious hackery, so I don't recommend it, but since you've ruled out additional parameters, and f1 will be the same whether wrapped or not, you've left hacks as your only option. The solution is to add a local variable to the wrapper function for the sole purpose of being found by means of stack inspection:
import inspect
def out(fun):
def inner(*args, **kwargs):
__wrapped_by__ = out
fun(*args, **kwargs)
return inner
def is_wrapped_by(func):
try:
return inspect.currentframe().f_back.f_back.f_back.f_locals.get('__wrapped_by__') is func
except AttributeError:
return False
#out
def decorated():
f1()
def not_decorated():
f1()
def f1():
if is_wrapped_by(out):
print('I am')
else:
print('I am not')
decorated()
not_decorated()
Try it online!
This assumes a specific degree of nesting (the manual back-tracking via f_back to account for is_wrapped_by itself, f1, decorated and finally to inner (from out). If you want to determine if out was involved anywhere in the call stack, make is_wrapped_by loop until the stack is exhausted:
def is_wrapped_by(func):
frame = None
try:
# Skip is_wrapped_by and caller
frame = inspect.currentframe().f_back.f_back
while True:
if frame.f_locals.get('__wrapped_by__') is func:
return True
frame = frame.f_back
except AttributeError:
pass
finally:
# Leaving frame on the call stack can cause cycle involving locals
# which delays cleanup until cycle collector runs;
# explicitly break cycle to save yourself the headache
del frame
return False
If you are open to creating an additional parameter in f1 (you could also use a default parameter), you can use functools.wraps and check for the existence of the __wrapped__ attribute. To do so, pass the wrapper function to f:
import functools
def out(fun):
#functools.wraps(fun)
def inner(*args, **kwargs):
fun(*args, **kwargs)
return inner
#out
def decorated():
f1(decorated)
def not_decorated():
f1(not_decorated)
def f1(_func):
if getattr(_func, '__wrapped__', False):
print('I am')
else:
print('I am not')
decorated()
not_decorated()
Output:
I am
I am not
Suppose you have a function decoration like this one
def double_arg(fun):
def inner(x):
return fun(x*2)
return inner
however you can't access it (it's inside a 3rd party lib or something). In this case you can wrap it into another function that adds the name of the decoration to the resulting function
def keep_decoration(decoration):
def f(g):
h = decoration(g)
h.decorated_by = decoration.__name__
return h
return f
and replace the old decoration by the wrapper.
double_arg = keep_decoration(double_arg)
You can even write a helper function that checks whether a function is decorated or not.
def is_decorated_by(f, decoration_name):
try:
return f.decorated_by == decoration_name
except AttributeError:
return False
Example of use...
#double_arg
def inc_v1(x):
return x + 1
def inc_v2(x):
return x + 1
print(inc_v1(5))
print(inc_v2(5))
print(is_decorated_by(inc_v1, 'double_arg'))
print(is_decorated_by(inc_v2, 'double_arg'))
Output
11
6
True
False
Why does this :
def fn(proc, *args, **kwargs):
cache = proc.cache = {}
def cached_execution(cache, *args, **kwargs):
if proc in cache:
if args in cache[proc]:
return cache[proc][args]
res = proc(args)
cache[proc] = {args: res}
return res
return cached_execution(cache, proc, *args, **kwargs)
#fn
def cached_fibo(n):
if n == 1 or n == 0:
return n
else:
return cached_fibo(n-1) + cached_fibo(n-2)
print cached_fibo(100)
throw an exception like this:
NameError: global name 'cached_fibo' is not defined
What fundamental concept am I missing?
(Conceptually, **kwargs is for decoration only. Not utilizing in retrieving the cached result, but don't worry about it).
A decorator should return a function, not the result of calling a function.
But this leads us to the next mistake: when you're passing cache and proc to cached_execution function they land in *args which in turn gets passed to proc. This doesn't make sense. Just let cache and proc be captured within the inner method:
def fn(proc, *args, **kwargs):
cache = proc.cache = {}
def cached_execution(*args, **kwargs):
if proc in cache:
if args in cache[proc]:
return cache[proc][args]
res = proc(*args)
cache[proc] = {args: res}
return res
return cached_execution
Another problem: you were not unpacking args. You should call proc(*args) instead of proc(args) (already fixed above).
The wrapper seems a little malformed. Here is an updated version:
def fn(proc):
cache = proc.cache = {}
def cached_execution(*args, **kwargs):
if proc in cache:
if args in cache[proc]:
return cache[proc][args]
res = proc(args[0])
cache[proc] = {args: res}
return res
return cached_execution
You were trying to run the wrapper function inside the wrapper instead of returning it to be run as the function, causing issues.
The next issue is that the argument you supply is a list of tuples *args at proc(args) when you only want the first one, so needs to turn into proc(args[0])
I want to write a decorator that inject custom local variable into function.
interface may like this.
def enclose(name, value):
...
def decorator(func):
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
return decorator
expectation:
#enclose('param1', 1)
def f():
param1 += 1
print param1
f() will compile and run without error
output:
2
Is it possible to do this in python? why?
I thought I'd try this out just to see how hard it would be. Pretty hard as it turns out.
First thing was how do you implement this? Is the extra parameter an injected local variable, an additional argument to the function or a nonlocal variable. An injected local variable will be a fresh object each time, but how to create more complicated objects... An additional argument will record mutations to the object, but assignments to the name will be forgotten between function invocations. Additionally, this will require either parsing of the source to find where to place the argument, or directly manipulating code objects. Finally, declaring the variables nonlocal will record mutations to the object and assignments to the name. Effectively a nonlocal is global, but only reachable by the decorated function. Again, using a nonlocal will requiring parsing the source and finding where to place the nonlocal declaration or direct manipulation of a code object.
In the end I decided with using a nonlocal variable and parsing the function source. Originally I was going to manipulate code objects, but it seemed too complicated.
Here is the code for the decorator:
import re
import types
import inspect
class DummyInject:
def __call__(self, **kwargs):
return lambda func: func
def __getattr__(self, name):
return self
class Inject:
function_end = re.compile(r"\)\s*:\s*\n")
indent = re.compile("\s+")
decorator = re.compile("#([a-zA-Z0-9_]+)[.a-zA-Z0-9_]*")
exec_source = """
def create_new_func({closure_names}):
{func_source}
{indent}return {func_name}"""
nonlocal_declaration = "{indent}nonlocal {closure_names};"
def __init__(self, **closure_vars):
self.closure_vars = closure_vars
def __call__(self, func):
lines, line_number = inspect.getsourcelines(func)
self.inject_nonlocal_declaration(lines)
new_func = self.create_new_function(lines, func)
return new_func
def inject_nonlocal_declaration(self, lines):
"""hides nonlocal declaration in first line of function."""
function_body_start = self.get_function_body_start(lines)
nonlocals = self.nonlocal_declaration.format(
indent=self.indent.match(lines[function_body_start]).group(),
closure_names=", ".join(self.closure_vars)
)
lines[function_body_start] = nonlocals + lines[function_body_start]
return lines
def get_function_body_start(self, lines):
line_iter = enumerate(lines)
found_function_header = False
for i, line in line_iter:
if self.function_end.search(line):
found_function_header = True
break
assert found_function_header
for i, line in line_iter:
if not line.strip().startswith("#"):
break
return i
def create_new_function(self, lines, func):
# prepares source -- eg. making sure indenting is correct
declaration_indent, body_indent = self.get_indent(lines)
if not declaration_indent:
lines = [body_indent + line for line in lines]
exec_code = self.exec_source.format(
closure_names=", ".join(self.closure_vars),
func_source="".join(lines),
indent=declaration_indent if declaration_indent else body_indent,
func_name=func.__name__
)
# create new func -- mainly only want code object contained by new func
lvars = {"closure_vars": self.closure_vars}
gvars = self.get_decorators(exec_code, func.__globals__)
exec(exec_code, gvars, lvars)
new_func = eval("create_new_func(**closure_vars)", gvars, lvars)
# add back bits that enable function to work well
# includes original global references and
new_func = self.readd_old_references(new_func, func)
return new_func
def readd_old_references(self, new_func, old_func):
"""Adds back globals, function name and source reference."""
func = types.FunctionType(
code=self.add_src_ref(new_func.__code__, old_func.__code__),
globals=old_func.__globals__,
name=old_func.__name__,
argdefs=old_func.__defaults__,
closure=new_func.__closure__
)
func.__doc__ = old_func.__doc__
return func
def add_src_ref(self, new_code, old_code):
return types.CodeType(
new_code.co_argcount,
new_code.co_kwonlyargcount,
new_code.co_nlocals,
new_code.co_stacksize,
new_code.co_flags,
new_code.co_code,
new_code.co_consts,
new_code.co_names,
new_code.co_varnames,
old_code.co_filename, # reuse filename
new_code.co_name,
old_code.co_firstlineno, # reuse line number
new_code.co_lnotab,
new_code.co_freevars,
new_code.co_cellvars
)
def get_decorators(self, source, global_vars):
"""Creates a namespace for exec function creation in. Must remove
any reference to Inject decorator to prevent infinite recursion."""
namespace = {}
for match in self.decorator.finditer(source):
decorator = eval(match.group()[1:], global_vars)
basename = match.group(1)
if decorator is Inject:
namespace[basename] = DummyInject()
else:
namespace[basename] = global_vars[basename]
return namespace
def get_indent(self, lines):
"""Takes a set of lines used to create a function and returns the
outer indentation that the function is declared in and the inner
indentation of the body of the function."""
body_indent = None
function_body_start = self.get_function_body_start(lines)
for line in lines[function_body_start:]:
match = self.indent.match(line)
if match:
body_indent = match.group()
break
assert body_indent
match = self.indent.match(lines[0])
if not match:
declaration_indent = ""
else:
declaration_indent = match.group()
return declaration_indent, body_indent
if __name__ == "__main__":
a = 1
#Inject(b=10)
def f(c, d=1000):
"f uses injected variables"
return a + b + c + d
#Inject(var=None)
def g():
"""Purposefully generate exception to show stacktraces are still
meaningful."""
create_name_error # line number 164
print(f(100)) # prints 1111
assert f(100) == 1111
assert f.__doc__ == "f uses injected variables" # show doc is retained
try:
g()
except NameError:
raise
else:
assert False
# stack trace shows NameError on line 164
Which outputs the following:
1111
Traceback (most recent call last):
File "inject.py", line 171, in <module>
g()
File "inject.py", line 164, in g
create_name_error # line number 164
NameError: name 'create_name_error' is not defined
The whole thing is hideously ugly, but it works. It's also worth noting that if Inject is used for method, then any injected values are shared between all instances of the class.
You can do it using globals but I don't recommend this approach.
def enclose(name, value):
globals()[name] = value
def decorator(func):
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
return decorator
#enclose('param1', 1)
def f():
global param1
param1 += 1
print(param1)
f()