With the goal of capturing meaningful elapsed-time information in for logs, I have replicated the following time-capture and logging code across many functions:
import time
import datetime
def elapsed_str(seconds):
""" Returns elapsed number of seconds in format '(elapsed HH:MM:SS)' """
return "({} elapsed)".format(str(datetime.timedelta(seconds=int(seconds))))
def big_job(job_obj):
""" Do a big job and return the result """
start = time.time()
logging.info(f"Starting big '{job_obj.name}' job...")
logging.info(f"Doing stuff related to '{job_type}'...")
time.sleep(10) # Do some stuff...
logging.info(f"Big '{job_obj.name}' job completed! "
f"{elapsed_str(time.time() - start)}")
return my_result
With sample usage output:
big_job("sheep-counting")
# Log Output:
# 2019-09-04 01:10:48,027 - INFO - Starting big 'sheep-counting' job...
# 2019-09-04 01:10:48,092 - INFO - Doing stuff related to 'sheep-counting'
# 2019-09-04 01:10:58,802 - INFO - Big 'sheep-counting' job completed! (0:00:10 elapsed)
I'm looking for an elegant (pythonic) method to remove these redundant lines from having to be rewritten each time:
start = time.time() - Should just automatically capture the start time at function launch.
time.time() - start Should use previously captured start time and infer current time from now(). (Ideally elapsed_str() would be callable with zero arguments.)
My specific use case is to generate large datasets in the data science / data engineering field. Runtimes could be anywhere from seconds to days, and it is critical that (1) logs are easily searchable (for the word "elapsed" in this case) and (2) that the developer cost of adding the logs is very low (since we don't know ahead of time which jobs may be slow and we may not be able to modify source code once we identify a performance problem).
The recommended way is to use time.perf_counter() and time.perf_counter_ns() since 3.7.
In order to measure runtime of functions it is comfortable to use a decorator. For example the following one:
import time
def benchmark(fn):
def _timing(*a, **kw):
st = time.perf_counter()
r = fn(*a, **kw)
print(f"{fn.__name__} execution: {time.perf_counter() - st} seconds")
return r
return _timing
#benchmark
def your_test():
print("IN")
time.sleep(1)
print("OUT")
your_test()
(c) The code of this decorator is slightly modified from sosw package
If I understood you correctly you could write a decorator that will time the function.
A good example here: https://stackoverflow.com/a/5478448/6001492
This may be overkill for others' use cases but the solution I found required a few difficult hurtles and I'll document them here for anyone who wants to accomplish something similar.
1. Helper function to dynamically evaluate f-strings
def fstr(fstring_text, locals, globals=None):
"""
Dynamically evaluate the provided fstring_text
Sample usage:
format_str = "{i}*{i}={i*i}"
i = 2
fstr(format_str, locals()) # "2*2=4"
i = 4
fstr(format_str, locals()) # "4*4=16"
fstr(format_str, {"i": 12}) # "10*10=100"
"""
locals = locals or {}
globals = globals or {}
ret_val = eval(f'f"{fstring_text}"', locals, globals)
return ret_val
2. The #logged decorator class
class logged(object):
"""
Decorator class for logging function start, completion, and elapsed time.
"""
def __init__(
self,
desc_text="'{desc_detail}' call to {fn.__name__}()",
desc_detail="",
start_msg="Beginning {desc_text}...",
success_msg="Completed {desc_text} {elapsed}",
log_fn=logging.info,
**addl_kwargs,
):
""" All arguments optional """
self.context = addl_kwargs.copy() # start with addl. args
self.context.update(locals()) # merge all constructor args
self.context["elapsed"] = None
self.context["start"] = time.time()
def re_eval(self, context_key: str):
""" Evaluate the f-string in self.context[context_key], store back the result """
self.context[context_key] = fstr(self.context[context_key], locals=self.context)
def elapsed_str(self):
""" Return a formatted string, e.g. '(HH:MM:SS elapsed)' """
seconds = time.time() - self.context["start"]
return "({} elapsed)".format(str(datetime.timedelta(seconds=int(seconds))))
def __call__(self, fn):
""" Call the decorated function """
def wrapped_fn(*args, **kwargs):
"""
The decorated function definition. Note that the log needs access to
all passed arguments to the decorator, as well as all of the function's
native args in a dictionary, even if args are not provided by keyword.
If start_msg is None or success_msg is None, those log entries are skipped.
"""
self.context["fn"] = fn
fn_arg_names = inspect.getfullargspec(fn).args
for x, arg_value in enumerate(args, 0):
self.context[fn_arg_names[x]] = arg_value
self.context.update(kwargs)
desc_detail_fn = None
log_fn = self.context["log_fn"]
# If desc_detail is callable, evaluate dynamically (both before and after)
if callable(self.context["desc_detail"]):
desc_detail_fn = self.context["desc_detail"]
self.context["desc_detail"] = desc_detail_fn()
# Re-evaluate any decorator args which are fstrings
self.re_eval("desc_detail")
self.re_eval("desc_text")
# Remove 'desc_detail' if blank or unused
self.context["desc_text"] = self.context["desc_text"].replace("'' ", "")
self.re_eval("start_msg")
if self.context["start_msg"]:
# log the start of execution
log_fn(self.context["start_msg"])
ret_val = fn(*args, **kwargs)
if desc_detail_fn:
# If desc_detail callable, then reevaluate
self.context["desc_detail"] = desc_detail_fn()
self.context["elapsed"] = self.elapsed_str()
# log the end of execution
log_fn(fstr(self.context["success_msg"], locals=self.context))
return ret_val
return wrapped_fn
Sample usage:
#logged()
def my_func_a():
pass
# 2019-08-18 - INFO - Beginning call to my_func_a()...
# 2019-08-18 - INFO - Completed call to my_func_a() (00:00:00 elapsed)
#logged(log_fn=logging.debug)
def my_func_b():
pass
# 2019-08-18 - DEBUG - Beginning call to my_func_b()...
# 2019-08-18 - DEBUG - Completed call to my_func_b() (00:00:00 elapsed)
#logged("doing a thing")
def my_func_c():
pass
# 2019-08-18 - INFO - Beginning doing a thing...
# 2019-08-18 - INFO - Completed doing a thing (00:00:00 elapsed)
#logged("doing a thing with {foo_obj.name}")
def my_func_d(foo_obj):
pass
# 2019-08-18 - INFO - Beginning doing a thing with Foo...
# 2019-08-18 - INFO - Completed doing a thing with Foo (00:00:00 elapsed)
#logged("doing a thing with '{custom_kwarg}'", custom_kwarg="foo")
def my_func_e(foo_obj):
pass
# 2019-08-18 - INFO - Beginning doing a thing with 'foo'...
# 2019-08-18 - INFO - Completed doing a thing with 'foo' (00:00:00 elapsed)
Conclusion
The main advantages versus simpler decorator solutions are:
By leveraging delayed execution of f-strings, and by injecting context variables from both the decorator constructor as well ass the function call itself, the log messaging can be easily customized to be human readable.
(And most importantly), almost any derivation of the function's arguments can be used to distinguish the logs in subsequent calls - without changing how the function itself is defined.
Advanced callback scenarios can be achieved by sending a functions or complex objects to the decorator argument desc_detail, in which case the function would get evaluated both before and after function execution. This could eventually be extended to use a callback functions to count rows in created data table (for instance) and to include the table row counts in the completion log.
Related
Python offers tracing through its trace module. There are also custom solutions like this. But these approaches capture most low-level executions, inside-and-out of most/every library you use. Other than deep-dive debugging this isn't very useful.
It would be nice to have something that captures only the highest-level functions laid out in your pipeline. For example, if I had:
def funct1():
res = funct2()
print(res)
def funct2():
factor = 3
res = funct3(factor)
return(res)
def funct3(factor):
res = 1 + 100*factor
return(res)
...and called:
funct1()
...it would be nice to capture:
function order:
- funct1
- funct2
- funct3
I have looked at:
trace
tracefunc
sys.settrace
trace.py
I am happy to manually mark the functions inside the scripts, like we do with Docstrings. Is there a way to add "hooks" to functions, then track them as they get called?
You can always use a decorator to track which functions are called. Here is an example that allows you to keep track of what nesting level the function is called at:
class Tracker:
level = 0
def __init__(self, indent=2):
self.indent = indent
def __call__(self, fn):
def wrapper(*args, **kwargs):
print(' '*(self.indent * self.level) + '-' + fn.__name__)
self.level += 1
out = fn(*args, **kwargs)
self.level -= 1
return out
return wrapper
track = Tracker()
#track
def funct1():
res = funct2()
print(res)
#track
def funct2():
factor = 3
res = funct3(factor)
return(res)
#track
def funct3(factor):
res = 1 + 100*factor
return(res)
It uses the class variable level to keep track of how many functions have been called and simply prints out the the function name with a space indent. So calling funct1 gives:
funct1()
# prints:
-funct1
-funct2
-funct3
# returns:
301
Depending on how you want to save the output, you can use the logging module for the output
How could one write a debounce decorator in python which debounces not only on function called but also on the function arguments/combination of function arguments used?
Debouncing means to supress the call to a function within a given timeframe, say you call a function 100 times within 1 second but you only want to allow the function to run once every 10 seconds a debounce decorated function would run the function once 10 seconds after the last function call if no new function calls were made. Here I'm asking how one could debounce a function call with specific function arguments.
An example could be to debounce an expensive update of a person object like:
#debounce(seconds=10)
def update_person(person_id):
# time consuming, expensive op
print('>>Updated person {}'.format(person_id))
Then debouncing on the function - including function arguments:
update_person(person_id=144)
update_person(person_id=144)
update_person(person_id=144)
>>Updated person 144
update_person(person_id=144)
update_person(person_id=355)
>>Updated person 144
>>Updated person 355
So calling the function update_person with the same person_id would be supressed (debounced) until the 10 seconds debounce interval has passed without a new call to the function with that same person_id.
There's a few debounce decorators but none includes the function arguments, example: https://gist.github.com/walkermatt/2871026
I've done a similar throttle decorator by function and arguments:
def throttle(s, keep=60):
def decorate(f):
caller = {}
def wrapped(*args, **kwargs):
nonlocal caller
called_args = '{}'.format(*args)
t_ = time.time()
if caller.get(called_args, None) is None or t_ - caller.get(called_args, 0) >= s:
result = f(*args, **kwargs)
caller = {key: val for key, val in caller.items() if t_ - val > keep}
caller[called_args] = t_
return result
# Keep only calls > keep
caller = {key: val for key, val in caller.items() if t_ - val > keep}
caller[called_args] = t_
return wrapped
return decorate
The main takaway is that it keeps the function arguments in caller[called_args]
See also the difference between throttle and debounce: http://demo.nimius.net/debounce_throttle/
Update:
After some tinkering with the above throttle decorator and the threading.Timer example in the gist, I actually think this should work:
from threading import Timer
from inspect import signature
import time
def debounce(wait):
def decorator(fn):
sig = signature(fn)
caller = {}
def debounced(*args, **kwargs):
nonlocal caller
try:
bound_args = sig.bind(*args, **kwargs)
bound_args.apply_defaults()
called_args = fn.__name__ + str(dict(bound_args.arguments))
except:
called_args = ''
t_ = time.time()
def call_it(key):
try:
# always remove on call
caller.pop(key)
except:
pass
fn(*args, **kwargs)
try:
# Always try to cancel timer
caller[called_args].cancel()
except:
pass
caller[called_args] = Timer(wait, call_it, [called_args])
caller[called_args].start()
return debounced
return decorator
I've had the same need to build a debounce annotation for a personal project, after stumbling upon the same gist / discussion you have, I ended up with the following solution:
import threading
def debounce(wait_time):
"""
Decorator that will debounce a function so that it is called after wait_time seconds
If it is called multiple times, will wait for the last call to be debounced and run only this one.
"""
def decorator(function):
def debounced(*args, **kwargs):
def call_function():
debounced._timer = None
return function(*args, **kwargs)
# if we already have a call to the function currently waiting to be executed, reset the timer
if debounced._timer is not None:
debounced._timer.cancel()
# after wait_time, call the function provided to the decorator with its arguments
debounced._timer = threading.Timer(wait_time, call_function)
debounced._timer.start()
debounced._timer = None
return debounced
return decorator
I've created an open-source project to provide functions such as debounce, throttle, filter ... as decorators, contributions are more than welcome to improve on the solution I have for these decorators / add other useful decorators: decorator-operations repository
I wrote a generic framework that help me to bench-mark code critical sections.
Here is an explanation of the framework and in the end is the problem I am facing and few ideas I have for solutions.
Basically, I am looking for more elegant solutions
Suppose I have a function that does this (in pseudo code):
#Pseudo Code - Don't expect it to run
def foo():
do_begin()
do_critical()
some_value = do_end()
return some_value
I want to run "do_critical" section many times in loop and measure the time but still get the return value.
so, I wrote BenchMarker class that its api is something like that:
#Pseudo Code - Don't expect it to run
bm = BenchMarker(first=do_begin, critical=do_critical, end=do_end)
bm.start_benchmarking()
returned_value = bm.returned_value
benchmark_result = bm.time
This Benckmarker internally performing the following:
#Pseudo Code - Don't expect it to run
class BenchMarker:
def __init__(self):
.....
def start_benchmarking(self):
first()
t0 = take_time
for i in range(n_loops):
critical()
t1 = take_time
self.time = (t1-t0)/n_loops
value = end()
self.returned_value = value
Important to mention that I also able to pass context between first, critical and end functions, but I omitted it for simplicity as this is not the gist of my question.
This framework is working like a charm until the following use case:
I have the following code
#Pseudo Code - Don't expect it to run
def bar():
do_begin()
with some_context_manager() as ctx:
do_critical()
some_value = do_end()
return some_value
Now, after this long introduction (sorry ...), I am getting to the real question.
I don't want to run the "with statement" in the time measuring loop, but the critical code needs the context manger.
so what I basically want is equivalent to the following decomposing of bar:
first -> do_begin() + "what happens in the with before the with body"
critical -> do_critical()
end -> "what happens after the with body" + do_end()
Two Solutions I thought of (but I don't like):
Solution 1
Mimic what with does under the hood
In end of first()m create the context manager object + run it's enter() function
In the start of end(), call the context manager exit() function
Solution 2
Framework Enhancement to handle CM
Add to the framework a "context work mode" (flag, whatever ...) on which the "start_benchmarking" flow will look like this:
#Pseudo Code - Don't expect it to run
def start_benchmarking(self):
first() #including instantiating the context manager
ctx = get_the_context_manager_created_in_first()
with ctx ...:
t0 = take_time
for i in range(n_loops):
critical()
t1 = take_time
self.time = (t1-t0)/n_loops
value = end()
self.returned_value = value
Any other, more elegant, solutions?
this is way over-complicated. and i cannot quite figure out why you'd actually want to do this, but assuming that you have reasons, just create a function that does your timing for you:
def run_func_n_times(n_times, func, *args, **kwargs):
start = time.time()
for _ in range(n_times):
res = func(*args, **kwargs)
return res, (time.time() - start) / n_times
no need for a class, just a simple func:
def example():
do_begin()
print('look, i am here')
with ctx() as blah:
res, timed = run_func_n_times(27, f, foo, bar)
do_end()
report.py
if __name__ == "__main__":
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter, description = "CHECK-ACCESS REPORTING.")
parser.add_argument('--input','-i', help='Filepath containing the Active Directory userlist')
parser.add_argument('--timestamp', '-t', nargs='?',const="BLANK", help='filepath with environement varible set')
args, unknownargs = parser.parse_known_args(sys.argv[1:])
timestampchecker(args.timestamp)
#checking the value of cons.DISPLAYRESULT is TRUE
main()
timestampchecker function :
def timestampchecker(status):
""" Check if the timestamp is to display or not from command line"""
if status is not None:
cons.DISPLAY_TIME_STAMP = True
This function checks if the user have set the -t arguments. If it is set I have defined one constant called cons.DISPLAYRESULT to true.
The function is working great and turning the constant value to True.
But in the main function I have implemented this decorators which is not taking the true value but false
timer.py
def benchmarking(timestaus):
def wrapper(funct):
def timercheck(*args, **kwarg):
if timestaus is True:
starttime=time.time()
funct(*args, **kwarg)
if timestaus is True:
print('Time Taken:',round(time.time()-starttime, 4))
return timercheck
return wrapper
I have decorated some method in main() method of report.py with the decorators above. For example This is the class being used in report.py and being decorated with above decorators
class NotAccountedReport:
def __init__(self, pluginoutputpath):
""" Path where the plugins result are stored need these files"""
self.pluginoutputpath = pluginoutputpath
#benchmarking(cons.DISPLAY_TIME_STAMP)
def makeNotAccountableReport():
#some functionality
here I have passed the constant value to the argument decorator which
when tested though before called is converted to True is taking false
and thus the decorators not being implemented. Where is the problem
cant figure out
You didn't post a complete minimal verifiable example so there might be something else too, but if your point is that when calling NotAccountedReport().makeNotAccountableReport() you don't get your "Time taken" printed then it's really not a surprise - the benchmarking decorator is applied when the function is defined (when the module is imported), well before the if __name__ == '__main__' clause is executed, so at that time cons.DISPLAY_TIME_STAMP has not been updated by your command line args.
If you want a runtime flag to activate / deactivate your decorator's behaviour the obvious solution is to check cons.DISPLAY_TIME_STAMP within the decorator instead of passing it as argument, ie:
def benchmarking(func):
def timercheck(*args, **kwarg):
if cons.DISPLAY_TIME_STAMP:
starttime=time.time()
result = func(*args, **kwarg)
if cons.DISPLAY_TIME_STAMP:
logger.debug('Time Taken: %s',round(time.time()-starttime, 4))
return result
return timercheck
class NotAccountedReport(object):
#benchmarking
def makeNotAccountableReport():
#some functionality
In some circumstances, I want to print debug-style output like this:
# module test.py
def f()
a = 5
b = 8
debug(a, b) # line 18
I want the debug function to print the following:
debug info at test.py: 18
function f
a = 5
b = 8
I am thinking it should be possible by using inspect module to locate the stack frame, then finding the appropriate line, looking up the source code in that line, getting the names of the arguments from there. The function name can be obtained by moving one stack frame up. (The values of the arguments is easy to obtain: they are passed directly to the function debug.)
Am I on the right track? Is there any recipe I can refer to?
You could do something along the following lines:
import inspect
def debug(**kwargs):
st = inspect.stack()[1]
print '%s:%d %s()' % (st[1], st[2], st[3])
for k, v in kwargs.items():
print '%s = %s' % (k, v)
def f():
a = 5
b = 8
debug(a=a, b=b) # line 12
f()
This prints out:
test.py:12 f()
a = 5
b = 8
You're generally doing it right, though it would be easier to use AOP for this kinds of tasks. Basically, instead of calling "debug" every time with every variable, you could just decorate the code with aspects which do certain things upon certain events, like upon entering the function to print passed variables and it's name.
Please refer to this site and old so post for more info.
Yeah, you are in the correct track. You may want to look at inspect.getargspec which would return a named tuple of args, varargs, keywords, defaults passed to the function.
import inspect
def f():
a = 5
b = 8
debug(a, b)
def debug(a, b):
print inspect.getargspec(debug)
f()
This is really tricky. Let me try and give a more complete answer reusing this code, and the hint about getargspec in Senthil's answer which got me triggered somehow. Btw, getargspec is deprecated in Python 3.0 and getfullarcspec should be used instead.
This works for me on a Python 3.1.2 both with explicitly calling the debug function and with using a decorator:
# from: https://stackoverflow.com/a/4493322/923794
def getfunc(func=None, uplevel=0):
"""Return tuple of information about a function
Go's up in the call stack to uplevel+1 and returns information
about the function found.
The tuple contains
name of function, function object, it's frame object,
filename and line number"""
from inspect import currentframe, getouterframes, getframeinfo
#for (level, frame) in enumerate(getouterframes(currentframe())):
# print(str(level) + ' frame: ' + str(frame))
caller = getouterframes(currentframe())[1+uplevel]
# caller is tuple of:
# frame object, filename, line number, function
# name, a list of lines of context, and index within the context
func_name = caller[3]
frame = caller[0]
from pprint import pprint
if func:
func_name = func.__name__
else:
func = frame.f_locals.get(func_name, frame.f_globals.get(func_name))
return (func_name, func, frame, caller[1], caller[2])
def debug_prt_func_args(f=None):
"""Print function name and argument with their values"""
from inspect import getargvalues, getfullargspec
(func_name, func, frame, file, line) = getfunc(func=f, uplevel=1)
argspec = getfullargspec(func)
#print(argspec)
argvals = getargvalues(frame)
print("debug info at " + file + ': ' + str(line))
print(func_name + ':' + str(argvals)) ## reformat to pretty print arg values here
return func_name
def df_dbg_prt_func_args(f):
"""Decorator: dpg_prt_func_args - Prints function name and arguments
"""
def wrapped(*args, **kwargs):
debug_prt_func_args(f)
return f(*args, **kwargs)
return wrapped
Usage:
#df_dbg_prt_func_args
def leaf_decor(*args, **kwargs):
"""Leaf level, simple function"""
print("in leaf")
def leaf_explicit(*args, **kwargs):
"""Leaf level, simple function"""
debug_prt_func_args()
print("in leaf")
def complex():
"""A complex function"""
print("start complex")
leaf_decor(3,4)
print("middle complex")
leaf_explicit(12,45)
print("end complex")
complex()
and prints:
start complex
debug info at debug.py: 54
leaf_decor:ArgInfo(args=[], varargs='args', keywords='kwargs', locals={'args': (3, 4), 'f': <function leaf_decor at 0x2aaaac048d98>, 'kwargs': {}})
in leaf
middle complex
debug info at debug.py: 67
leaf_explicit:ArgInfo(args=[], varargs='args', keywords='kwargs', locals={'args': (12, 45), 'kwargs': {}})
in leaf
end complex
The decorator cheats a bit: Since in wrapped we get the same arguments as the function itself it doesn't matter that we find and report the ArgSpec of wrapped in getfunc and debug_prt_func_args. This code could be beautified a bit, but it works alright now for the simple debug testcases I used.
Another trick you can do: If you uncomment the for-loop in getfunc you can see that inspect can give you the "context" which really is the line of source code where a function got called. This code is obviously not showing the content of any variable given to your function, but sometimes it already helps to know the variable name used one level above your called function.
As you can see, with the decorator you don't have to change the code inside the function.
Probably you'll want to pretty print the args. I've left the raw print (and also a commented out print statement) in the function so it's easier to play around with.