ThreadPoolExecutor logging? (python)

ThreadPoolExecutor logging? (python) - python

I have some code that looks like
with futures.ThreadPoolExecutor(max_workers=2) as executor:
for function in functions:
executor.submit(function)
How would I log which function is currently being handled by the executor? I may or may not have the capability to log from within the functions - would want the executor itself to log something like
print "handling process {i}".format(i=current_process)
Any thoughts on how to approach this?

I guess this is a little old but I stumbled across the questions and thought I would put an answer in. I just used a wrapper that can reference an instance of a logger prior to calling the function:
import logging
import os
import concurrent.futures
logging.basicConfig(filename=os.path.expanduser('~/Desktop/log.txt'), level=logging.INFO)
logger = logging.getLogger("MyLogger")
def logging_wrapper(func):
def wrapped(*args, **kwargs):
logger.info("Func name: {0}".format(func.__name__))
func(*args, **kwargs)
return wrapped
def a():
print('a ran')
def b():
print('b ran')
with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:
for func in [a, b]:
executor.submit(logging_wrapper(func))

Related

sentry sdk custom performance integration for python app

Sentry can track performance for celery tasks and API endpoints
https://docs.sentry.io/product/performance/
I have custom script that are lunching by crone and do set of similar tasks
I want to incorporated sentry_sdk into my script to get performance tracing of my tasks
Any advise how to do it with
https://getsentry.github.io/sentry-python/api.html#sentry_sdk.capture_event

You don't need use capture_event
I would suggest to use sentry_sdk.start_transaction instead. It also allows track your function performance.
Look at my example
from time import sleep
from sentry_sdk import Hub, init, start_transaction
init(
dsn="dsn",
traces_sample_rate=1.0,
)
def sentry_trace(func):
def wrapper(*args, **kwargs):
transaction = Hub.current.scope.transaction
if transaction:
with transaction.start_child(op=func.__name__):
return func(*args, **kwargs)
else:
with start_transaction(op=func.__name__, name=func.__name__):
return func(*args, **kwargs)
return wrapper
#sentry_trace
def b():
for i in range(1000):
print(i)
#sentry_trace
def c():
sleep(2)
print(1)
#sentry_trace
def a():
sleep(1)
b()
c()
if __name__ == '__main__':
a()
After starting this code you can see basic info of transaction a with childs b and c

Combining py.test and trio/curio

I would to combine pytest and trio (or curio, if that is any easier), i.e. write my test cases as coroutine functions. This is relatively easy to achieve by declaring a custom test runner in conftest.py:
#pytest.mark.tryfirst
def pytest_pyfunc_call(pyfuncitem):
'''If item is a coroutine function, run it under trio'''
if not inspect.iscoroutinefunction(pyfuncitem.obj):
return
kernel = trio.Kernel()
funcargs = pyfuncitem.funcargs
testargs = {arg: funcargs[arg]
for arg in pyfuncitem._fixtureinfo.argnames}
try:
kernel.run(functools.partial(pyfuncitem.obj, **testargs))
finally:
kernel.run(shutdown=True)
return True
This allows me to write test cases like this:
async def test_something():
server = MockServer()
server_task = await trio.run(server.serve)
try:
# test the server
finally:
server.please_terminate()
try:
with trio.fail_after(30):
server_task.join()
except TooSlowError:
server_task.cancel()
But this is a lot of boilerplate. In non-async code, I would factor this out into a fixture:
#pytest.yield_fixture()
def mock_server():
server = MockServer()
thread = threading.Thread(server.serve)
thread.start()
try:
yield server
finally:
server.please_terminate()
thread.join()
server.server_close()
def test_something(mock_server):
# do the test..
Is there a way to do the same in trio, i.e. implement async fixtures? Ideally, I would just write:
async def test_something(mock_server):
# do the test..

Edit: the answer below is mostly irrelevant now – instead use pytest-trio and follow the instructions in its manual.
Your example pytest_pyfunc_call code doesn't work becaues it's a mix of trio and curio :-). For trio, there's a decorator trio.testing.trio_test that can be used to mark individual tests (like if you were using classic unittest or something), so the simplest way to write a pytest plugin function is to just apply this to each async test:
from trio.testing import trio_test
#pytest.mark.tryfirst
def pytest_pyfunc_call(pyfuncitem):
if inspect.iscoroutine(pyfuncitem.obj):
# Apply the #trio_test decorator
pyfuncitem.obj = trio_test(pyfuncitem.obj)
In case you're curious, this is basically equivalent to:
import trio
from functools import wraps, partial
#pytest.mark.tryfirst
def pytest_pyfunc_call(pyfuncitem):
if inspect.iscoroutine(pyfuncitem.obj):
fn = pyfuncitem.obj
#wraps(fn)
def wrapper(**kwargs):
trio.run(partial(fn, **kwargs))
pyfuncitem.obj = wrapper
Anyway, that doesn't solve your problem with fixtures – for that you need something much more involved.

Automatic debug logs when control goes inside/outside of a function

I am trying one project, that has many functions. I am using standard logging module The requirement is to log DEBUG logs which says:
<timestamp> DEBUG entered foo()
<timestamp> DEBUG exited foo()
<timestamp> DEBUG entered bar()
<timestamp> DEBUG exited bar()
But I don't want to write the DEBUG logs inside every function. Is there a way in Python which takes care of automatic log containing entry and exit of functions?
I don't want to use any decorator to all functions, unless it is the only solution in Python.

Any reason you don't want to use a decorator? It's pretty simple:
from functools import wraps
import logging
logging.basicConfig(filename='some_logfile.log', level=logging.DEBUG)
def tracelog(func):
#wraps(func) # to preserve docstring
def inner(*args, **kwargs):
logging.debug('entered {0}, called with args={1}, kwargs={2}'.format(func.func_name, *args, **kwargs))
func(*args, **kwargs)
logging.debug('exited {0}'.format(func.func_name))
return inner
If you get that, then passing in an independent logger is just another layer deep:
def tracelog(log):
def real_decorator(func):
#wraps(func)
def inner(*args, **kwargs):
log.debug('entered {0} called with args={1}, kwargs={2}'.format(func.func_name, *args, **kwargs))
func(*args, **kwargs)
log.debug('exited {0}'.format(func.func_name))
return inner
return real_decorator
Cool thing, is that this works for functions and methods
Usage example:
#tracelog(logger)
def somefunc():
print('running somefunc')

You want to have a look at sys.settrace.
There is a nice explanation with code examples for call tracing here: https://pymotw.com/2/sys/tracing.html
A very primitive way to do it, look at the link for more worked examples:
import sys
def trace_calls(frame, event, arg):
if event not in ('call', 'return'):
return
co = frame.f_code
func_name = co.co_name
if func_name == 'write':
# Ignore write() calls from print statements
return
if event == 'call':
print "ENTER: %s" % func_name
else:
print "EXIT: %s" % func_name
sys.settrace(trace_calls)

Celery task with multiple decorators not auto registering task name

I'm having a task that looks like this
from mybasetask_module import MyBaseTask
#task(base=MyBaseTask)
#my_custom_decorator
def my_task(*args, **kwargs):
pass
and my base task looks like this
from celery import task, Task
class MyBaseTask(Task):
abstract = True
default_retry_delay = 10
max_retries = 3
acks_late = True
The problem I'm running into is that the celery worker is registering the task with the name
'mybasetask_module.__inner'
The task is registerd fine (which is the package+module+function) when I remove #my_custom_decorator from the task or if I provide an explicit name to the task like this
from mybasetask_module import MyBaseTask
#task(base=MyBaseTask, name='an_explicit_task_name')
#my_custom_decorator
def my_task(*args, **kwargs):
pass
Is this behavior expected? Do I need to do something so that my tasks are registered with the default auto registered name in the first case when I have multiple decorators but no explicit task name?
Thanks,

Use the functools.wraps() decorator to ensure that the wrapper returned by my_custom_decorator has the correct name:
from functools import wraps
def my_custom_decorator(func):
#wraps(func)
def __inner():
return func()
return __inner
The task name is taken from the function call that the task decorator wraps, but by inserting a decorator in between, you gave task your __inner wrapping function instead. The functools.wraps() decorator copies all the necessary metadata over from func to the wrapper so that task() can pick up the proper name.

celery task and customize decorator

I'm working on a project using django and celery(django-celery). Our team decided to wrap all data access code within (app-name)/manager.py(NOT wrap into Managers like the django way), and let code in (app-name)/task.py only dealing with assemble and perform tasks with celery(so we don't have django ORM dependency in this layer).
In my manager.py, I have something like this:
def get_tag(tag_name):
ctype = ContentType.objects.get_for_model(Photo)
try:
tag = Tag.objects.get(name=tag_name)
except ObjectDoesNotExist:
return Tag.objects.none()
return tag
def get_tagged_photos(tag):
ctype = ContentType.objects.get_for_model(Photo)
return TaggedItem.objects.filter(content_type__pk=ctype.pk, tag__pk=tag.pk)
def get_tagged_photos_count(tag):
return get_tagged_photos(tag).count()
In my task.py, I like to wrap them into tasks (then maybe use these tasks to do more complicated tasks), so I write this decorator:
import manager #the module within same app containing data access functions
class mfunc_to_task(object):
def __init__(mfunc_type='get'):
self.mfunc_type = mfunc_type
def __call__(self, f):
def wrapper_f(*args, **kwargs):
callback = kwargs.pop('callback', None)
mfunc = getattr(manager, f.__name__)
result = mfunc(*args, **kwargs)
if callback:
if self.mfunc_type == 'get':
subtask(callback).delay(result)
elif self.mfunc_type == 'get_or_create':
subtask(callback).delay(result[0])
else:
subtask(callback).delay()
return result
return wrapper_f
then (still in task.py):
##task
#mfunc_to_task()
def get_tag():
pass
##task
#mfunc_to_task()
def get_tagged_photos():
pass
##task
#mfunc_to_task()
def get_tagged_photos_count():
pass
Things work fine without #task.
But, after applying that #task decorator(to the top as celery documentation instructed), things just start to fall apart. Apparently, every time the mfunc_to_task.__call__ gets called, the same task.get_tag function gets passed as f. So I ended up with the same wrapper_f every time, and now the only thing I cat do is to get a single tag.
I'm new to decorators. Any one can help me understand what went wrong here, or point out other ways to achieve the task? I really hate to write the same task wrap code for every of my data access functions.

Not quite sure why passing arguments won't work?
if you use this example:
#task()
def add(x, y):
return x + y
lets add some logging to the MyCoolTask:
from celery import task
from celery.registry import tasks
import logging
import celery
logger = logging.getLogger(__name__)
class MyCoolTask(celery.Task):
def __call__(self, *args, **kwargs):
"""In celery task this function call the run method, here you can
set some environment variable before the run of the task"""
logger.info("Starting to run")
return self.run(*args, **kwargs)
def after_return(self, status, retval, task_id, args, kwargs, einfo):
#exit point of the task whatever is the state
logger.info("Ending run")
pass
and create an extended class (extending MyCoolTask, but now with arguments):
class AddTask(MyCoolTask):
def run(self,x,y):
if x and y:
result=add(x,y)
logger.info('result = %d' % result)
return result
else:
logger.error('No x or y in arguments')
tasks.register(AddTask)
and make sure you pass the kwargs as json data:
{"x":8,"y":9}
I get the result:
[2013-03-05 17:30:25,853: INFO/MainProcess] Starting to run
[2013-03-05 17:30:25,855: INFO/MainProcess] result = 17
[2013-03-05 17:30:26,739: INFO/MainProcess] Ending run
[2013-03-05 17:30:26,741: INFO/MainProcess] Task iamscheduler.tasks.AddTask[6a62641d-16a6-44b6-a1cf-7d4bdc8ea9e0] succeeded in 0.888684988022s: 17

Instead of use decorator why you don't create a base class that extend celery.Task ?
In this way all your tasks can extend your customized task class, where you can implement your personal behavior by using methods __call__ and after_return
.
You can also define common methods and object for all your task.
class MyCoolTask(celery.Task):
def __call__(self, *args, **kwargs):
"""In celery task this function call the run method, here you can
set some environment variable before the run of the task"""
return self.run(*args, **kwargs)
def after_return(self, status, retval, task_id, args, kwargs, einfo):
#exit point of the task whatever is the state
pass

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

ThreadPoolExecutor logging? (python) - python

Related

sentry sdk custom performance integration for python app

Combining py.test and trio/curio

Automatic debug logs when control goes inside/outside of a function

Celery task with multiple decorators not auto registering task name

celery task and customize decorator

Categories

Resources