I'm doing a few dozen HTTP requests inside a Gevent pool.
The goal is to retry a request once if it failed, but only once. Otherwise, it should throw an exception.
How would I write gevent code with at pool that supports re-running HTTP requests once if they fail?
Could this approach work?
import requests
import gevent
from gevent.pool import Pool
pool = Pool(10)
def do_request(id):
r = requests.get('http://example.com/%u' % id)
if not r.status_code == 200:
raise RuntimeError(id)
def spawn_greenlet(id, is_retry=False):
if not is_retry:
g = gevent.spawn(id)
g.link_exception(retry_once)
else:
g = pool.spawn(id)
g.link_exception(raise_exception)
return g
def retry_once(greenlet):
return spawn_greenlet(greenlet.exception.args[0])
def raise_exception(greenlet):
if greenlet.exception:
raise greenlet.exception
raise RuntimeError('Unknown error in greenlet processing.')
greenlets = pool.map(spawn_greenlet, [1, 2, 3, 4, 5])
gevent.joinall(greenlets)
Ist there a cleaner way to obtain the argument of the greenlet function than via exception arguments?
Is there a possibility that the joinall(greenlets) methods returns after an exception occurs inside do_request but before the retry_once event handler is called?
Is there a cleaner way to restart a greenlet with the same arguments, so I wouldn't need the is_retry kwarg at spawn_greenlet?
As far as I understand this, gevent.joinall(greenlets) only joins the greenlets returned by map. When there's an exception, is the original greenlet replaced with the new one returned by retry_once? If not, does processing continue even though the additional greenlets are still running? How could I wait for all greenlets to finish in that case?
Gevent docs are very scarce and there seem to be no other resources in the web documenting this, even though this is a fairly common use case. Therefore I don't consider this a too localized question.
Don't use spawn/link/link_exception for retrying things. Just use normal Python:
def do_something_with_retry(*args):
try:
return do_something(*args)
except Exception:
return do_something(*args)
Also, gevent.pool.Pool.map automatically spawns greenlet within given pool, you don't have to do it.
pool = Pool(10)
pool.map(do_something_with_retry, [1, 2, 3])
Now, you only need to implement do_something(), which can be normal Python/requests code:
def do_something(*args):
return requests.get('http://gevent.org')
Have fun!
Related
I am playing around with concurrent.futures.
Currently my future calls time.sleep(secs).
It seems that Future.cancel() does less than I thought.
If the future is already executing, then time.sleep() does not get cancel by it.
The same for the timeout parameter for wait(). It does not cancel my time.sleep().
How to cancel time.sleep() which gets executed in a concurrent.futures?
For testing I use the ThreadPoolExecutor.
If you submit a function to a ThreadPoolExecutor, the executor will run the function in a thread and store its return value in the Future object. Since the number of concurrent threads is limited, you have the option to cancel the pending execution of a future, but once control in the worker thread has been passed to the callable, there's no way to stop execution.
Consider this code:
import concurrent.futures as f
import time
T = f.ThreadPoolExecutor(1) # Run at most one function concurrently
def block5():
time.sleep(5)
return 1
q = T.submit(block5)
m = T.submit(block5)
print q.cancel() # Will fail, because q is already running
print m.cancel() # Will work, because q is blocking the only thread, so m is still queued
In general, whenever you want to have something cancellable you yourself are responsible for making sure that it is.
There are some off-the-shelf options available though. E.g., consider using asyncio, they also have an example using sleep. The concept circumvents the issue by, whenever any potentially blocking operation is to be called, instead returning control to a control loop running in the outer-most context, together with a note that execution should be continued whenever the result is available - or, in your case, after n seconds have passed.
I do not know much about concurrent.futures, but you can use this logic to break the time. Use a loop instead of sleep.time() or wait()
for i in range(sec):
sleep(1)
interrupt or break can be used to come out of loop.
I figured it out.
Here is a example:
from concurrent.futures import ThreadPoolExecutor
import queue
import time
class Runner:
def __init__(self):
self.q = queue.Queue()
self.exec = ThreadPoolExecutor(max_workers=2)
def task(self):
while True:
try:
self.q.get(block=True, timeout=1)
break
except queue.Empty:
pass
print('running')
def run(self):
self.exec.submit(self.task)
def stop(self):
self.q.put(None)
self.exec.shutdown(wait=False,cancel_futures=True)
r = Runner()
r.run()
time.sleep(5)
r.stop()
As it is written in its link, You can use a with statement to ensure threads are cleaned up promptly, like the below example:
import concurrent.futures
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
I've faced this same problem recently. I had 2 tasks to run concurrently and one of them had to sleep from time to time. In the code below, suppose task2 is the one that sleeps.
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor(max_workers=2)
executor.submit(task1)
executor.submit(task2)
executor.shutdown(wait=True)
In order to avoid the endless sleep I've extracted task2 to run synchronously. I don't whether it's a good practice, but it's simple and fit perfectly in my scenario.
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor(max_workers=1)
executor.submit(task1)
task2()
executor.shutdown(wait=True)
Maybe it's useful to someone else.
I have a primitive producer/consumer script running in gevent. It starts a few producer functions that put things into a gevent.queue.Queue, and one consumer function that fetches them out of the queue again:
from __future__ import print_function
import time
import gevent
import gevent.queue
import gevent.monkey
q = gevent.queue.Queue()
# define and spawn a consumer
def consumer():
while True:
item = q.get(block=True)
print('consumer got {}'.format(item))
consumer_greenlet = gevent.spawn(consumer)
# define and spawn a few producers
def producer(ID):
while True:
print("producer {} about to put".format(ID))
q.put('something from {}'.format(ID))
time.sleep(0.1)
# consumer_greenlet.switch()
producer_greenlets = [gevent.spawn(producer, i) for i in range(5)]
# wait indefinitely
gevent.monkey.patch_all()
print("about to join")
consumer_greenlet.join()
It works fine if I let gevent handle the scheduling implicitly (e.g. by calling time.sleep or some other gevent.monkey.patch()ed function), however when I switch to the consumer explicitly (replace time.sleepwith the commented-out switch call), gevent raises an AssertionError:
Traceback (most recent call last):
File "/my/virtualenvs/venv/local/lib/python2.7/site-packages/gevent/greenlet.py", line 327, in run
result = self._run(*self.args, **self.kwargs)
File "switch_test.py", line 14, in consumer
item = q.get(block=True)
File "/my/virtualenvs/venv/lib/python2.7/site-packages/gevent/queue.py", line 201, in get
assert result is waiter, 'Invalid switch into Queue.get: %r' % (result, )
AssertionError: Invalid switch into Queue.get: ()
<Greenlet at 0x7fde6fa6c870: consumer> failed with AssertionError
I would like to employ explicit switching because in production I have a lot of producers, gevent's scheduling does not allocate nearly enough runtime to the consumer and the queue gets longer and longer (which is bad). Alternatively, any insights into how to configure or modify gevent's scheduler is greatly appreciated.
This is on Python 2.7.2, gevent 1.0.1 and greenlet 0.4.5.
Seems to me explicit switch doesn't really play well with implicit switch.
You already have implicit switch happening either because monkey-patched I/O or because the gevent.queue.Queue().
The gevent documentation discourages usage of the raw greenlet methods:
Being a greenlet subclass, Greenlet also has switch() and throw()
methods. However, these should not be used at the application level as
they can very easily lead to greenlets that are forever unscheduled.
Prefer higher-level safe classes, like Event and Queue, instead.
Iterating gevent.queue.Queue() or accessing the queue's get method does implicit switching, interestingly put does not. So you have to generate an implicit thread switch yourself. Easiest is to call gevent.sleep(0) (you don't have to actually wait a specific time).
In conclusion you don't even have to monkey-pach things, provide that your code does not have blocking IO operations.
I would rewrite your code like this:
import gevent
import gevent.queue
q = gevent.queue.Queue()
# define and spawn a consumer
def consumer():
for item in q:
print('consumer got {}'.format(item))
consumer_greenlet = gevent.spawn(consumer)
# define and spawn a few producers
def producer(ID):
print('producer started', ID)
while True:
print("producer {} about to put".format(ID))
q.put('something from {}'.format(ID))
gevent.sleep(0)
producer_greenlets = [gevent.spawn(producer, i) for i in range(5)]
# wait indefinitely
print("about to join")
consumer_greenlet.join()
I'm new to Twisted and after finally figuring out how the deferreds work I'm struggling with the tasks. What I want to achieve is to have a script that sends a REST request in a loop, however if at some point it fails I want to stop the loop. Since I'm using callbacks I can't easily catch exceptions and because I don't know how to stop the looping from an errback I'm stuck.
This is the simplified version of my code:
def send_request():
agent = Agent(reactor)
req_result = agent.request('GET', some_rest_link)
req_result.addCallbacks(cp_process_request, cb_process_error)
if __name__ == "__main__":
list_call = task.LoopingCall(send_request)
list_call.start(2)
reactor.run()
To end a task.LoopingCall all you need to do is call the stop on the return object (list_call in your case).
Somehow you need to make that var available to your errback (cb_process_error) either by pushing it into a class that cb_process_error is in, via some other class used as a pseudo-global or by literally using a global, then you simply call list_call.stop() inside the errback.
BTW you said:
Since I'm using callbacks I can't easily catch exceptions
Thats not really true. The point of an errback to to deal with exceptions, thats one of the things that literally causes it to be called! Check out my previous deferred answer and see if it makes errbacks any clearer.
The following is a runnable example (... I'm not saying this is the best way to do it, just that it is a way...)
#!/usr/bin/python
from twisted.internet import task
from twisted.internet import reactor
from twisted.internet.defer import Deferred
from twisted.web.client import Agent
from pprint import pprint
class LoopingStuff (object):
def cp_process_request(self, return_obj):
print "In callback"
pprint (return_obj)
def cb_process_error(self, return_obj):
print "In Errorback"
pprint(return_obj)
self.loopstopper()
def send_request(self):
agent = Agent(reactor)
req_result = agent.request('GET', 'http://google.com')
req_result.addCallbacks(self.cp_process_request, self.cb_process_error)
def main():
looping_stuff_holder = LoopingStuff()
list_call = task.LoopingCall(looping_stuff_holder.send_request)
looping_stuff_holder.loopstopper = list_call.stop
list_call.start(2)
reactor.callLater(10, reactor.stop)
reactor.run()
if __name__ == '__main__':
main()
Assuming you can get to google.com this will fetch pages for 10 seconds, if you change the second arg of the agent.request to something like http://127.0.0.1:12999 (assuming that port 12999 will give a connection refused) then you'll see 1 errback printout (which will have also shutdown the loopingcall) and have a 10 second wait until the reactor shuts down.
I'd like to do something like that (1 queue, and multiple consumers):
import gevent
from gevent import queue
q=queue.Queue()
q.put(1)
q.put(2)
q.put(3)
q.put(StopIteration)
def consumer(qq):
for i in qq:
print i
jobs=[gevent.spawn(consumer,i) for i in [q,q]]
gevent.joinall(jobs)
But it's not possible ... the queue is consumed by job1 ... so job2 would block forever.
It gives me the exception gevent.hub.LoopExit: This operation would block forever.
I would that each consumer will be able to consume the full queue from start. (should display 1,2,3,1,2,3 or 1,1,2,2,3,3 ... nevermind)
One idea should be to clone the queue before spawning, but it's not possible using copy (shallow/deep) module ;-(
Is there another way to do that ?
[EDIT]
what do you think of that ?
import gevent
from gevent import queue
class MasterQueueClonable(queue.Queue):
def __init__(self,*a,**k):
queue.Queue.__init__(self,*a,**k)
self.__cloned = []
self.__old=[]
#override
def get(self,*a,**k):
e=queue.Queue.get(self,*a,**k)
for i in self.__cloned: i.put(e) # serve to current clones
self.__old.append(e) # save old element
return e
def clone(self):
q=queue.Queue()
for i in self.__old: q.put(i) # feed a queue with elements which are out
self.__cloned.append(q) # stock the queue, to be able to put newer elements too
return q
q=MasterQueueClonable()
q.put(1)
q.put(2)
q.put(3)
q.put(StopIteration)
def consumer(qq):
for i in qq:
print id(qq),i
jobs=[gevent.spawn(consumer,i) for i in [q.clone(), q ,q.clone(),q.clone()]]
gevent.joinall(jobs)
It's based on the idea of RyanYe. There is a "master queue" without a dispatcher.
My master queue override the GET method, and can dispatch to an ondemand clone.
And more, a "clone" can be created after the start of the masterqueue (with the __old trick).
I suggest you to create a greenlet to dispatch the work to consumers. Example code:
import gevent
from gevent import queue
master_queue=queue.Queue()
master_queue.put(1)
master_queue.put(2)
master_queue.put(3)
master_queue.put(StopIteration)
total_consumers = 10
consumer_queues = [queue.Queue() for i in xrange(total_consumers)]
def dispatcher(master_queue, consumer_queues):
for i in master_queue:
[j.put(i) for j in consumer_queues]
[j.put(StopIteration) for j in consumer_queues]
def consumer(qq):
for i in qq:
print i
jobs=[gevent.spawn(dispatcher, q, consumer_queues)] + [gevent.spawn(consumer, i) for i in consumer_queues]
gevent.joinall(jobs)
UPDATE: Fix missing StopIteration for consumer queues. Thanks arilou for pointing it out.
I've added copy() method to Queue class:
>>> import gevent.queue
>>> q = gevent.queue.Queue()
>>> q.put(5)
>>> q.copy().get()
5
>>> q
<Queue at 0x1062760d0 queue=deque([5])>
Let me know if it helps.
In the answer by Ryan Ye one line is missed in the end of the dispatcher() function:
[j.put(StopIteration) for j in consumer_queues]
Without it we still get 'gevent.hub.LoopExit: This operation would block forever' since 'for i in master_queue' loop doesn't copy StopIteration exception into the consumer_queues.
(Sorry, I can't leave comments yet so I write it as a separete answer.)
For operations in my Tornado server that are expected to block (and can't be easily modified to use things like Tornado's asynchronous HTTP request client), I have been offloading the work to separate worker processes using the multiprocessing module. Specifically, I was using a multiprocessing Pool because it offers a method called apply_async, which works very well with Tornado since it takes a callback as one of its arguments.
I recently realized that a pool preallocates the number of processes, so if they all become blocking, operations that require a new process will have to wait. I do realize that the server can still take connections since apply_async works by adding things to a task queue, and is rather immediately finished, itself, but I'm looking to spawn n processes for n amount of blocking tasks I need to perform.
I figured that I could use the add_handler method for my Tornado server's IOLoop to add a handler for each new PID that I create to that IOLoop. I've done something similar before, but it was using popen and an arbitrary command. An example of such use of this method is here. I wanted to pass arguments into an arbitrary target Python function within my scope, though, so I wanted to stick with multiprocessing.
However, it seems that something doesn't like the PIDs that my multiprocessing.Process objects have. I get IOError: [Errno 9] Bad file descriptor. Are these processes restricted somehow? I know that the PID isn't available until I actually start the process, but I do start the process. Here's the source code of an example I've made that demonstrates this issue:
#!/usr/bin/env python
"""Creates a small Tornado program to demonstrate asynchronous programming.
Specifically, this demonstrates using the multiprocessing module."""
import tornado.httpserver
import tornado.ioloop
import tornado.web
import multiprocessing as mp
import random
import time
__author__ = 'Brian McFadden'
__email__ = 'brimcfadden#gmail.com'
def sleepy(queue):
"""Pushes a string to the queue after sleeping for 5 seconds.
This sleeping can be thought of as a blocking operation."""
time.sleep(5)
queue.put("Now I'm awake.")
return
def random_num():
"""Returns a string containing a random number.
This function can be used by handlers to receive text for writing which
facilitates noticing change on the webpage when it is refreshed."""
n = random.random()
return "<br />Here is a random number to show change: {0}".format(n)
class SyncHandler(tornado.web.RequestHandler):
"""Demonstrates handing a request synchronously.
It executes sleepy() before writing some more text and a random number to
the webpage. While the process is sleeping, the Tornado server cannot
handle any requests at all."""
def get(self):
q = mp.Queue()
sleepy(q)
val = q.get()
self.write(val)
self.write('<br />Brought to you by SyncHandler.')
self.write('<br />Try refreshing me and then the main page.')
self.write(random_num())
class AsyncHandler(tornado.web.RequestHandler):
"""Demonstrates handing a request asynchronously.
It executes sleepy() before writing some more text and a random number to
the webpage. It passes the sleeping function off to another process using
the multiprocessing module in order to handle more requests concurrently to
the sleeping, which is like a blocking operation."""
#tornado.web.asynchronous
def get(self):
"""Handles the original GET request (normal function delegation).
Instead of directly invoking sleepy(), it passes a reference to the
function to the multiprocessing pool."""
# Create an interprocess data structure, a queue.
q = mp.Queue()
# Create a process for the sleepy function. Provide the queue.
p = mp.Process(target=sleepy, args=(q,))
# Start it, but don't use p.join(); that would block us.
p.start()
# Add our callback function to the IOLoop. The async_callback wrapper
# makes sure that Tornado sends an HTTP 500 error to the client if an
# uncaught exception occurs in the callback.
iol = tornado.ioloop.IOLoop.instance()
print "p.pid:", p.pid
iol.add_handler(p.pid, self.async_callback(self._finish, q), iol.READ)
def _finish(self, q):
"""This is the callback for post-sleepy() request handling.
Operation of this function occurs in the original process."""
val = q.get()
self.write(val)
self.write('<br />Brought to you by AsyncHandler.')
self.write('<br />Try refreshing me and then the main page.')
self.write(random_num())
# Asynchronous handling must be manually finished.
self.finish()
class MainHandler(tornado.web.RequestHandler):
"""Returns a string and a random number.
Try to access this page in one window immediately after (<5 seconds of)
accessing /async or /sync in another window to see the difference between
them. Asynchronously performing the sleepy() function won't make the client
wait for data from this handler, but synchronously doing so will!"""
def get(self):
self.write('This is just responding to a simple request.')
self.write('<br />Try refreshing me after one of the other pages.')
self.write(random_num())
if __name__ == '__main__':
# Create an application using the above handlers.
application = tornado.web.Application([
(r"/", MainHandler),
(r"/sync", SyncHandler),
(r"/async", AsyncHandler),
])
# Create a single-process Tornado server from the application.
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(8888)
print 'The HTTP server is listening on port 8888.'
tornado.ioloop.IOLoop.instance().start()
Here is the traceback:
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/tornado/web.py", line 810, in _stack_context
yield
File "/usr/local/lib/python2.6/dist-packages/tornado/stack_context.py", line 77, in StackContext
yield
File "/usr/local/lib/python2.6/dist-packages/tornado/web.py", line 827, in _execute
getattr(self, self.request.method.lower())(*args, **kwargs)
File "/usr/local/lib/python2.6/dist-packages/tornado/web.py", line 909, in wrapper
return method(self, *args, **kwargs)
File "./process_async.py", line 73, in get
iol.add_handler(p.pid, self.async_callback(self._finish, q), iol.READ)
File "/usr/local/lib/python2.6/dist-packages/tornado/ioloop.py", line 151, in add_handler
self._impl.register(fd, events | self.ERROR)
IOError: [Errno 9] Bad file descriptor
The above code is actually modified from an older example that used process pools. I've had it saved for reference for my coworkers and myself (hence the heavy amount of comments) for quite a while. I constructed it in such a way so that I could open two small browser windows side-by-side to demonstrate to my boss that the /sync URI blocks connections while /async allows more connections. For the purposes of this question, all you need to do to reproduce it is try to access the /async handler. It errors immediately.
What should I do about this? How can the PID be "bad"? If you run the program, you can see it be printed to stdout.
For the record, I'm using Python 2.6.5 on Ubuntu 10.04. Tornado is 1.1.
add_handler takes a valid file descriptor, not a PID. As an example of what's expected, tornado itself uses add_handler normally by passing in a socket object's fileno(), which returns the object's file descriptor. PID is irrelevant in this case.
Check out this project:
https://github.com/vukasin/tornado-subprocess
it allows you to start arbitrary processes from tornado and get a callback when they finish (with access to their status, stdout and stderr).