I am using concurrent futures to download reports from a remote server using an API. To inform me that the report has downloaded correctly, I just have the function print out its ID.
I have an issue where there are rare times that a report download will hang in-definitely. I do not get a Timeout Error or a Connection Reset error, just hanging there for hours until I kill the whole process. This is a known issue with the API with no known workaround.
I did some research and switched to using a Pebble based approach to implement a timeout on the function. My aim is then to record the ID of the report that failed to download and start again.
Unfortunately, I ran into a bit of a brick wall as I do not know how to actually retrieve the ID of the report I failed to download. I am using a similar layout to this answer:
from pebble import ProcessPool
from concurrent.futures import TimeoutError
def sometimes_stalling_download_function(report_id):
...
return report_id
with ProcessPool() as pool:
future = pool.map(sometimes_stalling_download_function, report_id_list, timeout=10)
iterator = future.result()
while True:
try:
result = next(iterator)
except StopIteration:
break
except TimeoutError as error:
print("function took longer than %d seconds" % error.args[1])
#Retrieve report ID here
failed_accounts.append(result)
What I want to do is retrieve the report ID in the event of a timeout but it does not seem to be reachable from that exception. Is it possible to have the function output the ID anyway in the case of a timeout exception or will I have to re-think how I am downloading the reports entirely?
The map function returns a future object which yields the results in the same order they were submitted.
Therefore, to understand which report_id is causing the timeout you can simply check its position in the report_id_list.
index = 0
while True:
try:
result = next(iterator)
except StopIteration:
break
except TimeoutError as error:
print("function took longer than %d seconds" % error.args[1])
#Retrieve report ID here
failed_accounts.append(report_id_list[index])
finally:
index += 1
Related
I have a project I am working on and I'm looking to use concurrent.futures ProcessPoolExecutor send a high number of HTTP requests. While the code below works great for getting the requests, I'm struggling with ideas to process the information as I get it. I tried inserting it into a sqlite3 database as I get responses, but it became tricky trying to manage locks and avoid the use of global variables.
Ideally, I'd like to start the Pool, and while it is executing, be able to read/store the data. Is this possible or should I take a different route with this...
pool = ProcessPoolExecutor(max_workers=60)
results = list(pool.map(http2_get, urls))
def http2_get(url):
while(True):
try:
start_time = millis()
result = s.get(url,verify=False)
print(url + " Total took " + str(millis() - start_time) + " ms")
return result
except Exception as e:
print(e,e.__traceback__.tb_lineno)
pass
As you noticed, map will not return until all the processes have finished. I assume that you want to process the data in the main process.
Instead of using map, submit all the tasks and process them as they finish:
from concurrent.futures import ProcessPoolExecutor, as_completed
pool = ProcessPoolExecutor(max_workers=60)
futures_list = [pool.submit(http2_get, url) for url in urls]
for future in as_completed(futures_list):
exception = future.exception()
if exception is not None:
# Handle exception in http2_get
pass
else:
result = future.result()
# process result...
Note that it is cleaner to use the ProcessPoolExecutor as a context manager:
with ProcessPoolExecutor(max_workers=60) as pool:
futures_list = [pool.submit(http2_get, url) for url in urls]
for future in as_completed(futures_list):
exception = future.exception()
if exception is not None:
# Handle exception in htt2_get
pass
else:
result = future.result()
# process result...
This is my main function. If I receive new offer, I need to check the payment. I have HandleNewOffer() function on that. But the problem with this code happens if there are 2(or more) offers at the same time. One of the buyers will have to wait until the closing of the transaction. So is this possible to generate new process with HandleNewOffer() function and kill it when it`s done to make several transactions at the same time? Thank you in advance.
def handler():
try:
conn = k.call('GET', '/api/').json() #connect
response = conn.call('GET', '/api/notifications/').json()
notifications = response['data']
for notification in notifications:
if notification['contact']:
HandleNewOffer(notification) # need to dynamically start new process if notification
except Exception as err:
error= ('Error')
Send(error)
I'd recommend to use the Pool of workers pattern here to limit the amount of concurrent calls to HandleNewOffer.
The concurrent.futures module offers ready-made implementations of the above mentioned pattern.
from concurrent.futures import ProcessPoolExecutor
def handler():
with ProcessPoolExecutor() as pool:
try:
conn = k.call('GET', '/api/').json() #connect
response = conn.call('GET', '/api/notifications/').json()
# collect notifications to process into a list
notifications = [n for n in response['data'] if n['contact']]
# send the list of notifications to the concurrent workers
results = pool.map(HandleNewOffer, notifications)
# iterate over the list of results from every HandleNewOffer call
for result in results:
print(result)
except Exception as err:
error= ('Error')
Send(error)
This logic will handle as many offers in parallel as many CPU cores you computer has.
I have some raspberry pi running some python code. Once and a while my devices will fail to check in. The rest of the python code continues to run perfectly but the code here quits. I am not sure why? If the devices can't check in they should reboot but they don't. Other threads in the python file continue to run correctly.
class reportStatus(Thread):
def run(self):
checkInCount = 0
while 1:
try:
if checkInCount < 50:
payload = {'d':device,'k':cKey}
resp = requests.post(url+'c', json=payload)
if resp.status_code == 200:
checkInCount = 0
time.sleep(1800) #1800
else:
checkInCount += 1
time.sleep(300) # 2.5 min
else:
os.system("sudo reboot")
except:
try:
checkInCount += 1
time.sleep(300)
except:
pass
The devices can run for days and weeks and will check in perfectly every 30 minutes, then out of the blue they will stop. My linux computers are in read-only and the computer continue to work and run correctly. My issue is in this thread. I think they might fail to get a response and this line could be the issue
resp = requests.post(url+'c', json=payload)
I am not sure how to solve this, any help or suggestions would be greatly appreciated.
Thank you
A bare except:pass is a very bad idea.
A much better approach would be to, at the very minimum, log any exceptions:
import traceback
while True:
try:
time.sleep(60)
except:
with open("exceptions.log", "a") as log:
log.write("%s: Exception occurred:\n" % datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
traceback.print_exc(file=log)
Then, when you get an exception, you get a log:
2016-12-20 13:28:55: Exception occurred:
Traceback (most recent call last):
File "./sleepy.py", line 8, in <module>
time.sleep(60)
KeyboardInterrupt
It is also possible that your code is hanging on sudo reboot or requests.post. You could add additional logging to troubleshoot which issue you have, although given you've seen it do reboots, I suspect it's requests.post, in which case you need to add a timeout (from the linked answer):
import requests
import eventlet
eventlet.monkey_patch()
#...
resp = None
with eventlet.Timeout(10):
resp = requests.post(url+'c', json=payload)
if resp:
# your code
Your code basically ignores all exceptions. This is considered a bad thing in Python.
The only reason I can think of for the behavior that you're seeing is that after checkInCount reaches 50, the sudo reboot raises an exception which is then ignored by your program, keeping this thread stuck in the infinite loop.
If you want to see what really happens, add print or loggging.info statements to all the different branches of your code.
Alternatively, remove the blanket try-except clause or replace it by something specific, e.g. except requests.exceptions.RequestException
Because of the answers given I was able to come up with a solution. I realized requests has a built in time out function. The timeout will never happen if a timeout is not specified as a parameter.
here is my solution:
resp = requests.post(url+'c', json=payload, timeout=45)
You can tell Requests to stop waiting for a response after a given
number of seconds with the timeout parameter. Nearly all production
code should use this parameter in nearly all requests. Failure to do
so can cause your program to hang indefinitely
The answers provided by TemporalWolf and other helped me alot. Thank you to all that helped.
I made a script to download wallpapers as a learning exercise to better familiarize myself with Python/Threading. Everything works well unless there is an exception trying to request a URL. This is the function I hit the exception (not a method of the same class, if that matters).
def open_url(url):
"""Opens URL and returns html"""
try:
response = urllib2.urlopen(url)
link = response.geturl()
html = response.read()
response.close()
return(html)
except urllib2.URLError, e:
if hasattr(e, 'reason'):
logging.debug('failed to reach a server.')
logging.debug('Reason: %s', e.reason)
logging.debug(url)
return None
elif hasattr(e, 'code'):
logging.debug('The server couldn\'t fulfill the request.')
logging.debug('Code: %s', e.reason)
logging.debug(url)
return None
else:
logging.debug('Shit fucked up2')
return None
At the end of my script:
main_thread = threading.currentThread()
for thread in threading.enumerate():
if thread is main_thread: continue
while thread.isAlive():
thread.join(2)
break
From my current understanding (which may be wrong) if the thread is not completed it's task within 2 seconds of reaching this it should time out. Instead it will stick in the last while. If I take that out it will just hang once the script is done executing.
Also, I decided it was time to man up and leave Notepad++ for a real IDE with debugging tools so I downloaded Wing. I'm a big fan of Wing, but the script doesn't hang there... What do you all use to write Python?
There is no thread interruption in Python and no way to cancel a thread. It can only finish execution by itself. The join method only waits 2 seconds or until termination, it does not kill anything. You need to implement timeout mechanism in the thread itself.
I hit the books and figured out enough to correct the issue I was having. I was able to remove that code that was near the end of my script completely. I corrected this issue by spawning the thread pool differently.
for i in range(queue.qsize()):
td = ThreadDownload(queue)
td.start()
queue.join()
I also was not using a try: for queue.get() during the thread's execution.
try:
img_url = self.queue.get()
...
except Queue.Empty:
...
We are using celery to get flights data from different travel
agencies, every request takes ~20-30 seconds(most agencies require
request sequence - authorize, send request, poll for results).
Normal
celery task looks like this:
from eventlet.green import urllib2, time
def get_results(attr, **kwargs):
search, provider, minprice = attr
data = XXX # prepared data
host = urljoin(MAIN_URL, "RPCService/Flights_SearchStart")
req = urllib2.Request(host, data, {'Content-Type': 'text/xml'})
try:
response_stream = urllib2.urlopen(req)
except urllib2.URLError as e:
return [search, None]
response = response_stream.read()
rsp_host = urljoin(MAIN_URL, "RPCService/FlightSearchResults_Get")
rsp_req = urllib2.Request(rsp_host, response, {'Content-Type':
'text/xml'})
ready = False
sleeptime = 1
rsp_response = ''
while not ready:
time.sleep(sleeptime)
try:
rsp_response_stream = urllib2.urlopen(rsp_req)
except urllib2.URLError as e:
log.error('go2see: results fetch failed for %s IOError %s'%
(search.id, str(e)))
else:
rsp_response = rsp_response_stream.read()
try:
rsp = parseString(rsp_response)
except ExpatError as e:
return [search, None]
else:
ready = rsp.getElementsByTagName('SearchResultEx')
[0].getElementsByTagName('IsReady')[0].firstChild.data
ready = (ready == 'true')
sleeptime += 1
if sleeptime > 10:
return [search, None]
hash = "%032x" % random.getrandbits(128)
open(RESULT_TMP_FOLDER+hash, 'w+').write(rsp_response)
# call to parser
parse_agent_results.apply_async(queue='parsers', args=[__name__,
search, provider, hash])
This tasks are run in eventlet pool with concurency 300,
prefetch_multiplier = 1, broker_limit = 300
When ~100-200 task are fetched from queue - CPU usage raises up to 100%
( whole CPU core is used) and task fetching from queue is performed
with delays.
Could you please point on possible issues - blocking
operations( eventlet ALARM DETECTOR gives no exceptions ), wrong
architecture or whatever.
A problem occurs if you fire 200 requests to a server, responses could be delayed and therefore urllib.urlopen will hang.
Another thing i noticed: If an URLError is raised, the program stays in the while loop until sleeptime is greater than 10. So an URLError error will let this script sleep for 55 sec (1+2+3.. etc)
Sorry for late response.
Thing i would try first in such situation is to turn off Eventlet completely in both Celery and your code, use process or OS thread model. 300 threads or even processes is not that much load for OS scheduler (although you may lack memory to run many processes). So i would try it and see if CPU load drops dramatically. If it does not, then problem is in your code and Eventlet can't magically fix it. If it does drop, however, we would need to investigate the issue closer.
If bug still persists, please, report it via any of these ways:
https://bitbucket.org/which_linden/eventlet/issues/new
https://github.com/eventlet/eventlet/issues/new
email to eventletdev#lists.secondlife.com