how to increase timeout in tornado - python

I'm trying tornado framework. But I found tornado frequently failed after 30 seconds in the stress test (using multi-mechanize). I use 10 threads in multi-mechanize and run for 100 seconds, around 500 requests / seconds. And it's around 15% failure ratio after 30 seconds. The whole test is about 100 seconds. From the statistics, I realize, the failure may due to timeout after 0.2 seconds. I searched for several ways to increase the timeout on web, but nothing works.
The below is my tornado code:
import tornado.ioloop
import tornado.web
class MainHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(self):
self.write("Hello, world")
self.finish()
application = tornado.web.Application([
(r"/", MainHandler),
])
if __name__ == "__main__":
application.listen(8000)
tornado.ioloop.IOLoop.instance().start()
Here is my multi-mechanize test script:
import requests
import random
import time
class Transaction(object):
def __init__(self):
pass
def run(self):
r = requests.get('http://127.0.0.1:8000')
output = r.raw.read()
assert(r.status_code == 200)
return output
if __name__ == '__main__':
trans = Transaction()
trans.run()
print trans.custom_timers
The following is the error message I got from multimech-run
Traceback (most recent call last):
File "././test_scripts/v_user.py", line 12, in run
r = requests.get('http://127.0.0.1:8000')
File "/Library/Python/2.7/site-packages/requests/api.py", line 54, in get
return request('get', url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/safe_mode.py", line 37, in wrapped
return function(method, url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/api.py", line 42, in request
return s.request(method=method, url=url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 230, in request
r.send(prefetch=prefetch)
File "/Library/Python/2.7/site-packages/requests/models.py", line 603, in send
timeout=self.timeout,
File "/Library/Python/2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 415, in urlopen
body=body, headers=headers)
File "/Library/Python/2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 267, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 941, in request
self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 975, in _send_request
self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 937, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 797, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 759, in send
self.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 740, in connect
self.timeout, self.source_address)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 571, in create_connection
raise err
error: [Errno 49] Can't assign requested address

To answer your question of how to change the timeout for tornado, you need to modify tornado's bind_socket function to set the time out: http://www.tornadoweb.org/documentation/_modules/tornado/netutil.html
sock.setblocking(0)
sock.bind(sockaddr)
sock.listen(backlog)
sockets.append(sock)
Change the first line to sock.setblocking(1). According to the documentation: http://docs.python.org/library/socket.html#socket.socket.settimeout :
Setting a timeout of None disables timeouts on socket operations
s.settimeout(None) is equivalent to s.setblocking(1).
However, as suggested in the comment, I think you should look at distributing the load.

Related

My python script stops working without any intervention

I am new to python and my script lasts about 3-4 hour and no error message is saved. What could be causing this problem?
Here is my code:
import time
import urllib.request
import threading
def load():
try:
content = str(urllib.request.urlopen("[URL]").read())
# do sth with content
threading.Timer(0.5, load).start()
except Exception as e:
file = open("Error.txt","w")
file.write(time.strftime("%H:%M:%S\n\n"))
file.write(e.message)
file.close()
threading.Timer(0.5, load).start()
def main(args):
load()
return 0
if __name__ == '__main__':
import sys
sys.exit(main(sys.argv))
And here is nohup.out file on ubuntu 14.04:
Exception in thread Thread-907:
Traceback (most recent call last):
File "/usr/lib/python3.4/urllib/request.py", line 1182, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "/usr/lib/python3.4/http/client.py", line 1125, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python3.4/http/client.py", line 1163, in _send_request
self.endheaders(body)
File "/usr/lib/python3.4/http/client.py", line 1121, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.4/http/client.py", line 951, in _send_output
self.send(msg)
File "/usr/lib/python3.4/http/client.py", line 886, in send
self.connect()
File "/usr/lib/python3.4/http/client.py", line 863, in connect
self.timeout, self.source_address)
File "/usr/lib/python3.4/socket.py", line 512, in create_connection
raise err
File "/usr/lib/python3.4/socket.py", line 503, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
Network is unreachable usually means you have a connectivity problem on your machine. Maybe it's an intermittent, short-lived problem, but you never try to detect it and recover from it.
Consider using something like retry decorator, or handle it manually.
(I also find your way of looping the fetch logic rather bizarre. Why a simple single-threaded loop did not work for you?)

Python Requests get doesn't return unless a timeout is specified

This request never returns (or at least not within my patience):
import requests
r = requests.get('http://en.wikipedia.org/w/api.php?rcprop=ids&format=json&action=query&rclimit=10&rctype=edit&list=recentchanges&rcnamespace=0', headers={'user-agent': 'api test'})
Hitting Ctrl+C always produces this traceback:
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 383, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 486, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 330, in send
timeout=timeout
File "/usr/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 542, in urlopen
body=body, headers=headers)
File "/usr/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 367, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python2.7/httplib.py", line 973, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.7/httplib.py", line 1007, in _send_request
self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 969, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 829, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 791, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 772, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 562, in create_connection
sock.connect(sa)
File "/usr/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
Adding timeout=5 to the request causes the request to succeed, after the timeout has expired (ie the correct data is returned from the API request). But of course that adds five seconds of latency into my application for every API request.
What's going wrong here?
This was due to IPv6 not working very well on my network. httplib (and therefore Requests) seems to prefer IPv6 if it's available, but if it's not working very well then you can have a long wait while the IPv6 request times out. Setting a timeout causes it to fall back to IPv4 following the expiry of the timeout, which then succeeds. Disabling IPv6 on my network has fixed this (as, I assume, would fixing IPv6).

python ssl eof occurred in violation of protocol, wantwriteerror, zeroreturnerror

I'm running many celery tasks (20,000) using gevent for the pool (also monkey patching all). Each of these tasks hit 3rd party services like adwords to pull data.
I keep having tasks fail because of underlying SSL errors. Below are the stack-traces from a few of the exceptions (in no particular order, these are failures from separate tasks). I also get WantWriteError and ZeroReturnError occasionally but the EOF error seems to come up the most.
These errors happen while using different client libraries like googleads (suds library for soap communication) as well as requests and elasticsearch. I'm guessing some of these libraries use urllib3 while others use urllib2 etc.
There has been a lot of info on the EOF issue and forcing TLSv1 but I can't seem to find a resolution that works.
I'm not sure if I'm running too many requests at once, if somethings blocking or what; any help would be greatly appreciated, I'm pulling my hair out over this one.
Traceback (most recent call last):
...
File "/srv/reporting/src/reporting/stats/adwords/client.py", line 58, in _awql_report
downloader = self._get_client(client_id).GetReportDownloader(version=self.REPORT_DOWNLOADER_VERSION)
File "/usr/local/lib/python2.7/dist-packages/googleads/adwords.py", line 283, in GetReportDownloader
return ReportDownloader(self, version, server)
File "/usr/local/lib/python2.7/dist-packages/googleads/adwords.py", line 400, in __init__
proxy=proxy_option, cache=self._adwords_client.cache).wsdl.schema
File "/usr/local/lib/python2.7/dist-packages/suds/client.py", line 115, in __init__
self.wsdl = reader.open(url)
File "/usr/local/lib/python2.7/dist-packages/suds/reader.py", line 150, in open
d = self.fn(url, self.options)
File "/usr/local/lib/python2.7/dist-packages/suds/wsdl.py", line 136, in __init__
d = reader.open(url)
File "/usr/local/lib/python2.7/dist-packages/suds/reader.py", line 74, in open
d = self.download(url)
File "/usr/local/lib/python2.7/dist-packages/suds/reader.py", line 92, in download
fp = self.options.transport.open(Request(url))
File "/usr/local/lib/python2.7/dist-packages/suds/transport/https.py", line 62, in open
return HttpTransport.open(self, request)
File "/usr/local/lib/python2.7/dist-packages/suds/transport/http.py", line 67, in open
return self.u2open(u2request)
File "/usr/local/lib/python2.7/dist-packages/suds/transport/http.py", line 132, in u2open
return url.open(u2request, timeout=tm)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1216, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1178, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 8] _ssl.c:504: EOF occurred in violation of protocol>
Traceback (most recent call last):
...
File "/srv/reporting/src/reporting/stats/analytics/client.py", line 57, in get_access_token
response = requests.post('https://accounts.google.com/o/oauth2/token', data)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 88, in post
return request('post', url, data=data, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 559, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 382, in send
raise SSLError(e, request=request)
SSLError: [Errno bad handshake] (-1, 'Unexpected EOF')
Traceback (most recent call last):
...
self.es.index(index=self.INDICE, doc_type=self.ROOT_CLASS.__name__, body=self.export(obj), id=obj.id)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 213, in index
_make_path(index, doc_type, id), params=params, body=body)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 284, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_requests.py", line 44, in perform_request
response = self.session.request(method, url, data=body, timeout=timeout or self.timeout)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 559, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 327, in send
timeout=timeout
File "/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py", line 493, in urlopen
body=body, headers=headers)
File "/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py", line 319, in _make_request
httplib_response = conn.getresponse(buffering=True)
File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
line = self.fp.readline()
File "/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 273, in readline
data = self._sock.recv(self._rbufsize)
File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 995, in recv
self._raise_ssl_error(self._ssl, result)
File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 851, in _raise_ssl_error
raise ZeroReturnError()
ZeroReturnError
So let's break this down by each traceback block. The first ends with:
File "/usr/lib/python2.7/urllib2.py", line 1178, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 8] _ssl.c:504: EOF occurred in violation of protocol>
This is coming from urllib2. The fact that this receives an EOF makes me think that the server closed the connection while you were waiting for that "thread" to read from the socket again. You might want to use more time.sleep(0) to yield to gevent.
The second traceback comes from requests:
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 382, in send
raise SSLError(e, request=request)
SSLError: [Errno bad handshake] (-1, 'Unexpected EOF')
The [Errno bad handshake] would make me tend to think this is a problem establishing the connection which could be caused by an unexpected EOF. Is that caused by using gevent? I'm uncertain.
The final traceback is definitely from requests as well but it also is coming out of PyOpenSSL and isn't being caught by urllib3 or requests.
File "/usr/local/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 851, in _raise_ssl_error
raise ZeroReturnError()
ZeroReturnError
I did some searching and found that "According to the pyOpenSSL docs ZeroReturnError means that the SSL connection has been closed cleanly." This says to me that the server again closed the connection because you took to long to read anything from the socket.
In short, I think you need to explicitly yield more often just to ensure that these socket problems don't arise. That's just a guess though, so take it with a grain of salt.

Socket error when using gdata Youtube API in Python

I'm using gdata to map YouTube URLs to video titles, using the following code:
import gdata.youtube.service as youtube
import re
import queue
import urlparse
ytservice = youtube.YouTubeService()
ytservice.ssl = True
ytservice.developer_key = '' # snip
class youtube(mediaplugin):
def __init__(self, parsed_url):
self.url = parsed_url
self.video_id = urlparse.parse_qs(parsed_url.query)['v'][0]
self.ytdata = ytservice.GetYouTubeVideoEntry(self.video_id)
print self.ytdata
I get the following socket exception when calling service.GetYouTubeVideoEntry():
File "/Users/haldean/Documents/qpi/qpi/media.py", line 21, in __init__
self.ytdata = ytservice.GetYouTubeVideoEntry(self.video_id)
File "/Users/haldean/Documents/qpi/lib/python2.7/site-packages/gdata/youtube/service.py", line 210, in GetYouTubeVideoEntry
return self.Get(uri, converter=gdata.youtube.YouTubeVideoEntryFromString)
File "/Users/haldean/Documents/qpi/lib/python2.7/site-packages/gdata/service.py", line 1069, in Get
headers=extra_headers)
File "/Users/haldean/Documents/qpi/lib/python2.7/site-packages/atom/__init__.py", line 93, in optional_warn_function
return f(*args, **kwargs)
File "/Users/haldean/Documents/qpi/lib/python2.7/site-packages/atom/service.py", line 186, in request
data=data, headers=all_headers)
File "/Users/haldean/Documents/qpi/lib/python2.7/site-packages/atom/http_interface.py", line 148, in perform_request
return http_client.request(operation, url, data=data, headers=headers)
File "/Users/haldean/Documents/qpi/lib/python2.7/site-packages/atom/http.py", line 163, in request
connection.endheaders()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 937, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 797, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 759, in send
self.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1140, in connect
self.timeout, self.source_address)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
gaierror: [Errno 8] nodename nor servname provided, or not known
I'm at a loss as to how to even begin debugging this. Any ideas appreciated. Thanks!
Edit:
In response to a question asked in comments, video_id is qh-mwjF-OMo and parsed_url is:
ParseResult(scheme=u'http', netloc=u'www.youtube.com', path=u'/watch', params='', query=u'v=qh-mwjF-OMo&feature=g-user-u', fragment='')
My mistake was that the video_id should be passed as a keyword parameter, like so:
self.ytdata = ytservice.GetYouTubeVideoEntry(video_id=self.video_id)
It seems that the socket exception is the only layer of gdata that will throw an exception; it tries to get a URL blindly based on the arguments and it only fails when the URL fetch fails.

Google Appengine URLFetch Timeouts - Any Best Practices?

New to python and appengine. Have got a little toy i've been playing with and ran into some script timeouts last night. I know you're capped at 10 seconds. Whats best practice for dealing with this?
edit
Sorry, should have been more clear. the URLFetch Timeout is the issue I am having. By Default it is set to 5 seconds, max is 10
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 636, in __call__
handler.post(*groups)
File "/base/data/home/apps/netlicense/3.349495357411133950/main.py", line 235, in post
graph.put_wall_post(message=body, attachment=attch, profile_id=self.request.get("fbid"))
File "/base/data/home/apps/netlicense/3.349495357411133950/facebook.py", line 149, in put_wall_post
return self.put_object(profile_id, "feed", message=message, **attachment)
File "/base/data/home/apps/netlicense/3.349495357411133950/facebook.py", line 131, in put_object
return self.request(parent_object + "/" + connection_name, post_args=data)
File "/base/data/home/apps/netlicense/3.349495357411133950/facebook.py", line 179, in request
file = urllib2.urlopen(urlpath, post_data)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 124, in urlopen
return _opener.open(url, data)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 381, in open
response = self._open(req, data)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 399, in _open
'_open', req)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 360, in _call_chain
result = func(*args)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 1115, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/base/python_runtime/python_dist/lib/python2.5/urllib2.py", line 1080, in do_open
r = h.getresponse()
File "/base/python_runtime/python_dist/lib/python2.5/httplib.py", line 197, in getresponse
self._allow_truncated, self._follow_redirects)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/urlfetch.py", line 260, in fetch
return rpc.get_result()
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 592, in get_result
return self.__get_result_hook(self)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/urlfetch.py", line 361, in _get_fetch_result
raise DeadlineExceededError(str(err))
DeadlineExceededError: ApplicationError: 5
You have not told us what your application does, so here are some generic suggestions:
You can trap the timeout exception with this exception class google.appengine.api.urlfetch.DownloadError and gently alert the users to retry.
Web request run time is 30 seconds max; if what you are trying to download is relatively small, you could probably trap the exception and resubmit (for just one time) the urlfetch inside the same Web request.
If working offline is not a problem for your app, you can move the Urlfetch call to a worker task served by a Task Queue; one of the advantage of using the taskqueue API is that App Engine automatically retries the Urlfetch task until it succeeds.

Categories