I have a python API script and my script sometimes gets terminated on this line despite using try/except. Here is the code:
try:
r = requests.post(URL, data=params, headers=headers, timeout=self.request_timeout)
try:
response = r.json()
except Exception, e:
message = "ERROR_0104! Unexpected error occured. The error is: "
message += str(e)
print message
aux_func.write_log(message)
return 'Switch'
except requests.exceptions.RequestException:
print "Exception occurred on 'API requests post' procedure."
counter += 1
continue
...
The error occurs on the second line of above shown code. This is the error:
r = requests.post(URL, data=params, headers=headers, timeout=self.request_timeout)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 88, in post
return request('post', url, data=data, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 383, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 486, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 394, in send
r.content
File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 679, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 616, in generate
decode_content=True):
File "/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/response.py", line 236, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/response.py", line 183, in read
data = self._fp.read(amt)
File "/usr/lib/python2.7/httplib.py", line 543, in read
return self._read_chunked(amt)
File "/usr/lib/python2.7/httplib.py", line 585, in _read_chunked
line = self.fp.readline(_MAXLINE + 1)
File "/usr/lib/python2.7/socket.py", line 476, in readline
data = self._sock.recv(self._rbufsize)
File "/usr/lib/python2.7/ssl.py", line 305, in recv
return self.read(buflen)
File "/usr/lib/python2.7/ssl.py", line 224, in read
return self._sslobj.read(len)
ssl.SSLError: The read operation timed out
I presume something within the Requests module is causing this, but I don't know what.
The read operation has timed out, as it says.
It times out, however, with an ssl.SSLError. This is not what your except is catching. If you want to catch and retry, you need to catch the right error.
except Exception, e does not work with >= Python 3
You have to make it except Exception as e
I saw that there was some confusion here regarding what the solution is because of lack of enough details. I posted the answer on the comment to the top post but the formatting is not great in the comment section and I will post a properly formatted answer here.
The problem, as Veedrac has mentioned is that I was not catching all the possible exceptions in the code that I posted in the question. My code only catches "requests.exceptions.RequestException", and any other exception will cause the code to exit abruptly.
Instead, I'm gonna re-write the code like this:
try:
r = requests.post(URL, data=params, headers=headers, timeout=self.request_timeout)
try:
response = r.json()
except Exception, e:
message = "ERROR_0104! Unexpected error occured. The error is: "
message += str(e)
print message
aux_func.write_log(message)
return 'Switch'
except requests.exceptions.RequestException:
print "Exception occurred on 'API requests post' procedure."
counter += 1
continue
except Exception, e:
print "Exception {0} occurred".format(e)
continue
All I did was add an extra generic exception catcher at the end which will catch all other unaccounted for exceptions.
I hope this helps.
Thanks.
Related
I'm trying to implement 2captcha using selenium with Python.
I just copied the example form their documentation:
https://github.com/2captcha/2captcha-api-examples/blob/master/ReCaptcha%20v2%20API%20Examples/Python%20Example/2captcha_python_api_example.py
This is my code:
from selenium import webdriver
from time import sleep
from selenium.webdriver.support.select import Select
import requests
driver = webdriver.Chrome('chromedriver.exe')
driver.get('the_url')
current_url = driver.current_url
captcha = driver.find_element_by_id("captcha-box")
captcha2 = captcha.find_element_by_xpath("//div/div/iframe").get_attribute("src")
captcha3 = captcha2.split('=')
#print(captcha3[2])
# Add these values
API_KEY = 'my_api_key' # Your 2captcha API KEY
site_key = captcha3[2] # site-key, read the 2captcha docs on how to get this
url = current_url # example url
proxy = 'Myproxy' # example proxy
proxy = {'http': 'http://' + proxy, 'https': 'https://' + proxy}
s = requests.Session()
# here we post site key to 2captcha to get captcha ID (and we parse it here too)
captcha_id = s.post("http://2captcha.com/in.php?key={}&method=userrecaptcha&googlekey={}&pageurl={}".format(API_KEY, site_key, url), proxies=proxy).text.split('|')[1]
# then we parse gresponse from 2captcha response
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(API_KEY, captcha_id), proxies=proxy).text
print("solving ref captcha...")
while 'CAPCHA_NOT_READY' in recaptcha_answer:
sleep(5)
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(API_KEY, captcha_id), proxies=proxy).text
recaptcha_answer = recaptcha_answer.split('|')[1]
# we make the payload for the post data here, use something like mitmproxy or fiddler to see what is needed
payload = {
'key': 'value',
'gresponse': recaptcha_answer # This is the response from 2captcha, which is needed for the post request to go through.
}
# then send the post request to the url
response = s.post(url, payload, proxies=proxy)
# And that's all there is to it other than scraping data from the website, which is dynamic for every website.
This is my error:
solving ref captcha...
Traceback (most recent call last):
File "main.py", line 38, in
recaptcha_answer = recaptcha_answer.split('|')[1]
IndexError: list index out of range
The captcha is getting solved because I can see it on 2captcha dashboard, so which is the error if it's de official documentation?
EDIT:
For some without modification I'm getting the captcha solved form 2captcha but then I get this error:
solving ref captcha...
OK|this_is_the_2captch_answer
Traceback (most recent call last):
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 594, in urlopen
self._prepare_proxy(conn)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 805, in _prepare_proxy
conn.connect()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 308, in connect
self._tunnel()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 906, in _tunnel
(version, code, message) = response._read_status()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 278, in _read_status
raise BadStatusLine(line)
http.client.BadStatusLine: <html>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\adapters.py", line 449, in send
timeout=timeout
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\util\retry.py", line 368, in increment
raise six.reraise(type(error), error, _stacktrace)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\packages\six.py", line 685, in reraise
raise value.with_traceback(tb)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 594, in urlopen
self._prepare_proxy(conn)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 805, in _prepare_proxy
conn.connect()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 308, in connect
self._tunnel()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 906, in _tunnel
(version, code, message) = response._read_status()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 278, in _read_status
raise BadStatusLine(line)
urllib3.exceptions.ProtocolError: ('Connection aborted.', BadStatusLine('<html>\r\n'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 49, in <module>
response = s.post(url, payload, proxies=proxy)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 581, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine('<html>\r\n'))
Why am I getting this error?
I'm setting as site_key = current_url_where_captcha_is_located
Is this correct?
Use your debugger or put a print(recaptcha_answer) before the error line to see what's the value of recaptcha_answer before you try to call .split('|') on it. There is no | in the string so when you're trying to get the second element of the resulting list with [1] it fails.
Looks like you don't provide any valid proxy connection parameters but passing this proxy to requests when connecting to the API.
Just comment these two lines:
#proxy = 'Myproxy' # example proxy
#proxy = {'http': 'http://' + proxy, 'https': 'https://' + proxy}
And then remove proxies=proxy from four lines:
captcha_id = s.post("http://2captcha.com/in.php?key={}&method=userrecaptcha&googlekey={}&pageurl={}".format(API_KEY, site_key, url)).text.split('|')[1]
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(API_KEY, captcha_id)).text
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(API_KEY, captcha_id)).text
response = s.post(url, payload, proxies=proxy)
I have a small script that repeatedly (hourly) fetches tweets from the API, using sixohsix's
Twitter Wrapper for Python. I am successful with handling most, if not all of the errors coming from the Twitter API, i.e. all the 5xx and 4xx stuff.
Nonetheless I randomly observe the below error traceback (only once in 2-3 days). I mean the program exits and the traceback is displayed in the shell. I have no clue what this could mean, but think it is not directly related to what my script does since it has proved itself to correctly run most of the time.
This is where I call a function of the wrapper in my script:
KW = {
'count': 200, # number of tweets to fetch (fetch maximum)
'user_id' : tweeter['user_id'],
'include_rts': 'false', # do not include native RT's
'trim_user' : 'true',
}
timeline = tw.twitter_request(tw_endpoint,\
tw_endpoint.statuses.user_timeline, KW)
The function tw.twitter_request(tw_endpoint, tw_endpoint.statuses.user_timeline, KW) basically does return tw_endpoint.statuses_user_timeline(**args), where args translate to KW, and tw_endpoint is an OAuthorized endpoint gained from using the sixohsix's library's
return twitter.Twitter(domain='api.twitter.com', api_version='1.1',
auth=twitter.oauth.OAuth(access_token, access_token_secret,
consumer_key, consumer_secret))
This is the traceback:
Traceback (most recent call last):
File "search_twitter_entities.py", line 166, in <module>
tw_endpoint.statuses.user_timeline, KW)
File "/home/tg/mild/twitter_utils.py", line 171, in twitter_request
return twitter_function(**args)
File "build/bdist.linux-x86_64/egg/twitter/api.py", line 173, in __call__
File "build/bdist.linux-x86_64/egg/twitter/api.py", line 177, in _handle_response
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1215, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1180, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 371, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
The only thing I can gain from that traceback is that the error happens somewhere deep inside another Python library and has something to do with an invalid HTTP stat coming from the Twitter API or the wrapper... But as I said, maybe some of you could give me a hint on how to debug/solve this since it is pretty annoying having to regularly check my script and restart it to continue fetching tweets.
EDIT: To clarify this a little, the first two functions in the traceback are already in a try-except block. For example, the try-except-Block in File "twitter_utils.py" filters out 40x and 50x exceptions, but also looks for general exceptions with only except:. So what I don't understand is why the error is not getting caught at this position and instead, the program is force-closed and a traceback printed? Shortly speaking I am in the situation where I cannot catch an error, just like a parse error in a PHP script. So how would I do this?
Perhaps this will point you in the right direction. This is what's being called when BadStatusLine is called upon:
class BadStatusLine(HTTPException):
def __init__(self, line):
if not line:
line = repr(line)
self.args = line,
self.line = line
I'm not too familiar with httplib, but if I had to guess, you're geting an empty response/error line and, well, it can't be parsed. There are comments before the line you're program is stopping at:
# Presumably, the server closed the connection before
# sending a valid response.
raise BadStatusLine(line)
If twitter is closing the connection before sending a response, you could try again, meaning do a try/except at "search_twitter_entities.py", line 166 a couple times (ugly).
try:
timeline = tw.twitter_request(tw_endpoint,\
tw_endpoint.statuses.user_timeline, KW)
except:
try:
timeline = tw.twitter_request(tw_endpoint,\
tw_endpoint.statuses.user_timeline, KW) # try again
except:
pass
Or, assuming you can reassign timeline as none each time, do a while loop:
timeline = None
while timeline == None:
try:
timeline = tw.twitter_request(tw_endpoint,\
tw_endpoint.statuses.user_timeline, KW)
except:
pass
I didn't test of of that. Check for bad code.
My code is like follows, but when it runs it throws an error.
search_request = urllib2.Request(url,data=tmp_file_name,headers={'X-Requested-With':'WoMenShi888XMLHttpRequestWin'})
#print search_request.get_method()
search_response = urllib2.urlopen(search_request)
html_data = search_response.read()
the error is:
Traceback (most recent call last):
File "xx_tmp.py", line 83, in <module>
print hello_lfi()
File "xx_tmp.py", line 69, in hello_lfi
search_response = urllib2.urlopen(search_request)
File "D:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "D:\Python27\lib\urllib2.py", line 406, in open
response = meth(req, response)
File "D:\Python27\lib\urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "D:\Python27\lib\urllib2.py", line 444, in error
return self._call_chain(*args)
File "D:\Python27\lib\urllib2.py", line 378, in _call_chain
result = func(*args)
File "D:\Python27\lib\urllib2.py", line 527, in http_error_defau
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 500: Internal Server Error
I don't know how to fix it? I mean, when an error happened, how can my code continue to work?
when i try use
try:
search_response = urllib2.urlopen(search_request)
except urllib2.HTTPError:
pass
new error
UnboundLocalError: local variable 'search_response' referenced before assignment
i use
global search_response
and have error
NameError: global name 'search_response' is not defined
You can catch the exception, this will prevent your program from stopping so 'abruptly':
try:
search_response = urllib2.urlopen(search_request)
except urllib2.HTTPError:
print 'There was an error with the request'
If you want to continue, you can simply:
try:
search_response = urllib2.urlopen(search_request)
except urllib2.HTTPError:
pass
This will allow your program to continue; but your other statement html_data = search_response.read() won't give you the expected result. To fix this problem permanently, you need to debug your request to see why its failing; this isn't something specific to Python.
I had the same error when I was trying to send a large post request to my GAE Python server. It turns out the server threw the error because I was trying to write the received POST string into a db.StringProperty(). I changed that to db.TextProperty() and it didn't throw the error anymore.
Source: Overcome appengine 500 byte string limit in python? consider text
I am fairly inexperienced with user authentication especially through restful apis. I am trying to use python to log in with a user that is set up in parse.com. The following is the code I have:
API_LOGIN_ROOT = 'https://api.parse.com/1/login'
params = {'username':username,'password':password}
encodedParams = urllib.urlencode(params)
url = API_LOGIN_ROOT + "?" + encodedParams
request = urllib2.Request(url)
request.add_header('Content-type', 'application/x-www-form-urlencoded')
# we could use urllib2's authentication system, but it seems like overkill for this
auth_header = "Basic %s" % base64.b64encode('%s:%s' % (APPLICATION_ID, MASTER_KEY))
request.add_header('Authorization', auth_header)
request.add_header('X-Parse-Application-Id', APPLICATION_ID)
request.add_header('X-Parse-REST-API-Key', MASTER_KEY)
request.get_method = lambda: http_verb
# TODO: add error handling for server response
response = urllib2.urlopen(request)
#response_body = response.read()
#response_dict = json.loads(response_body)
This is a modification of an open source library used to access the parse rest interface.
I get the following error:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/_webapp25.py", line 703, in __call__
handler.post(*groups)
File "/Users/nazbot/src/PantryPal_AppEngine/fridgepal.py", line 464, in post
url = user.login()
File "/Users/nazbot/src/PantryPal_AppEngine/fridgepal.py", line 313, in login
url = self._executeCall(self.username, self.password, 'GET', data)
File "/Users/nazbot/src/PantryPal_AppEngine/fridgepal.py", line 292, in _executeCall
response = urllib2.urlopen(request)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 400, in open
response = meth(req, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 513, in http_response
'http', request, response, code, msg, hdrs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 438, in error
return self._call_chain(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 372, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 521, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 404: Not Found
Can someone point me to where I am screwing up? I'm not quite sure why I'm getting a 404 instead of an access denied or some other issue.
Make sure the "User" class was created on Parse.com as a special user class. When you are adding the class, make sure to change the Class Type to "User" instead of "Custom". A little user head icon will show up next to the class name on the left hand side.
This stumped me for a long time until Matt from the Parse team showed me the problem.
Please change: API_LOGIN_ROOT = 'https://api.parse.com/1/login' to the following: API_LOGIN_ROOT = 'https://api.parse.com/1/login**/**'
I had the same problem using PHP, adding the / at the end fixed the 404 error.
I have a script running that is testing a series of urls for availability.
This is one of the functions.
def checkUrl(url): # Only downloads headers, returns status code.
p = urlparse(url)
conn = httplib.HTTPConnection(p.netloc)
conn.request('HEAD', p.path)
resp = conn.getresponse()
return resp.status
Occasionally, the VPS will lose connectivity, the entire script crashes when that occurs.
File "/usr/lib/python2.6/httplib.py", line 914, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.6/httplib.py", line 951, in _send_request
self.endheaders()
File "/usr/lib/python2.6/httplib.py", line 908, in endheaders
self._send_output()
File "/usr/lib/python2.6/httplib.py", line 780, in _send_output
self.send(msg)
File "/usr/lib/python2.6/httplib.py", line 739, in send
self.connect()
File "/usr/lib/python2.6/httplib.py", line 720, in connect
self.timeout)
File "/usr/lib/python2.6/socket.py", line 561, in create_connection
raise error, msg
socket.error: [Errno 101] Network is unreachable
I'm not at all familiar with handling errors like this in python.
What is the appropriate way to keep the script from crashing when network connectivity is temporarily lost?
Edit:
I ended up with this - feedback?
def checkUrl(url): # Only downloads headers, returns status code.
try:
p = urlparse(url)
conn = httplib.HTTPConnection(p.netloc)
conn.request('HEAD', p.path)
resp = conn.getresponse()
return resp.status
except IOError, e:
if e.errno == 101:
print "Network Error"
time.sleep(1)
checkUrl(url)
else:
raise
I'm not sure I fully understand what raise does though..
If you just want to handle this Network is unreachable 101, and let other exceptions throw an error, you can do following for example.
from errno import ENETUNREACH
try:
# tricky code goes here
except IOError as e:
# an IOError exception occurred (socket.error is a subclass)
if e.errno == ENETUNREACH:
# now we had the error code 101, network unreachable
do_some_recovery
else:
# other exceptions we reraise again
raise
Problem with your solution as it stands is you're going to run out of stack space if there are too many errors on a single URL (> 1000 by default) due to the recursion. Also, the extra stack frames could make tracebacks hard to read (500 calls to checkURL). I'd rewrite it to be iterative, like so:
def checkUrl(url): # Only downloads headers, returns status code.
while True:
try:
p = urlparse(url)
conn = httplib.HTTPConnection(p.netloc)
conn.request('HEAD', p.path)
resp = conn.getresponse()
return resp.status
except IOError as e:
if e.errno == 101:
print "Network Error"
time.sleep(1)
except:
raise
Also, you want the last clause in your try to be a bare except not an else. Your else only gets executed if control falls through the try suite, which can never happen, since the last statement of the try suite is return.
This is very easy to change to allow a limited number of retries. Just change the while True: line to for _ in xrange(5) or however many retries you wish to accept. The function will then return None if it can't connect to the site after 5 attempts. You can have it return something else or raise an exception by adding return or raise SomeException at the very end of the function (indented the same as the for or while line).
put try...except: around your code to catch exceptions.
http://docs.python.org/tutorial/errors.html