Strange Intermittent posting issue using python requests - python

I'm using a raspberry pi 3 and python 2.7 with requests to post data to my lamp server. All works great except for intermittent posting errors which are trapped requests.exceptions.ConnectTimeout. BTW, timeout=0.5 sec which is 2.5x the posting time (0.2 sec). See code below.
When the request exception occurs, I check for internet access using CheckConnection(). BTW, this takes only 0.016 sec on pi;so fast compared to other techniques. When False, it doesn't retry posting and logs data locally.
However, I can connect remotely to the Pi using TeamViewer while this is happening! I am posting data to our server with other installations so it is not a cloud server down issue.
After several to many minutes, the issue resolves itself and posting resumes like nothing was wrong.
Any suggestions to how I can change my code is most welcomed either to determine the root cause or fix the issue. Thank you in advance.
******** CODE ************
def PostData(payload,retry_count=3):
url = 'http://xxx.xxx.xxx.xxx/api/data/push/'
try:
response = requests.post(url,params=payload,timeout=0.5)
if response.status_code == 200:
return response.text
response.raise_for_status()
except (requests.exceptions.RequestException, requests.exceptions.ConnectTimeout) as e:
print "Post Error..."
x = CheckConnection()
if x==False:
return "Internet for Posting: " + str(x)
if retry_count >0:
Reason = "Post Settings Retry: " + str(retry_count)
print Reason
#sleeptime = 0.05*2**(3-retry_count)
#time.sleep(sleeptime)
return PostData(payload, retry_count-1)
if retry_count==0:
Reason = "Error! Post settings retry failed. Retry=0. Internet: " + str(x)
return Reason
return None
except Exception as e:
x = CheckConnection()
Reason= "Error! Posting Exception: " + str(e) + "Internet: " + str(x)
print Reason
return None
def CheckConnection(host="8.8.8.8",port=53,timeout=0.5):
try:
socket.setdefaulttimeout(timeout)
socket.socket(socket.AF_INET,socket.SOCK_STREAM).connect((host,port))
return True
except Exception:
return False

Related

Can't catch an exception, system just closes

I've implemented an orchestrator in python that makes some requisitions to an API. My problem is that sometimes it gets stucked in a request. To solve that I used the timeout parameter. Now I want to implement something so I can retry doing the requisition, but first I need to take care of the exception. For that I tried this:
uri = MS_MONITOR + "/api/monitor/advise/lawsuit/total/withProgressDaily"
try:
response = requests.get(uri, headers={"Correlation-Id": CORRELATION_ID}, timeout = 10)
except requests.exceptions.RequestException as e:
logging.info("[{}] [{}] {}".format("ERROR de Timeout", response.status_code, uri))
print("Deu timeout")
I put the timeout as 10 just to force it to happen, the problem is that it doesn't go into the exception part below, the cmd just closes. I've tried using this one too but it didn't work either:
except requests.exceptions.Timeout
If there's lack of information in my post please let me know so I'll provide it. Thank you!

Repeat function until true in Python

googled a lot but I still have no solution
So I have a parser def:
def parse_page(url):
req = request.get(url, headers=headers(), proxies=dict(http='socks4://' + get_proxy()), timeout=5)
(code was just for example)
Sometimes proxy is dead or other error could happened (timeout, err 500) but I need to make this request anyway and try until it will return true
So how can I do that?
I tried retrying lib but no success
Thank you!
How about:
import time
req = 0
while not req:
try:
req = request.get(url, headers=headers(), proxies=dict(http='socks4://' + get_proxy()))
except:
time.sleep(5)
As soon as you get a req this will be True no matter what it is, so long as it's not None and that will exit the loop.
while parse_page(url,urls[url]) == False:
print('Something happened... Trying again...')
else:
print(url + 'Is saved.. Keep going...')
Just have to swtich while to False and thats it...
I will leave it if somebody will google it.

How to check if a list of URLs exists

I'm trying to test if a simple list of urls exists, the code works when I'm just testing one url, but when I try add a array of urls, it's breaks.
Any idea what i'm doing wrong?
Single URL Code
import httplib
c = httplib.HTTPConnection('www.example.com')
c.request("HEAD", '')
if c.getresponse().status == 200:
print('web site exists')
Broken Array Code
import httplib
Urls = ['www.google.ie', 'www.msn.com', 'www.fakeniallweb.com', 'www.wikipedia.org', 'www.galwaydxc.com', 'www.foxnews.com', 'www.blizzard.com', 'www.youtube.com']
for x in Urls:
c = httplib.HTTPConnection(x)
c.request("HEAD", '')
if c.getresponse().status == 200:
print('web site exists')
else:
print('web site' + x + 'un-reachable')
#To prevent code from closing
input ()
The problem is not that you do it as an array, it is that one of your urls (www.fakeniallweb.com) has a different problem than your other urls.
I think because the DNS cannot be resolved, you cannot request the HEAD as you do. So you need an additional check other than just checking for response code 200.
Maybe you could do something like this:
try:
c.request("HEAD", '')
if c.getresponse().status == 200:
print('web site exists')
else:
print('website does not exist')
except gaierror as e:
print('Error resolving DNS')
Honestly I suspect you will find other cases where a website returns different status codes. For example a website might return something in the 3xx range for a redirect, or a 403 if you cannot access it. That does not mean the website does not exist.
Hope this helps you on your way!
#Dries De Rydt
Thanks for your help , it was a unresolved dns error causing it to crash out.
I ended up Lib/socket.py
solution
import socket
Urls = ['www.google.ie', 'www.msn.com', 'www.fakeniallweb.com', 'www.wikipedia.org', 'www.galwaydxc.com', 'www.foxnews.com', 'www.blizzard.com', 'www.youtube.com']
for x in Urls:
try:
url = socket.gethostbyname(x)
print x + ' was reachable '
except socket.gaierror, err:
print "cannot resolve hostname: ", x, err
#To prevent code from closing
input ()
Thanks for all the help.

urllib request fails when page takes too long to respond

I have a simple function (in python 3) to take a url and attempt to resolve it: printing an error code if there is one (e.g. 404) or resolve one of the shortened urls to its full url. My urls are in one column of a csv files and the output is saved in the next column. The problem arises where the program encounters a url where the server takes too long to respond- the program just crashes. Is there a simple way to force urllib to print an error code if the server is taking too long. I looked into Timeout on a function call but that looks a little too complicated as i am just starting out. Any suggestions?
i.e. (COL A) shorturl (COL B) http://deals.ebay.com/500276625
def urlparse(urlColumnElem):
try:
conn = urllib.request.urlopen(urlColumnElem)
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
else:
redirect=conn.geturl()
#check redirect
if(redirect == urlColumnElem):
#print ("same: ")
#print(redirect)
return (redirect)
else:
#print("Not the same url ")
return(redirect)
EDIT: if anyone gets the http.client.disconnected error (like me), see this question/answer http.client.RemoteDisconnected error while reading/parsing a list of URL's
Have a look at the docs:
urllib.request.urlopen(url, data=None[, timeout])
The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used).
You can set a realistic timeout (in seconds) for your process:
conn = urllib.request.urlopen(urlColumnElem, timeout=realistic_timeout_in_seconds)
and in order for your code to stop crushing, move everything inside the try except block:
import socket
def urlparse(urlColumnElem):
try:
conn = urllib.request.urlopen(
urlColumnElem,
timeout=realistic_timeout_in_seconds
)
redirect=conn.geturl()
#check redirect
if(redirect == urlColumnElem):
#print ("same: ")
#print(redirect)
return (redirect)
else:
#print("Not the same url ")
return(redirect)
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
except socket.timeout as e:
return ('Connection timeout')
Now if a timeout occurs, you will catch the exception and the program will not crush.
Good luck :)
First, there is a timeout parameter than can be used to control the time allowed for urlopen. Next an timeout in urlopen should just throw an exception, more precisely a socket.timeout. If you do not want it to abort the program, you just have to catch it:
def urlparse(urlColumnElem, timeout=5): # allow 5 seconds by default
try:
conn = urllib.request.urlopen(urlColumnElem, timeout = timeout)
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
except socket.timeout:
return ('Timeout')
else:
...

urllib2 exception handling with couchdb

I usually have a hard time nailing down how to handle urllib2 exceptions. So I'm still learning. Here is a scenario that I'd like some advice on.
I have a local couch db database. I want to know if the database exists. ie "127.0.0.1:5984/database". If it does not exist, and I can reach "127.0.0.1:5984", I want to know so I can create the new database.
Here are several cases I'm thinking about:
1) I could get a timeout.
2) my url is wrong in the sense that I fail to reach the database entirely ie I typed 127.0.4.1:5984/database but couchdb is on 127.0.0.1:5984
3) the database path "database" does not exist on the couch database.
So here some code I wrote to handle it:
What I do is test the response. If everything is fine I set db_exists to True. The only time I set db_exists to False is if I get a 404. Everything else just exits the program.
request = urllib2.Request(address)
try:
response = urllib2.urlopen(request)
except urllib2.URLError, e:
if hasattr(e, 'reason'):
print 'Failed to reach database'
print 'Reason: ', e.reason
sys.exit()
elif hasattr(e, 'code'):
if e.code == 404:
db_exists = False
else:
print 'Failed to reach database'
print 'Reason: ' + str(e)
sys.exit()
else:
try:
#I am expecting a json response. So make sure of it.
json.loads(response.read())
except:
print 'Failed to reach database at "' + address + '"'
sys.exit()
else:
db_exists = True
I am following the exception handling scheme layed out in URLlib2 The Missing Manual.
So basically my questions are...
1) Is this a clean, robust way to handle this?
2) is it common practice to sprinkle sys.exit() throughout code.
-Update-
Using couchdb-python:
main(db_url):
database = couchdb.Database(url=db_url)
try:
database.info()
except couchdb.http.ResourceNotFound, err:
print '"' + db_url + '" ' + err.message[0] + ', ' + err.message[1]
return
except couchdb.http.Unauthorized, err:
print err.message[1]
return
except couchdb.http.ServerError, err:
print err.message
return
except socket.error, err:
print str(err)
return
if __name__ == '__main__':
# note that id did not show it here, but db_url comes from an arg.
main(db_url)
I would argue that you're attacking this problem at too low a level. Why not use couchdb-python?
To answer your questions, 1) no it is not an especially clean way to do this. I would at least factor the code in your except block out into a method that extracts error types suitable for your application out of the urrlib2.URLError. For 2), no it is bad practice to call sys.exit() nearly all the time. Raise an appropriate exception. By default this will bubble up and halt the interpreter, just like your sys.exit() but with a traceback. Or, since your Couch client is a library, the exceptions can be handled at the application's discretion. Library code should never exit the interpreter.

Categories