Timeout within session while sending requests - python

I'm trying to learn how I can use timeout within a session while sending requests. The way I've tried below can fetch the content of a webpage but I'm not sure this is the right way as I could not find the usage of timeout in this documentation.
import requests
link = "https://stackoverflow.com/questions/tagged/web-scraping"
with requests.Session() as s:
r = s.get(link,timeout=5)
print(r.text)
How can I use timeout within session?

According to the Documentation - Quick Start.
You can tell Requests to stop waiting for a response after a given
number of seconds with the timeout parameter. Nearly all production code should use this parameter in nearly all requests.
requests.get('https://github.com/', timeout=0.001)
Or from the Documentation Advanced Usage you can set 2 values (connect and read timeout)
The timeout value will be applied to both the connect and the read
timeouts. Specify a tuple if you would like to set the values
separately:
r = requests.get('https://github.com', timeout=(3.05, 27))
Making Session Wide Timeout
Searched throughout the documentation and it seams it is not possible to set timeout parameter session wide.
But there is a GitHub Issue Opened (Consider making Timeout option required or have a default) which provides a workaround as an HTTPAdapter you can use like this:
import requests
from requests.adapters import HTTPAdapter
class TimeoutHTTPAdapter(HTTPAdapter):
def __init__(self, *args, **kwargs):
if "timeout" in kwargs:
self.timeout = kwargs["timeout"]
del kwargs["timeout"]
super().__init__(*args, **kwargs)
def send(self, request, **kwargs):
timeout = kwargs.get("timeout")
if timeout is None and hasattr(self, 'timeout'):
kwargs["timeout"] = self.timeout
return super().send(request, **kwargs)
And mount on a requests.Session()
s = requests.Session()
s.mount('http://', TimeoutHTTPAdapter(timeout=5)) # 5 seconds
s.mount('https://', TimeoutHTTPAdapter(timeout=5))
...
r = s.get(link)
print(r.text)
or similarly you can use the proposed EnhancedSession by #GordonAitchJay
with EnhancedSession(5) as s: # 5 seconds
r = s.get(link)
print(r.text)

I'm not sure this is the right way as I could not find the usage of timeout in this documentation.
Scroll to the bottom. It's definitely there. You can search for it in the page by pressing Ctrl+F and entering timeout.
You're using timeout correctly in your code example.
You can actually specify the timeout in a few different ways, as explained in the documentation:
If you specify a single value for the timeout, like this:
r = requests.get('https://github.com', timeout=5)
The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the values separately:
r = requests.get('https://github.com', timeout=(3.05, 27))
If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.
r = requests.get('https://github.com', timeout=None)
Try using https://httpstat.us/200?sleep=5000 to test your code.
For example, this raises an exception because 0.2 seconds is not long enough to establish a connection with the server:
import requests
link = "https://httpstat.us/200?sleep=5000"
with requests.Session() as s:
try:
r = s.get(link, timeout=(0.2, 10))
print(r.text)
except requests.exceptions.Timeout as e:
print(e)
Output:
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=0.2)
This raises an exception because the server waits for 5 seconds before sending the response, which is longer than the 2 second read timeout set:
import requests
link = "https://httpstat.us/200?sleep=5000"
with requests.Session() as s:
try:
r = s.get(link, timeout=(3.05, 2))
print(r.text)
except requests.exceptions.Timeout as e:
print(e)
Output:
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=2)
You specifically mention using a timeout within a session. So maybe you want a session object which has a default timeout. Something like this:
import requests
link = "https://httpstat.us/200?sleep=5000"
class EnhancedSession(requests.Session):
def __init__(self, timeout=(3.05, 4)):
self.timeout = timeout
return super().__init__()
def request(self, method, url, **kwargs):
print("EnhancedSession request")
if "timeout" not in kwargs:
kwargs["timeout"] = self.timeout
return super().request(method, url, **kwargs)
session = EnhancedSession()
try:
response = session.get(link)
print(response)
except requests.exceptions.Timeout as e:
print(e)
try:
response = session.get(link, timeout=1)
print(response)
except requests.exceptions.Timeout as e:
print(e)
try:
response = session.get(link, timeout=10)
print(response)
except requests.exceptions.Timeout as e:
print(e)
Output:
EnhancedSession request
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=4)
EnhancedSession request
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=1)
EnhancedSession request
<Response [200]>

Related

Why am I not getting a connection error when my API request fails on dropped wifi?

I am pulling data down from an API that has a limit of 250 records per call. There are a total of 100,000 records I need to pull down doing it 250 a time. I run my application leveraging the get_stats function below. It works fine for awhile but when my wifi drops and I am in the middle of the get request the request will hang and I won't get an exception back causing the rest of the application to hang as well.
I have tested turning off my wifi when the function is NOT in the middle of the get request and it does return back the ConnectionError exception.
How do I go about handling the situation where my app is in the middle of the get request and my wifi drops? I am thinking I need to do a timeout to give my wifi time to reconnect and then retry but how do I go about doing that? Or is there another way?
def get_stats(url, version):
headers = {
"API_version": version,
"API_token": "token"
}
try:
r = requests.get(url, headers=headers)
print(f"Status code: 200")
return json.loads(r.text)
except requests.exceptions.Timeout:
# Maybe set up for a retry, or continue in a retry loop
print("Error here in timeout")
except requests.exceptions.TooManyRedirects:
# Tell the user their URL was bad and try a different one
print("Redirect errors here")
except requests.exceptions.ConnectionError as r:
print("Connection error")
r = "Connection Error"
return r
except requests.exceptions.RequestException as e:
# catastrophic error. bail.
print("System errors here")
raise SystemExit(e)
To set a timeout on the request, call requests.get like this
r = requests.get(url, headers=headers, timeout=10)
The end goal is to get the data, so just make the call again with a possible sleep after failing
edit: I would say that the timeout is the sleep

Aiohttp session timeout doesn't cancel the request

I have this piece of code where I send a POST request and set it a maximum timeout using the aiohttp package:
from aiohttp import ClientTimeout, ClientSession
response_code = None
timeout = ClientTimeout(total=2)
async with ClientSession(timeout=timeout) as session:
try:
async with session.post(
url="some url", json=post_payload, headers=headers,
) as response:
response_code = response.status
except Exception as err:
logger.error(err)
That part works, however the request appears to not be canceled whenever the timeout and respectively the except clause are reached - I still receive it on the other end, even though an exception has been raised. I would like for the request to automatically be canceled whenever the timeout has been reached. Thanks in advance.

Checking website response within x seconds

Good day the problem I am facing is that I want to check if my website is up or not this is the sample pseudo code
Check(website.com)
if checking_time > 10 seconds:
print "No response Recieve"
else:
print "Site is up"
I already try the code below but not working
try:
response = urllib.urlopen("http://insurance.contactnumbersph.com").getcode()
time.sleep(5)
if response == "" or response == "403":
print "No response"
else:
print "ok"
If the website is not up and running, you will get connection refused error and actually doesn't return any status code. So, you can catch the error in python with simple try: and except: blocks.
import requests
URL = 'http://some-url-where-there-is-no-server'
try:
resp = requests.get(URL)
except Exception as e:
# handle here
print(e) # for example
You can also check repeatedly 10 times, each per second to check if there is an exception, if there is you will check again
import requests
URL = 'http://some-url'
canCheck = False
counts = 0
gotConnected = False
while counts < 10 :
try:
resp = requests.get(URL)
gotConnected = True
break
except Exception as e:
counts +=1
time.sleep(1)
The result will be available in gotConnected flag, which you can use later to handle appropriate actions.
note that the timeout that gets passed around by urllib applies to the "wrong thing". that is each individual network operation (e.g. hostname resolution, socket connection, sending headers, reading a few bytes of the headers, reading a few more bytes of the response) each get this same timeout applied. hence passing a "timeout" of 10 seconds could allow a large response to continue for hours
if you want to stick to built in Python code then it would be nice to use a thread to do this, but it doesn't seem to be possible to cancel running threads nicely. an async library like trio would allow better timeout and cancellation handling, but we can make do by using the multiprocessing module instead:
from urllib.request import Request, urlopen
from multiprocessing import Process
from time import perf_counter
def _http_ping(url):
req = Request(url, method='HEAD')
print(f'trying {url!r}')
start = perf_counter()
res = urlopen(req)
secs = perf_counter() - start
print(f'response {url!r} of {res.status} after {secs*1000:.2f}ms')
res.close()
def http_ping(url, timeout):
proc = Process(target=_http_ping, args=(url,))
try:
proc.start()
proc.join(timeout)
success = not proc.is_alive()
finally:
proc.terminate()
proc.join()
proc.close()
return success
you can use https://httpbin.org/ to test this, e.g:
http_ping('https://httpbin.org/delay/2', 1)
should print out a "trying" message, but not a "response" message. you can adjust the delay time and timeout to explore how this behaves...
note that this spins up a new process for each request, but as long as you're doing this less than a thousand pings a second it should be OK

Python - Requests module - Receving streaming updates - Connection reset by peer

I have been building my own python (version 3.2.1) trading application in a practice account of a Forex provider (OANDA) but I am having some issues in receiving the streaming prices with a Linux debian-based OS.
In particular, I have followed their "Python streaming rates" guide available here: http://developer.oanda.com/rest-live/sample-code/.
I have a thread calling the function 'connect_to_stream' which prints out all the ticks received from the server:
streaming_thread = threading.Thread(target=streaming.connect_to_stream, args=[])
streaming_thread.start()
The streaming.connect_to_stream function is defined as following:
def connect_to_stream():
[..]#provider-related info are passed here
try:
s = requests.Session()
url = "https://" + domain + "/v1/prices"
headers = {'Authorization' : 'Bearer ' + access_token,
'Connection' : 'keep-alive'
}
params = {'instruments' : instruments, 'accountId' : account_id}
req = requests.Request('GET', url, headers = headers, params = params)
pre = req.prepare()
resp = s.send(pre, stream = True, verify = False)
return resp
except Exception as e:
s.close()
print ("Caught exception when connecting to stream\n%s" % str(e))
if response.status_code != 200:
print (response.text)
return
for line in response.iter_lines(1):
if line:
try:
msg = json.loads(line)
print(msg)
except Exception as e:
print ("Caught exception when connecting to stream\n%s" % str(e))
return
The msg variable contains the tick received for the streaming.
The problem is that I receive ticks for three hours on average after which the connection gets dropped and the script either hangs without receiving any ticks or throws an exception with reason "Connection Reset by Peer".
Could you please share any thoughts on where I am going wrong here? Is it anything related to the requests library (iter_lines maybe)?
I would like to receive ticks indefinitely unless a Keyboard exception is raised.
Thanks
That doesn't seem too weird to me that a service would close connections living for more than 3 hours.
That's probably a safety on their side to make sure to free their server sockets from ghost clients.
So you should probably just reconnect when you are disconnected.
try:
s = requests.Session()
url = "https://" + domain + "/v1/prices"
headers = {'Authorization' : 'Bearer ' + access_token,
'Connection' : 'keep-alive'
}
params = {'instruments' : instruments, 'accountId' : account_id}
req = requests.Request('GET', url, headers = headers, params = params)
pre = req.prepare()
resp = s.send(pre, stream = True, verify = False)
return resp
except SocketError as e:
if e.errno == errno.ECONNRESET:
pass # connection has been reset, reconnect.
except Exception as e:
pass # other exceptions but you'll probably need to reconnect too.

python request error handling

I am writing some small python app which uses requests to get and post data to an html page.
now the problem I am having is that if I can't reach the html page the code stops with a max retries exceeded. I want to be able to do some things if I can't reach the server.
is such a thing possible?
here is sample code:
import requests
url = "http://127.0.0.1/"
req = requests.get(url)
if req.status_code == 304:
#do something
elif req.status_code == 404:
#do something else
# etc etc
# code here if server can`t be reached for whatever reason
You want to handle the exception requests.exceptions.ConnectionError, like so:
try:
req = requests.get(url)
except requests.exceptions.ConnectionError as e:
# Do stuff here
You may want to set a suitable timeout when catching ConnectionError:
url = "http://www.stackoverflow.com"
try:
req = requests.get(url, timeout=2) #2 seconds timeout
except requests.exceptions.ConnectionError as e:
# Couldn't connect
See this answer if you want to change the number of retries.

Categories