Huge gap between two methods of measuring request time via python-requests - python

I am trying to measure the respone time of a certain request using python-requests.
import requests
import time
start = time.time()
r = requests.get("https://www.dl.soc.i.kyoto-u.ac.jp/index.php/members/")
end = time.time()
print(end - start)
print(r.elapsed.seconds)
It gave me a result of
64.67747116088867
0.631163
Could anyone please explain the reason of this huge gap? Thanks.
By the way, when I was trying the same request on Google-Chrome, actually the first result is what I want.

I made some test with an artificially delaying webserver:
nc -l 8080
Then in another terminal in a Python session:
import time, requests
a=time.time()
r = requests.get("http://localhost:8080/")
b=time.time()
print r.elapsed, b-a
Pasting this issued this HTTP request on the server terminal:
GET / HTTP/1.1
Host: localhost:8080
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.9.1
I waited for 5 seconds, then I pasted this reply in the server:
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 12
Connection: keep-alive
Keep-Alive: timeout=15
Date: Thu, 04 Jan 2018 10:12:09 GMT
Server: Apache
Last-Modified: Wed, 09 Dec 2015 13:57:24 GMT
ETag: "28bd-52677784b6090"
Accept-Ranges: bytes
hello
I stated 12 bytes but only sent 6 (hello\n), so this was unfinished. I waited another five seconds, then pasted this text:
world
This finished the reply with the remaining six bytes (world\n). In the client terminal I saw the result appear:
0:00:05.185509 10.8904578686
So, obviously the r.elapsed is the Time-To-First-Byte (TTFB) while the call to requests.get() only terminates after the whole message has been received (Time-To-Last-Byte, TTLB).

Related

python pexpect failing for curl output

I am capturing a curl output after spawning to a remote machine. The expect function keeps getting timed out , i tried different patterns still no luck. The curl request is of the form ,
hdl2.sendline("curl -v http://{0}/index.html -o /dev/null".format(host1))
The output received is
" > GET /index.html HTTP/1.1
> User-Agent: curl/7.35.0
> Host: 13.126.208.1
> Accept: */*
< HTTP/1.1 200 OK
< Date: Sun, 20 Aug 2017 12:32:54 GMT
* Server Apache/2.4.7 (Ubuntu) is not blacklisted
< Server: Apache/2.4.7 (Ubuntu)
< Last-Modified: Sun, 20 Aug 2017 09:56:44 GMT
< ETag: "2cf6-5572c61363668"
< Accept-Ranges: bytes
< Content-Length: 11510
< Vary: Accept-Encoding
< Content-Type: text/html
<
{ [data not shown]
100 11510 100 11510 0 0 3055k 0 --:--:-- --:--:-- --:--:-- 3746k
* Connection #0 to host 13.126.208.1 left intact
ubuntu#ip-172-31-28-48:~$ "
This is the end output , i have given expect as
hdl2.expect("\$ ")
But every time i get pexpect timeout. Any suggestions is appreciated.
This may happen because of line buffering: ubuntu#ip-172-31-28-48:~$ is not terminated with \n, so except may not consider this line. You can try this:
hdl2.sendline("curl -v http://{0}/index.html -o /dev/null; echo DONE".format(host1))
hdl2.expect("DONE")
(Use a string that will be unique for your data instead of DONE.)
Pexpect's default timeout is 30 seconds. If your curl command needs more time then you need to increase the timeout value. For example:
hdl2.expect("\$ ", timeout=600)

Python Socket Connect poloniex

Hi I Want Python Socket Connect Poloniex API.
I ran the code. But I can not get the results I want.
I Made Code:
===================================================================
import requests
import socket
s=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("poloniex.com",443))
message="GET /public?command=returnTicker HTTP/1.1\r\nHost: poloniex.com\r\nConnection: keep-alive\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nUser-Agent: python-requests/2.18.1\r\n\r\n"
s.send(message)
print s.recv(5000)
===================================================================
Response Text:
HTTP/1.1 400 Bad Request
Server: cloudflare-nginx
Date: Tue, 20 Jun 2017 02:52:22 GMT
Content-Type: text/html
Content-Length: 275
Connection: close CF-RAY: - 400 The plain HTTP request was sent to HTTPS port
===================================================================
The error message is right - you're sending an HTTP request to port 443 which is the HTTPS port. If you want to send an HTTP request, use port 80. I have just tried to send a request to port 80, and the response says I should be using HTTPS from now on (see Location: https:// part):
HTTP/1.1 301 Moved Permanently
Date: Tue, 20 Jun 2017 13:40:52 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=d28a8f446379618a093014a5f13bbcb141497966052; expires=Wed, 20-Jun-18 13:40:52 GMT; path=/; domain=.poloniex.com; HttpOnly
Location: https://poloniex.com/public?command=returnTicker
Server: cloudflare-nginx
CF-RAY: 371f2473b09f5a7a-BOS
In this case you should use whether ssl module instead of socket, or just use requests since it is a simpler option.

How to make httplib debugger infomation into logger debug level

By default httplib debug send, headers and reply information returns as logger.info,
Instead can how do i display send, headers and replay as part of Debug information?
import requests
import logging
import httplib
httplib.HTTPConnection.debuglevel = 1
logging.basicConfig() # you need to initialize logging, otherwise you will not see anything from requests
logging.getLogger().setLevel(logging.DEBUG)
requests_log = logging.getLogger("requests.packages.urllib3")
requests_log.setLevel(logging.DEBUG)
requests_log.propagate = True
requests.get('http://httpbin.org/headers')
It prints
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP Connection (1):
httpbin.org
send: 'GET /headers HTTP/1.1\r\nHost: httpbin.org\r\nConnection: keep-alive\r\nA
ccept-Encoding: gzip, deflate\r\nAccept: */*\r\nUser-Agent: python-requests/2.8.
1\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: nginx
header: Date: Mon, 14 Dec 2015 12:50:44 GMT
header: Content-Type: application/json
header: Content-Length: 156
header: Connection: keep-alive
header: Access-Control-Allow-Origin: *
header: Access-Control-Allow-Credentials: true
DEBUG:requests.packages.urllib3.connectionpool:"GET /headers HTTP/1.1" 200 156
<Response [200]>
Thanks #Eli
I could achieve using this post http://stefaanlippens.net/redirect_python_print
import logging
import sys
import requests
import httplib
# HTTP stream handler
class WritableObject:
def __init__(self):
self.content = []
def write(self, string):
self.content.append(string)
# A writable object
http_log = WritableObject()
# Redirection
sys.stdout = http_log
# Enable
httplib.HTTPConnection.debuglevel = 2
# get operation
requests.get('http://httpbin.org/headers')
# Remember to reset sys.stdout!
sys.stdout = sys.__stdout__
debug_info = ''.join(http_log.content).replace('\\r', '').decode('string_escape').replace('\'', '')
# Remove empty lines
debug_info = "\n".join([ll.rstrip() for ll in debug_info.splitlines() if ll.strip()])
It prints like
C:\Users\vkosuri\Dropbox\robot\lab>python New-Redirect_Stdout.py
send: GET /headers HTTP/1.1
Host: httpbin.org
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.8.1
reply: HTTP/1.1 200 OK
header: Server: nginx
header: Date: Tue, 15 Dec 2015 09:36:36 GMT
header: Content-Type: application/json
header: Content-Length: 156
header: Connection: keep-alive
header: Access-Control-Allow-Origin: *
header: Access-Control-Allow-Credentials: true
Thanks
Malli
some_logger.set_level() does not do what you think it does. It doesn't set the level of the logs being emitted by a logger. It sets the minimum level of log emitted by the logger that your handler will care about and acknowledge. To do what you're asking, I can only think of one real, reasonable way:
Capture the logs as they're coming in and re-log them. You can capture them with the idea described here, and use that in a subclass of requests. This would without a doubt be complicated. So, this is probably a good time to start asking yourself, "what am I really trying to achieve and is there another way to go about it?"

Python urllib open issue

I'm trying to fetch data from http://book.libertorrent.com/, but at the moment I'm failing badly because some additional data (headers) present in response. My code is very simple:
response = urllib.urlopen('http://book.libertorrent.com/login.php')
f = open('someFile.html', 'w')
f.write(response.read())
read() returns:
Date: Fri, 09 Nov 2012 07:36:54 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: close
Cache-Control: no-cache, pre-check=0, post-check=0
Expires: 0
Pragma: no-cache
Set-Cookie: bb_test=973132321; path=/; domain=book.libertorrent.com
Content-Language: ru
1ec0
...Html...
0
And response.info() is empty.
is there any way to correct response?
Let's try this:
$ echo -ne "GET /index.php HTTP/1.1\r\nHost: book.libertorrent.com\r\n\r\n" | nc book.libertorrent.com 80 | head -n 10
HTTP/1.1 200 OK
WWW
Date: Sat, 10 Nov 2012 17:41:57 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Content-Language: ru
1f57
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"><html dir="ltr">
See that "WWW" in the second line? That's no valid HTTP header, I'm guessing that's what's throwing off the response parser here.
By the way, python2 and python3 behave differently here:
python2 seems to immediately interpret anything after this invalid header as content
python3 ignores all headers and continues reading the content after the double newline. Because the headers are ignored, so is the transfer encoding, and therfore the content lengths are interpreted as part of the body.
So in the end the problem is that the server is sending an invalid response, which should be fixed at the server's end.

HTTP telnet POST/GAE server question (SIMPLE STUFF)

I am playing with HTTP transfers, just trying to make something work. I have a GAE server and I'm pretty sure it's working properly because it renders when I go to it with my browser, but here is the python code anyway:
import sys
print 'Content-Type: text/html'
print ''
print '<pre>'
number = -1
data = sys.stdin.read()
try:
number = int(data[data.find('=')+1:])
except:
number = -1
print 'If your number was', number, ', then you are awesome!!!'
print '</pre>'
I am just learning the whole HTTP POST vs GET vs Response process, but this is what I have been doing from the terminal:
$ telnet localhost 8080
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET http://localhost:8080/?number=28 HTTP/1.0
HTTP/1.0 200 Good to go
Server: Development/1.0
Date: Thu, 07 Jul 2011 21:29:28 GMT
Content-Type: text/html
Cache-Control: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Content-Length: 61
<pre>
If your number was -1 , then you are awesome!!!
</pre>
Connection closed by foreign host.
I am using a GET here because I stumbled around for about 40 minutes trying to make a telnet POST work - with no success :(
I would appreciate any help on how to get this GET and/or the POST to work. Thanks in advance!!!!
when using GET, no data will be present in the request body, so sys.stdin.read() is bound to fail. instead, you might want to look at the environment, specifically os.environ['QUERY_STRING']
Another thing you're doing a little strangely is you are not using the correct request format. The second part of the request should not include the url scheme, host or port, it should look like:
GET /?number=28 HTTP/1.0
specify the host in a seperate Host: header; the server will determine the scheme on it's own.
When using POST, most servers won't read past the amount of data in the Content-Length header, which if you don't supply one, may be assumed to be zero bytes. The server may try to read any bytes after the point specified by the content-length to be the next request in a persistent connection, and when it doesn't begin with a valid request, it closes the connection. So basically:
POST / HTTP/1.0
Host: localhost: 8080
Content-Length: 2
Content-Type: text/plain
28
But why are you testing this in telnet? How about curl?
$ curl -vs -d'28' -H'Content-Type: text/plain' http://localhost:8004/
* About to connect() to localhost port 8004 (#0)
* Trying ::1... Connection refused
* Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 8004 (#0)
> POST / HTTP/1.1
> User-Agent: curl/7.20.1 (x86_64-redhat-linux-gnu) libcurl/7.20.1 NSS/3.12.6.2 zlib/1.2.3 libidn/1.16 libssh2/1.2.4
> Host: localhost:8004
> Accept: */*
> Content-Type: text/plain
> Content-Length: 2
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Date: Thu, 07 Jul 2011 22:09:17 GMT
< Server: WSGIServer/0.1 Python/2.6.4
< Content-Type: text/html; charset=UTF-8
< Content-Length: 45
<
* Closing connection #0
{'body': '28', 'method': 'POST', 'query': []}
or better yet, in python:
>>> import httplib
>>> headers = {"Content-type": "text/plain",
... "Accept": "text/plain"}
>>>
>>> conn = httplib.HTTPConnection("localhost:8004")
>>> conn.request("POST", "/", "28", headers)
>>> response = conn.getresponse()
>>> print response.read()
{'body': '28', 'method': 'POST', 'query': []}
>>>

Categories