python httplib: getting the outgoing request headers - python

I do:
con = HTTPConnection(SERVER_NAME)
con.request('GET', PATH, HEADERS)
resp = con.getresponse()
For debugging reasons, I want to see the request I used (it's fields, path, method,..). I would expect there to be some sort of con.getRequest() or something of the sort but didn't find anything. Ideas?

Try
con.setdebuglevel(1)
That will enable debugging output, which among other things, will print out all the data it sends.
If you only want to get the headers and request line, not the request body (or any other debugging output), you can subclass HTTPConnection and override the _output method, which is called by the class itself to produce output (except for the request body). You'd want to do something like this:
class MyHTTPConnection(HTTPConnection):
def _output(self, s):
print repr(s)
super(MyHTTPConnection, self)._output(s)
For more details on how that works and possible alternatives, have a look at the httplib source code.

Related

Locust: No statistics shown

I'm new to Locust, and I am attempting to log statistics for a POST request, and I'm using the following code along with a generic call to locust.
import json
from locust import HttpUser, task, between
import cfg
class BasicUser(HttpUser):
wait_time = between(1, 3)
v1_data = json.load(open("v1_sample_data.json", "r"))
#task
def get_v1_prediction(self):
route = "/" + cfg.lookup("model.v1.route")
response = self.client.post(
route,
json=self.v1_data,
catch_response=True,
name="API Call"
)
print(response.text)
When I start an experiment, the host is called successfully, and response.text has the expected value and is printed to the console repeatedly. However, the statistics aren't logged.
When I use a GET request in place of the POST without passing data, statistics are logged (though it's only failures because the web app only allows POST requests). Any idea what's going on here?
The catch_response=True is the culprit.
From the documentation:
catch_response – (optional) Boolean argument that, if set, can be used to make a request return a context manager to work as argument to a with statement. This will allow the request to be marked as a fail based on the content of the response, even if the response code is ok (2xx). The opposite also works, one can use catch_response to catch a request and then mark it as successful even if the response code was not (i.e 500 or 404).

Python POST request does not take form data with no files

Before downvoting/marking as duplicate, please note:
I have already tried out this, this, this, this,this, this - basically almost all the methods I could find pointed out by the Requests documentation but do not seem to find any solution.
Problem:
I want to make a POST request with a set of headers and form data.
There are no files to be uploaded. As per the request body in Postman, we set the parameters by selecting 'form-data' under the 'Body' section for the request.
Here is the code I have:
headers = {'authorization': token_string,
'content-type':'multipart/form-data; boundary=----WebKitFormBoundaryxxxxxXXXXX12345'} # I get 'unsupported application/x-www-form-url-encoded' error if I remove this line
body = {
'foo1':'bar1',
'foo2':'bar2',
#... and other form data, NO FILE UPLOADED
}
#I have also tried the below approach
payload = dict()
payload['foo1']='bar1'
payload['foo2']='bar2'
page = ''
page = requests.post(url, proxies=proxies, headers=headers,
json=body, files=json.dump(body)) # also tried data=body,data=payload,files={} when giving data values
Error
{"errorCode":404,"message":"Required String parameter 'foo1' is not
present"}
EDIT:
Adding a trace of the network console. I am defining it in the same way in the payload as mentioned on the request payload.
There isn't any gui at all? You could get the network data from chrome, although:
Try this:
headers = {'authorization': token_string}
Probably there is more authorization? Or smthng else?
You shouldn't add Content-Type as requests will handle it for you.
Important, you could see the content type as WebKitFormBoundary, so for the payload you must take, the data from the "name" variable.
Example:
(I know you won't upload any file, it just an example) -
So in this case, for my payload would look like this: payload = {'photo':'myphoto'} (yea there would be an open file etc etc, but I try to keep it simple)
So your payload would be this-> (So always use name from the WebKit)
payload = {'foo1':'foo1data',
'foo2':'foo2data'}
session.post(url,data = payload, proxies etc...)
Important! As I can see you use the method from requests library. Firstly you always should create a session like this
session = requests.session() -> it will handle cookies, headers, etc, and won't open a new session, or plain requests with every requests.get/post.

HTTP Get Request "Moved Permanently" using HttpLib

Scope:
I am currently trying to write a Web scraper for this specific page. I have a pretty strong "Web Crawling" background using C#, but this httplib is beating me off.
Problem:
When trying to make a Http Get request for the page specified above I get a "Moved Permanently", that points to the very same URL. I can make a request using the requests lib, but I want to make it work using httplib so I can understand what I am doing wrong.
Code Sample:
I am completely new to Python, so any wrong language guideline or syntax is C#'s fault.
import httplib
# Wrapper for a "HTTP GET" Request
class HttpClient(object):
def HttpGet(self, url, host):
connection = httplib.HTTPConnection(host)
connection.request('GET', url)
return connection.getresponse().read()
# Using "HttpClient" class
httpclient = httpClient()
# This is the full URL I need to make a get request for : https://420101.com/strain-database
httpResponseText = httpclient.HttpGet('www.420101.com','/strain-database')
print httpResponseText
I really want to make it work using the httplib library, instead of requests or any other fancy one because I feel like I am missing something really small here.
The problem i've had too little or too much caffeine in my system.
To get a https, I needed the HTTPSConnection class.
Also, there is no 'www' in the address I wanted to GET. So, it shouldn't be included in the host.
Both of the wrong addresses redirect me to the correct one, with the 301 error code. If I were using requests or a more full featured module, it would have automatically followed the redirect.
My Validation:
c = httplib.HTTPSConnection('420101.com')
c.request("GET", "/strain-database")
r = c.getresponse()
print r.status, r.reason
200 OK

How to debug urllib2 requests via proxy

I'm making HTTP requests with Python's urllib2 which go through a proxy.
proxy_handler = urllib2.ProxyHandler({'http': 'http://myproxy'})
opener = urllib2.build_opener(proxy_handler)
urllib2.install_opener(opener)
r = urllib2.urlopen('http://www.pbr.com')
I'd like to log all headers from this request. I know that using a standard HTTPHandler you can do:
handler = urllib2.HTTPHandler(debuglevel=1)
Is there something like this for ProxyHandler?
I'm pretty sure debuglevel isn't documented.
In practice, it's actually a feature of httplib that urllib2 just forwards along for convenience, so you don't have to pass lambda: httplib.HTTPConnection(debuglevel=1) in place of the default httplib.HTTPConnection as your HTTP object factory. So, you're unlikely to find anything similar in any of the other handlers.
But if you want to rely on an undocumented feature of the implementation, you're really going to need to read the source to see for yourself.
At any rate, the obvious way to add debugging to any of the handlers is to subclass them and do it yourself. For example:
class LoggingProxyHandler(urllib2.ProxyHandler):
def proxy_open(self, req, proxy, type):
had_proxy = req.has_proxy()
response = super(LoggingProxyHandler, self).proxy_open(req, proxy, type)
if not had_proxy and req.has_proxy():
# log stuff here
return response
I'm relying on internal knowledge that ProxyHandler calls set_proxy on the request if it doesn't have one and needs one. It might be cleaner to instead examine the response… but you may not get all the information you want that way.

http PUT method in python mechanize

I am using python mechanize lib and I am trying to use http PUT method on some url - but I cant find any option for this. I see only GET and POST methods...
If the PUT method is not working maybe some1 can tell me a better lib for doing this?
One possible solution:
class PutRequest(mechanize.Request):
"Extend the mechanize Request class to allow a http PUT"
def get_method(self):
return "PUT"
You can then use this when making a request like this:
browser.open(PutRequest(url,data=your_encoded_params,headers=your_headers))
NOTE: I arrived at this solution by digging into the mechanize code packages to find out where mechanize was setting the HTTP method. I noticed that when we call mechanize.Request, we are using the Request class in _request.py which in turn is extending the Request class in _urllib2_fork.py. The http method is actually set in get_method of the Request class in _urllib2_fork.py. Turns out get_method in _urllib2_fork.py was allowing only GET and POST methods. To get past this limitation, I ended up writing my own put and delete classes that extended mechanize. Request but over-rode get_method() only.
Use Requests:
>>> import requests
>>> result = requests.put("http://httpbin.org/put", data='hello')
>>> result.text
Per documentation:
requests.put(url, data=None, **kwargs)
Sends a PUT request. Returns Response object.
Parameters:
url – URL for the new Request object.
data – (optional) Dictionary or bytes to send in the body of the Request.
**kwargs – Optional arguments that request takes.
Via Mechanize:
import mechanize
import json
class PutRequest(mechanize.Request):
def get_method(self):
return 'PUT'
browser = mechanize.Browser()
browser.open(
PutRequest('http://example.com/',
data=json.dumps({'locale': 'en'}),
headers={'Content-Type': 'application/json'}))
See also http://qxf2.com/blog/python-mechanize-the-missing-manual/ (probably outdated).
Requests does it in a nicer way as Key Zhu said.

Categories