sending http requests with specific/non-existent http version protocol in Python

sending http requests with specific/non-existent http version protocol in Python - python

There is some way to send http requests in python with specific http version protocol.I think that, with httplib or urllib, it is not possible.
For example: GET / HTTP/6.9
Thanks in advance.

The simple answer to your question is: You're right, neither httplib nor urllib has public, built-in functionality to do this. (Also, you really shouldn't be using urllib for most things—in particular, for urlopen.)
Of course you can always rely on implementation details of those modules, as in Lukas Graf's answer.
Or, alternatively, you could fork one of those modules and modify it, which guarantees that your code will work on other Python 2.x implementations.*. Note that httplib is one of those modules that has a link to the source up at the top, which means it's mean to server as example code, not just as a black-box library.
Or you could just reimplement the lowest-level function that needs to be hooked but that's publicly documented. For httplib, I believe that's httplib.HTTPConnection.putrequest, which is a few hundred lines long.
Or you could pick a different library that has more hooks in it, so you have less to hook.
But really, if you're trying to craft a custom request to manually fingerprint the results, why are you using an HTTP library at all? Why not just do this?
msg = 'GET / HTTP/6.9\r\n\r\n'
s = socket.create_connection((host, 80))
with closing(s):
s.send(msg)
buf = ''.join(iter(partial(s.recv, 4096), ''))
* That's not much of a benefit, given that there will never be a 2.8, all of the existing major 2.7 implementations share the same source for this module, and it's not likely any new 2.x implementation will be any different. And if you go to 3.x, httplib has been reorganized and renamed, while urllib has been removed entirely, so you'll already have bigger changes to worry about.

You can do it easily enough by subclassing httplib.HTTPConnection and redefining the class attribute _http_vsn_str:
from httplib import HTTPConnection
class MyHTTPConnection(HTTPConnection):
_http_vsn_str = '6.9'
conn = MyHTTPConnection("www.stackoverflow.com")
conn.request("GET", "/")
response = conn.getresponse()
print "Status: {} {}".format(response.status, response.reason)
print "Headers: {}".format(response.getheaders())
print "Body: {}".format(response.read())
Of course this will result in a 400 Bad Request for most servers:
Status: 400 Bad Request
Headers: [('date', 'Tue, 11 Nov 2014 21:21:12 GMT'), ('connection', 'close'), ('content-type', 'text/html; charset=us-ascii'), ('content-length', '311')]
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request</h2>
<hr><p>HTTP Error 400. The request is badly formed.</p>
</BODY></HTML>

this is possible using pycurl by using this option
c.setopt(pycurl.HTTP_VERSION, pycurl.CURL_HTTP_VERSION_1_0)
however you need to use linux or mac since pycurl is not officially supported on windows

Related

HTTP Get Request "Moved Permanently" using HttpLib

Scope:
I am currently trying to write a Web scraper for this specific page. I have a pretty strong "Web Crawling" background using C#, but this httplib is beating me off.
Problem:
When trying to make a Http Get request for the page specified above I get a "Moved Permanently", that points to the very same URL. I can make a request using the requests lib, but I want to make it work using httplib so I can understand what I am doing wrong.
Code Sample:
I am completely new to Python, so any wrong language guideline or syntax is C#'s fault.
import httplib
# Wrapper for a "HTTP GET" Request
class HttpClient(object):
def HttpGet(self, url, host):
connection = httplib.HTTPConnection(host)
connection.request('GET', url)
return connection.getresponse().read()
# Using "HttpClient" class
httpclient = httpClient()
# This is the full URL I need to make a get request for : https://420101.com/strain-database
httpResponseText = httpclient.HttpGet('www.420101.com','/strain-database')
print httpResponseText
I really want to make it work using the httplib library, instead of requests or any other fancy one because I feel like I am missing something really small here.

The problem i've had too little or too much caffeine in my system.
To get a https, I needed the HTTPSConnection class.
Also, there is no 'www' in the address I wanted to GET. So, it shouldn't be included in the host.
Both of the wrong addresses redirect me to the correct one, with the 301 error code. If I were using requests or a more full featured module, it would have automatically followed the redirect.
My Validation:
c = httplib.HTTPSConnection('420101.com')
c.request("GET", "/strain-database")
r = c.getresponse()
print r.status, r.reason
200 OK

bad request when using python with suds for sharepoint

I am using Suds to access Sharepoint lists through soap, but I am having some trouble with malformed soap.
I am using the following code:
from suds.client import Client
from suds.sax.element import Element
from suds.sax.attribute import Attribute
from suds.transport.https import WindowsHttpAuthenticated
import logging
logging.basicConfig(level=logging.INFO)
logging.getLogger('suds.client').setLevel(logging.DEBUG)
ntlm = WindowsHttpAuthenticated(username='somedomain\\username', password='password')
url = "http://somedomain/sites/somesite/someothersite/somethirdsite/_vti_bin/Lists.asmx?WSDL"
client = Client(url, transport=ntlm)
result = client.service.GetListCollection()
print repr(result)
Every time I run this, I get the result Error 400 Bad request. As I have debugging enabled I can see the resulting envelope:
<SOAP-ENV:Envelope xmlns:ns0="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="http://schemas.microsoft.com/sharepoint/soap/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<ns0:Body>
<ns1:GetListCollection/>
</ns0:Body>
</SOAP-ENV:Envelope>
...with this error message:
DEBUG:suds.client:http failed:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request</h2>
<hr><p>HTTP Error 400. The request is badly formed.</p>
</BODY></HTML>
Running the same WSDL (and raw envelope data as well) through SoapUI the request returns with values as expected. Can anyone see any obvious reason why I get the different results with Suds as SoapUI and how I can correct this?
UPDATE: after testing the exact same code on a different Sharepoint site (i.e. not a subsubsubsite with whitespace in its name) and with Java (JAX-WS, which also had issues with the same site, though, different issues) it appears as if it works as expected. As a result I wonder if one of two details may be the reason for these problems:
SOAP implementations have some issues with subsubsubsites in Sharepoint?
SOAP implementations have some issues with whitespace in its name, even if using %20 as a replacement?
I still have the need to use the original URL with those issues, so any input would be highly appreciated. I assume that since SoapUI worked with the original url, it should be possible to correct whatever is wrong.

I think I narrowed down the issue, and it is specific to suds (possibly other SOAP implementations as well). Your bullet point:
SOAP implementations have some issues with whitespace in its name, even if using %20 as a replacement?
That's spot on. Turning up debug logging for suds allowed me to grab the endpoint, envelope, and headers. Mimicking the exact same call using cURL returns a valid response, but suds it throws the bad request.
The issue is that suds takes your WSDL (url parameter) and parses it, but it doesn't include the URL encoded string. This leads to debug messages like this:
DEBUG:suds.transport.http:opening (https://sub.site.com/sites/Site Collection with Spaces/_vti_bin/UserGroup.asmx?WSDL)
<snip>
TransportError: HTTP Error 400: Bad Request
Piping this request through a fiddler proxy showed that it was running the request against the URL https://sub.site.com/sites/Site due to the way it parses the WSDL. The issue is that you aren't passing the location parameter to suds.client.Client. The following code gives me valid responses every time:
from ntlm3 import ntlm
from suds.client import Client
from suds.transport.https import WindowsHttpAuthenticated
# URL without ?WSDL
url = 'https://sub.site.com/sites/Site%20Collection%20with%20Spaces/_vti_bin/Lists.asmx'
# Create NTLM transport handler
transport = WindowsHttpAuthenticated(username='foo',
password='bar')
# We use FBA, so this forces it to challenge us with
# a 401 so WindowsHttpAuthenticated can take over.
msg = ("%s\\%s" % ('DOM', 'foo'))
auth = 'NTLM %s' % ntlm.create_NTLM_NEGOTIATE_MESSAGE(msg).decode('ascii')
# Create the client and append ?WSDL to the URL.
client = Client(url=(url + "?WSDL"),
location=url,
transport=transport)
# Add the NTLM header to force negotiation.
header = {'Authorization': auth}
client.set_options(headers=header)
One caveat: Using quote from urllib works, but you cannot encode the entire URL or it fails to recognize the URL. You are better off just doing a replace on spaces with %20.
url = url.replace(' ','%20')
Hope this keeps someone else from banging their head against the wall.

Sending SOAP request using Python Requests

Is it possible to use Python's requests library to send a SOAP request?

It is indeed possible.
Here is an example calling the Weather SOAP Service using plain requests lib:
import requests
url="http://wsf.cdyne.com/WeatherWS/Weather.asmx?WSDL"
#headers = {'content-type': 'application/soap+xml'}
headers = {'content-type': 'text/xml'}
body = """<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:ns0="http://ws.cdyne.com/WeatherWS/" xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<ns1:Body><ns0:GetWeatherInformation/></ns1:Body>
</SOAP-ENV:Envelope>"""
response = requests.post(url,data=body,headers=headers)
print response.content
Some notes:
The headers are important. Most SOAP requests will not work without the correct headers. application/soap+xml is probably the more correct header to use (but the weatherservice prefers text/xml
This will return the response as a string of xml - you would then need to parse that xml.
For simplicity I have included the request as plain text. But best practise would be to store this as a template, then you can load it using jinja2 (for example) - and also pass in variables.
For example:
from jinja2 import Environment, PackageLoader
env = Environment(loader=PackageLoader('myapp', 'templates'))
template = env.get_template('soaprequests/WeatherSericeRequest.xml')
body = template.render()
Some people have mentioned the suds library. Suds is probably the more correct way to be interacting with SOAP, but I often find that it panics a little when you have WDSLs that are badly formed (which, TBH, is more likely than not when you're dealing with an institution that still uses SOAP ;) ).
You can do the above with suds like so:
from suds.client import Client
url="http://wsf.cdyne.com/WeatherWS/Weather.asmx?WSDL"
client = Client(url)
print client ## shows the details of this service
result = client.service.GetWeatherInformation()
print result
Note: when using suds, you will almost always end up needing to use the doctor!
Finally, a little bonus for debugging SOAP; TCPdump is your friend. On Mac, you can run TCPdump like so:
sudo tcpdump -As 0
This can be helpful for inspecting the requests that actually go over the wire.
The above two code snippets are also available as gists:
SOAP Request with requests
SOAP Request with suds

Adding up to the last answer, make sure you add to the headers the following attributes:
headers={"Authorization": f"bearer {token}", "SOAPAction": "http://..."}
The authorization is meant when you need some token to access the SOAP API,
Otherwise, the SOAPAction is the action you are going to perform with the data you are sending in,
So if you don't need Authorization, then you could pop that out of the headers,
That worked pretty fine for me,

Python httplib POST request and proper formatting

I'm currently working on a automated way to interface with a database website that has RESTful webservices installed. I am having issues with figure out the proper formatting of how to properly send the requests listed in the following site using python.
https://neesws.neeshub.org:9443/nees.html
Particular example is this:
POST https://neesws.neeshub.org:9443/REST/Project/731/Experiment/1706/Organization
<Organization id="167"/>
The biggest problem is that I do not know where to put the XML formatted part of the above. I want to send the above as a python HTTPS request and so far I've been trying something of the following structure.
>>>import httplib
>>>conn = httplib.HTTPSConnection("neesws.neeshub.org:9443")
>>>conn.request("POST", "/REST/Project/731/Experiment/1706/Organization")
>>>conn.send('<Organization id="167"/>')
But this appears to be completely wrong. I've never actually done python when it comes to webservices interfaces so my primary question is how exactly am I supposed to use httplib to send the POST Request, particularly the XML formatted part of it? Any help is appreciated.

You need to set some request headers before sending data. For example, content-type to 'text/xml'. Checkout the few examples,
Post-XML-Python-1
Which has this code as example:
import sys, httplib
HOST = www.example.com
API_URL = /your/api/url
def do_request(xml_location):
"""HTTP XML Post requeste"""
request = open(xml_location,"r").read()
webservice = httplib.HTTP(HOST)
webservice.putrequest("POST", API_URL)
webservice.putheader("Host", HOST)
webservice.putheader("User-Agent","Python post")
webservice.putheader("Content-type", "text/xml; charset=\"UTF-8\"")
webservice.putheader("Content-length", "%d" % len(request))
webservice.endheaders()
webservice.send(request)
statuscode, statusmessage, header = webservice.getreply()
result = webservice.getfile().read()
print statuscode, statusmessage, header
print result
do_request("myfile.xml")
Post-XML-Python-2
You may get some idea.

setting and reading cookies in mod_python (apache)

I've seen plenty of things explaining how to read and write cookies, however, I have no clue about how to do it in mod_python in apache. I tried putting this at the start of my HTML code, but it says to put it in the HTTP header. How do I do that?
Also, how do I retrieve them? I was originally looking mainly at this site:
http://webpython.codepoint.net/cgi_set_the_cookie
My code currently starts like this (and it displays as part of the HTML)
Content-Type: text/html
Set-Cookie: test=1
<html>
<head>

mod_python is not CGI, and provides it's own way to set and read cookies:
from mod_python import Cookie, apache
import time
def handler(req):
# read a cookie
spam_cookie = get_cookie(req, 'spam')
# set a cookie
egg_cookie = Cookie.Cookie('eggs', 'spam')
egg_cookie.expires = time.time() + 300
Cookie.add_cookie(req, egg_cookie)
req.write('<html><head></head><body>There's a cookie</body></html>')
return apache.OK
You'll find more documentation here : http://www.modpython.org/live/current/doc-html/pyapi-cookie.html

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.