Today I actually needed to retrieve data from the http-header response. But since I've never done it before and also there is not much you can find on Google about this. I decided to ask my question here.
So actual question: How does one print the http-header response data in python? I'm working in Python3.5 with the requests module and have yet to find a way to do this.
Update: Based on comment of OP, that only the response headers are needed. Even more easy as written in below documentation of Requests module:
We can view the server's response headers using a Python dictionary:
>>> r.headers
{
'content-encoding': 'gzip',
'transfer-encoding': 'chunked',
'connection': 'close',
'server': 'nginx/1.0.4',
'x-runtime': '148ms',
'etag': '"e1ca502697e5c9317743dc078f67693f"',
'content-type': 'application/json'
}
And especially the documentation notes:
The dictionary is special, though: it's made just for HTTP headers. According to RFC 7230, HTTP Header names are case-insensitive.
So, we can access the headers using any capitalization we want:
and goes on to explain even more cleverness concerning RFC compliance.
The Requests documentation states:
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly. When streaming a download, the above is the preferred and recommended way to retrieve the content.
It offers as example:
>>> r = requests.get('https://api.github.com/events', stream=True)
>>> r.raw
<requests.packages.urllib3.response.HTTPResponse object at 0x101194810>
>>> r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
But also offers advice on how to do it in practice by redirecting to a file etc. and using a different method:
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly
How about something like this:
import urllib2
req = urllib2.Request('http://www.google.com/')
res = urllib2.urlopen(req)
print res.info()
res.close();
If you are looking for something specific in the header:
For Date: print res.info().get('Date')
Here's how you get just the response headers using the requests library like you mentioned (implementation in Python3):
import requests
url = "https://www.google.com"
response = requests.head(url)
print(response.headers) # prints the entire header as a dictionary
print(response.headers["Content-Length"]) # prints a specific section of the dictionary
It's important to use .head() instead of .get() otherwise you will retrieve the whole file/page like the rest of the answers mentioned.
If you wish to retrieve a URL that requires authentication you can replace the above response with this:
response = requests.head(url, auth=requests.auth.HTTPBasicAuth(username, password))
easy
import requests
site = "https://www.google.com"
headers = requests.get(site).headers
print(headers)
if you want something specific
print(headers["domain"])
I'm using the urllib module, with the following code:
from urllib import request
with request.urlopen(url, data) as f:
print(f.getcode()) # http response code
print(f.info()) # all header info
resp_body = f.read().decode('utf-8') # response body
its very easy u can type
print(response.headers)
or my fav
print(requests.get('url').headers)
also u can use
print(requests.get('url').content)
Try to use req.headers and that's all. You will get the response headers ;)
import pprint
import requests
res = requests.request("GET", "https://google.com")
pprint.PrettyPrinter(indent=2).pprint(dict(res.headers))
Related
Is there a way to get the headers from url in any format like charles proxy does in python.
Yes, there is a way to get headers from an URL in Python programming language. The requests module provides a way to do this.
import requests
url = "https://www.google.com"
response = requests.head(url)
print(response.headers) # prints the entire header as a dictionary
print(response.headers["Content-Length"]) # prints a specific section of the
dictionary
https://www.folkstalk.com/tech/python-get-response-headers-with-code-examples/
I try to read JSON-formatted data from the following public URL: http://ws-old.parlament.ch/factions?format=json. Unfortunately, I was not able to convert the response to JSON as I always get the HTML-formatted content back from my request. Somehow the request seems to completely ignore the parameters for JSON formatting passed with the URL:
import urllib.request
response = urllib.request.urlopen('http://ws-old.parlament.ch/factions?format=json')
response_text = response.read()
print(response_text) #why is this HTML?
Does somebody know how I am able to get the JSON formatted content as displayed in the web browser?
You need to add "Accept": "text/json" to request header.
For example using requests package:
r = requests.get(r'http://ws-old.parlament.ch/factions?format=json',
headers={'Accept':'text/json'})
print(r.json())
Result:
[{'id': 3, 'updated': '2022-02-22T14:59:17Z', 'abbreviation': ...
Sorry for you but these web services have a misleading implementation. The format query parameter is useless. As pointed out by #maciek97x only the header Accept: <format> will be considered for the formatting.
So your can directly call the endpoint without the ?format=json but with the header Accept: text/json.
I'm new to python and trying to get some infos from IMDb using requests library. My code is capturing all data (e.g., movie titles) in my native language, but i would like to get them in english.
How can i change the accept-language in requests to do that?
All you need to do is define your own headers:
import requests
url = "http://www.imdb.com/title/tt0089218/"
headers = {"Accept-Language": "en-US,en;q=0.5"}
r = requests.get(url, headers=headers)
You can add whatever other headers you'd like to modify as well.
I am fairly a noob at this and have been trying to use requests modules to post a multipart/form-data. To clarify, the exact test case I am trying to use is the one same as in https://github.com/kennethreitz/requests/issues/1081. i.e. I am trying to do a post without a file :
--3eeaadbfda0441b8be821bbed2962e4d
Content-Disposition: form-data; name="key1"
value1
--3eeaadbfda0441b8be821bbed2962e4d
As per the discussion on the thread, I tried MultiPart form data scheme to do the following:
import requests
from requests_data_schemes import multipart_formdata as mfd
post_data = [('mouseAction', 'toggle'), ('zone' ,'10')]
post_data = mfd(post_data)
headers = {'Content-Type': 'multipart/form-data'}
req = requests.post(<url>, data=post_data, headers=headers)
However, the test server is throwing me an error saying that it cannot detect the boundary of the multipart form data.
I tried providing the boundary in the header too, but apparently its not working.
boundary = post_data[2: post_data.find('\r\n')]
headers = {'Content-Type': 'multipart/form-data; boundary={}'.format(boundary)}
Am i missing something simple?
P.S. From a bit of surfing I found a few solutions using base urllib2 but that would be my last resort as requests lets me do a lot things pretty easily.
Yeah this is a bug I have to address. You're better off at this point doing the following:
from requests.packages.urllib3.filepost import encode_multipart_formdata
(content, header) = encode_multipart_formdata([('key', 'value')])
r = requests.post(url, data=content, headers={'Content-Type': header})
Is there any standard way of getting JSON data from RESTful service using Python?
I need to use kerberos for authentication.
some snippet would help.
I would give the requests library a try for this. Essentially just a much easier to use wrapper around the standard library modules (i.e. urllib2, httplib2, etc.) you would use for the same thing. For example, to fetch json data from a url that requires basic authentication would look like this:
import requests
response = requests.get('http://thedataishere.com',
auth=('user', 'password'))
data = response.json()
For kerberos authentication the requests project has the reqests-kerberos library which provides a kerberos authentication class that you can use with requests:
import requests
from requests_kerberos import HTTPKerberosAuth
response = requests.get('http://thedataishere.com',
auth=HTTPKerberosAuth())
data = response.json()
Something like this should work unless I'm missing the point:
import json
import urllib2
json.load(urllib2.urlopen("url"))
You basically need to make a HTTP request to the service, and then parse the body of the response. I like to use httplib2 for it:
import httplib2 as http
import json
try:
from urlparse import urlparse
except ImportError:
from urllib.parse import urlparse
headers = {
'Accept': 'application/json',
'Content-Type': 'application/json; charset=UTF-8'
}
uri = 'http://yourservice.com'
path = '/path/to/resource/'
target = urlparse(uri+path)
method = 'GET'
body = ''
h = http.Http()
# If you need authentication some example:
if auth:
h.add_credentials(auth.user, auth.password)
response, content = h.request(
target.geturl(),
method,
body,
headers)
# assume that content is a json reply
# parse content with the json module
data = json.loads(content)
If you desire to use Python 3, you can use the following:
import json
import urllib.request
req = urllib.request.Request('url')
with urllib.request.urlopen(req) as response:
result = json.loads(response.readall().decode('utf-8'))
Well first of all I think rolling out your own solution for this all you need is urllib2 or httplib2 . Anyways in case you do require a generic REST client check this out .
https://github.com/scastillo/siesta
However i think the feature set of the library will not work for most web services because they shall probably using oauth etc .. . Also I don't like the fact that it is written over httplib which is a pain as compared to httplib2 still should work for you if you don't have to handle a lot of redirections etc ..