Is there any standard way of getting JSON data from RESTful service using Python?
I need to use kerberos for authentication.
some snippet would help.
I would give the requests library a try for this. Essentially just a much easier to use wrapper around the standard library modules (i.e. urllib2, httplib2, etc.) you would use for the same thing. For example, to fetch json data from a url that requires basic authentication would look like this:
import requests
response = requests.get('http://thedataishere.com',
auth=('user', 'password'))
data = response.json()
For kerberos authentication the requests project has the reqests-kerberos library which provides a kerberos authentication class that you can use with requests:
import requests
from requests_kerberos import HTTPKerberosAuth
response = requests.get('http://thedataishere.com',
auth=HTTPKerberosAuth())
data = response.json()
Something like this should work unless I'm missing the point:
import json
import urllib2
json.load(urllib2.urlopen("url"))
You basically need to make a HTTP request to the service, and then parse the body of the response. I like to use httplib2 for it:
import httplib2 as http
import json
try:
from urlparse import urlparse
except ImportError:
from urllib.parse import urlparse
headers = {
'Accept': 'application/json',
'Content-Type': 'application/json; charset=UTF-8'
}
uri = 'http://yourservice.com'
path = '/path/to/resource/'
target = urlparse(uri+path)
method = 'GET'
body = ''
h = http.Http()
# If you need authentication some example:
if auth:
h.add_credentials(auth.user, auth.password)
response, content = h.request(
target.geturl(),
method,
body,
headers)
# assume that content is a json reply
# parse content with the json module
data = json.loads(content)
If you desire to use Python 3, you can use the following:
import json
import urllib.request
req = urllib.request.Request('url')
with urllib.request.urlopen(req) as response:
result = json.loads(response.readall().decode('utf-8'))
Well first of all I think rolling out your own solution for this all you need is urllib2 or httplib2 . Anyways in case you do require a generic REST client check this out .
https://github.com/scastillo/siesta
However i think the feature set of the library will not work for most web services because they shall probably using oauth etc .. . Also I don't like the fact that it is written over httplib which is a pain as compared to httplib2 still should work for you if you don't have to handle a lot of redirections etc ..
Related
I am able to perform a web request and get back the response, using urllib.
from urllib import request
from urllib.parse import urlencode
response = request.urlopen(req, data=login_data)
content = response.read()
I get back something like b'{"token":"abcabcabc","error":null}'
How will i be able to parse the token information?
You can use the json module to load the binary string data and then access the token property:
token = json.loads(bin_data)['token']
I need to send python requests data in application/x-www-form-urlencoded. Couldn;t find the answer. It must be that format otherwise the web won;t pass me :(
simple request should work
import requests
url = 'application/x-www-form-urlencoded&username=login&password=password'
r = requests.get(url)
or a JSON post:
import requests
r = requests.post('application/x-www-form-urlencoded', json={"username": "login","password": password})
I need to write a python that accesses an internal to organization URL. I have an auth token.
How should my python look
At the moment I have this
import json
import requests
from pprint import pprint
path='/Users/Documents/sample_2.dat'
for url in open(path):
print url[1:-2]
headers = {'Content-type': 'application/json'}
response = requests.get(url[1:-2], headers=headers)
field_value = response.json()
print field_value["externals"]
sample_2.dat has 2 urls 1 below other
Example:
"http://xxx.abc.com/mfc/abc/v1/ext_info?id=1841261718,3421035156,B0185LBO7I,B0082SIL3K,B000PS8P3Q,B00G441OMY,0793522048,B00B12D2WY,3637015080,B00TNOUNVU&fields=ex.title,ex.url&fieldgroups=default"
"http://xxx.abc.com/mfc/abc/v1/ext_info?id=0553153617,B003W0CI6Y,B000R08E7Y,B001O2SAAU,B00B1MP3MG,B00QRHJBPU,B00007B4DC,0852597088,B0000003H4,1937715213&fields=ex.title,ex.url&fieldgroups=default"
Perhaps this might be useful, which can be found in the documentations
For GET requests that might require basic authentication, you can
include the auth paramter as follows:
response = requests.get('https://api.github.com/user', auth=('user','pass'))
As you can see, it is as simple as adding the auth parameter inside your get request.
I am trying to login to http://127.0.0.1/dvwa/login.php, with Python requests.post method.
Currently I am doing as follows:
import requests
payload = {'username':'admin','password':'password'}
response = requests.post('http://127.0.0.1/dvwa/login.php', data=payload)
However it does not seem to be working. I should be getting a 301 status code from the response object, but I am only receiving 200 codes. I've also taken the cookies from my browser and set them in the requests object; however, this does not work, and also defeats the purpose of what I am trying to do.
I've also tried the following with no luck:
from requests.auth import HTTPBasicAuth
import requests
response = requests.get("http://127.0.0.1/dvwa/login.php",auth=HTTPBasicAuth('admin','password'))
and
from requests.auth import HTTPBasicAuth
import requests
cookies = {'PHPSESSID':'07761e3f52ae72fa7d0e2c57569c32a7'}
response = requests.get("http://127.0.0.1/dvwa/login.php",auth=HTTPBasicAuth('admin','password'),cookies=cookies)
None of the above methods give the result I require/want, which is simply logging in.
By default, requests will follow redirects. response.status_code will be the status code of the ultimate location. If you want to check if you've been redirected, look at response.history.
import requests
response = requests.get("http://google.com/") #301 redirects to 'www.google.com'
response.status_code
#200
response.history
#[<Respone [301]>]
response.url
#'http://www.google.com/'
Additionally, a good way to have requests keep track of your session/cookies is by using requests.Session
import requests
with requests.Session() as sesh:
sesh.post(the_url, data=payload)
#do more stuff in session
I appreciate your answer, however I found my answer question. It is as follows in case anyone else has the same issue.
instead of:
import requests
response = requests.post('http://127.0.0.1/dvwa/login.php',data={'username':'admin','password':'password'})
You also need the login token stored in the payload, as follows:
import requests
response = requests.post('http://127.0.0.1/dvwa/login.php',data={'username':'admin','password':'password','Login':'Login'})
It then logs me in correctly.
Today I actually needed to retrieve data from the http-header response. But since I've never done it before and also there is not much you can find on Google about this. I decided to ask my question here.
So actual question: How does one print the http-header response data in python? I'm working in Python3.5 with the requests module and have yet to find a way to do this.
Update: Based on comment of OP, that only the response headers are needed. Even more easy as written in below documentation of Requests module:
We can view the server's response headers using a Python dictionary:
>>> r.headers
{
'content-encoding': 'gzip',
'transfer-encoding': 'chunked',
'connection': 'close',
'server': 'nginx/1.0.4',
'x-runtime': '148ms',
'etag': '"e1ca502697e5c9317743dc078f67693f"',
'content-type': 'application/json'
}
And especially the documentation notes:
The dictionary is special, though: it's made just for HTTP headers. According to RFC 7230, HTTP Header names are case-insensitive.
So, we can access the headers using any capitalization we want:
and goes on to explain even more cleverness concerning RFC compliance.
The Requests documentation states:
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly. When streaming a download, the above is the preferred and recommended way to retrieve the content.
It offers as example:
>>> r = requests.get('https://api.github.com/events', stream=True)
>>> r.raw
<requests.packages.urllib3.response.HTTPResponse object at 0x101194810>
>>> r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
But also offers advice on how to do it in practice by redirecting to a file etc. and using a different method:
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly
How about something like this:
import urllib2
req = urllib2.Request('http://www.google.com/')
res = urllib2.urlopen(req)
print res.info()
res.close();
If you are looking for something specific in the header:
For Date: print res.info().get('Date')
Here's how you get just the response headers using the requests library like you mentioned (implementation in Python3):
import requests
url = "https://www.google.com"
response = requests.head(url)
print(response.headers) # prints the entire header as a dictionary
print(response.headers["Content-Length"]) # prints a specific section of the dictionary
It's important to use .head() instead of .get() otherwise you will retrieve the whole file/page like the rest of the answers mentioned.
If you wish to retrieve a URL that requires authentication you can replace the above response with this:
response = requests.head(url, auth=requests.auth.HTTPBasicAuth(username, password))
easy
import requests
site = "https://www.google.com"
headers = requests.get(site).headers
print(headers)
if you want something specific
print(headers["domain"])
I'm using the urllib module, with the following code:
from urllib import request
with request.urlopen(url, data) as f:
print(f.getcode()) # http response code
print(f.info()) # all header info
resp_body = f.read().decode('utf-8') # response body
its very easy u can type
print(response.headers)
or my fav
print(requests.get('url').headers)
also u can use
print(requests.get('url').content)
Try to use req.headers and that's all. You will get the response headers ;)
import pprint
import requests
res = requests.request("GET", "https://google.com")
pprint.PrettyPrinter(indent=2).pprint(dict(res.headers))