No JSON Object could be decoded to missingkids website - python

I'm not sure why I am receiving this error. There is a decoder.py file in my python folder.
import requests
import json
import common
session = requests.Session()
uri = "http://www.missingkids.com"
json_srv_uri = uri + "/missingkids/servlet/JSONDataServlet"
search_uri = "?action=publicSearch"
child_detail_uri = "?action=childDetail"
session.get(json_srv_uri + search_uri + "&searchLang=en_US&search=new&subjToSearch=child&missState=CA&missCountry=US") #Change missState=All for all states
response = session.get(json_srv_uri + search_uri + "&searchLang=en_US&goToPage=1")
dct = json.loads(response.text)
pgs = int(dct["totalPages"])
print("found {} pages".format(pgs))
missing_persons = {}

The URL http://www.missingkids.com/missingkids/servlet/ returns a 404 Error. Thus, there is no JSON data for Requests to return. Fixing the URL so that it points to a valid destination will allow Requests to return page content.
To make a search for a missing child registered in that website's database, try this URL: http://www.missingkids.com/gethelpnow/search

After every HTTP call you need to check the status code.
Example
import requests
r = requests.get('my_url')
# status code 'OK' is very popular and its numeric value is 200
# note that there are other status codes as well
if r.status_code == requests.codes.ok:
# do your thing
else:
# we have a problem

Related

How to post large number of data using requests.post in python?

I have written a sample python code to post log data to Fluntd endpoint of EFK stack. When I send 400 logs at a time, status_code is 200 and can see all the logs at Kibana Dashboard, But when I send 500 logs at a time, the status_code is 414.
Here is the sample python code:
import sys
import json
from datetime import datetime
import random
import requests
f = open('/etc/td-agent/data2.json',)
data = json.load(f)
input = file(sys.argv[-1])
actions = []
url = ''
u_name = ''
p_word = ''
for line in input:
temp = json.loads(line)
tenantid = temp['HTTP_FLUENT_TAG']
message = temp['message']
message_json = json.loads(message)
h_name = data['account_details'][tenantid]['hostname']
u_name = data['account_details'][tenantid]['username']
p_word = data['account_details'][tenantid]['password']
url = 'https://' + h_name
for element in message_json:
temp = str(element['date'])
url = url + '?time=' + temp
action = {
"msg": element['log'],
"id": element['ID']
}
actions.append(action)
r = requests.post(url, auth=(u_name, p_word), json=actions)
print(r.status_code)
f.close()
Can anyone please help how to send huge load at a time at the Fluentd endpoint.
For Elasticsearch endpoint, we can use the elasticsearch api and it has also bulk feature, which helps to send huge amount of data at a time. I am looking for if there is any such way for Fluentd Endpoint.
There are two ways this can be done, either there are two ways.
create a JSON file of request body JSON objects and zip them to send them all in a single request
Call APIs multiple times with maximum possible JSONs as you mentioned above.

(Python) Bittrex API v3 keeps returning invalid content hash

Writing a bot for a personal project, and the Bittrex api refuses to validate my content hash. I've tried everything I can think of and all the suggestions from similar questions, but nothing has worked so far. Tried hashing 'None', tried a blank string, tried the currency symbol, tried the whole uri, tried the command & balance, tried a few other things that also didn't work. Reformatted the request a few times (bytes/string/dict), still nothing.
Documentation says to hash the request body (which seems synonymous with payload in similar questions about making transactions through the api), but it's a simple get/chcek balance request with no payload.
Problem is, I get a 'BITTREX ERROR: INVALID CONTENT HASH' response when I run it.
Any help would be greatly appreciated, this feels like a simple problem but it's been frustrating the hell out of me. I am very new to python, but the rest of the bot went very well, which makes it extra frustrating that I can't hook it up to my account :/
import hashlib
import hmac
import json
import os
import time
import requests
import sys
# Base Variables
Base_Url = 'https://api.bittrex.com/v3'
APIkey = os.environ.get('B_Key')
secret = os.environ.get('S_B_Key')
timestamp = str(int(time.time() * 1000))
command = 'balances'
method = 'GET'
currency = 'USD'
uri = Base_Url + '/' + command + '/' + currency
payload = ''
print(payload) # Payload Check
# Hashes Payload
content = json.dumps(payload, separators=(',', ':'))
content_hash = hashlib.sha512(bytes(json.dumps(content), "utf-8")).hexdigest()
print(content_hash)
# Presign
presign = (timestamp + uri + method + str(content_hash) + '')
print(presign)
# Create Signature
message = f'{timestamp}{uri}{method}{content_hash}'
sign = hmac.new(secret.encode('utf-8'), message.encode('utf-8'),
hashlib.sha512).hexdigest()
print(sign)
headers = {
'Api-Key': APIkey,
'Api-Timestamp': timestamp,
'Api-Signature': sign,
'Api-Content-Hash': content_hash
}
print(headers)
req = requests.get(uri, json=payload, headers=headers)
tracker_1 = "Tracker 1: Response =" + str(req)
print(tracker_1)
res = req.json()
if req.ok is False:
print('bullshit error #1')
print("Bittex response: %s" % res['code'], file=sys.stderr)
I can see two main problems:
You are serialising/encoding the payload separately for the hash (with json.dumps and then bytes) and for the request (with the json=payload parameter to request.get). You don't have any way of knowing how the requests library will format your data, and if even one byte is different you will get a different hash. It is better to convert your data to bytes first, and then use the same bytes for the hash and for the request body.
GET requests do not normally have a body (see this answer for more details), so it might be that the API is ignoring the payload you are sending. You should check the API docs to see if you really need to send a request body with GET requests.

Authenticating with Coinbase's Exchange's API (HMAC) using requests in Python

I am implementing Coinbase's exchange API using custom auth in requests-python. The following code works with all the (authenticated) GET-based calls, but fails for all the authenticated POST-based calls (I haven't tried with DELETE or UPDATE verbs). I don't understand why the signature wouldn't work for both, because the payload is timestamp + method + path for GETs and timestamp + method + path + body for PUTs, so custom auth seems correct. Something is going wrong with adding the body and changing GET to POST. Thanks!
You can get your API keys for trying it out here: https://gdax.com/settings
import json, hmac, hashlib, time, requests, base64
from requests.auth import AuthBase
class CoinbaseAuth(AuthBase):
SIGNATURE_HTTP_HEADER = 'CB-ACCESS-SIGN'
TIMESTAMP_HTTP_HEADER = 'CB-ACCESS-TIMESTAMP'
KEY_HTTP_HEADER = 'CB-ACCESS-KEY'
PASSPHRASE_HTTP_HEADER = 'CB-ACCESS-PASSPHRASE'
def __init__(self, api_key, secret_key, passphrase):
self.api_key = api_key
self.secret_key = secret_key
self.passphrase = passphrase
def __call__(self, request):
#Add headers
request.headers[CoinbaseAuth.KEY_HTTP_HEADER] = self.api_key
request.headers[CoinbaseAuth.PASSPHRASE_HTTP_HEADER] = self.passphrase
timestamp = str(time.time())
request.headers[CoinbaseAuth.TIMESTAMP_HTTP_HEADER] = timestamp
#add signature
method = request.method
path = request.path_url
content = request.body
message = timestamp + method + path
if content:
message += content
hmac_key = base64.b64decode(self.secret_key)
sig = hmac.new(hmac_key, message, hashlib.sha256)
sig_b64 = sig.digest().encode("base64").rstrip("\n")
#Add signature header
request.headers[CoinbaseAuth.SIGNATURE_HTTP_HEADER] = sig_b64
return request
#Get your keys here: https://gdax.com/settings
key = 'KEY GOES HERE'
secret = 'SECRET GOES HERE'
passphrase = 'PASSPHRASE GOES HERE'
api_url = 'https://api.gdax.com:443/'
auth = CoinbaseAuth(API_KEY, API_SECRET, API_PASS)
#GETs work, shows account balances
r = requests.get(api_url + 'accounts', auth=auth)
print r.json()
#POSTs fail: {message: 'invalid signature'}
order = {}
order['size'] = 0.01
order['price'] = 100
order['side'] = 'buy'
order['product_id'] = 'BTC-USD'
r = requests.post(api_url + 'orders', data=json.dumps(order), auth=auth)
print r.json()
And the output:
GET call: 200: [{u'available': .......}]
POST call: 400: {u'message': u'invalid signature'}
EDIT: POSTing 'a' instead of valid JSON-encoded data results in the same signature error (rather than a JSON decoding error from the server), so I don't think it is the way I'm forming the data. Notably, if I omit the body -- request.post(..., data='',...) --- the server responds appropriately with {u'message': u'Missing product_id'}.
I don't know why, but if I change the data keyword argument to requests.post() to json it works:
r = requests.post(api_url + 'orders', json=order, auth=auth)
EDIT: The only thing that changes, AFAICT, is the content-type in the header is changed from text to JSON. So it is likely that or a unicode vs ASCII encoding issue. Here's the issue for the library that added this feature recently: https://github.com/kennethreitz/requests/issues/2025#issuecomment-46337236
I believe the content needs to be a json string with no spaces (this is what the node example does anyway). Maybe try this:
message += json.dumps(content).replace(' ', '')
I had the same exact problem until I looked at the public gdax API for nodeJS and found that they are using some additional headers that were not mentioned in the GDAX API docs. I added them and then it started working. See my answer to the following: GDAX API Always Returns Http 400 "Invalid Signature" Even though I do it exactly like in the API Doc

How do I parse a JSON response from Python Requests?

I am trying to parse a response.text that I get when I make a request using the Python Requests library. For example:
def check_user(self):
method = 'POST'
url = 'http://localhost:5000/login'
ck = cookielib.CookieJar()
self.response = requests.request(method,url,data='username=test1&passwd=pass1', cookies=ck)
print self.response.text
When I execute this method, the output is:
{"request":"POST /login","result":"success"}
I would like to check whether "result" equals "success", ignoring whatever comes before.
The manual suggests: if self.response.status_code == requests.codes.ok:
If that doesn't work:
if json.loads(self.response.text)['result'] == 'success':
whatever()
Since the output, response, appears to be a dictionary, you should be able to do
result = self.response.json().get('result')
print(result)
and have it print
'success'
If the response is in json you could do something like (python3):
import json
import requests as reqs
# Make the HTTP request.
response = reqs.get('http://demo.ckan.org/api/3/action/group_list')
# Use the json module to load CKAN's response into a dictionary.
response_dict = json.loads(response.text)
for i in response_dict:
print("key: ", i, "val: ", response_dict[i])
To see everything in the response you can use .__dict__:
print(response.__dict__)
import json
def check_user(self):
method = 'POST'
url = 'http://localhost:5000/login'
ck = cookielib.CookieJar()
response = requests.request(method,url,data='username=test1&passwd=pass1', cookies=ck)
#this line converts the response to a python dict which can then be parsed easily
response_native = json.loads(response.text)
return self.response_native.get('result') == 'success'
I found another solution. It is not necessary to use json module. You can create a dict using dict = eval(whatever) and return, in example, dict["result"]. I think it is more elegant. However, the other solutions also work and are correct
Put in the return of your method like this:
return self.response.json()
If you wanna looking for more details, click this following link:
https://www.w3schools.com/python/ref_requests_response.asp
and search for json() method.
Here is an code example:
import requests
url = 'https://www.w3schools.com/python/demopage.js'
x = requests.get(url)
print(x.json())
In some cases, maybe the response would be as expected. So It'd be great if we can built a mechanism to catch and log the exception.
import requests
import sys
url = "https://stackoverflow.com/questions/26106702/how-do-i-parse-a-json-response-from-python-requests"
response = requests.get(url)
try:
json_data = response.json()
except ValueError as exc:
print(f"Exception: {exc}")
# to find out why you have got this exception, you can see the response content and header
print(str(response.content))
print(str(response.headers))
print(sys.exc_info())
else:
if json_data.get('result') == "success":
# do whatever you want
pass

Requests - get content-type/size without fetching the whole page/content

I have a simple website crawler, it works fine, but sometime it stuck because of large content such as ISO images, .exe files and other large stuff. Guessing content-type using file extension is probably not the best idea.
Is it possible to get content-type and content length/size without fetching the whole content/page?
Here is my code:
requests.adapters.DEFAULT_RETRIES = 2
url = url.decode('utf8', 'ignore')
urlData = urlparse.urlparse(url)
urlDomain = urlData.netloc
session = requests.Session()
customHeaders = {}
if maxRedirects == None:
session.max_redirects = self.maxRedirects
else:
session.max_redirects = maxRedirects
self.currentUserAgent = self.userAgents[random.randrange(len(self.userAgents))]
customHeaders['User-agent'] = self.currentUserAgent
try:
response = session.get(url, timeout=self.pageOpenTimeout, headers=customHeaders)
currentUrl = response.url
currentUrlData = urlparse.urlparse(currentUrl)
currentUrlDomain = currentUrlData.netloc
domainWWW = 'www.' + str(urlDomain)
headers = response.headers
contentType = str(headers['content-type'])
except:
logging.basicConfig(level=logging.DEBUG, filename=self.exceptionsFile)
logging.exception("Get page exception:")
response = None
Yes.
You can use the Session.head method to create HEAD requests:
response = session.head(url, timeout=self.pageOpenTimeout, headers=customHeaders)
contentType = response.headers['content-type']
A HEAD request similar to GET request, except that the message body would not be sent.
Here is a quote from Wikipedia:
HEAD
Asks for the response identical to the one that would correspond to a GET request, but without the response body. This is useful for retrieving meta-information written in response headers, without having to transport the entire content.
Use requests.head() for this. It will not return the message body. You should use head method if you are interested only in the headers. Check this link for detail.
h = requests.head(some_link)
header = h.headers
content_type = header.get('content-type')
Sorry, my mistake, I should read documentation better. Here is the answer:
http://docs.python-requests.org/en/latest/user/advanced/#advanced (Body Content Workflow)
tarball_url = 'https://github.com/kennethreitz/requests/tarball/master'
r = requests.get(tarball_url, stream=True)
if int(r.headers['content-length']) > TOO_LONG:
r.connection.close()
# log request too long
Because requests.head() does NOT auto redirect, so a URL is redirected, requests.head() will get 0 for Content-Length. So make sure allow_redirects=True is added.
r = requests.head(url, allow_redirects=True)
length = r.headers['Content-Length']
Refer to Requests Redirection And History

Categories