Import JSON in Python error BadStatusLine - python

I'm trying to import the Json data generated by an Impinj R420 reader.
The code i use is:
# import urllib library
import urllib.request
from urllib.request import urlopen
# import json
import json
# store the URL in url as
# parameter for urlopen
url = "http://10.234.92.19:14150"
# store the response of URL
response = urllib.request.urlopen(url)
# storing the JSON response
# from url in data
data_json = json.loads(response())
# print the json response
print(data_json)
When i execute the programm it gives the following error:
Traceback (most recent call last):
File "C:\Users\V\PycharmProjects\Stapelcontrole\main.py", line 13, in <module>
response = urllib.request.urlopen(url)
File "C:\Users\V\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\V\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 519, in open
response = self._open(req, data)
File "C:\Users\V\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "C:\Users\V\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 496, in _call_chain
result = func(*args)
File "C:\Users\V\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 1377, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Users\V\AppData\Local\Programs\Python\Python310\lib\urllib\request.py", line 1352, in do_open
r = h.getresponse()
File "C:\Users\V\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 1374, in getresponse
response.begin()
File "C:\Users\V\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 318, in begin
version, status, reason = self._read_status()
File "C:\Users\V\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 300, in _read_status
raise BadStatusLine(line)
http.client.BadStatusLine: {"epc":"3035307B2831B383E019E8EA","firstSeenTimestamp":"2022-04-11T11:24:23.592434Z","isHeartBeat":false}
Process finished with exit code 1
I know this is an error in the response where it gets a faulty HTTP status code.
Yet i don't know how to fix the error.
Could you advice me how to fix this?
The {"epc":"3035307B2831B383E019E8EA","firstSeenTimestamp":"2022-04-11T11:24:23.592434Z","isHeartBeat":false} is an answer i expect.
Thanks in advance
Edit:
Even with
with urllib.request.urlopen(url) as f:
data_json = json.load(f)`
i get the same BadStatusLine error.
I can't setup the reader any different, it can only sent a JSON response trough the IP-adress of the device. Is there a way to import the data without the HTTP Protocol?

# store the response of URL
response = urllib.request.urlopen(url)
# storing the JSON response
# from url in data
data_json = json.loads(response())
Here you are actually calling response, I do not know what you want to achieve by that, but examples in urllib.request docs suggest that urllib.request.urlopen should be treated akin to local file handle, thus please replace above using
with urllib.request.urlopen(url) as f:
data_json = json.load(f)
Observe that I used json.load not json.loads
EDIT: After Is there a way to import the data without the HTTP Protocol? I conclude more low-level solution is needed, hopefully socket will you allow to get what you want, using echo client example as starting point I prepared following code
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.connect(("10.234.92.19",14150))
s.sendall(b'GET / HTTP/1.1\r\n')
data = s.recv(1024)
print(data)
If everything will work as intended you should get printed 1024 first bytes of answer. If it so change 1024 to value which will be always bigger than number of bytes of response and use json.dumps(data) to get parsed response

Related

How to print JSON data to a Google Sheet using GSpread

I have tried every possible fix I can find online, unfortunately, I'm new to this and not sure if I'm getting closer or not.
Ultimately, all I am trying to do is print a JSON feed into a Google Sheet.
GSpread is working (I've appended just number values as a test), but I simply cannot get the JSON feed to print there.
I've gotten it printing to terminal, so I know it's accessible, but writing the loop to append the data becomes the issue.
This is my current script:
# import urllib library
import json
from urllib.request import urlopen
import gspread
gc = gspread.service_account(filename='creds.json')
sh = gc.open_by_key('1-1aiGMn2yUWRlh_jnIebcMNs-6phzUNxkktAFH7uY9o')
worksheet = sh.sheet1
# import json
# store the URL in url as
# parameter for urlopen
url = 'https://api.chucknorris.io/jokes/random'
# store the response of URL
response = urlopen(url)
# storing the JSON response
# from url in data
data_json = json.loads(response.read())
# print the json response
# print(data_json)
result = []
for key in data_json:
result.append([key, data_json[key]])
worksheet.update('a1', result)
I've hit a complete brick wall - any advice would be greatly appreciated
Update - suggested script with new error:
# import urllib library
import json
from urllib.request import urlopen
import gspread
gc = gspread.service_account(filename='creds.json')
sh = gc.open_by_key('1-1aiGMn2yUWRlh_jnIebcMNs-6phzUNxkktAFH7uY9o')
worksheet = sh.sheet1
url = 'https://api.chucknorris.io/jokes/random'
# store the response of URL
response = urlopen(url)
# storing the JSON response
# from url in data
data_json = json.loads(response.read())
# print the json response
# print(data_json)
result = []
for key in data_json:
result.append([key, data_json[key] if not isinstance(
data_json[key], list) else ",".join(map(str, data_json[key]))])
worksheet.update('a1', result)
Error:
Traceback (most recent call last):
File "c:\Users\AMadle\NBA-JSON-Fetch\PrintToSheetTest.py", line 17, in <module>
response = urlopen(url)
File "C:\Python\python3.10.5\lib\urllib\request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "C:\Python\python3.10.5\lib\urllib\request.py", line 525, in open
response = meth(req, response)
File "C:\Python\python3.10.5\lib\urllib\request.py", line 634, in http_response
response = self.parent.error(
File "C:\Python\python3.10.5\lib\urllib\request.py", line 563, in error
return self._call_chain(*args)
File "C:\Python\python3.10.5\lib\urllib\request.py", line 496, in _call_chain
result = func(*args)
File "C:\Python\python3.10.5\lib\urllib\request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
Can confirm it is not a permissions issue, the script below prints the same URL to terminal no problem. Also have no problem writing other data to the sheet:
import requests as rq
from bs4 import BeautifulSoup
url = 'https://api.chucknorris.io/jokes/random'
req = rq.get(url, verify=False)
soup = BeautifulSoup(req.text, 'html.parser')
print(soup)
In your script, I thought that it is required to convert the JSON data to a 2-dimensional array. And, when I saw the value of data_json, I noticed that an array is included in the value. I think that it is required to be also considered. I thought that this might be the reason for your issue. When this is reflected in your script, how about the following modification?
From:
result.append([key, data_json[key]])
To:
result.append([key, data_json[key] if not isinstance(data_json[key], list) else ",".join(map(str, data_json[key]))])
In this modification, the array is converted to the string using join.

Google Testing Tools API - MobileFriendlyTest Python 403 Forbidden

So recently I came across the Google Testing Tools API - Mobile Friendly Test (https://developers.google.com/webmaster-tools/search-console-api/reference/rest/v1/urlTestingTools.mobileFriendlyTest) but I couldn't work it even when I am trying on the site. I tried to use python for this app and followed the guide (https://developers.google.com/webmaster-tools/search-console-api/v1/samples) and I made some few changes to actually make it work (since urllib was merged into one library). So end of the day my code looked like this:
from __future__ import print_function
import urllib
import urllib.request as urllib2
api_key = 'API_KEY'
request_url = 'https://www.google.com/'
service_url = 'https://searchconsole.googleapis.com/v1/urlTestingTools/mobileFriendlyTest:run'
params = {
'url' : request_url,
'key' : api_key,
}
data = urllib.parse.urlencode(params)
content = urllib2.urlopen(url=service_url, data=str.encode(data)).read()
print(content)
And I got the error:
File ".\script2.py", line 14, in <module>
content = urllib2.urlopen(url=service_url, data=str.encode(data)).read()
File "C:\Python\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Python\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Python\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Python\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Python\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
I also tried using curl command and the online tool (not https://search.google.com/test/mobile-friendly but Try This API section) but neither of them worked.
Well I actually solved my own problem, it is mainly caused by urllib I think. So here is what I did;
from __future__ import print_function
import urllib.parse as parser
import urllib.request as urllib2
import json
import base64
request_url = url
params = {
'url': request_url,
'key': api_key
}
data = bytes(parser.urlencode(params), encoding='utf-8')
content = urllib2.urlopen(url=service_url, data=data).read()
sContent = str(content, encoding='utf-8') #Shorthand for stringContent

My Python cannot work with URL's, and nobody can figure out why?

All I want to do is scrape some data about earthquakes from a website. In fact, I just want Python to be able to extract data from URL's. For some reason, even the simplest code which only opens a url and uses '.readlines()' is met with a wall of errors. It doesn't seem to understand the 'openurl' command, nor most anything else.
I don't know what to even try, because I can't parse the errors that it's giving me. I was hoping, before I had to do something drastic like re-download python or something, that someone would have an answer for me.
import urllib.request
def urltest():
url = "http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.csv"
f = urllib.request.urlopen(url)
allLines = f.readlines()
f.close()
line = allLines[0].decode()
print(line)
This is the code I've used to simply test it. The URL goes to a website which holds a .csv file, which python should easily acquire and read through.
If anyone wants, I can actually post the entire wall of errors that this code returns. There looks to be at least 6 different ones, but this is the final line that it spits back:
urllib.error.URLError: <urlopen error unknown url type: https>
Looking through the urllib.requests module it loads a collection of handlers. we can see this code snippet in urllib.request.py
if hasattr(http.client, "HTTPSConnection"):
default_classes.append(HTTPSHandler)
skip = set()
for klass in default_classes:
for check in handlers:
if isinstance(check, type):
if issubclass(check, klass):
skip.add(klass)
elif isinstance(check, klass):
skip.add(klass)
for klass in skip:
default_classes.remove(klass)
for klass in default_classes:
opener.add_handler(klass())
So the https handler class is only loaded if the http.client.py has the attribute HTTPSConnection. If we look in the http.client.py we can see the following code for setting this attribute.
try:
import ssl
except ImportError:
pass
else:
class HTTPSConnection(HTTPConnection):
"This class allows communication via SSL."
default_port = HTTPS_PORT
So the HTTPSConnection class is only created if the ssl module can successfully be imported. If you system doesnt have the ssl module then http.client wont load the HTTPSConnection class which in turn will not add the attribute and as such urllib wont load a handler for https.
While the code you provided worked on my system. I added the following code before it to cause my system to not be able to locate the ssl module.
#load then remove the ssl module from the system
import sys
import ssl
del ssl
sys.modules['ssl']=None
import urllib.request
def urltest():
url = "http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.csv"
f = urllib.request.urlopen(url)
allLines = f.readlines()
f.close()
line = allLines[0].decode()
print(line)
urltest()
Doing this i get the same error you were getting
C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\python.exe C:/Users/cd00119621/PycharmProjects/ideas/stackoverflow.py
Traceback (most recent call last):
File "C:/Users/cd00119621/PycharmProjects/ideas/stackoverflow.py", line 19, in <module>
urltest()
File "C:/Users/cd00119621/PycharmProjects/ideas/stackoverflow.py", line 13, in urltest
f = urllib.request.urlopen(url)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 548, in _open
'unknown_open', req)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\cd00119621\AppData\Local\Programs\Python\Python37\lib\urllib\request.py", line 1387, in unknown_open
raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: https>
So i suspect you have installed python without ssl configured. You should be able to verify this easly by just trying to import ssl from the python command line import ssl if you get an error like
>>> import ssl
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'ssl'
Then that will be the cause of your issues. You would have to either reinstall python with ssl configured or somehow build the ssl module from source
It looks like the problem is a network(dns/proxy/firewall) issue.
https://github.com/pbugnion/gmaps/issues/245
You can use Pandas:
import pandas as pd
data = pd.read_csv('http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_day.csv')
print (data)

Python urllib.request.Request parameter 'data' object type

I try to use urllib.request.Request (Python 3.6.7) to make an API call to a internal web services to get some json results. I need to send some data and headers to the server, so I use the urllib.request.Request class to do this. For the input of data, I try to find out what is the format it will accept. From the Python docs, it says:
The supported object types include bytes, file-like objects, and
iterables.
So I use a dictionary data type for this parameter data. Here is my code:
import urllib
my_url = "https://httpbin.org/post"
my_headers = { "Content-Type" : "application/x-www-form-urlencoded" }
my_data = {
"client_id" : "ppp",
"client_secret" : "000",
"grant_type" : "client_credentials" }
req = urllib.request.Request(url=my_url, data=my_data, headers=my_headers)
response = urllib.request.urlopen(req)
html = response.read()
print(html)
I then get error like this:
Traceback (most recent call last):
File "./callapi.py", line 23, in <module>
response = urllib.request.urlopen(req)
File "/usr/lib64/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/usr/lib64/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 1361, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/usr/lib64/python3.6/urllib/request.py", line 1318, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/usr/lib64/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1064, in _send_output
+ b'\r\n'
TypeError: can't concat str to bytes
I then follow the example in this docs page, and change my code to:
import urllib
my_url = "https://httpbin.org/post"
my_headers = { "Content-Type" : "application/x-www-form-urlencoded" }
my_data = {
"client_id" : "ppp",
"client_secret" : "000",
"grant_type" : "client_credentials" }
my_uedata = urllib.parse.urlencode(my_data)
my_edata = my_uedata.encode('ascii')
req = urllib.request.Request(url=my_url, data=my_edata,headers=my_headers)
response = urllib.request.urlopen(req)
html = response.read()
print(html)
it then works.
My question is, isn't it in the docs it says this class accept data type iterables ? why does my parameter in dict is wrong ? My final result that is working use str.encode() method which returns an byte object, and it seems this class must take a byte object and not an iterables object.
I am trying to use Python Standard Library docs as the main source of reference to code in Python, however I am having a hard time to use it, hope anybody can shed some light in helping me to understand more on how the library docs works, or if there is any other tutorial I need to go through before I can use it in a better way. Thanks.
I agree with you, the doc is not explicit. What is implicit is that if the data parameter is an iterable, it must be an iterable of bytes. When I have tried to pass a string as data I got an explicit error message:
TypeError: POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.
So for that reason, the iterable cannot be a dictionnary. As an exemple of valid iterable that in not a byte object (ok, just an example, no reason to use that in real code...):
def dict_iter(d):
for i in d.items():
yield(str(i).encode())
You can use that generator for the data parameter:
req = urllib.request.Request(url=my_url, data=dict_iter(my_data), headers=my_headers)

Problem with urllib2 loading mobile site

I'm trying to fetch some data from http://m.finnkino.fi/events/now_showing, but at the moment I'm failing badly because I'm not even able to load the page source with python.
At the moment I'm using following code:
req = urllib2.urlopen(URL,None,2.5)
page = req.read()
print page
Here is the traceback for timeout error:
Traceback (most recent call last):
File "user/src/finnkinoParser.py", line 26, in <module>
main()
File "user/src/finnkinoParser.py", line 13, in main
getNowPlayingMovies()
File "user/src/finnkinoParser.py", line 17, in getNowPlayingMovies
req = urllib2.urlopen(baseURL,None,2.5)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 124, in urlopen
return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 383, in open
response = self._open(req, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 401, in _open
'_open', req)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 361, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 1130, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 1105, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error timed out>
If I browse to the url with my browser it works fine. So could someone tell me what makes that site that much different so the urllib2 is unable to load the page. I suppose it has something to do with the site being aimed to mobile users. With "regular" sites urllib2 works fine. Is there any other kind of sites to which the basic urlopen(URL) doesn't work?
Thanks for help
Following snippet works fine.
import httplib
headers = {"User-Agent": "Mozilla/5.0"}
conn = httplib.HTTPConnection("m.finnkino.fi")
conn.request("GET", "/events/now_showing", "", headers)
response = conn.getresponse()
print response.status, response.reason
data = response.read()
print data
conn.close()
It seems their server has verified several request vars. After tested some times, here is conclusion:
http protocol must be HTTP/1.1.
if request headers have Connection prop, its value should be keep-alive.
request headers must have User-Agent prop, whatever its value.
While in urllib2, Connection prop in HTTPHandler has been set to Close by default (L1127 in urllib2.py). you can use urlgrabber or other HTTP handler which supports HTTP/1.1 and keep-alive.

Categories