read xml files online

read xml files online - python

i'm new to programing and I'm trying to accese the webservice provided in http://indicadoreseconomicos.bccr.fi.cr/indicadoreseconomicos/WebServices/wsindicadoreseconomicos.asmx?op=ObtenerIndicadoresEconomicosXML, i've added the parameters I need to acces it but when I try to read the file in python I get
TypeError: 'HTTPResponse' object cannot be interpreted as an integer
this is my code
import urllib
import http.client
import time
HEADERS={"Content-type":"application/x-www-form-urlencoded","Accept":"text/plain"}
HOST = "indicadoreseconomicos.bccr.fi.cr"
POST = "/indicadoreseconomicos/WebServices/wsIndicadoresEconomicos.asmx/ObtenerIndicadoresEconomicos"
data = urllib.parse.urlencode({'tcIndicador': 317,
'tcFechaInicio':str(time.strftime("%d/%m/%Y")),
'tcFechaFinal':str(time.strftime("%d/%m/%Y")),
'tcNombre' : 'TI1400',
'tnSubNiveles' : 'N'})
conn=http.client.HTTPConnection(HOST)
conn.request("POST",POST,data,headers=HEADERS)
response= conn.getresponse()
responseSTR= response.read(response)
print (response)
Any suggestions are apreciated

response.read() takes an optional argument that is the number of bytes to read from the response; an integer, whole number. Now you passed the response object instead.
As you want to read the entire response, you should omit the argument altogether, thus:
response_str = response.read()
print(response_str)

Related

Upload file to Databricks DBFS with Python API

I'm following the Databricks example for uploading a file to DBFS (in my case .csv):
import json
import requests
import base64
DOMAIN = '<databricks-instance>'
TOKEN = '<your-token>'
BASE_URL = 'https://%s/api/2.0/dbfs/' % (DOMAIN)
def dbfs_rpc(action, body):
""" A helper function to make the DBFS API request, request/response is encoded/decoded as JSON """
response = requests.post(
BASE_URL + action,
headers={'Authorization': 'Bearer %s' % TOKEN },
json=body
)
return response.json()
# Create a handle that will be used to add blocks
handle = dbfs_rpc("create", {"path": "/temp/upload_large_file", "overwrite": "true"})['handle']
with open('/a/local/file') as f:
while True:
# A block can be at most 1MB
block = f.read(1 << 20)
if not block:
break
data = base64.standard_b64encode(block)
dbfs_rpc("add-block", {"handle": handle, "data": data})
# close the handle to finish uploading
dbfs_rpc("close", {"handle": handle})
When using the tutorial as is, I get an error:
Traceback (most recent call last):
File "db_api.py", line 65, in <module>
data = base64.standard_b64encode(block)
File "C:\Miniconda3\envs\dash_p36\lib\base64.py", line 95, in standard_b64encode
return b64encode(s)
File "C:\Miniconda3\envs\dash_p36\lib\base64.py", line 58, in b64encode
encoded = binascii.b2a_base64(s, newline=False)
TypeError: a bytes-like object is required, not 'str'
I tried doing with open('./sample.csv', 'rb') as f: before passing the blocks to base64.standard_b64encode but then getting another error:
TypeError: Object of type 'bytes' is not JSON serializable
This happens when the encoded block data is being sent into the API call.
I tried skipping encoding entirely and just passing the blocks into the post call. In this case the file gets created in the DBFS but has 0 bytes size.
At this point I'm trying to make sense of it all. It doesn't want a string but it doesn't want bytes either. What am I doing wrong? Appreciate any help.

In Python we have strings and bytes, which are two different entities note that there is no implicit conversion between them, so you need to know when to use which and how to convert when necessary. This answer provides nice explanation.
With the code snippet I see two issues:
This you already got - open by default reads the file as text. So your block is a string, while standard_b64encode expects bytes and returns bytes. To read bytes from file it needs to be opened in binary mode:
with open('/a/local/file', 'rb') as f:
Only strings can be encoded as JSON. There's no source code available for dbfs_rpc (or I can't find it), but apparently it expects a string, which it internally encodes. Since your data is bytes, you need to convert it to string explicitly and that's done using decode:
dbfs_rpc("add-block", {"handle": handle, "data": data.decode('utf8')})

How to convert a dict object in requests.models.Response object in Python?

I am trying to test a function which contain an API call. So in the function I have this line of code :
api_request = dict(requests.get(url_of_the_API).json())
So I tried to use patch like this :
#patch('requests.get')
def test_products_list_creator(self, mock_get):
mock_get.return_value = json.dumps({"products":
{
"name": "apple",
"categories": "fruit,red"
}
})
But at the line of my API call, python throw me this error :
AttributeError: 'str' object has no attribute 'json'
I tried to print type(requests.get(url_of_the_API}.json")) to know what it was and I got this : <class 'requests.models.Response'>
There is a lot of questions about convert a Response in dict but didn't found any about converting a dict to a Response.
So how to make my patch callable by the method json() ?

Firstly we need to figure what is required by requests.models.Response's method .json in order to work, this can be done by scrying source code available at github - requests.models source. After reading requests.models.Response.json body we might conclude that if encoding is set, it does simply loads .text. text method has #property decorator, meaning it is computed when accessing .text, this in turn depend on .content which also is computed. After scrying .content it should be clear that if we set ._content value then it will be returned by .content, therefore to make fake Response which do support .json we need to create one and set .encoding and ._content of it, consider following example:
from requests.models import Response
resp = Response()
resp.encoding = 'ascii'
resp._content = b'{"x":100}'
print(resp.json())
output
{'x': 100}
Note that _content value needs to be bytes so if you have dict d then to get value to be put there do json.dumps(d).encode('ascii')

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I am getting error Expecting value: line 1 column 1 (char 0) when trying to decode JSON.
The URL I use for the API call works fine in the browser, but gives this error when done through a curl request. The following is the code I use for the curl request.
The error happens at return simplejson.loads(response_json)
response_json = self.web_fetch(url)
response_json = response_json.decode('utf-8')
return json.loads(response_json)
def web_fetch(self, url):
buffer = StringIO()
curl = pycurl.Curl()
curl.setopt(curl.URL, url)
curl.setopt(curl.TIMEOUT, self.timeout)
curl.setopt(curl.WRITEFUNCTION, buffer.write)
curl.perform()
curl.close()
response = buffer.getvalue().strip()
return response
Traceback:
File "/Users/nab/Desktop/myenv2/lib/python2.7/site-packages/django/core/handlers/base.py" in get_response
111. response = callback(request, *callback_args, **callback_kwargs)
File "/Users/nab/Desktop/pricestore/pricemodels/views.py" in view_category
620. apicall=api.API().search_parts(category_id= str(categoryofpart.api_id), manufacturer = manufacturer, filter = filters, start=(catpage-1)*20, limit=20, sort_by='[["mpn","asc"]]')
File "/Users/nab/Desktop/pricestore/pricemodels/api.py" in search_parts
176. return simplejson.loads(response_json)
File "/Users/nab/Desktop/myenv2/lib/python2.7/site-packages/simplejson/__init__.py" in loads
455. return _default_decoder.decode(s)
File "/Users/nab/Desktop/myenv2/lib/python2.7/site-packages/simplejson/decoder.py" in decode
374. obj, end = self.raw_decode(s)
File "/Users/nab/Desktop/myenv2/lib/python2.7/site-packages/simplejson/decoder.py" in raw_decode
393. return self.scan_once(s, idx=_w(s, idx).end())
Exception Type: JSONDecodeError at /pricemodels/2/dir/
Exception Value: Expecting value: line 1 column 1 (char 0)

Your code produced an empty response body, you'd want to check for that or catch the exception raised. It is possible the server responded with a 204 No Content response, or a non-200-range status code was returned (404 Not Found, etc.). Check for this.
Note:
There is no need to use simplejson library, the same library is included with Python as the json module.
There is no need to decode a response from UTF8 to unicode, the simplejson / json .loads() method can handle UTF8 encoded data natively.
pycurl has a very archaic API. Unless you have a specific requirement for using it, there are better choices.
Either the requests or httpx offers much friendlier APIs, including JSON support. If you can, replace your call with:
import requests
response = requests.get(url)
response.raise_for_status() # raises exception when not a 2xx response
if response.status_code != 204:
return response.json()
Of course, this won't protect you from a URL that doesn't comply with HTTP standards; when using arbirary URLs where this is a possibility, check if the server intended to give you JSON by checking the Content-Type header, and for good measure catch the exception:
if (
response.status_code != 204 and
response.headers["content-type"].strip().startswith("application/json")
):
try:
return response.json()
except ValueError:
# decide how to handle a server that's misbehaving to this extent

Be sure to remember to invoke json.loads() on the contents of the file, as opposed to the file path of that JSON:
json_file_path = "/path/to/example.json"
with open(json_file_path, 'r') as j:
contents = json.loads(j.read())
I think a lot of people are guilty of doing this every once in a while (myself included):
contents = json.load(json_file_path)

Check the response data-body, whether actual data is present and a data-dump appears to be well-formatted.
In most cases your json.loads- JSONDecodeError: Expecting value: line 1 column 1 (char 0) error is due to :
non-JSON conforming quoting
XML/HTML output (that is, a string starting with <), or
incompatible character encoding
Ultimately the error tells you that at the very first position the string already doesn't conform to JSON.
As such, if parsing fails despite having a data-body that looks JSON like at first glance, try replacing the quotes of the data-body:
import sys, json
struct = {}
try:
try: #try parsing to dict
dataform = str(response_json).strip("'<>() ").replace('\'', '\"')
struct = json.loads(dataform)
except:
print repr(resonse_json)
print sys.exc_info()
Note: Quotes within the data must be properly escaped

With the requests lib JSONDecodeError can happen when you have an http error code like 404 and try to parse the response as JSON !
You must first check for 200 (OK) or let it raise on error to avoid this case.
I wish it failed with a less cryptic error message.
NOTE: as Martijn Pieters stated in the comments servers can respond with JSON in case of errors (it depends on the implementation), so checking the Content-Type header is more reliable.

Check encoding format of your file and use corresponding encoding format while reading file. It will solve your problem.
with open("AB.json", encoding='utf-8', errors='ignore') as json_data:
data = json.load(json_data, strict=False)

I had the same issue trying to read json files with
json.loads("file.json")
I solved the problem with
with open("file.json", "r") as read_file:
data = json.load(read_file)
maybe this can help in your case

A lot of times, this will be because the string you're trying to parse is blank:
>>> import json
>>> x = json.loads("")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
You can remedy by checking whether json_string is empty beforehand:
import json
if json_string:
x = json.loads(json_string)
else:
# Your code/logic here
x = {}

I encounterred the same problem, while print out the json string opened from a json file, found the json string starts with 'ï»¿', which by doing some reserach is due to the file is by default decoded with UTF-8, and by changing encoding to utf-8-sig, the mark out is stripped out and loads json no problem:
open('test.json', encoding='utf-8-sig')

This is the minimalist solution I found when you want to load json file in python
import json
data = json.load(open('file_name.json'))
If this give error saying character doesn't match on position X and Y, then just add encoding='utf-8' inside the open round bracket
data = json.load(open('file_name.json', encoding='utf-8'))
Explanation
open opens the file and reads the containts which later parse inside json.load.
Do note that using with open() as f is more reliable than above syntax, since it make sure that file get closed after execution, the complete sytax would be
with open('file_name.json') as f:
data = json.load(f)

There may be embedded 0's, even after calling decode(). Use replace():
import json
struct = {}
try:
response_json = response_json.decode('utf-8').replace('\0', '')
struct = json.loads(response_json)
except:
print('bad json: ', response_json)
return struct

I had the same issue, in my case I solved like this:
import json
with open("migrate.json", "rb") as read_file:
data = json.load(read_file)

I was having the same problem with requests (the python library). It happened to be the accept-encoding header.
It was set this way: 'accept-encoding': 'gzip, deflate, br'
I simply removed it from the request and stopped getting the error.

Just check if the request has a status code 200. So for example:
if status != 200:
print("An error has occured. [Status code", status, "]")
else:
data = response.json() #Only convert to Json when status is OK.
if not data["elements"]:
print("Empty JSON")
else:
"You can extract data here"

In my case I was doing file.read() two times in if and else block which was causing this error. so make sure to not do this mistake and hold contain in variable and use variable multiple times.

I had exactly this issue using requests.
Thanks to Christophe Roussy for his explanation.
To debug, I used:
response = requests.get(url)
logger.info(type(response))
I was getting a 404 response back from the API.

In my case it occured because i read the data of the file using file.read() and then tried to parse it using json.load(file).I fixed the problem by replacing json.load(file) with json.loads(data)
Not working code
with open("text.json") as file:
data=file.read()
json_dict=json.load(file)
working code
with open("text.json") as file:
data=file.read()
json_dict=json.loads(data)

For me, it was not using authentication in the request.

For me it was server responding with something other than 200 and the response was not json formatted. I ended up doing this before the json parse:
# this is the https request for data in json format
response_json = requests.get()
# only proceed if I have a 200 response which is saved in status_code
if (response_json.status_code == 200):
response = response_json.json() #converting from json to dictionary using json library

I received such an error in a Python-based web API's response .text, but it led me here, so this may help others with a similar issue (it's very difficult to filter response and request issues in a search when using requests..)
Using json.dumps() on the request data arg to create a correctly-escaped string of JSON before POSTing fixed the issue for me
requests.post(url, data=json.dumps(data))

In my case it is because the server is giving http error occasionally. So basically once in a while my script gets the response like this rahter than the expected response:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html>
<head><title>502 Bad Gateway</title></head>
<body bgcolor="white">
<h1>502 Bad Gateway</h1>
<p>The proxy server received an invalid response from an upstream server.<hr/>Powered by Tengine</body>
</html>
Clearly this is not in json format and trying to call .json() will yield JSONDecodeError: Expecting value: line 1 column 1 (char 0)
You can print the exact response that causes this error to better debug.
For example if you are using requests and then simply print the .text field (before you call .json()) would do.

I did:
Open test.txt file, write data
Open test.txt file, read data
So I didn't close file after 1.
I added
outfile.close()
and now it works

If you are a Windows user, Tweepy API can generate an empty line between data objects. Because of this situation, you can get "JSONDecodeError: Expecting value: line 1 column 1 (char 0)" error. To avoid this error, you can delete empty lines.
For example:
def on_data(self, data):
try:
with open('sentiment.json', 'a', newline='\n') as f:
f.write(data)
return True
except BaseException as e:
print("Error on_data: %s" % str(e))
return True
Reference:
Twitter stream API gives JSONDecodeError("Expecting value", s, err.value) from None

if you use headers and have "Accept-Encoding": "gzip, deflate, br" install brotli library with pip install. You don't need to import brotli to your py file.

In my case it was a simple solution of replacing single quotes with double.
You can find my answer here

a python script to query MIT START website from local machine

I'm learning Python and the project I've currently set myself includes sending a question from my laptop connected to the net, connect to the MIT START NLP database, enter the question, retrieve the response and display the response. I've read through the "HOWTO Fetch Internet Resources Using urllib2" at docs.python.org but I seem to be missing some poignant bit of this idea. Here's my code:
import urllib
import urllib2
question = raw_input("What is your question? ")
url = 'http://start.csail.mit.edu/'
values = question
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
print the_page
and here's the error I'm getting:
Traceback (most recent call last): File "mitstart.py", line 9, in
data = urllib.urlencode(values) File "/usr/lib/python2.7/urllib.py", line 1298, in urlencode
raise TypeError TypeError: not a valid non-string sequence or mapping object
So I'm thinking that the way I set question in vales was wrong, so I did
values = {question}
and values = (question)
and values = ('question')
with no joy.
(I know, and my response is "I'm learning, it's late, and suddenly my wife decided she needed to talk to me about something trivial while I was trying to figure this out)
Can I get some guidance or at least get pointed in the right direction?

Note that your error says: TypeError: not a valid non-string sequence or mapping object
So, while you've created values as a string, you need a non-string sequence or a mapping object.
urlencoding requires key value pairs (e.g. a mapping object or a dict), so you generally pass it a dictionary.
Looking at the source for the form, you'll see:
<input type="text" name="query" size="60">
This means you should create a dict, something like:
values = { 'query': 'What is your question?' }
Then you should be able to pass that as the argument to urlencode().

urllib.urlencode() doesn't accept a string as an argument.
As #ernie said you should specify query parameter. Also the url is missing the /startfarm.cgi part:
<form method="post" action="startfarm.cgi">
Updated example:
import cgi
from urllib import urlencode
from urllib2 import urlopen
data = urlencode(dict(query=raw_input("What is your question?"))).encode('ascii')
response = urlopen("http://start.csail.mit.edu/startfarm.cgi", data)
# extract encoding from Content-Type and print the response
_, params = cgi.parse_header(response.headers.get('Content-Type', ''))
print response.read().decode(params['charset'])

Why do I get "'str' object has no attribute 'read'" when trying to use `json.load` on a string? [duplicate]

This question already has answers here:
How can I parse (read) and use JSON?
(5 answers)
Closed 29 days ago.
In Python I'm getting an error:
Exception: (<type 'exceptions.AttributeError'>,
AttributeError("'str' object has no attribute 'read'",), <traceback object at 0x1543ab8>)
Given python code:
def getEntries (self, sub):
url = 'http://www.reddit.com/'
if (sub != ''):
url += 'r/' + sub
request = urllib2.Request (url +
'.json', None, {'User-Agent' : 'Reddit desktop client by /user/RobinJ1995/'})
response = urllib2.urlopen (request)
jsonStr = response.read()
return json.load(jsonStr)['data']['children']
What does this error mean and what did I do to cause it?

The problem is that for json.load you should pass a file like object with a read function defined. So either you use json.load(response) or json.loads(response.read()).

Ok, this is an old thread but.
I had a same issue, my problem was I used json.load instead of json.loads
This way, json has no problem with loading any kind of dictionary.
Official documentation
json.load - Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python object using this conversion table.
json.loads - Deserialize s (a str, bytes or bytearray instance containing a JSON document) to a Python object using this conversion table.

You need to open the file first. This doesn't work:
json_file = json.load('test.json')
But this works:
f = open('test.json')
json_file = json.load(f)

If you get a python error like this:
AttributeError: 'str' object has no attribute 'some_method'
You probably poisoned your object accidentally by overwriting your object with a string.
How to reproduce this error in python with a few lines of code:
#!/usr/bin/env python
import json
def foobar(json):
msg = json.loads(json)
foobar('{"batman": "yes"}')
Run it, which prints:
AttributeError: 'str' object has no attribute 'loads'
But change the name of the variablename, and it works fine:
#!/usr/bin/env python
import json
def foobar(jsonstring):
msg = json.loads(jsonstring)
foobar('{"batman": "yes"}')
This error is caused when you tried to run a method within a string. String has a few methods, but not the one you are invoking. So stop trying to invoke a method which String does not define and start looking for where you poisoned your object.

AttributeError("'str' object has no attribute 'read'",)
This means exactly what it says: something tried to find a .read attribute on the object that you gave it, and you gave it an object of type str (i.e., you gave it a string).
The error occurred here:
json.load(jsonStr)['data']['children']
Well, you aren't looking for read anywhere, so it must happen in the json.load function that you called (as indicated by the full traceback). That is because json.load is trying to .read the thing that you gave it, but you gave it jsonStr, which currently names a string (which you created by calling .read on the response).
Solution: don't call .read yourself; the function will do this, and is expecting you to give it the response directly so that it can do so.
You could also have figured this out by reading the built-in Python documentation for the function (try help(json.load), or for the entire module (try help(json)), or by checking the documentation for those functions on http://docs.python.org .

Instead of json.load() use json.loads() and it would work:
ex:
import json
from json import dumps
strinjJson = '{"event_type": "affected_element_added"}'
data = json.loads(strinjJson)
print(data)

So, don't use json.load(data.read()) use json.loads(data.read()):
def findMailOfDev(fileName):
file=open(fileName,'r')
data=file.read();
data=json.loads(data)
return data['mail']

use json.loads() function , put the s after that ... just a mistake btw i just realized after i searched error
def getEntries (self, sub):
url = 'http://www.reddit.com/'
if (sub != ''):
url += 'r/' + sub
request = urllib2.Request (url +
'.json', None, {'User-Agent' : 'Reddit desktop client by /user/RobinJ1995/'})
response = urllib2.urlopen (request)
jsonStr = response.read()
return json.loads(jsonStr)['data']['children']
try this

Open the file as a text file first
json_data = open("data.json", "r")
Now load it to dict
dict_data = json.load(json_data)

If you need to convert string to json. Then use loads() method instead of load(). load() function uses to load data from a file so used loads() to convert string to json object.
j_obj = json.loads('["label" : "data"]')

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

read xml files online - python

Related

Upload file to Databricks DBFS with Python API

How to convert a dict object in requests.models.Response object in Python?

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

a python script to query MIT START website from local machine

Why do I get "'str' object has no attribute 'read'" when trying to use `json.load` on a string? [duplicate]

Categories

Resources