Parsing XML creates not well-formed error in python - python

I am attempting to parse an XML document from the url https://www.predictit.org/api/marketdata/all/, using the following code:
import xml.etree.ElementTree as ET
import urllib.request
url = 'https://www.predictit.org/api/marketdata/all/'
response = urllib.request.urlopen(url).read().decode('utf-8')
tree = ET.fromstring(response)
However, I am getting the error ParseError: not well-formed (invalid token): line 1, column 0
What do I need to do in order to convert this to a python object? I am sure this is an XML document, and it appears to parse fine when opened in a browser.

You're most likely getting back json. To verify, try printing the value of info() on the HTTPResponse object and look at the "Content-Type":
response = urllib.request.urlopen(url)
print(response.info())
To request XML, create a Request object and set the header (printing tree for testing):
import xml.etree.ElementTree as ET
import urllib.request
url = "https://www.predictit.org/api/marketdata/all/"
request = urllib.request.Request(url, headers={"Content-Type": "application/xml"})
response = urllib.request.urlopen(request)
tree = ET.parse(response)
print(ET.tostring(tree.getroot()).decode())
this will print (truncated to fit SO):
<MarketList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><Markets><MarketData><ID>2721</ID><Name>Which party will win the 2020 U.S....

Related

Get the specific response parameter with urllib in python

I am able to perform a web request and get back the response, using urllib.
from urllib import request
from urllib.parse import urlencode
response = request.urlopen(req, data=login_data)
content = response.read()
I get back something like b'{"token":"abcabcabc","error":null}'
How will i be able to parse the token information?
You can use the json module to load the binary string data and then access the token property:
token = json.loads(bin_data)['token']

get a xml response according to the parameters passed in python

I would like to get a response, by post method, and outputting the response on my webpage. So far, when I'm injecting directly in the url, I'm getting the right response, but in my code, I'm getting the error 15 , according to the api.
Here is my url which needs to return a xml response .
https://xml2sms.gsm.co.za/send/?username=y&password=y&number1=27825551234&message1=This+is+a+test&number2=27825551234&message2=This+is+a+test+2'
Here is my code. It does post but returns an error 15. According to the api, I'm building the error 15 is, means destination out of range.
This is my code.
# -*- coding: utf-8 -*-
import requests
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
xml = """<?xml version='1.0' encoding='utf-8'?>
<a></a>"""
headers = {'Content-Type': 'application/xml'} #
print requests.request("POST",'https://xml2sms.gsm.co.za/send/?
username=y&password=y&
number1=27825551234&message1=This+is+a+test&
number2=2
7825551234&message2=This+is+a+test+2', data=xml, headers=headers).text

Python Parse JSON Response from URL

I'm am wanting to get information about my Hue lights using a python program. I am ok with sorting the information once I get it, but I am struggling to load in the JSON info. It is sent as a JSON response. My code is as follows:
import requests
import json
response= requests.get('http://192.168.1.102/api/F5La7UpN6XueJZUts1QdyBBbIU8dEvaT1EZs1Ut0/lights')
data = json.load(response)
print(data)
When this is run, all I get is the error:
in load return loads(fp.read(),
Response' object has no attribute 'read'
The problem is you are passing in the actual response which consists of more than just the content. You need to pull the content out of the response:
import requests
r = requests.get('https://github.com/timeline.json')
print r.text
# The Requests library also comes with a built-in JSON decoder,
# just in case you have to deal with JSON data
import requests
r = requests.get('https://github.com/timeline.json')
print r.json
http://www.pythonforbeginners.com/requests/using-requests-in-python
Looks like it will parse the JSON for you already...
Use response.content to access response content and json.loads method instead of json.load:
data = json.loads(response.content)
print data

Python 3.4.1 - Reading HTTP Request Data

I'm fairly new to Python and I'm trying to execute a HTTP Request to a URL that returns JSON. The code, I have is:
url = "http://myurl.com/"
req = urllib.request.Request(url)
response = urllib.request.urlopen(req)
data = response.read()
I'm getting an error reading: "'bytes' object has no attribute 'read'". I searched around, but haven't found a solution. Any suggestions?
You may find the requests library easier to use:
import requests
data = requests.get('http://example.com').text
or, if you need the raw, undecoded bytes,
import requests
data = requests.get('http://example.com').content

Parse XML from URL into python object

The goodreads website has this API for accessing a user's 'shelves:' https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread
It returns XML. I'm trying to create a django project that shows books on a shelf from this API. I'm looking to find out how (or if there is a better way than) to write my view so I can pass an object to my template. Currently, this is what I'm doing:
import urllib2
def homepage(request):
file = urllib2.urlopen('https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread')
data = file.read()
file.close()
dom = parseString(data)
I'm not entirely sure how to manipulate this object if I'm doing this correctly. I'm following this tutorial.
I'd use xmltodict to make a python dictionary out of the XML data structure and pass this dictionary to the template inside the context:
import urllib2
import xmltodict
def homepage(request):
file = urllib2.urlopen('https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread')
data = file.read()
file.close()
data = xmltodict.parse(data)
return render_to_response('my_template.html', {'data': data})
xmltodict using requests
import requests
import xmltodict
url = "https://yoursite/your.xml"
response = requests.get(url)
data = xmltodict.parse(response.content)
xmltodict using urllib3
import traceback
import urllib3
import xmltodict
def getxml():
url = "https://yoursite/your.xml"
http = urllib3.PoolManager()
response = http.request('GET', url)
try:
data = xmltodict.parse(response.data)
except:
print("Failed to parse xml from response (%s)" % traceback.format_exc())
return data

Categories