Python Suds raw XML vs parsed value differnce - python

In Suds, I use something like
client = suds.client.Client(url)
date_val = client.service.getDate()
and date_val is printed as
2013-11-16
If I use client.last_received(), the raw xml gets printed as
2013-12-11-05:00
How do I get the date returned to date_val to be returned as 2013-11-16-05:00 ?

Apparently this is a known issue with suds. It finds the datetime and returns datetime.date. I couldn't figure out how to change this so I used something like the following:
def getElementFromRawXML(raw_xml,element):
string_xml = raw_xml.plain()
begin = string_xml.find("<"+element+">")
end = string_xml.find("</"+element+">")
if begin == -1 or end == -1:
return None
else:
return string_xml[(begin+len(element)+2):end]
raw_xml = client.last_received()
print getELementFromRawXML(raw_xml,'date')

Related

How to get details of a JSON Server response parsed into list/dictionary in Python

I am new to Python. I have been trying to parse the response sent as parameter in a function.
I have been trying to convert a function from Perl to Python.
The Perl block looks something like this:
sub fetchId_byusername
{
my ($self,$resString,$name) =#_;
my $my_id;
my #arr = #{$json->allow_nonref->utf8->decode($resString)};
foreach(#arr)
{
my %hash = %{$_};
foreach my $keys (keys %hash)
{
$my_id = $hash{id} if($hash{name} eq $name);
}
}
print "Fetched Id is : $my_id\n";
return $my_id;
The part where JSON data is being parsed is troubling me. How do i write this in python3.
I tried something like
def fetchID_byUsername(self, resString, name):
arr = []
user_id = 0
arr = resString.content.decode('utf-8', errors="replace")
for item in arr:
temp_hash = {}
temp_hash = item
for index in temp_hash.keys():
if temp_hash[name] == name:
user_id = temp_hash[id]
print("Fetched ID is: {}".format(user_id))
return user_id
Now I am not sure, if this is the right way to do it.
The json inputs are something like:
[{"id":12345,"name":"11","email":"11#test.com","groups":[{"id":6967,"name":"Test1"},{"id":123456,"name":"E1"}],"department":{"id":3863,"name":"Department1"},"comments":"111","adminUser":false},{"id":123457,"name":"1234567","email":"1234567#test.com","groups":[{"id":1657,"name":"mytest"},{"id":58881,"name":"Service Admin"}],"department":{"id":182,"name":"Service Admin"},"comments":"12345000","adminUser":true}]
Thanks in advance.
Your json input should be valid python I changed false to False and true to True. If it is json formatted string you can do
import json
data=json.loads(json_formatted_string_here) #data will be python dictionary herer
And tried like this it just iterates and when match found returns id
data=[{"id":12345,"name":"11","email":"11#test.com","groups":[{"id":6967,"name":"Test1"},{"id":123456,"name":"E1"}],"department":{"id":3863,"name":"Department1"},"comments":"111","adminUser":False},{"id":123457,"name":"1234567","email":"1234567#test.com","groups":[{"id":1657,"name":"mytest"},{"id":58881,"name":"Service Admin"}],"department":{"id":182,"name":"Service Admin"},"comments":"12345000","adminUser":True}]
def fetch_id_by_name(list_records,name):
for record in list_records:
if record["name"] == name:
return record["id"]
print(fetch_id_by_name(data,"11"))
First of all import the the json library and use json.loads() like:
import json
x = json.loads(json_feed) #This converts the json feed to a python dictionary
print(x["key"]) #values to "key"

Getting wrong result from JSON - Python 3

Im working on a small project of retrieving information about books from the Google Books API using Python 3. For this i make a call to the API, read out the variables and store those in a list. For a search like "linkedin" this works perfectly. However when i enter "Google", it reads the second title from the JSON input. How can this happen?
Please find my code below (Google_Results is the class I use to initialize the variables):
import requests
def Book_Search(search_term):
parms = {"q": search_term, "maxResults": 3}
r = requests.get(url="https://www.googleapis.com/books/v1/volumes", params=parms)
print(r.url)
results = r.json()
i = 0
for result in results["items"]:
try:
isbn13 = str(result["volumeInfo"]["industryIdentifiers"][0]["identifier"])
isbn10 = str(result["volumeInfo"]["industryIdentifiers"][1]["identifier"])
title = str(result["volumeInfo"]["title"])
author = str(result["volumeInfo"]["authors"])[2:-2]
publisher = str(result["volumeInfo"]["publisher"])
published_date = str(result["volumeInfo"]["publishedDate"])
description = str(result["volumeInfo"]["description"])
pages = str(result["volumeInfo"]["pageCount"])
genre = str(result["volumeInfo"]["categories"])[2:-2]
language = str(result["volumeInfo"]["language"])
image_link = str(result["volumeInfo"]["imageLinks"]["thumbnail"])
dict = Google_Results(isbn13, isbn10, title, author, publisher, published_date, description, pages, genre,
language, image_link)
gr.append(dict)
print(gr[i].title)
i += 1
except:
pass
return
gr = []
Book_Search("Linkedin")
I am a beginner to Python, so any help would be appreciated!
It does so because there is no publisher entry in volumeInfo of the first entry, thus it raises a KeyError and your except captures it. If you're going to work with fuzzy data you have to account for the fact that it will not always have the expected structure. For simple cases you can rely on dict.get() and its default argument to return a 'valid' default entry if an entry is missing.
Also, there are a few conceptual problems with your function - it relies on a global gr which is bad design, it shadows the built-in dict type and it captures all exceptions guaranteeing that you cannot exit your code even with a SIGINT... I'd suggest you to convert it to something a bit more sane:
def book_search(search_term, max_results=3):
results = [] # a list to store the results
parms = {"q": search_term, "maxResults": max_results}
r = requests.get(url="https://www.googleapis.com/books/v1/volumes", params=parms)
try: # just in case the server doesn't return valid JSON
for result in r.json().get("items", []):
if "volumeInfo" not in result: # invalid entry - missing volumeInfo
continue
result_dict = {} # a dictionary to store our discovered fields
result = result["volumeInfo"] # all the data we're interested is in volumeInfo
isbns = result.get("industryIdentifiers", None) # capture ISBNs
if isinstance(isbns, list) and isbns:
for i, t in enumerate(("isbn10", "isbn13")):
if len(isbns) > i and isinstance(isbns[i], dict):
result_dict[t] = isbns[i].get("identifier", None)
result_dict["title"] = result.get("title", None)
authors = result.get("authors", None) # capture authors
if isinstance(authors, list) and len(authors) > 2: # you're slicing from 2
result_dict["author"] = str(authors[2:-2])
result_dict["publisher"] = result.get("publisher", None)
result_dict["published_date"] = result.get("publishedDate", None)
result_dict["description"] = result.get("description", None)
result_dict["pages"] = result.get("pageCount", None)
genres = result.get("authors", None) # capture genres
if isinstance(genres, list) and len(genres) > 2: # since you're slicing from 2
result_dict["genre"] = str(genres[2:-2])
result_dict["language"] = result.get("language", None)
result_dict["image_link"] = result.get("imageLinks", {}).get("thumbnail", None)
# make sure Google_Results accepts keyword arguments like title, author...
# and make them optional as they might not be in the returned result
gr = Google_Results(**result_dict)
results.append(gr) # add it to the results list
except ValueError:
return None # invalid response returned, you may raise an error instead
return results # return the results
Then you can easily retrieve as much info as possible for a term:
gr = book_search("Google")
And it will be far more tolerant of data omissions, provided that your Google_Results type makes most of the entries optional.
Following #Coldspeed's recommendation it became clear that missing information in the JSON file caused the exception to run. Since I only had a "pass" statement there it skipped the entire result. Therefore I will have to adapt the "Try and Except" statements so errors do get handled properly.
Thanks for the help guys!

Python complete newbie, JSON formatting

I have never used Python before but am trying to use it due to some restrictions in another (proprietary) language, to retrieve some values from a web service and return them in json format to a home automation processor. The relevant section of code below returns :
[u'Name:London', u'Mode:Auto', u'Name:Ling', u'Mode:Away']
["Name:London", "Mode:Auto", "Name:Ling", "Mode:Away"]
…which isn't valid json. I am sure this is a really dumb question but I have searched here and haven't found an answer that helps me. Apologies if I missed something obvious but can anyone tell me what I need to do to ensure the json.dumps command outputs data in the correct format?
CresData = []
for i in range(0, j):
r = requests.get('http://xxxxxx.com/WebAPI/emea/api/v1/location/installationInfo?userId=%s&includeTemperatureControlSystems=True' % UserID, headers=headers)
CresData.append("Name:" + r.json()[i]['locationInfo']['name'])
r = requests.get('http://xxxxxx.com/WebAPI/emea/api/v1/location/%s/status?includeTemperatureControlSystems=True' % r.json()[i]['locationInfo']['locationId'], headers = headers)
CresData.append('Mode:' + r.json()['gateways'][0]['temperatureControlSystems'][0]['systemModeStatus']['mode'])
Cres_json = json.dumps(CresData)
print CresData
print Cres_json
I wasn't able to test the code as the link you mentioned is not a live link but your solution should be something like this
It looks like you are looking for JSON format with key value pair. you need to pass a dict object into json.dumps() which will return you string in required JSON format.
CresData = dict()
key_str = "Location"
idx = 0
for i in range(0, j):
data = dict()
r = requests.get('http://xxxxxx.com/WebAPI/emea/api/v1/location/installationInfo?userId=%s&includeTemperatureControlSystems=True' % UserID, headers=headers)
data["Name"] = r.json()[i]['locationInfo']['name']
r = requests.get('http://xxxxxx.com/WebAPI/emea/api/v1/location/%s/status?includeTemperatureControlSystems=True' % r.json()[i]['locationInfo']['locationId'], headers = headers)
data["mode"] = r.json()['gateways'][0]['temperatureControlSystems'][0]['systemModeStatus']['mode']
CresData[key_str + str(idx)] = data
idx +=1
Cres_json = json.dumps(CresData)
print CresData
print Cres_json

ValueError: can only parse strings python

I am trying to gather a bunch of links using xpath which need to be scraped from the next page however, I keep getting the error that can only parse strings? I tried looking at the type of lk and it was a string after I casted it? What seems to be wrong?
def unicode_to_string(types):
try:
types = unicodedata.normalize("NFKD", types).encode('ascii', 'ignore')
return types
except:
return types
def getData():
req = "http://analytical360.com/access-points"
page = urllib2.urlopen(req)
tree = etree.HTML(page.read())
i = 0
for lk in tree.xpath('//a[#class="sabai-file sabai-file-image sabai-file-type-jpg "]//#href'):
print "Scraping Vendor #" + str(i)
trees = etree.HTML(urllib2.urlopen(unicode_to_string(lk)))
for ll in trees.xpath('//table[#id="archived"]//tr//td//a//#href'):
final = etree.HTML(urllib2.urlopen(unicode_to_string(ll)))
You should pass in strings not urllib2.orlopen.
Perhaps change the code like so:
trees = etree.HTML(urllib2.urlopen(unicode_to_string(lk)).read())
for i, ll in enumerate(trees.xpath('//table[#id="archived"]//tr//td//a//#href')):
final = etree.HTML(urllib2.urlopen(unicode_to_string(ll)).read())
Also, you don't seem to increment i.

not able to undertand cursors in appengine

I'm trying to fetch results in a python2.7 appengine app using cursors, but each time I use with_cursor() it fetches the same result set.
query = Model.all().filter("profile =", p_key).order('-created')
if r.get('cursor'):
query = query.with_cursor(start_cursor = r.get('cursor'))
cursor = query.cursor()
objs = query.fetch(limit=10)
count = len(objs)
for obj in objs:
...
Each time through I'm getting same 10 results. I'm thinkng it has to do with using end_cursor, but how do I get that value if query.cursor() is returning the start_cursor. I've looked through the docs but this is poorly documented.
Your formatting is a bit screwy by the way. Looking at your code (which is incomplete and therefore potentially leaving something out.) I have to assume you have forgotten to store the cursor after fetching results (or return to the user - I am assuming r is a request ?).
So after you have fetched some data you need to call cursor() on the query. e.g This function counts all entities using a cursor.
def count_entities(kind):
c = None
count = 0
q = kind.all(keys_only=True)
while True:
if c:
q.with_cursor(c)
i = q.fetch(1000)
count = count + len(i)
if not i:
break
c = q.cursor()
return count
See how after fetch() has been called the c=q.cursor() call and it's is used as the cursor next time through the loop.
Here's what finally worked:
query = Model.all().filter("profile =", p_key).order('-created')
if request.get('cursor'):
query = query.with_cursor(request.get('cursor'))
objs = query.fetch(limit=10)
cursor = query.cursor()
for obj in objs:
...

Categories