Why am I not getting the correct response? - python

I am a complete Python noob
Now that the foreshadowing is done, I am trying to parse some information out of a SOAP response.
The body of the reponse is below:
<soap:Body>
<ProcessMessageResponse xmlns="http://www.starstandards.org/webservices/2005/10/transport">
<payload>
<content id="Content0">
<CustomerLookupResponse xmlns="">
<Customer>
<CompanyNumber>ZQ1</CompanyNumber>
<CustomerNumber>1051012</CustomerNumber>
<TypeCode>I</TypeCode>
<LastName>NAME</LastName>
<FirstName>BASIC</FirstName>
<MiddleName/>
<Salutation/>
<Gender/>
<Language/>
<Address1/>
<Address2/>
<Address3/>
<City/>
<County/>
<StateCode/>
<ZipCode>0</ZipCode>
<PhoneNumber>0</PhoneNumber>
<BusinessPhone>0</BusinessPhone>
<BusinessExt>0</BusinessExt>
<FaxNumber>0</FaxNumber>
<BirthDate>0</BirthDate>
<DriversLicense/>
<Contact/>
<PreferredContact/>
<MailCode/>
<TaxExmptNumber/>
<AssignedSalesperson/>
<CustomerType/>
<PreferredPhone/>
<CellPhone>0</CellPhone>
<PagePhone>0</PagePhone>
<OtherPhone>0</OtherPhone>
<OtherPhoneDesc/>
<Email1/>
<Email2/>
<OptionalField/>
<AllowContactByPostal/>
<AllowContactByPhone/>
<AllowContactByEmail/>
<BusinessPhoneExtension/>
<InternationalBusinessPhone/>
<InternationalCellPhone/>
<ExternalCrossReferenceKey>0</ExternalCrossReferenceKey>
<InternationalFaxNumber/>
<InternationalOtherPhone/>
<InternationalHomePhone/>
<CustomerPreferredName/>
<InternationalPagerPhone/>
<PreferredLanguage/>
<LastChangeDate>20130401</LastChangeDate>
<Vehicles/>
<CCID/>
<CCCD>0</CCCD>
</Customer>
</CustomerLookupResponse>
</content>
</payload>
</ProcessMessageResponse>
</soap:Body>
and I have the following code snippet to show what I have done to parse out the response I want:
customer_number = ''
customer_first_name = ''
customer_last_name = ''
def send_customer_lookup(data):
soap_action = 'http://www.starstandards.org/webservices/2005/10/transport/operations/ProcessMessage'
source_port = random.randint(6000, 20000)
webservice = httplib.HTTPSConnection('otqa.arkona.com', source_address=('', source_port))
webservice.putrequest('POST', '/OpenTrack/Webservice.asmx?wsdl')
webservice.putheader('User-Agent', 'OpenTrack-Heartbeat')
webservice.putheader('Content-Type', 'application/soap+xml')
webservice.putheader('Content-Length', '%d' % len(data))
webservice.putheader('SOAPAction', soap_action)
webservice.endheaders()
webservice.send(data)
response = webservice.getresponse()
response_xml = str(response.read())
doc = ET.fromstring(response_xml)
for customer in doc.findall('.//{http://www.starstandards.org/webservices/2005/10/transport}Payload'):
global customer_number
global customer_first_name
global customer_last_name
customer_number = customer.findtext('{http://www.starstandards.org/webservices/2005/10/transport}CustomerNumber')
customer_first_name = customer.findtext('{http://www.starstandards.org/webservices/2005/10/transport}FirstName')
customer_last_name = customer.findtext('{http://www.starstandards.org/webservices/2005/10/transport}LastName')
webservice.close()
return customer_number, customer_first_name, customer_last_name, response_xml
I am not certain why I am getting an output of ' ', ' ', ' ', <xml response>...

It looks like you are overspecifying the field names, therefore they don't match anything, therefore your for customer in ... never runs. Try this:
import httplib
import xml.etree.ElementTree as ET
def send_customer_lookup(data):
soap_action = 'http://www.starstandards.org/webservices/2005/10/transport/operations/ProcessMessage'
source_port = random.randint(6000, 20000)
with httplib.HTTPSConnection('otqa.arkona.com', source_address=('', source_port)) as webservice:
webservice.putrequest('POST', '/OpenTrack/Webservice.asmx?wsdl')
webservice.putheader('User-Agent', 'OpenTrack-Heartbeat')
webservice.putheader('Content-Type', 'application/soap+xml')
webservice.putheader('Content-Length', '%d' % len(data))
webservice.putheader('SOAPAction', soap_action)
webservice.endheaders()
webservice.send(data)
response_xml = str(webservice.getresponse().read())
doc = ET.fromstring(response_xml)
results = []
for customer in doc.findall('.//CustomerLookupResponse/'):
customer_number = customer.findtext('CustomerNumber')
customer_first_name = customer.findtext('FirstName')
customer_last_name = customer.findtext('LastName')
results.append((customer_number, customer_first_name, customer_last_name))
return results
Also, global variable names are generally evil; I presume that you added them because you were getting 'variable not defined' errors? That should have been a clue that the for-loop was not actually getting run.

you could use xml.dom.minidom :
from xml.dom import minidom
def parse_customer_data(response_xml):
results = []
dom = minidom.parseString(response_xml)
customers=dom.getElementsByTagName('Customer')
for c in customers:
results.append({
"cnum" : c.getElementsByTagName('CustomerNumber')[0].firstChild.data,
"lname" : c.getElementsByTagName('LastName')[0].firstChild.data,
"fname" : c.getElementsByTagName('FirstName')[0].firstChild.data
})
return results
if __name__ == "__main__":
response_xml = open("soap.xml").read()
results = parse_customer_data(response_xml)
print(results)
note that for the input file, soap.xml:
1. I added xml version / soap:Envelope elements around the XML you provided, otherwise it would not parse
2. I added another Customer element to test my code
output:
$ python soap.py
[{'lname': u'NAME1', 'cnum': u'1051012', 'fname': u'BASIC1'}, {'lname': u'NAME2', 'cnum': u'1051013', 'fname': u'BASIC2'}]

Related

XML parsing in python issue using elementTree

I need to parse a soap response and convert to a text file. I am trying to parse the values as detailed below. I am using ElementTree in python
I have the below xml response which I need to parse
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tmf854="tmf854.v1" xmlns:alu="alu.v1">
<soapenv:Header>
<tmf854:header>
<tmf854:activityName>query</tmf854:activityName>
<tmf854:msgName>queryResponse</tmf854:msgName>
<tmf854:msgType>RESPONSE</tmf854:msgType>
<tmf854:senderURI>https:/destinationhost:8443/tmf854/services</tmf854:senderURI>
<tmf854:destinationURI>https://localhost:8443</tmf854:destinationURI>
<tmf854:activityStatus>SUCCESS</tmf854:activityStatus>
<tmf854:correlationId>1</tmf854:correlationId>
<tmf854:communicationPattern>MultipleBatchResponse</tmf854:communicationPattern>
<tmf854:communicationStyle>RPC</tmf854:communicationStyle>
<tmf854:requestedBatchSize>1500</tmf854:requestedBatchSize>
<tmf854:batchSequenceNumber>1</tmf854:batchSequenceNumber>
<tmf854:batchSequenceEndOfReply>true</tmf854:batchSequenceEndOfReply>
<tmf854:iteratorReferenceURI>http://9195985371165397084</tmf854:iteratorReferenceURI>
<tmf854:timestamp>20220915222121.472+0530</tmf854:timestamp>
</tmf854:header>
</soapenv:Header>
<soapenv:Body>
<queryResponse xmlns="alu.v1">
<queryObjectData>
<queryObject>
<name>
<tmf854:mdNm>AMS</tmf854:mdNm>
<tmf854:meNm>CHEERLAVANCHA_281743</tmf854:meNm>
<tmf854:ptpNm>/type=NE/CHEERLAVANCHA_281743</tmf854:ptpNm>
</name>
<vendorExtensions>
<package>
<NameAndStringValue>
<tmf854:name>hubSubtendedStatus</tmf854:name>
<tmf854:value>NONE</tmf854:value>
</NameAndStringValue>
<NameAndStringValue>
<tmf854:name>productAndRelease</tmf854:name>
<tmf854:value>DF.6.1</tmf854:value>
</NameAndStringValue>
<NameAndStringValue>
<tmf854:name>adminUserName</tmf854:name>
<tmf854:value>isadmin</tmf854:value>
</NameAndStringValue>
<NameAndStringValue>
</package>
</vendorExtensions>
</queryObject>
</queryObjectData>
</queryResponse>
</soapenv:Body>
</soapenv:Envelope>
I need to use the below code snippet.
parser = ElementTree.parse("response.txt")
root = parser.getroot()
inventoryObjectData = root.find(".//{alu.v1}queryObjectData")
for inventoryObject in inventoryObjectData:
for device in inventoryObject:
if (device.tag.split("}")[1]) == "me":
vendorExtensionsNames = []
vendorExtensionsValues = []
if device.find(".//{tmf854.v1}mdNm") is not None:
mdnm = device.find(".//{tmf854.v1}mdNm").text
if device.find(".//{tmf854.v1}meNm") is not None:
menm = device.find(".//{tmf854.v1}meNm").text
if device.find(".//{tmf854.v1}userLabel") is not None:
userlabel = device.find(".//{tmf854.v1}userLabel").text
if device.find(".//{tmf854.v1}resourceState") is not None:
resourcestate = device.find(".//{tmf854.v1}resourceState").text
if device.find(".//{tmf854.v1}location") is not None:
location = device.find(".//{tmf854.v1}location").text
if device.find(".//{tmf854.v1}manufacturer") is not None:
manufacturer = device.find(".//{tmf854.v1}manufacturer").text
if device.find(".//{tmf854.v1}productName") is not None:
productname = device.find(".//{tmf854.v1}productName").text
if device.find(".//{tmf854.v1}version") is not None:
version = device.find(".//{tmf854.v1}version").text
vendorExtensions = device.find("vendorExtensions")
vendorExtensionsNamesElements = vendorExtensions.findall(".//{tmf854.v1}name")
for i in vendorExtensionsNamesElements:
vendorExtensionsNames.append(i.text.strip())
vendorExtensionsValuesElements = vendorExtensions.findall(".//{tmf854.v1}value")
for i in vendorExtensionsValuesElements:
vendorExtensionsValues.append(str(i.text or "").strip())
alu = ""
for i in vendorExtensions:
if i.attrib:
if alu == "":
alu = i.attrib.get("{alu.v1}name")
else:
alu = alu + "|" + i.attrib.get("{alu.v1}name")
The issue is that The below code is not able to find the 'vendorExtensions"'. Please help here.
vendorExtensions = device.find("vendorExtensions")
Have tried the below as well
vendorExtensions = device.find(".//queryObject/vendorExtensions")
Your document declares a default namespace of alu.v1:
<queryResponse xmlns="alu.v1">
...
</queryResponse>
Any attribute without an explicit namespace is in the alu.v1 namespace. You need to qualify your attribute name appropriately:
vendorExtensions = device.find("{alu.v1}vendorExtensions")
While the above is a real problem with your code that needs to be corrected (the Wikipedia entry on XML namespaces may be useful reading if you're unfamiliar with how namespaces work), there are also some logic problems with your code.
Let's drop the big list of conditionals from the code and see if it's actually doing what we think it's doing. If we run this:
from xml.etree import ElementTree
parser = ElementTree.parse("data.xml")
root = parser.getroot()
queryObjectData = root.find(".//{alu.v1}queryObjectData")
for queryObject in queryObjectData:
for device in queryObject:
print(device.tag)
Then using your sample data (once it has been corrected to be syntactically valid), we see as output:
{alu.v1}name
{alu.v1}vendorExtensions
Your search for the {alu.v1}vendorExtensions element will never succeed before the thing on which you're trying to search (the device variable) is the thing you're trying to find.
Additionally, the conditional in your loop...
if (device.tag.split("}")[1]) == "me":
...will never match (there is no element in the entire document for which tag.split("}")[1] == "me" is True).
I'm not entirely clear what you're trying to do, but here's are some thoughts:
Given your example data, you probably don't want that for device in inventoryObject: loop
We can drastically simplify your code by replacing that long block of conditionals with a list of attributes in which we are interested and then a for loop to extract them.
Rather than assigning a bunch of individual variables, we can build up a dictionary with the data from the queryObject
That might look like:
from xml.etree import ElementTree
import json
attributeNames = [
"mdNm",
"meNm",
"userLabel",
"resourceState",
"location",
"manufacturer",
"productName",
"version",
]
parser = ElementTree.parse("data.xml")
root = parser.getroot()
queryObjectData = root.find(".//{alu.v1}queryObjectData")
for queryObject in queryObjectData:
device = {}
for name in attributeNames:
if (value := queryObject.find(f".//{{tmf854.v1}}{name}")) is not None:
device[name] = value.text
vendorExtensions = queryObject.find("{alu.v1}vendorExtensions")
extensionMap = {}
for extension in vendorExtensions.findall(".//{alu.v1}NameAndStringValue"):
extname = extension.find("{tmf854.v1}name").text
extvalue = extension.find("{tmf854.v1}value").text
extensionMap[extname] = extvalue
device["vendorExtensions"] = extensionMap
print(json.dumps(device, indent=2))
Given your example data, this outputs:
{
"mdNm": "AMS",
"meNm": "CHEERLAVANCHA_281743",
"vendorExtensions": {
"hubSubtendedStatus": "NONE",
"productAndRelease": "DF.6.1",
"adminUserName": "isadmin"
}
}
An alternate approach, in which we just transform each queryObject into a dictionary, might look like this:
from xml.etree import ElementTree
import json
def localName(ele):
return ele.tag.split("}")[1]
def etree_to_dict(t):
if list(t):
d = {}
for child in t:
if localName(child) == "NameAndStringValue":
d.update(dict([[x.text.strip() for x in child]]))
else:
d.update({localName(child): etree_to_dict(child) for child in t})
return d
else:
return t.text.strip()
parser = ElementTree.parse("data.xml")
root = parser.getroot()
queryObjectData = root.find(".//{alu.v1}queryObjectData") or []
for queryObject in queryObjectData:
d = etree_to_dict(queryObject)
print(json.dumps(d, indent=2))
This will output:
{
"name": {
"mdNm": "AMS",
"meNm": "CHEERLAVANCHA_281743",
"ptpNm": "/type=NE/CHEERLAVANCHA_281743"
},
"vendorExtensions": {
"package": {
"hubSubtendedStatus": "NONE",
"productAndRelease": "DF.6.1",
"adminUserName": "isadmin"
}
}
}
That may or may not be appropriate depending on the structure of your real data and exactly what you're trying to accomplish.

Parsing a JSON using specific key words using Python

I'm trying to parse a JSON of a sites stock.
The JSON: https://www.ssense.com/en-us/men/sneakers.json
So I want to take some keywords from the user. Then I want to parse the JSON using these keywords to find the name of the item and (in this specific case) return the ID, SKU and the URL.
So for example:
If I inputted "Black Fennec" I want to parse the JSON and find the ID,SKU, and URL of Black Fennec Sneakers (that have an ID of 3297299, a SKU of 191422M237006, and a url of /men/product/ps-paul-smith/black-fennec-sneakers/3297299 )
I have never attempted doing anything like this. Based on some guides that show how to parse a JSON I started out with this:
r = requests.Session()
stock = r.get("https://www.ssense.com/en-us/men/sneakers.json",headers = headers)
obj json_data = json.loads(stock.text)
However I am now confused. How do I find the product based off the keywords and how do I get the ID,Url and the SKU or it?
Theres a number of ways to handle the output. not sure what you want to do with it. But this should get you going.
EDIT 1:
import requests
r = requests.Session()
obj_json_data = r.get("https://www.ssense.com/en-us/men/sneakers.json").json()
products = obj_json_data['products']
keyword = input('Enter a keyword: ')
for product in products:
if keyword.upper() in product['name'].upper():
name = product['name']
id_var = product['id']
sku = product['sku']
url = product['url']
print ('Product: %s\nID: %s\nSKU: %s\nURL: %s' %(name, id_var, sku, url))
# if you only want to return the first match, uncomment next line
#break
I also have it setup to store it into a dataframe, and or a list too. Just to give some options of where to go with it.
import requests
import pandas as pd
r = requests.Session()
obj_json_data = r.get("https://www.ssense.com/en-us/men/sneakers.json").json()
products = obj_json_data['products']
keyword = input('Enter a keyword: ')
products_found = []
results = pd.DataFrame()
for product in products:
if keyword.upper() in product['name'].upper():
name = product['name']
id_var = product['id']
sku = product['sku']
url = product['url']
temp_df = pd.DataFrame([[name, id_var, sku, url]], columns=['name','id','sku','url'])
results = results.append(temp_df)
products_found = products_found.append(name)
print ('Product: %s\nID: %s\nSKU: %s\nURL: %s' %(name, id_var, sku, url))
if products_found == []:
print ('Nothing found')
EDIT 2: Here is another way to do it by converting the json to a dataframe, then filtering by those rows that have the keyword in the name (this is actually a better solution in my opinion)
import requests
import pandas as pd
from pandas.io.json import json_normalize
r = requests.Session()
obj_json_data = r.get("https://www.ssense.com/en-us/men/sneakers.json").json()
products = obj_json_data['products']
products_df = json_normalize(products)
keyword = input('Enter a keyword: ')
products_found = []
results = pd.DataFrame()
results = products_df[products_df['name'].str.contains(keyword, case = False)]
#print (results[['name', 'id', 'sku', 'url']])
products_found = list(results['name'])
if products_found == []:
print ('Nothing found')
else:
print ('Found: '+ str(products_found))

How to change the encoding of JSON dictionary string value?

I ran into an issue a string encoded with "utf-8" doesn't print as expected. The string contains accented letters (á, é, ü, ñ, etc.), and is part of a JSON dict returned from Wikipedia API.
Below is an example with the letter é:
== The complete code: ==
import urllib
import json
import re
def check(wikitext):
redirect_title = re.findall('\[\[[\S ]+\]\]', str(wikitext))[0]
redirect_title = redirect_title.strip('[]')
redirect_title = redirect_title.decode('ISO-8859-1').encode('utf8')
return redirect_title
serviceurl = 'https://en.wikipedia.org/w/api.php?'
action = 'parse'
formatjs = 'json'
prop = 'text|wikitext'
title = 'Jose Eduardo Agualusa'
url = serviceurl + urllib.urlencode({'action':action, 'page': title, 'format': formatjs, 'prop': prop})
uh = urllib.urlopen(url)
data = uh.read()
try:
js = json.loads(data)
except:
js = None
print ' Page is not found'
wikitext = js["parse"]["wikitext"]
redirect_title = check(wikitext)
print 'redirect_title:',redirect_title
redirect_title2 = 'Jos\xe9 Eduardo Agualusa'
redirect_title2 = redirect_title2.decode('ISO-8859-1').encode('utf8')
print 'redirect_title2:', redirect_title2
The result is:
redirect_title: Jos\xe9 Eduardo Agualusa
redirect_title2: José Eduardo Agualusa
redirect_title is parsed from the Wikipedia API JSON. Before encoded, it prints as 'Jos\xe9 Eduardo Agualusa'. After encoded, it doesn't seem to change.
redirect_title2 is assigned directly with the string 'Jos\xe9 Eduardo Agualusa' and then encoded.
Why do I get different results for redirect_title and redirect_title2? How can I make redirect_title print as "José Eduardo Agualusa"?
Your check() routine does some very odd things, including parsing the string representation of a dictionary.
Try this instead:
def check(wikitext):
for value in wikitext.values():
result = re.findall(ur'\[\[.*?\]\]', value)
if result:
return result[0].strip(u'[]')
return u''
Or this:
def check(wikitext):
redirect_title = u''.join(wikitext.values())
redirect_title = re.findall(u'\[\[[\S ]+\]\]', redirect_title)[0]
redirect_title = redirect_title.strip(u'[]')
return redirect_title

How to add 2 variables in the email body in win32com with Python?

I have this email template:
def email_tamplate(*args):
Format = { 'UNSPECIFIED' : 0, 'PLAIN' : 1, 'HTML' : 2, 'RTF' : 3}
profile = "Outlook"
#session = win32com.client.Dispatch("Mapi.Session")
outlook = win32com.client.Dispatch("Outlook.Application")
#session.Logon(profile)
mainMsg = outlook.CreateItem(0)
mainMsg.To = "myemail#amazon.com"
mainMsg.Subject = "Automated Crap Daily Update"
mainMsg.BodyFormat = Format['RTF']
mainMsg.HTMLBody = body2
mainMsg.Send() #this line actually sends the email
And would like to send an email that has 2 tables in the body. So I have 2 bodies:
Here's one:
eod = []
body2 = ['<html><body><table border="1" style="width:300px"><tr><td>Title Level</td></tr><tr><td>Source</td><td>Count</td></tr>']
header = [['Title Level']]
for row in cur:
eod.append(row)
count=0
count2=0
for item in eod:
body2[0]=body2[0]+"<tr><td>"+str(eod[count2][count])+"</td><td>"+str(eod[count2][count+1])+"</td></tr>"
count2=count2+1
body2[0]=body2[0]+"</table></body></html>"
body2=body2[0]
globals().update(locals())
And here's the other:
eod = []
body = ['<html><body><table border="1" style="width:300px"><tr><td>Previous Day</td></tr><tr><td>Decision_Status</td><td>Count</td></tr>']
header = [['Prev Day']]
for row in cur:
eod.append(row)
count=0
count2=0
for item in eod:
body[0]=body[0]+"<tr><td>"+str(eod[count2][count])+"</td><td>"+str(eod[count2][count+1])+"</td></tr>"
count2=count2+1
body[0]=body[0]+"</table></body></html>"
body=body[0]
globals().update(locals())
Both are created with data from different queries.
So I would like to be able to send in the body of the email variables body and boody2
Any ideas of how to accomplish this?
Thank you
I just resolved the issue. It happens that I only needed to concatenate body + body2.
As simple as that.
But thank you!
No need for 2 bodies, use .format() and put unlimited number of variables into 1 body
Example:
text ='some text'
table= pd.DataFrame([1,2,3])
msg.HTMLBody = '''<br>Hello, see this text:
<br>{text}<br>and this table:<br>{table}'''.format(text=text, table=table)

Generating XML dynamically in Python

Hey friends I am generating XML data using Python libraries as follow
def multiwan_info_save(request):
data = {}
init = "init"
try:
form = Addmultiwanform(request.POST)
except:
pass
if form.is_valid():
from_sv = form.save(commit=False)
obj_get = False
try:
obj_get = MultiWAN.objects.get(isp_name=from_sv.isp_name)
except:
obj_get = False
nameservr = request.POST.getlist('nameserver_mw')
for nm in nameservr:
nameserver1, is_new = NameServer.objects.get_or_create(name=nm)
from_sv.nameserver = nameserver1
from_sv.save()
# main(init)
top = Element('ispinfo')
# comment = Comment('Generated for PyMOTW')
#top.append(comment)
all_connection = MultiWAN.objects.all()
for conn in all_connection:
child = SubElement(top, 'connection number ='+str(conn.id)+'name='+conn.isp_name+'desc='+conn.description )
subchild_ip = SubElement(child,'ip_address')
subchild_subnt = SubElement(child,'subnet')
subchild_gtwy = SubElement(child,'gateway')
subchild_nm1 = SubElement(child,'probe_server1')
subchild_nm2 = SubElement(child,'probe_server2')
subchild_interface = SubElement(child,'interface')
subchild_weight = SubElement(child,'weight')
subchild_ip.text = str(conn.ip_address)
subchild_subnt.text = str(conn.subnet)
subchild_gtwy.text = str(conn.gateway)
subchild_nm1.text = str(conn.nameserver.name)
# subchild_nm2.text = conn.
subchild_weight.text = str(conn.weight)
subchild_interface.text = str(conn.interface)
print "trying to print _____________________________"
print tostring(top)
print "let seeeeeeeeeeeeeeeeee +++++++++++++++++++++++++"
But I am getting output like follow
<ispinfo><connection number =5name=Airtelllldesc=Largets TRelecome ><ip_address>192.168.1.23</ip_address><subnet>192.168.1.23</subnet><gateway>192.168.1.23</gateway><probe_server1>192.168.99.1</probe_server1><probe_server2 /><interface>eth0</interface><weight>160</weight></connection number =5name=Airtelllldesc=Largets TRelecome ><connection number =6name=Uninordesc=Uninor><ip_address>192.166.55.23</ip_address><subnet>192.166.55.23</subnet><gateway>192.168.1.23</gateway><probe_server1>192.168.99.1</probe_server1><probe_server2 /><interface>eth0</interface><weight>160</weight></connection number =6name=Uninordesc=Uninor><connection number =7name=Airteldesc=Largets TRelecome ><ip_address>192.168.1.23</ip_address><subnet>192.168.1.23</subnet><gateway>192.168.1.23</gateway><probe_server1>192.168.99.1</probe_server1><probe_server2 /><interface>eth0</interface><weight>160</weight></connection number =7name=Airteldesc=Largets TRelecome ></ispinfo>
I just want to know that how can I write this XML in proper XML format ?
Thanks in advance
UPDATED to include simulation of both creating and printing of the XML tree
The Basic Issue
Your code is generating invalid connection tags like this:
<connection number =5name=Airtelllldesc=Largets TRelecome ></connection number =5name=Airteldesc=Largets TRelecome >
when they should look like this (I am omitting the sub-elements in between. Your code is generating these correctly):
<connection number="5" name="Airtellll" desc="Largets TRelecome" ></connection>
If you had valid XML, this code would print it neatly:
from lxml import etree
xml = '''<ispinfo><connection number="5" name="Airtellll" desc="Largets TRelecome" ><ip_address>192.168.1.23</ip_address><subnet>192.168.1.23</subnet><gateway>192.168.1.23</gateway><probe_server1>192.168.99.1</probe_server1><probe_server2 /><interface>eth0</interface><weight>160</weight></connection></ispinfo>'''
xml = etree.XML(xml)
print etree.tostring(xml, pretty_print = True)
Generating Valid XML
A small simulation follows:
from lxml import etree
# Some dummy text
conn_id = 5
conn_name = "Airtelll"
conn_desc = "Largets TRelecome"
ip = "192.168.1.23"
# Building the XML tree
# Note how attributes and text are added, using the Element methods
# and not by concatenating strings as in your question
root = etree.Element("ispinfo")
child = etree.SubElement(root, 'connection',
number = str(conn_id),
name = conn_name,
desc = conn_desc)
subchild_ip = etree.SubElement(child, 'ip_address')
subchild_ip.text = ip
# and pretty-printing it
print etree.tostring(root, pretty_print=True)
This will produce:
<ispinfo>
<connection desc="Largets TRelecome" number="5" name="Airtelll">
<ip_address>192.168.1.23</ip_address>
</connection>
</ispinfo>
A single line is proper, in the sense that a XML parser will understand it.
For pretty-printing to sys.stdout, use the dump method of Element.
For pretty-printing to a stream, use the write method of ElementTree.

Categories