Parsing XML using Python on SelectSingleNode - python

Using the following XML data i want to get value for relevant keys that i'm calling through the python code. And i want to accomplish this task without using any 3rd party libraries.
<Userinfo>
<UserData>
<item key="DateOfBirth" value="19851103" />
<item key="FirstName" value="John" />
<item key="LastName" value="Dicaprio" />
<item key="Gender" value="M" />
<item key="Email" value="john#abc.com" />
<item key="ContactNo" value="235625341" />
</UserData>
</Userinfo>
From the above xml code i want to extract the value from the key i'm calling within the python code below.
def ExtractXml(args):
url = '....'
wc = System.Net.WebClient()
xml = wc.DownloadString(url)
doc = System.Xml.XmlDocument()
doc.LoadXml(xml)
root = doc.DocumentElement
nsmgr = System.Xml.XmlNamespaceManager(doc.NameTable)
#nsmgr.AddNamespace('ns','http://schemas.microsoft.com/developer/msbuild/2003')
node = root.SelectNodes('/Userinfo/UserData',nsmgr)
tcount=root.SelectNodes('/Userinfo/UserData').Count
if not node:
ServiceDesk.Log.PrintError('No condition node')
return
r=[]
t={}
counts=0
for itemNode in node:
counts += 1
fullname = xstr(itemNode.SelectSingleNode("/item[#key='FirstName']/#value",nsmgr))
empname = xstr(itemNode.SelectSingleNode("/item[#key='LastName']/#value",nsmgr))
cardcountry = xstr(itemNode.SelectSingleNode("/item[#key='Email']/#value",nsmgr))
#birthdate = ServiceDesk.Common.ParseDateTime(itemNode.SelectSingleNode("item[#key='DateOfBirth']"))
t = {'counter':counts,'FirstName':fullname,'LastName':empname,'Email':cardcountry,'__rowid__':counter,'__totalcount__':tcount}
r.append(t)
return r
Using following code it doesn't retrive the value for relevant key on SelectSingleNode call. Thanks in advance.

I'm not so sure about IronPython, but Python's core library includes some other platform independent XML parsers... Here's an example that uses xml.etree.ElementTree.
import datetime
import xml.etree.ElementTree as ET
xml = '''<Userinfo>
<UserData>
<item key="DateOfBirth" value="19851103" />
<item key="FirstName" value="John" />
<item key="LastName" value="Dicaprio" />
<item key="Gender" value="M" />
<item key="Email" value="john#abc.com" />
<item key="ContactNo" value="235625341" />
</UserData>
</Userinfo>'''
root = ET.fromstring(xml)
fname = root.findall(".//item[#key='FirstName']")[0].get('value')
dob = datetime.datetime.strptime(
root.findall(".//item[#key='DateOfBirth']")[0].get('value'),
'%Y%m%d')

Currently the XPath expressions in the for loop are all relative to document node since it started with /. To make it relative to context element, which in this case is referenced by itemNode, you can either add a preceding . or just remove the / completely :
for itemNode in node:
counts += 1
# here are some ways to make your XPath heeds the context element `itemNode`
fullname = xstr(itemNode.SelectSingleNode("./item[#key='FirstName']/#value",nsmgr))
empname = xstr(itemNode.SelectSingleNode("item[#key='LastName']/#value",nsmgr))
cardcountry = xstr(itemNode.SelectSingleNode("self::*/item[#key='Email']/#value",nsmgr))

Related

xml find is always None

How can I use .find or .get to get the value what's I want?
I look at the example, I don’t know where I did it wrong
Always get None
<?xml version="1.0" encoding="UTF-8"?>
<config version="1.0" xmlns="http://www.ipc.com/ver10">
<types>
<bitRateType>
<enum>VBR</enum>
<enum>CBR</enum>
</bitRateType>
<quality>
<enum>lowest</enum>
<enum>lower</enum>
</quality>
</types>
<streams type="list" count="2">
<item id="0">
<name type="string" maxLen="32"><![CDATA[rtsp://192.168.0.175:554/chID=2&streamType=main]]></name>
<resolution>2592x1520</resolution>
</item>
<item id="1">
<name type="string" maxLen="32"><![CDATA[rtsp://192.168.0.175:554/chID=2&streamType=sub1]]></name>
<resolution>704x480</resolution>
</item>
</streams>
</config>
My Code.
tree = ET.fromstring(res.text)
types = tree.find('types')
streams = tree.find('streams')
item = streams.findall('item')[0]
print(item.get('name'))
print(item.get('resolution'))
print(types, streams, item)
You can try this method's
from xml.etree import cElementTree as ET
new_tree = ET.parse('test.xml')
new_root = new_tree.getroot()
types = new_root[0]
bitRateType = types[0]
quality = types[1]
streams = new_root[1]
item0_name = streams[0][0]
item0_resolution = streams[0][1]
item1_name = streams[1][0]
item1_resolution = streams[1][1]
print(item0_name.attrib)
print(item1_resolution.text)
var.attrib - all attributes, for example:
print(item0_name.attrib)
{'type': 'string', 'maxLen': '32'}
var.text - get text , for example:
print(item1_resolution.text)
704x480

How do I write a function that takes an xml file and an integer value X as parameters and updates the attributes of the xml based on the given integer

I am trying to write a function that will take as parameters my xml file file.xml and an integer I want to input from the keyboard.
My xml files looks like this:
<root>
<item name="A" days="10"/>
<item name="B" days="20"/>
I have the integer X :
X= int(input("X value is:")
I want to add the X value to the days attribute in my xml.
for X=1.1 =>I want the output:
A, 11.1 days
B, 20.1 days
I don't know how to write the function because when I tried calling it the name of the file I wanted to open was not recognized =>
read_xml(file.xml)
NameError : name 'file' is not defined.
But more importantly, I don't know how to add an integer value to the attribute of an xml file.
What I did so far using the ElementTree library:
import os
import xml.etree.ElementTree as et
tree = et.ElementTree(file = 'file.xml')
root = tree.getroot()
for item in root.findall('item'):
names = item.get('name')
ages = item.get('age')
genders = item.get('sex')
print(f'''\n{names}, {ages} years old''')
At this moment I get the desired output format but without the integer X added to the days attribute.
Please let me know if you have any idea how to solve this in Python3.
Thanks!!!
import xml.etree.ElementTree as ET
xml = '''<root>
<item name="A" days="10"/>
<item name="B" days="20"/>
</root>'''
def change_days_value(factor):
root = ET.fromstring(xml)
items = root.findall('.//item')
for item in items:
item.attrib['days'] = str(int(item.attrib['days']) * factor)
ET.dump(root)
# read this value from the user
factor = 1.1
change_days_value(factor)
output
<root>
<item days="11.0" name="A" />
<item days="22.0" name="B" />
</root>

python ElementTree remove issue

I have xml file as following:
<plugin-config>
<properties>
<property name="AZSRVC_CONNECTION" value="diamond_plugins#AZSRVC" />
<property name="DIAMOND_HOST" value="10.0.230.1" />
<property name="DIAMOND_PORT" value="3333" />
</properties>
<pack-list>
<vsme-pack id="monthly_50MB">
<campaign-list>
<campaign id="2759" type="SOB" />
<campaign id="2723" type="SUBSCRIBE" />
</campaign-list>
</vsme-pack>
<vsme-pack id="monthly_500MB">
<campaign-list>
<campaign id="3879" type="SOB" />
<campaign id="3885" type="SOB" />
<campaign id="2724" type="SUBSCRIBE" />
<campaign id="1111" type="COB" /></campaign-list>
</vsme-pack>
</pack-list>
</plugin-config>
And trying to run this Python script to remove 'campaign' with specific id.
import xml.etree.ElementTree as ET
tree = ET.parse('pack-assign-config.xml')
root = tree.getroot()
pack_list = root.find('pack-list')
camp_list = pack_list.find(".//vsme-pack[#id='{pack_id}']".format(pack_id=pack_id)).find('campaign-list').findall('campaign')
for camp in camp_list:
if camp.get('id') == '2759':
camp_list.remove(camp)
tree.write('out.xml')
I run script but out is the same as input file, so does not remove element.
Issue :
this is wrong way to find the desired node . you are searching for vsme-pack and the trying to find campaign-list and campaign ? which incorrect format.
camp_list = pack_list.find(".//vsme-pack[#id='{pack_id}']".format(pack_id=pack_id)).find('campaign-list').findall('campaign')
Fixed Code Example
here is the working code which removes the node from xml
import xml.etree.ElementTree as ET
root = ET.parse('pack-assign-config.xml')
# Alternatively, parse the XML that lives in 'filename_path'
# tree = ElementTree.parse(filename_path)
# root = tree.getroot()
# Find the parent element of each "weight" element, using XPATH
for parent in root.findall('.//pack-list/'):
# Find each weight element
for element in parent.findall('campaign-list'):
for camp_list in element.findall('campaign'):
if camp_list.get('id') == '2759' or camp_list.get('id') == '3879' :
element.remove(camp_list)
root.write("out.xml")
hope this helps

Using Python LXML to removing XML element values but leaving one placeholder

I have an XML file which I would like to clear the text in the 'value' child elements, but leave one empty value element as a placeholder for adding text at a later date. I am using Python's LXML module.
Here's an example of the XML section:
<spec class="Spec" name="New Test">
<mainreport>
<item name="New Item">First Item</item>
</mainreport>
<case class="CaseItem" name="Some Name">
<extraelement>
<item name="ID">Some Id</item>
</extraelement>
<pool class="String" name="Originator">
<value>A</value>
<value>B</value>
<value>C</value>
</pool>
<pool class="String" name="Target">
<value>D</value>
<value>E</value>
<value>F</value>
</pool>
And here's what I am hoping to output:
<spec class="Spec" name="New Test">
<mainreport>
<item name="New Item">First Item</item>
</mainreport>
<case class="CaseItem" name="Some Name">
<extraelement>
<item name="ID">Some Id</item>
</extraelement>
<pool class="String" name="Originator">
<value></value>
</pool>
<pool class="String" name="Target">
<value></value>
</pool>
I have written the following code, but it only adds the "value" tag to the last element:
import lxml.etree as et
import os
xml_match = os.path.join("input.xml")
doc = et.parse(xml_match)
for elem in doc.xpath('//case/pool/value'):
elem.getparent().remove(elem)
blankval = et.Element("value")
blankval.text = ""
for elem in doc.xpath('//case/pool'):
elem.insert(1, blankval)
outFile = "output.xml"
doc.write(outFile)
I would remove all value elements and append an empty one in a single loop:
for elem in doc.xpath('//case/pool'):
for value in elem.findall("value"):
elem.remove(value)
blankval = et.Element("value")
blankval.text = ""
elem.append(blankval)
There is also a handy .clear() method, but it would also clear up the attributes.
The reason your current approach is not working is because you are trying to reuse the same exact blankval element, but instead, you need to recreate new element in the loop before you perform an insert operation:
for elem in doc.xpath('//case/pool'):
blankval = et.Element("value")
blankval.text = ""
elem.insert(1, blankval)

How to copy certain information from a text file to XML using Python?

We get order e-mails whenever a buyer makes a purchase; these e-mails are sent in a text format with some relevant and some irrelevant information. I am trying to write a python program which will read the text and then build an XML file (using ElementTree) which we can important into other software.
Unfortunately I do not quite know the proper terms for some of this, so please bear with the overlong explanations.
The problem is that I cannot figure out how to make it work with more than one product on the order. The program currently goes through each order and puts the data in a dictionary.
while file_length_dic != 0:
#goes line by line and adds each value (and it's name) to a dictionary
#keys are the first have a sentence followed by a distinguishing number
for line in raw_email:
colon_loc = line.index(':')
end_loc = len(line)
data_type = line[0:colon_loc] + "_" + file_length
data_variable = line[colon_loc+2:end_loc].lstrip(' ')
xml_dic[data_type] = data_variable
if line.find("URL"):
break
file_lenght_dic -= 1
How can I get this dictionary values into XML? For example, under the main "JOB" element there will be a sub-element ITEMNUMBER and then SALESMANN and QUANTITY. How can I fill out multiple sets?
<JOB>
<ITEM>
<ITEMNUMBER>36322</ITEMNUMBER>
<SALESMANN>17</SALESMANN>
<QUANTITY>2</QUANTITY>
</ITEM>
<ITEM>
<ITEMNUMBER>22388</ITEMNUMBER>
<SALESMANN>5</SALESMANN>
<QUANTITY>8</QUANTITY>
</ITEM>
</JOB>
As far as I can tell, ElementTree will only let me but the data into the first set of children but I can't imagine this must be so. I also do not know in advance how many items are with each order; it can be anywhere from 1 to 150 and the program needs to scale easily.
Should I be using a different library? lxml looks powerful but again, I do not know what it is exactly I am looking for.
Here's a simple example. Note that the basic ElementTree doesn't pretty print, so I included a pretty print function from the ElementTree author.
If you provide an actual example of the input file and dictionary it would be easier to target your specific case. I just Put some data in a dictionary to show how to iterate over it and generate some XML.
from xml.etree import ElementTree as et
def indent(elem, level=0):
i = "\n" + level*" "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
if not elem.tail or not elem.tail.strip():
elem.tail = i
for elem in elem:
indent(elem, level+1)
if not elem.tail or not elem.tail.strip():
elem.tail = i
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i
D = {36322:(17,2),22388:(5,8)}
job = et.Element('JOB')
for itemnumber,(salesman,quantity) in D.items():
item = et.SubElement(job,'ITEMNUMBER').text = str(itemnumber)
et.SubElement(job,'SALESMAN').text = str(salesman)
et.SubElement(job,'QUANTITY').text = str(quantity)
indent(job)
et.dump(job)
Output:
<JOB>
<ITEMNUMBER>36322</ITEMNUMBER>
<SALESMAN>17</SALESMAN>
<QUANTITY>2</QUANTITY>
<ITEMNUMBER>22388</ITEMNUMBER>
<SALESMAN>5</SALESMAN>
<QUANTITY>8</QUANTITY>
</JOB>
Although as #alko mentioned, a more structured XML might be:
job = et.Element('JOB')
for itemnumber,(salesman,quantity) in D.items():
item = et.SubElement(job,'ITEM')
et.SubElement(item,'NUMBER').text = str(itemnumber)
et.SubElement(item,'SALESMAN').text = str(salesman)
et.SubElement(item,'QUANTITY').text = str(quantity)
Output:
<JOB>
<ITEM>
<NUMBER>36322</NUMBER>
<SALESMAN>17</SALESMAN>
<QUANTITY>2</QUANTITY>
</ITEM>
<ITEM>
<NUMBER>22388</NUMBER>
<SALESMAN>5</SALESMAN>
<QUANTITY>8</QUANTITY>
</ITEM>
</JOB>
Your XML structure do not seem valid to me. How can one tell which salesman refers which itemnumber?
Probably, you need something like
<JOB>
<ITEM>
<NUMBER>36322</NUMBER>
<SALESMANN>17</SALESMANN>
<QUANTITY>2</QUANTITY>
</ITEM>
<ITEM>
<NUMBER>22388</NUMBER>
<SALESMANN>5</SALESMANN>
<QUANTITY>8</QUANTITY>
</ITEM>
</JOB>
For a list of serialization techniques, refer to Serialize Python dictionary to XML
Sample with dicttoxml:
import dicttoxml
from xml.dom.minidom import parseString
xml = dicttoxml.dicttoxml({'JOB':[{'NUMBER':36322,
'QUANTITY': 2,
'SALESMANN': 17}
]}, root=False)
dom = parseString(xml)
and output
>>> print(dom.toprettyxml())
<?xml version="1.0" ?>
<JOB type="list">
<item type="dict">
<SALESMANN type="int">
17
</SALESMANN>
<NUMBER type="int">
36322
</NUMBER>
<QUANTITY type="int">
2
</QUANTITY>
</item>
</JOB>

Categories