XML reading the last entry from the file in python

XML reading the last entry from the file in python - python

I want to read the last entry of the xml file and get its value. Here is my xml file
<TestSuite>
<TestCase>
<name>tcname1</name>
<total>1</total>
<totalpass>0</totalpass>
<totalfail>0</totalfail>
<totalerror>1</totalerror>
</TestCase>
<TestCase>
<name>tcname2</name>
<total>1</total>
<totalpass>0</totalpass>
<totalfail>0</totalfail>
<totalerror>1</totalerror>
</TestCase>
</TestSuite>
I want to get the <total> , <totalpass>,<totalfail> and <totalerror> value in the last tag of the file. I have tried this code to do that.
import xmltodict
with open(filename) as fd:
doc = xmltodict.parse(fd.read())
length=len(doc['TestSuite']['TestCase'])
tp=doc['TestSuite']['TestCase'][length-1]['totalpass']
tf=doc['TestSuite']['TestCase'][length-1]['totalfail']
te=doc['TestSuite']['TestCase'][length-1]['totalerror']
total=doc['TestSuite']['TestCase'][length-1]['total']
This works for the xml with 2 or more testcase tags in xml files , But fails with this error for the file with only one testcase tag .
Traceback (most recent call last):
File "HTMLReportGenerationFromXML.py", line 52, in <module>
tp=doc['TestSuite']['TestCase'][length-1]['totalpass']
KeyError: 4 .
Because instead of the count , it is taking the subtag ( etc value as length). Please help me resolve this issue.

Since you only want the last one, you can use negative indices to retrieve it:
import xml.etree.ElementTree as et
tree = et.parse('test.xml')
# collect all the test cases
test_cases = [test_case for test_case in tree.findall('TestCase')]
# Pull data from the last one
last = test_cases[-1]
total = last.find('total').text
totalpass = last.find('totalpass').text
totalfail = last.find('totalfail').text
totalerror = last.find('totalerror').text
print total,totalpass,totalfail,totalerror

Why didn't I do t his in the first place! Use xpath.
The first example involves processing the xml file with just one TestCase element, the second with two of them. The key point is to use the xpath last selector.
>>> from lxml import etree
>>> tree = etree.parse('temp.xml')
>>> last_TestCase = tree.xpath('.//TestCase[last()]')[0]
>>> for child in last_TestCase.iterchildren():
... child.tag, child.text
...
('name', 'tcname2')
('total', '1')
('totalpass', '0')
('totalfail', '0')
('totalerror', '1')
>>>
>>> tree = etree.parse('temp_2.xml')
>>> last_TestCase = tree.xpath('.//TestCase[last()]')[0]
>>> for child in last_TestCase.iterchildren():
... child.tag, child.text
...
('name', 'tcname1')
('reason', 'reason')
('total', '2')
('totalpass', '0')
('totalfail', '0')
('totalerror', '2')

I have tried this this works for me
import xml.etree.ElementTree as ET
import sys
tree = ET.parse('temp.xml')
root = tree.getroot()
print root
total=[]
totalpass=[]
totalfail=[]
totalerror=[]
for test in root.findall('TestCase'):
total.append(test.find('total').text)
totalpass.append(test.find('totalpass').text)
totalfail.append(test.find('totalfail').text)
totalerror.append(test.find('totalerror').text)
length=len(total)
print total[length-1],totalpass[length-1],totalfail[length-1],totalerror[length-1]
This one works for me

The reason of your error is that with xmltidict doc['TestSuite']['TestCase'] is a list just for long XMLs
>>> type(doc2['TestSuite']['TestCase']) # here doc2 is more than one-entry long XML file
>>> list
but it is just a kind of dictionary for a one-entry long file:
>>> type(doc['TestSuite']['TestCase']) # doc is one-entry long
>>> collections.OrderedDict
That's the reason. You could try to manage the issue in the following way:
import xmltodict
with open(filename) as fd:
doc = xmltodict.parse(fd.read())
if type(doc['TestSuite']['TestCase']) == list:
tp=doc['TestSuite']['TestCase'][length-1]['totalpass']
tf=doc['TestSuite']['TestCase'][length-1]['totalfail']
te=doc['TestSuite']['TestCase'][length-1]['totalerror']
total=doc['TestSuite']['TestCase'][length-1]['total']
else: # you have just a dict here
tp=doc['TestSuite']['TestCase']['totalpass']
tf=doc['TestSuite']['TestCase']['totalfail']
te=doc['TestSuite']['TestCase']['totalerror']
total=doc['TestSuite']['TestCase']['total']
Otherwise, you can use another library for the XML parsing.
...let me know if it helps!

Related

How to add namespace prefix at root XML using Python LXML?

I would like to have the following NS prefix <qsp: and </qsp:
<qsp:QSPart xmlns:qsp="urn:qvalent:quicksuper:gateway">
<qsp:MemberRegistrationRequest/>
</qsp:QSPart>
How do I do that in LMXL python?
from lxml import etree
nsmap = {'qsp': 'urn:qvalent:quicksuper:gateway'}
nsprefix = nsmap['qsp']
QSPart = etree.Element('QSPart', nsmap=nsmap)
MemberRegistrationRequest = etree.SubElement(QSPart, etree.QName(nsprefix, 'MemberRegistrationRequest'))
print(etree.tostring(QSPart, pretty_print=True, encoding=str))
Result:
<QSPart xmlns:qsp="urn:qvalent:quicksuper:gateway">
<qsp:MemberRegistrationRequest/>
</QSPart>

According to the documentation, you need to fully qualify the element name in your call to etree.Element:
from lxml import etree
nsmap = {'qsp': 'urn:qvalent:quicksuper:gateway'}
nsprefix = nsmap['qsp']
QSPart = etree.Element(f'{{{nsmap["qsp"]}}}QSPart')
MemberRegistrationRequest = etree.SubElement(QSPart, etree.QName(nsprefix, 'MemberRegistrationRequest'))
print(etree.tostring(QSPart, pretty_print=True, encoding=str))
This outputs:
<ns0:QSPart xmlns:ns0="urn:qvalent:quicksuper:gateway">
<ns0:MemberRegistrationRequest/>
</ns0:QSPart>

Since you know your expected output, I wouldn't bother with all that (though I understand many people frown on this approach...) - just use from and to string:
frag_text = """<qsp:QSPart xmlns:qsp="urn:qvalent:quicksuper:gateway">
<qsp:MemberRegistrationRequest/>
</qsp:QSPart>"""
fragment = etree.fromstring(frag_text)
print(etree.tostring(fragment).decode())
Output should be your expected output.

parsing .xml file using python :search and copy related data

I want to copy some data from .xml file based on some search value .
In below xml file I want to search 0xCCB7B836 ( 0xCCB7B836 )and copy data inside that
4e564d2d52656648
6173685374617274
1782af065966579e
899885d440d3ad67
d04b41b15e2b13c2
one more example :
search value 0xECFBBA1A and return 0000
or
search value 0xA54E2B5A and return 30d4
<MEM_DATA>
<MEM_SECTOR>
<MEM_SECTOR_NUMBER>0</MEM_SECTOR_NUMBER>
<MEM_SECTOR_STATUS>ACTIVE</MEM_SECTOR_STATUS>
<MEM_SECTOR_STARTADR>0x800000</MEM_SECTOR_STARTADR>
<MEM_SECTOR_ENDADR>0x0</MEM_SECTOR_ENDADR>
<MEM_SECTOR_COUNTER>0x1</MEM_SECTOR_COUNTER>
<MEM_ERASED_MARKER>SET</MEM_ERASED_MARKER>
<MEM_USED_MARKER>SET</MEM_USED_MARKER>
<MEM_FULL_MARKER>NOT_SET</MEM_FULL_MARKER>
<MEM_ERASE_MARKER>NOT_SET</MEM_ERASE_MARKER>
<MEM_START_MARKER>SET</MEM_START_MARKER>
<MEM_START_OFFSET>0x1</MEM_START_OFFSET>
<MEM_CLONE_MARKER>NOT_SET</MEM_CLONE_MARKER>
<MEM_BLOCK>
<MEM_BLOCK_ID>0x101</MEM_BLOCK_ID>
<MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
<MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
<MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
<MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
<MEM_BLOCK_LEN>0x28</MEM_BLOCK_LEN>
<MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
<MEM_BLOCK_HEADER_CRC>0xE527</MEM_BLOCK_HEADER_CRC>
<MEM_BLOCK_CRC>0xCCB7B836</MEM_BLOCK_CRC>
<MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
<MEM_BLOCK_DATA>
<MEM_PAGE_DATA>4e564d2d52656648</MEM_PAGE_DATA>
<MEM_PAGE_DATA>6173685374617274</MEM_PAGE_DATA>
<MEM_PAGE_DATA>1782af065966579e</MEM_PAGE_DATA>
<MEM_PAGE_DATA>899885d440d3ad67</MEM_PAGE_DATA>
<MEM_PAGE_DATA>d04b41b15e2b13c2</MEM_PAGE_DATA>
</MEM_BLOCK_DATA>
</MEM_BLOCK>
<MEM_BLOCK>
<MEM_BLOCK_ID>0x20F</MEM_BLOCK_ID>
<MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
<MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
<MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
<MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
<MEM_BLOCK_LEN>0x2</MEM_BLOCK_LEN>
<MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
<MEM_BLOCK_HEADER_CRC>0xE0D2</MEM_BLOCK_HEADER_CRC>
<MEM_BLOCK_CRC>0xECFBBA1A</MEM_BLOCK_CRC>
<MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
<MEM_BLOCK_DATA>
<MEM_PAGE_DATA>0000</MEM_PAGE_DATA>
</MEM_BLOCK_DATA>
</MEM_BLOCK>
<MEM_BLOCK>
<MEM_BLOCK_ID>0x1F8</MEM_BLOCK_ID>
<MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
<MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
<MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
<MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
<MEM_BLOCK_LEN>0x2</MEM_BLOCK_LEN>
<MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
<MEM_BLOCK_HEADER_CRC>0x1DCC</MEM_BLOCK_HEADER_CRC>
<MEM_BLOCK_CRC>0xA54E2B5A</MEM_BLOCK_CRC>
<MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
<MEM_BLOCK_DATA>
<MEM_PAGE_DATA>30d4</MEM_PAGE_DATA>
</MEM_BLOCK_DATA>
</MEM_BLOCK>
</MEM_SECTOR>
</MEM_DATA>

Assuming that we have this xml data inside a file named test.xml, you can do something like that:
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
def search_and_copy(query):
for child in root.findall("MEM_SECTOR/MEM_BLOCK"):
if child.find("MEM_BLOCK_CRC").text == query:
return [item.text for item in child.findall("MEM_BLOCK_DATA/*")]
Let's try this search_and_copy() function out:
>>> search_and_copy("0xCCB7B836")
['4e564d2d52656648', '6173685374617274', '1782af065966579e', '899885d440d3ad67', 'd04b41b15e2b13c2']
>>> search_and_copy("0xA54E2B5A")
['30d4']

We can use xpath, with python's xml etree and elementpath to write a function to retrieve the data :
Breakdown of the code below (within the elementpath.Selector):
1. the first line looks for elements that have our search string
2. The second line .. goes back one step to get the parent element
3. Proceeding from the parent element, this line searches for MEM_PAGE_DATA within the parent element. This element holds the data we are actually interested in.
4. The rest of the code simply pulls the text from the matches
import xml.etree.ElementTree as ET
import elementpath
#wrapped the shared data into a test.xml file
root = ET.parse('test.xml').getroot()
def find_data(search_string):
selector = elementpath.Selector(f""".//*[text()='{search_string}']
//..
//MEM_PAGE_DATA""")
#pull text from the match
result = [entry.text for entry in selector.select(root)]
return result
Test on the strings provided :
find_data("0xCCB7B836")
['4e564d2d52656648',
'6173685374617274',
'1782af065966579e',
'899885d440d3ad67',
'd04b41b15e2b13c2']
find_data("0xECFBBA1A")
['0000']
find_data("0xA54E2B5A")
['30d4']

How to get the child of child using Python's ElementTree

I'm building a Python file that communicates with a PLC. When compiling, the PLC creates a XML file that delivers important information about the program. The XML looks more less like this:
<visu>
<time>12:34</time>
<name>my_visu</name>
<language>english</language>
<vars>
<var name="input1">2</var>
<var name="input2">45.6</var>
<var name="input3">"hello"</var>
</vars>
</visu>
The important part is found under child "vars". Using Python I want to make a file that when sending argument "input2" it will print "45.6".
So far I can read all children of "visu", but don't know how to actually tell Python to search among "the child of child". Here's is what I got so far:
tree = ET.parse("file.xml")
root = tree.getroot()
for child in root:
if child.tag == "vars":
.......
if ( "childchild".attrib.get("name") == "input2" ):
print "childchild".text
Any ideas how I can complete the script? (or maybe a more efficient way of programming it?)

You'd be better of using an XPath search here:
name = 'input2'
value = root.find('.//vars/var[#name="{}"]'.format(name)).text
This searches for a <var> tag directly below a <vars> tag, whose attribute name is equal to the value given by the Python name variable, then retrieves the text value of that tag.
Demo:
>>> from xml.etree import ElementTree as ET
>>> sample = '''\
... <visu>
... <time>12:34</time>
... <name>my_visu</name>
... <language>english</language>
... <vars>
... <var name="input1">2</var>
... <var name="input2">45.6</var>
... <var name="input3">"hello"</var>
... </vars>
... </visu>
... '''
>>> root = ET.fromstring(sample)
>>> name = 'input2'
>>> root.find('.//vars/var[#name="{}"]'.format(name)).text
'45.6'
You can do this the hard way and manually loop over all the elements; each element can be looped over directly:
name = 'input2'
for elem in root:
if elem.tag == 'vars':
for var in elem:
if var.attrib.get('name') == name:
print var.text
but using element.find() or element.find_all() is probably going to be easier and more concise.

Using "info.get" for a child element in Python / lxml

I'm trying to get the attribute of a child element in Python, using lxml.
This is the structure of the xml:
<GroupInformation groupId="crid://thing.com/654321" ordered="true">
<GroupType value="show" xsi:type="ProgramGroupTypeType"/>
<BasicDescription>
<Title type="main" xml:lang="EN">A programme</Title>
<RelatedMaterial>
<HowRelated href="urn:eventis:metadata:cs:HowRelatedCS:2010:boxCover">
<Name>Box cover</Name>
</HowRelated>
<MediaLocator>
<mpeg7:MediaUri>file://ftp.something.com/Images/123456.jpg</mpeg7:MediaUri>
</MediaLocator>
</RelatedMaterial>
</BasicDescription>
The code I've got is below. The bit I want to return is the 'value' attribute ("Show" in the example) under 'grouptype' (third line from the bottom):
file_name = input('Enter the file name, including .xml extension: ')
print('Parsing ' + file_name)
from lxml import etree
parser = etree.XMLParser()
tree = etree.parse(file_name, parser)
root = tree.getroot()
nsmap = {'xmlns': 'urn:tva:metadata:2010','mpeg7':'urn:tva:mpeg7:2008'}
with open(file_name+'.log', 'w', encoding='utf-8') as f:
for info in root.xpath('//xmlns:GroupInformation', namespaces=nsmap):
crid = info.get('groupId'))
grouptype = info.find('.//xmlns:GroupType', namespaces=nsmap)
gtype = grouptype.get('value')
titlex = info.find('.//xmlns:BasicDescription/xmlns:Title', namespaces=nsmap)
title = titlex.text if titlex != None else 'Missing'
Can anyone explain to me how to implement it? I had a quick look at the xsi namespace, but was unable to get it to work (and didn't know if it was the right thing to do).

Is this what you are looking for?
grouptype.attrib['value']
PS: why the parenthesis around assignment values? Those look unnecessary.

Generating Xml using python

Kindly have a look at below code i am using this to generate a xml using python .
from lxml import etree
# Some dummy text
conn_id = 5
conn_name = "Airtelll"
conn_desc = "Largets TRelecome"
ip = "192.168.1.23"
# Building the XML tree
# Note how attributes and text are added, using the Element methods
# and not by concatenating strings as in your question
root = etree.Element("ispinfo")
child = etree.SubElement(root, 'connection',
number = str(conn_id),
name = conn_name,
desc = conn_desc)
subchild_ip = etree.SubElement(child, 'ip_address')
subchild_ip.text = ip
# and pretty-printing it
print etree.tostring(root, pretty_print=True)
This will produce:
<ispinfo>
<connection desc="Largets TRelecome" number="5" name="Airtelll">
<ip_address>192.168.1.23</ip_address>
</connection>
</ispinfo>
But i want it to be like :
<ispinfo>
<connection desc="Largets TRelecome" number='1' name="Airtelll">
<ip_address>192.168.1.23</ip_address>
</connection>
</ispinfo>
Mean number attribute should be come in a single quote .Any idea ....How can i achieve this

There is no flag in lxml to do this, so you have to resort to manual manipulation.
import re
re.sub(r'number="([0-9]+)"',r"number='\1'", etree.tostring(root, pretty_print=True))
However, why do you want to do this? As there is no difference other than cosmetics.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

XML reading the last entry from the file in python - python

Related

How to add namespace prefix at root XML using Python LXML?

parsing .xml file using python :search and copy related data

How to get the child of child using Python's ElementTree

Using "info.get" for a child element in Python / lxml

Generating Xml using python

Categories

Resources