Is it possible with the package xml.etree to find the parent of a child? For example:
<ELEMENTS>
<CONSTANT-SPECIFICATION>
</CONSTANT-SPECIFICATION>
</ELEMENTS>
<ELEMENTS>
<DATA-SPECIFICATION>
</DATA-SPECIFICATION>
</ELEMENTS>
I search for the object "ELEMENTS" that contains the Child "CONSTANT-SPECIFICATION".
You can use .//ELEMENTS[CONSTANT-SPECIFICATION] XPath expression, example:
import xml.etree.ElementTree as ET
data = """<?xml version="1.0" encoding="ISO-8859-1"?>
<ROOT>
<ELEMENTS>
<CONSTANT-SPECIFICATION>
</CONSTANT-SPECIFICATION>
</ELEMENTS>
<ELEMENTS>
<DATA-SPECIFICATION>
</DATA-SPECIFICATION>
</ELEMENTS>
</ROOT>
"""
root = ET.fromstring(data)
print root.find('.//elements[constant-specification]')
Related
I want to remove element but not its children. I tried with this code, but my code remove its children also.
code
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
for item in root.findall('item'):
root.remove(item)
print(ET.tostring(root))
>>> <root>
</root>
test.xml
<?xml version="1.0" ?>
<root>
<item>
<data>
<number>01</number>
<step>one</step>
</data>
</item>
</root>
expected outcome
<?xml version="1.0" ?>
<root>
<data>
<number>01</number>
<step>one</step>
</data>
</root>
You should move all children of item to root before removing
for item in root.findall('item'):
for child in item:
root.append(child)
root.remove(item)
print(ET.tostring(root))
the code results in
<root>
<data>
<number>01</number>
<step>one</step>
</data>
</root>
Find the element with data tag, remove it and extend the element's parent with element's children.
import xml.etree.ElementTree as etree
data = """
<root>
<item>
<data>
<number>01</number>
<step>one</step>
</data>
</item>
</root>
"""
tree = etree.fromstring(data)
def iterparent(tree):
for parent in tree.iter():
for child in parent:
yield parent, child
tree = etree.fromstring(data)
for parent, child in iterparent(tree):
if child.tag == "data":
parent.remove(child)
parent.extend(child)
print((etree.tostring(tree)))
will output
<root>
<item>
<number>01</number>
<step>one</step>
</item>
</root>
Adapted from a similar answer for your particular use case.
Given these XML documents:
Document 1
<root>
<element1>
</element1>
</root>
Document 2
<request>
<dummyValue>5</dummyValue>
</request>
Using Pythons ElementTree I'd like to insert the second document into the first document so that the result would look as follows.
Resulting document
<root>
<element1>
<request>
<dummyValue>5</dummyValue>
</request>
</element1>
</root>
ET.SubElement(element1, request) gives me a serialization error.
Is there a Pythonic way of doing this?
SubElement() constructs an Element and then attaches it to the tree. Since you already have request as an Element, you don't need to construct a new one.
Try element1.append(request), like so:
import xml.etree.ElementTree as ET
doc1 = ET.XML('''
<root>
<element1>
</element1>
</root>
''')
request = ET.XML('''
<request>
<dummyValue>5</dummyValue>
</request>
''')
for element1 in doc1.findall('element1'):
element1.append(request)
ET.dump(doc1)
What is the easiest way to navigate through XML with python?
<html>
<body>
<soapenv:envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:body>
<getservicebyidresponse xmlns="http://www.something.com/soa/2.0/SRIManagement">
<code xmlns="">
0
</code>
<client xmlns="">
<action xsi:nil="true">
</action>
<actionmode xsi:nil="true">
</actionmode>
<clientid>
405965216
</clientid>
<firstname xsi:nil="true">
</firstname>
<id xsi:nil="true">
</id>
<lastname>
Last Name
</lastname>
<role xsi:nil="true">
</role>
<state xsi:nil="true">
</state>
</client>
</getservicebyidresponse>
</soapenv:body>
</soapenv:envelope>
</body>
</html>
I would go with regex and try to get the values of the lines I need but is there a pythonic way? something like xml[0][1] etc?
As #deceze already pointed out, you can use xml.etree.ElementTree here.
import xml.etree.ElementTree as ET
tree = ET.parse("path_to_xml_file")
root = tree.getroot()
You can iterate over all children nodes of root:
for child in root.iter():
if child.tag == 'clientid':
print(child.tag, child.text.strip())
Children are nested, and we can access specific child nodes by index, so root[0][1] should work (as long as the indices are correct).
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<person>
<first-name>First_Name</first-name>
<last-name>Last_Name</last-name>
<headline>Headline</headline>
<location>
<name>Some_city, STATE </name>
<country>
<code>us</code>
</country>
</location>
</person>
I'm trying to access First_Name, Last_Name, Headline and Some_city, STATE
So far I have:
import xml.etree.ElementTree as ET
tree = ET.parse(data)
root = tree.getroot()
for child in root:
print child
Which prints out:
<Element 'first-name' at 0x110726b10>
<Element 'last-name' at 0x110726b50>
<Element 'headline' at 0x110726b90>
<Element 'location' at 0x110726bd0>
How can I access the value of 'first-name'?
Get the .text property:
for child in root:
print child.text
I am trying to remove an element in an xml which contains a namespace.
Here is my code:
templateXml = """<?xml version="1.0" encoding="UTF-8"?>
<Metadata xmlns="http://www.amazon.com/UnboxMetadata/v1">
<Movie>
<CountryOfOrigin>US</CountryOfOrigin>
<TitleInfo>
<Title locale="en-GB">The Title</Title>
<Actor>
<ActorName locale="en-GB">XXX</ActorName>
<Character locale="en-GB">XXX</Character>
</Actor>
</TitleInfo>
</Movie>
</Metadata>"""
from lxml import etree
tree = etree.fromstring(templateXml)
namespaces = {'ns':'http://www.amazon.com/UnboxMetadata/v1'}
for checkActor in tree.xpath('//ns:Actor', namespaces=namespaces):
etree.strip_elements(tree, 'ns:Actor')
In my actual XML I have lots of tags, So I am trying to search for the Actor tags which contain XXX and completely remove that whole tag and its contents. But it's not working.
Use remove() method:
for checkActor in tree.xpath('//ns:Actor', namespaces=namespaces):
checkActor.getparent().remove(checkActor)
print etree.tostring(tree, pretty_print=True, xml_declaration=True)
prints:
<?xml version='1.0' encoding='ASCII'?>
<Metadata xmlns="http://www.amazon.com/UnboxMetadata/v1">
<Movie>
<CountryOfOrigin>US</CountryOfOrigin>
<TitleInfo>
<Title locale="en-GB">The Title</Title>
</TitleInfo>
</Movie>
</Metadata>