I want to remove element but not its children. I tried with this code, but my code remove its children also.
code
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
for item in root.findall('item'):
root.remove(item)
print(ET.tostring(root))
>>> <root>
</root>
test.xml
<?xml version="1.0" ?>
<root>
<item>
<data>
<number>01</number>
<step>one</step>
</data>
</item>
</root>
expected outcome
<?xml version="1.0" ?>
<root>
<data>
<number>01</number>
<step>one</step>
</data>
</root>
You should move all children of item to root before removing
for item in root.findall('item'):
for child in item:
root.append(child)
root.remove(item)
print(ET.tostring(root))
the code results in
<root>
<data>
<number>01</number>
<step>one</step>
</data>
</root>
Find the element with data tag, remove it and extend the element's parent with element's children.
import xml.etree.ElementTree as etree
data = """
<root>
<item>
<data>
<number>01</number>
<step>one</step>
</data>
</item>
</root>
"""
tree = etree.fromstring(data)
def iterparent(tree):
for parent in tree.iter():
for child in parent:
yield parent, child
tree = etree.fromstring(data)
for parent, child in iterparent(tree):
if child.tag == "data":
parent.remove(child)
parent.extend(child)
print((etree.tostring(tree)))
will output
<root>
<item>
<number>01</number>
<step>one</step>
</item>
</root>
Adapted from a similar answer for your particular use case.
Related
I would like to get the xml value of an element in ElementTree. For example, if I had the code:
<?xml version="1.0" encoding="UTF-8"?>
<item>
<child>asd</child>
hello world
<ch>jkl</ch>
</item>
It would get me
<child>asd</child>
hello world
<ch>jkl</ch>
Here's what I tried so far:
import xml.etree.ElementTree as ET
root = ET.fromstring("""<?xml version="1.0" encoding="UTF-8"?>
<item>
<child>asd</child>
hello world
<ch>jkl</ch>
</item>""")
print(root.text)
Try
print(ET.tostring(root.find('.//child')).decode(),ET.tostring(root.find('.//ch')).decode())
Or, more readable:
elems = ['child','ch']
for elem in elems:
print(ET.tostring(doc.find(f'.//{elem}')).decode())
The output, based on the xml in your question, should be what you're looking for.
Building on Jack Fleeting's answer, I created a solution I feel is more general, not just relating to the xml I inserted.
import xml.etree.ElementTree as ET
root = ET.fromstring("""<?xml version="1.0" encoding="UTF-8"?>
<item>
<child>asd</child>
hello world
<ch>jkl</ch>
</item>""")
for elem in root:
print(ET.tostring(root.find(f'.//{elem.tag}')).decode())
I have an XML file as below:
<?xml version="1.0" encoding="UTF-8"?>
<data>
<text>
I have <num1>two</num1> apples and <num2>four</num2> mangoes
</text>
</data>
I want to parse the file and get the whole context of text and its children elements and assign it to variable sentence:
sentence = "I have two apples and four mangoes"
How can I do that using Python ElementTree?
xml = """
<?xml version="1.0" encoding="UTF-8"?>
<data>
<text>
I have <num1>two</num1> apples and <num2>four</num2> mangoes
</text>
</data>
"""
from xml.etree import ElementTree as ET
x_data = ET.fromstring(xml.strip())
all_text = list(x_data.findall(".//text")[0].itertext())
print(" ".join([text.strip() for text in all_text]))
Iterate through the text from the parent node, and process the text as per your need
I am newbie for XPath. I have the following XML file.
Here my xml file:
<?xml version='1.0' encoding='utf-8'?>
<items>
<item>
<country>India</country>
<referenceId>IN375TP</referenceId>
<price>400</price>
</item>
<item>
<country>Australia</country>
<referenceId>AU120ED</referenceId>
<price>15</price>
</item>
<item>
<country>United Kingdom</country>
<referenceId>UK862RB</referenceId>
<price>20</price>
</item>
</items>
I want the following <item> tag as an output:
<item>
<country>Australia</country>
<referenceId>AU120ED</referenceId>
<price>15</price>
</item>
Note: Please use condition like /items/item[referenceId/text()="AU120ED"]
If you want to find the item by country, you can use an xpath specifying you want to find the item in items that have the text=country_name:
from lxml.etree import parse, HTMLParser
xml = parse("check.xml",HTMLParser())
print(xml.find("//items//item[country='Australia']"))
<Element item at 0x7f40faa28950>
If you actually want to search be referenceid, just change to item[referenceid='AU120ED']:
print(xml.find("//items//item[referenceid='AU120ED']"))
<Element item at 0x7f02c0c24998>
For xml:
from xml.etree import ElementTree as et
xml = et.parse("check.xml")
print(xml.find(".").find("./item[referenceId='AU120ED']"))
Is it possible with the package xml.etree to find the parent of a child? For example:
<ELEMENTS>
<CONSTANT-SPECIFICATION>
</CONSTANT-SPECIFICATION>
</ELEMENTS>
<ELEMENTS>
<DATA-SPECIFICATION>
</DATA-SPECIFICATION>
</ELEMENTS>
I search for the object "ELEMENTS" that contains the Child "CONSTANT-SPECIFICATION".
You can use .//ELEMENTS[CONSTANT-SPECIFICATION] XPath expression, example:
import xml.etree.ElementTree as ET
data = """<?xml version="1.0" encoding="ISO-8859-1"?>
<ROOT>
<ELEMENTS>
<CONSTANT-SPECIFICATION>
</CONSTANT-SPECIFICATION>
</ELEMENTS>
<ELEMENTS>
<DATA-SPECIFICATION>
</DATA-SPECIFICATION>
</ELEMENTS>
</ROOT>
"""
root = ET.fromstring(data)
print root.find('.//elements[constant-specification]')
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<person>
<first-name>First_Name</first-name>
<last-name>Last_Name</last-name>
<headline>Headline</headline>
<location>
<name>Some_city, STATE </name>
<country>
<code>us</code>
</country>
</location>
</person>
I'm trying to access First_Name, Last_Name, Headline and Some_city, STATE
So far I have:
import xml.etree.ElementTree as ET
tree = ET.parse(data)
root = tree.getroot()
for child in root:
print child
Which prints out:
<Element 'first-name' at 0x110726b10>
<Element 'last-name' at 0x110726b50>
<Element 'headline' at 0x110726b90>
<Element 'location' at 0x110726bd0>
How can I access the value of 'first-name'?
Get the .text property:
for child in root:
print child.text