Using ElementTree, how do I place a comment just below the XML declaration and above the root element?
I have tried root.append(comment), but this places the comment as the last child of root. Can I append the comment to whatever is root's parent?
Thanks.
Here is how a comment can be added in the wanted position (after XML declaration, before root element) with lxml, using the addprevious() method.
from lxml import etree
root = etree.fromstring('<root><x>y</x></root>')
comment = etree.Comment('This is a comment')
root.addprevious(comment) # Add the comment as a preceding sibling
etree.ElementTree(root).write("out.xml",
pretty_print=True,
encoding="UTF-8",
xml_declaration=True)
Result (out.xml):
<?xml version='1.0' encoding='UTF-8'?>
<!--This is a comment-->
<root>
<x>y</x>
</root>
Here
import xml.etree.ElementTree as ET
root = ET.fromstring('<root><e1><e2></e2></e1></root>')
comment = ET.Comment('Here is a Comment')
root.insert(0, comment)
ET.dump(root)
output
<root><!--Here is a Comment--><e1><e2 /></e1></root>
Related
I'm trying hard to include my XSLT header in my XML with ElementTree, and I canĀ“t find any information on how to do it.
Here's my Python code:
tree = ET.parse('myfile.xml') #get all tags from this XML document
root = tree.getroot() #get all elements from each tag
root[0][0].text = "ole"
root[0][1].text = "ole"
tree.write('test_file.xml', encoding='utf-8', method="xml") #write XML file
The only problem is to include this header:
<\? xml-stylesheet type="text/xsl" href="myfile.xsl" \?>
Unfortunately xml.etree.ElementTree does not support XSLT (for instance, you can read about write() that method is either xml, text or html).
Luckily though, you can easily do that if you rely on lxml which adds supports to XSLT
I just found the answer!!!
You need to use lxml instead and this is the new code:
from lxml import etree as ET
parser = ET.XMLParser(strip_cdata=False) #strip = false to prevent cdata to be removed/ stripped
tree = ET.parse('myfile.xml', parser)
root = tree.getroot() #get all elements from each tag
tag1 = root.find('TAG1')
tag1.find('TAG2').text = 'text change here'
tree.write('test_file.xml', encoding='utf-8', method="xml")
Your XML template (myfile.xml) is like this:
<?xml-stylesheet type="text/xsl" href="your_file.xsl" ?>
<FirstTAG>
<TAG1>
<TAG2>your text</TAG2>
</TAG1>
</FirstTAG>
And the new one will be like this:
<?xml-stylesheet type="text/xsl" href="your_file.xsl" ?>
<FirstTAG>
<TAG1>
<TAG2>text change here</TAG2>
</TAG1>
</FirstTAG>
<?xml version="1.0"?>
<doc>
<doc1>
<branch name="testing" hash="1cdf045c">
text,source
</branch>
<some name="release01" hash:hashing="f200013e">
<sub-branch name="subrelease01">
xml,sgml
</sub-branch>
</branch>
<bsome name="invalid">
</branch>
</doc1>
</doc>
I wrote something like this:
import xml.etree.ElementTree as ET
from lxml import etree
filename='/testxml'
tree = etree.parse(filename)
root = tree.getroot()
print root
primary = etree.Element("{http://schemas.dmtf.org/xxx/doc/1}doc")
secondary = etree.SubElement(primary, "{http://schemas.dmtf.org/xxx/doc/1}some name")
secondary.text = ????
print(etree.tostring(primary, pretty_print=True))
I would want to change the value for hash:hashing...
If possible I want use xpath query for example /doc/doc1/some name/#hash:hashing
to do this task...
Please help....Thank you!
Is it possible to comment out an xml element with python's lxml while preserving the original element rendering inside the comment? I tried the following
elem.getparent().replace(elem, etree.Comment(etree.tostring(elem, pretty_print=True)))
but tostring() adds the namespace declaration.
The namespace of the commented-out element is inherited from the root element. Demo:
from lxml import etree
XML = """
<root xmlns='foo'>
<a>
<b>AAA</b>
</a>
</root>"""
root = etree.fromstring(XML)
b = root.find(".//{foo}b")
b.getparent().replace(b, etree.Comment(etree.tostring(b)))
print etree.tostring(root)
Result:
<root xmlns="foo">
<a>
<!--<b xmlns="foo">AAA</b>
--></a>
</root>
Manipulating namespaces is often harder than you might suspect. See https://stackoverflow.com/a/31870245/407651.
My suggestion here is to use BeautifulSoup, which in practice does not really care about namespaces (soup.find('b') returns the b element even though it is in the foo namespace).
from bs4 import BeautifulSoup, Comment
soup = BeautifulSoup(XML, "xml")
b = soup.find('b')
b.replace_with(Comment(str(b)))
print soup.prettify()
Result:
<?xml version="1.0" encoding="utf-8"?>
<root mlns="foo">
<a>
<!--<b>AAA</b>-->
</a>
</root>
I'm using ElementTree with Python to parse an XML file to find the contents of the tag contentType.
Here's the Python line:
extensionType = ET.parse("src/" + str(filename)).find('contentType')
And here's the XML:
<?xml version="1.0" encoding="UTF-8"?>
<StaticResource xmlns="http://soap.sforce.com/2006/04/metadata">
<cacheControl>Private</cacheControl>
<contentType>image/jpeg</contentType>
</StaticResource>
What am I doing wrong?
Thanks!
you are just parsing the xml file so far. this is how you can get your element using xpath (note how you have to use the given xml namespace xmlns)
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
xmlns = {'soap': '{http://soap.sforce.com/2006/04/metadata}'}
ct_element = root.find('.//{soap}contentType'.format(**xmlns))
print(ct_element.text)
Hi I am parsing and completely modifying XML file in Python3 using lxml and I need put new Element into existing Elements and change their parent.
Example:
old xml
<a>
<b>something</b>
<c>something different</c>
</a>
new xml
<a>
<new_parent>
<b>something</b>
<c>something different</c>
</new_parent>
<a>
Is it possible ?
I'm not sure there is a function that do directly what you want. I would do it as follow: Create a new_parent node and append children of a to new_parent node, and append new_parent to a.
import lxml.etree
xml = '''<?xml version='1.0' encoding='ASCII'?>
<root>
<a>
<b>something</b>
<c>something different</c>
</a>
</root>'''
root = lxml.etree.fromstring(xml)
a = root.find('.//a')
parent = lxml.etree.Element('new_parent')
for child in a:
parent.append(child)
a.append(parent)
print lxml.etree.tostring(root, xml_declaration=True)
prints (output format is modified to make it easy to read)
<?xml version='1.0' encoding='ASCII'?>
<root>
<a>
<new_parent>
<b>something</b>
<c>something different</c>
</new_parent>
</a>
</root>
UPDATE You can use extend instead of multiple calls of append.
root = lxml.etree.fromstring(xml)
a = root.find('.//a')
parent = lxml.etree.Element('new_parent')
parent.extend(a)
a.append(parent)