I have a XML file like this, let's name is XML_old:
<?xml version="1.0" encoding="UTF-8"?>
<!--
// Description : ahbbus12alda.xml
// modifications; this notice must be included on any copy.
-->
<ipxact:component xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ipxact="http://www.accellera.org/XMLSchema/IPXACT/1685-2014"
xsi:schemaLocation="http://www.accellera.org/XMLSchema/IPXACT/1685-2014
http://www.accellera.org/XMLSchema/IPXACT/1685-2014/index.xsd">
<ipxact:vendor>spiritconsortium.org</ipxact:vendor>
<ipxact:library>Leon2RTL</ipxact:library>
<ipxact:name>ahbbus12</ipxact:name>
<ipxact:version>1.3</ipxact:version>
<ipxact:busInterfaces>
<ipxact:busInterface>
<ipxact:name>AHBClk</ipxact:name>
</ipxact:busInterface>
</ipxact:busInterfaces>
</ipxact:component>
Also, I have another XML file, XML_1, like:
<?xml version="1.0" encoding="UTF-8"?>
<ipxact:component1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ipxact="http://www.accellera.org/XMLSchema/IPXACT/1685-2014"
xsi:schemaLocation="http://www.accellera.org/XMLSchema/IPXACT/1685-2014
http://www.accellera.org/XMLSchema/IPXACT/1685-2014/index.xsd">
<ipxact:vendor1>spiritconsortium.org</ipxact:vendor1>
<ipxact:library1>Leon2RTL</ipxact:library1>
<ipxact:name1>ahbbus34</ipxact:name1>
<ipxact:version1>1.3</ipxact:version1>
<ipxact:busInterfaces1>
<ipxact:busInterface1>
<ipxact:name1>AHBClk</ipxact:name1>
<ipxact:busInterface1>
<ipxact:busInterfaces1>
</ipxact:component1>
I want to add XML_1 to the XML_old as a last child of component element in XML_old file and create a new XML file like
<?xml version="1.0" encoding="UTF-8"?>
<!--
// Description : ahbbus12alda.xml
// modifications; this notice must be included on any copy.
-->
<ipxact:component xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ipxact="http://www.accellera.org/XMLSchema/IPXACT/1685-2014"
xsi:schemaLocation="http://www.accellera.org/XMLSchema/IPXACT/1685-2014
http://www.accellera.org/XMLSchema/IPXACT/1685-2014/index.xsd">
<ipxact:vendor>spiritconsortium.org</ipxact:vendor>
<ipxact:library>Leon2RTL</ipxact:library>
<ipxact:name>ahbbus12</ipxact:name>
<ipxact:version>1.3</ipxact:version>
<ipxact:busInterfaces>
<ipxact:busInterface>
<ipxact:name>AHBClk</ipxact:name>
</ipxact:busInterface>
</ipxact:busInterfaces>
<ipxact:component1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ipxact="http://www.accellera.org/XMLSchema/IPXACT/1685-2014"
xsi:schemaLocation="http://www.accellera.org/XMLSchema/IPXACT/1685-2014
http://www.accellera.org/XMLSchema/IPXACT/1685-2014/index.xsd">
<ipxact:vendor1>spiritconsortium.org</ipxact:vendor1>
<ipxact:library1>Leon2RTL</ipxact:library1>
<ipxact:name1>ahbbus34</ipxact:name1>
<ipxact:version1>1.3</ipxact:version1>
<ipxact:busInterfaces1>
<ipxact:busInterface1>
<ipxact:name1>AHBClk</ipxact:name1>
<ipxact:busInterface1>
<ipxact:busInterfaces1>
</ipxact:component1>
</ipxact:component>
I wonder how can I do this? I appreciate if you can help me.
Related
i have parsed and stored a xml file as document object using the code below.
import xml.dom.minidom as DOM
import shutil
import xml.etree.ElementTree as ET
metadata_path=r"C:\Users\ar\DD2MI_result.xml"
new_metadata=DOM.parse(metadata_path)
Now i want to use this complete document object to replace the data of the child node in another xml file. i am able to get the child node like this:
output_draft = r"C:\Users\ar\airquality.xml"
doc = DOM.parse(output_draft)
meta=doc.getElementsByTagName('XmlDoc')
for metadata in meta:
if metadata.firstChild.data:
metadata.firstChild.replaceData(0,len(new_metadata),new_metadata)
print (metadata.firstChild.data)
When i run the above code i get the error, TypeError: object of type 'Document' has no len() which i understand as it is an object. How can i use the complete object or file to replace the current contents?
airquality.xml
<?xml version="1.0" encoding="UTF-8"?>
<gmd:MD_Metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:gco="http://www.isotc211.org/2005/gco"
xmlns:gmd="http://www.isotc211.org/2005/gmd"
xmlns:srv="http://www.isotc211.org/2005/srv"
xmlns:gmx="http://www.isotc211.org/2005/gmx"
xmlns:gsr="http://www.isotc211.org/2005/gsr"
xmlns:gss="http://www.isotc211.org/2005/gss"
xmlns:gts="http://www.isotc211.org/2005/gts"
xmlns:gml="http://www.opengis.net/gml/3.2"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd">
<gmd:fileIdentifier>
<gco:CharacterString>https://hdl.handle.net/20.500.12085/1f97f2a1-75fc-4110-ae22-f873d7d86565#metadata</gco:CharacterString>
</gmd:fileIdentifier>
<gmd:language>
<gmd:LanguageCode codeList="http://www.loc.gov/standards/iso639-2/" codeListValue="eng">eng</gmd:LanguageCode>
</gmd:language>
</gmd:MD_Metadata>
DD2MI_result.xml before replacement
<SVCManifest xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:type="typens:SVCManifest">
<Databases xsi:type="typens:ArrayOfSVCDatabase" />
<Resources xsi:type="typens:ArrayOfSVCResource">
<SVCResource xsi:type="typens:SVCResource">
<ID>{429221BF-D0A1-40D8-9DC1-B41D269E95C7}</ID>
<Name>test.crf</Name>
<Metadata xsi:type="typens:XmlPropertySet">
<XmlDoc><?xml version="1.0"?>
<metadata xml:lang="en"><Esri><CreaDate>20211219</metadata>
</XmlDoc>
</Metadata>
</SVCManifest>
DD2MI_result.xml after replacement
<SVCManifest xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:type="typens:SVCManifest">
<Databases xsi:type="typens:ArrayOfSVCDatabase" />
<Resources xsi:type="typens:ArrayOfSVCResource">
<SVCResource xsi:type="typens:SVCResource">
<ID>{429221BF-D0A1-40D8-9DC1-B41D269E95C7}</ID>
<Name>test.crf</Name>
<Metadata xsi:type="typens:XmlPropertySet">
<XmlDoc><?xml version="1.0" encoding="UTF-8"?>
<gmd:MD_Metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:gco="http://www.isotc211.org/2005/gco"
xmlns:gmd="http://www.isotc211.org/2005/gmd"
xmlns:srv="http://www.isotc211.org/2005/srv"
xmlns:gmx="http://www.isotc211.org/2005/gmx"
xmlns:gsr="http://www.isotc211.org/2005/gsr"
xmlns:gss="http://www.isotc211.org/2005/gss"
xmlns:gts="http://www.isotc211.org/2005/gts"
xmlns:gml="http://www.opengis.net/gml/3.2"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd">
<gmd:fileIdentifier>
<gco:CharacterString>https://hdl.handle.net/20.500.12085/1f97f2a1-75fc-4110-ae22-f873d7d86565#metadata</gco:CharacterString>
</gmd:fileIdentifier>
<gmd:language>
<gmd:LanguageCode codeList="http://www.loc.gov/standards/iso639-2/" codeListValue="eng">eng</gmd:LanguageCode>
</gmd:language>
</gmd:MD_Metadata>
</XmlDoc>
</Metadata>
</SVCManifest>
Your DD2MI_result.xml is still not well formed - for example, a couple of tags aren't closed. So as a first step, this answer assumes that document looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<SVCManifest xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="typens:SVCManifest">
<Databases xsi:type="typens:ArrayOfSVCDatabase" />
<Resources xsi:type="typens:ArrayOfSVCResource">
<SVCResource xsi:type="typens:SVCResource">
<ID>{429221BF-D0A1-40D8-9DC1-B41D269E95C7}</ID>
<Name>test.crf</Name>
<Metadata xsi:type="typens:XmlPropertySet">
<XmlDoc><?xml version="1.0"?>
<metadata xml:lang="en"><Esri><CreaDate>20211219</metadata></XmlDoc>
</Metadata>
</SVCResource>
</Resources>
</SVCManifest>
Also, as I mentioned, you can't have two xml declarations within one well-formed xml, and if you have a declaration, it must be at the very beginning of the document, not somewhere in the middle.
Next, we start the actual process, but using the lxml library. This should get you close enough to what I believe is your expected output. If it's not 100% there, you'll have to tinker with it a bit:
from lxml import etree
source = """[the content of airquality.xml, from the question]"""
target = """[the content of DD2MI_result.xml, as corrected above]"""
s_doc = etree.XML(source.encode())
t_doc = etree.XML(target.encode())
#first, we get rid of the ugly XmlDoc element in target doc (DD2MI_result.xml)
for t in t_doc.xpath('//XmlDoc'):
t.getparent().remove(t)
#we then create an empty replacement for it
new_xd = etree.Element("XmlDoc")
#next, the replacement element is inserted in the target document in the correct place
for m in t_doc.xpath('//Metadata'):
m.addnext(new_xd)
#finally, we insert in the new XmlDoc element the contents of the source document (airquality.xml)
for t in t_doc.xpath('//XmlDoc'):
t.insert(0,s_doc.xpath('//*')[0])
#confirm that the output is what you are looking for:
print(etree.tostring(t_doc, xml_declaration=True, pretty_print=True).decode())
As I said, this may not be 100% of what you're trying to do, but should get you closer.
Consider XSLT, the special-purpose language designed to transform XML files, which maintains the document() function to read in other XML documents. You even avoid any for-loops and if-logic.
Python's third-party, lxml, can run XSLT 1.0 scripts (not built-in etree or minidom). Alternatively, Python can call third-party XSLT 1.0, 2.0, even 3.0 processors.
XSLT (save as .xsl, a special .xml file)
Below assumes airquality.xml is in same folder relative to DD2MI_result.xml.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" encoding="utf-8" indent="yes" />
<xsl:strip-space elements="*"/>
<!-- IDENTITY TRANSFORM -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<!-- REPLACE XMLDoc -->
<xsl:template match="XmlDoc">
<xsl:copy>
<xsl:apply-templates select="document('airquality.xml')"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Python
import lxml.etree as et
doc = et.parse('DD2MI_result.xml')
xsl = et.parse('MyScript.xsl')
# CONFIGURE TRANSFORMER
transform = et.XSLT(xsl)
# TRANSFORM SOURCE DOC
result = transform(doc)
# OUTPUT TO CONSOLE
print(result)
# SAVE TO FILE
result.write_output('Output.xml')
Output
<SVCManifest xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="typens:SVCManifest">
<Databases xsi:type="typens:ArrayOfSVCDatabase"/>
<Resources xsi:type="typens:ArrayOfSVCResource"/>
<SVCResource xsi:type="typens:SVCResource"/>
<ID>{429221BF-D0A1-40D8-9DC1-B41D269E95C7}</ID>
<Name>test.crf</Name>
<Metadata xsi:type="typens:XmlPropertySet">
<XmlDoc>
<gmd:MD_Metadata xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:srv="http://www.isotc211.org/2005/srv" xmlns:gmx="http://www.isotc211.org/2005/gmx" xmlns:gsr="http://www.isotc211.org/2005/gsr" xmlns:gss="http://www.isotc211.org/2005/gss" xmlns:gts="http://www.isotc211.org/2005/gts" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns:xlink="http://www.w3.org/1999/xlink" xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd">
<gmd:fileIdentifier>
<gco:CharacterString>https://hdl.handle.net/20.500.12085/1f97f2a1-75fc-4110-ae22-f873d7d86565#metadata</gco:CharacterString>
</gmd:fileIdentifier>
<gmd:language>
<gmd:LanguageCode codeList="http://www.loc.gov/standards/iso639-2/" codeListValue="eng">eng</gmd:LanguageCode>
</gmd:language>
</gmd:MD_Metadata>
</XmlDoc>
</Metadata>
</SVCManifest>
I generated a KML file using Python's SimpleKML library and the following script, the output of which is also shown below:
import simplekml
kml = simplekml.Kml()
ground = kml.newgroundoverlay(name='Aerial Extent')
ground.icon.href = 'C:\\Users\\mdl518\\Desktop\\aerial_image.png'
ground.latlonbox.north = 46.55537
ground.latlonbox.south = 46.53134
ground.latlonbox.east = 48.60005
ground.latlonbox.west = 48.57678
ground.latlonbox.rotation = 0.090320
kml.save(".//aerial_extent.kml")
The output KML:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2">
<Document id="1">
<GroundOverlay id="2">
<name>Aerial Extent</name>
<Icon id="3">
<href>C:\\Users\\mdl518\\Desktop\\aerial_image.png</href>
</Icon>
<LatLonBox>
<north>46.55537</north>
<south>46.53134</south>
<east>48.60005</east>
<west>48.57678</west>
<rotation>0.090320</rotation>
</LatLonBox>
</GroundOverlay>
</Document>
However, I am trying to remove the "Document" tag from this KML since it is a default element generated with SimpleKML, while keeping the child elements (e.g. GroundOverlay). Additionally, is there a way to remove the "id" attributes associated with specific elements (i.e. for the GroundOverlay, Icon elements)? I am exploring the usage of ElementTree/lxml to enable this, but these seem to be more specific to XML files as opposed to KMLs. Here's what I'm trying to use to modify the KML, but it is unable to remove the Document element:
from lxml import etree
tree = etree.fromstring(open("C:\\Users\\mdl518\\Desktop\\aerial_extent.kml").read())
for item in tree.xpath("//Document[#id='1']"):
item.getparent().remove(item)
print(etree.tostring(tree, pretty_print=True))
Here is the final desired output XML:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2">
<GroundOverlay>
<name>Aerial Extent</name>
<Icon>
<href>C:\\Users\\mdl518\\Desktop\\aerial_image.png</href>
</Icon>
<LatLonBox>
<north>46.55537</north>
<south>46.53134</south>
<east>48.60005</east>
<west>48.57678</west>
<rotation>0.090320</rotation>
</LatLonBox>
</GroundOverlay>
</kml>
Any insights are most appreciated!
You are getting tripped up on the dreaded namespaces...
Try using something like this:
ns = {'kml': 'http://www.opengis.net/kml/2.2'}
for item in tree.xpath("//kml:Document[#id='1']",namespaces=ns):
item.getparent().remove(item)
Edit:
To remove just the parent and retain all its descendants, try the following:
retain = doc.xpath("//kml:Document[#id='1']/kml:GroundOverlay",namespaces=ns)[0]
for item in doc.xpath("//kml:Document[#id='1']",namespaces=ns):
anchor = item.getparent()
anchor.remove(item)
anchor.insert(1,retain)
print(etree.tostring(doc, pretty_print=True).decode())
This should get you the desired output.
my original xml file looks like this:
<?xml version="1.0" encoding="utf-8"?>
<foo/>
and I want to change it to
<?xml version="1.0" encoding="utf-8"?>
<foo>
<bar>confusing dev</bar>
</foo>
I am using xml.etree.ElementTree as suggested by this tutorial
with open('file.xml','r+b') as f:
tree = etree.parse(f)
f.seek(0,0)
tree.write(f,xml_declaration=True)# default argument: encoding="us-ascii"
this outputs
<?xml version='1.0' encoding='us-ascii'?>
<foo/>
But how do I get the encoding of file.xml at runtime and pass it as an argument to tree.write or is there a better way to edit xml in python? I just want to change some Element.text but keep the declaration and namespace unchanged.
I have an xml file as follows:
<?xml version="1.0" encoding="utf-8"?>
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="Build" ToolsVersion="4.0">
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets">
</ImportGroup>
</Project>
I want to add a line <Import Project="$(ProjectName).targets" /> between
</ImportGroup> and </Project> as follows
<?xml version="1.0" encoding="utf-8"?>
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="Build" ToolsVersion="4.0">
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets">
</ImportGroup>
<Import Project="$(ProjectName).targets" />
</Project>
If the line <Import Project="$(ProjectName).targets" /> already exists in file there is no need to add.
How can I do that?
Your question is based on lines in text files, but the input file is clearly XML, so assuming you actually want to add an Import if it doesn't exist, try this:
import xml.dom.minidom
importstring = "$(Projectname).targets"
filename = "test.xml"
tree = xml.dom.minidom.parse(filename)
Project = tree.getElementsByTagName("Project")[0]
for Import in Project.getElementsByTagName("Import"):
if Import.getAttribute("Project") == importstring:
break
else: # note this is else belongs to the for, not the if
newImport = xml.dom.minidom.Element("Import")
newImport.setAttribute("Project", importstring)
Project.appendChild(newImport)
tree.writexml(open(filename, 'w'))
Take the XML parser of your choice, parse the file, manipulate the file using the related API, write it back.
I'm writing an application configuration module that uses XML in its files. Consider the following example:
<?xml version="1.0" encoding="UTF-8"?>
<Settings>
<PathA>/Some/path/to/directory</PathA>
<PathB>/Another/path</PathB>
</Settings>
Now, I'd like to override certain elements in a different file that gets loaded afterwards. Example of the override file:
<?xml version="1.0" encoding="UTF-8"?>
<Settings>
<PathB>/Change/this/path</PathB>
</Settings>
When querying the document (with overrides) with XPath, I'd like to get this as the element tree:
<?xml version="1.0" encoding="UTF-8"?>
<Settings>
<PathA>/Some/path/to/directory</PathA>
<PathB>/Change/this/path</PathB>
</Settings>
This is similar to what Python's ConfigParser does with its read() method, but done with XML. How can I implement this?
You could convert the XML into an instance of Python class:
import lxml.etree as ET
import io
class Settings(object):
def __init__(self,text):
root=ET.parse(io.BytesIO(text)).getroot()
self.settings=dict((elt.tag,elt.text) for elt in root.xpath('/Settings/*'))
def update(self,other):
self.settings.update(other.settings)
text='''\
<?xml version="1.0" encoding="UTF-8"?>
<Settings>
<PathA>/Some/path/to/directory</PathA>
<PathB>/Another/path</PathB>
</Settings>'''
text2='''\
<?xml version="1.0" encoding="UTF-8"?>
<Settings>
<PathB>/Change/this/path</PathB>
</Settings>'''
s=Settings(text)
s2=Settings(text2)
s.update(s2)
print(s.settings)
yields
{'PathB': '/Change/this/path', 'PathA': '/Some/path/to/directory'}
Must you use XML? The same could be achieved with JSON much simpler:
Suppose this is the text from the first config file:
text='''
{
"PathB": "/Another/path",
"PathA": "/Some/path/to/directory"
}
'''
and this is the text from the second:
text2='''{
"PathB": "/Change/this/path"
}'''
Then to merge the to, you simply load each into a dict, and call update:
import json
config=json.loads(text)
config2=json.loads(text2)
config.update(config2)
print(config)
yields the Python dict:
{u'PathB': u'/Change/this/path', u'PathA': u'/Some/path/to/directory'}