how to change a node value in python - python

<?xml version="1.0"?>
<info>
</tags>
</tags>
<area>
<media>
<options>
<name>Jaipur</name>
</options>
</media>
</area>
</info>
i am totaly new in python, here is my xml file and i want to edit element value at run time in python
it means I want to change the <name>Jaipur</name> to <name>Mumbai</name>

First, the example is not valid xml. You can use xml.etree that comes included:
from xml.etree import ElementTree as et
xmlstr="""\
<?xml version="1.0"?>
<area>
<media>
<options>
<name>Jaipur</name>
</options>
</media>
</area>"""
doc=et.fromstring(xmlstr)
doc.find('.//name').text='Mumbai'
print et.tostring(doc)
output:
<area>
<media>
<options>
<name>Mumbai</name>
</options>
</media>
</area>

Related

How to replace the contents of a XML child element with a complement DOM document object from another xml file?

i have parsed and stored a xml file as document object using the code below.
import xml.dom.minidom as DOM
import shutil
import xml.etree.ElementTree as ET
metadata_path=r"C:\Users\ar\DD2MI_result.xml"
new_metadata=DOM.parse(metadata_path)
Now i want to use this complete document object to replace the data of the child node in another xml file. i am able to get the child node like this:
output_draft = r"C:\Users\ar\airquality.xml"
doc = DOM.parse(output_draft)
meta=doc.getElementsByTagName('XmlDoc')
for metadata in meta:
if metadata.firstChild.data:
metadata.firstChild.replaceData(0,len(new_metadata),new_metadata)
print (metadata.firstChild.data)
When i run the above code i get the error, TypeError: object of type 'Document' has no len() which i understand as it is an object. How can i use the complete object or file to replace the current contents?
airquality.xml
<?xml version="1.0" encoding="UTF-8"?>
<gmd:MD_Metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:gco="http://www.isotc211.org/2005/gco"
xmlns:gmd="http://www.isotc211.org/2005/gmd"
xmlns:srv="http://www.isotc211.org/2005/srv"
xmlns:gmx="http://www.isotc211.org/2005/gmx"
xmlns:gsr="http://www.isotc211.org/2005/gsr"
xmlns:gss="http://www.isotc211.org/2005/gss"
xmlns:gts="http://www.isotc211.org/2005/gts"
xmlns:gml="http://www.opengis.net/gml/3.2"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd">
<gmd:fileIdentifier>
<gco:CharacterString>https://hdl.handle.net/20.500.12085/1f97f2a1-75fc-4110-ae22-f873d7d86565#metadata</gco:CharacterString>
</gmd:fileIdentifier>
<gmd:language>
<gmd:LanguageCode codeList="http://www.loc.gov/standards/iso639-2/" codeListValue="eng">eng</gmd:LanguageCode>
</gmd:language>
</gmd:MD_Metadata>
DD2MI_result.xml before replacement
<SVCManifest xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:type="typens:SVCManifest">
<Databases xsi:type="typens:ArrayOfSVCDatabase" />
<Resources xsi:type="typens:ArrayOfSVCResource">
<SVCResource xsi:type="typens:SVCResource">
<ID>{429221BF-D0A1-40D8-9DC1-B41D269E95C7}</ID>
<Name>test.crf</Name>
<Metadata xsi:type="typens:XmlPropertySet">
<XmlDoc><?xml version="1.0"?>
<metadata xml:lang="en"><Esri><CreaDate>20211219</metadata>
</XmlDoc>
</Metadata>
</SVCManifest>
DD2MI_result.xml after replacement
<SVCManifest xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:type="typens:SVCManifest">
<Databases xsi:type="typens:ArrayOfSVCDatabase" />
<Resources xsi:type="typens:ArrayOfSVCResource">
<SVCResource xsi:type="typens:SVCResource">
<ID>{429221BF-D0A1-40D8-9DC1-B41D269E95C7}</ID>
<Name>test.crf</Name>
<Metadata xsi:type="typens:XmlPropertySet">
<XmlDoc><?xml version="1.0" encoding="UTF-8"?>
<gmd:MD_Metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:gco="http://www.isotc211.org/2005/gco"
xmlns:gmd="http://www.isotc211.org/2005/gmd"
xmlns:srv="http://www.isotc211.org/2005/srv"
xmlns:gmx="http://www.isotc211.org/2005/gmx"
xmlns:gsr="http://www.isotc211.org/2005/gsr"
xmlns:gss="http://www.isotc211.org/2005/gss"
xmlns:gts="http://www.isotc211.org/2005/gts"
xmlns:gml="http://www.opengis.net/gml/3.2"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd">
<gmd:fileIdentifier>
<gco:CharacterString>https://hdl.handle.net/20.500.12085/1f97f2a1-75fc-4110-ae22-f873d7d86565#metadata</gco:CharacterString>
</gmd:fileIdentifier>
<gmd:language>
<gmd:LanguageCode codeList="http://www.loc.gov/standards/iso639-2/" codeListValue="eng">eng</gmd:LanguageCode>
</gmd:language>
</gmd:MD_Metadata>
</XmlDoc>
</Metadata>
</SVCManifest>
Your DD2MI_result.xml is still not well formed - for example, a couple of tags aren't closed. So as a first step, this answer assumes that document looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<SVCManifest xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="typens:SVCManifest">
<Databases xsi:type="typens:ArrayOfSVCDatabase" />
<Resources xsi:type="typens:ArrayOfSVCResource">
<SVCResource xsi:type="typens:SVCResource">
<ID>{429221BF-D0A1-40D8-9DC1-B41D269E95C7}</ID>
<Name>test.crf</Name>
<Metadata xsi:type="typens:XmlPropertySet">
<XmlDoc><?xml version="1.0"?>
<metadata xml:lang="en"><Esri><CreaDate>20211219</metadata></XmlDoc>
</Metadata>
</SVCResource>
</Resources>
</SVCManifest>
Also, as I mentioned, you can't have two xml declarations within one well-formed xml, and if you have a declaration, it must be at the very beginning of the document, not somewhere in the middle.
Next, we start the actual process, but using the lxml library. This should get you close enough to what I believe is your expected output. If it's not 100% there, you'll have to tinker with it a bit:
from lxml import etree
source = """[the content of airquality.xml, from the question]"""
target = """[the content of DD2MI_result.xml, as corrected above]"""
s_doc = etree.XML(source.encode())
t_doc = etree.XML(target.encode())
#first, we get rid of the ugly XmlDoc element in target doc (DD2MI_result.xml)
for t in t_doc.xpath('//XmlDoc'):
t.getparent().remove(t)
#we then create an empty replacement for it
new_xd = etree.Element("XmlDoc")
#next, the replacement element is inserted in the target document in the correct place
for m in t_doc.xpath('//Metadata'):
m.addnext(new_xd)
#finally, we insert in the new XmlDoc element the contents of the source document (airquality.xml)
for t in t_doc.xpath('//XmlDoc'):
t.insert(0,s_doc.xpath('//*')[0])
#confirm that the output is what you are looking for:
print(etree.tostring(t_doc, xml_declaration=True, pretty_print=True).decode())
As I said, this may not be 100% of what you're trying to do, but should get you closer.
Consider XSLT, the special-purpose language designed to transform XML files, which maintains the document() function to read in other XML documents. You even avoid any for-loops and if-logic.
Python's third-party, lxml, can run XSLT 1.0 scripts (not built-in etree or minidom). Alternatively, Python can call third-party XSLT 1.0, 2.0, even 3.0 processors.
XSLT (save as .xsl, a special .xml file)
Below assumes airquality.xml is in same folder relative to DD2MI_result.xml.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" encoding="utf-8" indent="yes" />
<xsl:strip-space elements="*"/>
<!-- IDENTITY TRANSFORM -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<!-- REPLACE XMLDoc -->
<xsl:template match="XmlDoc">
<xsl:copy>
<xsl:apply-templates select="document('airquality.xml')"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Python
import lxml.etree as et
doc = et.parse('DD2MI_result.xml')
xsl = et.parse('MyScript.xsl')
# CONFIGURE TRANSFORMER
transform = et.XSLT(xsl)
# TRANSFORM SOURCE DOC
result = transform(doc)
# OUTPUT TO CONSOLE
print(result)
# SAVE TO FILE
result.write_output('Output.xml')
Output
<SVCManifest xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="typens:SVCManifest">
<Databases xsi:type="typens:ArrayOfSVCDatabase"/>
<Resources xsi:type="typens:ArrayOfSVCResource"/>
<SVCResource xsi:type="typens:SVCResource"/>
<ID>{429221BF-D0A1-40D8-9DC1-B41D269E95C7}</ID>
<Name>test.crf</Name>
<Metadata xsi:type="typens:XmlPropertySet">
<XmlDoc>
<gmd:MD_Metadata xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:srv="http://www.isotc211.org/2005/srv" xmlns:gmx="http://www.isotc211.org/2005/gmx" xmlns:gsr="http://www.isotc211.org/2005/gsr" xmlns:gss="http://www.isotc211.org/2005/gss" xmlns:gts="http://www.isotc211.org/2005/gts" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns:xlink="http://www.w3.org/1999/xlink" xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd">
<gmd:fileIdentifier>
<gco:CharacterString>https://hdl.handle.net/20.500.12085/1f97f2a1-75fc-4110-ae22-f873d7d86565#metadata</gco:CharacterString>
</gmd:fileIdentifier>
<gmd:language>
<gmd:LanguageCode codeList="http://www.loc.gov/standards/iso639-2/" codeListValue="eng">eng</gmd:LanguageCode>
</gmd:language>
</gmd:MD_Metadata>
</XmlDoc>
</Metadata>
</SVCManifest>

Parsing XML in Python with ElementTree - findall()

I'm using the documentation here to try to get only the values (address , mask ) for certain elements.
This is an example of the structure of my XML:
<?xml version="1.0" ?>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:52622325-b136-40cf-bc36-85332e25b6f3" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
<data>
<native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
<interface>
<GigabitEthernet>
<name>1</name>
<ip>
<address>
<primary>
<address>192.168.40.30</address>
<mask>255.255.255.0</mask>
</primary>
</address>
</ip>
<logging>
<event>
<link-status/>
</event>
</logging>
<mop>
<enabled>false</enabled>
<sysid>false</sysid>
</mop>
<negotiation xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-ethernet">
<auto>true</auto>
</negotiation>
</GigabitEthernet>
<GigabitEthernet>
<name>2</name>
<ip>
<address>
<primary>
<address>10.10.10.1</address>
<mask>255.255.255.0</mask>
</primary>
</address>
</ip>
<logging>
<event>
<link-status/>
</event>
</logging>
<mop>
<enabled>false</enabled>
<sysid>false</sysid>
</mop>
<negotiation xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-ethernet">
<auto>true</auto>
</negotiation>
</GigabitEthernet>
<GigabitEthernet>
<name>3</name>
<ip>
<address>
<primary>
<address>30.30.30.1</address>
<mask>255.255.255.0</mask>
</primary>
</address>
</ip>
<logging>
<event>
<link-status/>
</event>
</logging>
<mop>
<enabled>false</enabled>
<sysid>false</sysid>
</mop>
<negotiation xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-ethernet">
<auto>true</auto>
</negotiation>
</GigabitEthernet>
<GigabitEthernet>
<name>4</name>
<logging>
<event>
<link-status/>
</event>
</logging>
<mop>
<enabled>false</enabled>
<sysid>false</sysid>
</mop>
<negotiation xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-ethernet">
<auto>true</auto>
</negotiation>
</GigabitEthernet>
</interface>
</native>
</data>
Working off this example in the documentation, I've tried something like this:
import xml.etree.ElementTree as ET
tree = ET.parse("C:\\Users\\Redha\\Documents\\test_network\\interface123.xml")
root = tree.getroot()
for i in root.findall('native'):
print(i.tag)
But it returns nothing . I've tried other things to no success. Any ideas? All advice appreciated. Thank you!
Consider using namespaces when referencing XML elements:
import xml.etree.ElementTree as ET
# declare XML namespaces
namespaces = {'native': 'http://cisco.com/ns/yang/Cisco-IOS-XE-native'}
tree = ET.parse("C:\\Users\\Redha\\Documents\\test_network\\interface123.xml")
root = tree.getroot()
# call findall() using previously created namespaces map
for i in root.findall('.//native:native', namespaces):
print(i.tag)

Python write result of XML ElementTree findall to a file

I want to write a python code to extract some data from a source XML file and write to a new file. My source file is like this:
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Header/>
<soapenv:Body>
<SessionID xmlns="http://www.niku.com/xog">12345</SessionID>
<QueryResult xmlns="http://www.niku.com/xog/Query" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Records>
<Record>
<id>1</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Payne, Max</name>
</Record>
<Record>
<id>2</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Reno, Jean</name>
</Record>
</Records>
</QueryResult>
</soapenv:Body>
</soapenv:Envelope>
I want to write the following output to a new xml file.
<Records>
<Record>
<id>1</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Payne, Max</name>
</Record>
<Record>
<id>2</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Reno, Jean</name>
</Record>
</Records>
I was able to get following results from this code.
import xml.etree.ElementTree as ET
tree = ET.parse('my_file.xml')
root = tree.getroot()
for xtag in root.findall('.//{http://www.niku.com/xog/Query}Record'):
print(xtag)
Result:
<Element '{http://www.niku.com/xog/Query}Record' at 0x00000216BA69B778>
<Element '{http://www.niku.com/xog/Query}Record' at 0x00000216BA6A3228>
Can anyone help me to complete my requirement?
In your case print(xtag) prints the xtag object and not a string. For that you would need to convert the object to a string using the tree's tostring() method. Also, it seems you are looking to get the whole <Records> block instead of the individual <Record> elements; for this you don't need a loop.
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
records = root.find('.//{http://www.niku.com/xog/Query}Records')
print(ET.tostring(records).decode("utf-8"))
Output
<ns0:Records xmlns:ns0="http://www.niku.com/xog/Query">
<ns0:Record>
<ns0:id>1</ns0:id>
<ns0:date_start>2020-10-04T00:00:00</ns0:date_start>
<ns0:date_end>2020-10-10T00:00:00</ns0:date_end>
<ns0:name>Payne, Max</ns0:name>
</ns0:Record>
<ns0:Record>
<ns0:id>2</ns0:id>
<ns0:date_start>2020-10-04T00:00:00</ns0:date_start>
<ns0:date_end>2020-10-10T00:00:00</ns0:date_end>
<ns0:name>Reno, Jean</ns0:name>
</ns0:Record>
</ns0:Records>
You could also use the lxml module, which gives a slightly different output.
from lxml import etree
tree = etree.parse('test.xml')
root = tree.getroot()
records = root.find('.//{http://www.niku.com/xog/Query}Records')
print(etree.tostring(records).decode("utf-8"))
Output
<Records xmlns="http://www.niku.com/xog/Query" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<Record>
<id>1</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Payne, Max</name>
</Record>
<Record>
<id>2</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Reno, Jean</name>
</Record>
</Records>

Python xml elementree how to check if element if present and process code?

<rules>
<entry name="rule name 1">
<to>
<member>untrust</member>
</to>
<from>
<member>trust</member>
</from>
<source>
<member>object1</member>
</source>
<destination>
<member>any</member>
</destination>
<service>any</service>
<description>'NAT Rule 1'</description>
<nat-type>ipv4</nat-type>
<source-translation>
<static-ip>
<bi-directional>yes</bi-directional>
<translated-address>object1-pub</translated-address>
</static-ip>
</source-translation>
</entry>
<entry name="rule name 2">
<to>
<member>untrust</member>
</to>
<from>
<member>trust</member>
</from>
<source>
<member>any</member>
</source>
<destination>
<member>object2-pub</member>
</destination>
<destination-translation>
<translated-address>object2</translated-address>
</destination-translation>
<service>any</service>
<description>'NAT Rule 2'</description>
<tag>
<member>DST NAT</member>
</tag>
</entry>
</rules>
Hi,
I am trying to process above xml using xml elementree in python. I am looking for a way to check if the <'source-traslation'> or <'destination-translation'> is present. In short, if it if source-translation then set nat-type varialble to source nat and proceed further to get and <'translated-address'> values. If <'destination-address'> is present then process logic to get values for . I am putting all this data in a dict with a format like this...
rules{
rule_name: <name>
options:{
src_zone:<from>
source:<source>
dst_zone:<to>
destination:<destination>
nat-type:<appliaction>
service:<service>
traslated-address:<translated-address>
destination-address:<destination-address>
}
}
I have tried various combinations however it is not working for me.
To check if your element exists you can have an if statement like this:
import xml.etree.ElementTree as ET
root = ET.parse('PATH_TO_YOUR_FILE').getroot()
if len(root.findall('source-translation')) > 0:
PUT YOUR CODE HERE

Delete entire node using lxml

I have a an xml document like the following:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>company</groupId>
<artifactId>art-id</artifactId>
<version>RELEASE</version>
</parent>
<properties>
<tomcat.username>admin</tomcat.username>
<tomcat.password>admin</tomcat.password>
</properties>
<dependencies>
<dependency>
<groupId>asdf</groupId>
<artifactId>asdf</artifactId>
<version>[3.8,)</version>
</dependency>
<dependency>
<groupId>asdf</groupId>
<artifactId>asdf</artifactId>
<version>[4.1,)</version>
</dependency>
</dependencies>
how can I delete the entire node "dependencies"?
I have looked at other questions and answers on stackoverflow and what is different about is the namespace aspect of this xml, and the other questions ask to delete a subelement like "dependency" while I want to delete the whole node "dependencies." Is there an easy way using lxml to delete the entire node?
The following gives a 'NoneType' object has no attribute 'remove' error:
from lxml import etree as ET
tree = ET.parse('pom.xml')
namespace = '{http://maven.apache.org/POM/4.0.0}'
root = ET.Element(namespace+'project')
root.find(namespace+'dependencies').remove()
You can create a dict mapping for your namespace(s), find the node then call root.remove passing the node, you don't call .remove on the node:
x = """<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>company</groupId>
<artifactId>art-id</artifactId>
<version>RELEASE</version>
</parent>
<properties>
<tomcat.username>admin</tomcat.username>
<tomcat.password>admin</tomcat.password>
</properties>
<dependencies>
<dependency>
<groupId>asdf</groupId>
<artifactId>asdf</artifactId>
<version>[3.8,)</version>
</dependency>
<dependency>
<groupId>asdf</groupId>
<artifactId>asdf</artifactId>
<version>[4.1,)</version>
</dependency>
</dependencies>
</project>"""
import lxml.etree as et
from StringIO import StringIO
tree = et.parse(StringIO(x))
root =tree.getroot()
nsmap = {"mav":"http://maven.apache.org/POM/4.0.0"}
root.remove(root.find("mav:dependencies", namespaces=nsmap))
print(et.tostring(tree))
Which would give you:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>company</groupId>
<artifactId>art-id</artifactId>
<version>RELEASE</version>
</parent>
<properties>
<tomcat.username>admin</tomcat.username>
<tomcat.password>admin</tomcat.password>
</properties>
</project>
First, grab the root node. Since it is <project ... > (vs <project .../>) the "parent" element of dependencies is project. Example from the documentation:
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
Once you have the root, check root.tag(), it should be "project".
Then do root.remove(root.find('dependencies')), where root is the project node.
If it were <project .../> then it would be invalid XML since there must be a root element. I can see exactly where you are coming from, though.

Categories