I have a an xml document like the following:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>company</groupId>
<artifactId>art-id</artifactId>
<version>RELEASE</version>
</parent>
<properties>
<tomcat.username>admin</tomcat.username>
<tomcat.password>admin</tomcat.password>
</properties>
<dependencies>
<dependency>
<groupId>asdf</groupId>
<artifactId>asdf</artifactId>
<version>[3.8,)</version>
</dependency>
<dependency>
<groupId>asdf</groupId>
<artifactId>asdf</artifactId>
<version>[4.1,)</version>
</dependency>
</dependencies>
how can I delete the entire node "dependencies"?
I have looked at other questions and answers on stackoverflow and what is different about is the namespace aspect of this xml, and the other questions ask to delete a subelement like "dependency" while I want to delete the whole node "dependencies." Is there an easy way using lxml to delete the entire node?
The following gives a 'NoneType' object has no attribute 'remove' error:
from lxml import etree as ET
tree = ET.parse('pom.xml')
namespace = '{http://maven.apache.org/POM/4.0.0}'
root = ET.Element(namespace+'project')
root.find(namespace+'dependencies').remove()
You can create a dict mapping for your namespace(s), find the node then call root.remove passing the node, you don't call .remove on the node:
x = """<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>company</groupId>
<artifactId>art-id</artifactId>
<version>RELEASE</version>
</parent>
<properties>
<tomcat.username>admin</tomcat.username>
<tomcat.password>admin</tomcat.password>
</properties>
<dependencies>
<dependency>
<groupId>asdf</groupId>
<artifactId>asdf</artifactId>
<version>[3.8,)</version>
</dependency>
<dependency>
<groupId>asdf</groupId>
<artifactId>asdf</artifactId>
<version>[4.1,)</version>
</dependency>
</dependencies>
</project>"""
import lxml.etree as et
from StringIO import StringIO
tree = et.parse(StringIO(x))
root =tree.getroot()
nsmap = {"mav":"http://maven.apache.org/POM/4.0.0"}
root.remove(root.find("mav:dependencies", namespaces=nsmap))
print(et.tostring(tree))
Which would give you:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>company</groupId>
<artifactId>art-id</artifactId>
<version>RELEASE</version>
</parent>
<properties>
<tomcat.username>admin</tomcat.username>
<tomcat.password>admin</tomcat.password>
</properties>
</project>
First, grab the root node. Since it is <project ... > (vs <project .../>) the "parent" element of dependencies is project. Example from the documentation:
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
Once you have the root, check root.tag(), it should be "project".
Then do root.remove(root.find('dependencies')), where root is the project node.
If it were <project .../> then it would be invalid XML since there must be a root element. I can see exactly where you are coming from, though.
Related
I'd like to create an element tree (not parsing!) with lxml.objectify that might look like this:
<root>
<child>Hello World</child>
</root>
My first attempt was to write code like this:
import lxml.objectify as o
from lxml.etree import tounicode
r = o.Element("root")
c = o.Element("child", text="Hello World")
r.append(c)
print(tounicode(r, pretty_print=True)
But that produces:
<root xmlns:py="http://codespeak.net/lxml/objectify/pytype"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" py:pytype="TREE">
<child text="Hello World" data="Test" py:pytype="TREE"/>
</root>
As suggested in other answers, the <child> has no method _setText.
Apparently, lxml.objectifiy does not allow to create an element with text or change the text content. So, did I miss something?
From the doc and the answer you linked. You should use SubElement:
r = o.E.root() # same as o.Element("root")
c = o.SubElement(r, "child")
c._setText("Hello World")
print(tounicode(r, pretty_print=True))
c._setText("Changed it!")
print(tounicode(r, pretty_print=True))
Output:
<root xmlns:py="http://codespeak.net/lxml/objectify/pytype" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<child>Hello World</child>
</root>
<root xmlns:py="http://codespeak.net/lxml/objectify/pytype" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<child>Changed it!</child>
</root>
I'm having issue extracting version number using python script. Its returning none while running the script. Can someone help me on this ?
Python Script:
import xml.etree.ElementTree as ET
tree = ET.parse('pom.xml')
root = tree.getroot()
releaseVersion = root.find("version")
print(releaseVersion)
pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://maven.apache.org/POM/4.0.0"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<artifactId>watcher</artifactId>
<version>0.0.1-SNAPSHOT</version>
<groupId>com.test</groupId>
<name>file</name>
<packaging>jar</packaging>
<parent>
<artifactId>spring-boot-starter-parent</artifactId>
<groupId>org.springframework.boot</groupId>
<relativePath/>
<version>2.6.1</version>
</parent>
</project>
You're not taking into account that all of your elements are in a default namespace defined by xmlns="http://maven.apache.org/POM/4.0.0" in your <project> element.
So you have to create your query with this namespace.
import xml.etree.ElementTree as ET
tree = ET.parse('pom.xml')
root = tree.getroot()
NS = { 'maven' : 'http://maven.apache.org/POM/4.0.0' }
releaseVersion = root.find("maven:version",NS)
print(releaseVersion.text)
Here NS = { ... } defines the namespace (in the following referred to by its prefix maven) used in the following XPath expression.
Your pom.xml has a namespace xmlns="http://maven.apache.org/POM/4.0.0" in the project tag.
If you must search with fullname, you need to follow {namespace}tag
>> root.find("{http://maven.apache.org/POM/4.0.0}version")
<Element '{http://maven.apache.org/POM/4.0.0}version' at 0x0000014635EC0A40>
But if you don't bother you can search with {*}tag
>> root.find("{*}version")
<Element '{http://maven.apache.org/POM/4.0.0}version' at 0x0000014635EC0A40>
I'm using the documentation here to try to get only the values (address , mask ) for certain elements.
This is an example of the structure of my XML:
<?xml version="1.0" ?>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:52622325-b136-40cf-bc36-85332e25b6f3" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
<data>
<native xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-native">
<interface>
<GigabitEthernet>
<name>1</name>
<ip>
<address>
<primary>
<address>192.168.40.30</address>
<mask>255.255.255.0</mask>
</primary>
</address>
</ip>
<logging>
<event>
<link-status/>
</event>
</logging>
<mop>
<enabled>false</enabled>
<sysid>false</sysid>
</mop>
<negotiation xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-ethernet">
<auto>true</auto>
</negotiation>
</GigabitEthernet>
<GigabitEthernet>
<name>2</name>
<ip>
<address>
<primary>
<address>10.10.10.1</address>
<mask>255.255.255.0</mask>
</primary>
</address>
</ip>
<logging>
<event>
<link-status/>
</event>
</logging>
<mop>
<enabled>false</enabled>
<sysid>false</sysid>
</mop>
<negotiation xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-ethernet">
<auto>true</auto>
</negotiation>
</GigabitEthernet>
<GigabitEthernet>
<name>3</name>
<ip>
<address>
<primary>
<address>30.30.30.1</address>
<mask>255.255.255.0</mask>
</primary>
</address>
</ip>
<logging>
<event>
<link-status/>
</event>
</logging>
<mop>
<enabled>false</enabled>
<sysid>false</sysid>
</mop>
<negotiation xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-ethernet">
<auto>true</auto>
</negotiation>
</GigabitEthernet>
<GigabitEthernet>
<name>4</name>
<logging>
<event>
<link-status/>
</event>
</logging>
<mop>
<enabled>false</enabled>
<sysid>false</sysid>
</mop>
<negotiation xmlns="http://cisco.com/ns/yang/Cisco-IOS-XE-ethernet">
<auto>true</auto>
</negotiation>
</GigabitEthernet>
</interface>
</native>
</data>
Working off this example in the documentation, I've tried something like this:
import xml.etree.ElementTree as ET
tree = ET.parse("C:\\Users\\Redha\\Documents\\test_network\\interface123.xml")
root = tree.getroot()
for i in root.findall('native'):
print(i.tag)
But it returns nothing . I've tried other things to no success. Any ideas? All advice appreciated. Thank you!
Consider using namespaces when referencing XML elements:
import xml.etree.ElementTree as ET
# declare XML namespaces
namespaces = {'native': 'http://cisco.com/ns/yang/Cisco-IOS-XE-native'}
tree = ET.parse("C:\\Users\\Redha\\Documents\\test_network\\interface123.xml")
root = tree.getroot()
# call findall() using previously created namespaces map
for i in root.findall('.//native:native', namespaces):
print(i.tag)
<?xml version="1.0" encoding="utf-8"?>
<AcResponse
Command="hist"
TaskId="408709">
<element
name="/build.gradle"
id="93527">
<transaction
id="1117194"
type="promote"
time="1529083792"
user="soarfa99">
<comment>Automated promotion to parent stream by module build: jenkins-SC-MODULE-CS-SC-TRUNK-MedRec-DEV-CI-430</comment>
<version
virtual="11007/75"
real="36877/2"
virtualNamedVersion="CS-SC-TRUNK-INTG/75"
realNamedVersion="CS-SC-TRUNK-MedRec-DEV2_ar037601/2"
elem_type="text"
dir="no">
<issueNum>72768</issueNum>
</version>
</transaction>
<transaction
id="1111652"
type="promote"
time="1528100495"
user="dm041068">
<comment>SEDA file add- Debajyoti</comment>
<version
virtual="11007/74"
real="39225/1"
virtualNamedVersion="CS-SC-TRUNK-INTG/74"
realNamedVersion="CS-SC-TRUNK-CM-DEV-Debajyoti_dm041068/1"
elem_type="text"
dir="no">
<issueNum>72629</issueNum>
</version>
</transaction>
</element>
<streams>
<stream
id="11007"
name="CS-SC-TRUNK-INTG"
type="normal"/>
</streams>
</AcResponse>
This is the xml i am trying to parse, and i am trying to extract the attribute 'issueNum' with the following code:
tree=ET.parse(xml)
root=tree.getroot()
for item in root.findall('version'):
for child in item:
print(child.attrib['issueNum'])
Can you guys please help, get me the value of "issueNum".
You can use an xpath expression to find the values of issueNum:
from lxml import etree
xml = '''<?xml version="1.0" encoding="utf-8"?>
<AcResponse
Command="hist"
TaskId="408709">....'''
tree = etree.fromstring(xml)
issues = tree.xpath('//version/issueNum')
for issue in issues:
print(issue.text)
This prints:
72768
72629
<?xml version="1.0"?>
<info>
</tags>
</tags>
<area>
<media>
<options>
<name>Jaipur</name>
</options>
</media>
</area>
</info>
i am totaly new in python, here is my xml file and i want to edit element value at run time in python
it means I want to change the <name>Jaipur</name> to <name>Mumbai</name>
First, the example is not valid xml. You can use xml.etree that comes included:
from xml.etree import ElementTree as et
xmlstr="""\
<?xml version="1.0"?>
<area>
<media>
<options>
<name>Jaipur</name>
</options>
</media>
</area>"""
doc=et.fromstring(xmlstr)
doc.find('.//name').text='Mumbai'
print et.tostring(doc)
output:
<area>
<media>
<options>
<name>Mumbai</name>
</options>
</media>
</area>