How to get parent path of lxml.etree._ElementTree object

How to get parent path of lxml.etree._ElementTree object - python

Using lxml library I have objectified some elements (sample code below)
config = objectify.Element("config")
gui = objectify.Element("gui")
color = objectify.Element("color")
gui.append(color)
config.append(gui)
config.gui.color.active = "red"
config.gui.color.readonly = "black"
config.gui.color.match = "grey"
the result is the following structure
config
config.gui
config.gui.color
config.gui.color.active
config.gui.color.readonly
config.gui.color.match
I can get a full path for each of the objects
for element in config.iter():
print(element.getroottree().getpath(element))
The path elements are separated by slash but that is not a problem. I do not know how can I get only the parent part of the path so I can use setattr to change the value of given element
For example for element
config.gui.color.active
I would like to enter the command
setattr(config.gui.color, 'active', 'something')
But have no idea how get the "parent" part of full path.

You can get the parent of an element using the getparent function.
for element in config.iter():
print("path:", element.getroottree().getpath(element))
if element.getparent() is not None:
print("parent-path:", element.getroottree().getpath(element.getparent()))
You could also just remove the last part of the element path itself.
for element in config.iter():
path = element.getroottree().getpath(element)
print("path:", path)
parts = path.split("/")
parent_path = "/".join(parts[:-1])
print("parent-path:", parent_path)

Related

parsing .xml file using python :search and copy related data

I want to copy some data from .xml file based on some search value .
In below xml file I want to search 0xCCB7B836 ( 0xCCB7B836 )and copy data inside that
4e564d2d52656648
6173685374617274
1782af065966579e
899885d440d3ad67
d04b41b15e2b13c2
one more example :
search value 0xECFBBA1A and return 0000
or
search value 0xA54E2B5A and return 30d4
<MEM_DATA>
<MEM_SECTOR>
<MEM_SECTOR_NUMBER>0</MEM_SECTOR_NUMBER>
<MEM_SECTOR_STATUS>ACTIVE</MEM_SECTOR_STATUS>
<MEM_SECTOR_STARTADR>0x800000</MEM_SECTOR_STARTADR>
<MEM_SECTOR_ENDADR>0x0</MEM_SECTOR_ENDADR>
<MEM_SECTOR_COUNTER>0x1</MEM_SECTOR_COUNTER>
<MEM_ERASED_MARKER>SET</MEM_ERASED_MARKER>
<MEM_USED_MARKER>SET</MEM_USED_MARKER>
<MEM_FULL_MARKER>NOT_SET</MEM_FULL_MARKER>
<MEM_ERASE_MARKER>NOT_SET</MEM_ERASE_MARKER>
<MEM_START_MARKER>SET</MEM_START_MARKER>
<MEM_START_OFFSET>0x1</MEM_START_OFFSET>
<MEM_CLONE_MARKER>NOT_SET</MEM_CLONE_MARKER>
<MEM_BLOCK>
<MEM_BLOCK_ID>0x101</MEM_BLOCK_ID>
<MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
<MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
<MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
<MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
<MEM_BLOCK_LEN>0x28</MEM_BLOCK_LEN>
<MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
<MEM_BLOCK_HEADER_CRC>0xE527</MEM_BLOCK_HEADER_CRC>
<MEM_BLOCK_CRC>0xCCB7B836</MEM_BLOCK_CRC>
<MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
<MEM_BLOCK_DATA>
<MEM_PAGE_DATA>4e564d2d52656648</MEM_PAGE_DATA>
<MEM_PAGE_DATA>6173685374617274</MEM_PAGE_DATA>
<MEM_PAGE_DATA>1782af065966579e</MEM_PAGE_DATA>
<MEM_PAGE_DATA>899885d440d3ad67</MEM_PAGE_DATA>
<MEM_PAGE_DATA>d04b41b15e2b13c2</MEM_PAGE_DATA>
</MEM_BLOCK_DATA>
</MEM_BLOCK>
<MEM_BLOCK>
<MEM_BLOCK_ID>0x20F</MEM_BLOCK_ID>
<MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
<MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
<MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
<MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
<MEM_BLOCK_LEN>0x2</MEM_BLOCK_LEN>
<MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
<MEM_BLOCK_HEADER_CRC>0xE0D2</MEM_BLOCK_HEADER_CRC>
<MEM_BLOCK_CRC>0xECFBBA1A</MEM_BLOCK_CRC>
<MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
<MEM_BLOCK_DATA>
<MEM_PAGE_DATA>0000</MEM_PAGE_DATA>
</MEM_BLOCK_DATA>
</MEM_BLOCK>
<MEM_BLOCK>
<MEM_BLOCK_ID>0x1F8</MEM_BLOCK_ID>
<MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
<MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
<MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
<MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
<MEM_BLOCK_LEN>0x2</MEM_BLOCK_LEN>
<MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
<MEM_BLOCK_HEADER_CRC>0x1DCC</MEM_BLOCK_HEADER_CRC>
<MEM_BLOCK_CRC>0xA54E2B5A</MEM_BLOCK_CRC>
<MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
<MEM_BLOCK_DATA>
<MEM_PAGE_DATA>30d4</MEM_PAGE_DATA>
</MEM_BLOCK_DATA>
</MEM_BLOCK>
</MEM_SECTOR>
</MEM_DATA>

Assuming that we have this xml data inside a file named test.xml, you can do something like that:
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
def search_and_copy(query):
for child in root.findall("MEM_SECTOR/MEM_BLOCK"):
if child.find("MEM_BLOCK_CRC").text == query:
return [item.text for item in child.findall("MEM_BLOCK_DATA/*")]
Let's try this search_and_copy() function out:
>>> search_and_copy("0xCCB7B836")
['4e564d2d52656648', '6173685374617274', '1782af065966579e', '899885d440d3ad67', 'd04b41b15e2b13c2']
>>> search_and_copy("0xA54E2B5A")
['30d4']

We can use xpath, with python's xml etree and elementpath to write a function to retrieve the data :
Breakdown of the code below (within the elementpath.Selector):
1. the first line looks for elements that have our search string
2. The second line .. goes back one step to get the parent element
3. Proceeding from the parent element, this line searches for MEM_PAGE_DATA within the parent element. This element holds the data we are actually interested in.
4. The rest of the code simply pulls the text from the matches
import xml.etree.ElementTree as ET
import elementpath
#wrapped the shared data into a test.xml file
root = ET.parse('test.xml').getroot()
def find_data(search_string):
selector = elementpath.Selector(f""".//*[text()='{search_string}']
//..
//MEM_PAGE_DATA""")
#pull text from the match
result = [entry.text for entry in selector.select(root)]
return result
Test on the strings provided :
find_data("0xCCB7B836")
['4e564d2d52656648',
'6173685374617274',
'1782af065966579e',
'899885d440d3ad67',
'd04b41b15e2b13c2']
find_data("0xECFBBA1A")
['0000']
find_data("0xA54E2B5A")
['30d4']

Selenium (Python) - Checking parameter next to the LABEL and insert the value OR select an option

IF would like to create a generic code (by using Selenium) which will look for the label, and the find next to the label input(OR select) tag and insert the value.
Main function:
for l in label:
try:
xpathInput = "//label[contains(.,'{}')]/following::input".format(l)
checkXpathInput, pathInput= check_xpath(browser,xpathInput)
if checkXpathInput is True:
pathInput.clear()
pathInput.send_keys("\b{}".format(value))
break
for op in option:
xpathSelect = "//label[contains(.,'{}')]/following::select/option[text()='{}']".format(l,op)
checkXpathSelect, pathSelect= check_xpath(browser,xpathSelect)
if checkXpathSelect is True:
pathSelect.click()
break
except:
print("Can't match: {}".format(l))
Path checker:
def check_xpath(browser,xpath):
try:
path = browser.find_element_by_xpath(xpath)
except NoSuchElementException:
return False
return True , path
What is the current issue?
I need that if LABEL will be for example TITLE the code will check that there is NO input tag next to "Title" label and then he will go and check is there is the select tag next to the label "Title" and e.t.c....
In my current, he will find the label "Title" and then will fill in value to the first next input (which is incorrect as "Title" is using the SELECT tag)

I'd exploit the fact that find_elements_by_xpath returns a list of found elements and empty lists are falsy. So you wouldn't need a try/except and a function which returns bool or tuple values (which is not the most optimal behavior).
It would be easier to give a good answer with some html source example but I assume what you'd like to do is this:
def handle_label_inputs(label, value):
# if there is a such label, this result won't be empty
found_labels = driver.find_elements_by_xpath('//label[contains(.,"{}")]'.format(label))
# if the list is not empty
if found_labels:
l = found_labels[0]
# any options with the given value as text
following_select_option_values = l.find_elements_by_xpath('./following::select//option[text()="{}"]'.format(value))
# any inputs next to the label
following_inputs = l.find_elements_by_xpath('./following::input')
# did we find an option?
if following_select_option_values:
following_select_option_values[0].click()
# or is there an input?
elif following_inputs:
in_field = following_inputs[0]
in_field.clear()
in_field.send_keys(value)
else:
print("Can't match: {} - {}".format(label, value))
driver.get('http://thenewcode.com/166/HTML-Forms-Drop-down-Menus')
handle_label_inputs('State / Province / Territory', 'California')
I don't know how tidy the page you are work with but if it is well done, then your label should have a for="something" attribute. If that is the case then you can simply find the label-related-element and find out if its tag is input (or select):
related_element_if_done_properly = driver.find_elements_by_xpath('//*[#id="{}"]'.format(label_element.get_attribute("for")))
if related_element_if_done_properly:
your_element = related_element_if_done_properly[0]
is_input = your_element.tagname.lower() == "input"
else:
print('Ohnoes')

setSelected in QTreeWidget

I have a project where I need to change the selection of a tree widget in code. This needs to be done after I clear out the tree and populate it again.
I'm trying to mark the appropriate item as "selected" while I'm adding them. This works for root level nodes. But for child nodes, it doesn't. I need to store the QTreeWidgetItem in another variable and mark it as selected after the tree has been completely populated. Why does this happen?
This does not work:
def refreshTree(self):
treeObj.clear()
for item in items:
temp = QTreeWidgetItem(0)
for key, val in item.subitems().items():
childTemp = QTreeWidgetItem(0)
...setup text, font, etc...
if(condition1):
childTemp.setSelected(True)
temp.addChild(childTemp)
if(!condition1 and condition2):
temp.setSelected(True)
treeObj.addToplevelItem(temp)
This does:
def refreshTree(self):
treeObj.clear()
for item in items:
temp = QTreeWidgetItem(0)
for key, val in item.subitems().items():
childTemp = QTreeWidgetItem(0)
...setup text, font, etc...
if(condition1):
selTemp = childTemp
temp.addChild(childTemp)
if(!condition1 and condition2):
temp.setSelected(True)
elif(selTemp):
selTemp.setSelected(True)
treeObj.addToplevelItem(temp)

It is not specified in the documentation, but setSelected does nothing if the item hasn't been added to a view yet:
inline void QTreeWidgetItem::setSelected(bool aselect)
{ if (view) view->setItemSelected(this, aselect); }
So, you should either
pass treeObj or temp in the constructor of your QTreeWidgetItem to make them part of the view from the start
or call addChild/addTopLevelItem before calling setSelected (or other functions like setExpanded...).
I don't know why your second code was even working.

Search and remove element with elementTree in Python

I have an XML document in which I want to search for some elements and if they match some criteria
I would like to delete them
However, I cannot seem to be able to access the parent of the element so that I can delete it
file = open('test.xml', "r")
elem = ElementTree.parse(file)
namespace = "{http://somens}"
props = elem.findall('.//{0}prop'.format(namespace))
for prop in props:
type = prop.attrib.get('type', None)
if type == 'json':
value = json.loads(prop.attrib['value'])
if value['name'] == 'Page1.Button1':
#here I need to access the parent of prop
# in order to delete the prop
Is there a way I can do this?
Thanks

You can remove child elements with the according remove method. To remove an element you have to call its parents remove method. Unfortunately Element does not provide a reference to its parents, so it is up to you to keep track of parent/child relations (which speaks against your use of elem.findall())
A proposed solution could look like this:
root = elem.getroot()
for child in root:
if child.name != "prop":
continue
if True:# TODO: do your check here!
root.remove(child)
PS: don't use prop.attrib.get(), use prop.get(), as explained here.

You could use xpath to select an Element's parent.
file = open('test.xml', "r")
elem = ElementTree.parse(file)
namespace = "{http://somens}"
props = elem.findall('.//{0}prop'.format(namespace))
for prop in props:
type = prop.get('type', None)
if type == 'json':
value = json.loads(prop.attrib['value'])
if value['name'] == 'Page1.Button1':
# Get parent and remove this prop
parent = prop.find("..")
parent.remove(prop)
http://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax
Except if you try that it doesn't work: http://elmpowered.skawaii.net/?p=74
So instead you have to:
file = open('test.xml', "r")
elem = ElementTree.parse(file)
namespace = "{http://somens}"
search = './/{0}prop'.format(namespace)
# Use xpath to get all parents of props
prop_parents = elem.findall(search + '/..')
for parent in prop_parents:
# Still have to find and iterate through child props
for prop in parent.findall(search):
type = prop.get('type', None)
if type == 'json':
value = json.loads(prop.attrib['value'])
if value['name'] == 'Page1.Button1':
parent.remove(prop)
It is two searches and a nested loop. The inner search is only on Elements known to contain props as first children, but that may not mean much depending on your schema.

I know this is an old thread but this kept popping up while I was trying to figure out a similar task. I did not like the accepted answer for two reasons:
1) It doesn't handle multiple nested levels of tags.
2) It will break if multiple xml tags are deleted in the same level one-after-another. Since each element is an index of Element._children you shouldn't delete while forward iterating.
I think a better more versatile solution is this:
import xml.etree.ElementTree as et
file = 'test.xml'
tree = et.parse(file)
root = tree.getroot()
def iterator(parents, nested=False):
for child in reversed(parents):
if nested:
if len(child) >= 1:
iterator(child)
if True: # Add your entire condition here
parents.remove(child)
iterator(root, nested=True)
For the OP, this should work - but I don't have the data you're working with to test if it's perfect.
import xml.etree.ElementTree as et
file = 'test.xml'
tree = et.parse(file)
namespace = "{http://somens}"
props = tree.findall('.//{0}prop'.format(namespace))
def iterator(parents, nested=False):
for child in reversed(parents):
if nested:
if len(child) >= 1:
iterator(child)
if prop.attrib.get('type') == 'json':
value = json.loads(prop.attrib['value'])
if value['name'] == 'Page1.Button1':
parents.remove(child)
iterator(props, nested=True)

A solution using lxml module
from lxml import etree
root = ET.fromstring(xml_str)
for e in root.findall('.//{http://some.name.space}node'):
parent = e.getparent()
for child in parent.find('./{http://some.name.space}node'):
try:
parent.remove(child)
except ValueError:
pass

Using the fact that every child must have a parent, I'm going to simplify #kitsu.eb's example. f using the findall command to get the children and parents, their indices will be equivalent.
file = open('test.xml', "r")
elem = ElementTree.parse(file)
namespace = "{http://somens}"
search = './/{0}prop'.format(namespace)
# Use xpath to get all parents of props
prop_parents = elem.findall(search + '/..')
props = elem.findall('.//{0}prop'.format(namespace))
for prop in props:
type = prop.attrib.get('type', None)
if type == 'json':
value = json.loads(prop.attrib['value'])
if value['name'] == 'Page1.Button1':
#use the index of the current child to find
#its parent and remove the child
prop_parents[props.index[prop]].remove(prop)

I also used XPath for this issue, but in a different way:
root = elem.getroot()
elementName = "YourElement"
#this will find all the parents of the elements with elementName
for elementParent in root.findall(".//{}/..".format(elementName)):
#this will find all the elements under the parent, and remove them
for element in elementParent.findall("{}".format(elementName)):
elementParent.remove(element)

I like to use an XPath expression for this kind of filtering. Unless I know otherwise, such an expression must be applied at the root level, which means I can't just get a parent and apply the same expression on that parent. However, it seems to me that there is a nice and flexible solution that should work with any supported XPath, as long as none of the sought nodes is the root. It goes something like this:
root = elem.getroot()
# Find all nodes matching the filter string (flt)
nodes = root.findall(flt)
while len(nodes):
# As long as there are nodes, there should be parents
# Get the first of all parents to the found nodes
parent = root.findall(flt+'/..')[0]
# Use this parent to remove the first node
parent.remove(nodes[0])
# Find all remaining nodes
nodes = root.findall(flt)

I would like only to add a comment on the accepted answer, but my lack of reputation doesn't allow me to. I wanted to add that it is important to add .findall("*")to the iterator to avoid issues, as stated in the documentation:
Note that concurrent modification while iterating can lead to problems, just like when iterating and modifying Python lists or dicts. Therefore, the example first collects all matching elements with root.findall(), and only then iterates over the list of matches.
Therefore, in the accepted answer the iteration should be for child in root.findal("*"):instead of for child in root:. Not doing so made my code skip some elements from the list.

wx.TreeCtrl item

I'm trying to use a TreeCtrl to represent a folder structure. For each folder I need to know it's absolute path and name. I'm currently doing something like this:
self.root = self.tree.AddRoot(project.name)
self.tree.SetPyData(self.root, None)
self.root.path = root
---- other code -----
childItem = self.tree.AppendItem(self.root, child.name)
childItem.path = self.root.path + "/" + child.name
But now on an event I will need to get the path string. So far my approach that fails is:
self.Bind(wx.EVT_TREE_ITEM_EXPANDED, self.OnItemExpanded, self.tree)
----- other code -------
def OnItemExpanded(self, evt):
selected = evt.GetItem()
print selected.path
Now this fails because: AttributeError: 'TreeItemId' object has no attribute 'path' . From what I understand here the event only gives me a Id to a Item from the tree and not the actual Item that resulted from the "childItem = self.tree.AppendItem(self.root, child.name)" ? If that is the case how can I get to that item ?
regards,
Bogdan

What is the .path property? Is this something you are creating or an actual member of the TreeItemId object (this is the object returned from the "AppendItem" method)? I do not see any docs on it.
If you want to store arbitrary data in the child items use SetPyData/GetPyData methods.
childItem = self.tree.AppendItem(self.root, child.name)
self.tree.SetPyData(childItem, ["hi", "i" , "am", "a", "python", "object"])
Then in your handler:
def OnItemExpanded(self, event):
item = event.GetItem()
if item:
pyObj = self.tree.GetPyData(item)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get parent path of lxml.etree._ElementTree object - python

Related

parsing .xml file using python :search and copy related data

Selenium (Python) - Checking parameter next to the LABEL and insert the value OR select an option

setSelected in QTreeWidget

Search and remove element with elementTree in Python

wx.TreeCtrl item

Categories

Resources