How to insert values using python in xml file - python

I am programming novice and have just started learning python
below is my xml file:
<Build_details>
<Release number="1902">
<Build number="260">
<OMS>
<Build_path>ST_OMS_V1810_B340</Build_path>
<Pc_version>8041.30.01</Pc_version>
</OMS>
<OMNI>
<Build_path>ST_OMNI_V1810_B340</Build_path>
</OMNI>
</Build>
</Release>
<Release number="1810">
<Build number="230">
<OMS>
<Build_path>ST_OMS_909908</Build_path>
<Pc_version>8031.25.65</Pc_version>
</OMS>
<OMNI>
<Build_path>ST_OMNI_798798789789</Build_path>
</OMNI>
</Build>
</Release>
<Release number="1806">
<Build number="300">
<OMS>
<Build_path>ST_OMS_V18102_B300</Build_path>
<Pc_version>8041.30.01</Pc_version>
</OMS>
<OMNI>
<Build_path>ST_OMNI_V18102_B300</Build_path>
</OMNI>
</Build>
</Release>
</Build_details>
How can i insert below chunk of data by asking release no to user and insert below it :
<Build number="230">
<OMS>
<Build_path>ST_OMS_909908</Build_path>
<Pc_version>8031.25.65</Pc_version>
</OMS>
<OMNI>
<Build_path>ST_OMNI_798798789789</Build_path>
</OMNI>
</Build>
I need to search a particular release and then add details to it.Please help
i am not unable to traverse xml to find a particular release

I'm not able to add my comment because of less Reputations .
go through this link Reading XML file and fetching its attributes value in Python

Here is the solution using python inbuilt library xml,
You will have to find the release element first and then create a new build element and append to the release element.
import xml.etree.ElementTree as ET
if __name__ == "__main__":
release_number = input("Enter the release number\n").strip()
tree = ET.ElementTree(file="Build.xml") # Original XML File
root = tree.getroot()
for elem in root.iterfind('.//Release'):
# Find the release element
if elem.attrib['number'] == release_number:
# Create new Build Element
build_elem = ET.Element("Build", {"number": "123"})
# OMS element
oms_elem = ET.Element("OMS")
build_path_elem = ET.Element("Build_path")
build_path_elem.text = "ST_OMS_909908"
pc_version_elem = ET.Element("Pc_version")
pc_version_elem.text = "8031.25.65"
oms_elem.append(build_path_elem)
oms_elem.append(pc_version_elem)
omni_elem = ET.Element("OMNI")
build_path_omni_elem = ET.Element("Build_path")
build_path_omni_elem.text = "ST_OMNI_798798789789"
omni_elem.append(build_path_omni_elem)
build_elem.append(oms_elem)
build_elem.append(omni_elem)
elem.append(build_elem)
# Write to file
tree.write("Build_new.xml") # After adding the new element

Related

docxtpl - error when opening document in Word after adding more than five images

I'm trying to automate some reports in Word, and I'm getting the following error when I open the created document in Word:
"Word found unreadable content in test. Do you want to recover the contents of this document? If you trust the source of this document, click Yes."
After clicking yes it says the file cannot be opened. When I open in Libre Office there's no issue (I'm running the script on Ubuntu/Python 3.8.5)
Here's a simplified version of my code:
from docxtpl import DocxTemplate, InlineImage
from docx.shared import Mm
doc = DocxTemplate("template_test.docx")
""" load up images """
mps_chart = InlineImage(doc, image_descriptor='test/mps_line_chart.png')
server_pie = InlineImage(doc, image_descriptor='test/server_availability_pie.png', width=Mm(76), height=Mm(58))
agent_pie = InlineImage(doc, image_descriptor='test/agent_availability_pie.png', width=Mm(76), height=Mm(58))
cases_chart = InlineImage(doc, image_descriptor='test/cases_bar_chart.png')
alarms_chart = InlineImage(doc, image_descriptor='test/alarms_line_chart.png')
intro_alarms_graphic = InlineImage(doc, image_descriptor='test/alarms_intro_graphic.png', width=Mm(38), height=Mm(38))
intro_cases_graphic = InlineImage(doc, image_descriptor='test/open cases_intro_graphic.png', width=Mm(38), height=Mm(38))
intro_mps_graphic = InlineImage(doc, image_descriptor='test/mps_intro_graphic.png', width=Mm(38), height=Mm(38))
intro_doc_graphic = InlineImage(doc, image_descriptor='test/doc_intro_graphic.png', width=Mm(38), height=Mm(38))
months = {"MONTH_1": "June", "MONTH_2": "May", "MONTH_3": "April"}
intro_images = {"intro_alarms": intro_alarms_graphic, "intro_cases": intro_cases_graphic, "intro_mps": intro_mps_graphic, "intro_doc": intro_doc_graphic}
images = {"mps_line_chart": mps_chart, "agent_pie_chart": agent_pie, "server_pie_chart": server_pie , "alarms_line_chart": alarms_chart, "cases_bar_chart": cases_chart}
context = {**images, **months, **intro_images}
doc.render(context)
doc.save("test.docx")
The following will work fine, I only get the error when more than 5 images are added:
intro_images = {"intro_alarms": intro_alarms_graphic}
images = {"mps_line_chart": mps_chart, "agent_pie_chart": agent_pie, "server_pie_chart": server_pie , "alarms_line_chart": alarms_chart}
I also still have the same issue when I include all the images in a single dict, or if I do this:
context = {"mps_line_chart": mps_chart, "agent_pie_chart": agent_pie, "server_pie_chart": server_pie , "alarms_line_chart": alarms_chart, "cases_bar_chart": cases_chart, "intro_alarms": intro_alarms_graphic, "intro_cases": intro_cases_graphic, "intro_mps": intro_mps_graphic, "intro_doc": intro_doc_graphic}
Seemed to be an issue with the Word document once it had been opened and saved again in Libre Office. I opened the template back in Word and saved which seems to have resolved the issue.

How to write multiple XML parameters correctly in python-docx

Trying to change the width of an existing Word table using XML. I need to write to the XML parameters that is to get the code: <w:tblW w:w="5000" w:type="pct"/> But it does not work. See below how it turns out. Please tell me why this happens? How to do it right?
import docx
from docx.oxml.table import CT_Row, CT_Tc
from docx.oxml import OxmlElement
from docx.oxml.ns import qn
from docx import Document
doc = docx.Document('example.docx')
# all tables via XML
for table in doc.tables:
table.style = 'Normal Table'
tbl = table._tbl # get xml element in table
tblPr = tbl.tblPr # We get an xml element containing the style and width
print('============================ before ==============================')
print(table._tbl.xml) # Output the entire xml of the table
# Setting the table width to 100%. To do this, look at the xml example:
# <w:tblW w:w="5000" w:type="pct"/> - this is size 5000 = 100%, and type pct = %
#
tblW = OxmlElement('w:tblW')
w = OxmlElement('w:w')
w.set(qn('w:w'), '5000')
type = OxmlElement('w:type')
type.set(qn('w:type'), 'pct')
tblW.append(w)
tblW.append(type)
tblPr.append(tblW) # Adding the recorded results to the elements
print('============================ after ==============================')
print(table._tbl.xml) # Output the entire xml of the table
doc.save('restyled.docx')
We get the following results:
============================ before ==============================
...
<w:tblPr>
<w:tblW w:w="8880" w:type="dxa"/>
<w:tblCellMar>
<w:top w:w="15" w:type="dxa"/>
<w:left w:w="15" w:type="dxa"/>
<w:bottom w:w="15" w:type="dxa"/>
<w:right w:w="15" w:type="dxa"/>
</w:tblCellMar>
<w:tblLook w:val="04A0" w:firstRow="1" w:lastRow="0" w:firstColumn="1" w:lastColumn="0" w:noHBand="0" w:noVBand="1"/>
</w:tblPr>
...
============================ after ==============================
...
<w:tblPr>
...
<w:tblW>
<w:w w:w="5000"/>
<w:type w:type="pct"/>
</w:tblW>
</w:tblPr>
...
There should have been a result:
...
<w:tblPr>
...
<w:tblW w:w="5000" w:type="pct"/>
</w:tblPr>
...
Just add:
w.set(qn('w:type'), 'pct')
Instead of these lines:
tblW.append(w)
tblW.append(type)
tblPr.append(tblW) # Adding the recorded results to the elements
w:type is an attribute, which is added using the set() method. The append() method is used to add a child element.

How to update table of contents in docx-file with python on linux?

I've got a problem with updating table of contents in docx-file, generated by python-docx on Linux. Generally, it is not difficult to create TOC (Thanks for this answer https://stackoverflow.com/a/48622274/9472173 and this thread https://github.com/python-openxml/python-docx/issues/36)
from docx.oxml.ns import qn
from docx.oxml import OxmlElement
paragraph = self.document.add_paragraph()
run = paragraph.add_run()
fldChar = OxmlElement('w:fldChar') # creates a new element
fldChar.set(qn('w:fldCharType'), 'begin') # sets attribute on element
instrText = OxmlElement('w:instrText')
instrText.set(qn('xml:space'), 'preserve') # sets attribute on element
instrText.text = 'TOC \o "1-3" \h \z \u' # change 1-3 depending on heading levels you need
fldChar2 = OxmlElement('w:fldChar')
fldChar2.set(qn('w:fldCharType'), 'separate')
fldChar3 = OxmlElement('w:t')
fldChar3.text = "Right-click to update field."
fldChar2.append(fldChar3)
fldChar4 = OxmlElement('w:fldChar')
fldChar4.set(qn('w:fldCharType'), 'end')
r_element = run._r
r_element.append(fldChar)
r_element.append(instrText)
r_element.append(fldChar2)
r_element.append(fldChar4)
p_element = paragraph._p
But later to make TOC visible it requires to update fields. Mentioned bellow solution involves update it manually (right-click on TOC hint and choose 'update fields'). For the automatic updating, I've found the following solution with word application simulation (thanks to this answer https://stackoverflow.com/a/34818909/9472173)
import win32com.client
import inspect, os
def update_toc(docx_file):
word = win32com.client.DispatchEx("Word.Application")
doc = word.Documents.Open(docx_file)
doc.TablesOfContents(1).Update()
doc.Close(SaveChanges=True)
word.Quit()
def main():
script_dir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
file_name = 'doc_with_toc.docx'
file_path = os.path.join(script_dir, file_name)
update_toc(file_path)
if __name__ == "__main__":
main()
It pretty works on Windows, but obviously not on Linux. Have someone any ideas about how to provide the same functionality on Linux. The only one suggestion I have is to use local URLs (anchors) to every heading, but I am not sure is it possible with python-docx, also I'm not very strong with these openxml features. I will very appreciate any help.
I found a solution from this Github Issue. It work on ubuntu.
def set_updatefields_true(docx_path):
namespace = "{http://schemas.openxmlformats.org/wordprocessingml/2006/main}"
doc = Document(docx_path)
# add child to doc.settings element
element_updatefields = lxml.etree.SubElement(
doc.settings.element, f"{namespace}updateFields"
)
element_updatefields.set(f"{namespace}val", "true")
doc.save(docx_path)## Heading ##
import docx.oxml.ns as ns
def update_table_of_contents(doc):
# Find the settings element in the document
settings_element = doc.settings.element
# Create an "updateFields" element and set its "val" attribute to "true"
update_fields_element = docx.oxml.shared.OxmlElement('w:updateFields')
update_fields_element.set(ns.qn('w:val'), 'true')
# Add the "updateFields" element to the settings element
settings_element.append(update_fields_element)

how to get file names and paths based on a given attribute in parent tag

I want to change the below code to get file_names and file_paths only when fastboot="true" attribute is present in the parent tag,I provided the current output and expected ouput,can anyone provide guidance on how to do it?
import sys
import os
import string
from xml.dom import minidom
if __name__ == '__main__':
meta_contents = minidom.parse("fast.xml")
builds_flat = meta_contents.getElementsByTagName("builds_flat")[0]
build_nodes = builds_flat.getElementsByTagName("build")
for build in build_nodes:
bid_name = build.getElementsByTagName("name")[0]
print "Checking if this is cnss related image... : \n"+bid_name.firstChild.data
if (bid_name.firstChild.data == 'apps'):
file_names = build.getElementsByTagName("file_name")
file_paths = build.getElementsByTagName("file_path")
print "now files paths...\n"
for fn,fp in zip(file_names,file_paths):
if (not fp.firstChild.nodeValue.endswith('/')):
fp.firstChild.nodeValue = fp.firstChild.nodeValue + '/'
full_path = fp.firstChild.nodeValue+fn.firstChild.nodeValue
print "file-to-copy: "+full_path
break
INPUT XML:-
<builds_flat>
<build>
<name>apps</name>
<file_ref ignore="true" minimized="true">
<file_name>adb.exe</file_name>
<file_path>LINUX/android/vendor/qcom/proprietary/usb/host/windows/prebuilt/</file_path>
</file_ref>
<file_ref ignore="true" minimized="true">
<file_name>system.img</file_name>
<file_path>LINUX/android/out/target/product/msmcobalt/secondary-boot/</file_path>
</file_ref>
<download_file cmm_file_var="APPS_BINARY" fastboot_rumi="boot" fastboot="true" minimized="true">
<file_name>boot.img</file_name>
<file_path>LINUX/android/out/target/product/msmcobalt/</file_path>
</download_file>
<download_file sparse_image_path="true" fastboot_rumi="abl" fastboot="true" minimized="true">
<file_name>abl.elf</file_name>
<file_path>LINUX/android/out/target/product/msmcobalt/</file_path>
</download_file>
</build>
</builds_flat>
OUTPUT:-
...............
now files paths...
file-to-copy: LINUX/android/vendor/qcom/proprietary/usb/host/windows/prebuilt/adb.exe
file-to-copy: LINUX/android/out/target/product/msmcobalt/secondary-boot/system.img
file-to-copy: LINUX/android/out/target/product/msmcobalt/boot.img
file-to-copy: LINUX/android/out/target/product/msmcobalt/abl.elf
EXPECTED OUT:-
now files paths...
........
file-to-copy: LINUX/android/out/target/product/msmcobalt/boot.img
file-to-copy: LINUX/android/out/target/product/msmcobalt/abl.elf
Something rather quick and dirty that comes to mind is using the fact that only the download_file elements have the fastboot attribute, right? If that's the case, you could always get the children of type download_file and filter the ones whose fastboot attribute is not "true":
import os
from xml.dom import minidom
if __name__ == '__main__':
meta_contents = minidom.parse("fast.xml")
for elem in meta_contents.getElementsByTagName('download_file'):
if elem.getAttribute('fastboot') == "true":
path = elem.getElementsByTagName('file_path')[0].firstChild.nodeValue
file_name = elem.getElementsByTagName('file_name')[0].firstChild.nodeValue
print os.path.join(path, file_name)
With the sample you provided that outputs:
$ python ./stack_034.py
LINUX/android/out/target/product/msmcobalt/boot.img
LINUX/android/out/target/product/msmcobalt/abl.elf
Needless to say... since there's no .xsd file (nor that it'd matter with the minidom, though) you only get strings (no type safety) and this only applies to the structure shown in the example (you probably would like to add some extra checks there, is what I mean)
EDIT:
As per the comment in this answer:
To get the elements within the <build> that contains a <name> attribute with value apps, you can: Find that <name> tag (the one whose value is the string apps), then move to the parent node (which will put you in the build element) and then proceed as mentioned above:
if __name__ == '__main__':
meta_contents = minidom.parse("fast.xml")
for elem in meta_contents.getElementsByTagName('name'):
if elem.firstChild.nodeValue == "apps":
apps_build = elem.parentNode
for elem in apps_build.getElementsByTagName('download_file'):
if elem.getAttribute('fastboot') == "true":
path = elem.getElementsByTagName('file_path')[0].firstChild.nodeValue
file_name = elem.getElementsByTagName('file_name')[0].firstChild.nodeValue
print os.path.join(path, file_name)

Parsing xml tree attributes (file has no elements)

I have been trying to use minidom but have no real preference. For some reason lxml will not install on my machine.
I would like to parse an xml file:
<?xml version="1.
-<transfer frmt="1" vtl="0" serial_number="E5XX-0822" date="2016-10-03 16:34:53.000" style="startstop">
-<plateInfo>
<plate barcode="E0122326" name="384plate" type="source"/>
<plate barcode="A1234516" name="1536plateD" type="destination"/>
</plateInfo>
-<printmap total="1387">
<w reason="" cf="13" aa="1.779" eo="299.798" tof="32.357" sv="1565.311" ct="1.627" ft="1.649" fc="88.226" memt="0.877" fldu="Percent" fld="DMSO" dy="0" dx="0" region="-1" tz="18989.481" gy="72468.649" gx="55070.768" avt="50" vt="50" vl="3.68" cvl="3.63" t="16:30:47.703" dc="0" dr="0" dn="A1" c="0" r="0" n="A1"/>
<w reason="" cf="13" aa="1.779" eo="299.798" tof="32.357" sv="1565.311" ct="1.627" ft="1.649" fc="88.226" memt="0.877" fldu="Percent" fld="DMSO" dy="0" dx="0" region="-1" tz="18989.481" gy="72468.649" gx="55070.768" avt="50" vt="50" vl="3.68" cvl="3.63" t="16:30:47.703" dc="0" dr="0" dn="A1" c="1" r="0" n="A2"/>
</printmap>
</transfer>
The files do not have any element details, as you can see. All the information is contained in the attributes. In trying to adapt another SO post, I have this - but it seems to be geared more toward elements. I am also failing at a good way to "browse" the xml information, i.e. I would like to say "dir(xml_file)" and have a list of all the methods I can carry out on my tree structure, or see all the attributes. I know this was a lot and potentially different directions, but thank you in advance!
def parse(files):
for xml_file in files:
xmldoc = minidom.parse(xml_file)
transfer = xmldoc.getElementsByTagName('transfer')[0]
plateInfo = transfer.getElementsByTagName('plateInfo')[0]
With minidom you can access the attributes of a particular element using the method attributes which can then be treated as dictionary; this example iterates and print the attributes of the element transfer[0]:
from xml.dom.minidom import parse, parseString
xml_file='''<?xml version="1.0" encoding="UTF-8"?>
<transfer frmt="1" vtl="0" serial_number="E5XX-0822" date="2016-10-03 16:34:53.000" style="startstop">
<plateInfo>
<plate barcode="E0122326" name="384plate" type="source"/>
<plate barcode="A1234516" name="1536plateD" type="destination"/>
</plateInfo>
<printmap total="1387">
<w reason="" cf="13" aa="1.779" eo="299.798" tof="32.357" sv="1565.311" ct="1.627" ft="1.649" fc="88.226" memt="0.877" fldu="Percent" fld="DMSO" dy="0" dx="0" region="-1" tz="18989.481" gy="72468.649" gx="55070.768" avt="50" vt="50" vl="3.68" cvl="3.63" t="16:30:47.703" dc="0" dr="0" dn="A1" c="0" r="0" n="A1"/>
<w reason="" cf="13" aa="1.779" eo="299.798" tof="32.357" sv="1565.311" ct="1.627" ft="1.649" fc="88.226" memt="0.877" fldu="Percent" fld="DMSO" dy="0" dx="0" region="-1" tz="18989.481" gy="72468.649" gx="55070.768" avt="50" vt="50" vl="3.68" cvl="3.63" t="16:30:47.703" dc="0" dr="0" dn="A1" c="1" r="0" n="A2"/>
</printmap>
</transfer>'''
xmldoc = parseString(xml_file)
transfer = xmldoc.getElementsByTagName('transfer')
attlist= transfer[0].attributes.keys()
for a in attlist:
print transfer[0].attributes[a].name,transfer[0].attributes[a].value
you can find more information here:
http://www.diveintopython.net/xml_processing/attributes.html

Categories