I am using Elementtree to parse an xml file, edit the contents and write to a new xml file. I have this all working apart form one issue. When I generate the file there are a lot of extra lines containing namespace information. Here are some snippets of code:
import xml.etree.ElementTree as ET
ET.register_namespace("", "http://clish.sourceforge.net/XMLSchema")
tree = ET.parse('ethernet.xml')
root = tree.getroot()
commands = root.findall('{http://clish.sourceforge.net/XMLSchema}'
'VIEW/{http://clish.sourceforge.net/XMLSchema}COMMAND')
for command in commands:
all1.append(list(command.iter()))
And a sample of the output file, with the erroneous line xmlns="http://clish.sourceforge.net/XMLSchema:
<COMMAND xmlns="http://clish.sourceforge.net/XMLSchema" help="Interface specific description" name="description">
<PARAM help="Description (must be in double-quotes)" name="description" ptype="LINE" />
<CONFIG />
</COMMAND>
How can I remove this with elementtree, can I? Or will i have to use some regex (I am writing a string to the file)?
Related
I am a begginer using Python. What I am trying to do is to update the InvoiceStatus of a certain list of invoices - I want to update it to "N" instead of "A". Below the XML file extract:
<?xml version="1.0" encoding="WINDOWS-1252"?>
<AuditFile>
<Header>
<AuditFileVersion>1.04_01</AuditFileVersion>
<CompanyID>51630</CompanyID>
</Header>
<MasterFiles>
<Customer>
<CustomerID>20201376</CustomerID>
<AccountID>20000</AccountID>
</Customer>
</MasterFiles>
<SourceDocuments>
<SalesInvoices>
<NumberOfEntries>981</NumberOfEntries>
<Invoice>
<InvoiceNo>F2 UF/3510000211</InvoiceNo>
<ATCUD>0</ATCUD>
<DocumentStatus>
<InvoiceStatus>A</InvoiceStatus>
<SourceBilling>P</SourceBilling>
</DocumentStatus>
<InvoiceNo>F2 UF/3510020247</InvoiceNo>
<ATCUD>0</ATCUD>
<DocumentStatus>
<InvoiceStatus>A</InvoiceStatus>
<SourceBilling>P</SourceBilling>
</DocumentStatus>
<InvoiceNo>F2 UF/3510020247</InvoiceNo>
<ATCUD>0</ATCUD>
<DocumentStatus>
<InvoiceStatus>A</InvoiceStatus>
<SourceBilling>P</SourceBilling>
</DocumentStatus>
</Invoice>
</SalesInvoices>
</SourceDocuments>
</AuditFile>
Here the script:
from xml.dom import minidom
def reemplazaTexto(nodo,textonuevo):
nodo.firstChild.replaceWholeText(textonuevo)
doc = minidom.parse('sample.xml')
print(doc.toxml())
invoices = doc.getElementsByTagName('InvoiceStatus')
for nodo in invoices:
reemplazaTexto(nodo, 'N')
print(doc.toxml())
But this script modifies all the InvoiceStatus. I would appreciate a hand on this.
Cheers,
Axel
Actually, my workflow is, I'll get some data from backend python, and I have to use that data and replicate it into an HTML page or in a pdf format as per the user's wish.
So I have created a python function in which the XML will be there and be saved automatically in our backend.
Here I'll provide my .py file where I wrote my code which generated XML code.
import xml.etree.ElementTree as xml
def GenerateXML(filename):
root = xml.Element("customers")
c1= xml.Element("customer")
root.append(c1)
type1= xml.SubElement(c1,"place")
type1.text = "UK"
Amount1 = xml.SubElement(c1,"Amount")
Amount1.text="4500"
tree=xml.ElementTree(root)
with open(filename,"wb") as f:
tree.write(f)
if __name__ == "__main__":
GenerateXML("fast.xml")
The result for this code will generate a backend file named fast.xml, which contains
#fast.xml
<customers>
<customer>
<place>uk</place>
<Amount>4500</Amount>
</customer>
</customers>
Creating an XML file is done, but attaching an XSL file to the XML is the issue,
can we do it with python, as we created .XML file
For example,
I have another XML file, which has an XSL file for it:
XML file
<?xml-stylesheet type = "text/xsl" href = "demo1.xsl"?>
<class>
<student>
<firstname>Graham</firstname>
<lastname>Bell</lastname>
<nickname>Garry</nickname>
</student>
<student>
<firstname>Albert</firstname>
<lastname>Einstein</lastname>
<nickname>Ally</nickname>
</student>
<student>
<firstname>Thomas</firstname>
<lastname>Edison</lastname>
<nickname>Eddy</nickname>
</student>
</class>
IN HTML it looks with a tabular form and with background colors.
but how to do it with an automated XML file
can anyone provide me a solution for this?
Thanks in Advance
So I have the following .txt file of data, where the data highlighted with yellow needs to be saved to a new txt file:
I managed to print certain sections in Python, but that's about it:
with open('Podatki-zima-MEDVES.txt', mode='r+t') as file:
for line in file:
print(line[18:39])
Resulting in:
EntryDate="20101126"
EntryDate="20101126"
EntryDate="20101126"
EntryDate="20101126"
EntryDate="20101127"
EntryDate="20101128"
EntryDate="20101128"
EntryDate="20101128"
EntryDate="20101128"
I know it's a very basic question, but for someone experienced this wouldn't take a minute.
Thanks
It looks like you're trying to parse xml data.
There is a standard library package that can do this. The documentation is pretty good and it includes a tutorial. Take a look at The ElementTree XML API.
In you case the code would look something like:
data = """
<data>
<ROW EntryData="20101126" SnowDepth="4"/>
<ROW EntryData="20101127" SnowDepth="8"/>
</data>"""
import xml.etree.ElementTree as ET
root = ET.fromstring(data)
for child in root:
entries = child.attrib
print(entries["EntryData"], entries["SnowDepth"])
This gives the output you're looking for:
20101126 4
20101127 8
As an alternative to using Element Tree you could use an Expat parser for your Structured Markup data.
You first need to specify document type and wrap a top level element around your data as follows:
<?xml version="1.0"?>
<podatki>
<ROW RowState="5" EntryDate="20101126" Entry="" SnowDepth="4" />
<ROW RowState="13" EntryDate="20101126" Entry="Prvi sneg to zimo" SnowDepth="10" />
</podatki>
Then you could use an expat parser.
import xml.parsers.expat
def podatki(name, attrs):
if name == "ROW":
print(f'EntryDate={attrs["EntryDate"]},',
f'SnowDepth={attrs["SnowDepth"]}')
parser = xml.parsers.expat.ParserCreate()
parser.StartElementHandler = podatki
with open('podatki.xml', 'rb') as input_file:
parser.ParseFile(input_file)
The result should be
EntryDate=20101126, SnowDepth=4
EntryDate=20101126, SnowDepth=10
Hi I'm new to xml files in general, but I am trying to replace specific lines in a xml file using 'if statements' in python 3.6. I've been looking at suggestions to use ElementTree, but none of the posts online quite fit the problem I have, so here I am.
My file is as followed:
<?xml version="1.0" encoding="UTF-8"?>
-<StructureDefinition xmlns="http://hl7.org/fhir">
<url value="http://example.org/fhir/StructureDefinition/MyObservation"/>
<name value="MyObservation"/>
<status value="draft"/>
<fhirVersion value="3.0.1"/>
<kind value="resource"/>
<abstract value="false"/>
<type value="Observation"/>
<baseDefinition value="http://hl7.org/fhir/StructureDefinition/Observation"/>
<derivation value="constraint"/>
</StructureDefinition>
I want to replace
url value="http://example.org/fhir/StructureDefinition/MyObservation"/
to something like
url value="http://example.org/fhir/StructureDefinition/NewObservation"/
by using conditional statements - because these are repeated multiple times in other files.
I have tried for-looping through the xml find to find the exact string match (which I've succeeded), but I wasn't able to delete, or replace the line (probably having to do with the fact that this isn't a .txt file).
Any help is greatly appreciated!
Your sample file contains a "-"-token in ln 3 that may be overlooked when copy/pasting in order to find a solution.
Input File
<?xml version="1.0" encoding="UTF-8"?>
<StructureDefinition xmlns="http://hl7.org/fhir">
<url value="http://example.org/fhir/StructureDefinition/MyObservation"/>
<name value="MyObservation"/>
<status value="draft"/>
<fhirVersion value="3.0.1"/>
<kind value="resource"/>
<abstract value="false"/>
<type value="Observation"/>
<baseDefinition value="http://hl7.org/fhir/StructureDefinition/Observation"/>
<derivation value="constraint"/>
</StructureDefinition>
Script
from xml.dom.minidom import parse # use minidom for this task
dom = parse('june.xml') #read in your file
search = "http://example.org/fhir/StructureDefinition/MyObservation" #set search value
replace = "http://example.org/fhir/StructureDefinition/NewObservation" #set replace value
res = dom.getElementsByTagName('url') #iterate over url tags
for element in res:
if element.getAttribute('value') == search: #in case of match
element.setAttribute('value', replace) #replace
with open('june_updated.xml', 'w') as f:
f.write(dom.toxml()) #update the dom, save as new xml file
Output file
<?xml version="1.0" ?><StructureDefinition xmlns="http://hl7.org/fhir">
<url value="http://example.org/fhir/StructureDefinition/NewObservation"/>
<name value="MyObservation"/>
<status value="draft"/>
<fhirVersion value="3.0.1"/>
<kind value="resource"/>
<abstract value="false"/>
<type value="Observation"/>
<baseDefinition value="http://hl7.org/fhir/StructureDefinition/Observation"/>
<derivation value="constraint"/>
</StructureDefinition>
I have a xml file which looks like below:
<?xml version="1.0" encoding="ASCII" standalone="yes"?>
<file>
<records>
<record>
<device_serial_number>PAD203137687</device_serial_number>
<device_serial_number_2>203137687</device_serial_number_2>
</record>
<record>
<device_serial_number>PAD203146024</device_serial_number>
<device_serial_number_2>203146024</device_serial_number_2>
</record>
</records>
</file>
Now i want to check device_serial_number in each record and check if the last 4 characters are 6024, if yes then write the complete record data to newxml file named one.xml
I have tried the below
from xml.etree import ElementTree as ET
tree = ET.parse('C:\\Users\\x3.xml')
for node in tree.findall('.//records//record/'):
print("<"+str(node.tag) + "> "+"<"+str(node.text)+"/>")
So from what I understand, you can try something like below:
from xml.etree import ElementTree as ET
from xml.dom.minidom import getDOMImplementation
from xml.dom.minidom import parseString
tree = ET.parse('C:\\Users\\x3.xml')
root = tree.getroot()
impl = getDOMImplementation()
#print(root) #just to check
commands = root.findall(".//records//")
recs=[c for c in commands if c.find('device_serial_number')!=None and
c.find('soc_id').text[-4:]=='6024']
bb=""
for rec in recs:
aa=(parseString(ET.tostring(rec)).toprettyxml(''))
bb=bb+aa
#print(bb) #it will have all data you need, write these into files
newdoc = impl.createDocument(None, bb, None)
newdoc.writexml(open('your_output_file.xml', 'w'),
indent="",
addindent="",
newl='') #check documentation for these
Here is the linkfor documentation regarding writing to xml files.
Node.writexml(writer, indent=”“, addindent=”“, newl=”“)
Write XML to the writer object. The writer should have a write() method which matches that of the file object interface. The indent parameter is the indentation of the current node. The addindent parameter is the incremental indentation to use for subnodes of the current one. The newl parameter specifies the string to use to terminate newlines.
The above is from xml.dom.minidom documentation.Which explains how to write and what they mean.
Finally this will help you to write the required data to the file which you specify in writexml, in xml format.