Minidom write XML removing empty node

Minidom write XML removing empty node - python

I have xml file like this
Input XML:
<India>
<state>
<name>Karnataka</name>
<description>One State Many Worlds</description>
<population></population>
</state>
<state>
<name>Kerala</name>
<description> Gods own country</description>
<population></population>
</state>
</India>
and, I process the data(process code/data not included for simplicity).
code:
import xml.dom.minidom
from xml.dom.minidom import Node
doc = xml.dom.minidom.parse("india.xml")
doc.writexml(open('NewData.xml', 'w'))
doc.unlink()
Here is the output XML:
<?xml version="1.0" ?><India>
<state>
<name>Karnataka</name>
<description>One State Many Worlds</description>
<population/>
</state>
<state>
<name>Kerala</name>
<description> Gods own country</description>
<population/>
</state>
</India>
As you can see node is "half removed" in the out file.
Need help to tackle this issue. thanks!

Related

How to rewrite thid XML file?

I trying to rewrite this xml file containing this XML code:
<?xml version="1.0" encoding="UTF-8"?>
<BrowserAutomationStudioProject>
<ModelList>
<Model>
<Name>token</Name>
<Description ru="token" en="token"/>
<Value>5660191076:AAEY8RI3hXcI3dEvjWAj7p2e7DdxOMNjPfk8</Value>
</Model>
<Defaults/>
<Model>
<Name>chat_id</Name>
<Value>5578940124</Value>
</Model>
<Defaults/>
</ModelList>
</BrowserAutomationStudioProject>
My python code:
import xml.etree.ElementTree as ET
tree = ET.parse('Actual.xml')
root = tree.getroot()
for model in root.findall('Model'):
name = model.find('Name').text
if name == 'token':
model.find('Value').text = '123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZ'
if name == 'chat_id':
model.find('Value').text = '1234567890'
tree.write('xml_file.xml')
It works but I get the same file:
<?xml version="1.0" encoding="UTF-8"?>
<BrowserAutomationStudioProject>
<ModelList>
<Model>
<Name>token</Name>
<Description ru="token" en="token"/>
<Value>5660191076:AAEY8RI3hXcI3dEvjWAj7p2e7DdxOMNjPfk8</Value>
</Model>
<Defaults/>
<Model>
<Name>chat_id</Name>
<Value>5578940124</Value>
</Model>
<Defaults/>
</ModelList>
</BrowserAutomationStudioProject>
What's wrong with my code?
Even ChatGPT can't help me haha
I even tried to print it but it doesn't work
What I should do?
Please help me.

As described in the documentation, Element.findall() finds only elements with a tag which are direct children of the current element.. You need to force ET to selects all subelements, on all levels beneath the current element by using //.
Since <Model> is not a direct child of root (it's a grandchild, or something to that effect :)), root.findall('Model') finds nothing. So to get ET to find it, you need to modify that to
root.findall('.//Model')
and it should work.

You could also use for model in root.findall('ModelList/Model').

If you know the order of the xml tag you can do something like pop() the values from a list by iterate through the tree:
import xml.etree.ElementTree as ET
tree = ET.parse('Actual.xml')
root = tree.getroot()
input_value = ['1234567890','123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZ']
for elem in root.iter():
if elem.tag == "Value":
elem.text = input_value.pop()
print(elem.tag, elem.text)
tree.write('xml_file.xml')
Output:
<?xml version="1.0"?>
<BrowserAutomationStudioProject>
<ModelList>
<Model>
<Name>token</Name>
<Description ru="token" en="token" />
<Value>123456789:ABCDEFGHIJKLMNOPQRSTUVWXYZ</Value>
</Model>
<Defaults />
<Model>
<Name>chat_id</Name>
<Value>1234567890</Value>
</Model>
<Defaults />
</ModelList>
</BrowserAutomationStudioProject>

How to copy xml entire elements, attributes and data to new xml file with specific id using python

Below is my sample xml source file
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="http://www.sample.com/xml/catalog" catalog-id="sample-catalog">
<product product-id="214146430">
<online-flag>false</online-flag>
<online-flag site-id="sample_ae">false</online-flag>
<available-flag>true</available-flag>
<searchable-flag>true</searchable-flag>
<tax-class-id>standard</tax-class-id>
<page-attributes/>
<custom-attributes>
<custom-attribute attribute-id="adultsize">L</custom-attribute>
</custom-attributes>
</product>
<product product-id="214146123">
<online-flag>false</online-flag>
<online-flag site-id="sample_ae">false</online-flag>
<available-flag>true</available-flag>
<searchable-flag>true</searchable-flag>
<tax-class-id>standard</tax-class-id>
<page-attributes/>
<custom-attributes>
<custom-attribute attribute-id="adultsize">L</custom-attribute>
</custom-attributes>
</product>
</catalog>
I want to copy only product id 214146430 to
New xml file and it should look like below
<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="http://www.sample.com/xml/catalog" catalog-id="sample-catalog">
<product product-id="214146430">
<online-flag>false</online-flag>
<online-flag site-id="sample_ae">false</online-flag>
<available-flag>true</available-flag>
<searchable-flag>true</searchable-flag>
<tax-class-id>standard</tax-class-id>
<page-attributes/>
<custom-attributes>
<custom-attribute attribute-id="adultsize">L</custom-attribute>
</custom-attributes>
</product>
</catalog>
I am currently using xml.etree.ElementTree and xml.dom but no luck
but it is just copying the entire xml which is is not expected
Below is my python code
import xml.etree.ElementTree as ET
tree = ET.parse('Development/product_data_parser/emporio-imoprt-test.xml')
root = tree.getroot()
print(ET.tostring(root, encoding='utf8').decode('utf8'))
Thank so much in advance for your help

You could do it like this although this solution might be specific to your use-case:-
import xml.etree.ElementTree as ET
import re
# A list of product IDs that you want to keep
keep = ['214146430']
# Figure out the namespace (used in tag matching later)
def getnamespace(root):
m = re.match(r'\{.*\}', root.tag)
return m.group(0) if m is not None else ''
tree = ET.parse('Development/product_data_parser/emporio-imoprt-test.xml')
root = tree.getroot()
namespace = getnamespace(root)
for elem in root.findall(f'{namespace}product'):
if elem.attrib['product-id'] not in keep:
root.remove(elem)
print(ET.tostring(root, encoding='utf8').decode('utf8'))

parse xml file with game data in python

I have this game data written in xml format.
<?xml version="1.0" encoding="ISO-8859-1"?>
<log xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://mirror.server.eu/descr.xsd">
<version>0.1</version>
<info>
<timestamp>2018-09-23 16:09:23 CEST</timestamp>
<hostname>server.eu</hostname>
</info>
<events>
<event>
<pickup>
<time>1.506636</time>
<item>item_spikes</item>
<player>player1</player>
<value>50</value>
</pickup>
</event>
<event>
<damage>
<time>1.926975</time>
<attacker>player1</attacker>
<target>player2</target>
<type>sg</type>
<quad>0</quad>
<splash>0</splash>
<value>24</value>
<armor>0</armor>
</damage>
</event>
<event>
<death>
<time>4.862534</time>
<attacker>player2</attacker>
<target>player1</target>
<type>lg_beam</type>
<quad>0</quad>
<armorleft>0</armorleft>
<killheight>0</killheight>
<lifetime>4.862534</lifetime>
</death>
</event>
</events>
</log>
I need to parse it and take out all events called 'death'. Then I need to access every element in that 'death' section. Could you please help me with that?

Assuming that events can only contain a tag called death you can easily do this:
import xml.etree.cElementTree as ET
tree = ET.ElementTree(file='your_game_events.xml')
for event in tree.iter(tag = 'death'):
for child in event:
print "%s: %s" % (child.tag, child.text)

Python: Appending children to an already created XML file's root using XML DOM

I'm totally new to XML and I'm stuck on how to append children to a root node of an already exisitng XML file using Python and XML DOM. Right now I have this script to create an output file:
from xml.dom.minidom import Document
doc = Document()
root_node = doc.createElement("notes") # Root
doc.appendChild(root_node)
object_node = doc.createElement("data") # Child
root_node.appendChild(object_node)
object_node.setAttribute("a_version", "something_v001.0001.ma") # Set attributes
object_node.setAttribute("b_user", "Me")
object_node.setAttribute("c_comment", "comment about file")
xml_file = open("C:/Temp/file_notes.xml", "w") # Append
xml_file.write(doc.toprettyxml())
xml_file.close()
This gives me an output file that looks like this:
<?xml version="1.0" ?>
<notes>
<data a_version="Me" b_user="something_v001.0001.ma" c_comment="comment about file"/>
</notes>
I'd like to append future data to this file so it will look something like this after 2 additional versions:
<?xml version="1.0" ?>
<notes>
<data a_version="something_v001.0001.ma" b_user="Me" c_comment="comment about file"/>
<data a_version="something_v001.0002.ma" b_user="You" c_comment="minor save"/>
<data a_version="something_v002.0003.ma" b_user="Them" c_comment="major save"/>
</notes>
But every attempt I make at appending data comes out like this:
<?xml version="1.0" ?>
<notes>
<data a_version="Me" b_user="something_v001.0001.ma" c_comment="comment about file"/>
</notes>
<?xml version="1.0" ?>
<notes>
<data a_version="Me" b_user="something_v001.0001.ma" c_comment="comment about file"/>
</notes>
<?xml version="1.0" ?>
<notes>
<data a_version="Me" b_user="something_v001.0001.ma" c_comment="comment about file"/>
</notes>
If anyone has any alternative methods to accomplish this task by using ElementTree that would be appreciated as well. There seem to be a lot more resources, but I'm not sure how to implement the solution with Maya. Thanks!

You haven't shown us any code demonstrating "every attempt I make at appending data". But never mind, here is how you can use ElementTree to append new elements to an existing XML file.
from xml.etree import ElementTree as ET
from xml.dom import minidom
# Assume that we have an existing XML document with one "data" child
doc = ET.parse("file_notes.xml")
root = doc.getroot()
# Create 2 new "data" elements
data1 = ET.Element("data", {"a_version": "something_v001.0002.ma",
"b_user": "You",
"c_comment": "minor save"})
data2 = ET.Element("data", {"a_version": "something_v001.0003.ma",
"b_user": "Them",
"c_comment": "major save"})
# Append the new "data" elements to the root element of the XML document
root.append(data1)
root.append(data2)
# Now we have a new well-formed XML document. It is not very nicely formatted...
out = ET.tostring(root)
# ...so we'll use minidom to make the output a little prettier
dom = minidom.parseString(out)
print dom.toprettyxml()
Output:
<?xml version="1.0" ?>
<notes>
<data a_version="Me" b_user="something_v001.0001.ma" c_comment="comment about file"/>
<data a_version="something_v001.0002.ma" b_user="You" c_comment="minor save"/>
<data a_version="something_v001.0003.ma" b_user="Them" c_comment="major save"/>
</notes>
ElementTree does not have a built-in pretty-printer, so we use minidom for that. The output contains some superfluous whitespace, but it is better than what ElementTree can provide.

How to remove ns0 tag while dumping

I have tried parsing the file using lxml iterparse since the actual file would be huge. I have the following code:
import xml.etree.cElementTree as etree
filename = r'D:\test\Books.xml'
context = iter(etree.iterparse(filename, events=('start', 'end')))
_, root = next(context)
for event, elem in context:
if event == 'start' and elem.tag == '{http://www.book.org/Book-19200/biblography}Book':
print(etree.dump(elem))
root.clear()
And my XML looks like this:
<Books>
<Book xmlns="http://www.book.org/Book-19200/biblography"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
ISBN="519292296"
xsi:schemaLocation="http://www.book.org/Book-19200/biblography ../../book.xsd
http://www.w3.org/2000/12/xmldsig# ../../xmldsig-core-schema.xsd">
<Detail ID="67">
<BookName>Code Complete 2</BookName>
<Author>Steve McConnell</Author>
<Pages>960</Pages>
<ISBN>0735619670</ISBN>
<BookName>Application Architecture Guide 2</BookName>
<Author>Microsoft Team</Author>
<Pages>496</Pages>
<ISBN>073562710X</ISBN>
</Detail>
</Book>
<Book xmlns="http://www.book.org/Book-19200/biblography"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
ISBN="519292296"
xsi:schemaLocation="http://www.book.org/Book-19200/biblography ../../book.xsd
http://www.w3.org/2000/12/xmldsig# ../../xmldsig-core-schema.xsd">
<Detail ID="87">
<BookName>Rocking Python</BookName>
<Author>Guido Rossum</Author>
<Pages>960</Pages>
<ISBN>0735619690</ISBN>
<BookName>Python Rocks</BookName>
<Author>Microsoft Team</Author>
<Pages>496</Pages>
<ISBN>073562710X</ISBN>
</Detail>
</Book>
</Books>
Running the above generates something like this:
<ns0:Book xmlns:ns0="http://www.book.org/Book-19200/biblography" xmlns:xsi="http://www.w3.org/2001/XMLSchema-ins
ance" ISBN="519292296" xsi:schemaLocation="http://www.book.org/Book-19200/biblography ../../book.xsd http:/
www.w3.org/2000/12/xmldsig# ../../xmldsig-core-schema.xsd">
<ns0:Detail ID="67">
<ns0:BookName>Code Complete 2</ns0:BookName>
<ns0:Author>Steve McConnell</ns0:Author>
<ns0:Pages>960</ns0:Pages>
<ns0:ISBN>0735619670</ns0:ISBN>
<ns0:BookName>Application Architecture Guide 2</ns0:BookName>
<ns0:Author>Microsoft Team</ns0:Author>
<ns0:Pages>496</ns0:Pages>
<ns0:ISBN>073562710X</ns0:ISBN>
</ns0:Detail>
</ns0:Book>
How do I ensure I print the xml fragment without the ns0 prefix? I am using Python 3.

Add
etree.register_namespace("", "http://www.book.org/Book-19200/biblography")
to your program. This function registers a namespace prefix to be used for serialization (in this case it means no prefix).
Reference: http://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.register_namespace

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Minidom write XML removing empty node - python

Related

How to rewrite thid XML file?

How to copy xml entire elements, attributes and data to new xml file with specific id using python

parse xml file with game data in python

Python: Appending children to an already created XML file's root using XML DOM

How to remove ns0 tag while dumping

Categories

Resources