python lxml with py2exe - python

I have Generated an XML with dom and i want to use lxml to pretty print the xml.
this is my code for pretty print the xml
def prettify_xml(xml_str):
import lxml.etree as etree
root = etree.fromstring(xml_str)
xml_str = etree.tostring(root, pretty_print=True)
return xml_str
my output should be an xml formatted string.
I got this code from some post in stactoverflow. This works flawlessly when i am compiling wit python itself. But when i convert my project to a binary created from py2exe (my binary is windows service with a namedpipe).I had two problems:
My service was not starting , i solved this by adding lxml.etree in includes option in py2exe function. then on my service started properly.
when xml generation in called here, is the error which I am seeing in my log
'module' object has no attribute 'fromstring'
where do i rectify this error ? And Is my first problem's solution correct ?
my xml generation Code :
from xml.etree import ElementTree
from xml.dom import minidom
from xml.etree.ElementTree import Element, SubElement, tostring, XML
import lxml.etree
def prettify_xml(xml_str):
root = lxml.etree.fromstring(xml_str)
xml_str = lxml.etree.tostring(root, pretty_print=True)
return xml_str
def dll_xml(status):
try:
xml_declaration = '<?xml version="1.0" standalone="no" ?>'
rootTagName='response'
root = Element(rootTagName)
root.set('id' , 'rp001')
parent = SubElement(root, 'command', opcode ='-ac')
# Create children
chdtag1Name = 'mode'
chdtag1Value = 'repreport'
chdtag2Name='status'
chdtag2Value = status
fullchildtag1 = ''+chdtag1Name+' value = "'+chdtag1Value+'"'
fullchildtag2=''+chdtag2Name+' value="'+chdtag2Value+'"'
children = XML('''<root><'''+fullchildtag1+''' /><'''+fullchildtag2+'''/></root> ''')
# Add parent
parent.extend(children)
dll_xml_doc = xml_declaration + tostring(root)
dll_xml_doc = prettify_xml(dll_xml_doc)
return dll_xml_doc
except Exception , error:
log.error("xml_generation_failed : %s" % error)

Try to use PyInstaller instead py2exe. I converted your program to binary .exe with no problem just by running python pyinstaller.py YourPath\xml_a.py.

Related

Removing ":" from namespace in a XML file parsing

I am trying to modify XML file using xml.etree.ElementTree on Python 2.6.6 (due to restrictions) and facing ns0 issue. I looked at this issue and used ET._namespace_map[uri] = prefix as suggested which removed ns0 but the element tags still has the : value. How do we remove it or does it impact the validity of the XML file when we use if for further processing?
Example:
<?xml version="1.0" encoding="UTF-8" ?>
<Seed xmlns="http://www.example.com">
<TagA>
<TagB>B</TagB>
<TagC>c</TagC>
</TagA>
</Seed>
Script
import xml.etree.ElementTree as ET
tree = ET.parse('sample.xml')
root = tree.getroot()
try:
ET.register_namespace("","http://example.com")
except AttributeError:
def register_namespace(prefix, uri):
ET._namespace_map[uri] = prefix
register_namespace("","http://www.example.com")
tree.write('sample.xml')
Note: I could not use lxml or other xml.etree that is supported only from 2.7 version.

Python xml.etree issue

I'm trying to build a script in python 3 using xml.etree which accepts version as a parameter , parses xml and replace the version in the xml tags + values from the tree to the root and his children.
I am at the point where i can change the default value in the root but i am struggling with changing the version to childs and grandchilds - CurrentVersion, Template and Base.
Here is my code and XML:
code-
import sys
from xml.etree import ElementTree as et
version = sys.argv[1]
parse = et.parse("WebApp2.config")
root = parse.getroot()
def changeVersion(version):
ourVersion = root.find('OurVersion')
root.set("default", version)
print(et.tostring(root))
parse.write("WebApp2.config", xml_declaration=True)
if __name__ == "__main__":
changeVersion(version)
XML-
<?xml version="1.0"?>
<OurVersion default="1.0.0.3">
<CurrentVersion bitSupport="true" deviceDetectionSupport="true"
version="1.0.0.3">
<Template>D:\Some\Path\Software\1.0.0.3\webApp\index.webapp</Template>
<BasePath>resources/1.0.0.3/webApp/</BasePath>
</CurrentVersion>
</OurVersion>
I've tried to add something like the below, but im getting issue that "no set attribue to currentVersion" -
ourVersion = root.find('OurVersion')
ourVersion.set('default`, version)
currentVersion = ourVersion.find('CurrentVersion')
currentVersion.set('version', version)
Appreciate your help on this matter ;)
Your first script works because with root.set("default", version) you are using root to refer to the attribute default that you want to modify.
In fact ourVersion = root.find('OurVersion') returns nothing (None) because
OurVersion is your root and then ourVersion.find('CurrentVersion') cannot return what you expect.
Try this instead :
currentVersion = root.find('CurrentVersion')
currentVersion.set('version', version)

error traversing xml in Python

My attempts to traverse an xml file retrieved from a url has always failed. Though, it worked if I typed the xml file directly in the code such as:
smplexml = ''' somexml'''
but I have been unsuccessful to make a code like:
import xml.etree.ElementTree as ET
import urllib
xmlstr = urllib.urlopen('http://www.w3schools.com/xml/simple.xml').read()
tree = ET.fromstring(xmlstr)
print tree.find('name').text
this work. Please what am I doing wrongly? Sometimes I get an error message like:
AttributeError: 'NoneType' object has no attribute 'text'
import xml.etree.ElementTree as ET
import urllib
xmlstr = urllib.urlopen('http://www.w3schools.com/xml/simple.xml').read()
tree = ET.fromstring(xmlstr)
for food in tree:
print food.find('name').text

Extracting nested namespace from a xml using lxml

I'm new to Python and currently learning to parse XML. All seems to be going well until I hit a wall with nested namespaces.
Below is an snippet of my xml ( with a beginning and child element that I'm trying to parse:
<?xml version="1.0" encoding="UTF-8"?>
-<CompositionPlaylist xmlns="http://www.digicine.com/PROTO-ASDCP-CPL-20040511#">
<!-- Generated by orca_wrapping version 3.8.3-0 -->
<Id>urn:uuid:e0e43007-ca9b-4ed8-97b9-3ac9b272be7a</Id>
-------------
-------------
-------------
-<cc-cpl:MainClosedCaption xmlns:cc-cpl="http://www.digicine.com/PROTO- ASDCP-CC-CPL-20070926#"><Id>urn:uuid:0607e57f-edcc-46ec- 997a-d2fbc0c1ea3a</Id><EditRate>24 1</EditRate><IntrinsicDuration>2698</IntrinsicDuration></cc-cpl:MainClosedCaption>
------------
------------
------------
</CompositionPlaylist>
What I'm need is a solution to extract the URI of the local name 'MainClosedCaption'. In this case, I'm trying to extract the string "http://www.digicine.com/PROTO- ASDCP-CC-CPL-20070926#". I looked through a lot of tutorials but cannot seems to find a solution.
If there's anyone out there can lend your expertise, it would be much appreciated.
Here what I did so far with the help from the two contributors:
#!/usr/bin/env python
from xml.etree import ElementTree as ET #import ElementTree module as an alias ET
from lxml import objectify, etree
def parse():
import os
import sys
cpl_file = sys.argv[1]
xml_file = os.path.abspath(__file__)
xml_file = os.path.dirname(xml_file)
xml_file = os.path.join(xml_file,cpl_file)
with open(xml_file)as f:
xml = f.read()
tree = etree.XML(xml)
caption_namespace = etree.QName(tree.find('.//{*}MainClosedCaption')).namespace
print caption_namespace
print tree.nsmap
nsmap = {}
for ns in tree.xpath('//namespace::*'):
if ns[0]:
nsmap[ns[0]] = ns[1]
tree.xpath('//cc-cpl:MainClosedCaption', namespace=nsmap)
return nsmap
if __name__=="__main__":
parse()
But it's not working so far. I got the result 'None' when I used QName to locate the tag and its namespace. And when I try to locate all namespace in the XML using for loop as suggested in another post, I got the error 'Unknown return type: dict'
Any suggestions pls?
This program prints the namespace of the indicated tag:
from lxml import etree
xml = etree.XML('''<?xml version="1.0" encoding="UTF-8"?>
<CompositionPlaylist xmlns="http://www.digicine.com/PROTO-ASDCP-CPL-20040511#">
<!-- Generated by orca_wrapping version 3.8.3-0 -->
<Id>urn:uuid:e0e43007-ca9b-4ed8-97b9-3ac9b272be7a</Id>
<cc-cpl:MainClosedCaption xmlns:cc-cpl="http://www.digicine.com/PROTO-ASDCP-CC-CPL-20070926#">
<Id>urn:uuid:0607e57f-edcc-46ec- 997a-d2fbc0c1ea3a</Id>
<EditRate>24 1</EditRate>
<IntrinsicDuration>2698</IntrinsicDuration>
</cc-cpl:MainClosedCaption>
</CompositionPlaylist>
''')
print etree.QName(xml.find('.//{*}MainClosedCaption')).namespace
Result:
http://www.digicine.com/PROTO-ASDCP-CC-CPL-20070926#
Reference: http://lxml.de/tutorial.html#namespaces

python lxml etree parsing an fvdl file

The file contains the following lines.
<?xml version="1.0" encoding="UTF-8"?>
<FVDL xmlns="xmlns://www.fortifysoftware.com/schema/fvdl" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.9" xsi:type="FVDL">`
<CreatedTS date="2013-08-06" time="11:8:48" />`
I am trying to read the version tag in FVDL.I am using lxml etree and my code snippet is
from lxml import etree
with open(os.path.join(analysis,"merged-results.fvdl") ,"r") as file_handle:
context = etree.parse(file_handle)
ver = context.xpath('//FVDL')
print ver
This had worked before in parsing a standard xml file. However it is failing for the above mentioned file .(ver is an empty list at the end of execution)
Alternative to #falsetru's answer
(By "trying to read the version tag", I understand "the version attribute" (which may not be what you want))
Explicitly register fvdl namespace, under the "fvdl" prefix:
ver = context.xpath('//fvdl:FVDL/#version',
namespaces={"fvdl": "xmlns://www.fortifysoftware.com/schema/fvdl"})
Or, riskier, if somehow you know you want the version attribute from the root node
ver = context.xpath('/*/#version')
Both give ['1.9']
context = etree.parse(file_handle)
ver = context.getroot()
print ver.attrib['version']
output:'1.9'
Use [local-name()=...]:
ver = context.xpath('//*[local-name()="FVDL"]')

Categories