i've got the following xml schema:
<xsd:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType name="DataPackage">
<xsd:sequence>
<xsd:element name="timestamp" type="xsd:float" default="0.0"/>
<xsd:element name="type" type="xsd:string" default="None"/>
<xsd:element name="host" type="xsd:string" default="None"/>
<xsd:element name="data" type="Data" />
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="Data">
<xsd:sequence>
<xsd:element name="item" type="Item" minOccurs="0" maxOccurs="unbounded" />
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="Item">
<xsd:sequence>
<xsd:element name="key" type="xsd:string"/>
<xsd:element name="val" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>
I used pyxbgen -u DataPackage.xsd -m DataPackage to generate the corresponding python classes and used these to generate the following xml code:
<?xml version="1.0" encoding="utf-8"?>
<DataPackage>
<timestamp>1378970933.29</timestamp>
<type>None</type>
<host>Client 1</host>
<data>
<item>
<key>KEY1</key>
<val>value1</val>
</item>
</data>
</DataPackage>
If i try to read this using the following in python interpreter:
import DataPackage
xml = file("dataPackage-Test.xml").read()
data = DataPackage.CreateFromDocument(xml)
I get the exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "DataPackage.py", line 54, in CreateFromDocument
instance = handler.rootObject()
File "/usr/local/lib/python2.7/dist-packages/pyxb/binding/saxer.py", line 274, in rootObject
raise pyxb.UnrecognizedDOMRootNodeError(self.__rootObject)
pyxb.exceptions_.UnrecognizedDOMRootNodeError: <pyxb.utils.saxdom.Element object at 0x9c7c76c>
Anyone an idea what's wrong?
Your schema defines a top-level complex type named DataPackage, but does not define any top-level elements. Thus the DOM element DataPackage has no corresponding element that PyXB can use to process it.
You need to add something like:
<element name="DataPackage" type="DataPackage"/>
Note that in XML Schema the namespaces for elements and types are distinct, but in Python they are not, so PyXB will rename one of them (the complex type in this case) to avoid the conflict. See http://pyxb.sourceforge.net/arch_binding.html?highlight=conflict#deconflicting-names
Related
The following is an excerpt of an XML file I have.
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<configSections>
<sectionGroup name="applicationSettings" type="System.Configuration.ApplicationSettingsGroup, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089">
<section name="WindowsApplication1.Properties.Settings" type="System.Configuration.ClientSettingsSection, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />
</sectionGroup>
<sectionGroup name="userSettings" type="System.Configuration.UserSettingsGroup, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089">
<section name="WindowsApplication1.Properties.Settings" type="System.Configuration.ClientSettingsSection, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" allowExeDefinition="MachineToLocalUser" />
</sectionGroup>
</configSections>
<appSettings>
<add key="webpages:Enabled" value="false" />
</appSettings>
<applicationSettings>
<WindowsApplication1.Properties.Settings>
<setting name="Cursor" serializeAs="String">
<value>Default</value>
</setting>
<setting name="DoubleBuffering" serializeAs="String">
<value>False</value>
</setting>
</WindowsApplication1.Properties.Settings>
</applicationSettings>
<userSettings>
<WindowsApplication1.Properties.Settings>
<setting name="FormTitle" serializeAs="String">
<value>Form1</value>
</setting>
<setting name="FormSize" serializeAs="String">
<value>595, 536</value>
</setting>
</WindowsApplication1.Properties.Settings>
</userSettings>
</configuration>
I want to check if the value for key="webpages:Enabled" is true or false and find the name of the user who modified the file.
I tried this "Last modified by" (user name, not time) attribute for xlsx using Python and this How to retrieve the author of an office file in python? but I'm unable to find out how to retrieve author information for the xml file. Is it possible to retrieve in python, the name of the user who modified the file?
I'm using the Python library zeep to talk to a SOAP service. One of the required arguments in the documentation is of type List<String> and in the WSDL I found this:
<xs:element minOccurs="0" maxOccurs="1" name="IncludedLenders" type="tns:ArrayOfString"/>
And I believe AraryOfString is defined as:
<xs:complexType name="ArrayOfString">
<xs:sequence>
<xs:element minOccurs="0" maxOccurs="unbounded" name="string" nillable="true" type="xs:string"/>
</xs:sequence>
</xs:complexType>
How do I make zeep generate the values for that? I tried with:
"IncludedLenders": [
"BMS",
"BME"
]
but that generates:
<ns0:IncludedLenders>
<ns0:string>BMS</ns0:string>
</ns0:IncludedLenders>
instead of:
<ns0:IncludedLenders>
<ns0:string>BMS</ns0:string>
<ns0:string>BME</ns0:string>
</ns0:IncludedLenders>
Any ideas how to generate the later?
I figured out. First I needed to extract the ArrayOfString type:
array_of_string_type = client.get_type("ns1:ArrayOfString")
and then create it this way:
"IncludedLenders": array_of_string_type(["BMS","BME"])
I encoutered a problem with creating xsd schema in python using lxml library.
I have prepared an xsd schema file below (content was cut to the minimum)
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="qualified"
version="2.4">
<xs:annotation>
<xs:documentation xml:lang="de">Bundeseinheitlicher Medikationsplan</xs:documentation>
</xs:annotation>
<xs:element name="MP">
<xs:annotation>
<xs:documentation>Bundeseinheitlicher Medikationsplan</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:attribute name="p" use="prohibited">
<xs:annotation>
<xs:documentation>Name: Patchnummer</xs:documentation>
</xs:annotation>
<xs:simpleType>
<xs:restriction base="xs:int">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="99"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:complexType>
</xs:element>
</xs:schema>
and when using lxml library to create xsd schema like this
from lxml import etree
with open('some_file.xsd') as schema_file: # some_file.xsd is the file above
etree.XMLSchema(file=schema_file)
I get the following error:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "src/lxml/xmlschema.pxi", line 87, in lxml.etree.XMLSchema.__init__ (src/lxml/lxml.etree.c:197804)
lxml.etree.XMLSchemaParseError: Element '{http://www.w3.org/2001/XMLSchema}attribute': The content is not valid. Expected is (annotation?)., line 16
But when doing this with the python standard library everything goes correct
import xml.etree.ElementTree as ET
with open('some_file.xsd') as f:
tree = ET.parse(f)
I played around a bit with the xsd file and discovered that when removing
use="prohibited" from the attribiute element resolves the problem with lxml library but I need that property.
What can be the reason for that? Is something wrong with lxml library or rather the xml structure of above xsd is incorrect?
This question is old, but had me stumped for a bit.
Here's how I solved it.
schema_root = etree.parse(xsd_filename)
schema = etree.XMLSchema(schema_root)
xml_parser = etree.XMLParser(schema=schema, no_network=False)
Then if you attempt to open it with something like
with open(xml_filename, 'rb') as f:
etree.fromstring(f.read(), xml_parser)
You will only get actual XMLSchemaErrors
https://lxml.de/api/lxml.etree.XMLParser-class.html
I'm useing generation data structures from XML Schema. There is a part of xsd file, which descibe TCPInterface class:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xsd:schema targetNamespace="tcpinterface_xsd.xsd"
xmlns:cext="tcpinterface_xsd.xsd"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified"
attributeFormDefault="unqualified">
<xsd:element name="TCPInterface">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="port" type="xsd:integer"/>
....
How i can set a default value for "port" element in this xsd file ?
Just add the attribute default="{yourInteger}" to the element. See an example below :
<xsd:element name="port" type="xsd:integer" default="2"/>
Note that, in this case, if your port element is empty before validation, the XML Infoset change after validation and become the post schema validation infoset (PSVI) with the default value assigned to port element.
<xsd:element name="port" type="xsd:integer" default="1" />
I'm using python's lxml to validate xmls against a schema. I have a schema with an element:
<xs:element name="link-url" type="xs:anyURL"/>
and I test, for example, this (part of an) xml:
<a link-url="server/path"/>
I would like this test to FAIL because the link-url doesn't start with http://. I tried switching anyURI to anyURL but this results in an exception - it's not a valid tag.
Is this possible with lxml? is it possible at all with schema validation?
(I'm pretty sure xs:anyURL is not valid. The XML Schema standard calls it anyURI. And since link-url is an attribute, shouldn't you be using xs:attribute instead of xs:element?)
You could restrict the URIs by creating a new simpleType based on it, and put a restriction on the pattern. For example,
#!/usr/bin/env python2.6
from lxml import etree
from StringIO import StringIO
schema_doc = etree.parse(StringIO('''
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="httpURL">
<xs:restriction base="xs:anyURI">
<xs:pattern value='https?://.+'/>
<!-- accepts only http:// or https:// URIs. -->
</xs:restriction>
</xs:simpleType>
<xs:element name="a">
<xs:complexType>
<xs:attribute name="link-url" type="httpURL"/>
</xs:complexType>
</xs:element>
</xs:schema>
''')) #/
schema = etree.XMLSchema(schema_doc)
schema.assertValid(etree.parse(StringIO('<a link-url="http://sd" />')))
assert not schema(etree.parse(StringIO('<a link-url="server/path" />')))