I'm useing generation data structures from XML Schema. There is a part of xsd file, which descibe TCPInterface class:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xsd:schema targetNamespace="tcpinterface_xsd.xsd"
xmlns:cext="tcpinterface_xsd.xsd"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified"
attributeFormDefault="unqualified">
<xsd:element name="TCPInterface">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="port" type="xsd:integer"/>
....
How i can set a default value for "port" element in this xsd file ?
Just add the attribute default="{yourInteger}" to the element. See an example below :
<xsd:element name="port" type="xsd:integer" default="2"/>
Note that, in this case, if your port element is empty before validation, the XML Infoset change after validation and become the post schema validation infoset (PSVI) with the default value assigned to port element.
<xsd:element name="port" type="xsd:integer" default="1" />
Related
The following is an excerpt of an XML file I have.
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<configSections>
<sectionGroup name="applicationSettings" type="System.Configuration.ApplicationSettingsGroup, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089">
<section name="WindowsApplication1.Properties.Settings" type="System.Configuration.ClientSettingsSection, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />
</sectionGroup>
<sectionGroup name="userSettings" type="System.Configuration.UserSettingsGroup, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089">
<section name="WindowsApplication1.Properties.Settings" type="System.Configuration.ClientSettingsSection, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" allowExeDefinition="MachineToLocalUser" />
</sectionGroup>
</configSections>
<appSettings>
<add key="webpages:Enabled" value="false" />
</appSettings>
<applicationSettings>
<WindowsApplication1.Properties.Settings>
<setting name="Cursor" serializeAs="String">
<value>Default</value>
</setting>
<setting name="DoubleBuffering" serializeAs="String">
<value>False</value>
</setting>
</WindowsApplication1.Properties.Settings>
</applicationSettings>
<userSettings>
<WindowsApplication1.Properties.Settings>
<setting name="FormTitle" serializeAs="String">
<value>Form1</value>
</setting>
<setting name="FormSize" serializeAs="String">
<value>595, 536</value>
</setting>
</WindowsApplication1.Properties.Settings>
</userSettings>
</configuration>
I want to check if the value for key="webpages:Enabled" is true or false and find the name of the user who modified the file.
I tried this "Last modified by" (user name, not time) attribute for xlsx using Python and this How to retrieve the author of an office file in python? but I'm unable to find out how to retrieve author information for the xml file. Is it possible to retrieve in python, the name of the user who modified the file?
I'm trying to read XML with ElementTree and write the result back to disk. My long-term goal is to prettify the XML this way. However, in my naive approach, ElementTree eats all the namespace declarations in the document and I don't understand why. Here is an example
test.xsd
<?xml version='1.0' encoding='UTF-8'?>
<xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'
xmlns='sdformat/pose' targetNamespace='sdformat/pose'
xmlns:pose='sdformat/pose'
xmlns:types='http://sdformat.org/schemas/types.xsd'>
<xs:import namespace='sdformat/pose' schemaLocation='./pose.xsd'/>
<xs:element name='pose' type='poseType' />
<xs:simpleType name='string'><xs:restriction base='xs:string' /></xs:simpleType>
<xs:simpleType name='pose'><xs:restriction base='types:pose' /></xs:simpleType>
<xs:complexType name='poseType'>
<xs:simpleContent>
<xs:extension base="pose">
<xs:attribute name='relative_to' type='string' use='optional' default=''>
</xs:attribute>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:schema>
test.py
from xml.etree import ElementTree
ElementTree.register_namespace("types", "http://sdformat.org/schemas/types.xsd")
ElementTree.register_namespace("pose", "sdformat/pose")
ElementTree.register_namespace("xs", "http://www.w3.org/2001/XMLSchema")
tree = ElementTree.parse("test.xsd")
tree.write("test_out.xsd")
Produces test_out.xsd
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="sdformat/pose">
<xs:import namespace="sdformat/pose" schemaLocation="./pose.xsd" />
<xs:element name="pose" type="poseType" />
<xs:simpleType name="string"><xs:restriction base="xs:string" /></xs:simpleType>
<xs:simpleType name="pose"><xs:restriction base="types:pose" /></xs:simpleType>
<xs:complexType name="poseType">
<xs:simpleContent>
<xs:extension base="pose">
<xs:attribute name="relative_to" type="string" use="optional" default="">
</xs:attribute>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:schema>
Notice how test_out.xsd is missing any namespace declarations from test.xsd. I would expect them to be identical. I verified that the latter is valid XML by validating it. It validates with exception of my choice of namespace URI, which I think shouldn't matter.
Update:
Based on mzji's comment I realized that this only happens for values of attributes. With this in mind, I can manually add the namespaces like so:
from xml.etree import ElementTree
namespaces = {
"types": "http://sdformat.org/schemas/types.xsd",
"pose": "sdformat/pose",
"xs": "http://www.w3.org/2001/XMLSchema"
}
for prefix, ns in namespaces.items():
ElementTree.register_namespace(prefix, ns)
tree = ElementTree.parse("test.xsd")
root = tree.getroot()
queue = [tree.getroot()]
while queue:
element:ElementTree.Element = queue.pop()
for value in element.attrib.values():
try:
prefix, value = value.split(":")
except ValueError:
# no namespace, nothing to do
pass
else:
if prefix == "xs":
break # ignore XMLSchema namespace
root.attrib[f"xmlns:{prefix}"] = namespaces[prefix]
for child in element:
queue.append(child)
tree.write("test_out.xsd")
While this solves the problem, it is quite an ugly solution. I also still don't understand why this happens in the first place, so it doesn't answer the question.
There is a valid reason for this behaviour, but it requires a good understanding of XML Schema concepts.
First, some important facts:
Your XML document is not just any old XML document. It is an XSD.
An XSD is described by a schema (See schema for schema )
The attribute xs:restriction/#base is not an xs:string. Its type is xs:QName.
Based on the above facts, we can assert the following:
if test.xsd is parsed as an XML document, but without knowledge of the 'schema for schema' then the value of the base attribute will be treated as a string (technically, as PCDATA).
if test.xsd is parsed using a validating XML parser, with the 'schema for schema' as the XSD, then the value of the base attribute will be parsed as xs:QName
When ElementTree writes the output XML, its behaviour should depend on the data type of base. If base is a QName then ElementTree should detect that it is using the namespace prefix 'types' and it should emit the corresponding namespace declaration.
If you are not supplying the 'schema for schema' when parsing test.xsd then ElementTree is off the hook, because it cannot possibly know that base is supposed to be interpreted as a QName.
I am trying to call "Get_Purchase_Orders" operation in python and it throws below error when response is received
TypeError error in Get_Purchase_Orders : {urn:com.workday/bsvc}Bill_To_Address_ReferenceType() got an unexpected keyword argument 'Address_Reference'. Signature: `({Bill_To_Address_Reference: {urn:com.workday/bsvc}Unique_IdentifierObjectType} | {Address_Reference: {urn:com.workday/bsvc}Address_ReferenceType[]}) Unexpected error: <class 'TypeError'>
the WSDL file is accessible here
My Findings:
Bill_To_Address_Data has two elements (Bill_To_Address_Reference and Address_Reference) that are mutually exclusive, meaning only one out of two elements are expected (there is choice for Bill_To_Address_Reference Address_Reference and both tags are coming in response ). Sample XML can be seen here.
xml chunk can be seen below as well
<bsvc:Bill_To_Address_Data>
<!-- You have a CHOICE of the next 2 items at this level -->
<!-- Optional: -->
<bsvc:Bill_To_Address_Reference bsvc:Descriptor="string">
<!-- Zero or more repetitions: -->
<bsvc:ID bsvc:type="string">string</bsvc:ID>
</bsvc:Bill_To_Address_Reference>
<!-- Zero or more repetitions: -->
<bsvc:Address_Reference>
<!-- Optional: -->
<bsvc:ID>string</bsvc:ID>
</bsvc:Address_Reference>
</bsvc:Bill_To_Address_Data>
below is xsd chunk for above xml
<xsd:complexType name="Bill_To_Address_ReferenceType">
<xsd:annotation>
<xsd:documentation>Contains a reference instance or a Address Reference ID for an existing address</xsd:documentation>
<xsd:appinfo>
<wd:Validation>
<wd:Validation_Message>The Provided Bill To Address is Invalid for this Purchase Order</wd:Validation_Message>
</wd:Validation>
<wd:Validation>
<wd:Validation_Message>The Provided Bill To Address is Invalid for this Purchase Order</wd:Validation_Message>
</wd:Validation>
</xsd:appinfo>
</xsd:annotation>
<xsd:sequence>
<xsd:choice>
<xsd:element name="Bill_To_Address_Reference" type="wd:Unique_IdentifierObjectType" minOccurs="0">
<xsd:annotation>
<xsd:documentation>Reference to an existing Ship-To address.</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:element name="Address_Reference" type="wd:Address_ReferenceType" minOccurs="0" maxOccurs="unbounded">
<xsd:annotation>
<xsd:documentation>Address Reference ID</xsd:documentation>
</xsd:annotation>
</xsd:element>
</xsd:choice>
</xsd:sequence>
</xsd:complexType>
I confirmed this in oxygen when validating XML against the XSD in WSDL or can be accessed here
Now what I want is to ignore this error and parse the response in python using zeep.
Any help will be highly appreciated.
Your choices are:
Modify the WSDL (the XML schema part) so that both tags are allowed in the same request
Find a setting in Zeep that allows you to switch off XSD validation
Stop using Zeep, and find another tool that allows you to parse a request without validating against the WSDL
Option 1 is best because WSDL is supposed to be a contract between the service and its callers. If you don't validate then the value of using WSDL is greatly reduced.
i've got the following xml schema:
<xsd:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:complexType name="DataPackage">
<xsd:sequence>
<xsd:element name="timestamp" type="xsd:float" default="0.0"/>
<xsd:element name="type" type="xsd:string" default="None"/>
<xsd:element name="host" type="xsd:string" default="None"/>
<xsd:element name="data" type="Data" />
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="Data">
<xsd:sequence>
<xsd:element name="item" type="Item" minOccurs="0" maxOccurs="unbounded" />
</xsd:sequence>
</xsd:complexType>
<xsd:complexType name="Item">
<xsd:sequence>
<xsd:element name="key" type="xsd:string"/>
<xsd:element name="val" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>
I used pyxbgen -u DataPackage.xsd -m DataPackage to generate the corresponding python classes and used these to generate the following xml code:
<?xml version="1.0" encoding="utf-8"?>
<DataPackage>
<timestamp>1378970933.29</timestamp>
<type>None</type>
<host>Client 1</host>
<data>
<item>
<key>KEY1</key>
<val>value1</val>
</item>
</data>
</DataPackage>
If i try to read this using the following in python interpreter:
import DataPackage
xml = file("dataPackage-Test.xml").read()
data = DataPackage.CreateFromDocument(xml)
I get the exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "DataPackage.py", line 54, in CreateFromDocument
instance = handler.rootObject()
File "/usr/local/lib/python2.7/dist-packages/pyxb/binding/saxer.py", line 274, in rootObject
raise pyxb.UnrecognizedDOMRootNodeError(self.__rootObject)
pyxb.exceptions_.UnrecognizedDOMRootNodeError: <pyxb.utils.saxdom.Element object at 0x9c7c76c>
Anyone an idea what's wrong?
Your schema defines a top-level complex type named DataPackage, but does not define any top-level elements. Thus the DOM element DataPackage has no corresponding element that PyXB can use to process it.
You need to add something like:
<element name="DataPackage" type="DataPackage"/>
Note that in XML Schema the namespaces for elements and types are distinct, but in Python they are not, so PyXB will rename one of them (the complex type in this case) to avoid the conflict. See http://pyxb.sourceforge.net/arch_binding.html?highlight=conflict#deconflicting-names
I'm using python's lxml to validate xmls against a schema. I have a schema with an element:
<xs:element name="link-url" type="xs:anyURL"/>
and I test, for example, this (part of an) xml:
<a link-url="server/path"/>
I would like this test to FAIL because the link-url doesn't start with http://. I tried switching anyURI to anyURL but this results in an exception - it's not a valid tag.
Is this possible with lxml? is it possible at all with schema validation?
(I'm pretty sure xs:anyURL is not valid. The XML Schema standard calls it anyURI. And since link-url is an attribute, shouldn't you be using xs:attribute instead of xs:element?)
You could restrict the URIs by creating a new simpleType based on it, and put a restriction on the pattern. For example,
#!/usr/bin/env python2.6
from lxml import etree
from StringIO import StringIO
schema_doc = etree.parse(StringIO('''
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="httpURL">
<xs:restriction base="xs:anyURI">
<xs:pattern value='https?://.+'/>
<!-- accepts only http:// or https:// URIs. -->
</xs:restriction>
</xs:simpleType>
<xs:element name="a">
<xs:complexType>
<xs:attribute name="link-url" type="httpURL"/>
</xs:complexType>
</xs:element>
</xs:schema>
''')) #/
schema = etree.XMLSchema(schema_doc)
schema.assertValid(etree.parse(StringIO('<a link-url="http://sd" />')))
assert not schema(etree.parse(StringIO('<a link-url="server/path" />')))