How to get attribute value of XML element with saxonc.PySaxonProcessor - python

I am analyzing the use of Saxon XSLT processing in Python 3.8 with Saxon-C-1.2.0 in Windows 10.
I can succesfully run script SaxonHEC.1.2.0.\Saxon.C.API\python-saxon**saxon_example.py**.
The last print line shows the result of getting an attribute value. I think there is an error in that code.
My question: how to get the value of an XML attribute?
with saxonc.PySaxonProcessor(license=False) as proc:
# ... code left out form saxon_example.py
xml2 = """\
<out>
<person att1='value1' att2='value2'>text1</person>
<person>text2</person>
<person>text3</person>
</out>
"""
node2 = proc.parse_xml(xml_text=xml2)
outNode = node2.children
children = outNode[0].children
attrs = children[1].attributes
if len(attrs) == 2:
print('node.children[1].attributes[1].string_value =', attrs[1].string_value)
print('node.children[1].attributes[1] =', attrs[1])
print('node.children[1].attributes[1].__str__ =', attrs[1].__str__())
print('node.children[1].attributes[1].__repr__ =', attrs[1].__repr__())
print('node.children[1].attributes[1].text =', attrs[1].text)
On the commandline I get:
node.children[1].attributes[1].string_value = att2="value2"
node.children[1].attributes[1] = att2="value2"
node.children[1].attributes[1].__str__ = att2="value2"
node.children[1].attributes[1].__repr__ = att2="value2"
Traceback (most recent call last):
File "test-app.py", line 77, in <module>
print('node.children[1].attributes[1].text =', attrs[1].text)
AttributeError: 'saxonc.PyXdmNode' object has no attribute 'text'
while I expect to see only "value2" without the attribute name.

As mentioned in the comment see link to the bug issue where we mention the fix to the bug: The string_value property should be calling the underlying C++ method getStringValue. Fix available in the next maintenance release.
Another workaround is to use the get_attribute_value function if you know the names of the attributes.

Related

Python jsonpickle error: 'OrderedDict' object has no attribute '_OrderedDict__root'

I'm hitting this exception with jsonpickle, when trying to pickle a rather complex object that unfortunately I'm not sure how to describe here. I know that makes it tough to say much, but for what it's worth:
>>> frozen = jsonpickle.encode(my_complex_object_instance)
>>> thawed = jsonpickle.decode(frozen)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/jsonpickle/__init__.py",
line 152, in decode
return unpickler.decode(string, backend=backend, keys=keys)
:
:
File "/Library/Python/2.7/site-packages/jsonpickle/unpickler.py",
line 336, in _restore_from_dict
instance[k] = value
File "/Library/Python/2.7/site-packages/botocore/vendored/requests/packages/urllib3/packages/ordered_dict.py",
line 49, in __setitem__
root = self.__root
AttributeError: 'OrderedDict' object has no attribute '_OrderedDict__root'
I don't find much of assistance when googling the error. I do see what looks like the same issue was resolved at some time past for simpler objects:
https://github.com/jsonpickle/jsonpickle/issues/33
The cited example in that report works for me:
>>> jsonpickle.decode(jsonpickle.encode(collections.OrderedDict()))
OrderedDict()
>>> jsonpickle.decode(jsonpickle.encode(collections.OrderedDict(a=1)))
OrderedDict([(u'a', 1)])
Has anyone ever run into this themselves and found a solution? I ask with the understanding that my case may be "differently idiosynchratic" than another known example.
The requests module for me seems to be running into problems when I .decode(). After looking at the jsonpickle code a bit, I decided to fork it and change the following lines to see what was going on (and I ended up keeping a private copy of jsonpickle with the changes so I can move forward).
In jsonpickle/unpickler.py (in my version it's line 368), search for the if statement section in the method _restore_from_dict():
if (util.is_noncomplex(instance) or
util.is_dictionary_subclass(instance)):
instance[k] = value
else:
setattr(instance, k, value)
and change it to this (it will logERROR the ones that are failing and then you can either keep the code in place or change your OrderedDict's version that have __root)
if (util.is_noncomplex(instance) or
util.is_dictionary_subclass(instance)):
# Currently requests.adapters.HTTPAdapter is using a non-standard
# version of OrderedDict which doesn't have a _OrderedDict__root
# attribute
try:
instance[k] = value
except AttributeError as e:
import logging
import pprint
warnmsg = 'Unable to unpickle {}[{}]={}'.format(pprint.pformat(instance), pprint.pformat(k), pprint.pformat(value))
logging.error(warnmsg)
else:
setattr(instance, k, value)

python: 'str' object is not callable?

I have gone through many similar posts here and there but non of them seem solve my problem. I have a method that searches for file path:
def getDumpFile(self):
self.saveDump()
dumpname = str(self.filename)+'-01.netxml'
filepath = os.path.join('/some/path/to/file',dumpname)
try:
if os.path.exists(os.path.join('/some/path/to/file',dumpname)):
logging.debug( "Filepath "+str(filepath) )
return filepath
else:
logging.debug( "File Not Found" )
return None
except OSError as e:
logging.debug( "File not created: "+str(e) )
return None
and in the main function I call this function like this:
xmlfile = wscanner.getDumpFile()
and when I execute above code, it finds the correct path in getDumpFile() method but the server gives out exception:
Unexpected exception in wireless.views.attackAP with type <type 'exceptions.TypeError'> and error 'str' object is not callable
I really don't know why passing the filepath to xmlfile variable(which I believe is never initiated before)could cause error,please help. Thanks.
Edit: It is actually the code xmlfile = wscanner.getDumpFile() that gives out error, but I don't know why. Comment out this line would get rid of the error, but I need this path later on.
This is why I enjoy StackOverflow -- it causes you to really plunge deeper.
The last poster is 100% correct. I wrote a quick class to demo the problem. If I had to go on what we know from the poster, I'd suggest to take a closer look at references to getDumpFile, to ensure someone is not accidentally assigning a string value to it:
class MyClass:
def getDumpFile(self):
pass
myclass = MyClass()
myclass.getDumpFile = 'hello world'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object is not callable
There's a bit of 'non-pythonic' stuff going on in this module. But let's ignore that for a moment.
You're saying the error message comes from calling method:
xmlfile = wscanner.getDumpFile()
If I had to guess, I'd say 'wscanner' is not defined how you think it is -- and specifically, the python interpreter thinks it's a string.
Try adding this call right before the call to getDumpFile()
print type(wscanner)
See what it shows.

python list substring

I am trying to read the variables from newreg.py (e.g. state, district, dcode, etc, a long list which in turn picking up data from a web form) into insertNew.py.
I have currently read the whole file into a list named 'lines'. Now, how do I filter each variable (like- state, district, etc. approx 50-55 variables. This 'list' also has html code as I have read the whole web page into it) from list 'lines'?
Is there a better and efficient way to do it ?
Once I am able to read each variable, I need to concatenate these value ( convert into string) and insert into MongoDB.
Lastly when the data has been inserted into DB, 'home.py' page opens.
I am giving details so that a complete picture is available for some solution which can be given. I hope it I have been able to keep it simple as well as complete.
I want to loop over the list (sample below) and filter out the variables (before '=' sign values). The following is in 'newreg.py' :
state = form.getvalue('state','ERROR')
district = form.getvalue('district','ERROR')
dcode = form.getvalue('Dcode','ERROR')
I read a file / page into a list
fp = open('/home/dev/wsgi-scripts/newreg.py','r')
lines = fp.readlines()
so that I can create dictionary to insert into MongoDB.eg.
info = {'state' : state , 'district' : district, . . . . }
{key : value } [value means --- which is the variable from the above list]
Thanks
but i am getting the following errors when i do
print getattr(newreg, 'state')
the error is
>>> print getattr(newreg, 'state')
Traceback (most recent call last):
File "<stdin>", line 1, in module
AttributeError: 'module' object has no attribute 'state'
I also tried
>>> print newreg.state
Traceback (most recent call last):
File "<stdin>", line 1, in module
AttributeError: 'module' object has no attribute 'state'
This is how I added the module
>>> import os,sys
>>> sys.path.append('/home/dev/wsgi-scripts/')
>>> import newreg
>>> newreg_vars = dir(newreg)
>>> print newreg_vars
['Connection', 'Handler', '__builtins__', '__doc__', '__file__', '__name__',
'__package__', 'application', 'cgi', 'datetime', 'os', 'sys', 'time']
Handler in the above list is a class in the following
#!/usr/bin/env python
import os, sys
import cgi
from pymongo import Connection
import datetime
import time
class Handler:
def do(self, environ, start_response):
form = cgi.FieldStorage(fp=environ['wsgi.input'],
environ=environ)
state = form.getvalue('state','<font color="#FF0000">ERROR</font>')
district = form.getvalue('district','<font color="#FF0000">ERROR</font>')
dcode = form.getvalue('Dcode','<font color="#FF0000">ERROR</font>')
I am assuming you want to copy the variables from one Python module to another at runtime.
import newreg
newreg_vars = dir(newreg)
print newreg_vars
will print all of the attributes of the module "newreg".
To read the variables from the module:
print getattr(newreg, 'state')
print getattr(newreg, 'district')
print getattr(newreg, 'dcode')
or if you know the names of the attributes:
print newreg.state
print newreg.district
print newreg.dcode
To change the attributes into strings, use a list comprehension (or a generator):
newreg_strings = [str(item) for item in newreg_vars]
This will save you lots of effort, as you will not have to parse "newreg" as a text file with re.
As a side note: Type conversion is not concatenation (although concatenation may involve type conversion in some other programming languages).

renderContents in beautifulsoup (python)

The code I'm trying to get working is:
h = str(heading)
# '<h1>Heading</h1>'
heading.renderContents()
I get this error:
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
print h.renderContents()
AttributeError: 'str' object has no attribute 'renderContents'
Any ideas?
I have a string with html tags and i need to clean it if there is a different way of doing that please suggest it.
Your error message and your code sample don't line up. You say you're calling:
heading.renderContents()
But your error message says you're calling:
print h.renderContents()
Which suggests that perhaps you have a bug in your code, trying to call renderContents() on a string object that doesn't define that method.
In any case, it would help if you checked what type of object heading is to make sure it's really a BeautifulSoup instance. This works for me with BeautifulSoup 3.2.0:
from BeautifulSoup import BeautifulSoup
heading = BeautifulSoup('<h1>heading</h1>')
repr(heading)
# '<h1>heading</h1>'
print heading.renderContents()
# <h1>heading</h1>
print str(heading)
# '<h1>heading</h1>'
h = str(heading)
print h
# <h1>heading</h1>

AttributeError: xmlNode instance has no attribute 'isCountNode'

I'm using libxml2 in a Python app I'm writing, and am trying to run some test code to parse an XML file. The program downloads an XML file from the internet and parses it. However, I have run into a problem.
With the following code:
xmldoc = libxml2.parseDoc(gfile_content)
droot = xmldoc.children # Get document root
dchild = droot.children # Get child nodes
while dchild is not None:
if dchild.type == "element":
print "\tAn element with ", dchild.isCountNode(), "child(ren)"
print "\tAnd content", repr(dchild.content)
dchild = dchild.next
xmldoc.freeDoc();
...which is based on the code example found on this article on XML.com, I receive the following error when I attempt to run this code on Python 2.4.3 (CentOS 5.2 package).
Traceback (most recent call last):
File "./xml.py", line 25, in ?
print "\tAn element with ", dchild.isCountNode(), "child(ren)"
AttributeError: xmlNode instance has no attribute 'isCountNode'
I'm rather stuck here.
Edit: I should note here I also tried IsCountNode() and it still threw an error.
isCountNode should read "lsCountNode" (a lower-case "L")

Categories