Scientific Notation and conversions from DB output into JSON in Python

Scientific Notation and conversions from DB output into JSON in Python - python

I'm writing python code that tests a REST Endpoint to get Scientific numbers from a DB and validate that the scientific format is returned from the database in correct JSON scientific number format.
The issue that I'm having is that some scientific numbers are converted. For instance the JSON loader will convert the e to upper case and some values are converted into integers. Here is some example code. The code isn't doing exactly what I'm doing since you won't have the DB back end.
import json
import decimal
class DecimalEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, decimal.Decimal):
print 'In here: ' + str( o )
return str(o)
return super(DecimalEncoder, self).default(o)
class JSONUtils:
def __init__( self, response ):
self.response = response
self.jsonData = None
self.LoadData( )
# print 'jsonData: ' + json.dumps( self.jsonData, cls=DecimalEncoder, indent=2 )
def GetData( self ):
return self.jsonData
def GetAsStr( self ):
return json.dumps(self.GetData(), cls=DecimalEncoder )
def LoadData ( self ):
if ( self.jsonData == None ):
if ( type( self.response ) == str or type( self.response ) == unicode ):
print '****type1'
self.jsonData = json.loads(self.response, parse_float=decimal.Decimal )
elif ( type( self.response ) == dict ):
print '****type2'
dump = json.dumps( self.response, cls=DecimalEncoder )
self.jsonData = json.loads(dump, parse_float=decimal.Decimal)
def GetJSONChunk( self, path ):
returnValue = ''
curPath = ''
try:
if ( type( path ) == str ):
returnValue = self.jsonData[path]
elif (type( path ) == list):
temp = ''
firstTime = True
for curPath in path:
if firstTime == True:
temp = self.jsonData[curPath]
firstTime = False
else:
temp = temp[curPath]
returnValue = temp
else:
print 'Unknown type in GetJSONChunk: ' + unicode( type( path ))
except KeyError as err:
ti.DBG_OUT( 'JSON chunk doesn\'t have value: ' + unicode( path ))
returnValue = self.kNoNode
except IndexError as err:
ti.DBG_OUT( 'Index does not exist: ' + unicode( curPath ))
returnValue = self.kInvalidIndex
return returnValue
info = { "fldName":1.47e-10 } # converts to 1.47e-10 (e to E)
# info = { "fldName":1.47e10 } # convers to 14700000000.0
# info = { "fldName":1.47e+10 } # convers to 14700000000.0
# info = { "fldName":1.47E+10 } # convers to 14700000000.0
# info = { "fldName":12345678901234567890123 } # example shows larger # support
print 'info: ' + str ( info )
myJSON = JSONUtils( info )
print 'type: ' + str( myJSON.jsonData )
print 'myJSON: ' + myJSON.GetAsStr( )
value = myJSON.GetJSONChunk ( 'fldName' )
print 'Type: ' + str( type( value ))
print value
What I need to do is compare the DB result to an expected value. Is there a way to identify ONLY scientific numbers? NOT doubles / decimal values and return those as strings. As you can see I'm already trying to protect Doubles that are being returned to be sure that meet the criteria / capabilities of the back end database. Which can be 20+ digits to the left or right of the decimal place.
The actual results are documented by each of the lines of code that start with # info.

I'm not entirely clear on your question, so if this is way off base let me know.
I think there may be some confusion between how Python displays a number vs. the actual value of the number.
For example, I can write the number 1000 as:
>>> 1000
1000.0
>>> 1E3
1000.0
>>> 10E2
1000.0
>>> 1e3
1000.0
>>> 1e+3
1000.0
Those are all different representations of the number, but they are all numerically equivalent. JSON is similarly flexible; all of the above representations are also valid in JSON.
Similarly, I can write:
10000000000000000000000000000.0
But the print statement will display it as:
1e+28
But it's still the same number. It hasn't been "converted" in any way. Python will use E notation once your number is >= 1E16.
So if I receive JSON that looks like this:
{
"val1": 1e+3,
"val2": 1e+20
}
The output of the following:
values = json.loads('{"val1": 1e+3, "val2": 1e+20}')
for k, v in values.items():
print(k, '=', v)
Would be:
val1 = 1000.0
val2 = 1e+20

Related

Cannot return string from python class

I'm trying to learn how to correctly use classes in python, I'm fairly new to it but I cant get this class to return a string output of all the values. Ideally I'd like to be able to just str(packet) into a network socket.
class ARP():
dst_addr = ''
src_addr = ''
type = '\x08\x06'
payload = '\x00\x01\x08\x00\x06\x04\x00'
arptype = '\x01'
src_mac_addr = ''
src_ip_addr = ''
dst_mac_addr = ''
dst_ip_addr = ''
def __repr_(self):
return 'ARP'
def __str__(self):
return dst_addr + src_addr + type + payload + arptype \
+ src_mac_addr + src_ip_addr + dst_mac_addr + dst_ip_addr
p = ARP()
p.dst_addr = router_mac
p.src_addr = random_mac()
p.arptype = '\x02'
p.src_mac_addr = local_mac
p.src_ip_addr = ip_s2n(target_ip)
p.dst_mac_addr = router_mac
p.dst_ip_addr = ip_s2n(router_ip)
print 'PACKET: ', str(p)
return str(p)
This code outputs nothing at all. repr() outputs <__main__.ARP instance at 0x2844ea8> which I guess is what it's meant to do?

You are missing an underscore in your __repr__ method name:
def __repr_(self):
# ---------^
Python looks for __repr__, not __repr_.
Next, your __str__ method should refer to attributes on self, not to globals. Perhaps a str.join() call would be helpful here too:
def __str__(self):
return ''.join([getattr(self, attr) for attr in (
'dst_addr', 'src_addr', 'type', 'payload', 'arptype', 'src_mac_addr',
'src_ip_addr', 'dst_mac_addr', 'dst_ip_addr')])

How do I put these variables in place correctly?

u = 'stringandstring'
b = "network:"
e = "yeser;"
def haystack(b,e,u):
i = re.search('%s(.*)%s', u)
r = i.group(1)
return r
or
.....
def haystack(b,e,u):
i = re.search('b(.*)e', u)
.....
how do i get those variables inside that function correctly?

I guess you can try concatenation (str1+str2)
def haystack(b,e,u):
i = re.search(b+'(.*)'+e, u)
if i: #check if there is any result
return i.group(1) #return match
#now try to call it
print haystack("this","str","this is str") #this should output ' is '
print haystack("no","no", "this is str") #this should not print anything
this is working perfectly for me so far

Python missing attributes/attributes not valid

This is the code I am running but it says "These attributes are not valid", how do I list attributes/transforms in my scene so I can list them properly. I tried using cmds.ls(type='transform') but it still doesn't work. Any help is appreciated.
import maya.cmds as cmds
def changeXtransformVal(myList, percentage=2.0):
"""
Changes the value of each transform in the scene by a percentange.
Parameters:
percentage - Percentange to change each transform's value. Default value is 1.
Returns:
Nothing.
"""
# The ls command is the list command. It is used to list various nodes
# in the current scene. You can also use it to list selected nodes.
transformInScene = cmds.ls(type='transform')
found = False
for thisTransform in transformInScene:
if thisTransform not in ['front','persp','side','top']:
found = True
break
else:
found = False
if found == False:
sphere1 = cmds.polySphere()[0]
cmds.xform(sphere1, t = (0.5, 0.5, 0.5))
transformInScene = cmds.ls(type='transform')
# If there are no transforms in the scene, there is no point running this script
if not transformInScene:
raise RuntimeError, 'There are no transforms in the scene!'
badAttrs = list()
# Loop through each transform
for thisTransform in transformInScene:
if thisTransform not in ['front','persp','side','top']:
allAttrs = cmds.listAttr(thisTransform, keyable=True, scalar=True)
allAttrs = [ i for i in badAttrs if i != "visibility" ]
print allAttrs
for attr in myList:
if attr in allAttrs:
currentVal = cmds.getAttr( thisTransform + "." + attr )
newVal = currentVal * percentage
cmds.setAttr(thisTransform + "." + attr, newval)
print "Changed %s. %s from %s to %s" % (thisTransform,attr,currentVal,newVal)
else:
badAttrs.append(attr)
if badAttrs:
print "These attributes %s are not valid" % str()
myList = ['sx', 'sy', 'tz', 'ty', 'tx']
changeXtransformVal(myList, percentage=2.0)

You have a simple indentation error in several places. The last (on line 35):
for attr in myList:
the code is a level too low. The code on line 31 > :
if thisTransform not in ['front','persp','side','top']:
allAttrs = cmds.listAttr(thisTransform, keyable=True, scalar=True)
Should all be on the if level. Also this makes no sense:
allAttrs = [ i for i in badAttrs if i != "visibility" ]
is indented wrong all your code after that should be on the level of your if. Here's the central part written again:
import maya.cmds as cmds
def changeXtransformVal(myList, percentage=2.0):
transformInScene = [i for i in cmds.ls(type='transform') if i not in ['front','persp','side','top'] ]
myList = [i for i in myList if i not in ['v','visibility']]
for thisTransform in transformInScene:
badAttrs = []
for attr in myList:
try:
currentVal = cmds.getAttr( thisTransform + "." + attr )
newVal = currentVal * percentage
cmds.setAttr(thisTransform + "." + attr, newVal)
print "Changed %s. %s from %s to %s" % (thisTransform,attr,currentVal,newVal)
except TypeError:
badAttrs.append(attr)
if badAttrs:
print "These attributes %s are not valid" % str(badAttrs)
myList = ['sx', 'sy', 'tz', 'ty', 'tx']
changeXtransformVal(myList, percentage=2.0)
Note the nesting is a bit too deep consider moving the looping of mattress into a function definition.

What is the pythonic way to implement a parse tree for custom format?

I have a project that has a non-standard file format something like:
var foo = 5
load 'filename.txt'
var bar = 6
list baz = [1, 2, 3, 4]
And I want to parse this into a data structure much like BeautifulSoup does. But this format isn't supported by BeautifulSoup. What is the pythonic way to build a parse tree so that I can modify the values and re-write it out? In the end I would like to do something like:
data = parse_file('file.txt')
data.foo = data.foo * 2
data.write_file('file_new.txt')

Here is a solution using pyparsing... it works in your case. Beware that i'm not an expert therefore depending on your standards the code could be ugly... cheers
class ConfigFile (dict):
"""
Configuration file data
"""
def __init__ (self, filename):
"""
Parses config file.
"""
from pyparsing import Suppress, Word, alphas, alphanums, nums, \
delimitedList, restOfLine, printables, ZeroOrMore, Group, \
Combine
equal = Suppress ("=")
lbrack = Suppress ("[")
rbrack = Suppress ("]")
delim = Suppress ("'")
string = Word (printables, excludeChars = "'")
identifier = Word (alphas, alphanums + '_')
integer = Word (nums).setParseAction (lambda t: int (t[0]))
real = Combine( Word(nums) + '.' + Word(nums) ).setParseAction (lambda t: float(t[0]))
value = real | integer
var_kwd = Suppress ("var")
load_kwd = Suppress ("load")
list_kwd = Suppress ("list")
var_stm = Group (var_kwd + identifier + equal + value +
restOfLine.suppress ()).setParseAction (
lambda tok: tok[0].insert(len(tok[0]), 0))
load_stm = Group (load_kwd + delim + string + delim +
restOfLine.suppress ()).setParseAction (
lambda tok: tok[0].insert(len(tok[0]), 1))
list_stm = Group (list_kwd + identifier + equal + lbrack +
Group ( delimitedList (value, ",") ) +
rbrack + restOfLine.suppress ()).setParseAction (
lambda tok: tok[0].insert(len(tok[0]), 2))
cnf_file = ZeroOrMore (var_stm | load_stm | list_stm)
lines = cnf_file.parseFile (filename)
self._lines = []
for line in lines:
self._lines.append ((line[-1], line[0]))
if line[-1] != 1: dict.__setitem__(self, line[0], line[1])
self.__initialized = True
# after initialisation, setting attributes is the same as setting an item
def __getattr__ (self, key):
try:
return dict.__getitem__ (self, key)
except KeyError:
return None
def __setattr__ (self, key, value):
"""Maps attributes to values. Only if we are initialised"""
# this test allows attributes to be set in the __init__ method
if not self.__dict__.has_key ('_ConfigFile__initialized'):
return dict.__setattr__(self, key, value)
# any normal attributes are handled normally
elif self.__dict__.has_key (key):
dict.__setattr__(self, key, value)
# takes care of including new 'load' statements
elif key == 'load':
if not isinstance (value, str):
raise ValueError, "Invalid data type"
self._lines.append ((1, value))
# this is called when setting new attributes after __init__
else:
if not isinstance (value, int) and \
not isinstance (value, float) and \
not isinstance (value, list):
raise ValueError, "Invalid data type"
if dict.has_key (self, key):
if type(dict.__getitem__(self, key)) != type (value):
raise ValueError, "Cannot modify data type."
elif not isinstance (value, list): self._lines.append ((0, key))
else: self._lines.append ((2, key))
dict.__setitem__(self, key, value)
def Write (self, filename):
"""
Write config file.
"""
fid = open (filename, 'w')
for d in self._lines:
if d[0] == 0: fid.write ("var %s = %s\n" % (d[1], str(dict.__getitem__(self, d[1]))))
elif d[0] == 1: fid.write ("file '%s'\n" % (d[1]))
else: fid.write ("list %s = %s\n" % (d[1], str(dict.__getitem__(self, d[1]))))
if __name__ == "__main__":
input="""var foo = 5
load 'filename.txt'
var bar = 6
list baz = [1, 2, 3, 4]"""
file ("test.txt", 'w').write (input)
config = ConfigFile ("test.txt")
# Modify existent items
config.foo = config.foo * 2
# Add new items
config.foo2 = [4,5,6,7]
config.foo3 = 12.3456
config.load = 'filenameX.txt'
config.load = 'filenameXX.txt'
config.Write ("test_new.txt")
EDIT
I have modified the class to use
__getitem__, __setitem__
methods to mimic the 'access to member' syntax to parsed items as required by our poster. Enjoy!
PS
Overloading of the
__setitem__
method should be done with care to avoid interferences between setting of 'normal' attributes (class members) and the parsed items (that are accesses like attributes). The code is now fixed to avoid these problems. See the following reference http://code.activestate.com/recipes/389916/ for more info. It was funny to discover this!

What you have is a custom language you need to parse.
Use one of the many existing parsing libraries for Python. I personally recommend PLY. Alternatively, Pyparsing is also good and widely used & supported.
If your language is relatively simple, you can also implement a hand-written parser. Here is an example

refactor this dictionary-to-xml converter in python

It's a small thing, really: I have this function that converts dict objects to xml.
Here's the function:
def dictToXml(d):
from xml.sax.saxutils import escape
def unicodify(o):
if o is None:
return u'';
return unicode(o)
lines = []
def addDict(node, offset):
for name, value in node.iteritems():
if isinstance(value, dict):
lines.append(offset + u"<%s>" % name)
addDict(value, offset + u" " * 4)
lines.append(offset + u"</%s>" % name)
elif isinstance(value, list):
for item in value:
if isinstance(item, dict):
lines.append(offset + u"<%s>" % name)
addDict(item, offset + u" " * 4)
lines.append(offset + u"</%s>" % name)
else:
lines.append(offset + u"<%s>%s</%s>" % (name, escape(unicodify(item)), name))
else:
lines.append(offset + u"<%s>%s</%s>" % (name, escape(unicodify(value)), name))
addDict(d, u"")
lines.append(u"")
return u"\n".join(lines)
For example, it converts this dictionary
{ 'site': { 'name': 'stackoverflow', 'blogger': [ 'Jeff', 'Joel' ] } }
to:
<site>
<name>stackoverflow</name>
<blogger>jeff</blogger>
<blogger>joel</blogger>
</site>
It works, but the addDict function looks a little too repetitive. I'm sure there's a way to refactor it into 3 co-recursive functions named addDict, addList and addElse, but my brain is stuck. Any help?
Also, any way to get rid of the offset + thing in every line would be nice.
NOTE: I chose these semantics because I'm trying to match the behavior of the json-to-xml converter in org.json, which I use in a different part of my project. If you got to this page just looking for a dictionary to xml converter, there are some really good options in some of the answers. (Especially pyfo).

>>> from pyfo import pyfo
>>> d = ('site', { 'name': 'stackoverflow', 'blogger': [ 'Jeff', 'Joel' ] } )
>>> result = pyfo(d, pretty=True, prolog=True, encoding='ascii')
>>> print result.encode('ascii', 'xmlcharrefreplace')
<?xml version="1.0" encoding="ascii"?>
<site>
<blogger>
Jeff
Joel
</blogger>
<name>stackoverflow</name>
</site>
To install pyfo:
$ easy_install pyfo

I noticed you have commonality in adding items. Using this commonality I would refactor adding an item to a separate function.
def addItem(item, name, offset):
if isinstance(item, dict):
lines.append(offset + u"<%s>" % name)
addDict(item, offset + u" " * 4)
lines.append(offset + u"</%s>" % name)
else:
lines.append(offset + u"<%s>%s</%s>" % (name, escape(unicodify(item)), name))
def addList(value,name, offset):
for item in value:
addItem(item, name, offset)
def addDict(node, offset):
for name, value in node.iteritems():
if isinstance(value, list):
addList(value, name, offset)
else:
addItem(value, name, offset)
Advisory warning: this code is not tested or written by anybody who actually uses Python.

To get rid of repeated "offset+":
offset = 0
def addLine(str):
lines.append(u" " * (offset * 4) + str
then
...
addLine(u"<%s>" % name)
offset = offset + 1
addDict(value)
offset = offset - 1
addLine(u"</%s>" % name)
Don't have access to an interpreter here, so take this with a grain of salt :(

Your original code produce malformed XML and can produce the same XML for two different dictionaries (is not injective, speaking mathematically).
For example, if you have a list as a value of the only key in a dictionary:
d = { 'list': [1,2,3] }
I expect that your code would produce
<list>1</list><list>2</list><list>3</list>
and there is no root element. Any XML should have one and only one root element.
Then given the XML produced by your code, it is impossible to say if this XML
<tag>1</tag>
was produced from { 'tag': 1 } or from { 'tag': [1] }.
So, I suggest
always start from the root element
represent lists with either two special tags (e.g. <list/> and <item/>) or mark them as such in attributes
Then, after decisions about these conceptual shortcomings we can generate correct and unambiguous XML. I chose to use attributes to markup lists, and used ElementTree to construct the XML tree automatically. Also, recursion helps (add_value_to_xml is called recursively):
from xml.etree.ElementTree import Element, SubElement, tostring
def is_scalar(v):
return isinstance(v,basestring) or isinstance(v,float) \
or isinstance(v,int) or isinstance(v,bool)
def add_value_to_xml(root,v):
if type(v) == type({}):
for k,kv in v.iteritems():
vx = SubElement(root,unicode(k))
vx = add_value_to_xml(vx,kv)
elif type(v) == list:
root.set('type','list')
for e in v:
li = SubElement(root,root.tag)
li = add_value_to_xml(li,e)
li.set('type','item')
elif is_scalar(v):
root.text = unicode(v)
else:
raise Exception("add_value_to_xml: unsuppoted type (%s)"%type(v))
return root
def dict_to_xml(d,root='dict'):
x = Element(root)
x = add_value_to_xml(x,d)
return x
d = { 'float': 5194.177, 'str': 'eggs', 'int': 42,
'list': [1,2], 'dict': { 'recursion': True } }
x = dict_to_xml(d)
print tostring(x)
The result of the conversion of the test dict is:
<dict><int>42</int><dict><recursion>True</recursion></dict><float>5194.177</float><list type="list"><list type="item">1</list><list type="item">2</list></list><str>eggs</str></dict>

Here is my short sketch for a solution:
have a general addSomething() function that dispatches based on the type of the value to addDict(), addList() or addElse(). Those functions recursively call addSomething() again.
Basically you are factoring out the parts in the if clause and add a recursive call.

Here's what I find helpful when working with XML. Actually create the XML node structure first, then render this into text second.
This separates two unrelated concerns.
How do I transform my Python structure into an XML object model?
How to I format that XML object model?
It's hard when you put these two things together into one function. If, on the other hand, you separate them, then you have two things. First, you have a considerably simpler function to "walk" your Python structure and return an XML node. Your XML Nodes can be rendered into text with some preferred encoding and formatting rules applied.
from xml.sax.saxutils import escape
class Node( object ):
def __init__( self, name, *children ):
self.name= name
self.children= children
def toXml( self, indent ):
if len(self.children) == 0:
return u"%s<%s/>" % ( indent*4*u' ', self.name )
elif len(self.children) == 1:
child= self.children[0].toXml(0)
return u"%s<%s>%s</%s>" % ( indent*4*u' ', self.name, child, self.name )
else:
items = [ u"%s<%s>" % ( indent*4*u' ', self.name ) ]
items.extend( [ c.toXml(indent+1) for c in self.children ] )
items.append( u"%s</%s>" % ( indent*4*u' ', self.name ) )
return u"\n".join( items )
class Text( Node ):
def __init__( self, value ):
self.value= value
def toXml( self, indent ):
def unicodify(o):
if o is None:
return u'';
return unicode(o)
return "%s%s" % ( indent*4*u' ', escape( unicodify(self.value) ), )
def dictToXml(d):
def dictToNodeList(node):
nodes= []
for name, value in node.iteritems():
if isinstance(value, dict):
n= Node( name, *dictToNodeList( value ) )
nodes.append( n )
elif isinstance(value, list):
for item in value:
if isinstance(item, dict):
n= Node( name, *dictToNodeList( value ) )
nodes.append( n )
else:
n= Node( name, Text( item ) )
nodes.append( n )
else:
n= Node( name, Text( value ) )
nodes.append( n )
return nodes
return u"\n".join( [ n.toXml(0) for n in dictToNodeList(d) ] )

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Scientific Notation and conversions from DB output into JSON in Python - python

Related

Cannot return string from python class

How do I put these variables in place correctly?

Python missing attributes/attributes not valid

What is the pythonic way to implement a parse tree for custom format?

refactor this dictionary-to-xml converter in python

Categories

Resources