I'm looking over some complex Python 2.6 code which is occasionally resulting in an infinity being generated (at least an Infinity being serialized by the json library -- which checks w/ math.isinf).
What is especially baffling is that Python (as far as I can tell) shouldn't be able to ever produce computation results set to infinity. Am I wrong with this assumption? I was aware you can only get infinities from constants:
k = float('inf')
k = 1e900
Somewhere between 1e308 and 1e309 the floats run out of precision, so if you are computing results above that range you will see inf
>>> 1e308
1e+308
>>> 1e309
inf
>>> json.dumps(1e308,allow_nan=False)
'1e+308'
>>> json.dumps(1e309,allow_nan=False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/json/__init__.py", line 237, in dumps
**kw).encode(obj)
File "/usr/lib/python2.6/json/encoder.py", line 367, in encode
chunks = list(self.iterencode(o))
File "/usr/lib/python2.6/json/encoder.py", line 304, in _iterencode
yield floatstr(o, self.allow_nan)
File "/usr/lib/python2.6/json/encoder.py", line 47, in floatstr
raise ValueError(msg)
ValueError: Out of range float values are not JSON compliant: inf
>>>
Decimal can handle larger numbers, but obviously there is a performance penalty (and it can't be serialised with json)
>>> from decimal import Decimal
>>> Decimal('1e900')/10
Decimal("1E+899")
Here is an example of an addition that doesn't raise overflow exception
>>> a=1e308
>>> a+a
inf
Related
I have a sympy poly that looks like:
Poly(0.764635937801645*I**4 + 7.14650839258644*I**3 - 0.667712176660315*I**2 - 2.81663805543677*I - 0.623299856233272, I, domain='RR')
I'm converting to mpc using the following code:
a = val.subs('I',1.0j)
b = sy.re(a)
c = sy.im(a)
d = mpmath.mpc(b,c)
Two questions.
Assuming my mpc and sympy type have equal precision (of eg 100 dps) is there a precision loss using this conversion from a to d?
Is there a better way to convert?
Aside: sympy seems to treat I just like a symbol here. How do I get sympy to simplify this polynomial?
Edit: Ive also noticed that the following works in place of a above:
a = val.args[0]
Strings and expressions
Root cause of the issue is seen in val.subs('I', 1.0j) -- you appear to pass strings as arguments to SymPy functions. There are some valid uses for this (such as creation of high-precision floats), but when symbols are concerned, using strings is a recipe for confusion. The string 'I' gets implicitly converted to SymPy expression Symbol('I'), which is different from SymPy expression I. So the answer to
How do I get sympy to simplify this polynomial?
is to revisit the process of creation of that polynomial, and fix that. If you really need to create it from a string, then use locals parameter:
>>> S('3.3*I**2 + 2*I', locals={'I': I})
-3.3 + 2*I
Polynomials and expressions
If the Poly structure is not needed, use the method as_expr() of Poly to get an expression from it.
Conversion to mpmath and precision loss
is there a precision loss using this conversion from a to d?
Yes, splitting into real and imaginary and then recombining can lead to precision loss. Pass a SymPy object directly to mpc if you know it's a complex number. Or to mpmathify if you want mpmath to decide what type it should have. An example:
>>> val = S('1.111111111111111111111111111111111111111111111111')*I**3 - 2
>>> val
-2 - 1.111111111111111111111111111111111111111111111111*I
>>> import mpmath
>>> mpmath.mp.dps = 40
>>> mpmath.mpc(val)
mpc(real='-2.0', imag='-1.111111111111111111111111111111111111111111')
>>> mpmath.mpmathify(val)
mpc(real='-2.0', imag='-1.111111111111111111111111111111111111111111')
>>> mpmath.mpc(re(val), im(val))
mpc(real='-2.0', imag='-1.111111111111111111111111111111111111111114')
Observations:
When I is actual imaginary unit, I**3 evaluates fo -I, you don't have to do anything for it to happen.
A string representation of high-precision decimal is used to create such a float in SymPy. Here S stands for sympify. One can also be more direct and use Float('1.1111111111111111111111111')
Direct conversion of a SymPy complex number to an mpmath complex number is preferable to splitting in real/complex and recombining.
Conclusion
Most of the above is just talking around an XY problem. Your expression with I was not what you think it was, so you tried to do strange things that were not needed, and my answer is mostly a waste of time.
I'm adding my own answer here, as FTP's answer, although relevant and very helpful, did not (directly) resolve my issue (which wasn't that clear from the question tbh). When I ran the code in his example I got the following:
>>> from sympy import *
>>> import mpmath
>>> val = S('1.111111111111111111111111111111111111111111111111')*I**3 - 2
>>> val
-2 - 1.111111111111111111111111111111111111111111111111*I
>>> mpmath.mp.dps = 40
>>> mpmath.mpc(val)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\mpmath\ctx_mp_python.py", line 373, in __new__
real = cls.context.mpf(real)
File "C:\Python27\lib\site-packages\mpmath\ctx_mp_python.py", line 77, in __new__
v._mpf_ = mpf_pos(cls.mpf_convert_arg(val, prec, rounding), prec, rounding)
File "C:\Python27\lib\site-packages\mpmath\ctx_mp_python.py", line 96, in mpf_convert_arg
raise TypeError("cannot create mpf from " + repr(x))
TypeError: cannot create mpf from -2 - 1.111111111111111111111111111111111111111111111111*I
>>> mpmath.mpmathify(val)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\mpmath\ctx_mp_python.py", line 662, in convert
return ctx._convert_fallback(x, strings)
File "C:\Python27\lib\site-packages\mpmath\ctx_mp.py", line 614, in _convert_fallback
raise TypeError("cannot create mpf from " + repr(x))
TypeError: cannot create mpf from -2 - 1.111111111111111111111111111111111111111111111111*I
>>> mpmath.mpc(re(val), im(val))
mpc(real='-2.0', imag='-1.111111111111111111111111111111111111111114')
>>> mpmath.mpmathify(val)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\mpmath\ctx_mp_python.py", line 662, in convert
return ctx._convert_fallback(x, strings)
File "C:\Python27\lib\site-packages\mpmath\ctx_mp.py", line 614, in _convert_fallback
raise TypeError("cannot create mpf from " + repr(x))
TypeError: cannot create mpf from -2 - 1.111111111111111111111111111111111111111111111111*I
Updating my sympy (1.0->1.1.1) and mpmath (0.19->1.0.0) fixed the exceptions. I did not test which of these upgrades actually resolved the issue.
Using python 3.5.2
>>> from decimal import Decimal
>>> Decimal('12') % Decimal('0.01')
Decimal('0.00')
>>> Decimal('234567') % Decimal('0.01')
Decimal('0.00')
Works as expected. But...
>>> Decimal('7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450') % Decimal('0.01')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
decimal.InvalidOperation: [<class 'decimal.DivisionImpossible'>]
EDIT: This is the smallest number I found that can cause this error:
>>> Decimal(10**26) % Decimal('0.01')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
decimal.InvalidOperation: [<class 'decimal.DivisionImpossible'>]
Why does Decimal(very_large_int) % Decimal('0.01') give this error? I thought that Decimal is able to handle very large numbers?
As L3viathan answered, the problem is that a result (not the result—this is the "hidden part" I mention in a comment) has overrun the available precision.
The hidden part is more obvious if we use Python2:
Traceback (most recent call last):
File "/tmp/d.py", line 24, in <module>
print(big % Decimal('0.01'))
File "/usr/local/lib/python2.7/decimal.py", line 1460, in __mod__
remainder = self._divide(other, context)[1]
File "/usr/local/lib/python2.7/decimal.py", line 1381, in _divide
'quotient too large in //, % or divmod')
File "/usr/local/lib/python2.7/decimal.py", line 3873, in _raise_error
raise error(explanation)
InvalidOperation: quotient too large in //, % or divmod
Essentially, a % b is implemented by doing both division and modulus together (a la Algorithm D in Knuth vol 2; for a C implementation restricted to two fullwords, see the qdivrem.c code I wrote in the early 2000s). The library code therefore needs two extra digits (the number of digits to the right of the decimal point in Decimal('0.01')—calculating the actual number of digits needed is not as simple as for big below as we have to look at the exponents) to compute the intermediate quotient.
The decimal library was reimplemented directly in C for Python3, which hides the detail, but the cure is the same for both: extend the precision. Here's an example source routine that catches the exception and tries again, though with magic constant 2.
from __future__ import print_function
import decimal
Decimal = decimal.Decimal
import traceback
big = Decimal(
'731671765313306249192251196744265747423553491949349698352031277'
'4506326239578318016984801869478851843858615607891129494954595017379'
'5833195285320880551112540698747158523863050715693290963295227443043'
'5576689664895044524452316173185640309871112172238311362229893423380'
'3081353362766142828064444866452387493035890729629049156044077239071'
'3810515859307960866701724271218839987979087922749219016997208880937'
'7665727333001053367881220235421809751254540594752243525849077116705'
'5601360483958644670632441572215539753697817977846174064955149290862'
'5693219784686224828397224137565705605749026140797296865241453510047'
'4821663704844031998900088952434506585412275886668811642717147992444'
'2928230863465674813919123162824586178664583591245665294765456828489'
'1288314260769004224219022671055626321111109370544217506941658960408'
'0719840385096245544436298123098787992724428490918884580156166097919'
'1338754992005240636899125607176060588611646710940507754100225698315'
'520005593572972571636269561882670428252483600823257530420752963450')
try:
print(big % Decimal('0.01'))
except decimal.DecimalException:
traceback.print_exc()
print('')
ctx = decimal.getcontext()
print('failed because precision was', ctx.prec, 'and big is',
len(big.as_tuple().digits), 'digits long')
print('trying again with 2 more digits')
with decimal.localcontext() as ctx:
ctx.prec = len(big.as_tuple().digits) + 2
try:
print(big % Decimal('0.01'))
except decimal.DecimalException:
traceback.print_exc()
With Python2:
$ python2 /tmp/d.py
Traceback (most recent call last):
File "/tmp/d.py", line 24, in <module>
print(big % Decimal('0.01'))
File "/usr/local/lib/python2.7/decimal.py", line 1460, in __mod__
remainder = self._divide(other, context)[1]
File "/usr/local/lib/python2.7/decimal.py", line 1381, in _divide
'quotient too large in //, % or divmod')
File "/usr/local/lib/python2.7/decimal.py", line 3873, in _raise_error
raise error(explanation)
InvalidOperation: quotient too large in //, % or divmod
failed because precision was 28 and big is 1000 digits long
trying again with 2 more digits
0.00
With Python3:
$ python3 /tmp/d.py
Traceback (most recent call last):
File "/tmp/d.py", line 24, in <module>
print(big % Decimal('0.01'))
decimal.InvalidOperation: [<class 'decimal.DivisionImpossible'>]
failed because precision was 28 and big is 1000 digits long
trying again with 2 more digits
0.00
Note that dividing by a very large number is actually easier: it's the division by 0.01 that is causing problems here. If the exponent on the divisor were at least 1000 - 28 (1e972 or larger), we would not have the problem.
Decimal is based on the Decimal Arithmetic specification. You can see here that "Division impossible" means that
the integer result of a divide-integer or remainder operation had too many digits (would be longer than precision).
This precision is something you can adjust:
>>> decimal.getcontext().prec=10000
>>> Decimal('7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088
... 0551112540698747158523863050715693290963295227443043557668966489504452445231617318564030987111217223831136222989342338030813533627661428280644448664523874
... 9303589072962904915604407723907138105158593079608667017242712188399879790879227492190169972088809377665727333001053367881220235421809751254540594752243525
... 8490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637
... 0484403199890008895243450658541227588666881164271714799244429282308634656748139191231628245861786645835912456652947654568284891288314260769004224219022671
... 0556263211111093705442175069416589604080719840385096245544436298123098787992724428490918884580156166097919133875499200524063689912560717606058861164671094
... 0507754100225698315520005593572972571636269561882670428252483600823257530420752963450') % Decimal('0.01')
Decimal('0.00')
I want to normalize floating-point numbers to nn.nn strings, and to do some special handling if the number is out of range.
try:
norm = '{:5.2f}'.format(f)
except ValueError:
norm = 'BadData' # actually a bit more complex than this
except it doesn't work: .format silently overflows the 5-character width. Obviously I could length-check norm and raise my own ValueError, but have I missed any way to force format (or the older % formatting) to raise an exception on field-width overflow?
You can not achieve this with format(). You have to create your custom formatter which raises the exception. For example:
def format_float(num, max_int=5, decimal=2):
if len(str(num).split('.')[0])>max_int:
raise ValueError('Integer part of float can have maximum {} digits'.format(max_int))
return "{:.2f}".format(num)
Sample run:
>>> format_float(123.456)
'123.46'
>>> format_float(123.4)
'123.40'
>>> format_float(123789.456) # Error since integer part is having length more than 5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in format_float
ValueError: Integer part of float can have maximum 5 digits
I have written this code for vector addition using numba.SmartArrays. I am using this numba.SmartArrays for the first time. I am not sure how to use that.
This code is not working and it is throwing errors.
import numpy as np
from numba import SmartArray,cuda, jit, uint32
li1=np.uint32([1,2,3,4])
li=np.uint32([1,2,3,4])
b=SmartArray(li,where="host",copy=True)
a=SmartArray(li1,where="host",copy=True)
c=np.uint32([1,1,1,1])
print type(li)
print type(a)
#cuda.jit('void(uint32[:],uint32[:],uint32[:])',type="gpu")
def additionG(c,a,b):
idx=cuda.threadIdx.x+cuda.blockDim.x*cuda.blockIdx.x
if idx< len(a):
a[idx]=c[idx]+b[idx]
dA=cuda.to_device(a)
dB=cuda.to_device(b)
dC=cuda.to_device(c)
additionG[1, 128](c,a,b)
print a.__array__()
Errors:
<type 'numpy.ndarray'>
<class 'numba.smartarray.SmartArray'>
Traceback (most recent call last):
File "C:\Users\hp-pc\My Documents\LiClipse Workspace\cuda\blowfishgpu_smart_arrays.py", line 20, in <module>
dA=cuda.to_device(a)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\devices.py", line 257, in _require_cuda_context
return fn(*args, **kws)
File "C:\Anaconda\lib\site-packages\numba\cuda\api.py", line 55, in to_device
to, new = devicearray.auto_device(obj, stream=stream, copy=copy)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\devicearray.py", line 403, in auto_device
devobj.copy_to_device(obj, stream=stream)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\devicearray.py", line 148, in copy_to_device
sz = min(_driver.host_memory_size(ary), self.alloc_size)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\driver.py", line 1348, in host_memory_size
s, e = host_memory_extents(obj)
File "C:\Anaconda\lib\site-packages\numba\cuda\cudadrv\driver.py", line 1333, in host_memory_extents
return mviewbuf.memoryview_get_extents(obj)
TypeError: expected a readable buffer object
Its been a while since I posted this question. Still posting the answer so that someone may find it helpful in future.
import numpy as np
from numba import SmartArray,cuda, jit, uint32,autojit
li1=np.uint32([6,7,8,9])
li=np.uint32([1,2,3,4])
a=SmartArray(li1,where='host',copy=True)
b=SmartArray(li,where="host",copy=True)
c=np.uint32([1,1,1,1])
def additionG(a,c):
idx=cuda.threadIdx.x+cuda.blockDim.x*cuda.blockIdx.x
if idx < len(c):
a[idx]=a[idx]+c[idx]
cuda.syncthreads()
bpg=1
tpb=128
dC=cuda.to_device(c)
cfunc = cuda.jit()(additionG)
cfunc[bpg, tpb](a,dC)
print a.__array__()
It looks to me like cuda.to_device doesn't handle smart arrays, which would sort of make sense, because smart arrays are supposed to do away with explicit copy management.
If my reading of the documentation is correct (I have never tried SmartArray before), you should just be able to change this
dA=cuda.to_device(a)
dB=cuda.to_device(b)
dC=cuda.to_device(c)
additionG[1, 128](c,a,b)
to just
dC=cuda.to_device(c)
additionG[1, 128](dC,a.gpu(),b.gpu())
The .gpu() method should return a GPU resident object that the kernel can understand and access.
My task is to use 10-fold cross validation method with uni, bi and trigrams in a corpus and compare their accuracy. However, I am stuck with a float division error. All of these codes are given by the question setter except for the loop, so the error is probably there. Here, we are only using the first 1000 sentences to test the program, and that line will be removed once I know the program runs.
import codecs
mypath = "/Users/myname/Desktop/"
corpusFile = codecs.open(mypath + "estonianSample.txt",mode="r",encoding="latin-1")
sentences = [[tuple(w.split("/")) for w in line[:-1].split()] for line in corpusFile.readlines()]
corpusFile.close()
from math import ceil
N=len(sentences)
chunkSize = int(ceil(N/10.0))
sentences = sentences[:1000]
chunks=[sentences[i:i+chunkSize] for i in range(0, N, chunkSize)]
for i in range(10):
training = reduce(lambda x,y:x+y,[chunks[j] for j in range(10) if j!=i])
testing = chunks[i]
from nltk import UnigramTagger,BigramTagger,TrigramTagger
t1 = UnigramTagger(training)
t2 = BigramTagger(training,backoff=t1)
t3 = TrigramTagger(training,backoff=t2)
t3.evaluate(testing)
This is what the error says:
runfile('/Users/myname/pythonhw3.py', wdir='/Users/myname')
Traceback (most recent call last):
File "<ipython-input-1-921164840ebd>", line 1, in <module>
runfile('/Users/myname/pythonhw3.py', wdir='/Users/myname')
File "/Users/myname/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 580, in runfile
execfile(filename, namespace)
File "/Users/myname/pythonhw3.py", line 34, in <module>
t3.evaluate(testing)
File "/Users/myname/anaconda/lib/python2.7/site-packages/nltk/tag/api.py", line 67, in evaluate
return accuracy(gold_tokens, test_tokens)
File "/Users/myname/anaconda/lib/python2.7/site-packages/nltk/metrics/scores.py", line 40, in accuracy
return float(sum(x == y for x, y in izip(reference, test))) / len(test)
ZeroDivisionError: float division by zero
Your error is occurring due to the return value being close to negative infinity.
The line specifically causing the issue is,
t3.evaluate(testing)
What you can do instead is,
try:
t3.evaluate(testing)
except ZeroDivisonError:
# Do whatever you want it to do
print(0)
It works on my end. Try it out!
The answer is four years later, but hopefully, a fellow net citizen can find this helpful.