Changing floating point behavior in Python to Numpy style - python

Is there a way to make Python floating point numbers follow numpy's rules regarding +/- Inf and NaN? For instance, making 1.0/0.0 = Inf.
>>> from numpy import *
>>> ones(1)/0
array([ Inf])
>>> 1.0/0.0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: float division
Numpy's divide function divide(1.0,0.0)=Inf however it is not clear if it can be used similar to from __future__ import division.

You should have a look at how Sage does it. IIRC they wrap the Python REPL in their own preprocessor.

I tried to do something similar, and I never figured out how to do it nicely. But, I can tell you a few things I tried, that didn't work:
Setting float = numpy.float -- python still uses the old float
trying to change float.div to a user-defined function -- "TypeError: can't set attributes of built-in/extension type float". Also, python doesn't like you mucking with the dict object in built-in objects.
I decided to go in and change the actual cpython source code to have it do what I wanted, which is obviously not practical, but it worked.
I think the reason why something like this is not possible is that float/int/list are implemented in C in the background, and their behavior cannot be changed cleanly from inside the language.

You could wrap all your floats in numpy.float64, which is the numpy float type.
a = float64(1.)
a/0 # Inf
In fact, you only need to wrap the floats on the left of arithmetic operations, obviously.

Related

"SystemError: <class 'int'> returned a result with an error set" in Python

I wanted to apply a very simple function using ndimage.generic_filter() from scipy. This is the code:
import numpy as np
import scipy.ndimage as ndimage
data = np.random.rand(400,128)
dimx = int(np.sqrt(np.size(data,0)))
dimy = dimx
coord = np.random.randint(np.size(data,0), size=(dimx,dimy))
def test_func(values):
idx_center = int(values[4])
weight_center = data[idx_center]
weights_around = data[values]
differences = weights_around - weight_center
distances = np.linalg.norm(differences, axis=1)
return np.max(distances)
results = ndimage.generic_filter(coord,
test_func,
footprint = np.ones((3,3)))
When I execute it though, the following error shows up:
SystemError: <class 'int'> returned a result with an error set
when trying to coerce values[4] to an int. If I run the function test_func() without using ndimage.generic_filter() for a random array values, the function works alright.
Why is this error occurring? Is there a way to make it work?
For your case:
This must be a bug in either Python or SciPy. Please file a bug at https://bugs.python.org and/or https://www.scipy.org/bug-report.html. Include the version numbers of Python and NumPy/SciPy, the full code that you have here, and the entire traceback.
(Also, if you can find a way to trigger this bug that doesn't require the use of randomness, they will likely appreciate it. But if you can't find such a method, then please do file it as-is.)
In general:
"[R]eturned a result with an error set" is something that can only be done at the C level.
In general, the Python/C API expects most C functions to do one of two things:
Set an exception using one of these functions and return NULL (corresponds to throwing an exception).
Don't set an exception and return a "real" value, usually a PyObject* (corresponds to returning a value, including returning None).
These two cases are normally incorrect:
Set an exception (or fail to clear one that already exists), but then return some value other than NULL.
Don't set an exception, but then return NULL.
Python is raising a SystemError because the implementation of int, in the Python standard library, tried to do (3), possibly as a result of SciPy doing it first. This is always wrong, so there must be a bug in either Python or the SciPy code that it called into.
I was having a very similar experience with Python 3.8.1 and SciPy 1.4.1 on Linux. A workaround was to introduce np.floor so that:
centre = int(window.size / 2) becomes centre = int(np.floor(window.size/2))
which seems to have resolved the issue.

How can I future-proof the `round` function in Python2?

When round is imported from the future, it does not behave the same as the Python3 round function. Specifically, it does not support negative digit rounding.
In Python3:
>>> round(4781, -2)
4800
In Python2:
>>> from builtins import round
>>> round(4781, -2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/future/builtins/newround.py", line 33, in newround
raise NotImplementedError('negative ndigits not supported yet')
NotImplementedError: negative ndigits not supported yet
Possible solutions include error handling for negative rounding, writing my own round function, etc. How should this be handled? (Implicitly, I'm asking for best practices, most Pythonic, accepted by community, etc.)
I was going to suggest your custom function idea so you can ensure it always does what you want, but this appears to be a special (weird) case where if I don't use future.builtin.round() I get
Python 3.6:
>>> round(4781, -2)
4800
and Python 2.7:
>>> round(4781, -2)
4800.0
It appears to just be the future.builtins that is somehow broken; in this case, you should avoid using the future.builtins.round() function. Note that py3 round() returns an integer while py2 round() returns a float, but that seems to be the only difference between the two stock implementations (for simple rounding operations such as the examples given).

().is_integer() not working

Whats wrong with this code:
n = 10
((n/3)).is_integer()
I do not understand why I cannot set n = any number and check if it is an integer or not.
Thanks for your help!
python 2.7.4
error:
Traceback (most recent call last):
File "/home/userh/Arbeitsfläche/übung.py", line 2, in <module>
print ((n/3)).is_integer()
AttributeError: 'int' object has no attribute 'is_integer'
The reason you get this error is because you divide the integer 10 by 3 using integer division, getting the integral number 3 in the form of an int instance as a result. You then try to call the method is_integer() on that result but that method is in the float class and not in the int class, just as the error message says.
A quick fix would be to change your code and divide by 3.0 instead of 3 which would result in floating point division and give you a float instance on which you can call the is_integer() method like you are trying to. Do this:
n = 10
((n/3.0)).is_integer()
You are using Python 2.7. Unless you use from __future__ import division, dividing two integers will return you and integer. is_integer exists only in float, hence your error.
the other answers say this but aren't very clear (imho).
in python 2, the / sign means "integer division" when the arguments are integers. that gives you just the integer part of the result:
>>> 10/3
3
which means that in (10/3).is_integer() you are calling is_integer() on 3, which is an integer. and that doesn't work:
>>> (3.0).is_integer()
True
>>> (3).is_integer()
AttributeError: 'int' object has no attribute 'is_integer'
what you probably want is to change one of the numbers to a float:
>>> (10/3.0).is_integer()
False
this is fixed in python 3, by the way (which is the future, and a nicer language in many small ways).
You can use isdigit, this is a good function provided by Python itself
You can refer documentation here https://docs.python.org/2/library/stdtypes.html#str.isdigit
if token.isdigit():
return int(token)
...
When I wrote this answer there was no information about language.
But in python2 you can use the following to check if it's an integer or not
isinstance( <var>, ( int, long ) )

Why does built-in sum behave wrongly after "from numpy import *"?

I have some code like:
import math, csv, sys, re, time, datetime, pickle, os, gzip
from numpy import *
x = [1, 2, 3, ... ]
y = sum(x)
The sum of the actual values in x is 2165496761, which is larger than the limit of 32bit integer. The reported y value is -2129470535, implying integer overflow.
Why did this happen? I thought the built-in sum was supposed to use Python's arbitrary-size integers?
See How to restore a builtin that I overwrote by accident? if you've accidentally done something like this at the REPL (interpreter prompt).
Doing from numpy import * causes the built-in sum function to be replaced with numpy.sum:
>>> sum(xrange(10**7))
49999995000000L
>>> from numpy import sum
>>> sum(xrange(10**7)) # assuming a 32-bit platform
-2014260032
To verify that numpy.sum is in use, try to check the type of the result:
>>> sum([721832253, 721832254, 721832254])
-2129470535
>>> type(sum([721832253, 721832254, 721832254]))
<type 'numpy.int32'>
To avoid this problem, don't use star import.
If you must use numpy.sum and want an arbitrary-sized integer result, specify a dtype for the result like so:
>>> sum([721832253, 721832254, 721832254],dtype=object)
2165496761L
or refer to the builtin sum explicitly (possibly giving it a more convenient binding):
>>> __builtins__.sum([721832253, 721832254, 721832254])
2165496761L
The reason why you get this invalid value is that you're using np.sum on a int32. Nothing prevents you from not using a np.int32 but a np.int64 or np.int128 dtype to represent your data. You could for example just use
x.view(np.int64).sum()
On a side note, please make sure that you never use from numpy import *. It's a terrible practice and a habit you must get rid of as soon as possible. When you use the from ... import *, you might be overwriting some Python built-ins which makes it very difficult to debug. Typical example, your overwriting of functions like sum or max...
Python handles large numbers with arbitrary precision:
>>> sum([721832253, 721832254, 721832254])
2165496761
Just sum them up!
To make sure you don't use numpy.sum, try __builtins__.sum() instead.

Creating a customized language using Python

I have started playing with Sage recently, and I've come to suspect that the standard Python int is wrapped in a customized class called Integer in Sage. If I type in type(1) in Python, I get <type 'int'>, however, if I type in the same thing in the sage prompt I get <type 'sage.rings.integer.Integer'>.
If I wanted to replace Python int (or list or dict) with my own custom class, how might it be done? How difficult would it be (e.g. could I do it entirely in Python)?
As an addendum to the other answers: when running any code, Sage has a preprocessing step which converts the Sage-Python to true Python (which is then executed). This is done by the preparse function, e.g.
sage: preparse('a = 1')
'a = Integer(1)'
sage: preparse('2^40')
'Integer(2)**Integer(40)'
sage: preparse('F.<x> = PolynomialRing(ZZ)')
"F = PolynomialRing(ZZ, names=('x',)); (x,) = F._first_ngens(1)"
This step is precisely what allows the transparent use of Integers (in place of ints) and the other non-standard syntax (like the polynomial ring example above and [a..b] etc).
As far as I understand, this is the only way to completely transparently use replacements for the built-in types in Python.
You are able to subclass all of Python's built-in types. For example:
class MyInt(int):
pass
i = MyInt(2)
#i is now an instance of MyInt, but still will behave entirely like an integer.
However, you need to explicitly say each integer is a member of MyInt. So type(1) will still be int, you'll need to do type(MyInt(1)).
Hopefully that's close to what you're looking for.
In the case of Sage, it's easy. Sage has complete control of its own REPL (read-evaluate-print loop), so it can parse the commands you give it and make the parts of your expression into whatever classes it wants. It is not so easy to have standard Python automatically use your integer type for integer literals, however. Simply reassigning the built-in int() to some other type won't do it. You could probably do it with an import filter, that scans each file imported for (say) integer literals and replaces them with MyInt(42) or whatever.

Categories