How to set the precision on str(numpy.float64)? - python

i need to write a couple of numpy floats to a csv-file which has additional string content. therefore i dont use savetxt etc. with numpy.set_printoptions() i can only define the print behaviour, but not the str() behaviour. i know that i miss something and it cant be that hard, but i dont find a reasonable answer on the interwebs. maybe someone can point me in the right direction. heres some example code:
In [1]: import numpy as np
In [2]: foo = np.array([1.22334])
In [3]: foo
Out[3]: array([ 1.22334])
In [4]: foo[0]
Out[4]: 1.2233400000000001
In [5]: str(foo[0])
Out[5]: '1.22334'
In [6]: np.set_printoptions(precision=3)
In [7]: foo
Out[7]: array([ 1.223])
In [8]: foo[0]
Out[8]: 1.2233400000000001
In [9]: str(foo[0])
Out[9]: '1.22334'
How do i convert np.float to a nicely formatted string, which i can feed to file.write()?
kind regards,
fookatchu

You can just use standard string formatting:
>>> x = 1.2345678
>>> '%.2f' % x
'1.23'

You could use normal String formating, see:
http://docs.python.org/library/string.html#formatspec
Example:
print '{:.2f}'.format(0.1234) # '0.12'
print '{:.2e}'.format(0.1234) # '1.23e-01'

Numpy 1.14 and later have format_float_positional and format_float_scientific functions to format a floating-point scalar as a decimal string in positional or scientific notation, with control over rounding, trimming and padding. These functions offer much more control to the formatting than conventional Python string formatters.
import numpy as np
x = np.float64('1.2345678')
print(np.format_float_positional(x)) # 1.2345678
print(np.format_float_positional(x, precision=3)) # 1.235
print(np.format_float_positional(np.float16(x))) # 1.234
print(np.format_float_positional(np.float16(x), unique=False, precision=8)) # 1.23437500
y = x / 1e8
print(np.format_float_scientific(y)) # 1.2345678e-08
print(np.format_float_scientific(y, precision=3, exp_digits=1)) # 1.235e-8
etc.
These advanced formatters are based on the Dragon4 algorithm; see Ryan Juckett's Printing Floating-Point Numbers to read more on the subject.

Instead of str(foo[0]), use "%.3f" % foo[0].

Also you can do:
precision = 2
str(np.round(foo[0], precision))
It had some advantages for me over the ('%.2f' % x) when I needed to do string a str(np.log(0.0)) which is neatly treated to "-inf" by numpy so you don't have to bother here.

Related

how do I check decimal.is_nan() for all values in array?

Suppose I have my array like this:
from decimal import Decimal
array = [Decimal(np.nan), Decimal(np.nan), Decimal(0.231411)]
I know that if the types are float, I can check if all the values are nan or not
, as:
np.isnan(array).all()
Is there a way for type Decimal?
The solution would be better without iteration.
You could use NumPy's vectorize to avoid iteration.
In [40]: from decimal import Decimal
In [41]: import numpy as np
In [42]: nums = [Decimal(np.nan), Decimal(np.nan), Decimal(0.231411)]
In [43]: nums
Out[43]:
[Decimal('NaN'),
Decimal('NaN'),
Decimal('0.2314110000000000055830895462349872104823589324951171875')]
In [44]: np.all(np.vectorize(lambda x: x.is_nan())(np.asarray(nums)))
Out[44]: False
In [45]: np.all(np.vectorize(lambda x: x.is_nan())(np.asarray(nums[:-1])))
Out[45]: True
In the snippet above nums is a list of instances of class Decimal. Notice that you need to convert that list into a NumPy array.
From my comment above, I realise it’s an iteration. The reason is that np.isnan does not support Decimal as an input type; therefore, I don’t believe this can be done via broadcasting, without converting the datatype - which means a potential precision loss, which is a reason to use a Decimal type.
Additionally, as commented by #juanpa.arrivillaga, as the Decimal objects are in a list, iteration is the only way. Numpy is not necessary in this operation.
One method is:
all([i.is_nan() for i in array])

How to reliably separate decimal and floating parts from a number?

This is not a duplicate of this, I'll explain here.
Consider x = 1.2. I'd like to separate it out into 1 and 0.2. I've tried all these methods as outlined in the linked question:
In [370]: x = 1.2
In [371]: divmod(x, 1)
Out[371]: (1.0, 0.19999999999999996)
In [372]: math.modf(x)
Out[372]: (0.19999999999999996, 1.0)
In [373]: x - int(x)
Out[373]: 0.19999999999999996
In [374]: x - int(str(x).split('.')[0])
Out[374]: 0.19999999999999996
Nothing I try gives me exactly 1 and 0.2.
Is there any way to reliably convert a floating number to its decimal and floating point equivalents that is not hindered by the limitation of floating point representation?
I understand this might be due to the limitation of how the number is itself stored, so I'm open to any suggestion (like a package or otherwise) that overcomes this.
Edit: Would prefer a way that didn't involve string manipulation, if possible.
Solution
It may seem like a hack, but you could separate the string form (actually repr) and convert it back to ints and floats:
In [1]: x = 1.2
In [2]: s = repr(x)
In [3]: p, q = s.split('.')
In [4]: int(p)
Out[4]: 1
In [5]: float('.' + q)
Out[5]: 0.2
How it works
The reason for approaching it this way is that the internal algorithm for displaying 1.2 is very sophisticated (a fast variant of David Gay's algorithm). It works hard to show the shortest of the possible representations of numbers that cannot be represented exactly. By splitting the repr form, you're taking advantage of that algorithm.
Internally, the value entered as 1.2 is stored as the binary fraction, 5404319552844595 / 4503599627370496 which is actually equal to 1.1999999999999999555910790149937383830547332763671875. The Gay algorithm is used to display this as the string 1.2. The split then reliably extracts the integer portion.
In [6]: from decimal import Decimal
In [7]: Decimal(1.2)
Out[7]: Decimal('1.1999999999999999555910790149937383830547332763671875')
In [8]: (1.2).as_integer_ratio()
Out[8]: (5404319552844595, 4503599627370496)
Rationale and problem analysis
As stated, your problem roughly translates to "I want to split the integral and fractional parts of the number as it appears visually rather that according to how it is actually stored".
Framed that way, it is clear that the solution involves parsing how it is displayed visually. While it make feel like a hack, this is the most direct way to take advantage of the very sophisticated display algorithms and actually match what you see.
This way may the only reliable way to match what you see unless you manually reproduce the internal display algorithms.
Failure of alternatives
If you want to stay in realm of integers, you could try rounding and subtraction but that would give you an unexpected value for the floating point portion:
In [9]: round(x)
Out[9]: 1.0
In [10]: x - round(x)
Out[10]: 0.19999999999999996
Here is a solution without string manipulation (frac_digits is the count of decimal digits that you can guarantee the fractional part of your numbers will fit into):
>>> def integer_and_fraction(x, frac_digits=3):
... i = int(x)
... c = 10**frac_digits
... f = round(x*c-i*c)/c
... return (i, f)
...
>>> integer_and_fraction(1.2)
(1, 0.2)
>>> integer_and_fraction(1.2, 1)
(1, 0.2)
>>> integer_and_fraction(1.2, 2)
(1, 0.2)
>>> integer_and_fraction(1.2, 5)
(1, 0.2)
>>>
You could try converting 1.2 to string, splitting on the '.' and then converting the two strings ("1" and "2") back to the format you want.
Additionally padding the second portion with a '0.' will give you a nice format.
So I just did the following in a python terminal and it seemed to work properly...
x=1.2
s=str(x).split('.')
i=int(s[0])
d=int(s[1])/10

Smallest positive float64 number

I need to find a numpy.float64 value that is as close to zero as possible.
Numpy offers several constants that allow to do something similar:
np.finfo(np.float64).eps = 2.2204460492503131e-16
np.finfo(np.float64).tiny = 2.2250738585072014e-308
These are both reasonably small, but when I do this
>>> x = np.finfo(np.float64).tiny
>>> x / 2
6.9533558078350043e-310
the result is even smaller. When using an impromptu binary search I can get down to about 1e-323, before the value is rounded down to 0.0.
Is there a constant for this in numpy that I am missing? Alternatively, is there a right way to do this?
Use np.nextafter.
>>> import numpy as np
>>> np.nextafter(0, 1)
4.9406564584124654e-324
>>> np.nextafter(np.float32(0), np.float32(1))
1.4012985e-45
2^-1075 is the smallest positive float.
2^-1075 = 5.10^-324

Separating real and imaginary parts using Sympy

I am trying to segregate real and imaginary parts of the output for the following program.
import sympy as sp
a = sp.symbols('a', imaginary=True)
b=sp.symbols('b',real=True)
V=sp.symbols('V',imaginary=True)
a=4*sp.I
b=5
V=a+b
print V
Kindly help. Thanks in advance.
The lines
b=sp.symbols('b',real=True)
V=sp.symbols('V',imaginary=True)
have no effect, because you overwrite the variables b and V in the lines
b=5
V=a+b
It's important to understand the difference between Python variables and SymPy symbols when using SymPy. Whenever you use =, you are assigning a Python variable, which is just a pointer to the number or expression you assign it to. Assigning it again changes the pointer, not the expression. See http://docs.sympy.org/latest/tutorial/intro.html and http://nedbatchelder.com/text/names.html.
To do what you want, use the as_real_imag() method, like
In [1]: expr = 4*I + 5
In [2]: expr.as_real_imag()
Out[2]: (5, 4)
You can also use the re() and im() functions:
In [3]: re(expr)
Out[3]: 5
In [4]: im(expr)
Out[4]: 4

Show an array in format of scientific notation

I would like to show my results in scientific notation (e.g., 1.2e3). My data is in array format. Is there a function like tolist() that can convert the array to float so I can use %E to format the output?
Here is my code:
import numpy as np
a=np.zeros(shape=(5,5), dtype=float)
b=a.tolist()
print a, type(a), b, type(b)
print '''%s''' % b
# what I want is
print '''%E''' % function_to_float(a or b)
If your version of Numpy is 1.7 or greater, you should be able to use the formatter option to numpy.set_printoptions. 1.6 should definitely work -- 1.5.1 may work as well.
import numpy as np
a = np.zeros(shape=(5, 5), dtype=float)
np.set_printoptions(formatter={'float': lambda x: format(x, '6.3E')})
print a
Alternatively, if you don't have formatter, you can create a new array whose values are formatted strings in the format you want. This will create an entirely new array as big as your original array, so it's not the most memory-efficient way of doing this, but it may work if you can't upgrade numpy. (I tested this and it works on numpy 1.3.0.)
To use this strategy to get something similar to above:
import numpy as np
a = np.zeros(shape=(5, 5), dtype=float)
formatting_function = np.vectorize(lambda f: format(f, '6.3E'))
print formatting_function(a)
'6.3E' is the format you want each value printed as. You can consult the this documentation for more options.
In this case, 6 is the minimum width of the printed number and 3 is the number of digits displayed after the decimal point.
You can format each of the elements of an array in scientific notation and then display them as you'd like. Lists cannot be converted to floats, they have floats inside them potentially.
import numpy as np
a = np.zeroes(shape=(5, 5), dtype=float)
for e in a.flat:
print "%E" % e
or
print ["%E" % e for e in a.flat]

Categories