decimal digits in python with e+ - python

I want to do a print output in Python which gives everytime 2 digits. The problem is, that there are very large (or small) numbers so normal Python output gives like this:
5.89630388655e-09
8.93552349994e+14
but sometimes also normal numbers like:
345.8976
I just want to force it to have two digits, which means that the output for the large and small numbers are
5.89e-09
8.93e+14
and the normal numbers just capped (or rounded) at the second digits:
345.89 (or 345.90)
How is that possible to realize in Python?

In python you can format numbers in a way similar to other languages:
print '%.2f' % number
print '%.3g' % number
See string formatting for more details on available flags and conversions.
Alternatively, you can use str.format() or a Formatter:
'{0:.2f}'.format(number)
'{0:.3g}'.format(number)
See format string syntax for details on the format expression syntax.
The f conversion produces notation which always contains the decimal point and may result in very long string representation for large numbers and 0.00 for very small numbers.
The g conversion produces notation with or without the exponent depending on the size of the number. However, the precision argument is interpreted differently for g than for f conversion. For f it is the number of digits after the decimal point while for g it is the number of all significant digits displayed. See string formatting for details.
The reason for the different interpretation of the precision argument is that when dealing with numbers of very different magnitudes it makes a lot more sense to stick to a fixed number of significant digits.
If you decide to not follow convention here, you'll need to write code which uses different formatting expressions for numbers of different magnitude. Note however that this will result in your code producing numbers with different accuracy depending on their magnitude, e.g. 345.89 has five significant digits while 3.46e+10 and 3.46e-10 only three.

You could use the format command:
"({0:.2f})".format(yournumber)
'.2f' means two decimal places

Related

How to convert exponent in Python and get rid of the 'e+'?

I'm starting with Python and I recently came across a dataset with big values.
One of my fields has a list of values that looks like this: 1.3212724310201994e+18 (note the e+18 by the end of the number).
How can I convert it to a floating point number and remove the the exponent without affecting the value?
First of all, the number is already a floating point number, and you do not need to change this. The only issue is that you want to have more control over how it is converted to a string for output purposes.
By default, floating point numbers above a certain size are converted to strings using exponential notation (with "e" representing "*10^"). However, if you want to convert it to a string without exponential notation, you can use the f format specifier, for example:
a = 1.3212724310201994e+18
print("{:f}".format(a))
gives:
1321272431020199424.000000
or using "f-strings" in Python 3:
print(f"{a:f}")
here the first f tells it to use an f-string and the :f is the floating point format specifier.
You can also specify the number of decimal places that should be displayed, for example:
>>> print(f"{a:.2f}") # 2 decimal places
1321272431020199424.00
>>> print(f"{a:.0f}") # no decimal places
1321272431020199424
Note that the internal representation of a floating-point number in Python uses 53 binary digits of accuracy (approximately one part in 10^16), so in this case, the value of your number of magnitude approximately 10^18 is not stored with accuracy down to the nearest integer, let alone any decimal places. However, the above gives the general principle of how you control the formatting used for string conversion.
You can use Decimal from the decimal module for each element of your data:
from decimal import Decimal
s = 1.3212724310201994e+18
print(Decimal(s))
Output:
1321272431020199424

Numpy to weak to calculate a precise mean value

This question is very similar to this post - but not exactly
I have some data in a .csv file. The data has precision to the 4th digit (#.####).
Calculating the mean in Excel or SAS gives a result with precision to 5th digit (#.#####) but using numpy gives:
import numpy as np
data = np.recfromcsv(path2file, delimiter=';', names=['measurements'], dtype=np.float64)
rawD = data['measurements']
print np.average(rawD)
gives a number like this
#.#####999999999994
Clearly something is wrong..
using
from math import fsum
print fsum(rawD.ravel())/rawD.size
gives
#.#####
Is there anything in the np.average that I set wrong _______?
BONUS info:
I'm only working with 200 data points in the array
UPDATE
I thought I should make my case more clear.
I have numbers like 4.2730 in my csv (giving a 4 decimal precision - even though the 4th always is zero [not part of the subject so don't mind that])
Calculating an average/mean by numpy gives me this
4.2516499999999994
Which gives a print by
>>>print "%.4f" % np.average(rawD)
4.2516
During the same thing in Excel or SAS gives me this:
4.2517
Which I actually believe as being the true average value because it finds it to be 4.25165.
This code also illustrate it:
answer = 0
for number in rawD:
answer += int(number*1000)
print answer/2
425165
So how do I tell np.average() to calculate this value ___?
I'm a bit surprised that numpy did this to me... I thought that I only needed to worry if I was dealing with 16 digits numbers. Didn't expect a round off on the 4 decimal place would be influenced by this..
I know I could use
fsum(rawD.ravel())/rawD.size
But I also have other things (like std) I want to calculate with the same precision
UPDATE 2
I thought I could make a temp solution by
>>>print "%.4f" % np.float64("%.5f" % np.mean(rawD))
4.2416
Which did not solve the case. Then I tried
>>>print "%.4f" % float("4.24165")
4.2416
AHA! There is a bug in the formatter: Issue 5118
To be honest I don't care if python stores 4.24165 as 4.241649999... It's still a round off error - NO MATTER WHAT.
If the interpeter can figure out how to display the number
>>>print float("4.24165")
4.24165
Then should the formatter as well and deal with that number when rounding..
It still doesn't change the fact that I have a round off problem (now both with the formatter and numpy)
In case you need some numbers to help me out then I have made this modified .csv file:
Download it from here
(I'm aware that this file does not have the number of digits I explained earlier and that the average gives ..9988 at the end instead of ..9994 - it's modified)
Guess my qeustion boils down to how do I get a string output like the one excel gives me if I use =average()
and have it round off correctly if I choose to show only 4 digits
I know that this might seem strange for some.. But I have my reasons for wanting to reproduce the behavior of Excel.
Any help would be appreciated, thank you.
To get exact decimal numbers, you need to use decimal arithmetic instead of binary. Python provides the decimal module for this.
If you want to continue to use numpy for the calculations and simply round the result, you can still do this with decimal. You do it in two steps, rounding to a large number of digits to eliminate the accumulated error, then rounding to the desired precision. The quantize method is used for rounding.
from decimal import Decimal,ROUND_HALF_UP
ten_places = Decimal('0.0000000001')
four_places = Decimal('0.0001')
mean = 4.2516499999999994
print Decimal(mean).quantize(ten_places).quantize(four_places, rounding=ROUND_HALF_UP)
4.2517
The result value of average is a double. When you print out a double, by default all digits are printed. What you see here is the result of limited digital precision, which is not a problem of numpy, but a general computing problem. When you care of the presentation of your float value, use "%.4f" % avg_val. There is also a package for rational numbers, to avoid representing fractions as real numbers, but I guess that's not what you're looking for.
For your second statement, summarizing all the values by hand and then dividing it, I suppose you're using python 2.7 and all your input values are integer. In that way, you would have an integer division, which truncates everything after the dot, resulting in another integer value.

Python: read mixed float and string csv file

I have a csv file with mixed floats, a string and an integer, the formatted output from a FORTRAN file.
A typical line looks like:
507.930 , 24.4097 , 1.0253E-04, O III , 4
I want to read it while keeping the float decimal places unmodified, and check to see if the first entry in each line is present is another list.
Using loadtxt and genfromtxt results in the demical places changing from 3 (or 4) to 12.
How should I tackle this?
If you need to keep precision exactly, you need to use the decimal module. Otherwise, issues with floating point arithmetic limitations might trip you up.
Chances are, though, that you don't really need that precision - just make sure you don't compare floats for equality exactly but always allow a fudge factor, and format the output to a limited number of significant digits:
# instead of if float1==float2:, use this:
if abs(float1-float2) <= sys.float_info.epsilon:
print "equal"
loadtxt appears to take a converters argument so something like:
from decimal import Decimal
numpy.loadtxt(..., converters={0: Decimal,
1: Decimal,
2: Decimal})
Should work.
Decimal's should work with whatever precision you require although if you're doing significant number crunching with Decimal it will be considerably slower than working with float. However, I assume you're just looking to transform the data without losing any precision so this should be fine.
I finished up writing some string processing code. Not elegant but it works:
stuff=loadtxt(fname1,skiprows=35,dtype="f10,f10,e10,S10,i1",delimiter=','‌​)
stuff2 = loadtxt('keylines.txt') # a list of the reference values
... # open file for writing etc
for i in range(0,len(stuff)):
bb=round(float(stuff[i][0]),3) # gets number back to correct decimal format
cc=round(float(stuff[i][1]),5) # ditto
dd=float(stuff[i][2])
ee=stuff[i][3].replace(" ","") # gets rid of extra FORTRAN spaes
ff=int(stuff[i][4])
for item in stuff2:
if bb == item:
fn.write( str(bb)+','+str("%1.5f" % cc)+','+str("%1.4e" % dd)+','+ee+','+str(ff)+'\n')

Python: Suppress exponential format (i.e. 9e-10) in float to string conversion?

I want to use python to write code for another language which doesn't understand exponentially formatted floats. Is there an easy way to get python to, when converting floats to strings, use long-form notation (I.E. 0.000000009 instead of 9e-9)? I tried '%(foo)f', but it cuts the decimal short (0.00000).
Try something like
"%.16f" % f
This will still use exponential format if the number is too small, so you have to treat this case separately, for example
"%.16f" % f if f >= 1e-16 else "0.0"
Use a specific format specifier, e.g.:
>>> f=9*(10**-9)
>>> str(f)
'9e-09'
>>> "%.23f" % f
'0.00000000900000000000000'
UPDATE (thanks to #Sven): The amount of digits you want to use depends on the magnitude of the number. if you have large numbers (like several trillions) you won't need any decimals, obviously. for tiny numbers you need more. 'tis an ugly representation indeed.

Significant figures in the decimal module

So I've decided to try to solve my physics homework by writing some python scripts to solve problems for me. One problem that I'm running into is that significant figures don't always seem to come out properly. For example this handles significant figures properly:
from decimal import Decimal
>>> Decimal('1.0') + Decimal('2.0')
Decimal("3.0")
But this doesn't:
>>> Decimal('1.00') / Decimal('3.00')
Decimal("0.3333333333333333333333333333")
So two questions:
Am I right that this isn't the expected amount of significant digits, or do I need to brush up on significant digit math?
Is there any way to do this without having to set the decimal precision manually? Granted, I'm sure I can use numpy to do this, but I just want to know if there's a way to do this with the decimal module out of curiosity.
Changing the decimal working precision to 2 digits is not a good idea, unless you absolutely only are going to perform a single operation.
You should always perform calculations at higher precision than the level of significance, and only round the final result. If you perform a long sequence of calculations and round to the number of significant digits at each step, errors will accumulate. The decimal module doesn't know whether any particular operation is one in a long sequence, or the final result, so it assumes that it shouldn't round more than necessary. Ideally it would use infinite precision, but that is too expensive so the Python developers settled for 28 digits.
Once you've arrived at the final result, what you probably want is quantize:
>>> (Decimal('1.00') / Decimal('3.00')).quantize(Decimal("0.001"))
Decimal("0.333")
You have to keep track of significance manually. If you want automatic significance tracking, you should use interval arithmetic. There are some libraries available for Python, including pyinterval and mpmath (which supports arbitrary precision). It is also straightforward to implement interval arithmetic with the decimal library, since it supports directed rounding.
You may also want to read the Decimal Arithmetic FAQ: Is the decimal arithmetic ‘significance’ arithmetic?
Decimals won't throw away decimal places like that. If you really want to limit precision to 2 d.p. then try
decimal.getcontext().prec=2
EDIT: You can alternatively call quantize() every time you multiply or divide (addition and subtraction will preserve the 2 dps).
Just out of curiosity...is it necessary to use the decimal module? Why not floating point with a significant-figures rounding of numbers when you are ready to see them? Or are you trying to keep track of the significant figures of the computation (like when you have to do an error analysis of a result, calculating the computed error as a function of the uncertainties that went into the calculation)? If you want a rounding function that rounds from the left of the number instead of the right, try:
def lround(x,leadingDigits=0):
"""Return x either as 'print' would show it (the default)
or rounded to the specified digit as counted from the leftmost
non-zero digit of the number, e.g. lround(0.00326,2) --> 0.0033
"""
assert leadingDigits>=0
if leadingDigits==0:
return float(str(x)) #just give it back like 'print' would give it
return float('%.*e' % (int(leadingDigits),x)) #give it back as rounded by the %e format
The numbers will look right when you print them or convert them to strings, but if you are working at the prompt and don't explicitly print them they may look a bit strange:
>>> lround(1./3.,2),str(lround(1./3.,2)),str(lround(1./3.,4))
(0.33000000000000002, '0.33', '0.3333')
Decimal defaults to 28 places of precision.
The only way to limit the number of digits it returns is by altering the precision.
What's wrong with floating point?
>>> "%8.2e"% ( 1.0/3.0 )
'3.33e-01'
It was designed for scientific-style calculations with a limited number of significant digits.
If I undertand Decimal correctly, the "precision" is the number of digits after the decimal point in decimal notation.
You seem to want something else: the number of significant digits. That is one more than the number of digits after the decimal point in scientific notation.
I would be interested in learning about a Python module that does significant-digits-aware floating point point computations.

Categories