I need to perform many math operations on numbers that look like 104.950178 - i.e. having multiple decimal places. It ranges from 4 decimal places to 8 decimal places with an occasional 10 decimal place number.
After much reading on Stack Overflow, I understand that I should not use float but instead use Decimal, because float is base 2 and Decimal is base 10. Having many math operations, float will give a result with error, which was not what I wanted, so I went with Decimal.
Everything was working fine until I encountered:
from decimal import *
Decimal(0.1601).quantize(Decimal('.0001'), rounding=ROUND_FLOOR) #returned Decimal('0.1600')
I need the numbers to be reflected exactly as they are. Can you please advise if there is another number system to use in Python 3, or if there is a way to get Decimal to take the number given as it is without changing its value?
Note: I added the "quantize..." portion of the code due to the comment by Mark to make it a MVCE. I took for granted the output and forgot that every Decimal in this part of my code was processed to truncate at 4 decimal points.
Nevertheless, #jonrsharpe is really sharp. He saw my question and immediately knew (before the "quantize..." edit) where the problem was. Thanks!
Instead of:
a = 0.1601
Decimal(a)
I used:
a = '0.1601'
Decimal(a)
Defined it as a string instead, and Decimal reflected the value exactly. I hope this helps someone.
Related
Beginner here. I've been following the 100 days of code course on Udemy and I have been trying to figure out the Tip Calculator project.
https://gyazo.com/285baf25f0c803fc893faa32d23d9fd1
I am receiving the wrong tip per person. For example I make the total bill amount 40.53, percent tip 15 and make it a 3 way split, and it gives me 15.33... and if I do this online on another program it would give me 15.54. Any tips for a beginner?
Your issue is that it is converting those numbers to integers, however, integers don't have decimal places and also always round down, you need to use floats for maximum precision
Without seeing your code, as BeRT2me has requested to see, it sounds like the variable you are assigning your intermediary total to is being set as an integer or is being truncated/rounded down perhaps by the output type of a function call you are making/returning.
Likely, change it to be an appropriate form of decimal type.
e.g. Using your values, 40.53 * 1.15 = 46.6095
46.6095 / 3 = 15.5365 (or 15.54 rounded to 2 decimal places)
46 / 3 = 15.33 (rounded to 2 decimal places)
If you are using that code example of total_final_bill. The type would be implicitly set as an integer since you are adding to integer values.
You are performing an implicit rounding function on the tip_amount and the total_bill by using the int() function. These should not be integers as they are decimal/float values.
So, each time you use int() other than int(tip_percent) and int(split_question) which are actually integer values for your formulas you are rounding the decimals.
My code is quite simple, and only 1 line is causing an issue:
np.tan(np.radians(rotation))
Instead of my expected output for rotation = 45 as 1, I get 0.9999999999999999. I understand that 0 and a ton of 9's is 1. In my use case, however, it seems like the type of thing that will definitely build up over iterations.
What is causing the floating point error: np.tan or np.radians, and how do I get the problem function to come out correctly regardless of floating point inaccuracies?
Edit:
I should clarify that I am familiar with floating point inaccuracies. My concern is that as that number gets multiplied, added, and compared, the 1e-6 error suddenly becomes a tangible issue. I've normally been able to safely ignore floating point issues, but now I am far more concerned about the build up of error. I would like to reduce the possibility of such an error.
Edit 2:
My current solution is to just round to 8 decimal places because that's most likely enough. It's sort of a temporary solution because I'd much prefer a way to get around the IEEE decimal representations.
What is causing the floating point error: np.tan or np.radians, and how do I get the problem function to come out correctly regardless of floating point inaccuracies?
Both functions incur rounding error, since in neither case is the exact result representable in floating point.
My current solution is to just round to 8 decimal places because that's most likely enough. It's sort of a temporary solution because I'd much prefer a way to get around the IEEE decimal representations.
The problem has nothing to do with decimal representation, and this will give worse results outside of the exact case you mention above, e.g.
>>> np.tan(np.radians(60))
1.7320508075688767
>>> round(np.tan(np.radians(60)), 8)
1.73205081
>>> np.sqrt(3) # sqrt is correctly rounded, so this is the closest float to the true result
1.7320508075688772
If you absolutely need higher accuracy than the 15 decimal digits you would get from code above, then you can use an arbitrary precision library like gmpy2.
Take a look here: https://docs.scipy.org/doc/numpy/user/basics.types.html .
Standard dtypes in numpy do not go beyond 64 bits precision. From the docs:
Be warned that even if np.longdouble offers more precision than python
float, it is easy to lose that extra precision, since python often
forces values to pass through float. For example, the % formatting
operator requires its arguments to be converted to standard python
types, and it is therefore impossible to preserve extended precision
even if many decimal places are requested. It can be useful to test
your code with the value 1 + np.finfo(np.longdouble).eps.
You can increase precision with np.longdouble, but this is platform dependent
In spyder (windows):
np.finfo(np.longdouble).eps #same precision as float
>> 2.220446049250313e-16
np.finfo(np.longdouble).precision
>> 15
In google colab:
np.finfo(np.longdouble).eps #larger precision
>> 1.084202172485504434e-19
np.finfo(np.longdouble).precision
>> 18
print(np.tan(np.radians(45, dtype=np.float), dtype=np.float) - 1)
print(np.tan(np.radians(45, dtype=np.longfloat), dtype=np.longfloat) - 1)
>> -1.1102230246251565e-16
0.0
I am trying to return a number with 6 decimal places, regardless of what the number is.
For example:
>>> a = 3/6
>>> a
0.5
How can I take a and make it 0.500000 while preserving its type as a float?
I've tried
'{0:.6f}'.format(a)
but that returns a string. I'd like something that accomplishes this same task, but returns a float.
In memory of the computer, the float is being stored as an IEEE754 object, that means it's just a bunch of binary data exposed with a given format that's nothing alike the string of the number as you write it.
So when you manipulate it, it's still a float and has no number of decimals after the dot. It's only when you display it that it does, and whatever you do, when you display it, it gets converted to a string.
That's when you do the conversion to string that you can specify the number of decimals to show, and you do it using the string format as you wrote.
This question shows a slight misunderstanding on the nature of data types such as float and string.
A float in a computer has a binary representation, not a decimal one. The rendering to decimal that python is giving you in the console was converted to a string when it was printed, even if it's implicit by the print function. There is no difference between how a 0.5 and 0.5000000 is stored as a float in its binary representation.
When you are writing application code, it is best not to worry about the presentation until it gets to the end user where it must, somehow, be converted to a string if only implicitly. At that point you can worry about decimal places, or even whether you want it shown in decimal at all.
This question is very similar to this post - but not exactly
I have some data in a .csv file. The data has precision to the 4th digit (#.####).
Calculating the mean in Excel or SAS gives a result with precision to 5th digit (#.#####) but using numpy gives:
import numpy as np
data = np.recfromcsv(path2file, delimiter=';', names=['measurements'], dtype=np.float64)
rawD = data['measurements']
print np.average(rawD)
gives a number like this
#.#####999999999994
Clearly something is wrong..
using
from math import fsum
print fsum(rawD.ravel())/rawD.size
gives
#.#####
Is there anything in the np.average that I set wrong _______?
BONUS info:
I'm only working with 200 data points in the array
UPDATE
I thought I should make my case more clear.
I have numbers like 4.2730 in my csv (giving a 4 decimal precision - even though the 4th always is zero [not part of the subject so don't mind that])
Calculating an average/mean by numpy gives me this
4.2516499999999994
Which gives a print by
>>>print "%.4f" % np.average(rawD)
4.2516
During the same thing in Excel or SAS gives me this:
4.2517
Which I actually believe as being the true average value because it finds it to be 4.25165.
This code also illustrate it:
answer = 0
for number in rawD:
answer += int(number*1000)
print answer/2
425165
So how do I tell np.average() to calculate this value ___?
I'm a bit surprised that numpy did this to me... I thought that I only needed to worry if I was dealing with 16 digits numbers. Didn't expect a round off on the 4 decimal place would be influenced by this..
I know I could use
fsum(rawD.ravel())/rawD.size
But I also have other things (like std) I want to calculate with the same precision
UPDATE 2
I thought I could make a temp solution by
>>>print "%.4f" % np.float64("%.5f" % np.mean(rawD))
4.2416
Which did not solve the case. Then I tried
>>>print "%.4f" % float("4.24165")
4.2416
AHA! There is a bug in the formatter: Issue 5118
To be honest I don't care if python stores 4.24165 as 4.241649999... It's still a round off error - NO MATTER WHAT.
If the interpeter can figure out how to display the number
>>>print float("4.24165")
4.24165
Then should the formatter as well and deal with that number when rounding..
It still doesn't change the fact that I have a round off problem (now both with the formatter and numpy)
In case you need some numbers to help me out then I have made this modified .csv file:
Download it from here
(I'm aware that this file does not have the number of digits I explained earlier and that the average gives ..9988 at the end instead of ..9994 - it's modified)
Guess my qeustion boils down to how do I get a string output like the one excel gives me if I use =average()
and have it round off correctly if I choose to show only 4 digits
I know that this might seem strange for some.. But I have my reasons for wanting to reproduce the behavior of Excel.
Any help would be appreciated, thank you.
To get exact decimal numbers, you need to use decimal arithmetic instead of binary. Python provides the decimal module for this.
If you want to continue to use numpy for the calculations and simply round the result, you can still do this with decimal. You do it in two steps, rounding to a large number of digits to eliminate the accumulated error, then rounding to the desired precision. The quantize method is used for rounding.
from decimal import Decimal,ROUND_HALF_UP
ten_places = Decimal('0.0000000001')
four_places = Decimal('0.0001')
mean = 4.2516499999999994
print Decimal(mean).quantize(ten_places).quantize(four_places, rounding=ROUND_HALF_UP)
4.2517
The result value of average is a double. When you print out a double, by default all digits are printed. What you see here is the result of limited digital precision, which is not a problem of numpy, but a general computing problem. When you care of the presentation of your float value, use "%.4f" % avg_val. There is also a package for rational numbers, to avoid representing fractions as real numbers, but I guess that's not what you're looking for.
For your second statement, summarizing all the values by hand and then dividing it, I suppose you're using python 2.7 and all your input values are integer. In that way, you would have an integer division, which truncates everything after the dot, resulting in another integer value.
So I've decided to try to solve my physics homework by writing some python scripts to solve problems for me. One problem that I'm running into is that significant figures don't always seem to come out properly. For example this handles significant figures properly:
from decimal import Decimal
>>> Decimal('1.0') + Decimal('2.0')
Decimal("3.0")
But this doesn't:
>>> Decimal('1.00') / Decimal('3.00')
Decimal("0.3333333333333333333333333333")
So two questions:
Am I right that this isn't the expected amount of significant digits, or do I need to brush up on significant digit math?
Is there any way to do this without having to set the decimal precision manually? Granted, I'm sure I can use numpy to do this, but I just want to know if there's a way to do this with the decimal module out of curiosity.
Changing the decimal working precision to 2 digits is not a good idea, unless you absolutely only are going to perform a single operation.
You should always perform calculations at higher precision than the level of significance, and only round the final result. If you perform a long sequence of calculations and round to the number of significant digits at each step, errors will accumulate. The decimal module doesn't know whether any particular operation is one in a long sequence, or the final result, so it assumes that it shouldn't round more than necessary. Ideally it would use infinite precision, but that is too expensive so the Python developers settled for 28 digits.
Once you've arrived at the final result, what you probably want is quantize:
>>> (Decimal('1.00') / Decimal('3.00')).quantize(Decimal("0.001"))
Decimal("0.333")
You have to keep track of significance manually. If you want automatic significance tracking, you should use interval arithmetic. There are some libraries available for Python, including pyinterval and mpmath (which supports arbitrary precision). It is also straightforward to implement interval arithmetic with the decimal library, since it supports directed rounding.
You may also want to read the Decimal Arithmetic FAQ: Is the decimal arithmetic ‘significance’ arithmetic?
Decimals won't throw away decimal places like that. If you really want to limit precision to 2 d.p. then try
decimal.getcontext().prec=2
EDIT: You can alternatively call quantize() every time you multiply or divide (addition and subtraction will preserve the 2 dps).
Just out of curiosity...is it necessary to use the decimal module? Why not floating point with a significant-figures rounding of numbers when you are ready to see them? Or are you trying to keep track of the significant figures of the computation (like when you have to do an error analysis of a result, calculating the computed error as a function of the uncertainties that went into the calculation)? If you want a rounding function that rounds from the left of the number instead of the right, try:
def lround(x,leadingDigits=0):
"""Return x either as 'print' would show it (the default)
or rounded to the specified digit as counted from the leftmost
non-zero digit of the number, e.g. lround(0.00326,2) --> 0.0033
"""
assert leadingDigits>=0
if leadingDigits==0:
return float(str(x)) #just give it back like 'print' would give it
return float('%.*e' % (int(leadingDigits),x)) #give it back as rounded by the %e format
The numbers will look right when you print them or convert them to strings, but if you are working at the prompt and don't explicitly print them they may look a bit strange:
>>> lround(1./3.,2),str(lround(1./3.,2)),str(lround(1./3.,4))
(0.33000000000000002, '0.33', '0.3333')
Decimal defaults to 28 places of precision.
The only way to limit the number of digits it returns is by altering the precision.
What's wrong with floating point?
>>> "%8.2e"% ( 1.0/3.0 )
'3.33e-01'
It was designed for scientific-style calculations with a limited number of significant digits.
If I undertand Decimal correctly, the "precision" is the number of digits after the decimal point in decimal notation.
You seem to want something else: the number of significant digits. That is one more than the number of digits after the decimal point in scientific notation.
I would be interested in learning about a Python module that does significant-digits-aware floating point point computations.