So I've decided to try to solve my physics homework by writing some python scripts to solve problems for me. One problem that I'm running into is that significant figures don't always seem to come out properly. For example this handles significant figures properly:
from decimal import Decimal
>>> Decimal('1.0') + Decimal('2.0')
Decimal("3.0")
But this doesn't:
>>> Decimal('1.00') / Decimal('3.00')
Decimal("0.3333333333333333333333333333")
So two questions:
Am I right that this isn't the expected amount of significant digits, or do I need to brush up on significant digit math?
Is there any way to do this without having to set the decimal precision manually? Granted, I'm sure I can use numpy to do this, but I just want to know if there's a way to do this with the decimal module out of curiosity.
Changing the decimal working precision to 2 digits is not a good idea, unless you absolutely only are going to perform a single operation.
You should always perform calculations at higher precision than the level of significance, and only round the final result. If you perform a long sequence of calculations and round to the number of significant digits at each step, errors will accumulate. The decimal module doesn't know whether any particular operation is one in a long sequence, or the final result, so it assumes that it shouldn't round more than necessary. Ideally it would use infinite precision, but that is too expensive so the Python developers settled for 28 digits.
Once you've arrived at the final result, what you probably want is quantize:
>>> (Decimal('1.00') / Decimal('3.00')).quantize(Decimal("0.001"))
Decimal("0.333")
You have to keep track of significance manually. If you want automatic significance tracking, you should use interval arithmetic. There are some libraries available for Python, including pyinterval and mpmath (which supports arbitrary precision). It is also straightforward to implement interval arithmetic with the decimal library, since it supports directed rounding.
You may also want to read the Decimal Arithmetic FAQ: Is the decimal arithmetic ‘significance’ arithmetic?
Decimals won't throw away decimal places like that. If you really want to limit precision to 2 d.p. then try
decimal.getcontext().prec=2
EDIT: You can alternatively call quantize() every time you multiply or divide (addition and subtraction will preserve the 2 dps).
Just out of curiosity...is it necessary to use the decimal module? Why not floating point with a significant-figures rounding of numbers when you are ready to see them? Or are you trying to keep track of the significant figures of the computation (like when you have to do an error analysis of a result, calculating the computed error as a function of the uncertainties that went into the calculation)? If you want a rounding function that rounds from the left of the number instead of the right, try:
def lround(x,leadingDigits=0):
"""Return x either as 'print' would show it (the default)
or rounded to the specified digit as counted from the leftmost
non-zero digit of the number, e.g. lround(0.00326,2) --> 0.0033
"""
assert leadingDigits>=0
if leadingDigits==0:
return float(str(x)) #just give it back like 'print' would give it
return float('%.*e' % (int(leadingDigits),x)) #give it back as rounded by the %e format
The numbers will look right when you print them or convert them to strings, but if you are working at the prompt and don't explicitly print them they may look a bit strange:
>>> lround(1./3.,2),str(lround(1./3.,2)),str(lround(1./3.,4))
(0.33000000000000002, '0.33', '0.3333')
Decimal defaults to 28 places of precision.
The only way to limit the number of digits it returns is by altering the precision.
What's wrong with floating point?
>>> "%8.2e"% ( 1.0/3.0 )
'3.33e-01'
It was designed for scientific-style calculations with a limited number of significant digits.
If I undertand Decimal correctly, the "precision" is the number of digits after the decimal point in decimal notation.
You seem to want something else: the number of significant digits. That is one more than the number of digits after the decimal point in scientific notation.
I would be interested in learning about a Python module that does significant-digits-aware floating point point computations.
Related
My code is quite simple, and only 1 line is causing an issue:
np.tan(np.radians(rotation))
Instead of my expected output for rotation = 45 as 1, I get 0.9999999999999999. I understand that 0 and a ton of 9's is 1. In my use case, however, it seems like the type of thing that will definitely build up over iterations.
What is causing the floating point error: np.tan or np.radians, and how do I get the problem function to come out correctly regardless of floating point inaccuracies?
Edit:
I should clarify that I am familiar with floating point inaccuracies. My concern is that as that number gets multiplied, added, and compared, the 1e-6 error suddenly becomes a tangible issue. I've normally been able to safely ignore floating point issues, but now I am far more concerned about the build up of error. I would like to reduce the possibility of such an error.
Edit 2:
My current solution is to just round to 8 decimal places because that's most likely enough. It's sort of a temporary solution because I'd much prefer a way to get around the IEEE decimal representations.
What is causing the floating point error: np.tan or np.radians, and how do I get the problem function to come out correctly regardless of floating point inaccuracies?
Both functions incur rounding error, since in neither case is the exact result representable in floating point.
My current solution is to just round to 8 decimal places because that's most likely enough. It's sort of a temporary solution because I'd much prefer a way to get around the IEEE decimal representations.
The problem has nothing to do with decimal representation, and this will give worse results outside of the exact case you mention above, e.g.
>>> np.tan(np.radians(60))
1.7320508075688767
>>> round(np.tan(np.radians(60)), 8)
1.73205081
>>> np.sqrt(3) # sqrt is correctly rounded, so this is the closest float to the true result
1.7320508075688772
If you absolutely need higher accuracy than the 15 decimal digits you would get from code above, then you can use an arbitrary precision library like gmpy2.
Take a look here: https://docs.scipy.org/doc/numpy/user/basics.types.html .
Standard dtypes in numpy do not go beyond 64 bits precision. From the docs:
Be warned that even if np.longdouble offers more precision than python
float, it is easy to lose that extra precision, since python often
forces values to pass through float. For example, the % formatting
operator requires its arguments to be converted to standard python
types, and it is therefore impossible to preserve extended precision
even if many decimal places are requested. It can be useful to test
your code with the value 1 + np.finfo(np.longdouble).eps.
You can increase precision with np.longdouble, but this is platform dependent
In spyder (windows):
np.finfo(np.longdouble).eps #same precision as float
>> 2.220446049250313e-16
np.finfo(np.longdouble).precision
>> 15
In google colab:
np.finfo(np.longdouble).eps #larger precision
>> 1.084202172485504434e-19
np.finfo(np.longdouble).precision
>> 18
print(np.tan(np.radians(45, dtype=np.float), dtype=np.float) - 1)
print(np.tan(np.radians(45, dtype=np.longfloat), dtype=np.longfloat) - 1)
>> -1.1102230246251565e-16
0.0
I need to perform many math operations on numbers that look like 104.950178 - i.e. having multiple decimal places. It ranges from 4 decimal places to 8 decimal places with an occasional 10 decimal place number.
After much reading on Stack Overflow, I understand that I should not use float but instead use Decimal, because float is base 2 and Decimal is base 10. Having many math operations, float will give a result with error, which was not what I wanted, so I went with Decimal.
Everything was working fine until I encountered:
from decimal import *
Decimal(0.1601).quantize(Decimal('.0001'), rounding=ROUND_FLOOR) #returned Decimal('0.1600')
I need the numbers to be reflected exactly as they are. Can you please advise if there is another number system to use in Python 3, or if there is a way to get Decimal to take the number given as it is without changing its value?
Note: I added the "quantize..." portion of the code due to the comment by Mark to make it a MVCE. I took for granted the output and forgot that every Decimal in this part of my code was processed to truncate at 4 decimal points.
Nevertheless, #jonrsharpe is really sharp. He saw my question and immediately knew (before the "quantize..." edit) where the problem was. Thanks!
Instead of:
a = 0.1601
Decimal(a)
I used:
a = '0.1601'
Decimal(a)
Defined it as a string instead, and Decimal reflected the value exactly. I hope this helps someone.
I'm getting something that doesn't seem to be making a lot of sense. I was practicing my coding by making a little program that would give me the probability of getting certain cards within a certain timeframe of a card game. In order to calculate the chances, I would need to create a method that would perform division, and report the chances as a fraction and as a decimal. so I designed this:
from fractions import Fraction
def time_odds(card_count,turns,deck_size=60):
chance_of_occurence = float(card_count)/float(deck_size)
opening_hand_odds = 7*chance_of_occurence
turn_odds = (7 + turns)*chance_of_occurence
print ("Chance of it being in the opening hand: %s or %s" %(opening_hand_odds,Fraction(opening_hand_odds)))
print ("Chance of it being acquired by turn %s : %s or %s" %(turns,turn_odds,Fraction(turn_odds) ))
and then I used it like so:
time_odds(3,5)
but for whatever reason I got this as the answer:
"Chance of it being in the opening hand: 0.35000000000000003 or
6305039478318695/18014398509481984"
"Chance of it being acquired by turn 5 : 0.6000000000000001 or
1351079888211149/2251799813685248"
so it's like, almost right, except the decimal is just slightly off, giving like a 0.0000000000003 difference or a 0.000000000000000000001 difference.
Python doesn't do this when I just make it do division like this:
print (7*3/60)
This gives me just 0.35, which is correct. The only difference that I can observe, is that I get the slightly incorrect values when I am dividing with variables rather than just numbers.
I've looked around a little for an answer, and most incorrect division problems have to do with integer division (or I think it can be called floor division) , but I didn't manage to find anything addressing this.
I've had a similar problem with python doing this when I was dividing really big numbers. What's going on?
Why is this so? what can I do to correct it?
The extra digits you're seeing are floating point precision errors. As you do more and more operations with floating point numbers, the errors have a chance of compounding.
The reason you don't see them when you try to replicate the computation by hand is that your replication performs the operations in a different order. If you compute 7 * 3 / 60, the mutiplication happens first (with no error), and the division introduces a small enough error that Python's float type hides it for you in its repr (because 0.35 unambiguously refers to the same float value as the computation). If you do 7 * (3 / 60), the division happens first (introducing error) and then the multiplication increases the size of the error to the point that it can't be hidden (because 0.35000000000000003 is a different float value than 0.35).
To avoid printing out the the extra digits that are probably error, you may want to explicitly specify a precision to use when turning your numbers into strings. For instance, rather than using the %s format code (which calls str on the value), you could use %.3f, which will round off your number after three decimal places.
There's another issue with your Fractions. You're creating the Fraction directly from the floating point value, which already has the error calculated in. That's why you're seeing the fraction print out with a very large numerator and denominator (it's exactly representing the same number as the inaccurate float). If you instead pass integer numerator and denominator values to the Fraction constructor, it will take care of simplifying the fraction for you without any floating point inaccuracy:
print("Chance of it being in the opening hand: %.3f or %s"
% (opening_hand_odds, Fraction(7*card_count, deck_size)))
This should print out the numbers as 0.350 and 7/20. You can of course choose whatever number of decimal places you want.
Completely separate from the floating point errors, the calculation isn't actually getting the probability right. The formula you're using may be a good enough one for doing in your head while playing a game, but it's not completely accurate. If you're using a computer to crunch the numbers for you, you might as well get it right.
The probability of drawing at least one of N specific cards from a deck of size M after D draws is:
1 - (comb(M-N, D) / comb(M, D))
Where comb is the binary coefficient or "combination" function (often spoken as "N choose R" and written "nCr" in mathematics). Python doesn't have an implementation of that function in the standard library, but there are a lot of add on modules you may already have installed that provide one or you can pretty easily write your own. See this earlier question for more specifics.
For your example parameters, the correct odds are '5397/17110' or 0.315.
Can someone explain why the following three examples are not all equal?
ipdb> Decimal(71.60) == Decimal(71.60)
True
ipdb> Decimal('71.60') == Decimal('71.60')
True
ipdb> Decimal(71.60) == Decimal('71.60')
False
Is there a general 'correct' way to create Decimal objects in Python? (ie, as strings or as floats)
Floating point numbers, what are used by default, are in base 2. 71.6 can't be accurately represented in base 2. (Think of numbers like 1/3 in base 10).
Because of this, they will be converted to be as many decimal places as the floating point can represent. Because the number 71.6 in base 2 would go on forever and you almost certainly don't have infinate memory to play with, the computer decides to represent it (well, is told to) in a fewer number of bits.
If you were to use a string instead, the program can use an algorithm to convert it exactly instead of starting from the dodgy rounded floating point number.
>>> decimal.Decimal(71.6)
Decimal('71.599999999999994315658113919198513031005859375')
Compared to
>>> decimal.Decimal("71.6")
Decimal('71.6')
However, if your number is representable exactly as a float, it is just as accurate as a string
>>> decimal.Decimal(71.5)
Decimal('71.5')
Normally Decimal is used to avoid the floating point precision problem. For example, the float literal 71.60 isn't mathematically 71.60, but a number very close to it.
As a result, using float to initialize Decimal won't avoid the problem. In general, you should use strings to initialize Decimal.
In Python, are either
n**0.5 # or
math.sqrt(n)
recognized when a number is a perfect square? Specifically, should I worry that when I use
int(n**0.5) # instead of
int(n**0.5 + 0.000000001)
I might accidentally end up with the number one less than the actual square root due to precision error?
As several answers have suggested integer arithmetic, I'll recommend the gmpy2 library. It provides functions for checking if a number is a perfect power, calculating integer square roots, and integer square root with remainder.
>>> import gmpy2
>>> gmpy2.is_power(9)
True
>>> gmpy2.is_power(10)
False
>>> gmpy2.isqrt(10)
mpz(3)
>>> gmpy2.isqrt_rem(10)
(mpz(3), mpz(1))
Disclaimer: I maintain gmpy2.
Yes, you should worry:
In [11]: int((100000000000000000000000000000000000**2) ** 0.5)
Out[11]: 99999999999999996863366107917975552L
In [12]: int(math.sqrt(100000000000000000000000000000000000**2))
Out[12]: 99999999999999996863366107917975552L
obviously adding the 0.000000001 doesn't help here either...
As #DSM points out, you can use the decimal library:
In [21]: from decimal import Decimal
In [22]: x = Decimal('100000000000000000000000000000000000')
In [23]: (x ** 2).sqrt() == x
Out[23]: True
for numbers over 10**999999999, provided you keep a check on the precision (configurable), it'll throw an error rather than an incorrect answer...
Both **0.5 and math.sqrt() perform the calculation using floating point arithmetic. The input is converted to float before the square root is calculated.
Do these calculations recognize when the input value is a perfect square?
No they do not. Floating arithmetic has no concept of perfect squares.
large integers may not be representable, for values where the number has more significant digits than available in the floating point mantissa. It's easy to see therefore that for non-representable input values, n**0.5 may be innaccurate. And you proposed fix by adding a small value will not in general fix the problem.
If your input is an integer then you should consider performing your calculation using integer arithmetic. That ultimately is the right way to deal with this.
You can use the round(number, significant_figures) before converting to an int, I cannot recall if python truncs or rounds when doing a float-to-integer conversion.
In any case, since python uses floating point arithmetic, all the pitfalls apply. See:
http://docs.python.org/2/tutorial/floatingpoint.html
Perfect-square values will have no fractional components, so your main worry would be very large values, and for such values a difference of 1 or 2 being significant means you're going to want a specific numerical library that supports such high precision (as DSM mentions, the Decimal library, standard since Python 2.4, should be able to do what you want as it supports arbitrary precision.
http://docs.python.org/library/decimal.html
sqrt is one of the easier math library functions to implement, and any math library of reasonable quality will implement it with faithful rounding (sub-ULP accuracy). If the input is a perfect square, its square root is representable (in a reasonable floating-point format). In this case, faithful rounding guarantees the result is exact.
This addresses only the value actually passed to sqrt. Whether a number can be converted without error from another format to the floating-point input for sqrt is a separate issue.