What is a float, have I got the right idea?

What is a float, have I got the right idea? - python

I was programming a calculator and I came across float. I was told to use floats then my calculator would be able to calculate calculations including numbers like 16.4 and 4.5 and not just whole numbers.
Now, I sort of got thinking and wondered if someone could just verify I'm on the right tracks.
I understand an int is a solid number, a double is basically 2 numbers with a dot in the middle, a decimal? And now comes the tricky one, the float data type.
I just need somebody to verify I'm on the right track. I think a float is a data type that can either be a whole number (this is what I'm not sure about) and a double/decimal.
I feel like float is the safe data type kinda thing, where you're not sure and want to accept whole and decimal numbers, am I right? Is it for accepting both whole and decimals?
For anyone that struggles to understand, heres the code to my calculator, it might help.
while True:
calculation = input("Calculation: ")
if (calculation.__contains__("+")):
print(float(calculation[0] ) + float(calculation[2]))
elif (calculation.__contains__("-")):
print(float(calculation[0] ) - float(calculation[2]))
elif (calculation.__contains__("*")):
print(float(calculation[0] ) * float(calculation[2]))
elif (calculation.__contains__("/")):
print(float(calculation[0] ) / float(calculation[2]))

A float is a floating-point number (more or less your "basically 2 numbers with a dot in the middle").
The term double is short for "double precision floating-point number": a similar kind of number but typically using more bits to store it, allowing for more precision.
In Python, the type float is used to refer to all floating-point numbers, regardless of precision.

You mention Python's float and also double. These are exactly the same thing, because what Python calls float (pedantically, in most implementations of Python) is what everyone else calls double. And what C and C++ call float does not exist in Python.

Related

What is a floating number and its purpose in terms of theory

I am just learning python and making a calculator and the tutorial says float instead of int. what is a floating number
Why not just use int

Floats, contrary to int or integer data types, can store numerical data with much more precision. Integers lose information regarding to the mantissa or decimal portion of the number. So in applications where precision is required, we use float data type instead of int.

welcome to Stackoverflow!
This is a question on which you can dive deep, and I would absolutely google this. For example, the Wikipedia page on this will already give you a bunch of information.
Now, in order not to overwhelm you with a too much information, we can answer your question without too many details:
A float is a type with which you can represent decimal numbers (numbers with a comma in there, for example 0.5). An int is a type with which you can only represent integers (1, 2, 50, ...).
Don't hesitate to google around for more info!

A float is a number with numbers after decimal points, for example 12.56 . It can also be called a real.
An integer is a whole number eg 12.
If you use an integer in a calculator, then you wouldn't be able to calculate with anything other than whole numbers.
for example using int is fine if you do 12 + 12. However, if you enter 12.56 + 12.56, then you will get an error, as the int function assumes that the user enters a whole number

Need help understanding binary conversion in Python

Or I guess binary in general. I'm obviously quite new to coding, so I'll appreciate any help here.
I just started learning about converting numbers into binary, specifically two's complement. The course presented the following code for converting:
num = 19
if num < 0:
isNeg = True
num = abs(num)
else:
isNeg = False
result = ''
if num == 0:
result = '0'
while num > 0:
result = str(num % 2) + result
num = num // 2
if isNeg:
result = '-' + result
This raised a couple of questions with me and after doing some research (mostly here on Stack Overflow), I found myself more confused than I was before. Hoping somebody can break things down a bit more for me. Here are some of those questions:
I thought it was outright wrong that the code suggested just appending a - to the front of a binary number to show its negative counterpart. It looks like bin() does the same thing, but don't you have to flip the bits and add a 1 or something? Is there a reason for this other than making it easy to comprehend/read?
Was reading here and one of the answers in particular said that Python doesn't really work in two's complement, but something else that mimics it. The disconnect here for me is that Python shows me one thing but is storing the numbers a different way. Again, is this just for ease of use? Is bin() using two's complement or Python's method?
Follow-up to that one, how does the 'sign-magnitude' format mentioned in the above answer differ from two's complement?
The Professor doesn't talk at all about 8-bit, 16-bit, 64-bit, etc., which I saw a lot of while reading up on this. Where does this distinction come from, and does Python use one? Or are those designations specific to the program that I might be coding?
A lot of these posts I've only reference how Python stores integers. Is that suggesting that it stores floats a different way, or are they just speaking broadly?
As I wrote this all up, I sort of realized that maybe I'm diving into the deep end before learning how to swim, but I'm curious like that and like to have a deeper understanding of stuff before moving on.

I thought it was outright wrong that the code suggested just appending a - to the front of a binary number to show its negative counterpart. It looks like bin() does the same thing, but don't you have to flip the bits and add a 1 or something? Is there a reason for this other than making it easy to comprehend/read?
You have to somehow designate the number being negative. You can add another symbol (-), add a sign bit at the very beginning, use ones'-complement, use two's-complement, or some other completely made-up scheme that works. Both the ones'- and two's-complement representation of a number require a fixed number of bits, which doesn't exist for Python integers:
>>> 2**1000
1071508607186267320948425049060001810561404811705533607443750
3883703510511249361224931983788156958581275946729175531468251
8714528569231404359845775746985748039345677748242309854210746
0506237114187795418215304647498358194126739876755916554394607
7062914571196477686542167660429831652624386837205668069376
The natural solution is to just prepend a minus sign. You can similarly write your own version of bin() that requires you to specify the number of bits and return the two's-complement representation of the number.
Was reading here and one of the answers in particular said that Python doesn't really work in two's complement, but something else that mimics it. The disconnect here for me is that Python shows me one thing but is storing the numbers a different way. Again, is this just for ease of use? Is bin() using two's complement or Python's method?
Python is a high-level language, so you don't really know (or care) how your particular Python runtime interally stores integers. Whether you use CPython, Jython, PyPy, IronPython, or something else, the language specification only defines how they should behave, not how they should be represented in memory. bin() just takes a number and prints it out using binary digits, the same way you'd convert 123 into base-2.
Follow-up to that one, how does the 'sign-magnitude' format mentioned in the above answer differ from two's complement?
Sign-magnitude usually encodes a number n as 0bXYYYYYY..., where X is the sign bit and YY... are the binary digits of the non-negative magnitude. Arithmetic with numbers encoded as two's-complement is more elegant due to the representation, while sign-magnitude encoding requires special handling for operations on numbers of opposite signs.
The Professor doesn't talk at all about 8-bit, 16-bit, 64-bit, etc., which I saw a lot of while reading up on this. Where does this distinction come from, and does Python use one? Or are those designations specific to the program that I might be coding?
No, Python doesn't define a maximum size for its integers because it's not that low-level. 2**1000000 computes fine, as will 2**10000000 if you have enough memory. n-bit numbers arise when your hardware makes it more beneficial to make your numbers a certain size. For example, processors have instructions that quickly work with 32-bit numbers but not with 87-bit numbers.
A lot of these posts I've only reference how Python stores integers. Is that suggesting that it stores floats a different way, or are they just speaking broadly?
It depends on what your Python runtime uses. Usually floating point numbers are like C doubles, but that's not required.

don't you have to flip the bits and add a 1 or something?
Yes, for two complement notation you invert all bits and add one to get the negative counterpart.
Is bin() using two's complement or Python's method?
Two's complement is a practical way to represent negative number in electronics that can have only 0 and 1. Internally the microprocessor uses two's complement for negative numbers and all modern microprocessors do. For more info, see your textbook on computer architecture.
how does the 'sign-magnitude' format mentioned in the above answer
differ from two's complement?
You should look what this code does and why it is there:
while num > 0:
result = str(num % 2) + result
num = num // 2

Arithmetic error: Python is incorrectly dividing variables

I'm getting something that doesn't seem to be making a lot of sense. I was practicing my coding by making a little program that would give me the probability of getting certain cards within a certain timeframe of a card game. In order to calculate the chances, I would need to create a method that would perform division, and report the chances as a fraction and as a decimal. so I designed this:
from fractions import Fraction
def time_odds(card_count,turns,deck_size=60):
chance_of_occurence = float(card_count)/float(deck_size)
opening_hand_odds = 7*chance_of_occurence
turn_odds = (7 + turns)*chance_of_occurence
print ("Chance of it being in the opening hand: %s or %s" %(opening_hand_odds,Fraction(opening_hand_odds)))
print ("Chance of it being acquired by turn %s : %s or %s" %(turns,turn_odds,Fraction(turn_odds) ))
and then I used it like so:
time_odds(3,5)
but for whatever reason I got this as the answer:
"Chance of it being in the opening hand: 0.35000000000000003 or
6305039478318695/18014398509481984"
"Chance of it being acquired by turn 5 : 0.6000000000000001 or
1351079888211149/2251799813685248"
so it's like, almost right, except the decimal is just slightly off, giving like a 0.0000000000003 difference or a 0.000000000000000000001 difference.
Python doesn't do this when I just make it do division like this:
print (7*3/60)
This gives me just 0.35, which is correct. The only difference that I can observe, is that I get the slightly incorrect values when I am dividing with variables rather than just numbers.
I've looked around a little for an answer, and most incorrect division problems have to do with integer division (or I think it can be called floor division) , but I didn't manage to find anything addressing this.
I've had a similar problem with python doing this when I was dividing really big numbers. What's going on?
Why is this so? what can I do to correct it?

The extra digits you're seeing are floating point precision errors. As you do more and more operations with floating point numbers, the errors have a chance of compounding.
The reason you don't see them when you try to replicate the computation by hand is that your replication performs the operations in a different order. If you compute 7 * 3 / 60, the mutiplication happens first (with no error), and the division introduces a small enough error that Python's float type hides it for you in its repr (because 0.35 unambiguously refers to the same float value as the computation). If you do 7 * (3 / 60), the division happens first (introducing error) and then the multiplication increases the size of the error to the point that it can't be hidden (because 0.35000000000000003 is a different float value than 0.35).
To avoid printing out the the extra digits that are probably error, you may want to explicitly specify a precision to use when turning your numbers into strings. For instance, rather than using the %s format code (which calls str on the value), you could use %.3f, which will round off your number after three decimal places.
There's another issue with your Fractions. You're creating the Fraction directly from the floating point value, which already has the error calculated in. That's why you're seeing the fraction print out with a very large numerator and denominator (it's exactly representing the same number as the inaccurate float). If you instead pass integer numerator and denominator values to the Fraction constructor, it will take care of simplifying the fraction for you without any floating point inaccuracy:
print("Chance of it being in the opening hand: %.3f or %s"
% (opening_hand_odds, Fraction(7*card_count, deck_size)))
This should print out the numbers as 0.350 and 7/20. You can of course choose whatever number of decimal places you want.
Completely separate from the floating point errors, the calculation isn't actually getting the probability right. The formula you're using may be a good enough one for doing in your head while playing a game, but it's not completely accurate. If you're using a computer to crunch the numbers for you, you might as well get it right.
The probability of drawing at least one of N specific cards from a deck of size M after D draws is:
1 - (comb(M-N, D) / comb(M, D))
Where comb is the binary coefficient or "combination" function (often spoken as "N choose R" and written "nCr" in mathematics). Python doesn't have an implementation of that function in the standard library, but there are a lot of add on modules you may already have installed that provide one or you can pretty easily write your own. See this earlier question for more specifics.
For your example parameters, the correct odds are '5397/17110' or 0.315.

Why do Python's math.ceil() and math.floor() operations return floats instead of integers?

Can someone explain this (straight from the docs- emphasis mine):
math.ceil(x) Return the ceiling of x as a float, the smallest integer value greater than or equal to x.
math.floor(x) Return the floor of x as a float, the largest integer value less than or equal to x.
Why would .ceil and .floor return floats when they are by definition supposed to calculate integers?
EDIT:
Well this got some very good arguments as to why they should return floats, and I was just getting used to the idea, when #jcollado pointed out that they in fact do return ints in Python 3...

As pointed out by other answers, in python they return floats probably because of historical reasons to prevent overflow problems. However, they return integers in python 3.
>>> import math
>>> type(math.floor(3.1))
<class 'int'>
>>> type(math.ceil(3.1))
<class 'int'>
You can find more information in PEP 3141.

The range of floating point numbers usually exceeds the range of integers. By returning a floating point value, the functions can return a sensible value for input values that lie outside the representable range of integers.
Consider: If floor() returned an integer, what should floor(1.0e30) return?
Now, while Python's integers are now arbitrary precision, it wasn't always this way. The standard library functions are thin wrappers around the equivalent C library functions.

Because python's math library is a thin wrapper around the C math library which returns floats.

The source of your confusion is evident in your comment:
The whole point of ceil/floor operations is to convert floats to integers!
The point of the ceil and floor operations is to round floating-point data to integral values. Not to do a type conversion. Users who need to get integer values can do an explicit conversion following the operation.
Note that it would not be possible to implement a round to integral value as trivially if all you had available were a ceil or float operation that returned an integer. You would need to first check that the input is within the representable integer range, then call the function; you would need to handle NaN and infinities in a separate code path.
Additionally, you must have versions of ceil and floor which return floating-point numbers if you want to conform to IEEE 754.

Before Python 2.4, an integer couldn't hold the full range of truncated real numbers.
http://docs.python.org/whatsnew/2.4.html#pep-237-unifying-long-integers-and-integers

Because the range for floats is greater than that of integers -- returning an integer could overflow

This is a very interesting question! As a float requires some bits to store the exponent (=bits_for_exponent) any floating point number greater than 2**(float_size - bits_for_exponent) will always be an integral value! At the other extreme a float with a negative exponent will give one of 1, 0 or -1. This makes the discussion of integer range versus float range moot because these functions will simply return the original number whenever the number is outside the range of the integer type. The python functions are wrappers of the C function and so this is really a deficiency of the C functions where they should have returned an integer and forced the programer to do the range/NaN/Inf check before calling ceil/floor.
Thus the logical answer is the only time these functions are useful they would return a value within integer range and so the fact they return a float is a mistake and you are very smart for realizing this!

Maybe because other languages do this as well, so it is generally-accepted behavior. (For good reasons, as shown in the other answers)

This totally caught me off guard recently. This is because I've programmed in C since the 1970's and I'm only now learning the fine details of Python. Like this curious behavior of math.floor().
The math library of Python is how you access the C standard math library. And the C standard math library is a collection of floating point numerical functions, like sin(), and cos(), sqrt(). The floor() function in the context of numerical calculations has ALWAYS returned a float. For 50 YEARS now. It's part of the standards for numerical computation. For those of us familiar with the math library of C, we don't understand it to be just "math functions". We understand it to be a collection of floating-point algorithms. It would be better named something like NFPAL - Numerical Floating Point Algorithms Libary. :)
Those of us that understand the history instantly see the python math module as just a wrapper for the long-established C floating-point library. So we expect without a second thought, that math.floor() is the same function as the C standard library floor() which takes a float argument and returns a float value.
The use of floor() as a numerical math concept goes back to 1798 per the Wikipedia page on the subject: https://en.wikipedia.org/wiki/Floor_and_ceiling_functions#Notation
It never has been a computer science covert floating-point to integer storage format function even though logically it's a similar concept.
The floor() function in this context has always been a floating-point numerical calculation as all(most) the functions in the math library. Floating-point goes beyond what integers can do. They include the special values of +inf, -inf, and Nan (not a number) which are all well defined as to how they propagate through floating-point numerical calculations. Floor() has always CORRECTLY preserved values like Nan and +inf and -inf in numerical calculations. If Floor returns an int, it totally breaks the entire concept of what the numerical floor() function was meant to do. math.floor(float("nan")) must return "nan" if it is to be a true floating-point numerical floor() function.
When I recently saw a Python education video telling us to use:
i = math.floor(12.34/3)
to get an integer I laughed to myself at how clueless the instructor was. But before writing a snarkish comment, I did some testing and to my shock, I found the numerical algorithms library in Python was returning an int. And even stranger, what I thought was the obvious answer to getting an int from a divide, was to use:
i = 12.34 // 3
Why not use the built-in integer divide to get the integer you are looking for! From my C background, it was the obvious right answer. But low and behold, integer divide in Python returns a FLOAT in this case! Wow! What a strange upside-down world Python can be.
A better answer in Python is that if you really NEED an int type, you should just be explicit and ask for int in python:
i = int(12.34/3)
Keeping in mind however that floor() rounds towards negative infinity and int() rounds towards zero so they give different answers for negative numbers. So if negative values are possible, you must use the function that gives the results you need for your application.
Python however is a different beast for good reasons. It's trying to address a different problem set than C. The static typing of Python is great for fast prototyping and development, but it can create some very complex and hard to find bugs when code that was tested with one type of objects, like floats, fails in subtle and hard to find ways when passed an int argument. And because of this, a lot of interesting choices were made for Python that put the need to minimize surprise errors above other historic norms.
Changing the divide to always return a float (or some form of non int) was a move in the right direction for this. And in this same light, it's logical to make // be a floor(a/b) function, and not an "int divide".
Making float divide by zero a fatal error instead of returning float("inf") is likewise wise because, in MOST python code, a divide by zero is not a numerical calculation but a programming bug where the math is wrong or there is an off by one error. It's more important for average Python code to catch that bug when it happens, instead of propagating a hidden error in the form of an "inf" which causes a blow-up miles away from the actual bug.
And as long as the rest of the language is doing a good job of casting ints to floats when needed, such as in divide, or math.sqrt(), it's logical to have math.floor() return an int, because if it is needed as a float later, it will be converted correctly back to a float. And if the programmer needed an int, well then the function gave them what they needed. math.floor(a/b) and a//b should act the same way, but the fact that they don't I guess is just a matter of history not yet adjusted for consistency. And maybe too hard to "fix" due to backward compatibility issues. And maybe not that important???
In Python, if you want to write hard-core numerical algorithms, the correct answer is to use NumPy and SciPy, not the built-in Python math module.
import numpy as np
nan = np.float64(0.0) / 0.0 # gives a warning and returns float64 nan
nan = np.floor(nan) # returns float64 nan
Python is different, for good reasons, and it takes a bit of time to understand it. And we can see in this case, the OP, who didn't understand the history of the numerical floor() function, needed and expected it to return an int from their thinking about mathematical integers and reals. Now Python is doing what our mathematical (vs computer science) training implies. Which makes it more likely to do what a beginner expects it to do while still covering all the more complex needs of advanced numerical algorithms with NumPy and SciPy. I'm constantly impressed with how Python has evolved, even if at times I'm totally caught off guard.

floats inside tuples changing values when accessed

So I have a list of tuples of two floats each. Each tuple represents a range. I am going through another list of floats which represent values to be fit into the ranges. All of these floats are < 1 but positive, so precision matter. One of my tests to determine if a value fits into a range is failing when it should pass. If I print the value and the range that is causing problems I can tell this much:
curValue = 0.00145000000671
range = (0.0014500000067055225, 0.0020968749796738849)
The conditional that is failing is:
if curValue > range[0] and ... blah :
# do some stuff
From the values given by curValue and range, the test should clearly pass (don't worry about what is in the conditional). Now, if I print explicitly what the value of range[0] is I get:
range[0] = 0.00145000000671
Which would explain why the test is failing. So my question then, is why is the float changing when it is accessed. It has decimal values available up to a certain precision when part of a tuple, and a different precision when accessed. Why would this be? What can I do to ensure my data maintains a consistent amount of precision across my calculations?

The float doesn't change. The built-in numberic types are all immutable. The cause for what you're observing is that:
print range[0] uses str on the float, which (up until very recent versions of Python) printed less digits of a float.
Printing a tuple (be it with repr or str) uses repr on the individual items, which gives a much more accurate representation (again, this isn't true anymore in recent releases which use a better algorithm for both).
As for why the condition doesn't work out the way you expect, it's propably the usual culprit, the limited precision of floats. Try print repr(curVal), repr(range[0]) to see if what Python decided was the closest representation of your float literal possible.

In modern day PC's floats aren't that precise. So even if you enter pi as a constant to 100 decimals, it's only getting a few of them accurate. The same is happening to you. This is because in 32-bit floats you only get 24 bits of mantissa, which limits your precision (and in unexpected ways because it's in base2).
Please note, 0.00145000000671 isn't the exact value as stored by Python. Python only diplays a few decimals of the complete stored float if you use print. If you want to see exactly how python stores the float use repr.
If you want better precision use the decimal module.

It isn't changing per se. Python is doing its best to store the data as a float, but that number is too precise for float, so Python modifies it before it is even accessed (in the very process of storing it). Funny how something so small is such a big pain.
You need to use a arbitrary fixed point module like Simple Python Fixed Point or the decimal module.

Not sure it would work in this case, because I don't know if Python's limiting in the output or in the storage itself, but you could try doing:
if curValue - range[0] > 0 and...

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.