Counterintuitive behaviour of int() in python

Counterintuitive behaviour of int() in python - python

It's clearly stated in the docs that int(number) is a flooring type conversion:
int(1.23)
1
and int(string) returns an int if and only if the string is an integer literal.
int('1.23')
ValueError
int('1')
1
Is there any special reason for that? I find it counterintuitive that the function floors in one case, but not the other.

There is no special reason. Python is simply applying its general principle of not performing implicit conversions, which are well-known causes of problems, particularly for newcomers, in languages such as Perl and Javascript.
int(some_string) is an explicit request to convert a string to integer format; the rules for this conversion specify that the string must contain a valid integer literal representation. int(float) is an explicit request to convert a float to an integer; the rules for this conversion specify that the float's fractional portion will be truncated.
In order for int("3.1459") to return 3 the interpreter would have to implicitly convert the string to a float. Since Python doesn't support implicit conversions, it chooses to raise an exception instead.

This is almost certainly a case of applying three of the principles from the Zen of Python:
Explicit is better implicit.
[...] practicality beats purity
Errors should never pass silently
Some percentage of the time, someone doing int('1.23') is calling the wrong conversion for their use case, and wants something like float or decimal.Decimal instead. In these cases, it's clearly better for them to get an immediate error that they can fix, rather than silently giving the wrong value.
In the case that you do want to truncate that to an int, it is trivial to explicitly do so by passing it through float first, and then calling one of int, round, trunc, floor or ceil as appropriate. This also makes your code more self-documenting, guarding against a later modification "correcting" a hypothetical silently-truncating int call to float by making it clear that the rounded value is what you want.

Sometimes a thought experiment can be useful.
Behavior A: int('1.23') fails with an error. This is the existing behavior.
Behavior B: int('1.23') produces 1 without error. This is what you're proposing.
With behavior A, it's straightforward and trivial to get the effect of behavior B: use int(float('1.23')) instead.
On the other hand, with behavior B, getting the effect of behavior A is significantly more complicated:
def parse_pure_int(s):
if "." in s:
raise ValueError("invalid literal for integer with base 10: " + s)
return int(s)
(and even with the code above, I don't have complete confidence that there isn't some corner case that it mishandles.)
Behavior A therefore is more expressive than behavior B.
Another thing to consider: '1.23' is a string representation of a floating-point value. Converting '1.23' to an integer conceptually involves two conversions (string to float to integer), but int(1.23) and int('1') each involve only one conversion.
Edit:
And indeed, there are corner cases that the above code would not handle: 1e-2 and 1E-2 are both floating point values too.

In simple words - they're not the same function.
int( decimal ) behaves as 'floor i.e. knock off the decimal portion and return as int'
int( string ) behaves as 'this text describes an integer, convert it and return as int'.
They are 2 different functions with the same name that return an integer but they are different functions.
'int' is short and easy to remember and its meaning applied to each type is intuitive to most programmers which is why they chose it.
There's no implication they are providing the same or combined functionality, they simply have the same name and return the same type. They could as easily be called 'floorDecimalAsInt' and 'convertStringToInt', but they went for 'int' because it's easy to remember, (99%) intuitive and confusion would rarely occur.
Parsing text as an Integer for text which included a decimal point such as "4.5" would throw an error in majority of computer languages and be expected to throw an error by majority of programmers, since the text-value does not represent an integer and implies they are providing erroneous data

Related

Cython returns 0 for expression that should evaluate to 0.5?

For some reason, Cython is returning 0 on a math expression that should evaluate to 0.5:
print(2 ** (-1)) # prints 0
Oddly enough, mix variables in and it'll work as expected:
i = 1
print(2 ** (-i)) # prints 0.5
Vanilla CPython returns 0.5 for both cases. I'm compiling for 37m-x86_64-linux-gnu, and language_level is set to 3.
What is this witchcraft?

It's because it's using C ints rather than Python integers so it matches C behaviour rather than Python behaviour. I'm relatively sure this used to be documented as a limitation somewhere but I can't find it now. If you want to report it as a bug then go to https://github.com/cython/cython/issues, but I suspect this is a deliberate trade-off of speed for compatibility.
The code gets translated to
__Pyx_pow_long(2, -1L)
where __Pyx_pow_long is a function of type static CYTHON_INLINE long __Pyx_pow_long(long b, long e).
The easiest way to fix it is to change one/both of the numbers to be a floating point number
print(2. ** (-1))
As a general comment on the design choice: people from the C world generally expect int operator int to return an int, and this option will be fastest. Python had tried to do this in the past with the Python 2 division behaviour (but inconsistently - power always returned a floating point number).
Cython generally tries to follow Python behaviour. However, a lot of people are using it for speed so they also try to fall back to quick, C-like operations especially when people specify types (since those people want speed). I think what's happened here is that it's been able to infer the types automatically, and so defaulted to C behaviour. I suspect ideally it should distinguish between specified types and types that it's inferred. However, it's also probably too late to start changing that.

It looks like Cython is incorrectly inferring the final data type as int rather than float when only numbers are involved
The following code works as expected:
print(2.0 ** (-1))
See this link for a related discussion: https://groups.google.com/forum/#!topic/cython-users/goVpote2ScY

What's the rationale behind 2.5 // 2.0 returning a float rather than an int in Python 3.x?

What's the rationale behind 2.5 // 2.0 returning a float rather than an int in Python 3.x?
If it's an integral value, why not put it in a type int object?
[edit]
I am looking for a justification of the fact that this is so. What were the arguments in making it this way. Haven't been able to find them yet.
[edit2]
The relation with floor is more problematic than the term "floor division" suggests!
floor(3.5 / 5.5) == 0 (int)
whereas
3.5 // 5.5 == 0.0 (float)
Can not yet discern any logic here :(
[edit3]
From PEP238:
In a
unified model, the integer 1 should be indistinguishable from the
floating point number 1.0 (except for its inexactness), and both
should behave the same in all numeric contexts.
All very nice, but a not unimportant library like Numpy complains when offering floats as indices, even if they're integral. So 'indistinguishable' is not reality yet. Spent some time hunting a bug in connection with this. I was very surprise to learn about the true nature of //. And it wasn't that obvious from the docs (for me).
Since I've quite some trust in the design of Python 3.x, I thought I must have missed a very obvious reason to define // in this way. But now I wonder...

The // operator is covered in PEP 238. First of all, note that it's not "integer division" but "floor division", i.e. it is never claimed that the result would be an integer.
From the section on Semantics of Floor Division:
Floor division will be implemented in all the Python numeric types, and will have the semantics of
a // b == floor(a/b)
except that the result type will be the common type into which a and b are coerced before the operation.
And later:
For floating point inputs, the result is a float. For example:
3.5//2.0 == 1.0
The rationale behind this decision is not explicitly stated (or I could not find it). However, the way it is implemented, it is consistent with the other mathematical operations (emphasis mine):
Specifically, if a and b are of the same type, a//b will be of that type too. If the inputs are of different types, they are first coerced to a common type using the same rules used for all other arithmetic operators.
Also, if the results would be automatically converted to int, that could yield weird and surprising results for very large floating point numbers that are beyond integer precision:
>>> 1e30 // 2.
5e+29
>>> int(1e30 // 2.)
500000000000000009942312419328

What is an (int) prefix on floating point arithmetic actually doing?

In some sample code I see this syntax being used:
float1 = 7.0
float2 = 2.0
result = (int)(float1/float2)
The point seems to be to force the result to an integer, but I can't find any place that documents the (int) syntax being used, or why it would be preferable to int(float1/float2). A call to int() itself is supposed to return zero, but (0)(float1/float2) throws a TypeError and complains about zero not being a callable. It's obvious the interpreter is trying to execute the int() reference, but it's not clear to me why it would expect to find a callable there.
Can someone point me to some documentation on this syntax?

Whoever wrote that code had too much exposure to languages that are more closely related to C than Python is. In C, C++, Java, C# and others, (int)something is the syntax to cast something to (int). In Python, it's just a strange way to spell int(something). int is the builtin function which converts something to an int.
In general, the Python expression (<expr>) for some other expression <expr> just evaluates to the same thing as <expr>. The parantheses can only affect precedence, but in this case they don't. Likewise, <expr>(...) evaluates <expr> and then calls the result -- note that <expr> can again be any expression, it's not limited to simple function names.
(x)(y) is the same as x(y). (int)(y) works because int(y) works. (0)(y) doesn't because 0(y) doesn't work.

In this code: result = (int)(float/float2)
(int) is an expression that is evaluated to return the int function. That then is called with a single argument, the expression float/float2.
It doesn't work for (0) because the result of that expression is a number, which is not a callable type, exactly as the TypeError states.

(int) evaluates to int, so
result = (int)(float/float2)
is the same as
result = int(float/float2)
Placing unnecessary parentheses around bare variable names is not recommended.

How do I ONLY round a number/float down in Python?

I will have this random number generated e.g 12.75 or 1.999999999 or 2.65
I want to always round this number down to the nearest integer whole number so 2.65 would be rounded to 2.
Sorry for asking but I couldn't find the answer after numerous searches, thanks :)

You can us either int(), math.trunc(), or math.floor(). They all will do what you want for positive numbers:
>>> import math
>>> math.floor(12.6) # returns 12.0 in Python 2
12
>>> int(12.6)
12
>>> math.trunc(12.6)
12
However, note that they behave differently with negative numbers: int and math.trunc will go to 0, whereas math.floor always floors downwards:
>>> import math
>>> math.floor(-12.6) # returns -13.0 in Python 2
-13
>>> int(-12.6)
-12
>>> math.trunc(-12.6)
-12
Note that math.floor and math.ceil used to return floats in Python 2.
Also note that int and math.trunc will both (at first glance) appear to do the same thing, though their exact semantics differ. In short: int is for general/type conversion and math.trunc is specifically for numeric types (and will help make your intent more clear).
Use int if you don't really care about the difference, if you want to convert strings, or if you don't want to import a library. Use trunc if you want to be absolutely unambiguous about what you mean or if you want to ensure your code works correctly for non-builtin types.
More info below:
Math.floor() in Python 2 vs Python 3
Note that math.floor (and math.ceil) were changed slightly from Python 2 to Python 3 -- in Python 2, both functions will return a float instead of an int. This was changed in Python 3 so that both methods return an int (more specifically, they call the __float__ method on whatever object they were given). So then, if you're using Python 2, or would like your code to maintain compatibility between the two versions, it would generally be safe to do int(math.floor(...)).
For more information about why this change was made + about the potential pitfalls of doing int(math.floor(...)) in Python 2, see
Why do Python's math.ceil() and math.floor() operations return floats instead of integers?
int vs math.trunc()
At first glance, the int() and math.trunc() methods will appear to be identical. The primary differences are:
int(...)
The int function will accept floats, strings, and ints.
Running int(param) will call the param.__int__() method in order to perform the conversion (and then will try calling __trunc__ if __int__ is undefined)
The __int__ magic method was not always unambiguously defined -- for some period of time, it turned out that the exact semantics and rules of how __int__ should work were largely left up to the implementing class.
The int function is meant to be used when you want to convert a general object into an int. It's a type conversion method. For example, you can convert strings to ints by doing int("42") (or do things like change of base: int("AF", 16) -> 175).
math.trunc(...)
The trunc will only accept numeric types (ints, floats, etc)
Running math.trunc(param) function will call the param.__trunc__() method in order to perform the conversion
The exact behavior and semantics of the __trunc__ magic method was precisely defined in PEP 3141 (and more specifically in the Changes to operations and __magic__ methods section).
The math.trunc function is meant to be used when you want to take an existing real number and specifically truncate and remove its decimals to produce an integral type. This means that unlike int, math.trunc is a purely numeric operation.
All that said, it turns out all of Python's built-in types will behave exactly the same whether you use int or trunc. This means that if all you're doing is using regular ints, floats, fractions, and decimals, you're free to use either int or trunc.
However, if you want to be very precise about what exactly your intent is (ie if you want to make it absolutely clear whether you're flooring or truncating), or if you're working with custom numeric types that have different implementations for __int__ and __trunc__, then it would probably be best to use math.trunc.
You can also find more information and debate about this topic on Python's developer mailing list.

you can do this easily with a built in python functions, just use two forward slashes and divide by 1.
>>> print 12.75//1
12.0
>>> print 1.999999999//1
1.0
>>> print 2.65//1
2.0

No need to import any module like math etc....
python bydeafault it convert if you do simply type cast by integer
>>>x=2.65
>>>int(x)
2

I'm not sure whether you want math.floor, math.trunc, or int, but... it's almost certainly one of those functions, and you can probably read the docs and decide more easily than you can explain enough for usb to decide for you.

Obviously, Michael0x2a's answer is what you should do. But, you can always get a bit creative.
int(str(12.75).split('.')[0])

If you only looking for the nearest integer part I think the best option would be to use math.trunc() function.
import math
math.trunc(123.456)
You can also use int()
int(123.456)
The difference between these two functions is that int() function also deals with string numeric conversion, where trunc() only deals with numeric values.
int('123')
# 123
Where trunc() function will throw an exception
math.trunc('123')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-62-f9aa08f6d314> in <module>()
----> 1 math.trunc('123')
TypeError: type str doesn't define __trunc__ method
If you know that you only dealing with numeric data, you should consider using trunc() function since it's faster than int()
timeit.timeit("math.trunc(123.456)", setup="import math", number=10_000)
# 0.0011689490056596696
timeit.timeit("int(123.456)", number=10_000)
# 0.0014109049952821806

subclassing float to force fixed point printing precision in python

[Python 3.1]
I'm following up on this answer:
class prettyfloat(float):
def __repr__(self):
return "%0.2f" % self
I know I need to keep track of my float literals (i.e., replace 3.0 with prettyfloat(3.0), etc.), and that's fine.
But whenever I do any calculations, prettyfloat objects get converted into float.
What's the easiest way to fix it?
EDIT:
I need exactly two decimal digits; and I need it across the whole code, including where I print a dictionary with float values inside. That makes any formatting functions hard to use.
I can't use Decimal global setting, since I want computations to be at full precision (just printing at 2 decimal points).
#Glenn Maynard: I agree I shouldn't override __repr__; if anything, it would be just __str__. But it's a moot point because of the following point.
#Glenn Maynard and #singularity: I won't subclass float, since I agree it will look very ugly in the end.
I will stop trying to be clever, and just call a function everywhere a float is being printed. Though I am really sad that I can't override __str__ in the builtin class float.
Thank you!

I had a look at the answer you followed up on, and I think you're confusing data and its representation.
#Robert Rossney suggested to subclass float so you could map() an iterable of standard, non-adulterated floats into prettyfloats for display purposes:
# Perform all our computations using standard floats.
results = compute_huge_numbers(42)
# Switch to prettyfloats for printing.
print(map(prettyfloat, results))
In other words, you were not supposed to (and you shouldn't) use prettyfloat as a replacement for float everywhere in your code.
Of course, inheriting from float to solve that problem is overkill, since it's a representation problem and not a data problem. A simple function would be enough:
def prettyfloat(number):
return "%0.2f" % number # Works the same.
Now, if it's not about representation after all, and what you actually want to achieve is fixed-point computations limited to two decimal places everywhere in your code, that's another story entirely.

that because prettyfloat (op) prettyfloat don't return a prettyfloat
example:
>>> prettyfloat(0.6)
0.60 # type prettyfloat
>>> prettyfloat(0.6) + prettyfloat(4.4)
5.0 # type float
solution if you don't want to cast every operation result manually to prettyfloat and if you still want to use prettyfloat is to override all operators.
example with operator __add__ (which is ugly)
class prettyfloat(float):
def __repr__(self):
return "%0.2f" % self
def __add__(self, other):
return prettyfloat(float(self) + other)
>>> prettyfloat(0.6) + prettyfloat(4.4)
5.00
by doing this i think you will have also to change the name from prettyfloat to uglyfloat :) , Hope this will help

Use decimal. This is what it's for.
>>> import decimal
>>> decimal.getcontext().prec = 2
>>> one = decimal.Decimal("1.0")
>>> three = decimal.Decimal("3.0")
>>> one / three
Decimal('0.33')
...unless you actually want to work with full-precision floats everywhere in your code but print them rounded to two decimal places. In that case, you need to rewrite your printing logic.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Counterintuitive behaviour of int() in python - python

Related

Cython returns 0 for expression that should evaluate to 0.5?

What's the rationale behind 2.5 // 2.0 returning a float rather than an int in Python 3.x?

What is an (int) prefix on floating point arithmetic actually doing?

How do I ONLY round a number/float down in Python?

subclassing float to force fixed point printing precision in python

Categories

Resources