Why does Python allow comparison of a callable and a number?

Why does Python allow comparison of a callable and a number? - python

I used python to write an assignment last week, here is a code snippet
def departTime():
'''
Calculate the time to depart a packet.
'''
if(random.random < 0.8):
t = random.expovariate(1.0 / 2.5)
else:
t = random.expovariate(1.0 / 10.5)
return t
Can you see the problem? I compare random.random with 0.8, which
should be random.random().
Of course this because of my careless, but I don't get it. In my
opinion, this kind of comparison should invoke a least a warning in
any programming language.
So why does python just ignore it and return False?

This isn't always a mistake
Firstly, just to make things clear, this isn't always a mistake.
In this particular case, it's pretty clear the comparison is an error.
However, because of the dynamic nature of Python, consider the following (perfectly valid, if terrible) code:
import random
random.random = 9 # Very weird but legal assignment.
random.random < 10 # True
random.random > 10 # False
What actually happens when comparing objects?
As for your actual case, comparing a function object to a number, have a look at Python documentation: Python Documentation: Expressions. Check out section 5.9, titled "Comparisons", which states:
The operators <, >, ==, >=, <=, and != compare the values of two objects. The objects need not have the same type. If both are numbers, they are converted to a common type. Otherwise, objects of different types always compare unequal, and are ordered consistently but arbitrarily. You can control comparison behavior of objects of non-built-in types by defining a cmp method or rich comparison methods like gt, described in section Special method names.
(This unusual definition of comparison was used to simplify the definition of operations like sorting and the in and not in operators. In the future, the comparison rules for objects of different types are likely to change.)
That should explain both what happens and the reasoning for it.
BTW, I'm not sure what happens in newer versions of Python.
Edit: If you're wondering, Debilski's answer gives info about Python 3.

This is ‘fixed’ in Python 3 http://docs.python.org/3.1/whatsnew/3.0.html#ordering-comparisons.

Because in Python that is a perfectly valid comparison. Python can't know if you really want to make that comparison or if you've just made a mistake. It's your job to supply Python with the right objects to compare.
Because of the dynamic nature of Python you can compare and sort almost everything with almost everything (this is a feature). You've compared a function to a float in this case.
An example:
list = ["b","a",0,1, random.random, random.random()]
print sorted(list)
This will give the following output:
[0, 0.89329568818188976, 1, <built-in method random of Random object at 0x8c6d66c>, 'a', 'b']

I think python allows this because the random.random object could be overriding the > operator by including a __gt__ method in the object which might be accepting or even expecting a number. So, python thinks you know what you are doing... and does not report it.
If you try check for it, you can see that __gt__ exists for random.random...
>>> random.random.__gt__
<method-wrapper '__gt__' of builtin_function_or_method object at 0xb765c06c>
But, that might not be something you want to do.

Related

'<' not supported between instances of 'method' and 'method' [duplicate]

I often see error messages that look like any of:
TypeError: '<' not supported between instances of 'str' and 'int'
The message can vary quite a bit, and I guess that it has many causes; so rather than ask again every time for every little situation, I want to know: what approaches or techniques can I use to find the problem, when I see this error message? (I have already read I'm getting a TypeError. How do I fix it?, but I'm looking for advice specific to the individual pattern of error messages I have identified.)
So far, I have figured out that:
the error will show some kind of operator (most commonly <; sometimes >, <=, >= or +) is "not supported between instances of", and then two type names (could be any types, but usually they are not the same).
The highlighted code will almost always have that operator in it somewhere, but the version with < can also show up if I am trying to sort something. (Why?)

Overview
As with any other TypeError, the main steps of the debugging task are:
Figure out what operation is raising the exception, what the inputs are, and what their types are
Understand why these types and operation cause a problem together, and determine which is wrong
If the input is wrong, work backwards to figure out where it comes from
The "working backwards" part is the same for all exceptions, but here are some specific hints for the first two steps.
Identifying the operation and inputs
This error occurs with the relational operators (or comparisons) <, >, <=, >=. It won't happen with == or != (unless someone specifically defines those operators for a user-defined class such that they do), because there is a fallback comparison based on object identity.
Bitwise, arithmetic and shifting operators give different error messages. (The boolean logical operators and and or do not normally cause a problem because of their logic is supported by every type by default, just like with == and !=. As for xor, that doesn't exist.)
As usual, start by looking at the last line of code mentioned in the error message. Go to the corresponding file and examine that line of code. (If the code is line-wrapped, it might not all be shown in the error message.)
Try to find an operator that matches the one in the error message, and double-check what the operands will be i.e. the things on the left-hand and right-hand side of the error. Double-check operator precedence to make sure of what expression will feed into the left-hand and right-hand sides of the operator. If the line is complex, try rewriting it to do the work in multiple steps. (If this accidentally fixes the problem, consider not trying to put it back!)
Sometimes the problem will be obvious at this point (for example, maybe the wrong variable was used due to a typo). Otherwise, use a debugger (ideally) or print traces to verify these values, and their types, at the time that the error occurs. The same line of code could run successfully many other times before the error occurs, so figuring out the problem with print can be difficult. Consider using temporary exception handling, along with breaking up the expression:
# result = complex_expression_a() < complex_expression_b()
try:
lhs, rhs = complex_expression_a(), complex_expression_b()
result = lhs < rhs
except TypeError:
print(f'comparison failed between `{lhs}` of type `{type(lhs)}` and `{rhs}` of type `{type(rhs)}`')
raise # so the program still stops and shows the error
Special case: sorting
As noted in the question, trying to sort a list using its .sort method, or to sort a sequence of values using the built-in sorted function (this is basically equivalent to creating a new list from the values, .sorting it and returning it), can cause TypeError: '<' not supported between instances of... - naming the types of two of the values that are in the input. This happens because general-purpose sorting involves comparing the values being sorted, and the built-in sort does this using <. (In Python 2.x, it was possible to specify a custom comparison function, but now custom sort orders are done using a "key" function that transforms the values into something that sorts in the desired way.)
Therefore, if the line of code contains one of these calls, the natural explanation is that the values being sorted are of incompatible types (typically, mixed types). Rather than looking for left- and right-hand side of an expression, we look at a single sequence of inputs. One useful technique here is to use set to find out all the types of these values (looking at individual values will probably not be as insightful):
try:
my_data.sort()
except TypeError:
print(f'sorting failed. Found these types: {set(type(d) for d in my_data)}')
raise
See also LabelEncoder: TypeError: '>' not supported between instances of 'float' and 'str' for a Pandas-specific variant of this problem.
If all the input values are the same type, it could still be that the type does not support comparison (for example, a list of all None cannot be sorted, despite that it's obvious that the result should just be the same list). A special note here: if the input was created using a list comprehension, then the values will normally be of the same type, but that type could be invalid. Carefully check the logic for the comprehension. If it results in a function, or in None, see the corresponding sections below.
Historical note
This kind of error is specific to Python 3. In 2.x, objects could be compared regardless of mismatched types, following rather complex rules; and certain things of the same type (such as dicts) could be compared that are no longer considered comparable in 3.x.
This meant that data could always be sorted without causing a cryptic error; but the resulting order could be hard to understand, and this permissive behaviour often caused many more problems than it solved.
Understanding the incompatibility
For comparisons, it's very likely that the problem is with either or both of the inputs, rather than the operator; but double-check the intended logic anyway.
For simple cases of sorting an input sequence, similarly, the problem is almost certainly with the input values. However, when sorting using a key function (e.g. mylist.sort(key=lambda x: ...), that function could also cause the problem. Double-check the logic: given the expected type for the input values, what type of thing will be returned? Does it make sense to compare two things of that type? If an existing function is used, test the function with some sample values. If a lambda is used, convert it to a function first and test that.
If the list is supposed to contain instances of a user-defined class, make sure that the class instances are created properly. Consider for example:
class Example:
def __init__(self):
self.attribute = None
mylist = [Example(), Example()]
mylist.sort(key=lambda e: e.attribute)
The key function was supposed to make it possible to sort the instances according to their attribute value, but those values were set wrongly to None - thus we still get an error, because the Nones returned from the key function are not comparable.
Comparing NoneType
NoneType is the type of the special None value, so this means that either of the operands (or one or more of the elements of the input) is None.
Check:
If the value is supposed to be provided by a user-defined function, make sure that the value is returned rather than being displayed using print and that the return value is used properly. Make sure that the function explicitly returns a non-None value without reaching the end, in every case. If the function uses recursion, make sure that it doesn't improperly ignore a value returned from the recursive call (i.e., unless there is a good reason).
If the value is supposed to come from a built-in method or a library function, make sure that it actually returns the value, rather than modifying the input as a side effect. This commonly happens for example with many list methods, random.shuffle, and print (especially a print call left over from a previous debugging attempt). Many other things can return None in some circumstances rather than reporting an error. When in doubt, read the documentation.
Comparing functions (or methods)
This almost always means that the function was not called when it should have been. Keep in mind that the parentheses are necessary for a call even if there are no arguments.
For example, if we have
import random
if random.random < 0.5:
print('heads')
else:
print('tails')
This will fail because the random function was not called - the code should say if random.random() < 0.5: instead.
Comparing strings and numbers
If one side of the comparison is a str and the other side is int or float, this typically suggests that the str should have been converted earlier on, as in this example. This especially happens when the string comes from user input.
Comparing user-defined types
By default, only == and != comparisons are possible with user-defined types. The others need to be implemented, using the special methods __lt__ (<), __le__ (<=), __gt__ (>) and/or __ge__ (>=). Python 3.x can make some inferences here automatically, but not many:
>>> class Example:
... def __init__(self, value):
... self._value = value
... def __gt__(self, other):
... if isinstance(other, Example):
... return self._value > other._value
... return self._value > other # for non-Examples
...
>>> Example(1) > Example(2) # our Example class supports `>` comparison with other Examples
False
>>> Example(1) > 2 # as well as non-Examples.
False
>>> Example(1) < Example(2) # `<` is inferred by swapping the arguments, for two Examples...
True
>>> Example(1) < 2 # but not for other types
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'Example' and 'int'
>>> Example(1) >= Example(2) # and `>=` does not work, even though `>` and `==` do
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '>=' not supported between instances of 'Example' and 'Example'
In 3.2 and up, this can be worked around using the total_ordering decorator from the standard library functools module:
>>> from functools import total_ordering
>>> #total_ordering
... class Example:
... # the rest of the class as before
>>> # Now all the examples work and do the right thing.

Limitations of variables in python

I realize this may be a bit broad, and thought this was an interesting question that I haven't really seen an answer to. It may be hidden in the python documentation somewhere, but as I'm new to python haven't gone through all of it yet.
So.. are there any general rules of things that we cannot set to be variables? Everything in python is an object and we can use variables for the typical standard usage of storing strings, integers, aliasing variables, lists, calling references to classes, etc and if we're clever even something along the lines as the below that I can think of off the top of my head, wherever this may be useful
var = lambda: some_function()
storing comparison operators to clean code up such as:
var = some_value < some_value ...
So, that being said I've never come across anything that I couldn't store as a variable if I really wanted to, and was wondering if there really are any limitations?

You can't store syntactical constructs in a variable. For example, you can't do
command = break
while condition:
if other_condition:
command
or
operator = +
three = 1 operator 2

You can't really store expressions and statements as objects in Python.
Sure, you can wrap an expression in a lambda, and you can wrap a series of statements in a code object or callable, but you can't easily manipulate them. For instance, changing all instances of addition to multiplication is not readily possible.
To some extent, this can be worked around with the ast module, which provides for parsing Python code into abstract syntax trees. You can then manipulate the trees, instead of the code itself, and pass it to compile() to turn it back into a code object.
However, this is a form of indirection, compensating for a feature Python itself lacks. ast can't really compare to the anything-goes flexibility of (say) Lisp macros.

According to the Language Reference, the right hand side of an assignment statement can be an 'expression list' or a 'yield expression'. An expression list is a comma-separated list of one or more expressions. You need to follow this through several more tokens to come up with anything concrete, but ultimately you can find that an 'expression' is any number of objects (literals or variable names, or the result of applying a unary operator such as not, ~ or - to a nested expression_list) chained together by any binary operator (such as the arithmetic, comparison or bitwise operators, or logical and and or) or the ternary a if condition else b.
You can also note in other parts of the language reference that an 'expression' is exactly something you can use as an argument to a function, or as the first part (before the for) of a list comprehension or generator expression.
This is a fairly broad definition - in fact, it amounts to "anything Python resolves to an object". But it does leave out a few things - for example, you can't directly store the less-than operator < in a variable, since it isn't a valid expression by itself (it has to be between two other expressions) and you have to put it in a function that uses it instead. Similarly, most of the Python keywords aren't expressions (the exceptions are True, False and None, which are all canonical names for certain objects).
Note especially that functions are also objects, and hence the name of a function (without calling it) is a valid expression. This means that your example:
var = lambda: some_function()
can be written as:
var = some_function

By definition, a variable is something which can vary, or change. In its broadest sense, a variable is no more than a way of referring to a location in memory in your given program. Another way to think of a variable is as a container to place your information in.
Unlike popular strongly typed languages, variable declaration in Python is not required. You can place pretty much anything in a variable so long as you can come up with a name for it. Furthermore, in addition to the value of a variable in Python being capable of changing, the type often can as well.
To address your question, I would say the limitations on a variable in Python relate only to a few basic necessary attributes:
A name
A scope
A value
(Usually) a type
As a result, things like operators (+ or * for instance) cannot be stored in a variable as they do not meet these basic requirements, and in general you cannot store expressions themselves as variables (unless you're wrapping them in a lambda expression).
As mentioned by Kevin, it's also worth noting that it is possible to sort of store an operator in a variable using the operator module , however even doing so you cannot perform the kinds of manipulations that a variable is otherwise subject to as really you are just making a value assignment. An example of the operator module:
import operator
operations = {"+": operator.add,
"-": operator.sub,}
operator_variable_string= input('Give me an operand:')
operator_function = operations[operator_variable_string]
result = operator_function(8, 4)

Integers v/s Floats in python:Cannot understand the behavior

I was playing a bit in my python shell while learning about mutability of objects.
I found something strange:
>>> x=5.0
>>> id(x)
48840312
>>> id(5.0)
48840296
>>> x=x+3.0
>>> id(x) # why did x (now 8.0) keep the same id as 5.0?
48840296
>>> id(5.0)
36582128
>>> id(5.0)
48840344
Why is the id of 5.0 reused after the statement x=x+3.0?

Fundamentally, the answer to your question is "calling id() on numbers will give you unpredictable results". The reason for this is because unlike languages like Java, where primitives literally are their value in memory, "primitives" in Python are still objects, and no guarantee is provided that exactly the same object will be used every time, merely that a functionally equivalent one will be.
CPython caches the values of the integers from -5 to 256 for efficiency (ensuring that calls to id() will always be the same), since these are commonly used and can be effectively cached, however nothing about the language requires this to be the case, and other implementations may chose not to do so.
Whenever you write a double literal in Python, you're asking the interpreter to convert the string into a valid numerical object. If it can, Python will reuse existing objects, but if it cannot easily determine whether an object exits already, it will simply create a new one.
This is not to say that numbers in Python are mutable - they aren't. Any instance of a number, such as 5.0, in Python cannot be changed by the user after being created. However there's nothing wrong, as far as the interpreter is concerned, with constructing more than one instance of the same number.
Your specific example of the object representing x = 5.0 being reused for the value of x += 3.0 is an implementation detail. Under the covers, CPython may, if it sees fit, reuse numerical objects, both integers and floats, to avoid the costly activity of constructing a whole new object. I stress however, this is an implementation detail; it's entirely possible certain cases will not display this behavior, and CPython could at any time change its number-handling logic to no longer behave this way. You should avoid writing any code that relies on this quirk.
The alternative, as eryksun points out, is simply that you stumbled on an object being garbage collected and replaced in the same location. From the user's perspective, there's no difference between the two cases, and this serves to stress that id() should not be used on "primitives".

The Devil is in the details
PyObject* PyInt_FromLong(long ival)
Return value: New reference.
Create a new integer object with a value of ival.
The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range
you actually just get back a reference to the existing object. So it
should be possible to change the value of 1. I suspect the behaviour
of Python in this case is undefined. :-)
Note This is true only for CPython and may not apply for other Python Distribution.

Understanding Python Attributes and Methods

I am trying to learn Python and am a bit confused about a script I am playing with. I am using Python to launch scapy. There are some conditional statements that test for certain values. My confusion is centered around how the values are checked. I hope I am using the terms attributes and methods appropriately. I am still trying to figure out the builtin features vs. what is included with scapy. I've been using Powershell mainly for the last few years so its hard to switch gears :)
tcp_connect_scan_resp = sr1(IP(dst=dst_ip)/TCP(sport=src_port,dport=dst_port,flags="S"),timeout=10)
if(str(type(tcp_connect_scan_resp))=="<type 'NoneType'>"):
Print "Closed"
elif(tcp_connect_scan_resp.haslayer(TCP)):
if(tcp_connect_scan_resp.getlayer(TCP).flags == 0x12):
The first conditional statement appears to be check for the attribute 'type'. Why would they use the Python built-in str() and type() functions in this case? If I just use type() it pulls the same value.
For the second and third conditional statements appear to be using methods built into scapy. What is the logic for including the brackets () on the outside of the statements? Again if I run them manually, I get the proper value.

The second statement, the parantheses around the expression of an if statement, is simply unnecessary and bad style.
The first statement warrants a more detailed explanation:
if(str(type(tcp_connect_scan_resp))=="<type 'NoneType'>"):
This checks if the string representation of the type that tcp_connect_scan_resp is of is equal to "". This is a bad form of type checking, used in a bad way. There are situations where type checking may be necessary, but generally you should try to avoid it in Python (see duck typing). If you must, use isinstance().
In the case of the Python builtin type None, the idiomatic way is to just write
if foo is None
Now, the reason you got the "same result" by using type() yourself, is that if you enter someting in an interactive Python shell, the interpreter represents the value for you (by calling __repr__()). Except for basic types that have literal notations, like integers, strings, or sequences, the representation of an object isn't necessarlily the same as its value (or what you would type in to recreate that same object).
So, when you do
>>> foo = type(42)
>>> foo
<type 'int'>
the interpreter prints '<type 'int'>', but the result of the call is actualy int, the built-in type for integers:
>>> type(42) == int
True
>>> type(42) == "<type 'int'>"
False
Also, consider this:
Libraries or tools written to help with a specific field of expertise are often written by experts in those fields - not necessarily experts in Python. In my opinion, you often see this in scientific libraries (matplotlib and numpy for example). This doesn't mean they're bad libraries, but they often aren't a good inspiration for Pythonic style.

Never check a type by comparing str(type(obj)) == 'ClassName'.
You should use isinstance(obj, Class), or for None you just write if obj is None.

Why isn't the 'len' function inherited by dictionaries and lists in Python

example:
a_list = [1, 2, 3]
a_list.len() # doesn't work
len(a_list) # works
Python being (very) object oriented, I don't understand why the 'len' function isn't inherited by the object.
Plus I keep trying the wrong solution since it appears as the logical one to me

Guido's explanation is here:
First of all, I chose len(x) over x.len() for HCI reasons (def __len__() came much later). There are two intertwined reasons actually, both HCI:
(a) For some operations, prefix notation just reads better than postfix — prefix (and infix!) operations have a long tradition in mathematics which likes notations where the visuals help the mathematician thinking about a problem. Compare the easy with which we rewrite a formula like x*(a+b) into x*a + x*b to the clumsiness of doing the same thing using a raw OO notation.
(b) When I read code that says len(x) I know that it is asking for the length of something. This tells me two things: the result is an integer, and the argument is some kind of container. To the contrary, when I read x.len(), I have to already know that x is some kind of container implementing an interface or inheriting from a class that has a standard len(). Witness the confusion we occasionally have when a class that is not implementing a mapping has a get() or keys() method, or something that isn’t a file has a write() method.
Saying the same thing in another way, I see ‘len‘ as a built-in operation. I’d hate to lose that. /…/

The short answer: 1) backwards compatibility and 2) there's not enough of a difference for it to really matter. For a more detailed explanation, read on.
The idiomatic Python approach to such operations is special methods which aren't intended to be called directly. For example, to make x + y work for your own class, you write a __add__ method. To make sure that int(spam) properly converts your custom class, write a __int__ method. To make sure that len(foo) does something sensible, write a __len__ method.
This is how things have always been with Python, and I think it makes a lot of sense for some things. In particular, this seems like a sensible way to implement operator overloading. As for the rest, different languages disagree; in Ruby you'd convert something to an integer by calling spam.to_i directly instead of saying int(spam).
You're right that Python is an extremely object-oriented language and that having to call an external function on an object to get its length seems odd. On the other hand, len(silly_walks) isn't any more onerous than silly_walks.len(), and Guido has said that he actually prefers it (http://mail.python.org/pipermail/python-3000/2006-November/004643.html).

It just isn't.
You can, however, do:
>>> [1,2,3].__len__()
3
Adding a __len__() method to a class is what makes the len() magic work.

This way fits in better with the rest of the language. The convention in python is that you add __foo__ special methods to objects to make them have certain capabilities (rather than e.g. deriving from a specific base class). For example, an object is
callable if it has a __call__ method
iterable if it has an __iter__ method,
supports access with [] if it has __getitem__ and __setitem__.
...
One of these special methods is __len__ which makes it have a length accessible with len().

Maybe you're looking for __len__. If that method exists, then len(a) calls it:
>>> class Spam:
... def __len__(self): return 3
...
>>> s = Spam()
>>> len(s)
3

Well, there actually is a length method, it is just hidden:
>>> a_list = [1, 2, 3]
>>> a_list.__len__()
3
The len() built-in function appears to be simply a wrapper for a call to the hidden len() method of the object.
Not sure why they made the decision to implement things this way though.

there is some good info below on why certain things are functions and other are methods. It does indeed cause some inconsistencies in the language.
http://mail.python.org/pipermail/python-dev/2008-January/076612.html

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.