I often see error messages that look like any of:
TypeError: '<' not supported between instances of 'str' and 'int'
The message can vary quite a bit, and I guess that it has many causes; so rather than ask again every time for every little situation, I want to know: what approaches or techniques can I use to find the problem, when I see this error message? (I have already read I'm getting a TypeError. How do I fix it?, but I'm looking for advice specific to the individual pattern of error messages I have identified.)
So far, I have figured out that:
the error will show some kind of operator (most commonly <; sometimes >, <=, >= or +) is "not supported between instances of", and then two type names (could be any types, but usually they are not the same).
The highlighted code will almost always have that operator in it somewhere, but the version with < can also show up if I am trying to sort something. (Why?)
Overview
As with any other TypeError, the main steps of the debugging task are:
Figure out what operation is raising the exception, what the inputs are, and what their types are
Understand why these types and operation cause a problem together, and determine which is wrong
If the input is wrong, work backwards to figure out where it comes from
The "working backwards" part is the same for all exceptions, but here are some specific hints for the first two steps.
Identifying the operation and inputs
This error occurs with the relational operators (or comparisons) <, >, <=, >=. It won't happen with == or != (unless someone specifically defines those operators for a user-defined class such that they do), because there is a fallback comparison based on object identity.
Bitwise, arithmetic and shifting operators give different error messages. (The boolean logical operators and and or do not normally cause a problem because of their logic is supported by every type by default, just like with == and !=. As for xor, that doesn't exist.)
As usual, start by looking at the last line of code mentioned in the error message. Go to the corresponding file and examine that line of code. (If the code is line-wrapped, it might not all be shown in the error message.)
Try to find an operator that matches the one in the error message, and double-check what the operands will be i.e. the things on the left-hand and right-hand side of the error. Double-check operator precedence to make sure of what expression will feed into the left-hand and right-hand sides of the operator. If the line is complex, try rewriting it to do the work in multiple steps. (If this accidentally fixes the problem, consider not trying to put it back!)
Sometimes the problem will be obvious at this point (for example, maybe the wrong variable was used due to a typo). Otherwise, use a debugger (ideally) or print traces to verify these values, and their types, at the time that the error occurs. The same line of code could run successfully many other times before the error occurs, so figuring out the problem with print can be difficult. Consider using temporary exception handling, along with breaking up the expression:
# result = complex_expression_a() < complex_expression_b()
try:
lhs, rhs = complex_expression_a(), complex_expression_b()
result = lhs < rhs
except TypeError:
print(f'comparison failed between `{lhs}` of type `{type(lhs)}` and `{rhs}` of type `{type(rhs)}`')
raise # so the program still stops and shows the error
Special case: sorting
As noted in the question, trying to sort a list using its .sort method, or to sort a sequence of values using the built-in sorted function (this is basically equivalent to creating a new list from the values, .sorting it and returning it), can cause TypeError: '<' not supported between instances of... - naming the types of two of the values that are in the input. This happens because general-purpose sorting involves comparing the values being sorted, and the built-in sort does this using <. (In Python 2.x, it was possible to specify a custom comparison function, but now custom sort orders are done using a "key" function that transforms the values into something that sorts in the desired way.)
Therefore, if the line of code contains one of these calls, the natural explanation is that the values being sorted are of incompatible types (typically, mixed types). Rather than looking for left- and right-hand side of an expression, we look at a single sequence of inputs. One useful technique here is to use set to find out all the types of these values (looking at individual values will probably not be as insightful):
try:
my_data.sort()
except TypeError:
print(f'sorting failed. Found these types: {set(type(d) for d in my_data)}')
raise
See also LabelEncoder: TypeError: '>' not supported between instances of 'float' and 'str' for a Pandas-specific variant of this problem.
If all the input values are the same type, it could still be that the type does not support comparison (for example, a list of all None cannot be sorted, despite that it's obvious that the result should just be the same list). A special note here: if the input was created using a list comprehension, then the values will normally be of the same type, but that type could be invalid. Carefully check the logic for the comprehension. If it results in a function, or in None, see the corresponding sections below.
Historical note
This kind of error is specific to Python 3. In 2.x, objects could be compared regardless of mismatched types, following rather complex rules; and certain things of the same type (such as dicts) could be compared that are no longer considered comparable in 3.x.
This meant that data could always be sorted without causing a cryptic error; but the resulting order could be hard to understand, and this permissive behaviour often caused many more problems than it solved.
Understanding the incompatibility
For comparisons, it's very likely that the problem is with either or both of the inputs, rather than the operator; but double-check the intended logic anyway.
For simple cases of sorting an input sequence, similarly, the problem is almost certainly with the input values. However, when sorting using a key function (e.g. mylist.sort(key=lambda x: ...), that function could also cause the problem. Double-check the logic: given the expected type for the input values, what type of thing will be returned? Does it make sense to compare two things of that type? If an existing function is used, test the function with some sample values. If a lambda is used, convert it to a function first and test that.
If the list is supposed to contain instances of a user-defined class, make sure that the class instances are created properly. Consider for example:
class Example:
def __init__(self):
self.attribute = None
mylist = [Example(), Example()]
mylist.sort(key=lambda e: e.attribute)
The key function was supposed to make it possible to sort the instances according to their attribute value, but those values were set wrongly to None - thus we still get an error, because the Nones returned from the key function are not comparable.
Comparing NoneType
NoneType is the type of the special None value, so this means that either of the operands (or one or more of the elements of the input) is None.
Check:
If the value is supposed to be provided by a user-defined function, make sure that the value is returned rather than being displayed using print and that the return value is used properly. Make sure that the function explicitly returns a non-None value without reaching the end, in every case. If the function uses recursion, make sure that it doesn't improperly ignore a value returned from the recursive call (i.e., unless there is a good reason).
If the value is supposed to come from a built-in method or a library function, make sure that it actually returns the value, rather than modifying the input as a side effect. This commonly happens for example with many list methods, random.shuffle, and print (especially a print call left over from a previous debugging attempt). Many other things can return None in some circumstances rather than reporting an error. When in doubt, read the documentation.
Comparing functions (or methods)
This almost always means that the function was not called when it should have been. Keep in mind that the parentheses are necessary for a call even if there are no arguments.
For example, if we have
import random
if random.random < 0.5:
print('heads')
else:
print('tails')
This will fail because the random function was not called - the code should say if random.random() < 0.5: instead.
Comparing strings and numbers
If one side of the comparison is a str and the other side is int or float, this typically suggests that the str should have been converted earlier on, as in this example. This especially happens when the string comes from user input.
Comparing user-defined types
By default, only == and != comparisons are possible with user-defined types. The others need to be implemented, using the special methods __lt__ (<), __le__ (<=), __gt__ (>) and/or __ge__ (>=). Python 3.x can make some inferences here automatically, but not many:
>>> class Example:
... def __init__(self, value):
... self._value = value
... def __gt__(self, other):
... if isinstance(other, Example):
... return self._value > other._value
... return self._value > other # for non-Examples
...
>>> Example(1) > Example(2) # our Example class supports `>` comparison with other Examples
False
>>> Example(1) > 2 # as well as non-Examples.
False
>>> Example(1) < Example(2) # `<` is inferred by swapping the arguments, for two Examples...
True
>>> Example(1) < 2 # but not for other types
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'Example' and 'int'
>>> Example(1) >= Example(2) # and `>=` does not work, even though `>` and `==` do
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '>=' not supported between instances of 'Example' and 'Example'
In 3.2 and up, this can be worked around using the total_ordering decorator from the standard library functools module:
>>> from functools import total_ordering
>>> #total_ordering
... class Example:
... # the rest of the class as before
>>> # Now all the examples work and do the right thing.
Related
How do I check if an object is of a given type, or if it inherits from a given type?
How do I check if the object o is of type str?
Beginners often wrongly expect the string to already be "a number" - either expecting Python 3.x input to convert type, or expecting that a string like '1' is also simultaneously an integer. This is the wrong canonical for those questions. Please carefully read the question and then use How do I check if a string represents a number (float or int)?, How can I read inputs as numbers? and/or Asking the user for input until they give a valid response as appropriate.
Use isinstance to check if o is an instance of str or any subclass of str:
if isinstance(o, str):
To check if the type of o is exactly str, excluding subclasses of str:
if type(o) is str:
See Built-in Functions in the Python Library Reference for relevant information.
Checking for strings in Python 2
For Python 2, this is a better way to check if o is a string:
if isinstance(o, basestring):
because this will also catch Unicode strings. unicode is not a subclass of str; both str and unicode are subclasses of basestring. In Python 3, basestring no longer exists since there's a strict separation of strings (str) and binary data (bytes).
Alternatively, isinstance accepts a tuple of classes. This will return True if o is an instance of any subclass of any of (str, unicode):
if isinstance(o, (str, unicode)):
The most Pythonic way to check the type of an object is... not to check it.
Since Python encourages Duck Typing, you should just try...except to use the object's methods the way you want to use them. So if your function is looking for a writable file object, don't check that it's a subclass of file, just try to use its .write() method!
Of course, sometimes these nice abstractions break down and isinstance(obj, cls) is what you need. But use sparingly.
isinstance(o, str) will return True if o is an str or is of a type that inherits from str.
type(o) is str will return True if and only if o is a str. It will return False if o is of a type that inherits from str.
After the question was asked and answered, type hints were added to Python. Type hints in Python allow types to be checked but in a very different way from statically typed languages. Type hints in Python associate the expected types of arguments with functions as runtime accessible data associated with functions and this allows for types to be checked. Example of type hint syntax:
def foo(i: int):
return i
foo(5)
foo('oops')
In this case we want an error to be triggered for foo('oops') since the annotated type of the argument is int. The added type hint does not cause an error to occur when the script is run normally. However, it adds attributes to the function describing the expected types that other programs can query and use to check for type errors.
One of these other programs that can be used to find the type error is mypy:
mypy script.py
script.py:12: error: Argument 1 to "foo" has incompatible type "str"; expected "int"
(You might need to install mypy from your package manager. I don't think it comes with CPython but seems to have some level of "officialness".)
Type checking this way is different from type checking in statically typed compiled languages. Because types are dynamic in Python, type checking must be done at runtime, which imposes a cost -- even on correct programs -- if we insist that it happen at every chance. Explicit type checks may also be more restrictive than needed and cause unnecessary errors (e.g. does the argument really need to be of exactly list type or is anything iterable sufficient?).
The upside of explicit type checking is that it can catch errors earlier and give clearer error messages than duck typing. The exact requirements of a duck type can only be expressed with external documentation (hopefully it's thorough and accurate) and errors from incompatible types can occur far from where they originate.
Python's type hints are meant to offer a compromise where types can be specified and checked but there is no additional cost during usual code execution.
The typing package offers type variables that can be used in type hints to express needed behaviors without requiring particular types. For example, it includes variables such as Iterable and Callable for hints to specify the need for any type with those behaviors.
While type hints are the most Pythonic way to check types, it's often even more Pythonic to not check types at all and rely on duck typing. Type hints are relatively new and the jury is still out on when they're the most Pythonic solution. A relatively uncontroversial but very general comparison: Type hints provide a form of documentation that can be enforced, allow code to generate earlier and easier to understand errors, can catch errors that duck typing can't, and can be checked statically (in an unusual sense but it's still outside of runtime). On the other hand, duck typing has been the Pythonic way for a long time, doesn't impose the cognitive overhead of static typing, is less verbose, and will accept all viable types and then some.
In Python 3.10, you can use | in isinstance:
>>> isinstance('1223', int | str)
True
>>> isinstance('abcd', int | str)
True
isinstance(o, str)
Link to docs
You can check for type of a variable using __name__ of a type.
Ex:
>>> a = [1,2,3,4]
>>> b = 1
>>> type(a).__name__
'list'
>>> type(a).__name__ == 'list'
True
>>> type(b).__name__ == 'list'
False
>>> type(b).__name__
'int'
For more complex type validations I like typeguard's approach of validating based on python type hint annotations:
from typeguard import check_type
from typing import List
try:
check_type('mylist', [1, 2], List[int])
except TypeError as e:
print(e)
You can perform very complex validations in very clean and readable fashion.
check_type('foo', [1, 3.14], List[Union[int, float]])
# vs
isinstance(foo, list) and all(isinstance(a, (int, float)) for a in foo)
I think the cool thing about using a dynamic language like Python is you really shouldn't have to check something like that.
I would just call the required methods on your object and catch an AttributeError. Later on this will allow you to call your methods with other (seemingly unrelated) objects to accomplish different tasks, such as mocking an object for testing.
I've used this a lot when getting data off the web with urllib2.urlopen() which returns a file like object. This can in turn can be passed to almost any method that reads from a file, because it implements the same read() method as a real file.
But I'm sure there is a time and place for using isinstance(), otherwise it probably wouldn't be there :)
The accepted answer answers the question in that it provides the answers to the asked questions.
Q: What is the best way to check whether a given object is of a given type? How about checking whether the object inherits from a given type?
A: Use isinstance, issubclass, type to check based on types.
As other answers and comments are quick to point out however, there's a lot more to the idea of "type-checking" than that in python. Since the addition of Python 3 and type hints, much has changed as well. Below, I go over some of the difficulties with type checking, duck typing, and exception handling. For those that think type checking isn't what is needed (it usually isn't, but we're here), I also point out how type hints can be used instead.
Type Checking
Type checking is not always an appropriate thing to do in python. Consider the following example:
def sum(nums):
"""Expect an iterable of integers and return the sum."""
result = 0
for n in nums:
result += n
return result
To check if the input is an iterable of integers, we run into a major issue. The only way to check if every element is an integer would be to loop through to check each element. But if we loop through the entire iterator, then there will be nothing left for intended code. We have two options in this kind of situation.
Check as we loop.
Check beforehand but store everything as we check.
Option 1 has the downside of complicating our code, especially if we need to perform similar checks in many places. It forces us to move type checking from the top of the function to everywhere we use the iterable in our code.
Option 2 has the obvious downside that it destroys the entire purpose of iterators. The entire point is to not store the data because we shouldn't need to.
One might also think that checking if checking all of the elements is too much then perhaps we can just check if the input itself is of the type iterable, but there isn't actually any iterable base class. Any type implementing __iter__ is iterable.
Exception Handling and Duck Typing
An alternative approach would be to forgo type checking altogether and focus on exception handling and duck typing instead. That is to say, wrap your code in a try-except block and catch any errors that occur. Alternatively, don't do anything and let exceptions rise naturally from your code.
Here's one way to go about catching an exception.
def sum(nums):
"""Try to catch exceptions?"""
try:
result = 0
for n in nums:
result += n
return result
except TypeError as e:
print(e)
Compared to the options before, this is certainly better. We're checking as we run the code. If there's a TypeError anywhere, we'll know. We don't have to place a check everywhere that we loop through the input. And we don't have to store the input as we iterate over it.
Furthermore, this approach enables duck typing. Rather than checking for specific types, we have moved to checking for specific behaviors and look for when the input fails to behave as expected (in this case, looping through nums and being able to add n).
However, the exact reasons which make exception handling nice can also be their downfall.
A float isn't an int, but it satisfies the behavioral requirements to work.
It is also bad practice to wrap the entire code with a try-except block.
At first these may not seem like issues, but here's some reasons that may change your mind.
A user can no longer expect our function to return an int as intended. This may break code elsewhere.
Since exceptions can come from a wide variety of sources, using the try-except on the whole code block may end up catching exceptions you didn't intend to. We only wanted to check if nums was iterable and had integer elements.
Ideally we'd like to catch exceptions our code generators and raise, in their place, more informative exceptions. It's not fun when an exception is raised from someone else's code with no explanation other than a line you didn't write and that some TypeError occured.
In order to fix the exception handling in response to the above points, our code would then become this... abomination.
def sum(nums):
"""
Try to catch all of our exceptions only.
Re-raise them with more specific details.
"""
result = 0
try:
iter(nums)
except TypeError as e:
raise TypeError("nums must be iterable")
for n in nums:
try:
result += int(n)
except TypeError as e:
raise TypeError("stopped mid iteration since a non-integer was found")
return result
You can kinda see where this is going. The more we try to "properly" check things, the worse our code is looking. Compared to the original code, this isn't readable at all.
We could argue perhaps this is a bit extreme. But on the other hand, this is only a very simple example. In practice, your code is probably much more complicated than this.
Type Hints
We've seen what happens when we try to modify our small example to "enable type checking". Rather than focusing on trying to force specific types, type hinting allows for a way to make types clear to users.
from typing import Iterable
def sum(nums: Iterable[int]) -> int:
result = 0
for n in nums:
result += n
return result
Here are some advantages to using type-hints.
The code actually looks good now!
Static type analysis may be performed by your editor if you use type hints!
They are stored on the function/class, making them dynamically usable e.g. typeguard and dataclasses.
They show up for functions when using help(...).
No need to sanity check if your input type is right based on a description or worse lack thereof.
You can "type" hint based on structure e.g. "does it have this attribute?" without requiring subclassing by the user.
The downside to type hinting?
Type hints are nothing more than syntax and special text on their own. It isn't the same as type checking.
In other words, it doesn't actually answer the question because it doesn't provide type checking. Regardless, however, if you are here for type checking, then you should be type hinting as well. Of course, if you've come to the conclusion that type checking isn't actually necessary but you want some semblance of typing, then type hints are for you.
To Hugo:
You probably mean list rather than array, but that points to the whole problem with type checking - you don't want to know if the object in question is a list, you want to know if it's some kind of sequence or if it's a single object. So try to use it like a sequence.
Say you want to add the object to an existing sequence, or if it's a sequence of objects, add them all
try:
my_sequence.extend(o)
except TypeError:
my_sequence.append(o)
One trick with this is if you are working with strings and/or sequences of strings - that's tricky, as a string is often thought of as a single object, but it's also a sequence of characters. Worse than that, as it's really a sequence of single-length strings.
I usually choose to design my API so that it only accepts either a single value or a sequence - it makes things easier. It's not hard to put a [ ] around your single value when you pass it in if need be.
(Though this can cause errors with strings, as they do look like (are) sequences.)
If you have to check for the type of str or int please use instanceof. As already mentioned by others the explanation is to also include sub classes. One important example for sub classes from my perspective are Enums with data type like IntEnum or StrEnum. Which are a pretty nice way to define related constants. However, it is kind of annoying if libraries do not accept those as such types.
Example:
import enum
class MyEnum(str, enum.Enum):
A = "a"
B = "b"
print(f"is string: {isinstance(MyEnum.A, str)}") # True
print(f"is string: {type(MyEnum.A) == str}") # False!!!
print(f"is string: {type(MyEnum.A.value) == str}") # True
In Python, you can use the built-in isinstance() function to check if an object is of a given type, or if it inherits from a given type.
To check if the object o is of type str, you would use the following code:
if isinstance(o, str):
# o is of type str
You can also use type() function to check the object type.
if type(o) == str:
# o is of type str
You can also check if the object is a sub class of a particular class using issubclass() function.
if issubclass(type(o),str):
# o is sub class of str
A simple way to check type is to compare it with something whose type you know.
>>> a = 1
>>> type(a) == type(1)
True
>>> b = 'abc'
>>> type(b) == type('')
True
I think the best way is to typing well your variables. You can do this by using the "typing" library.
Example:
from typing import NewType
UserId = NewType ('UserId', int)
some_id = UserId (524313`)
See https://docs.python.org/3/library/typing.html.
How do I check if an object is of a given type, or if it inherits from a given type?
How do I check if the object o is of type str?
Beginners often wrongly expect the string to already be "a number" - either expecting Python 3.x input to convert type, or expecting that a string like '1' is also simultaneously an integer. This is the wrong canonical for those questions. Please carefully read the question and then use How do I check if a string represents a number (float or int)?, How can I read inputs as numbers? and/or Asking the user for input until they give a valid response as appropriate.
Use isinstance to check if o is an instance of str or any subclass of str:
if isinstance(o, str):
To check if the type of o is exactly str, excluding subclasses of str:
if type(o) is str:
See Built-in Functions in the Python Library Reference for relevant information.
Checking for strings in Python 2
For Python 2, this is a better way to check if o is a string:
if isinstance(o, basestring):
because this will also catch Unicode strings. unicode is not a subclass of str; both str and unicode are subclasses of basestring. In Python 3, basestring no longer exists since there's a strict separation of strings (str) and binary data (bytes).
Alternatively, isinstance accepts a tuple of classes. This will return True if o is an instance of any subclass of any of (str, unicode):
if isinstance(o, (str, unicode)):
The most Pythonic way to check the type of an object is... not to check it.
Since Python encourages Duck Typing, you should just try...except to use the object's methods the way you want to use them. So if your function is looking for a writable file object, don't check that it's a subclass of file, just try to use its .write() method!
Of course, sometimes these nice abstractions break down and isinstance(obj, cls) is what you need. But use sparingly.
isinstance(o, str) will return True if o is an str or is of a type that inherits from str.
type(o) is str will return True if and only if o is a str. It will return False if o is of a type that inherits from str.
After the question was asked and answered, type hints were added to Python. Type hints in Python allow types to be checked but in a very different way from statically typed languages. Type hints in Python associate the expected types of arguments with functions as runtime accessible data associated with functions and this allows for types to be checked. Example of type hint syntax:
def foo(i: int):
return i
foo(5)
foo('oops')
In this case we want an error to be triggered for foo('oops') since the annotated type of the argument is int. The added type hint does not cause an error to occur when the script is run normally. However, it adds attributes to the function describing the expected types that other programs can query and use to check for type errors.
One of these other programs that can be used to find the type error is mypy:
mypy script.py
script.py:12: error: Argument 1 to "foo" has incompatible type "str"; expected "int"
(You might need to install mypy from your package manager. I don't think it comes with CPython but seems to have some level of "officialness".)
Type checking this way is different from type checking in statically typed compiled languages. Because types are dynamic in Python, type checking must be done at runtime, which imposes a cost -- even on correct programs -- if we insist that it happen at every chance. Explicit type checks may also be more restrictive than needed and cause unnecessary errors (e.g. does the argument really need to be of exactly list type or is anything iterable sufficient?).
The upside of explicit type checking is that it can catch errors earlier and give clearer error messages than duck typing. The exact requirements of a duck type can only be expressed with external documentation (hopefully it's thorough and accurate) and errors from incompatible types can occur far from where they originate.
Python's type hints are meant to offer a compromise where types can be specified and checked but there is no additional cost during usual code execution.
The typing package offers type variables that can be used in type hints to express needed behaviors without requiring particular types. For example, it includes variables such as Iterable and Callable for hints to specify the need for any type with those behaviors.
While type hints are the most Pythonic way to check types, it's often even more Pythonic to not check types at all and rely on duck typing. Type hints are relatively new and the jury is still out on when they're the most Pythonic solution. A relatively uncontroversial but very general comparison: Type hints provide a form of documentation that can be enforced, allow code to generate earlier and easier to understand errors, can catch errors that duck typing can't, and can be checked statically (in an unusual sense but it's still outside of runtime). On the other hand, duck typing has been the Pythonic way for a long time, doesn't impose the cognitive overhead of static typing, is less verbose, and will accept all viable types and then some.
In Python 3.10, you can use | in isinstance:
>>> isinstance('1223', int | str)
True
>>> isinstance('abcd', int | str)
True
isinstance(o, str)
Link to docs
You can check for type of a variable using __name__ of a type.
Ex:
>>> a = [1,2,3,4]
>>> b = 1
>>> type(a).__name__
'list'
>>> type(a).__name__ == 'list'
True
>>> type(b).__name__ == 'list'
False
>>> type(b).__name__
'int'
For more complex type validations I like typeguard's approach of validating based on python type hint annotations:
from typeguard import check_type
from typing import List
try:
check_type('mylist', [1, 2], List[int])
except TypeError as e:
print(e)
You can perform very complex validations in very clean and readable fashion.
check_type('foo', [1, 3.14], List[Union[int, float]])
# vs
isinstance(foo, list) and all(isinstance(a, (int, float)) for a in foo)
I think the cool thing about using a dynamic language like Python is you really shouldn't have to check something like that.
I would just call the required methods on your object and catch an AttributeError. Later on this will allow you to call your methods with other (seemingly unrelated) objects to accomplish different tasks, such as mocking an object for testing.
I've used this a lot when getting data off the web with urllib2.urlopen() which returns a file like object. This can in turn can be passed to almost any method that reads from a file, because it implements the same read() method as a real file.
But I'm sure there is a time and place for using isinstance(), otherwise it probably wouldn't be there :)
The accepted answer answers the question in that it provides the answers to the asked questions.
Q: What is the best way to check whether a given object is of a given type? How about checking whether the object inherits from a given type?
A: Use isinstance, issubclass, type to check based on types.
As other answers and comments are quick to point out however, there's a lot more to the idea of "type-checking" than that in python. Since the addition of Python 3 and type hints, much has changed as well. Below, I go over some of the difficulties with type checking, duck typing, and exception handling. For those that think type checking isn't what is needed (it usually isn't, but we're here), I also point out how type hints can be used instead.
Type Checking
Type checking is not always an appropriate thing to do in python. Consider the following example:
def sum(nums):
"""Expect an iterable of integers and return the sum."""
result = 0
for n in nums:
result += n
return result
To check if the input is an iterable of integers, we run into a major issue. The only way to check if every element is an integer would be to loop through to check each element. But if we loop through the entire iterator, then there will be nothing left for intended code. We have two options in this kind of situation.
Check as we loop.
Check beforehand but store everything as we check.
Option 1 has the downside of complicating our code, especially if we need to perform similar checks in many places. It forces us to move type checking from the top of the function to everywhere we use the iterable in our code.
Option 2 has the obvious downside that it destroys the entire purpose of iterators. The entire point is to not store the data because we shouldn't need to.
One might also think that checking if checking all of the elements is too much then perhaps we can just check if the input itself is of the type iterable, but there isn't actually any iterable base class. Any type implementing __iter__ is iterable.
Exception Handling and Duck Typing
An alternative approach would be to forgo type checking altogether and focus on exception handling and duck typing instead. That is to say, wrap your code in a try-except block and catch any errors that occur. Alternatively, don't do anything and let exceptions rise naturally from your code.
Here's one way to go about catching an exception.
def sum(nums):
"""Try to catch exceptions?"""
try:
result = 0
for n in nums:
result += n
return result
except TypeError as e:
print(e)
Compared to the options before, this is certainly better. We're checking as we run the code. If there's a TypeError anywhere, we'll know. We don't have to place a check everywhere that we loop through the input. And we don't have to store the input as we iterate over it.
Furthermore, this approach enables duck typing. Rather than checking for specific types, we have moved to checking for specific behaviors and look for when the input fails to behave as expected (in this case, looping through nums and being able to add n).
However, the exact reasons which make exception handling nice can also be their downfall.
A float isn't an int, but it satisfies the behavioral requirements to work.
It is also bad practice to wrap the entire code with a try-except block.
At first these may not seem like issues, but here's some reasons that may change your mind.
A user can no longer expect our function to return an int as intended. This may break code elsewhere.
Since exceptions can come from a wide variety of sources, using the try-except on the whole code block may end up catching exceptions you didn't intend to. We only wanted to check if nums was iterable and had integer elements.
Ideally we'd like to catch exceptions our code generators and raise, in their place, more informative exceptions. It's not fun when an exception is raised from someone else's code with no explanation other than a line you didn't write and that some TypeError occured.
In order to fix the exception handling in response to the above points, our code would then become this... abomination.
def sum(nums):
"""
Try to catch all of our exceptions only.
Re-raise them with more specific details.
"""
result = 0
try:
iter(nums)
except TypeError as e:
raise TypeError("nums must be iterable")
for n in nums:
try:
result += int(n)
except TypeError as e:
raise TypeError("stopped mid iteration since a non-integer was found")
return result
You can kinda see where this is going. The more we try to "properly" check things, the worse our code is looking. Compared to the original code, this isn't readable at all.
We could argue perhaps this is a bit extreme. But on the other hand, this is only a very simple example. In practice, your code is probably much more complicated than this.
Type Hints
We've seen what happens when we try to modify our small example to "enable type checking". Rather than focusing on trying to force specific types, type hinting allows for a way to make types clear to users.
from typing import Iterable
def sum(nums: Iterable[int]) -> int:
result = 0
for n in nums:
result += n
return result
Here are some advantages to using type-hints.
The code actually looks good now!
Static type analysis may be performed by your editor if you use type hints!
They are stored on the function/class, making them dynamically usable e.g. typeguard and dataclasses.
They show up for functions when using help(...).
No need to sanity check if your input type is right based on a description or worse lack thereof.
You can "type" hint based on structure e.g. "does it have this attribute?" without requiring subclassing by the user.
The downside to type hinting?
Type hints are nothing more than syntax and special text on their own. It isn't the same as type checking.
In other words, it doesn't actually answer the question because it doesn't provide type checking. Regardless, however, if you are here for type checking, then you should be type hinting as well. Of course, if you've come to the conclusion that type checking isn't actually necessary but you want some semblance of typing, then type hints are for you.
To Hugo:
You probably mean list rather than array, but that points to the whole problem with type checking - you don't want to know if the object in question is a list, you want to know if it's some kind of sequence or if it's a single object. So try to use it like a sequence.
Say you want to add the object to an existing sequence, or if it's a sequence of objects, add them all
try:
my_sequence.extend(o)
except TypeError:
my_sequence.append(o)
One trick with this is if you are working with strings and/or sequences of strings - that's tricky, as a string is often thought of as a single object, but it's also a sequence of characters. Worse than that, as it's really a sequence of single-length strings.
I usually choose to design my API so that it only accepts either a single value or a sequence - it makes things easier. It's not hard to put a [ ] around your single value when you pass it in if need be.
(Though this can cause errors with strings, as they do look like (are) sequences.)
If you have to check for the type of str or int please use instanceof. As already mentioned by others the explanation is to also include sub classes. One important example for sub classes from my perspective are Enums with data type like IntEnum or StrEnum. Which are a pretty nice way to define related constants. However, it is kind of annoying if libraries do not accept those as such types.
Example:
import enum
class MyEnum(str, enum.Enum):
A = "a"
B = "b"
print(f"is string: {isinstance(MyEnum.A, str)}") # True
print(f"is string: {type(MyEnum.A) == str}") # False!!!
print(f"is string: {type(MyEnum.A.value) == str}") # True
In Python, you can use the built-in isinstance() function to check if an object is of a given type, or if it inherits from a given type.
To check if the object o is of type str, you would use the following code:
if isinstance(o, str):
# o is of type str
You can also use type() function to check the object type.
if type(o) == str:
# o is of type str
You can also check if the object is a sub class of a particular class using issubclass() function.
if issubclass(type(o),str):
# o is sub class of str
A simple way to check type is to compare it with something whose type you know.
>>> a = 1
>>> type(a) == type(1)
True
>>> b = 'abc'
>>> type(b) == type('')
True
I think the best way is to typing well your variables. You can do this by using the "typing" library.
Example:
from typing import NewType
UserId = NewType ('UserId', int)
some_id = UserId (524313`)
See https://docs.python.org/3/library/typing.html.
I've come across answers here for type checking in general, type checking for numbers, and type checking for strings. Most people seem to respond by saying that type checking should never be performed in python (< 2.6) due to duck typing. My (limited) understanding of duck typing is that type is determined by use of an object's attributes. What do I do if I'm not using any attributes?
I have a simple function that determines constants based on the argument, which should be a number. I raise an exception defined by
class outOfBoundsError(ValueError):
"""
Specified value is outside the domain.
"""
with a message telling them the number they gave me is too big. I would like to keep this message specific. But if the argument is a string (like 'charlie'), it still considers the argument to be greater than my specified number (and raises my exception). Should I just add a dummy line to the code like argument + 2 so that a TypeError is raised?
Note: I don't know anything about ABCs but I don't think they're available to me since the latest python version we have access to is 2.5 : (.
A common duck-typish Python solution to this problem is to (try to) convert what you got to what you need. For example, if you need an integer:
def getconstants(arg):
try:
arg = int(arg)
except:
raise TypeError("expected integer, or something that can be converted to one, but got " + repr(arg))
The int type constructor knows how to deal with many built-in types. Additionally, types that are convertible to an int type can implement the __int__() special method, which would allow them to be used here. If the user passed in a string, that would work fine too as long as it contained only digits.
Your idea of performing some operation that could only be performed on a numeric type (such as adding 2) is similar, and would work great in many cases, but would fail with strings that can be converted to the desired type, so I like the type conversion better.
You could probably get by without the try/except here, since int() will raise either TypeError or ValueError if it can't do the conversion, but I think TypeError is more appropriate since you are mainly interested in the object's type, so I chose to catch the exception and always raise TypeError.
My honest answer to
Most people seem to respond by saying
that type checking should never be
performed in python (< 2.6) due to
duck typing
is: nonsense.
Type-checking - were needed - is common practice.
Don't listen to all and everything and don't accept every statement unfiltered.
You can check to make sure you're expecting a type you support initially. i.e.
def foo(arg):
if not isinstance(arg, int):
raise TypeError
...
Nothing wrong with that if you're only supporting integers.
Introspection works for this...primitive types have class names too:
>>> i=2
>>> i.__class__.__name__
'int'
>>> s="two"
>>> s.__class__.__name__
'str'
>>>
I used python to write an assignment last week, here is a code snippet
def departTime():
'''
Calculate the time to depart a packet.
'''
if(random.random < 0.8):
t = random.expovariate(1.0 / 2.5)
else:
t = random.expovariate(1.0 / 10.5)
return t
Can you see the problem? I compare random.random with 0.8, which
should be random.random().
Of course this because of my careless, but I don't get it. In my
opinion, this kind of comparison should invoke a least a warning in
any programming language.
So why does python just ignore it and return False?
This isn't always a mistake
Firstly, just to make things clear, this isn't always a mistake.
In this particular case, it's pretty clear the comparison is an error.
However, because of the dynamic nature of Python, consider the following (perfectly valid, if terrible) code:
import random
random.random = 9 # Very weird but legal assignment.
random.random < 10 # True
random.random > 10 # False
What actually happens when comparing objects?
As for your actual case, comparing a function object to a number, have a look at Python documentation: Python Documentation: Expressions. Check out section 5.9, titled "Comparisons", which states:
The operators <, >, ==, >=, <=, and != compare the values of two objects. The objects need not have the same type. If both are numbers, they are converted to a common type. Otherwise, objects of different types always compare unequal, and are ordered consistently but arbitrarily. You can control comparison behavior of objects of non-built-in types by defining a cmp method or rich comparison methods like gt, described in section Special method names.
(This unusual definition of comparison was used to simplify the definition of operations like sorting and the in and not in operators. In the future, the comparison rules for objects of different types are likely to change.)
That should explain both what happens and the reasoning for it.
BTW, I'm not sure what happens in newer versions of Python.
Edit: If you're wondering, Debilski's answer gives info about Python 3.
This is ‘fixed’ in Python 3 http://docs.python.org/3.1/whatsnew/3.0.html#ordering-comparisons.
Because in Python that is a perfectly valid comparison. Python can't know if you really want to make that comparison or if you've just made a mistake. It's your job to supply Python with the right objects to compare.
Because of the dynamic nature of Python you can compare and sort almost everything with almost everything (this is a feature). You've compared a function to a float in this case.
An example:
list = ["b","a",0,1, random.random, random.random()]
print sorted(list)
This will give the following output:
[0, 0.89329568818188976, 1, <built-in method random of Random object at 0x8c6d66c>, 'a', 'b']
I think python allows this because the random.random object could be overriding the > operator by including a __gt__ method in the object which might be accepting or even expecting a number. So, python thinks you know what you are doing... and does not report it.
If you try check for it, you can see that __gt__ exists for random.random...
>>> random.random.__gt__
<method-wrapper '__gt__' of builtin_function_or_method object at 0xb765c06c>
But, that might not be something you want to do.
I read somewhere that functions should always return only one type
so the following code is considered as bad code:
def x(foo):
if 'bar' in foo:
return (foo, 'bar')
return None
I guess the better solution would be
def x(foo):
if 'bar' in foo:
return (foo, 'bar')
return ()
Wouldn't it be cheaper memory wise to return a None then to create a new empty tuple or is this time difference too small to notice even in larger projects?
Why should functions return values of a consistent type? To meet the following two rules.
Rule 1 -- a function has a "type" -- inputs mapped to outputs. It must return a consistent type of result, or it isn't a function. It's a mess.
Mathematically, we say some function, F, is a mapping from domain, D, to range, R. F: D -> R. The domain and range form the "type" of the function. The input types and the result type are as essential to the definition of the function as is the name or the body.
Rule 2 -- when you have a "problem" or can't return a proper result, raise an exception.
def x(foo):
if 'bar' in foo:
return (foo, 'bar')
raise Exception( "oh, dear me." )
You can break the above rules, but the cost of long-term maintainability and comprehensibility is astronomical.
"Wouldn't it be cheaper memory wise to return a None?" Wrong question.
The point is not to optimize memory at the cost of clear, readable, obvious code.
It's not so clear that a function must always return objects of a limited type, or that returning None is wrong. For instance, re.search can return a _sre.SRE_Match object or a NoneType object:
import re
match=re.search('a','a')
type(match)
# <type '_sre.SRE_Match'>
match=re.search('a','b')
type(match)
# <type 'NoneType'>
Designed this way, you can test for a match with the idiom
if match:
# do xyz
If the developers had required re.search to return a _sre.SRE_Match object, then
the idiom would have to change to
if match.group(1) is None:
# do xyz
There would not be any major gain by requiring re.search to always return a _sre.SRE_Match object.
So I think how you design the function must depend on the situation and in particular, how you plan to use the function.
Also note that both _sre.SRE_Match and NoneType are instances of object, so in a broad sense they are of the same type. So the rule that "functions should always return only one type" is rather meaningless.
Having said that, there is a beautiful simplicity to functions that return objects which all share the same properties. (Duck typing, not static typing, is the python way!) It can allow you to chain together functions: foo(bar(baz))) and know with certainty the type of object you'll receive at the other end.
This can help you check the correctness of your code. By requiring that a function returns only objects of a certain limited type, there are fewer cases to check. "foo always returns an integer, so as long as an integer is expected everywhere I use foo, I'm golden..."
Best practice in what a function should return varies greatly from language to language, and even between different Python projects.
For Python in general, I agree with the premise that returning None is bad if your function generally returns an iterable, because iterating without testing becomes impossible. Just return an empty iterable in this case, it will still test False if you use Python's standard truth testing:
ret_val = x()
if ret_val:
do_stuff(ret_val)
and still allow you to iterate over it without testing:
for child in x():
do_other_stuff(child)
For functions that are likely to return a single value, I think returning None is perfectly acceptable, just document that this might happen in your docstring.
Here are my thoughts on all that and I'll try to also explain why I think that the accepted answer is mostly incorrect.
First of all programming functions != mathematical functions. The closest you can get to mathematical functions is if you do functional programming but even then there are plenty of examples that say otherwise.
Functions do not have to have input
Functions do not have to have output
Functions do not have to map input to output (because of the previous two bullet points)
A function in terms of programming is to be viewed simply as a block of memory with a start (the function's entry point), a body (empty or otherwise) and exit point (one or multiple depending on the implementation) all of which are there for the purpose of reusing code that you've written. Even if you don't see it a function always "returns" something. This something is actually the address of next statement right after the function call. This is something you will see in all of its glory if you do some really low-level programming with an Assembly language (I dare you to go the extra mile and do some machine code by hand like Linus Torvalds who ever so often mentions this during his seminars and interviews :D). In addition you can also take some input and also spit out some output. That is why
def foo():
pass
is a perfectly correct piece of code.
So why would returning multiple types be bad? Well...It isn't at all unless you abuse it. This is of course a matter of poor programming skills and/or not knowing what the language you're using can do.
Wouldn't it be cheaper memory wise to return a None then to create a new empty tuple or is this time difference too small to notice even in larger projects?
As far as I know - yes, returning a NoneType object would be much cheaper memory-wise. Here is a small experiment (returned values are bytes):
>> sys.getsizeof(None)
16
>> sys.getsizeof(())
48
Based on the type of object you are using as your return value (numeric type, list, dictionary, tuple etc.) Python manages the memory in different ways including the initially reserved storage.
However you have to also consider the code that is around the function call and how it handles whatever your function returns. Do you check for NoneType? Or do you simply check if the returned tuple has length of 0? This propagation of the returned value and its type (NoneType vs. empty tuple in your case) might actually be more tedious to handle and blow up in your face. Don't forget - the code itself is loaded into memory so if handling the NoneType requires too much code (even small pieces of code but in a large quantity) better leave the empty tuple, which will also avoid confusion in the minds of people using your function and forgetting that it actually returns 2 types of values.
Speaking of returning multiple types of value this is the part where I agree with the accepted answer (but only partially) - returning a single type makes the code more maintainable without a doubt. It's much easier to check only for type A then A, B, C, ... etc.
However Python is an object-oriented language and as such inheritance, abstract classes etc. and all that is part of the whole OOP shenanigans comes into play. It can go as far as even generating classes on-the-fly, which I have discovered a few months ago and was stunned (never seen that stuff in C/C++).
Side note: You can read a little bit about metaclasses and dynamic classes in this nice overview article with plenty of examples.
There are in fact multiple design patterns and techniques that wouldn't even exists without the so called polymorphic functions. Below I give you two very popular topics (can't find a better way to summarize both in a single term):
Duck typing - often part of the dynamic typing languages which Python is a representative of
Factory method design pattern - basically it's a function that returns various objects based on the input it receives.
Finally whether your function returns one or multiple types is totally based on the problem you have to solve. Can this polymorphic behaviour be abused? Sure, like everything else.
I personally think it is perfectly fine for a function to return a tuple or None. However, a function should return at most 2 different types and the second one should be a None. A function should never return a string and list for example.
If x is called like this
foo, bar = x(foo)
returning None would result in a
TypeError: 'NoneType' object is not iterable
if 'bar' is not in foo.
Example
def x(foo):
if 'bar' in foo:
return (foo, 'bar')
return None
foo, bar = x(["foo", "bar", "baz"])
print foo, bar
foo, bar = x(["foo", "NOT THERE", "baz"])
print foo, bar
This results in:
['foo', 'bar', 'baz'] bar
Traceback (most recent call last):
File "f.py", line 9, in <module>
foo, bar = x(["foo", "NOT THERE", "baz"])
TypeError: 'NoneType' object is not iterable
Premature optimization is the root of all evil. The minuscule efficiency gains might be important, but not until you've proven that you need them.
Whatever your language: a function is defined once, but tends to be used at any number of places. Having a consistent return type (not to mention documented pre- and postconditions) means you have to spend more effort defining the function, but you simplify the usage of the function enormously. Guess whether the one-time costs tend to outweigh the repeated savings...?