Boolean identity == True vs is True - python

It is standard convention to use if foo is None rather than if foo == None to test if a value is specifically None.
If you want to determine whether a value is exactly True (not just a true-like value), is there any reason to use if foo == True rather than if foo is True? Does this vary between implementations such as CPython (2.x and 3.x), Jython, PyPy, etc.?
Example: say True is used as a singleton value that you want to differentiate from the value 'bar', or any other true-like value:
if foo is True: # vs foo == True
...
elif foo == 'bar':
...
Is there a case where using if foo is True would yield different results from if foo == True?
NOTE: I am aware of Python booleans - if x:, vs if x == True, vs if x is True. However, it only addresses whether if foo, if foo == True, or if foo is True should generally be used to determine whether foo has a true-like value.
UPDATE: According to PEP 285 § Specification:
The values False and True will be singletons, like None.

If you want to determine whether a value is exactly True (not just a true-like value), is there any reason to use if foo == True rather than if foo is True?
If you want to make sure that foo really is a boolean and of value True, use the is operator.
Otherwise, if the type of foo implements its own __eq__() that returns a true-ish value when comparing to True, you might end up with an unexpected result.
As a rule of thumb, you should always use is with the built-in constants True, False and None.
Does this vary between implementations such as CPython (2.x and 3.x), Jython, PyPy, etc.?
In theory, is will be faster than == since the latter must honor types' custom __eq__ implementations, while is can directly compare object identities (e.g., memory addresses).
I don't know the source code of the various Python implementations by heart, but I assume that most of them can optimize that by using some internal flags for the existence of magic methods, so I suspect that you won't notice the speed difference in practice.

Never use is True in combination with numpy (and derivatives such as pandas):
In[1]: import numpy as np
In[2]: a = np.array([1, 2]).any()
In[4]: a is True
Out[4]: False
In[5]: a == True
Out[5]: True
This was unexpected to me as:
In[3]: a
Out[3]: True
I guess the explanation is given by:
In[6]: type(a)
Out[6]: numpy.bool_

is there any reason to use if foo == True rather than if foo is True?"
>>> d = True
>>> d is True
True
>>> d = 1
>>> d is True
False
>>> d == True
True
>>> d = 2
>>> d == True
False
Note that bool is a subclass of int, and that True has the integer value 1. To answer your question, if you want to check that some variable "is exactly True", you have to use the identity operator is. But that's really not pythonic... May I ask what's your real use case - IOW : why do you want to make a difference between True, 1 or any 'truth' value ?

edit: regarding:
Is there a case where using if foo is True would yield different results from if foo == True?
there is a case, and it's this:
In [24]: 1 is True
Out[24]: False
In [25]: 1 == True
Out[25]: True
additionally, if you're looking to use a singleton as a sentinel value, you can just create an object:
sentinel_time = object()
def f(snth):
if snth is sentinel_time:
print 'got em!'
f(sentinel_time)
you don't want to use if var == True:, you really want if var:.
imagine you have a list. you don't care if a list is "True" or not, you just want to know whether or not it's empty. so...
l = ['snth']
if l:
print l
check out this post for what evaluates to False: Evaluation of boolean expressions in Python

Using foo is True instead of foo == True (or just foo) if is most of the time not what you want.
I have seen foo is True used for checking that the parameter foo really was a boolean.
It contradicts python's duck-typing philosophy (you should in general not check for types. A function acting differently with True than with other truthy values is counter-intuitive for a programmer who assumes duck-typing)
Even if you want to check for types, it is better to do it explicity like :
def myFunction(foo):
if not isinstance(foo, bool):
raise ValueError("foo should be a boolean")
>>> myFunction(1)
Exception: ValueError "foo should be a boolean"
For several reasons:
Bool is the only type where the is operator will be equivalent to isinstance(a, bool) and a. The reason for that is the fact that True and False are singletons. In other words, this works because of a poorly known feature of python (especially when some tutorials teach you that True and False are just aliases for 1 and 0).
If you use isinstance and the programmer was not aware that your function did not accept truthy-values, or if they are using numpy and forgot to cast their numpy-boolean to a python-boolean, they will know what they did wrong, and will be able to debug.
Compare with
def myFunction(foo):
if foo is True:
doSomething()
else:
doSomethingElse()
In this case, myFunction(1) not only does not raise an exception, but probably does the opposite of what it was expected to do. This makes for a hard to find bug in case someone was using a numpy boolean for example.
When should you use is True then ?
EDIT: this is bad practice, starting from 3.9, python raises a warning when you try to use is to compare with a literal. See # JayDadhania's comment below. In conclusion is should not be used to compare to literals, only to check the equality of memory address.
Just don't use it. If you need to check for type, use isinstance.
Old paragraph:
Basically, use it only as a shorthand for isinstance(foo, bool) and foo
The only case I see is when you explicitely want to check if a value is true, and you will also check if the value is another truthy value later on. Examples include:
if foo is True:
doSomething()
elif foo is False:
doSomethingElse()
elif foo is 1: #EDIT: raises a warning, use == 1 instead
doYetSomethingElse()
else:
doSomethingElseEntirely()

Here's a test that allows you to see the difference between the 3 forms of testing for True:
for test in ([], [1], 0, 1, 2):
print repr(test), 'T' if test else 'F', 'T' if test == True else 'F', 'T' if test is True else 'F'
[] F F F
[1] T F F
0 F F F
1 T T F
2 T F F
As you can see there are cases where all of them deliver different results.

Most of the time, you should not care about a detail like this. Either you already know that foo is a boolean (and you can thus use if foo), or you know that foo is something else (in which case there's no need to test). If you don't know the types of your variables, you may want to refactor your code.
But if you really need to be sure it is exactly True and nothing else, use is. Using == will give you 1 == True.

Related

SQLAlchemy Core select where condition contains boolean expression `is False`

How to use SQLAlchemy expression language to select columns with where condition to check boolean expression. example:
select([table]).\
where(and_(table.c.col1 == 'abc',
table.c.is_num is False
))
This doesn't give syntax error, but evaluates the condition wrong. I cannot use == False which gives error. SQLAlchemy Core v.1.0.8
The identity comparison operator is cannot be overloaded in Python, so
table.c.is_num is False
compares the identities of the Column object and False, and since they're clearly not the same object, evaluates to False. By
I cannot use == False which gives error
you probably mean that some Python linter adhering to PEP-8 gives you a warning. Checking equality against True or False is still valid Python, though unpythonic in the general sense – but it does make sense in SQLAlchemy filters and it is used in the docs. For example:
In [5]: t.c.bar == False
Out[5]: <sqlalchemy.sql.elements.BinaryExpression object at 0x7fdc355a1da0>
In [6]: print(_)
foo.bar = false
But: instead of comparing a boolean to a boolean you could use the value itself:
select([table]).\
where(and_(table.c.col1 == 'abc',
~table.c.is_num
))
which would translate to (approximately):
SELECT ... FROM table WHERE col1 = 'abc' AND NOT is_num
since SQLAlchemy ColumnOperators overload the __invert__ to not_(). Some backends may not support a boolean type, but SQLAlchemy handles the conversion:
In [6]: print((~t.c.bar).compile(dialect=sqlite.dialect()))
foo.bar = 0
According to the documentation, the way you should handle this is by using the true() or false() constants that you can import from SqlAlchemy. It would look like this:
from sqlalchemy import false
select([table]).\
where(and_(table.c.col1 == 'abc',
table.c.is_num == false()
))
Hope this helps!

how does python operator overloading works

I can understand that some langurage allows user to do some operator overloading. I know this in C++ area first. But c++ also has some restrictions on operator overloading and I think that's reasonable.
but when I come to python pandams library. I'm start to confused.
Take a look at my code at nbviewer.jupyter.org
complaints['Complaint Type'] == "Noise - Street/Sidewalk"
doesn't return a True or False.
This is crazy to me. Does anyone can help me to understand this?
in Python, can we overloading operator == so that it doesn't return a boolean?
If it is true for question 1, how can I wrote a simple code to demo this?
Some relevant results copied from the link:
>>> complaints['Complaint Type'] == "Noise - Street/Sidewalk"
0 True
1 False
2 False
3 False
4 False
...
111063 False
111064 False
111065 False
111066 True
111067 False
111068 False
Name: Complaint Type, Length: 111069, dtype: bool
You can overload operators if you create your own classes and add a __eq__ method to them.
class MyClass(object):
def __eq__(self, other):
# compare self with other, return whatever you need
This will be invoked whenever you compare your type with self == other. It is considered very normal to return a boolean from this function in python, so you might want to have a think about returning anything else if you want your code to make sense to other developers.
See the docs for python 2 on this here

Why isn't "is not" working with ConfigParser.sections() in a list comprehension? [duplicate]

In a comment on this question, I saw a statement that recommended using
result is not None
vs
result != None
What is the difference? And why might one be recommended over the other?
== is an equality test. It checks whether the right hand side and the left hand side are equal objects (according to their __eq__ or __cmp__ methods.)
is is an identity test. It checks whether the right hand side and the left hand side are the very same object. No methodcalls are done, objects can't influence the is operation.
You use is (and is not) for singletons, like None, where you don't care about objects that might want to pretend to be None or where you want to protect against objects breaking when being compared against None.
First, let me go over a few terms. If you just want your question answered, scroll down to "Answering your question".
Definitions
Object identity: When you create an object, you can assign it to a variable. You can then also assign it to another variable. And another.
>>> button = Button()
>>> cancel = button
>>> close = button
>>> dismiss = button
>>> print(cancel is close)
True
In this case, cancel, close, and dismiss all refer to the same object in memory. You only created one Button object, and all three variables refer to this one object. We say that cancel, close, and dismiss all refer to identical objects; that is, they refer to one single object.
Object equality: When you compare two objects, you usually don't care that it refers to the exact same object in memory. With object equality, you can define your own rules for how two objects compare. When you write if a == b:, you are essentially saying if a.__eq__(b):. This lets you define a __eq__ method on a so that you can use your own comparison logic.
Rationale for equality comparisons
Rationale: Two objects have the exact same data, but are not identical. (They are not the same object in memory.)
Example: Strings
>>> greeting = "It's a beautiful day in the neighbourhood."
>>> a = unicode(greeting)
>>> b = unicode(greeting)
>>> a is b
False
>>> a == b
True
Note: I use unicode strings here because Python is smart enough to reuse regular strings without creating new ones in memory.
Here, I have two unicode strings, a and b. They have the exact same content, but they are not the same object in memory. However, when we compare them, we want them to compare equal. What's happening here is that the unicode object has implemented the __eq__ method.
class unicode(object):
# ...
def __eq__(self, other):
if len(self) != len(other):
return False
for i, j in zip(self, other):
if i != j:
return False
return True
Note: __eq__ on unicode is definitely implemented more efficiently than this.
Rationale: Two objects have different data, but are considered the same object if some key data is the same.
Example: Most types of model data
>>> import datetime
>>> a = Monitor()
>>> a.make = "Dell"
>>> a.model = "E770s"
>>> a.owner = "Bob Jones"
>>> a.warranty_expiration = datetime.date(2030, 12, 31)
>>> b = Monitor()
>>> b.make = "Dell"
>>> b.model = "E770s"
>>> b.owner = "Sam Johnson"
>>> b.warranty_expiration = datetime.date(2005, 8, 22)
>>> a is b
False
>>> a == b
True
Here, I have two Dell monitors, a and b. They have the same make and model. However, they neither have the same data nor are the same object in memory. However, when we compare them, we want them to compare equal. What's happening here is that the Monitor object implemented the __eq__ method.
class Monitor(object):
# ...
def __eq__(self, other):
return self.make == other.make and self.model == other.model
Answering your question
When comparing to None, always use is not. None is a singleton in Python - there is only ever one instance of it in memory.
By comparing identity, this can be performed very quickly. Python checks whether the object you're referring to has the same memory address as the global None object - a very, very fast comparison of two numbers.
By comparing equality, Python has to look up whether your object has an __eq__ method. If it does not, it examines each superclass looking for an __eq__ method. If it finds one, Python calls it. This is especially bad if the __eq__ method is slow and doesn't immediately return when it notices that the other object is None.
Did you not implement __eq__? Then Python will probably find the __eq__ method on object and use that instead - which just checks for object identity anyway.
When comparing most other things in Python, you will be using !=.
Consider the following:
class Bad(object):
def __eq__(self, other):
return True
c = Bad()
c is None # False, equivalent to id(c) == id(None)
c == None # True, equivalent to c.__eq__(None)
None is a singleton, and therefore identity comparison will always work, whereas an object can fake the equality comparison via .__eq__().
>>> () is ()
True
>>> 1 is 1
True
>>> (1,) == (1,)
True
>>> (1,) is (1,)
False
>>> a = (1,)
>>> b = a
>>> a is b
True
Some objects are singletons, and thus is with them is equivalent to ==. Most are not.

Check for initialized variable in Python

I'm new to Python and I'm playing a bit with some code snippets.
In my code I need to check for variable initialization and I was using this idiom:
if my_variable:
# execute some code
but reading some posts I found this other idiom is used:
if my_variable is not None:
# execute some code
Are they equivalent or is there some semantic difference?
Quoting Python documentation on boolean operations,
In the context of Boolean operations, and also when expressions are used by control flow statements, the following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true.
So, if my_variable will fail, if my_variable has any of the above mentioned falsy values where as the second one will fail only if my_variable is None. Normally the variables are initialized with None as a placeholder value and if it is not None at some point of time in the program then they will know that some other value has been assigned to it.
For example,
def print_name(name=None):
if name is not None:
print(name)
else:
print("Default name")
Here, the function print_name expects one argument. If the user provides it, then it may not be None, so we are printing the actual name passed by the user and if we don't pass anything, by default None will be assigned. Now, we check if name is not None to make sure that we are printing the actual name instead of the Default name.
Note: If you really want to know if your variable is defined or not, you might want to try this
try:
undefined_variable
except NameError as e:
# Do whatever you want if the variable is not defined yet in the program.
print(e)
No if 0 would be False where if my_variable was actually 0 then if my_variable is not None: would be True, it would be the same for any Falsey values.
In [10]: bool([])
Out[10]: False
In [11]: bool(0)
Out[11]: False
In [12]: bool({})
Out[12]: False
In [13]: [] is not None
Out[13]: True
In [14]: 0 is not None
Out[14]: True
It's worth noting that python variables cannot be uninitialised. Variables in python are created by assignment.
If you want to check for actual uninitialisation, you should check for (non) existence, by catching the NameError exception.
Taking an example of a null string i.e. '' which is not None
>>> a = ""
>>> if a:
... print (True)
...
>>> if a is not None:
... print (True)
...
True
>>>
And a boolean value
>>> a = False
>>> if a:
... print (True)
...
>>> if a is not None:
... print (True)
...
True
>>>
Thus they are not equivalent
Check if variable exists is in globals dict, if not initialize variable.
if 'ots' not in globals():
ots=0.0

if (foo or bar or baz) is None:

I've been refactoring some rather crufty code and came across the following rather odd construct:
#!/usr/bin/env python2.7
# ...
if (opts.foo or opts.bar or opts.baz) is None:
# (actual option names changed to protect the guilty)
sys.stderr.write("Some error messages that these are required arguments")
... and I was wondering if this would ever make any conceivable sense.
I changed it to something like:
#!/usr/bin/env python2.7
if None in (opts.foo, opts.bar, opts.baz):
# ...
I did fire up an interpreter and actually try the first construct ... it only seems to work if the values are all false and the last of these false values is None. (In other words CPython's implementation seems to return the first true or last false value from a chain of or expressions).
I still suspect that the proper code should use either the any() or all() built-ins which were added 2.5 (the code in question already requires 2.7). I'm not yet sure which are the preferred/intended semantics as I'm just starting on this project.
So is there any case where this original code would make sense?
The short-circuiting behavior causes foo or bar or baz to return the first of the three values that is boolean-true, or the last value if all are boolean-false. So it basically means "if all are false and the last one is None".
Your changed version is slightly different. if None in (opts.foo, opts.bar, opts.baz) will, for instance, enter the if block if opts.foo is None and the other two are 1, whereas the original version will not (because None or 1 or 1 will evaluate to 1, which is not None). Your version will enter the if when any of the three is None, regardless of what the other two are, whereas the original version will enter the if only if the last is None and the other two are any boolean-false values.
Which of the two versions you want depends on how the rest of the code is structured and what values the options might take (in particular, whether they might have boolean-false values other than None, such as False or 0 or an empty string). Intuitively your version does seem more reasonable, but if the code has peculiar tricks like this in it, you never know what corner cases might emerge.
It's behaving that way because or is an short-circuit operator, details are in docs. Thus your first if statement is equal to:
if opts.baz is None
We could guess what author of that code expected. I think that, as you mentioned, he thought of using not all([opts.foo, opts.bar, opts.baz]).
I would prefer
if any(i is None for i in (opts.foo, opts.bar, opts.baz))
as it exactly expresses the intended goal.
OTOH,
not all([opts.foo, opts.bar, opts.baz])
does check for falseness, not for None.
The original code doesn't seem to make sense; it seems to have been written by someone unaware what they are doing.
let's try both two of your code:
In [20]: foo = True
In [22]: bar = None
In [23]: baz = None
In [24]: foo or bar or baz
Out[24]: True
In [25]: (foo or bar or baz) is None
Out[25]: False
In [28]: ((foo or bar or baz) is None) == (None in (foo, bar, baz))
Out[28]: False
You can see your rewrite is not same as the original code.
Your first condition only return True when all of your variables is None
In [19]: (None or None or None) is None
Out[19]: True
so you can rewrite your first condition to:
if foo == bar == bar == None:

Categories