This question already has answers here:
Why does comparing strings using either '==' or 'is' sometimes produce a different result?
(15 answers)
Closed 9 years ago.
I noticed a Python script I was writing was acting squirrelly, and traced it to an infinite loop, where the loop condition was while line is not ''. Running through it in the debugger, it turned out that line was in fact ''. When I changed it to !='' rather than is not '', it worked fine.
Also, is it generally considered better to just use '==' by default, even when comparing int or Boolean values? I've always liked to use 'is' because I find it more aesthetically pleasing and pythonic (which is how I fell into this trap...), but I wonder if it's intended to just be reserved for when you care about finding two objects with the same id.
For all built-in Python objects (like
strings, lists, dicts, functions,
etc.), if x is y, then x==y is also
True.
Not always. NaN is a counterexample. But usually, identity (is) implies equality (==). The converse is not true: Two distinct objects can have the same value.
Also, is it generally considered better to just use '==' by default, even
when comparing int or Boolean values?
You use == when comparing values and is when comparing identities.
When comparing ints (or immutable types in general), you pretty much always want the former. There's an optimization that allows small integers to be compared with is, but don't rely on it.
For boolean values, you shouldn't be doing comparisons at all. Instead of:
if x == True:
# do something
write:
if x:
# do something
For comparing against None, is None is preferred over == None.
I've always liked to use 'is' because
I find it more aesthetically pleasing
and pythonic (which is how I fell into
this trap...), but I wonder if it's
intended to just be reserved for when
you care about finding two objects
with the same id.
Yes, that's exactly what it's for.
I would like to show a little example on how is and == are involved in immutable types. Try that:
a = 19998989890
b = 19998989889 +1
>>> a is b
False
>>> a == b
True
is compares two objects in memory, == compares their values. For example, you can see that small integers are cached by Python:
c = 1
b = 1
>>> b is c
True
You should use == when comparing values and is when comparing identities. (Also, from an English point of view, "equals" is different from "is".)
The logic is not flawed. The statement
if x is y then x==y is also True
should never be read to mean
if x==y then x is y
It is a logical error on the part of the reader to assume that the converse of a logic statement is true. See http://en.wikipedia.org/wiki/Converse_(logic)
See This question
Your logic in reading
For all built-in Python objects (like
strings, lists, dicts, functions,
etc.), if x is y, then x==y is also
True.
is slightly flawed.
If is applies then == will be True, but it does NOT apply in reverse. == may yield True while is yields False.
Related
Two string variables are set to the same value. s1 == s2 always returns True, but s1 is s2 sometimes returns False.
If I open my Python interpreter and do the same is comparison, it succeeds:
>>> s1 = 'text'
>>> s2 = 'text'
>>> s1 is s2
True
Why is this?
is is identity testing, and == is equality testing. What happens in your code would be emulated in the interpreter like this:
>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False
So, no wonder they're not the same, right?
In other words: a is b is the equivalent of id(a) == id(b)
Other answers here are correct: is is used for identity comparison, while == is used for equality comparison. Since what you care about is equality (the two strings should contain the same characters), in this case the is operator is simply wrong and you should be using == instead.
The reason is works interactively is that (most) string literals are interned by default. From Wikipedia:
Interned strings speed up string
comparisons, which are sometimes a
performance bottleneck in applications
(such as compilers and dynamic
programming language runtimes) that
rely heavily on hash tables with
string keys. Without interning,
checking that two different strings
are equal involves examining every
character of both strings. This is
slow for several reasons: it is
inherently O(n) in the length of the
strings; it typically requires reads
from several regions of memory, which
take time; and the reads fills up the
processor cache, meaning there is less
cache available for other needs. With
interned strings, a simple object
identity test suffices after the
original intern operation; this is
typically implemented as a pointer
equality test, normally just a single
machine instruction with no memory
reference at all.
So, when you have two string literals (words that are literally typed into your program source code, surrounded by quotation marks) in your program that have the same value, the Python compiler will automatically intern the strings, making them both stored at the same memory location. (Note that this doesn't always happen, and the rules for when this happens are quite convoluted, so please don't rely on this behavior in production code!)
Since in your interactive session both strings are actually stored in the same memory location, they have the same identity, so the is operator works as expected. But if you construct a string by some other method (even if that string contains exactly the same characters), then the string may be equal, but it is not the same string -- that is, it has a different identity, because it is stored in a different place in memory.
The is keyword is a test for object identity while == is a value comparison.
If you use is, the result will be true if and only if the object is the same object. However, == will be true any time the values of the object are the same.
One last thing to note is you may use the sys.intern function to ensure that you're getting a reference to the same string:
>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True
As pointed out in previous answers, you should not be using is to determine equality of strings. But this may be helpful to know if you have some kind of weird requirement to use is.
Note that the intern function used to be a built-in on Python 2, but it was moved to the sys module in Python 3.
is is identity testing and == is equality testing. This means is is a way to check whether two things are the same things, or just equivalent.
Say you've got a simple person object. If it is named 'Jack' and is '23' years old, it's equivalent to another 23-year-old Jack, but it's not the same person.
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.name == other.name and self.age == other.age
jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)
jack1 == jack2 # True
jack1 is jack2 # False
They're the same age, but they're not the same instance of person. A string might be equivalent to another, but it's not the same object.
This is a side note, but in idiomatic Python, you will often see things like:
if x is None:
# Some clauses
This is safe, because there is guaranteed to be one instance of the Null Object (i.e., None).
If you're not sure what you're doing, use the '=='.
If you have a little more knowledge about it you can use 'is' for known objects like 'None'.
Otherwise, you'll end up wondering why things doesn't work and why this happens:
>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False
I'm not even sure if some things are guaranteed to stay the same between different Python versions/implementations.
From my limited experience with Python, is is used to compare two objects to see if they are the same object as opposed to two different objects with the same value. == is used to determine if the values are identical.
Here is a good example:
>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True
s1 is a Unicode string, and s2 is a normal string. They are not the same type, but they are the same value.
I think it has to do with the fact that, when the 'is' comparison evaluates to false, two distinct objects are used. If it evaluates to true, that means internally it's using the same exact object and not creating a new one, possibly because you created them within a fraction of 2 or so seconds and because there isn't a large time gap in between it's optimized and uses the same object.
This is why you should be using the equality operator ==, not is, to compare the value of a string object.
>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>>
In this example, I made s2, which was a different string object previously equal to 'one' but it is not the same object as s, because the interpreter did not use the same object as I did not initially assign it to 'one', if I had it would have made them the same object.
The == operator tests value equivalence. The is operator tests object identity, and Python tests whether the two are really the same object (i.e., live at the same address in memory).
>>> a = 'banana'
>>> b = 'banana'
>>> a is b
True
In this example, Python only created one string object, and both a and b refers to it. The reason is that Python internally caches and reuses some strings as an optimization. There really is just a string 'banana' in memory, shared by a and b. To trigger the normal behavior, you need to use longer strings:
>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)
When you create two lists, you get two objects:
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False
In this case we would say that the two lists are equivalent, because they have the same elements, but not identical, because they are not the same object. If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.
If a refers to an object and you assign b = a, then both variables refer to the same object:
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
Reference: Think Python 2e by Allen B. Downey
I believe that this is known as "interned" strings. Python does this, so does Java, and so do C and C++ when compiling in optimized modes.
If you use two identical strings, instead of wasting memory by creating two string objects, all interned strings with the same contents point to the same memory.
This results in the Python "is" operator returning True because two strings with the same contents are pointing at the same string object. This will also happen in Java and in C.
This is only useful for memory savings though. You cannot rely on it to test for string equality, because the various interpreters and compilers and JIT engines cannot always do it.
Actually, the is operator checks for identity and == operator checks for equality.
From the language reference:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
So from the above statement we can infer that the strings, which are immutable types, may fail when checked with "is" and may succeed when checked with "is".
The same applies for int and tuple which are also immutable types.
is will compare the memory location. It is used for object-level comparison.
== will compare the variables in the program. It is used for checking at a value level.
is checks for address level equivalence
== checks for value level equivalence
is is identity testing and == is equality testing (see the Python documentation).
In most cases, if a is b, then a == b. But there are exceptions, for example:
>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False
So, you can only use is for identity tests, never equality tests.
The basic concept, we have to be clear, while approaching this question, is to understand the difference between is and ==.
"is" is will compare the memory location. if id(a)==id(b), then a is b returns true else it returns false.
So, we can say that is is used for comparing memory locations. Whereas,
== is used for equality testing which means that it just compares only the resultant values. The below shown code may acts as an example to the above given theory.
Code
In the case of string literals (strings without getting assigned to variables), the memory address will be same as shown in the picture. so, id(a)==id(b). The remaining of this is self-explanatory.
This question already has answers here:
How is the 'is' keyword implemented in Python?
(11 answers)
Closed 8 years ago.
s="wall"
d="WALL".lower()
s is d returns False.
s and d are having the same string. But Why it returned False?
== tests for equality. a == b tests whether a and b have the same value.
is tests for identity — that is, a is b tests whether a and b are in fact the same object. Just don't use it except to test for is None.
You seem to be misunderstanding the is operator. This operator returns true if the two variables in question are located at the same memory location. In this instance, although you have two variables both storing the value "wall", they are still both distinct in that each has its own copy of the word.
In order to properly check String equality, you should use the == operator, which is value equality checking.
"is" keyword compares the objects IDs, not just whether the value is equal.
It's not the same as '===' operator in many other languages.
'is' is equivalent to:
id(s) == id(d)
There are some even more interesting cases. For example (in CPython):`
a = 5
b = 5
a is b # equals to True
but:
c = 1200
d = 1200
c is d # equals to False
The conclusion is: don't use 'is' to compare values as it can lead to confusion.
By using is your are comparing their identities, if you try print s == d, which compares their values, you will get true.
Check this post for more details: String comparison in Python: is vs. ==
Use == to compare objects for equality. Use is to check if two variables reference the exact same objects. Here s and d refer to two distinct string objects with identical contents, so == is the correct operator to use.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Python “is” operator behaves unexpectedly with integers
I'm learning Python, and am curious as to why:
x = 500
x is 500
returns False, but:
y = 100
y is 100
returns True?
Python reuses small integers. That is, all 1s (for example) are the same 1 object. The range is -5 to 255, if I remember correctly, though this is a CPython implementation detail that should not be relied upon. I am pretty sure Jython and IronPython, for example, handle this differently.
The reason this works out fine is that ints are immutable. That is, you can't change a 4 to a 5 in-place. if a has a value of 4, a = 5 is actually pointing a to a different object, not changing the value a contains. Python doesn't share any mutable types (such as lists) where unexpectedly having multiple references to the same object might cause problems.
You should use == for comparing most things. is is for checking to see whether two references point to the same object; it is roughly equivalent to id(x) == id(y).
is tests for identity - x is y asks if they are the same object, not if they are simply 'equivalent'. So you also have, eg:
>>> x = []
>>> y = []
>>> z = x
>>> x is y
False
>>> x is z
True
For equivalence, you want to test equality:
>>> x = 500
>>> x == 500
True
Python (or, at least, cpython - the major implementation) does some optimisations so that certain immutable objects only exist once throughout the lifetime of the interpreter. So, every 5 throughout your program will be the same integer object. The same thing happens with string literals, for example.
"is" compare objects IDs and "==" will compare object values. So, if you need to compare values, go with "==" and if you whant to compare objects, go with "is".
As in Python everything is an object, is compares objects IDs, it's faster, but some times unpredictable. You need to be very sure of what you are doing to use "is" for simple comparsion.
About the situation above, I found here: http://docs.python.org/c-api/int.html the following remark:
The current implementation keeps an array of integer objects for all
integers between -5 and 256, when you create an int in that range you
actually just get back a reference to the existing object. So it
should be possible to change the value of 1. I suspect the behaviour
of Python in this case is undefined. :-)
So, you can do the following test and see this behaviour:
>>> a = 256
>>> id(a)
19707932
>>> id(256)
19707932
>>> a = 257
>>> id(a)
26286076
>>> id(257)
26286064
So, for integers above 256, "is" will not work. Be careful using "is" for comparsion.
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Types for which “is” keyword may be equivalent to equality operator in Python
Python “is” operator behaves unexpectedly with integers
Hi.
I have a question which perhaps might enlighten me on more than what I am asking.
Consider this:
>>> x = 'Hello'
>>> y = 'Hello'
>>> x == y
True
>>> x is y
True
I have always used the comparison operator. Also I read that is compares the memory address and hence in this case, returns True
So my question is, is this another way to compare variables in Python? If yes, then why is this not used?
Also I noticed that in C++, if the variables have the same value, their memory addresses are different.
{ int x = 40; int y = 40; cout << &x, &y; }
0xbfe89638, 0xbfe89634
What is the reason for Python having the same memory addresses?
This is an implementation detail and absolutely not to be relied upon. is compares identities, not values. Short strings are interned, so they map to the same memory address, but this doesn't mean you should compare them with is. Stick to ==.
There are two ways to check for equality in Python: == and is. == will check the value, while is will check the identity. In almost every case, if is is true, then == must be true.
Sometimes, Python (specifically, CPython) will optimize values together so that they have the same identity. This is especially true for short strings. Python realizes that 'Hello' is the same as 'Hello' and since strings are immutable, they become the same through string interning / string pooling.
See a related question: Python: Why does ("hello" is "hello") evaluate as True?
This is because of a Python feature called String interning which is a method of storing only one copy of each distinct string value.
In Python both strings and integers are immutable therefore you can cache them. Integers in the range of ´-5´ to ´256´ and small strings(don't know the exact size atm) get cached, therefore they are the same object. x and y are only names that refer to these objects.
Also == compares for equals values, while is compares for object identity. None True and False are global objects, for example you can rebind False to True.
The following shows that not every thing is being cached:
x = 'Test' * 2000
y = 'Test' * 2000
>>> x == y
True
>>> x is y
False
>>> x = 10000000000000
>>> y = 10000000000000
>>> x == y
True
>>> x is y
False
In Python, variables are just names that point to some object (and they can point to the same object). In C++, variables also define the actual memory that is reserved for them; this is why they have distinct memory addresses.
About Python string interning and differences between the two comparison operators, see carl's response.
Two string variables are set to the same value. s1 == s2 always returns True, but s1 is s2 sometimes returns False.
If I open my Python interpreter and do the same is comparison, it succeeds:
>>> s1 = 'text'
>>> s2 = 'text'
>>> s1 is s2
True
Why is this?
is is identity testing, and == is equality testing. What happens in your code would be emulated in the interpreter like this:
>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False
So, no wonder they're not the same, right?
In other words: a is b is the equivalent of id(a) == id(b)
Other answers here are correct: is is used for identity comparison, while == is used for equality comparison. Since what you care about is equality (the two strings should contain the same characters), in this case the is operator is simply wrong and you should be using == instead.
The reason is works interactively is that (most) string literals are interned by default. From Wikipedia:
Interned strings speed up string
comparisons, which are sometimes a
performance bottleneck in applications
(such as compilers and dynamic
programming language runtimes) that
rely heavily on hash tables with
string keys. Without interning,
checking that two different strings
are equal involves examining every
character of both strings. This is
slow for several reasons: it is
inherently O(n) in the length of the
strings; it typically requires reads
from several regions of memory, which
take time; and the reads fills up the
processor cache, meaning there is less
cache available for other needs. With
interned strings, a simple object
identity test suffices after the
original intern operation; this is
typically implemented as a pointer
equality test, normally just a single
machine instruction with no memory
reference at all.
So, when you have two string literals (words that are literally typed into your program source code, surrounded by quotation marks) in your program that have the same value, the Python compiler will automatically intern the strings, making them both stored at the same memory location. (Note that this doesn't always happen, and the rules for when this happens are quite convoluted, so please don't rely on this behavior in production code!)
Since in your interactive session both strings are actually stored in the same memory location, they have the same identity, so the is operator works as expected. But if you construct a string by some other method (even if that string contains exactly the same characters), then the string may be equal, but it is not the same string -- that is, it has a different identity, because it is stored in a different place in memory.
The is keyword is a test for object identity while == is a value comparison.
If you use is, the result will be true if and only if the object is the same object. However, == will be true any time the values of the object are the same.
One last thing to note is you may use the sys.intern function to ensure that you're getting a reference to the same string:
>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True
As pointed out in previous answers, you should not be using is to determine equality of strings. But this may be helpful to know if you have some kind of weird requirement to use is.
Note that the intern function used to be a built-in on Python 2, but it was moved to the sys module in Python 3.
is is identity testing and == is equality testing. This means is is a way to check whether two things are the same things, or just equivalent.
Say you've got a simple person object. If it is named 'Jack' and is '23' years old, it's equivalent to another 23-year-old Jack, but it's not the same person.
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.name == other.name and self.age == other.age
jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)
jack1 == jack2 # True
jack1 is jack2 # False
They're the same age, but they're not the same instance of person. A string might be equivalent to another, but it's not the same object.
This is a side note, but in idiomatic Python, you will often see things like:
if x is None:
# Some clauses
This is safe, because there is guaranteed to be one instance of the Null Object (i.e., None).
If you're not sure what you're doing, use the '=='.
If you have a little more knowledge about it you can use 'is' for known objects like 'None'.
Otherwise, you'll end up wondering why things doesn't work and why this happens:
>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False
I'm not even sure if some things are guaranteed to stay the same between different Python versions/implementations.
From my limited experience with Python, is is used to compare two objects to see if they are the same object as opposed to two different objects with the same value. == is used to determine if the values are identical.
Here is a good example:
>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True
s1 is a Unicode string, and s2 is a normal string. They are not the same type, but they are the same value.
I think it has to do with the fact that, when the 'is' comparison evaluates to false, two distinct objects are used. If it evaluates to true, that means internally it's using the same exact object and not creating a new one, possibly because you created them within a fraction of 2 or so seconds and because there isn't a large time gap in between it's optimized and uses the same object.
This is why you should be using the equality operator ==, not is, to compare the value of a string object.
>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>>
In this example, I made s2, which was a different string object previously equal to 'one' but it is not the same object as s, because the interpreter did not use the same object as I did not initially assign it to 'one', if I had it would have made them the same object.
The == operator tests value equivalence. The is operator tests object identity, and Python tests whether the two are really the same object (i.e., live at the same address in memory).
>>> a = 'banana'
>>> b = 'banana'
>>> a is b
True
In this example, Python only created one string object, and both a and b refers to it. The reason is that Python internally caches and reuses some strings as an optimization. There really is just a string 'banana' in memory, shared by a and b. To trigger the normal behavior, you need to use longer strings:
>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)
When you create two lists, you get two objects:
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False
In this case we would say that the two lists are equivalent, because they have the same elements, but not identical, because they are not the same object. If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.
If a refers to an object and you assign b = a, then both variables refer to the same object:
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
Reference: Think Python 2e by Allen B. Downey
I believe that this is known as "interned" strings. Python does this, so does Java, and so do C and C++ when compiling in optimized modes.
If you use two identical strings, instead of wasting memory by creating two string objects, all interned strings with the same contents point to the same memory.
This results in the Python "is" operator returning True because two strings with the same contents are pointing at the same string object. This will also happen in Java and in C.
This is only useful for memory savings though. You cannot rely on it to test for string equality, because the various interpreters and compilers and JIT engines cannot always do it.
Actually, the is operator checks for identity and == operator checks for equality.
From the language reference:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
So from the above statement we can infer that the strings, which are immutable types, may fail when checked with "is" and may succeed when checked with "is".
The same applies for int and tuple which are also immutable types.
is will compare the memory location. It is used for object-level comparison.
== will compare the variables in the program. It is used for checking at a value level.
is checks for address level equivalence
== checks for value level equivalence
is is identity testing and == is equality testing (see the Python documentation).
In most cases, if a is b, then a == b. But there are exceptions, for example:
>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False
So, you can only use is for identity tests, never equality tests.
The basic concept, we have to be clear, while approaching this question, is to understand the difference between is and ==.
"is" is will compare the memory location. if id(a)==id(b), then a is b returns true else it returns false.
So, we can say that is is used for comparing memory locations. Whereas,
== is used for equality testing which means that it just compares only the resultant values. The below shown code may acts as an example to the above given theory.
Code
In the case of string literals (strings without getting assigned to variables), the memory address will be same as shown in the picture. so, id(a)==id(b). The remaining of this is self-explanatory.