Python Lists and Equality

Python Lists and Equality - python

I'm practicing for a midterm, and I came across this:
the_cake = [1,2,[3],4,5]
a_lie = the_cake[1:4]
the_cake = the_cake[1:4]
great = a_lie
delicious = the_cake
moist = great[:-1]
After running this code in the Python interpreter, why is:
the_cake.append == a_lie.append
False
My thought is that they are equal methods, and though not "IS", should fulfill the equality.
Maybe this evaluates to False because of instantiation?
If this is true, then do class attributes evaluate to True when compared?
Is this a special case with list objects?
Follow-up:
According to this:
Is there a difference between `==` and `is` in Python?
"IS will return True if two variables point to the same object, == if the objects referred to by the variables are equal."
Then do methods of the List class point to separate instances of the "append" method?
So if I define a function x(parameter), every time I call it, it'll be the same because it's the same object assigned to different variables, right?
Then for some equivalent variable "parameter":
x(parameter) == x(parameter)
True
Thanks!

The methods are at different locations along with their respective object instances. For instance we have:
a = []
b = []
So we have:
>>> a.append == b.append
False
and their respective locations is given in:
>>> a.append
<built-in method append of list object at 0x7f7c7c97d560>
>>> b.append
<built-in method append of list object at 0x7f7c7c97d908>
Notice the different addresses.

Both answers are valid, but check this out too:
>>> a = []
>>> b = a
>>> a.append == b.append
True

Python 2.x: Functions of type objects implement rich comparisons based on the object's address in memory.
Python 3.x: be careful that functions are not longer orderable. So, e.g., the_cake.append > a_lie.append will throw an error message.

Slicing of list in python always returns a new list. the_cake[1:4] returns a new list as well. So if you call the same slice each time that does not mean that it will return the same list. Irrespective of whether you do the same slice again and again, it will return a new list each time it is being called.
Even though you are assigning the same slice the_cake[1:4] to a_lie as well as to itself(i.e. the_cake), both are referring to a new list which is different from the other one.
So both the list have a different memory location assigned during creation. If you check id(the_cake) == id(a_lie), it will return False.
So now when you refer to append for both of the instances, they also differ as well. Even though same method is being referred, it is referred from two different instances. So it will create different instances as well for the method being invoked. Therefore the instances referred when called the_cake.append differs from that of a_lie.append.

Related

What does the operator id() return in Python3? [duplicate]

I read the Python 2 docs and noticed the id() function:
Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
CPython implementation detail: This is the address of the object in memory.
So, I experimented by using id() with a list:
>>> list = [1,2,3]
>>> id(list[0])
31186196
>>> id(list[1])
31907092 // increased by 896
>>> id(list[2])
31907080 // decreased by 12
What is the integer returned from the function? Is it synonymous to memory addresses in C? If so, why doesn't the integer correspond to the size of the data type?
When is id() used in practice?

Your post asks several questions:
What is the number returned from the function?
It is "an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime." (Python Standard Library - Built-in Functions) A unique number. Nothing more, and nothing less. Think of it as a social-security number or employee id number for Python objects.
Is it the same with memory addresses in C?
Conceptually, yes, in that they are both guaranteed to be unique in their universe during their lifetime. And in one particular implementation of Python, it actually is the memory address of the corresponding C object.
If yes, why doesn't the number increase instantly by the size of the data type (I assume that it would be int)?
Because a list is not an array, and a list element is a reference, not an object.
When do we really use id( ) function?
Hardly ever. You can test if two references are the same by comparing their ids, but the is operator has always been the recommended way of doing that. id( ) is only really useful in debugging situations.

That's the identity of the location of the object in memory...
This example might help you understand the concept a little more.
foo = 1
bar = foo
baz = bar
fii = 1
print id(foo)
print id(bar)
print id(baz)
print id(fii)
> 1532352
> 1532352
> 1532352
> 1532352
These all point to the same location in memory, which is why their values are the same. In the example, 1 is only stored once, and anything else pointing to 1 will reference that memory location.

Rob's answer (most voted above) is correct. I would like to add that in some situations using IDs is useful as it allows for comparison of objects and finding which objects refer to your objects.
The later usually helps you for example to debug strange bugs where mutable objects are passed as parameter to say classes and are assigned to local vars in a class. Mutating those objects will mutate vars in a class. This manifests itself in strange behavior where multiple things change at the same time.
Recently I had this problem with a Python/Tkinter app where editing text in one text entry field changed the text in another as I typed :)
Here is an example on how you might use function id() to trace where those references are. By all means this is not a solution covering all possible cases, but you get the idea. Again IDs are used in the background and user does not see them:
class democlass:
classvar = 24
def __init__(self, var):
self.instancevar1 = var
self.instancevar2 = 42
def whoreferencesmylocalvars(self, fromwhere):
return {__l__: {__g__
for __g__ in fromwhere
if not callable(__g__) and id(eval(__g__)) == id(getattr(self,__l__))
}
for __l__ in dir(self)
if not callable(getattr(self, __l__)) and __l__[-1] != '_'
}
def whoreferencesthisclassinstance(self, fromwhere):
return {__g__
for __g__ in fromwhere
if not callable(__g__) and id(eval(__g__)) == id(self)
}
a = [1,2,3,4]
b = a
c = b
democlassinstance = democlass(a)
d = democlassinstance
e = d
f = democlassinstance.classvar
g = democlassinstance.instancevar2
print( 'My class instance is of', type(democlassinstance), 'type.')
print( 'My instance vars are referenced by:', democlassinstance.whoreferencesmylocalvars(globals()) )
print( 'My class instance is referenced by:', democlassinstance.whoreferencesthisclassinstance(globals()) )
OUTPUT:
My class instance is of <class '__main__.democlass'> type.
My instance vars are referenced by: {'instancevar2': {'g'}, 'classvar': {'f'}, 'instancevar1': {'a', 'c', 'b'}}
My class instance is referenced by: {'e', 'd', 'democlassinstance'}
Underscores in variable names are used to prevent name colisions. Functions use "fromwhere" argument so that you can let them know where to start searching for references. This argument is filled by a function that lists all names in a given namespace. Globals() is one such function.

id() does return the address of the object being referenced (in CPython), but your confusion comes from the fact that python lists are very different from C arrays. In a python list, every element is a reference. So what you are doing is much more similar to this C code:
int *arr[3];
arr[0] = malloc(sizeof(int));
*arr[0] = 1;
arr[1] = malloc(sizeof(int));
*arr[1] = 2;
arr[2] = malloc(sizeof(int));
*arr[2] = 3;
printf("%p %p %p", arr[0], arr[1], arr[2]);
In other words, you are printing the address from the reference and not an address relative to where your list is stored.
In my case, I have found the id() function handy for creating opaque handles to return to C code when calling python from C. Doing that, you can easily use a dictionary to look up the object from its handle and it's guaranteed to be unique.

I am starting out with python and I use id when I use the interactive shell to see whether my variables are assigned to the same thing or if they just look the same.
Every value is an id, which is a unique number related to where it is stored in the memory of the computer.

If you're using python 3.4.1 then you get a different answer to your question.
list = [1,2,3]
id(list[0])
id(list[1])
id(list[2])
returns:
1705950792
1705950808 # increased by 16
1705950824 # increased by 16
The integers -5 to 256 have a constant id, and on finding it multiple times its id does not change, unlike all other numbers before or after it that have different id's every time you find it.
The numbers from -5 to 256 have id's in increasing order and differ by 16.
The number returned by id() function is a unique id given to each item stored in memory and it is analogy wise the same as the memory location in C.

The is operator uses it to check whether two objects are identical (as opposed to equal). The actual value that is returned from id() is pretty much never used for anything because it doesn't really have a meaning, and it's platform-dependent.

The answer is pretty much never. IDs are mainly used internally to Python.
The average Python programmer will probably never need to use id() in their code.

It is the address of the object in memory, exactly as the doc says. However, it has metadata attached to it, properties of the object and location in the memory is needed to store the metadata. So, when you create your variable called list, you also create metadata for the list and its elements.
So, unless you an absolute guru in the language you can't determine the id of the next element of your list based on the previous element, because you don't know what the language allocates along with the elements.

I have an idea to use value of id() in logging.
It's cheap to get and it's quite short.
In my case I use tornado and id() would like to have an anchor to group messages scattered and mixed over file by web socket.

I'm a little bit late and i will talk about Python3. To understand what id() is and how it (and Python) works, consider next example:
>>> x=1000
>>> y=1000
>>> id(x)==id(y)
False
>>> id(x)
4312240944
>>> id(y)
4312240912
>>> id(1000)
4312241104
>>> x=1000
>>> id(x)
4312241104
>>> y=1000
>>> id(y)
4312241200
You need to think about everything on the right side as objects. Every time you make assignment - you create new object and that means new id. In the middle you can see a "wild" object which is created only for function - id(1000). So, it's lifetime is only for that line of code. If you check the next line - you see that when we create new variable x, it has the same id as that wild object. Pretty much it works like memory address.

As of in python 3 id is assigned to a value not a variable. This means that if you create two functions as below, all the three id's are the same.
>>> def xyz():
... q=123
... print(id(q))
...
>>> def iop():
... w=123
... print(id(w))
>>> xyz()
1650376736
>>> iop()
1650376736
>>> id(123)
1650376736

Be carefull (concerning the answer just below)...That's only true because 123 is between -5 and 256...
In [111]: q = 257
In [112]: id(q)
Out[112]: 140020248465168
In [113]: w = 257
In [114]: id(w)
Out[114]: 140020274622544
In [115]: id(257)
Out[115]: 140020274622768

Python None comparison with is keyword

So the is keyword returns true only if the two arguments point to the same object. My question is related to the snippet below.
This snippet
number = None
if number is None:
print("PEP 8 Style Guide prefers this pattern")
Outputs
>>PEP 8 Style Guide prefers this pattern
Does this mean when I assign number = None then it is only by reference, because is checks if it is the same object. I'm so confused why this happens?? Am I wrong?? Why was this design choice made?

This is because of two reasons.
1. Assignments in Python are by reference.
In the following code
x = object()
y = x
print(x is y) # True
print(id(x)) # 139957673835552
print(id(y)) # 139957673835552
Calling object() creates a new structure in memory, whose unique identifier can be accessed with the id() function.
You can imagine x and y being arrows pointing to the same object, and this is why their underlying identifier is the same in both cases.
As such, when assigning None to a variable, you're just saying "number is an alias, an arrow pointing to the object returned by writing None". You can check that
number = None
print(id(None), id(number))
will give you the same identifier two times. But wait! What if you do it for a big number like 100000?
number = 100000
print(id(100000), id(number)) # Different values!
This means that the same literal, written two times, can return different objects, bringing up the next reason.
2. The language guarantee for None
Note that no matter how many times you get the None identifier, you get the same one.
print(id(None)) # 139957682420224
print(id(None)) # 139957682420224
print(id(None)) # 139957682420224
This is because writing None doesn't create a new object, as in the first example, because the language specification guarantees that there's only one possible object of NoneType in memory. In other words, there's only one possible object returned by writing None, making comparisons with is work as expected. This is a good design choice: There's only one canonical way for saying that a variable (an arrow) points to nothingness.
In fact, using is is encouraged as the Pythonic way of checking that a variable is None.
You can also get the class of the object by writing
NoneType = type(None)
and notice how
NoneType() is None
is true.
Note: The uniqueness property is also satisfied by other literals, particularly small numbers, for performance reasons.

All assignments are by reference (see Facts and myths about Python names and values). However, the language guarantees that None is the only object of its type. That value is created at startup, and the literal None will always produce a reference to that value.
>>> a = None; b = None
>>> a is b
True
>>> a = None
>>> b = None
>>> a is b
True
Compare to a literal like 12345, which may or may not produce a reference to an existing value of type int.
>>> a = 12345; b = 12345
>>> a is b
True
>>> a = 12345
>>> b = 12345
>>> a is b
False
Why this produces different results isn't really important, other than to say that an implementation can create new objects from int literals if it prefers.

A different way to check the type of a list

list = [1,2,3]
print(type(list) == list) # Prints False
Other than changing the name of the list, is there another way to check the type of this list? (Because I have already referenced the list variable a lot of times in my code and it is pretty hard to change all of them.)

Your code justly returns False, as you replaced the original meaning of list by your list. You shouldn't use the names of Python builtins as variable names.
So, change the name of your list and it will work as expected.
If it's too late for that, as you suggest in the edit to your question, you can still access the original list with:
list = [1,2,3]
print(type(list) == __builtins__.list)
# True
Or, the more recommended way, using isinstance instead of type(...) == ...:
print(isinstance(list, __builtins__.list))
# True

This is because you are shadowing the built-in list
l = [1,2,3]
print(type(l) == list) # True
type(list) gives <class 'list'>, which is not [1,2,3].
You can use one of the options suggested by #ThierryLathuille, but the best practice will be renaming the list variable, you shouldn't use built-in names as variables names.

Why Use a=b in Python

If I define
foo = [[1,2],[2,3],[3,4]]
bar = foo
then foo and bar reference the same object, namely [[1,2],[2,3],[3,4]]. I can now use either of these "tags/namespaces/references" to make changes to the object [[1,2],[2,3],[3,4]], but how is this useful to anyone?

One reason it can be useful to rebind names to existing values is if you intend to reuse the original name for a different object. For instance, this function to calculate the nth number in the Fibonacci sequence reuses the names a, b and temp repeatedly in a loop, binding the value previously referenced by a to b (via temp) each time:
def fib(n):
a = 1
b = 0
for _ in range(n):
temp = a
a = a+b
b = temp
# A more Pythonic version of the last three lines would be: a, b = a+b, a
return b

Let's say I have an attribute of my class instance, frobnoz, a reference to an instance of a Frobnoz class which in turn an attribute marfoos which is a list of all the associated marfoos, and I want to perform several operations on the first one.
marfoo = self.frobnoz.marfoos[0]
marfoo.rotate(CLOCKWISE, degrees=90)
if marfoo.is_cloudy():
self.purge_clouds(marfoo)
If it were not possible to create an additional reference to the marfoo I wanted to perform actions on, I would have had to not only have lengthy references to it, but would also incur the expense of looking up both the frobnoz and marfoos references plus the first element of the list each time I wanted to use it.

It is useful (among other things) for function calls that modify a value:
def changeFoo(bar):
bar[0][0]=3.14
changeFoo(foo)
Note: This does not technically use assignment, but it is equivalent.
It can also be used when multiple objects need to have a reference to the same object (such as in linked lists).

Does assigning another variable to a string make a copy or increase the reference count

On p.35 of "Python Essential Reference" by David Beazley, he first states:
For immutable data such as strings, the interpreter aggressively
shares objects between different parts of the program.
However, later on the same page, he states
For immutable objects such as numbers and strings, this assignment
effectively creates a copy.
But isn't this a contradiction? On one hand he is saying that they are shared, but then he says they are copied.

An assignment in python never ever creates a copy (it is technically possible only if the assignment for a class member is redefined for example by using __setattr__, properties or descriptors).
So after
a = foo()
b = a
whatever was returned from foo has not been copied, and instead you have two variables a and b pointing to the same object. No matter if the object is immutable or not.
With immutable objects however it's hard to tell if this is the case (because you cannot mutate the object using one variable and check if the change is visible using the other) so you are free to think that indeed a and b cannot influence each other.
For some immutable objects also Python is free to reuse old objects instead of creating new ones and after
a = x + y
b = x + y
where both x and y are numbers (so the sum is a number and is immutable) may be that both a and b will be pointing to the same object. Note that there is no such a guarantee... it may also be that instead they will be pointing to different objects with the same value.
The important thing to remember is that Python never ever makes a copy unless specifically instructed to using e.g. copy or deepcopy. This is very important with mutable objects to avoid surprises.
One common idiom you can see is for example:
class Polygon:
def __init__(self, pts):
self.pts = pts[:]
...
In this case self.pts = pts[:] is used instead of self.pts = pts to make a copy of the whole array of points to be sure that the point list will not change unexpectedly if after creating the object changes are applied to the list that was passed to the constructor.

It effectively creates a copy. It doesn't actually create a copy. The main difference between having two copies and having two names share the same value is that, in the latter case, modifications via one name affect the value of the other name. If the value can't be mutated, this difference disappears, so for immutable objects there is little practical consequence to whether the value is copied or not.
There are some corner cases where you can tell the difference between copies and different objects even for immutable types (e.g., by using the id function or the is operator), but these are not useful for Python builtin immutable types (like strings and numbers).

No, assigning a pre-existing str variable to a new variable name does not create an independent copy of the value in memory.
The existence of unique objects in memory can be checked using the id() function. For example, using the interactive Python prompt, try:
>>> str1 = 'ATG'
>>> str2 = str1
Both str1 and str2 have the same value:
>>> str1
'ATG'
>>> str2
'ATG'
This is because str1 and str2 both point to the same object, evidenced by the fact that they share the same unique object ID:
>>> id(str1)
140439014052080
>>> id(str2)
140439014052080
>>> id(str1) == id(str2)
True
Now suppose you modify str1:
>>> str1 += 'TAG' # same as str1 = str1 + 'TAG'
>>> str1
ATGTAG
Because str objects are immutable, the above assignment created a new unique object with its own ID:
>>> id(str1)
140439016777456
>>> id(str1) == id(str2)
False
However, str2 maintains the same ID it had earlier:
>>> id(str2)
140439014052080
Thus, execution of str1 += 'TAG' assigned a brand new str object with its own unique ID to the variable str1, while str2 continues to point to the original str object.
This implies that assigning an existing str variable to another variable name does not create a copy of its value in memory.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Lists and Equality - python

Both answers are valid, but check this out too: >>> a = [] >>> b = a >>> a.append == b.append True

Python 2.x: Functions of type objects implement rich comparisons based on the object's address in memory. Python 3.x: be careful that functions are not longer orderable. So, e.g., the_cake.append > a_lie.append will throw an error message.

Related

What does the operator id() return in Python3? [duplicate]

Python None comparison with is keyword

A different way to check the type of a list

Why Use a=b in Python

Does assigning another variable to a string make a copy or increase the reference count

Categories

Resources