Id of list versus str slices

Id of list versus str slices - python

Given a list a = [1,2,3], why is the following statement true
id(a[:]) == id(a[:]) # true
while the following one is false?
b = a[:]
id(b) == id(a[:]) # false
Also, if I instead use a string (in place of a list), then both statements are true. Why? What am I missing?

The id is defined to be unique for an object across its lifetime. That is, two separate objects existing at the same time cannot have the same id. However, two separate objects existing at different time as well as objects not required to be separate may have the same id.
id(object)
Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
Thus, one has to be mindful of two things when reasoning about id: When must the lifetime of objects overlap, and when must two objects be separate.
When the objects whose id we look at are created only for the id:
>>> # / / first a[:]
>>> # v v v v second a[:]
>>> id(a[:]) == id(a[:])
True
then the object are not required to exist at the same time. Each id(a[:]) expression can create the slice, get its id and then discard the slice before the equality between the ids is ever checked. This means both slice can have the same id as they never exist at the same time.
In contrast, when a slice is assigned to a variable it has to exist at least as long as the variable. Thus, when we check the id of an object via a variable
>>> b = a[:] # < ------------- first a[:]
>>> id(b) == id(a[:]) # < second a[:] |
False
>>> b # < ----------------/
…
its lifetime overlaps with that of the temporary slice. This means both slices must not have the same id as they never exist at the same time…
… iff slicing must create separate objects.
When comparing the behaviour of list and str, the key difference is that the latter does not have behaviour depending on its identity – roughly, this corresponds to mutable and immutable types.
When working with lists, identity is important because we can mutate a specific object. Even if two objects have the same initial value, mutation has a different effect:
>>> a, b = [1, 2, 3], [1, 2, 3]
>>> c = a # a, b, c have same value
>>> c += [4] # changing c has different effect on a and b
>>> a == b
False
When working with str, identity is irrelevant because we cannot mutate a specific object. If two objects have the same initial value, immutability guarantees they will always have the same value:
>>> a, b = "123", "123"
>>> c = a # a, b, c have same value
>>> c += "4" # changing c has *no* effect on a and b
>>> a == b
True
As a result, slicing a mutable list to a new list must always create a new object. Otherwise, mutating the slice would have unreliable behaviour.
In contrast, slicing an immutable str to a new str may create a new object. Even if it always provides the same object the behaviour is the same.
As a result of how id is defined – in respect to lifetime and separation – a Python implementation must use separate ids in specific cases but may use separate ids in all other cases.
In specific, a Python implementation is free to re-use ids if objects don't exist at the same time and is free to share ids if behaviour does not depend on identity.

CPython implementation detail: This is the address of the object in memory.
Memory has limited room and you can treat them as a storage house with shelves, you might want to label every shelf with some digits or somewhat so you can find their precise position for convenience.
Every time you create new stuff they all must have a place to put it somewhere. you create a variable a, which means it's an object and it was labeled some address on it. you create a value [1, 2, 3], it's another object and it still labeled some address on it.
Then you say
a = 50; you allocate 50 to a variable named a.
Under the hood, it says a's address will reference to another address where 50 lives. (it's pointer)
A variable is an object. A pile of data is an object. Actually, a computer itself doesn't need a variable called a. It already has the address of [1, 2, 3], it knows where it is in memory. The reason we need a variable called a is that we human beings need a name to represent this pile of data instead of using an address.
The example in C:
#include <stdio.h>
int main()
{
int a ;
printf("The address of a is %p\n\n", &a);
a = 55;
printf("The address of a is %p\n", &a);
printf("The address of 55's pointer is %p\n\n", 55);
a = 30;
printf("The address of a is %p\n", &a);
printf("The address of 30's pointer is %p\n\n", 30);
}
// The address of a is 0x7fffb44b83dc
// The address of a is 0x7fffb44b83dc
// The address of 55's pointer is 0x37
// The address of a is 0x7fffb44b83dc
// The address of 30's pointer is 0x1e
you can check this for further reading
Back to here, whatever value we create after there existing a = [1, 2, 3]
a[0], a[1], a[:2], a[:], a[::-1], etc. will occupy new space memory and has their own address individually, They are brand new objects since the script interpreted.
The a's address won't change whether you assign other value to it, it only leads to point to another value's address.

just print the ids
a = [1,2,3]
print(id(a), id(a[:])) # "a" has id#001 and its copy id#002
b = a[:] # here "b" inherits id#002 and the first copy of "a" dies
print(id(b), id(a[:])) # here "b" is still id#002, "a" is a new copy with id#003

Related

What does the operator id() return in Python3? [duplicate]

I read the Python 2 docs and noticed the id() function:
Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.
CPython implementation detail: This is the address of the object in memory.
So, I experimented by using id() with a list:
>>> list = [1,2,3]
>>> id(list[0])
31186196
>>> id(list[1])
31907092 // increased by 896
>>> id(list[2])
31907080 // decreased by 12
What is the integer returned from the function? Is it synonymous to memory addresses in C? If so, why doesn't the integer correspond to the size of the data type?
When is id() used in practice?

Your post asks several questions:
What is the number returned from the function?
It is "an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime." (Python Standard Library - Built-in Functions) A unique number. Nothing more, and nothing less. Think of it as a social-security number or employee id number for Python objects.
Is it the same with memory addresses in C?
Conceptually, yes, in that they are both guaranteed to be unique in their universe during their lifetime. And in one particular implementation of Python, it actually is the memory address of the corresponding C object.
If yes, why doesn't the number increase instantly by the size of the data type (I assume that it would be int)?
Because a list is not an array, and a list element is a reference, not an object.
When do we really use id( ) function?
Hardly ever. You can test if two references are the same by comparing their ids, but the is operator has always been the recommended way of doing that. id( ) is only really useful in debugging situations.

That's the identity of the location of the object in memory...
This example might help you understand the concept a little more.
foo = 1
bar = foo
baz = bar
fii = 1
print id(foo)
print id(bar)
print id(baz)
print id(fii)
> 1532352
> 1532352
> 1532352
> 1532352
These all point to the same location in memory, which is why their values are the same. In the example, 1 is only stored once, and anything else pointing to 1 will reference that memory location.

Rob's answer (most voted above) is correct. I would like to add that in some situations using IDs is useful as it allows for comparison of objects and finding which objects refer to your objects.
The later usually helps you for example to debug strange bugs where mutable objects are passed as parameter to say classes and are assigned to local vars in a class. Mutating those objects will mutate vars in a class. This manifests itself in strange behavior where multiple things change at the same time.
Recently I had this problem with a Python/Tkinter app where editing text in one text entry field changed the text in another as I typed :)
Here is an example on how you might use function id() to trace where those references are. By all means this is not a solution covering all possible cases, but you get the idea. Again IDs are used in the background and user does not see them:
class democlass:
classvar = 24
def __init__(self, var):
self.instancevar1 = var
self.instancevar2 = 42
def whoreferencesmylocalvars(self, fromwhere):
return {__l__: {__g__
for __g__ in fromwhere
if not callable(__g__) and id(eval(__g__)) == id(getattr(self,__l__))
}
for __l__ in dir(self)
if not callable(getattr(self, __l__)) and __l__[-1] != '_'
}
def whoreferencesthisclassinstance(self, fromwhere):
return {__g__
for __g__ in fromwhere
if not callable(__g__) and id(eval(__g__)) == id(self)
}
a = [1,2,3,4]
b = a
c = b
democlassinstance = democlass(a)
d = democlassinstance
e = d
f = democlassinstance.classvar
g = democlassinstance.instancevar2
print( 'My class instance is of', type(democlassinstance), 'type.')
print( 'My instance vars are referenced by:', democlassinstance.whoreferencesmylocalvars(globals()) )
print( 'My class instance is referenced by:', democlassinstance.whoreferencesthisclassinstance(globals()) )
OUTPUT:
My class instance is of <class '__main__.democlass'> type.
My instance vars are referenced by: {'instancevar2': {'g'}, 'classvar': {'f'}, 'instancevar1': {'a', 'c', 'b'}}
My class instance is referenced by: {'e', 'd', 'democlassinstance'}
Underscores in variable names are used to prevent name colisions. Functions use "fromwhere" argument so that you can let them know where to start searching for references. This argument is filled by a function that lists all names in a given namespace. Globals() is one such function.

id() does return the address of the object being referenced (in CPython), but your confusion comes from the fact that python lists are very different from C arrays. In a python list, every element is a reference. So what you are doing is much more similar to this C code:
int *arr[3];
arr[0] = malloc(sizeof(int));
*arr[0] = 1;
arr[1] = malloc(sizeof(int));
*arr[1] = 2;
arr[2] = malloc(sizeof(int));
*arr[2] = 3;
printf("%p %p %p", arr[0], arr[1], arr[2]);
In other words, you are printing the address from the reference and not an address relative to where your list is stored.
In my case, I have found the id() function handy for creating opaque handles to return to C code when calling python from C. Doing that, you can easily use a dictionary to look up the object from its handle and it's guaranteed to be unique.

I am starting out with python and I use id when I use the interactive shell to see whether my variables are assigned to the same thing or if they just look the same.
Every value is an id, which is a unique number related to where it is stored in the memory of the computer.

If you're using python 3.4.1 then you get a different answer to your question.
list = [1,2,3]
id(list[0])
id(list[1])
id(list[2])
returns:
1705950792
1705950808 # increased by 16
1705950824 # increased by 16
The integers -5 to 256 have a constant id, and on finding it multiple times its id does not change, unlike all other numbers before or after it that have different id's every time you find it.
The numbers from -5 to 256 have id's in increasing order and differ by 16.
The number returned by id() function is a unique id given to each item stored in memory and it is analogy wise the same as the memory location in C.

The is operator uses it to check whether two objects are identical (as opposed to equal). The actual value that is returned from id() is pretty much never used for anything because it doesn't really have a meaning, and it's platform-dependent.

The answer is pretty much never. IDs are mainly used internally to Python.
The average Python programmer will probably never need to use id() in their code.

It is the address of the object in memory, exactly as the doc says. However, it has metadata attached to it, properties of the object and location in the memory is needed to store the metadata. So, when you create your variable called list, you also create metadata for the list and its elements.
So, unless you an absolute guru in the language you can't determine the id of the next element of your list based on the previous element, because you don't know what the language allocates along with the elements.

I have an idea to use value of id() in logging.
It's cheap to get and it's quite short.
In my case I use tornado and id() would like to have an anchor to group messages scattered and mixed over file by web socket.

I'm a little bit late and i will talk about Python3. To understand what id() is and how it (and Python) works, consider next example:
>>> x=1000
>>> y=1000
>>> id(x)==id(y)
False
>>> id(x)
4312240944
>>> id(y)
4312240912
>>> id(1000)
4312241104
>>> x=1000
>>> id(x)
4312241104
>>> y=1000
>>> id(y)
4312241200
You need to think about everything on the right side as objects. Every time you make assignment - you create new object and that means new id. In the middle you can see a "wild" object which is created only for function - id(1000). So, it's lifetime is only for that line of code. If you check the next line - you see that when we create new variable x, it has the same id as that wild object. Pretty much it works like memory address.

As of in python 3 id is assigned to a value not a variable. This means that if you create two functions as below, all the three id's are the same.
>>> def xyz():
... q=123
... print(id(q))
...
>>> def iop():
... w=123
... print(id(w))
>>> xyz()
1650376736
>>> iop()
1650376736
>>> id(123)
1650376736

Be carefull (concerning the answer just below)...That's only true because 123 is between -5 and 256...
In [111]: q = 257
In [112]: id(q)
Out[112]: 140020248465168
In [113]: w = 257
In [114]: id(w)
Out[114]: 140020274622544
In [115]: id(257)
Out[115]: 140020274622768

Immutable objects with same value and type not referencing same object

I have been reading the Python Data Model. The following text is taken from here:
Types affect almost all aspects of object behavior. Even the
importance of object identity is affected in some sense: for immutable
types, operations that compute new values may actually return a
reference to any existing object with the same type and value, while
for mutable objects this is not allowed. E.g., after a = 1; b = 1, a
and b may or may not refer to the same object with the value one,
depending on the implementation, but after c = []; d = [], c and d are
guaranteed to refer to two different, unique, newly created empty
lists. (Note that c = d = [] assigns the same object to both c and d.)
So, it mentions that, for immutable types, operations that compute new values may actually return a reference to an existing object with same type and value. So, I wanted to test this. Following is my code:
a = (1,2,3)
b = (1,2)
c = (3,)
k = b + c
print(id(a))
>>> 2169349869720
print(id(k))
>>> 2169342802424
Here, I did an operation to compute a new tuple that has same the value and type as a. But I got an object referencing to different id. This means I got an object which references different memory than a. Why is this?

Answering the question based on comments from #jonrsharpe
Note "may actually return" - it's not guaranteed, it would likely be
less efficient for Python to look through the existing tuples to find
out if one that's the same as the one your operation creates already
exists and reuse it than it would to just create a new one.

Python multiple assignment and references

Why does multiple assignment make distinct references for ints, but not lists or other objects?
>>> a = b = 1
>>> a += 1
>>> a is b
>>> False
>>> a = b = [1]
>>> a.append(1)
>>> a is b
>>> True

In the int example, you first assign the same object to both a and b, but then reassign a with another object (the result of a+1). a now refers to a different object.
In the list example, you assign the same object to both a and b, but then you don't do anything to change that. append only changes the interal state of the list object, not its identity. Thus they remain the same.
If you replace a.append(1) with a = a + [1], you end up with different object, because, again, you assign a new object (the result of a+[1]) to a.
Note that a+=[1] will behave differently, but that's a whole other question.

primitive types are immutable. When a += 1 runs, a no longer refers to the memory location as b:
https://docs.python.org/2/library/functions.html#id
CPython implementation detail: This is the address of the object in memory.
In [1]: a = b = 100000000000000000000000000000
print id(a), id(b)
print a is b
Out [1]: 4400387016 4400387016
True
In [2]: a += 1
print id(a), id(b)
print a is b
Out [2]: 4395695296 4400387016
False

Python works differently when changing values of mutable object and immutable object
Immutable objects:
This are objects whose values which dose not after initialization
i.e.)int,string,tuple
Mutable Objects
This are objects whose values which can be after initialization
i.e.)All other objects are mutable like dist,list and user defined object
When changing the value of mutable object it dose not create a new memory space and transfer there it just changes the memory space where it was created
But it is exactly the opposite for immutable objects that is it creates a new space and transfer itself there
i.e.)
s="awe"
s[0]="e"
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-19-9f16ce5bbc72> in <module>()
----> 1 s[0]="e"
TypeError: 'str' object does not support item assignment
This is trying to tell u that you can change the value of the string memory
you could do this
"e"+s[1:]
Out[20]: 'ewe'
This creates a new memory space and allocates the string there .
Like wise making A=B=1 and changing A A=2 will create a new memory space and variable A will reference to that location so that's why B's value is not changed when changing value of A
But this not the case in List since it is a mutable object changing the value does not transfer it to a new memory location it just expands the used memory
i.e.)
a=b=[]
a.append(1)
print a
[1]
print b
[1]
Both gives the same value since it is referencing the same memory space so both are equal

The difference is not in the multiple assignment, but in what you subsequently do with the objects. With the int, you do +=, and with the list you do .append.
However, even if you do += for both, you won't necessarily see the same result, because what += does depends on what type you use it on.
So that is the basic answer: operations like += may work differently on different types. Whether += returns a new object or modifies the existing object is behavior that is defined by that object. To know what the behavior is, you need to know what kind of object it is and what behavior it defines (i.e., the documentation). All the more, you cannot assume that using an operation like += will have the same result as using a method like .append. What a method like .append does is defined by the object you call it on.

Shared References and Equality

Using Python 3.4 and working through examples in a book by O'Reily.
The example shows:
A = ['spam']
B = A
B[0] = 'shrubbery'
Result after running print A:
'shrubbery'
Now my thought process is thatA was defined but never changed.
This example yields a different result
A = 'string'
B = A
B = 'dog'
This is the result after running print A:
'string'
Can someone explain?

In the first example, you are modifying the list referenced by B.
Doing:
B[0] = 'shrubbery'
tells Python to set the first item in the list referenced by B to the value of 'shrubbery'. Moreover, this list happens to be the same list that is referenced by A. This is because doing:
B = A
causes B and A to each refer to the same list:
>>> A = ['spam']
>>> B = A
>>> A is B
True
>>>
So, any changes to the list referenced by B will also affect the list referenced by A (and vice-versa) because they are the same object.
The second example however does not modify anything. Instead, it simply reassigns the name B to a new value.
Once this line is executed:
B = 'dog'
B no longer references the string 'string' but rather the new string 'dog'. The value of A meanwhile is left unchanged.

As is the case in most modern dynamic languages, variables in python are actually references which are sort of like C pointers. This means that when you do something like A = B (where A and B are both variables), you simply make A point to the same location in memory as B.
In the first example you are mutating (modifying) an existing object in place -- this is what the variable_name[index/key] = value syntax does. Both A and B continue to point at the same thing, but this things first entry is now 'shrubbery', instead of 'spam'.
In the second example, you make B point at a different (new at this point) object when you say B = 'dog'.

I hope you could understand it with this way :-)
As you see in first method, both of them refers to same list, second one different.So in second way changes not effects on another one.

Mutable objects are Lists while Strings are immutable that's why you can change the memory address and the lists itself but not the string.

We are talking here about shared references and mutable / immutable objects . When you do B = A, both variables points to same memory address ( shared reference) .
First case , list is a mutable object ( it's state can be change ) but object memory address remains the same . So if you change it's state , then the other variable will see those changes as it points to same memory address .( A and B have same value as they point to same object in memory )
Second case , string is immutable ( you cannot change it ) .By doing
B = 'dog' , basically you create another object and now B points to another object ( another memory address ) . In this case A still points to same old memory reference ( A and B have different values )

Here are the differences between the two:
Here's a step by step analysis:
A = ['spam']
"A points to a list whose first element, or A[0], is 'spam'."
B = A
"B points to what A points to, which is the same list."
B[0] = 'shrubbery'
"When we set B[0] to 'shrubbery', the result can be observed in the diagram.
A[0] is set to 'shrubbery' as well."
print (A):
A = 'string'
"A points to 'string'."
B = A
"B points to what A points to, which is 'string'."
B = 'dog'
"Oh look! B points to another string, 'dog', now.
So does what A points to change? No."
The result can be observed in the diagram.
print (A):

Python object deletion

a = [1,2,3,4,5]
b = a[1]
print id(a[1],b) # out put shows same id.hence both represent same object.
del a[1] # deleting a[1],both a[1],b have same id,hence both are aliases
print a # output: [1,3,4,5]
print b # output: 2
Both b,a[1] have same id but deleting one isn't effecting the other.Python reference states that 'del' on a subscription deletes the actual object,not the name object binding. Output: [1,3,4,5] proves this statement.But how is it possible that 'b' remains unaffected when both a[0] and b have same id.
Edit: The part 'del' on a subscription deletes the actual object,not the name object binding is not true.The reverse is true. 'del' actually removes the name,object bindings.In case of 'del' on subscription (eg. del a[1]) removes object 2 from the list object and also removes the current a[1] binding to 2 and makes a[1] bind to 3 instead. Subsequent indexes follow the pattern.

del doesn't delete objects, it deletes references.
There is an object which is the integer value 2. That one single object was referred to by two places; a[1] and b.
You deleted a[1], so that reference was gone. But that has no effect on the object 2, only on the reference that was in a[1]. So the reference accessible through the name b still reaches the object 2 just fine.
Even if you del all the references, that has no effect on the object. Python is a garbage collected language, so it is responsible for noticing when an object is no longer referenced anywhere at all, so that it can reclaim the memory occupied by the object. That will happen some time after the object is no longer reachable.1
1 CPython uses reference counting to implement it's garbage collection2, which allows us to say that objects will usually be reclaimed as soon as their last reference dies, but that's an implementation detail not part of the language specification. You don't have to understand exactly how Python collects its garbage and shouldn't write programs that depend on it; other Python implementations such as Jython, PyPy, and IronPython do not implement garbage collection this way.
2 Plus an additional garbage collection mechanism to detect cyclic garbage, which reference counting can't handle.

del merely decrements the reference count for that object. So at after b = a[1] the object at a[1] has 2 (let's say) references. After delete a[1], it is gone from the list and now only has 1 reference, as it's still referenced by b. No actual deletion occurs until the ref. count is 0, and then only on a GC cycle.

There are multiple issues at work here. First, calling del on a list member removes the item from the list, which releases the reference count on the object, but it will not deallocate it since the variable b still reference it. You can never deallocate something which you have a reference to.
The second issue to note here is that integer numbers close to zero are actually pooled and are never deallocated. You should normally not have to bother knowing about this though.

They have the same id because Python reuses the id for small integers, even if you delete these... This is mentioned in the docs:
The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object.
We can see this behaviour:
>>> c = 256
>>> id(c)
140700180101352
>>> del c
>>> d = 256
>>> id(d)
140700180101352 # same as id(c) was
>>> e = 257
>>> id(e)
140700180460152
>>> del e
>>> f = 257
>>> id(f)
140700180460128 # different to id(e) !

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Id of list versus str slices - python

Given a list a = [1,2,3], why is the following statement true id(a[:]) == id(a[:]) # true while the following one is false? b = a[:] id(b) == id(a[:]) # false Also, if I instead use a string (in place of a list), then both statements are true. Why? What am I missing?

just print the ids a = [1,2,3] print(id(a), id(a[:])) # "a" has id#001 and its copy id#002 b = a[:] # here "b" inherits id#002 and the first copy of "a" dies print(id(b), id(a[:])) # here "b" is still id#002, "a" is a new copy with id#003

Related

What does the operator id() return in Python3? [duplicate]

Immutable objects with same value and type not referencing same object

Python multiple assignment and references

Shared References and Equality

Python object deletion

Categories

Resources