I am new to Python. Here is a question I have about lists:
It is said that lists are mutable and tuples are immutable. But when I write the following:
L1 = [1, 2, 3]
L2 = (L1, L1)
L1[1] = 5
print L2
the result is
([1, 5, 3], [1, 5, 3])
instead of
([1, 2, 3], [1, 2, 3])
But L2 is a tuple and tuples are immutable. Why is it that when I change the value of L1, the value of L2 is also changed?
From the Python documentation (http://docs.python.org/reference/datamodel.html), note:
The value of an immutable container object that contains a reference to a mutable
object can change when the latter’s value is changed; however the container is
still considered immutable, because the collection of objects it contains cannot
be changed. So, immutability is not strictly the same as having an unchangeable
value, it is more subtle.
The tuple is immutable, but the list inside the tuple is mutable. You changed L1 (the list), not the tuple. The tuple contains two copies of L1, so they both show the change, since they are actually the same list.
If an object is "immutable", that doesn't automatically mean everything it touches is also immutable. You can put mutable objects inside immutable objects, and that won't stop you from continuing to mutate the mutable objects.
The tuple didn't get modified, it still contains the same duplicate references to list you gave it.
You modified a list (L1), not the tuple (or more precisely, not the reference to the list in the tuple).
For instance you would not have been able to do
L2[1] = 5
because tuples are immutable as you correctly state.
So the tuple wasn't changed, but the list that the tuple contained a reference to was modified (since both entries were references to the same list, both values in the output changed to 5). No value in the tuple was changed.
It may help if you think of reference as a "pointer" in this context.
EDIT (based on question by OP in comments below):
About references, lists and copies, maybe these examples will be helpful:
L=range(5)
s = (L, L[:]) # a reference to the original list and a copy
s
([0, 1, 2, 3, 4], [0, 1, 2, 3, 4])
then changing L[2]
L[2] = 'a'
gives:
s
([0, 1, 'a', 3, 4], [0, 1, 2, 3, 4]) # copy is not changed
Notice that the "2nd" list didn't change, since it contains a copy.
Now,
L=range(5)
we are creating two copies of the list and giving the references to the tuple
s = (L[:], L[:])
now
L[2] = 'a'
doesn't affect anything but the original list L
s
([0, 1, 2, 3, 4], [0, 1, 2, 3, 4])
Hope this is helpful.
You're right that tuples are immutable: L2 is an immutable tuple of two references to L1 (not, as it might first appear, a tuple of two lists), and L1 is not immutable. When you alter L1, you aren't altering L2, just the objects that L2 references.
Use deepcopy instead of = :
from copy import deepcopy
L2 = deepcopy(L1)
The tuple contains two references, each to the same list (not copies of the list, as you might have expected). Hence, changes in the list will still show up in the tuple (since the tuple contains only the references), but the tuple itself is not altered. Therefore, it's immutability is not violated.
Tuples being immutable means only one thing -- once you construct a tuple, it's impossible to modify it. Lists, on the other hand, can be added elements to, removed elements from. But, both tuples and lists are concerned with the elements they contain, but not with what those elements are.
In Python, and this has nothing to do with tuples or lists, when you add a simple value, like an int, it gets represented as is, but any complex value like a list, a tuple, or any other class-type object is always stored as reference.
If you were to convert your tuple to a set(), you'd get an error message that might surprise you, but given the above it should make sense:
>>> L=range(5)
>>> s = (L, L[:]) # a reference to the original list and a copy
>>> set(1, 2, s)
>>> set((1, 2, s))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
As values of a set must never change once they are added to the set, any mutable value contained inside the immutable tuple s raises TypeError.
Related
This question already has answers here:
a mutable type inside an immutable container
(3 answers)
Closed 6 years ago.
So I have this code:
tup = ([1,2,3],[7,8,9])
tup[0] += (4,5,6)
which generates this error:
TypeError: 'tuple' object does not support item assignment
While this code:
tup = ([1,2,3],[7,8,9])
try:
tup[0] += (4,5,6)
except TypeError:
print tup
prints this:
([1, 2, 3, 4, 5, 6], [7, 8, 9])
Is this behavior expected?
Note
I realize this is not a very common use case. However, while the error is expected, I did not expect the list change.
Yes it's expected.
A tuple cannot be changed. A tuple, like a list, is a structure that points to other objects. It doesn't care about what those objects are. They could be strings, numbers, tuples, lists, or other objects.
So doing anything to one of the objects contained in the tuple, including appending to that object if it's a list, isn't relevant to the semantics of the tuple.
(Imagine if you wrote a class that had methods on it that cause its internal state to change. You wouldn't expect it to be impossible to call those methods on an object based on where it's stored).
Or another example:
>>> l1 = [1, 2, 3]
>>> l2 = [4, 5, 6]
>>> t = (l1, l2)
>>> l3 = [l1, l2]
>>> l3[1].append(7)
Two mutable lists referenced by a list and by a tuple. Should I be able to do the last line (answer: yes). If you think the answer's no, why not? Should t change the semantics of l3 (answer: no).
If you want an immutable object of sequential structures, it should be tuples all the way down.
Why does it error?
This example uses the infix operator:
Many operations have an “in-place” version. The following functions
provide a more primitive access to in-place operators than the usual
syntax does; for example, the statement x += y is equivalent to x =
operator.iadd(x, y). Another way to put it is to say that z =
operator.iadd(x, y) is equivalent to the compound statement z = x; z
+= y.
https://docs.python.org/2/library/operator.html
So this:
l = [1, 2, 3]
tup = (l,)
tup[0] += (4,5,6)
is equivalent to this:
l = [1, 2, 3]
tup = (l,)
x = tup[0]
x = x.__iadd__([4, 5, 6]) # like extend, but returns x instead of None
tup[0] = x
The __iadd__ line succeeds, and modifies the first list. So the list has been changed. The __iadd__ call returns the mutated list.
The second line tries to assign the list back to the tuple, and this fails.
So, at the end of the program, the list has been extended but the second part of the += operation failed. For the specifics, see this question.
Well I guess tup[0] += (4, 5, 6) is translated to:
tup[0] = tup[0].__iadd__((4,5,6))
tup[0].__iadd__((4,5,6)) is executed normally changing the list in the first element. But the assignment fails since tuples are immutables.
Tuples cannot be changed directly, correct. Yet, you may change a tuple's element by reference. Like:
>>> tup = ([1,2,3],[7,8,9])
>>> l = tup[0]
>>> l += (4,5,6)
>>> tup
([1, 2, 3, 4, 5, 6], [7, 8, 9])
The Python developers wrote an official explanation about why it happens here: https://docs.python.org/2/faq/programming.html#why-does-a-tuple-i-item-raise-an-exception-when-the-addition-works
The short version is that += actually does two things, one right after the other:
Run the thing on the right.
assign the result to the variable on the left
In this case, step 1 works because you’re allowed to add stuff to lists (they’re mutable), but step 2 fails because you can’t put stuff into tuples after creating them (tuples are immutable).
In a real program, I would suggest you don't do a try-except clause, because tup[0].extend([4,5,6]) does the exact same thing.
I'm a new learner of python/programming. Here is a question on top of head about the use of function in python.
If I had a list called myList.
(a) If I were to sort it, I would use myList.sort()
(b) If I were to sort it temporarily, I would use sorted(myList)
Note the difference between the use of two functions, one is to apply the function to myList, the other one is use myList as a parameter to the function.
My question is, each time when I use a function.
How do I know if the function should be used as an "action" to be applied to an object (in (a)), or
should an object passed to the function as a parameter,(in (b)).
I have been confused with this for quite long time. appreciate any explanations.
Thanks.
There are two big differences between list.sort and sorted(list)
The list.sort() sorts the list in-place, which means it modifies the
list. The sorted function does not modify original list but returns
a sorted list
The list.sort() only applies to list (it is a method), but sorted built-in function can take any iterable object.
Please go through this useful documentation.
Only sorted is a function - list.sort is a method of the list type.
Functions such as sorted are applicable to more than a specific type. For example, you can get a sorted list, set, or even a temporary generator. Only the output is concrete (you always get a new list) but not the input.
Methods such as sort are applicable only to the type that holds them. For example, there is a list.sort method but not a dict.sort method. Even for types whose methods have the same name, switching them is not sensible - for example, set.copy cannot be used to copy a dict.
An easy way to distinguish the two is that functions live in regular namespaces, such as modules. On the other hand, methods only live inside classes and their instances.
sorted # function
list.sort # method
import math
math.sqrt # function
math.pi.as_integer_ratio # method
Conventionally, Python usually uses functions for immutable actions and methods for mutating actions. For example, sorted provides a new sorted list leaving the old one untouched; my_list.sort() sorts the existing list, providing no new one.
my_list = [4, 2, 3, 1]
print(sorted(my_list)) # prints [1, 2, 3, 4]
print(my_list) # prints [4, 2, 3, 1] - unchanged by sorted
print(my_list.sort()) # prints None - no new list produced
print(my_list) # prints [1, 2, 3, 4] - changed by sort
sort() is an in-place function whereas sorted() will return a sorted list, but will not alter your variable in place. The following demonstrates the difference:
l = [1, 2, 1, 3, 2, 4]
l.sort()
print(l) --returns [1, 1, 2, 2, 3, 4]
l = [1, 2, 1, 3, 2, 4]
new_l = sorted(l)
print(new_l) -- returns [1, 1, 2, 2, 3, 4]
print(l) -- [1, 2, 1, 3, 2, 4]
If you want to maintain the original order of your list use sorted, otherwise you can use sort().
This question already has answers here:
Object id in Python
(4 answers)
Closed 5 years ago.
Case A:
list1=[0, 1, 2, 3]
list2=list1
list1=list1+[4]
print(list1)
print(list2)
Output:
[0, 1, 2, 3, 4]
[0, 1, 2, 3]
(This non-mutating behavior also happens when concatenating a list of more than a single entry, and when 'multiplying' the list, e.g. list1=list1*2, in fact any type of "re-assignment" that performs an operation with an infix operator to the list and then assigns the result of that operation to the same list name using "=" )
In this case the original list object that list1 pointed to has not been altered in memory and list2 still points to it, another object has simply been created in memory for the result of the concatenation that list1 now points to (there are now two distinct, different list objects in memory)
Case B:
list1=[0, 1, 2, 3]
list2=list1
list1.append(4)
print(list1)
print(list2)
---
Output :
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
Case C:
list1=[0, 1, 2, 3]
list2=list1
list1[-1]="foo"
print(list1)
print(list2)
Outputs:
[0, 1, 2, 'foo']
[0, 1, 2, 'foo']
In case B and C the original list object that list1 points to has mutated, list2 still points to that same object, and as a result the value of list2 has changed. (there is still one single list object in memory and it has mutated).
This behavior seems inconsistent to me as a noob. Is there a good reason/utility for this?
EDIT :
I changed the list variables names from "list" and "list_copy" to "list1" and "list2" as this was clearly a very poor and confusing choice of names.
I chose Kabanus' answer as I liked how he pointed out that mutating operations are always(?) explicit in python.
In fact a short and simple answer to my question can be done summarizing Kabanus' answer into two of his statements :
-"In python mutating operations are explicit"
-"The addition[or multiplication] operator [performed on list objects] creates a new object, and doesn't change x[the list object] implicitly."
I could also add:
-"Every time you use square brackets to describe a list, that creates a new list object"
[this is from : http://www-inst.eecs.berkeley.edu/~selfpace/cs9honline/Q2/mutation.html , great explanations there on this topic]
Also I realized after Kabanus' answer how careful one must be tracking mutations in a program :
l=[1,2]
z=l
l+=[3]
z=z+[3]
and
l=[1,2]
z=l
z=z+[3]
l+=[3]
Will yield completely different values for z. This must be a frequent source of errors isn't it?
I'm only in the beginning of my learning and haven't delved deeply into OOP concepts just yet, but I think I'm already starting to understand what the fuss around functional paradigm is about...
l += [4]
is equivalent to:
l.append(4)
and won't create a copy.
l = l + [4]
is an assignment to l, it first evaluates the right side of the assignment, then assigns the resulting object to name l. There is no way for this operation to be mutating l.
Update: I guess I haven't made myself clear enough. Of course operations on the RHS of the assignment may involve mutating the object that is the current value of LHS; but finally, the result of computing RHS is assigned to the LHS, thus overwriting any previous mutations. Example:
def increment_first(x):
x[0] += 1
return []
l = [ 1 ]
l = increment_first(l)
While the call to increment_first will increment l[0] as its side effect, the mutated list object will be lost anyway as soon as the value of RHS (in this case - an empty list) is assigned to l.
This is by design. The point is python does not like non explicit side effects. Suppose this valid line in your file:
x=[1,2]
print(x+[3,4])
Note there is no assignment, but it's still a valid line. Do you expect x to have changed after that second line? For me it doesn't make sense.
That's what you're seeing - the addition operator creates a new object, and doesn't change x implicitly. If you feel it should, then what about:
[3,4]+x
Of course, addition does not change behavior in assignment, to avoid confusion.
In python mutating operations are explicit:
x+=[3,4]
Or your example:
x[0]=1
Here you are explicitly asking to change a cell, i.e. explicit mutation. These things are consistent - an operation is always a mutation or it isn't, and it won't be both. Usually it makes sense as well, such as concatenating lists on the fly.
I was reading sets in python http://www.python-course.eu/sets_frozensets.php and got confusion that whether the elements of sets in python must be mutable or immutable? Because in the definition section they said "A set contains an unordered collection of unique and immutable objects." If it is true than how can a set contain the list as list is mutable?
Can someone clarify my doubt?
>>> x = [x for x in range(0,10,2)]
>>> x
[0, 2, 4, 6, 8] #This is a list x
>>> my_set = set(x) #Here we are passing list x to create a set
>>> my_set
set([0, 8, 2, 4, 6]) #and here my_set is a set which contain the list.
>>>
When you pass the set() constructor built-in any iterable, it builds a set out of the elements provided in the iterable. So when you pass set() a list, it creates a set containing the objects within the list - not a set containing the list itself, which is not permissible as you expect because lists are mutable.
So what matters is that the objects inside your list are immutable, which is true in the case of your linked tutorial as you have a list of (immutable) strings.
>>> set(["Perl", "Python", "Java"])
set[('Java', 'Python', 'Perl')]
Note that this printing formatting doesn't mean your set contains a list, it is just how sets are represented when printed. For instance, we can create a set from a tuple and it will be printed the same way.
>>> set((1,2,3))
set([1, 2, 3])
In Python 2, sets are printed as set([comma-separated-elements]).
You seem to be confusing initialising a set with a list:
a = set([1, 2])
with adding a list to an existing set:
a = set()
a.add([1, 2])
the latter will throw an error, where the former initialises the set with the values from the list you provide as an argument. What is most likely the cause for the confusion is that when you print a from the first example it looks like:
set([1, 2])
again, but here [1, 2] is not a list, just the way a is represented:
a = set()
a.add(1)
a.add(2)
print(a)
gives:
set([1, 2])
without you ever specifying a list.
Say I have two lists:
>>> l1=[1,2,3,4]
>>> l2=[11,12,13,14]
I can put those lists in a tuple, or dictionary, and it appears that they are all references back to the original list:
>>> t=(l1,l2)
>>> d={'l1':l1, 'l2':l2}
>>> id(l1)==id(d['l1'])==id(t[0])
True
>>> l1 is d['l1'] is t[0]
True
Since they are references, I can change l1 and the referred data in the tuple and dictionary change accordingly:
>>> l1.append(5)
>>> l1
[1, 2, 3, 4, 5]
>>> t
([1, 2, 3, 4, 5], [11, 12, 13, 14])
>>> d
{'l2': [11, 12, 13, 14], 'l1': [1, 2, 3, 4, 5]}
Including if I append the reference in the dictionary d or mutable reference in the tuple t:
>>> d['l1'].append(6)
>>> t[0].append(7)
>>> d
{'l2': [11, 12, 13, 14], 'l1': [1, 2, 3, 4, 5, 6, 7]}
>>> l1
[1, 2, 3, 4, 5, 6, 7]
If I now set l1 to a new list, the reference count for the original list decreases:
>>> sys.getrefcount(l1)
4
>>> sys.getrefcount(t[0])
4
>>> l1=['new','list']
>>> l1 is d['l1'] is t[0]
False
>>> sys.getrefcount(l1)
2
>>> sys.getrefcount(t[0])
3
And appending or changing l1 does not change d['l1'] or t[0] since it now a new reference. The notion of indirect references is covered fairly well in the Python documents but not completely.
My questions:
Is a mutable object always a reference? Can you always assume that modifying it modifies the original (Unless you specifically make a copy with l2=l1[:] kind of idiom)?
Can I assemble a list of all the same references in Python? ie, Some function f(l1) that returns ['l1', 'd', 't'] if those all those are referring to the same list?
It is my assumption that no matter what, the data will remain valid so long as there is some reference to it.
ie:
l=[1,2,3] # create an object of three integers and create a ref to it
l2=l # create a reference to the same object
l=[4,5,6] # create a new object of 3 ints; the original now referenced
# by l2 is unchanged and unmoved
1) Modifying a mutable object through a reference will always modify the "original". Honestly, this is betraying a misunderstanding of references. The newer reference is just as much the "original" as is any other reference. So long as both names point to the same object, modifying the object through either name will be reflected when accessed through the other name.
2) Not exactly like what you want. gc.get_referrers returns all references to the object.
>>> l = [1, 2]
>>> d = {0: l}
>>> t = (l, )
>>> import gc
>>> import pprint
>>> pprint.pprint(gc.get_referrers(l))
[{'__builtins__': <module '__builtin__' (built-in)>,
'__doc__': None,
'__name__': '__main__',
'__package__': None,
'd': {0: [1, 2]},
'gc': <module 'gc' (built-in)>,
'l': [1, 2],
'pprint': <module 'pprint' from '/usr/lib/python2.6/pprint.pyc'>,
't': ([1, 2],)}, # This is globals()
{0: [1, 2]}, # This is d
([1, 2],)] # this is t
Note that the actual object referenced by l is not included in the returned list because it does not contain a reference to itself. globals() is returned because that does contain a reference to the original list.
3) If by valid, you mean "will not be garbage collected" then this is correct barring a highly unlikely bug. It would be a pretty sorry garbage collector that "stole" your data.
Every variable in Python is a reference.
For lists, you are focusing on the results of the append() method, and loosing sight of the bigger picture of Python data structures. There are other methods on lists, and there are advantages and consequences to how a list is constructed. It is helpful to think of list as view on to other objects referred to in the list. They do not "containing" anything other than the rules and ways of accessing the data referred to by objects within them.
The list.append(x) method specifically is equivalent to l[len(l):]=[list]
So:
>>> l1=range(3)
>>> l2=range(20,23)
>>> l3=range(30,33)
>>> l1[len(l1):]=[l2] # equivalent to 'append' for subscriptable sequences
>>> l1[len(l1):]=l3 # same as 'extend'
>>> l1
[0, 1, 2, [20, 21, 22], 30, 31, 32]
>>> len(l1)
7
>>> l1.index(30)
4
>>> l1.index(20)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: list.index(x): x not in list
>>> 20 in l1
False
>>> 30 in l1
True
By putting the list constructor around l2 in l1[len(l1):]=[l2], or calling l.append(l2), you create a reference that is bound to l2. If you change l2, the references will show the change as well. The length of that in the list is a single element -- the reference to the appended sequence.
With no constructor shortcut as in l1[len(l1):]=l3, you copy each element of the sequence.
If you use other common list methods, such as l.index(something), or in you will not find elements inside of the data references. l.sort() will not sort properly. They are "shallow" operations on the object, and by using l1[len(l1):]=[l2] you are now creating a recursive data structure.
If you use l1[len(l1):]=l3, you are making a true (shallow) copy of the elements in l3.
These are fairly fundamental Python idioms, and most of the time they 'do the right thing.' You can, however, get surprising results, such as:
>>> m=[[None]*2]*3
>>> m
[[None, None], [None, None], [None, None]]
>>> m[0][1]=33
>>> m
[[None, 33], [None, 33], [None, 33]] # probably not what was intended...
>>> m[0] is m[1] is m[2] # same object, that's why they all changed
True
Some Python newbies try to create a multi dimension by doing something like m=[[None]*2]*3 The first sequence replication works as expected; it creates 2 copies of None. It is the second that is the issue: it creates three copies of the reference to the first list. So entering m[0][1]=33 modifies the list inside the list bound to m and then all the bound references change to show that change.
Compare to:
>>> m=[[None]*2,[None]*2,[None]*2]
>>> m
[[None, None], [None, None], [None, None]]
>>> m[0][1]=33
>>> m
[[None, 33], [None, None], [None, None]]
You can also use nested list comprehensions to do the same like so:
>>> m=[[ None for i in range(2)] for j in range(3)]
>>> m
[[None, None], [None, None], [None, None]]
>>> m[0][1]=44
>>> m
[[None, 44], [None, None], [None, None]]
>>> m[0] is m[1] is m[2] # three different lists....
False
For lists and references, Fredrik Lundh has this text for a good intro.
As to your specific questions:
1) In Python, Everything is a label or a reference to an object. There is no 'original' (a C++ concept) and there is no distinction between 'reference', pointer, or actual data (a C / Perl concept)
2) Fredrik Lundh has a great analogy about in reference to a question similar to this:
The same way as you get the name of
that cat you found on your porch: the
cat (object) itself cannot tell you
its name, and it doesn't really care
-- so the only way to find out what it's called is to ask all your
neighbours (namespaces) if it's their
cat (object)...
....and don't be surprised if you'll
find that it's known by many names, or
no name at all!
You can find this list with some effort, but why? Just call it what you are going to call it -- like a found cat.
3) True.
1- Is a mutable object always a
reference? Can you always assume that
modifying it modifies the original
(Unless you specifically make a copy
with l2=l1[:] kind of idiom)?
Yes. Actually non-mutable objects are always a reference as well.
You just can't change them to perceive this.
2 - Can I assemble a list of all the
same references in Python? ie, Some
function f(l1) that returns ['l1',
'd', 't'] if those all those are
referring to the same list?
That is odd, but it can be done.
You can compare objects for "samenes" with the is operator.
Like in l1 is t[0]
And you can get all referred-to objects with the function
gc.get_referrers in the garbage collector module (gc) --
You can check which of these referrers point o your object with the isoperator. So,yes, it can be done.
I just don't think it would be a good idea. It is more likely the is operator offer
a way for you to do waht you need alone
3- It is my assumption that no matter
what, the data will remain valid so
long as there is some reference to it.
Yes.
Is a mutable object always a reference? Can you always assume that modifying it modifies the original (Unless you specifically make a copy with l2=l1[:] kind of idiom)?
Python has reference semantics: variables do not store values as in C++, but instead label them. The concept of "the original" is flawed: if two variables label the same value, it is totally irrelevant which one "came first". It doesn't matter if the object is mutable or not (except that immutable objects won't make it so easy to tell what's going on behind the scenes). To make copies in a more general-purpose way, try the copy module.
Can I assemble a list of all the same references in Python? ie, Some function f(l1) that returns ['l1', 'd', 't'] if those all those are referring to the same list?
Not easily. Refer to aaronasterling's answer for details. You could also try something like k for k, v in locals().items() if v is the_object, but you'll also have to search globals(), you'll miss some stuff and it might cause some kind of problems due to recursing with the names 'k' and 'v' (I haven't tested).
It is my assumption that no matter what, the data will remain valid so long as there is some reference to it.
Absolutely.
"... object is a reference ..." is nonsense. References aren't objects. Variables, member fields, slots in lists and sets, etc. hold references, and these references point to objects. There can be any number (in a non-refcouting implementations, even none - temporarily, i.e. until the GC kicks in) references to an object. Everyone who has a reference to an object can invoke it's methods, access it's members, etc. - this is true for all objects. Of course only mutable objects can be changed this way, so you usually don't care for immutable ones.
Yes, as others have shown. But this shouldn't be necessary unless you're either debugging the GC or tracking down a serious memory leak in your code - why do you think you need this?
Python has automatic memory management, so yes. As long as there is a reference to an object, it won't be deleted (however, it may stay alive for a while after it became unreachable, due to cyclic references and the fact that GCs only run once in a while).
1a. Is a mutable object always a reference?
There is no difference between mutable and non-mutable objects. Seeing the variable names as references is helpful for people with a C-background (but implies they can be dereferenced, which they can not).
1b. Can you always assume that modifying it modifies the original
Please, it's not "the original". It's the same object. b = a means b and a now are the same object.
1c. (Unless you specifically make a copy with l2=l1[:] kind of idiom)?
Right, because then it is not the same object anymore. (Although the entries n the list will be the same objects as the original list).
2. Can I assemble a list of all the same references in Python?
Yes, probably, but you will never ever ever need it, so that would be a waste of energy. :)
3. It is my assumption that no matter what, the data will remain valid so long as there is some reference to it.
Yes, an object will not be garbage collected as long as you have a reference to it.
(Using the word "valid" here seems incorrect, but I assume this is what you mean).