Get a pointer to a list element - python

I was wondering if it was possible to get a "pointer" to an element in a python list. That way, I would be able to access my element directly without needing to know my element's index. What I mean by that is that in a list, you can add elements anywhere; at the start, in the middle or even at the end, yet the individual elements aren't moved from their actual memory location. In theory, it should be possible to do something like:
myList = [1]
[1]
element = &myList[0]
element would act as a pointer here.
myList.insert(0, 0)
myList.append(2)
[0, 1, 2]
At this point, I would still be able to access the element directly even though it's index within the list has changed.
The reason I want to do this is because in my program, it would be way too tedious to keep track of every item I add to my list. Each item is generated by an object. Once in a while, the object has to update the value, yet it can't be guaranteed that it will find its item at the same index as when it was added. Having a pointer would solve the problem. I hope that makes sense.
What would be the right way to do something like that in Python?

There's no concept of pointers on python (at least that I'm aware of).
In case you are saving objects inside your list, you can simply keep a reference to that object.
In the case you are saving primitive values into your list, the approach I would take is to make a wrapper object around the value/values and keep a reference of that object to use it later without having to access the list. This way your wrapper is working as a mutable object and can be modified no matter from where you are accesing it.
An example:
class FooWrapper(object):
def __init__(self, value):
self.value = value
# save an object into a list
l = []
obj = FooWrapper(5)
l.append(obj)
# add another object, so the initial object is shifted
l.insert(0, FooWrapper(1))
# change the value of the initial object
obj.value = 3
print l[1].value # prints 3 since it's still the same reference

element = mylist[0] already works if you don't need to change the element or if element is a mutable object.
Immutable objects such as int objects in Python you can not change. Moreover, you can refer to the same object using multiple names in Python e.g., sys.getrefcount(1) is ~2000 in a fresh REPL on my system. Naturally, you don't want 1 to mean 2 all of a sudden in all these places.
If you want to change an object later then it should be mutable e.g., if mylist[0] == [1] then to change the value, you could set element[0] = 2. A custom object instead of the
[1] list could be more appropriate for a specific application.
As an alternative, you could use a dictionary (or other namespace objects such as types.SimpleNamespace) instead of the mylist list. Then to change the item, reference it by its name: mydict["a"] = 2.

Related

Updating a list in Python: Why is the scope of my for-loop within a function apparently global?

I'm an absolute Python-Newbe and I have some trouble with following function. I hope you can help me. Thank you very much for your help in advance!
I have created a list of zip-files in a directory via a list-comprehension:
zips_in_folder = [file for file in os.listdir(my_path) if file.endswith('.zip')]
I then wanted to define a function that replaces a certain character at a certain index in every element fo the list with "-":
print(zips_in_folder)
def replacer_zip_names(r_index, replacer, zips_in_folder=zips_in_folder):
for index, element in enumerate(zips_in_folder):
x = list(element)
x[r_index] = replacer
zips_in_folder[index]=''.join(x)
replacer_zip_names(5,"-")
print(zips_in_folder)
Output:
['12345#6', '22345#6']
['12345-6', '22345-6']
The function worked, but what I cannot wrap my head around: Why will my function update the actual list "zips_in_folder". I thought the "zips_in_folder"-list within the function would only be a "shadow" of the actual list outside the function. Is the scope of the for-loop global instead of local in this case?
In other functions I wrote the scope of the variables was always local...
I was searching for an answer for hours now, I hope my question isn't too obvious!
Thanks again!
Best
Felix
This is a rather intermediate topic. In one line: Python is pass-by-object-reference.
What this means
zips_in_folder is an object. An object has a reference (think of it like an address) that points to its location in memory. To access an object, you need to use its reference.
Now, here's the key part:
For objects Python passes their reference as value
This means that a copy of the reference of the object is created but again, the new reference is pointing to the same location in the memory.
As a consequence, if you use the reference's copy to access the object, then the original object will be modified.
In your function, zips_in_folder is a variable storing a new copy of the reference.
The following line is using the new copy to access the original object:
zips_in_folder[index]=''.join(x)
However, if you decide to reassign the variable that is storing the reference, nothing will be done to the object, or its original reference, because you just reassigned the variable storing the copy of the reference, you did not modify the original object. Meaning that:
def reassign(a):
a = []
a = [1,0]
reassign(a)
print(a) # output: [1,0]
A simple way to think about it is that lists are mutable, this means that the following will be true:
a = [1, 2, 3]
b = a # a, b are referring to the same object
a[1] = 20 # b now is [1, 20, 3]
That is because lists are objects in python, not primitive variables, so the function changes the "original" list i.e. it doesn't make a local copy of it.
The same is true for any class, user-defined or otherwise: a function manipulating an object will not make a copy of the object, it will change the "original" object passed to it.
If you have knowledge of c++ or any other low-level programming language, it's the same as pass-by-reference.

Why does Python return None on list.reverse()?

Was solving an algorithms problem and had to reverse a list.
When done, this is what my code looked like:
def construct_path_using_dict(previous_nodes, end_node):
constructed_path = []
current_node = end_node
while current_node:
constructed_path.append(current_node)
current_node = previous_nodes[current_node]
constructed_path = reverse(constructed_path)
return constructed_path
But, along the way, I tried return constructed_path.reverse() and I realized it wasn't returning a list...
Why was it made this way?
Shouldn't it make sense that I should be able to return a reversed list directly, without first doing list.reverse() or list = reverse(list) ?
What I'm about to write was already said here, but I'll write it anyway because I think it will perhaps add some clarity.
You're asking why the reverse method doesn't return a (reference to the) result, and instead modifies the list in-place. In the official python tutorial, it says this on the matter:
You might have noticed that methods like insert, remove or sort that only modify the list have no return value printed – they return the default None. This is a design principle for all mutable data structures in Python.
In other words (or at least, this is the way I think about it) - python tries to mutate in-place where-ever possible (that is, when dealing with an immutable data structure), and when it mutates in-place, it doesn't also return a reference to the list - because then it would appear that it is returning a new list, when it is really returning the old list.
To be clear, this is only true for object methods, not functions that take a list, for example, because the function has no way of knowing whether or not it can mutate the iterable that was passed in. Are you passing a list or a tuple? The function has no way of knowing, unlike an object method.
list.reverse reverses in place, modifying the list it was called on. Generally, Python methods that operate in place don’t return what they operated on to avoid confusion over whether the returned value is a copy.
You can reverse and return the original list:
constructed_path.reverse()
return constructed_path
Or return a reverse iterator over the original list, which isn’t a list but doesn’t involve creating a second list just as big as the first:
return reversed(constructed_path)
Or return a new list containing the reversed elements of the original list:
return constructed_path[::-1]
# equivalent: return list(reversed(constructed_path))
If you’re not concerned about performance, just pick the option you find most readable.
methods like insert, remove or sort that only modify the list have no return value printed – they return the default None. 1 This is a design principle for all mutable data structures in Python.
PyDocs 5.1
As I understand it, you can see the distinction quickly by comparing the differences returned by modifying a list (mutable) ie using list.reverse() and mutating a list that's an element within a tuple (non-mutable), while calling
id(list)
id(tuple_with_list)
before and after the mutations. Mutable data-type mutations returning none is part allowing them to be changed/expanded/pointed-to-by-multiple references without reallocating memory.

Set changed size during iteration

I'm new to python coming from a c++ background. I was just playing around with sets trying to calculate prime numbers and got a "Set changed size during iteration" error.
How internally does python know the set changed size during iteration?
Is it possible to do something similar in user defined objects?
The pythonic way to filter sets, lists or dicts is with list [or dict] expressions
your_filtered_set = set([elem for elem in original_set if condition(elem)])
It's trivial to do so with a user-defined object: just set a flag each time you modify the object, and have the iterator check that flag each time it tries to retrieve an item.
Generally, you should not modify a set while iterating over it, as you risk missing an item or getting the same item twice.

How to reference an element of a list inside itself?

How would I go about making reference to an element from a list inside that list? For example,
settings = ["Exposure", "0", random_time(settings[0])]
Where the third element makes reference to the first. I could verbosely state "Exposure" but I am trying to set it up so that even if the first element is changed the third changes with it.
Edit:
I think maybe my question wasn't clear enough. There will be more than one setting each using the generic function "random_time", hence the need to pass the keyword of the setting. The reference to the first element is so I only have to make modifications to the code in one place. This value will not change once the script is running.
I will try and use a list of keywords that the settings list makes reference to.
The right-hand expression is evaluated first, so when you evaluate
["Exposure", "0", random_time(settings[0])]
the variable settings is not defined yet.
A little example:
a = 1 + 2
First 1 + 2 is evaluated and the result is 3, after it's evaluated, then the assignment is done:
a = 3
One way you could handle this is storing the "changing" string to a variable:
var1 = "Exposure"
settings = [var1 , "0", random_time(var1)]
this will work in the list definition, but if, after declaring the list settings, you change var1, it won't change its third element. If you want this to happen, you can try implementing a class Settings, which will be a lot more flexible.
AFAIK you can't. This is common to most programming languages because when you're running your function there the item hasn't been completely created yet.
You can't directly.
You could have both refer to something else, though, and use an attribute of that.
class SettingObj:
name = "Exposure"
settings = [SettingObj, "0", random_time(SettingObj)]
Now, change the way you work with your settings list so that you look for your name attribute for 1st and 3rd items on the list.
As others have told you, the syntax you've chosen will try to reference settings before it is created, and therefore it will not work (unless settings already exists because another object was assigned to it on a previous line).
More importantly, in Python, assigning a string to two places will not make it so that changing it in one place will change it in the other. This applies to all forms of binding, including variable names, lists and object attributes.
Strings are immutable in Python -- they cannot be changed, only rebinded. And rebinding only affects a single name (or list position or etc.) at a time. This is different from, say, C, where two names can contain pointers that reference the same spot in memory, and you can edit that spot in memory and affect both places.
If you really need to do this, you can wrap the string in an object (custom class, presumably). You could even make the object's interface look like a string in all respects, except that it's not a string primitive but an object with an attribute (say contents) that's bound to a string. Then when you want to change the string, you rebind the object's attribute (that is, obj.contents or whatever). Since you are not reassigning the names bound to the object itself, but only a name inside the object, it will change in both places.
In this particular case you don't just have the same string in both places but you actually have a string in the first position but the result of a function performed on the string in the third position. So even if you use an object wrapper, it won't work the way you seem to want it to, because the function needs to be re-run every time.
There are ways to design your program so that this is not a problem, but without knowing more about your ultimate goal I can't say what they are.

memory management with objects and lists in python

I am trying to understand how exactly assignment operators, constructors and parameters passed in functions work in python specifically with lists and objects. I have a class with a list as a parameter. I want to initialize it to an empty list and then want to populate it using the constructor. I am not quite sure how to do it.
Lets say my class is --
class A:
List = [] # Point 1
def __init1__(self, begin=[]): # Point 2
for item in begin:
self.List.append(item)
def __init2__(self, begin): # Point 3
List = begin
def __init3__(self, begin=[]): # Point 4
List = list()
for item in begin:
self.List.append(item)
listObj = A()
del(listObj)
b = listObj
I have the following questions. It will be awesome if someone could clarify what happens in each case --
Is declaring an empty like in Point 1 valid? What is created? A variable pointing to NULL?
Which of Point 2 and Point 3 are valid constructors? In Point 3 I am guessing that a new copy of the list passed in (begin) is not made and instead the variable List will be pointing to the pointer "begin". Is a new copy of the list made if I use the constructor as in Point 2?
What happens when I delete the object using del? Is the list deleted as well or do I have to call del on the List before calling del on the containing object? I know Python uses GC but if I am concerned about cleaning unused memory even before GC kicks in is it worth it?
Also assigning an object of type A to another only makes the second one point to the first right? If so how do I do a deep copy? Is there a feature to overload operators? I know python is probably much simpler than this and hence the question.
EDIT:
5. I just realized that using Point 2 and Point 3 does not make a difference. The items from the list begin are only copied by reference and a new copy is not made. To do that I have to create a new list using list(). This makes sense after I see it I guess.
Thanks!
In order:
using this form is simply syntactic sugar for calling the list constructor - i.e. you are creating a new (empty) list. This will be bound to the class itself (is a static field) and will be the same for all instances.
apart from the constructor name which must always be init, both are valid forms, but mean different things.
The first constructor can be called with a list as argument or without. If it is called without arguments, the empty list passed as default is used within (this empty list is created once during class definition, and not once per constructor call), so no items are added to the static list.
The second must be called with a list parameter, or python will complain with an error, but using it without the self. prefix like you are doing, it would just create a new local variable name List, accessible only within the constructor, and leave the static A.List variable unchanged.
Deleting will only unlink a reference to the object, without actually deleting anything. Once all references are removed, however, the garbage collector is free to clear the memory as needed.
It is usually a bad idea to try to control the garbage collector. instead. just make sure you don't hold references to objects you no longer need and let it make its work.
Assigning a variable with an object will only create a new reference to the same object, yes. To create a deep copy use the related functions or write your own.
Operator overloading (use with care, it can make things more confusing instead of clearer if misused) can be done by overriding some special methods in the class definition.
About your edit: like i pointed above, when writing List=list() inside the constructor, without the self. (or better, since the variable is static, A.) prefix, you are just creating an empty variable, and not overriding the one you defined in the class body.
For reference, the usual way to handle a list as default argument is by using a None placeholder:
class A(object):
def __init__(self, arg=None):
self.startvalue = list(arg) if arg is not None else list()
# making a defensive copy of arg to keep the original intact
As an aside, do take a look at the python tutorial. It is very well written and easy to follow and understand.
"It will be awesome if someone could clarify what happens in each case" isn't that the purpose of the dis module ?
http://docs.python.org/2/library/dis.html

Categories