Unclarity on variable cloning behavior - python

Along with a book I was provided with a Python program, into which I am digging deep now.
The program uses a global data structure named globdat, in a specific routine a numpy array inside globdat is assigned to a local variable:
a = globdat.array
Then in a following while loop the variable a is updated every iteration according to:
a[:] += da[:]
The result of this operation is that globdat.array is updated, which is used in subsequent operations.
Is the usage of [:] required here, or is it merely used to indicate that it also clones into globdat.array? Can anyone clarify this coding style?

The second [:], in the right-hand-side, is redundant. It just copies da before using it in the concatenation, which is pointless.
We're left with:
a[:] += da
First, let's understand what a += da does. It maps to:
a = a.__iadd__(da)
The call to __iadd__ extends the original list a, and returns self, i.e. a reference to the list. The assignment which happens after, then, has no effect in this case (same as a=a).
This achieves the original goal, i.e. to extend the global array.
Now, What does a[:] += da do? It maps to:
a[:] = a[:].__iadd__(da)
Or more tediously to:
a.__setitem__(slice(None), a.__getitem__(slice(None)).__iadd__(da))
For readability, let's write it as (not a valid python syntax):
a.__setitem__(:, a.__getitem__(:).__iadd__(da))
So a[:].__iadd__(da):
creates a copy of a (call is a2)
concatenate da to a2 in place
return a2
Then the assignment a[:] = ...:
replaces all values in a with all values in a2 in place.
So that too, achieves the original goal, but is wayyyyyy less efficient.
There are some interesting details about this stuff in the answers to this question.

This statement is rather nasty:
a[:] += da[:]
It translates into this:
a.__setitem__(slice(None),
a.__getitem__(slice(None)).__iadd__(da.__getitem__(slice(None))))
This makes unnecessary copies of both lists.
Assuming a and da are lists, you could much more reasonably use the extend() method:
a.extend(da)

If you want to modify a list in place, rather than replace it with a new list, you need to use the slicing syntax.
a[:] = da[:]
In this case though, += will always modify the list in place, so the slicing is redundant.
This might be a perfect example of cargo cult programming.

Related

Python remove method removes element from both lists

When I run the code below:
a=[0,1,2,3]
t=a
t.remove(3)
print(a)
it gives me a result of [0,1,2] even though I didn't use the remove() method for list a. Why does it happen?
Because list(and dicts) are pass by reference in Python. You have to shallow copy the list if you don't want it to happen like this:
t=a[:]
or
t=a.copy()
Lists in python are mutable objects and hence when you assigned t = a,
you created list 't' which is reference to list 'a'. Both lists refer to the same memory location.
Original list
a=[0,1,2,3]
id(a) # gives unique number which points to memory location
>> 60528136
assigning to t
t = a
id(t)
>> 60528136
The result shows same number which signifies that both lists refer to the same memory location. So any modification on list 'a' would be reflected in list 't' and vice-versa.
To avoid this create a copy.
t = a[::]
or
t = a.copy()
id(t)
>>60571720
Check out the Python docs on Programming FAQ here.
Why did changing list ‘y’ also change list ‘x’?
There are two factors that produce this result:
Variables are simply names that refer to objects. Doing y = x doesn’t create a copy of the list – it creates a new variable y that refers to the same object x refers to. This means that there is only one object (the list), and both x and y refer to it.
Lists are mutable, which means that you can change their content.
How do I copy an object in Python?
In general, try copy.copy() or copy.deepcopy() for the general case. Not all objects can be copied, but most can.
Some objects can be copied more easily. Dictionaries have a copy() method:
newdict = olddict.copy()
Sequences can be copied by slicing:
new_l = l[:]

How can I preserve the value of my parameters through a function so it can be used multiple times with its initial value?

I have just started using SAGE which is pretty close to python as I understand it, and I have come accross this problem where I'll have as a parameter of a function a matrix which I wish to use multiple times in the function with its same original function but through the different parts of the function it changes values.
I have seen in a tutorial that declaring a variable in the function as
variable = list(parameter) doesnt affect the parameter or whatever is put in the parentheses. However I can't make it work..
Below is part of my program posing the problem (I can add the rest if necessary): I declare the variable determinant which has as value the result of the function my_Gauss_determinant with the variable auxmmatrix as parameter. Through the function my_Gauss_determinant the value of auxmmatrix changes but for some reason the value of mmatrix as well. How can avoid this and be able to re-use the parameter mmatrix with its original value?
def my_Cramer_solve(mmatrix,bb):
auxmmatrix=list(mmatrix)
determinant=my_Gauss_determinant(auxmmatrix)
if determinant==0:
print
k=len(auxmmatrix)
solution=[]
for l in range(k):
auxmmatrix1=my_replace_column(list(mmatrix),l,bb)
determinant1=my_Gauss_determinant(auxmmatrix1)
solution.append(determinant1/determinant0)
return solution
What you want is a copy of mmatrix. The reason list(other_list) works is because it iterates through every item in other_list to create a new list. But the mutable objects within the list aren't copied
>>> a = [{1,2}]
>>> b = list(a)
>>> b[0].add(7)
>>> a
[set([1,2,7])]
To make a complete copy, you can use copy.deepcopy to make copies of the elements within the list
>>> import copy
>>> a = [{1,2}]
>>> b = copy.deepcopy(a)
>>> b[0].add(7)
>>> a
[set([1,2])]
So if you only want to copy the list, but don't want to copy the elements within the list, you would do this
auxmmatrix = copy.copy(matrix)
determinant = my_Gauss_determinant(copy.copy(matrix))
If you want to copy the elements within the list as well, use copy.deepcopy
If m is a matrix, you can copy it into mm by doing
sage: mm = m[:, :]
or
sage: mm = matrix(m)
To understand the need to copy container structures such as lists and matrices, you could read the tutorial on objects and classes in Python and Sage.
The other Sage tutorials are recommended too!

What's this syntax in python mean?

I have the following code in python:
a = "xxx" # a is a string
b = "yyy" # b is another string
for s in a, b:
t = s[:]
...
I dont understand the meaning of for line. I know a, b returns a tuple. But what about looping through a, b? And why you need t = s[:]. I know s[:] creates a copy of list. But if s is a string, why don't you write t = s to make a copy of the string s into t?
Thank you.
The meaning of the for loop is to iterate over the tuple (a, b). So the loop body will run twice, once with s equal to a and again equal to b.
t = s[:]
On the face of it, this creates a copy of the string s, and makes t a reference to that new string.
However strings are immutable, so for most purposes the original is as good as a copy. As an optimization, Python implementations are allowed to just re-use the original string. So the line is likely to be equivalent to:
t = s
That is to say, it will not make a copy. It will just make t refer to the same object s refers to.

How does Python referencing work?

I am confused with Python referencing. Consider the following example:
My task : To edit each element in the list
d = { 'm': [1,2,3] }
m = d['m']
m = m[1:] # m changes its reference to the new sliced list, edits m but not d (I wanted to change d)
Similarly:
d = { 'm': [1,2,3] }
m = d['m']
m = m[0] # As per python referencing, m should be pointing to d['m'] and should have edited d
In python everything goes by reference, then when is a new object created?
Do we always need copy and deepcopy from copy module to make object copies?
Please clarify.
In Python a variable is not a box that holds things, it is a name that points to an object. In your code:
d = { 'm': [1,2,3] } --> binds the name d to a dictionary
m = d['m'] --> binds the name m to a list
m = m[1:] --> binds the name m to another list
Your third line is not changing m itself, but what m is pointing to.
To edit the elements in the list what you can do is:
m = d['m']
for i, item in enumerate(m):
result = do_something_with(item)
m[i] = result
Ethan Furman did an excellent job of explaining how Python internals work, I won't repeat it.
Since m really does represent the list inside the dictionary, you can modify it. You just can't reassign it to something new, which is what happens when you use = to equate it to a new slice.
To slice off the first element of the list for example:
>>> m[0:1] = []
>>> d
{'m': [2, 3]}
In python everything goes by reference
In Python, everything is a reference, and the references get passed around by value.
If you want to use those terms. But those terms make things harder to understand.
Much simpler: in Python, a variable is a name for an object. = is used to change what object a name refers to. The left-hand side can refer to part of an existing object, in which case the whole object is changed by replacing that part. This is because the object, in turn, doesn't really contain its parts, but instead contains more names, which can be caused to start referring to different things.
then when is a new object created ?
Objects are created when they are created (by using the class constructor, or in the case of built-in types that have a literal representation, by typing out a literal). I don't understand how this is relevant to the rest of your question.
m = m[1:] # m changes its reference to the new sliced list
Yes, of course. Now m refers to the result of evaluating m[1:].
edits m but not d (I wanted to change d)
Yes, of course. Why would it change d? It wasn't some kind of magic, it was simply the result of evaluating d['m']. Exactly the same thing happens on both lines.
Let's look at a simpler example.
m = 1
m = 2
Does this cause 1 to become 2? No, of course not. Integers are immutable. But the same thing is happening: m is caused to name one thing, and then to name another thing.
Or, another way: if "references" were to work the way you expect, then the line m = m[1:] would be recursive. You're expecting it to mean "anywhere that you see m, treat it as if it meant m[1:]". Well, in that case, m[1:] would actually mean m[1:][1:], which would then mean m[1:][1:][1:], etc.

Difference between mutation, rebinding, copying value, and assignment operator [duplicate]

This question already has answers here:
"Least Astonishment" and the Mutable Default Argument
(33 answers)
Closed 6 months ago.
#!/usr/bin/env python3.2
def f1(a, l=[]):
l.append(a)
return(l)
print(f1(1))
print(f1(1))
print(f1(1))
def f2(a, b=1):
b = b + 1
return(a+b)
print(f2(1))
print(f2(1))
print(f2(1))
In f1 the argument l has a default value assignment, and it is only evaluated once, so the three print output 1, 2, and 3. Why f2 doesn't do the similar?
Conclusion:
To make what I learned easier to navigate for future readers of this thread, I summarize as the following:
I found this nice tutorial on the topic.
I made some simple example programs to compare the difference between mutation, rebinding, copying value, and assignment operator.
This is covered in detail in a relatively popular SO question, but I'll try to explain the issue in your particular context.
When your declare your function, the default parameters get evaluated at that moment. It does not refresh every time you call the function.
The reason why your functions behave differently is because you are treating them differently. In f1 you are mutating the object, while in f2 you are creating a new integer object and assigning it into b. You are not modifying b here, you are reassigning it. It is a different object now. In f1, you keep the same object around.
Consider an alternative function:
def f3(a, l= []):
l = l + [a]
return l
This behaves like f2 and doesn't keep appending to the default list. This is because it is creating a new l without ever modifying the object in the default parameter.
Common style in python is to assign the default parameter of None, then assign a new list. This gets around this whole ambiguity.
def f1(a, l = None):
if l is None:
l = []
l.append(a)
return l
Because in f2 the name b is rebound, whereas in f1 the object l is mutated.
This is a slightly tricky case. It makes sense when you have a good understanding of how Python treats names and objects. You should strive to develop this understanding as soon as possible if you're learning Python, because it is central to absolutely everything you do in Python.
Names in Python are things like a, f1, b. They exist only within certain scopes (i.e. you can't use b outside the function that uses it). At runtime a name refers to a value, but can at any time be rebound to a new value with assignment statements like:
a = 5
b = a
a = 7
Values are created at some point in your program, and can be referred to by names, but also by slots in lists or other data structures. In the above the name a is bound to the value 5, and later rebound to the value 7. This has no effect on the value 5, which is always the value 5 no matter how many names are currently bound to it.
The assignment to b on the other hand, makes binds the name b to the value referred to by a at that point in time. Rebinding the name a afterwards has no effect on the value 5, and so has no effect on the name b which is also bound to the value 5.
Assignment always works this way in Python. It never has any effect on values. (Except that some objects contain "names"; rebinding those names obviously effects the object containing the name, but it doesn't affect the values the name referred to before or after the change)
Whenever you see a name on the left side of an assignment statement, you're (re)binding the name. Whenever you see a name in any other context, you're retrieving the (current) value referred to by that name.
With that out of the way, we can see what's going on in your example.
When Python executes a function definition, it evaluates the expressions used for default arguments and remembers them somewhere sneaky off to the side. After this:
def f1(a, l=[]):
l.append(a)
return(l)
l is not anything, because l is only a name within the scope of the function f1, and we're not inside that function. However, the value [] is stored away somewhere.
When Python execution transfers into a call to f1, it binds all the argument names (a and l) to appropriate values - either the values passed in by the caller, or the default values created when the function was defined. So when Python beings executing the call f3(5), the name a will be bound to the value 5 and the name l will be bound to our default list.
When Python executes l.append(a), there's no assignment in sight, so we're referring to the current values of l and a. So if this is to have any effect on l at all, it can only do so by modifying the value that l refers to, and indeed it does. The append method of a list modifies the list by adding an item to the end. So after this our list value, which is still the same value stored to be the default argument of f1, has now had 5 (the current value of a) appended to it, and looks like [5].
Then we return l. But we've modified the default list, so it will affect any future calls. But also, we've returned the default list, so any other modifications to the value we returned will affect any future calls!
Now, consider f2:
def f2(a, b=1):
b = b + 1
return(a+b)
Here, as before, the value 1 is squirreled away somewhere to serve as the default value for b, and when we begin executing f2(5) call the name a will become bound to the argument 5, and the name b will become bound to the default value 1.
But then we execute the assignment statement. b appears on the left side of the assignment statement, so we're rebinding the name b. First Python works out b + 1, which is 6, then binds b to that value. Now b is bound to the value 6. But the default value for the function hasn't been affected: 1 is still 1!
Hopefully that's cleared things up. You really need to be able to think in terms of names which refer to values and can be rebound to point to different values, in order to understand Python.
It's probably also worth pointing out a tricky case. The rule I gave above (about assignment always binding names with no effect on the value, so if anything else affects a name it must do it by altering the value) are true of standard assignment, but not always of the "augmented" assignment operators like +=, -= and *=.
What these do unfortunately depends on what you use them on. In:
x += y
this normally behaves like:
x = x + y
i.e. it calculates a new value with and rebinds x to that value, with no effect on the old value. But if x is a list, then it actually modifies the value that x refers to! So be careful of that case.
In f1 you are storing the value in an array or better yet in Python a list where as in f2 your operating on the values passed. Thats my interpretation on it. I may be wrong
Other answers explain why this is happening, but I think there should be some discussion of what to do if you want to get new objects. Many classes have the method .copy() that allows you create copies. For instance, if we rewrite f1 as
def f1(a, l=[]):
new_l = l.copy()
new_l.append(a)
return(new_l)
then it will continue to return [1]no matter how many times we call it. There is also the library https://docs.python.org/3/library/copy.html for managing copies.
Also, if you're looping through the elements of a container and mutating them one by one, using comprehensions not only is more Pythonic, but can avoid the issue of mutating the original object. For instance, suppose we have the following code:
data = [1,2,3]
scaled_data = data
for i, value in enumerate(scaled_data):
scaled_data[i] = value/sum(data)
This will set scaled_data to [0.16666666666666666, 0.38709677419354843, 0.8441754916792739]; each time you set a value of scaled_data to the scaled version, you also change the value in data. If you instead have
data = [1,2,3]
scaled_data = [x/sum(data) for x in data]
this will set scaled_data to [0.16666666666666666, 0.3333333333333333, 0.5] because you're not mutating the original object but creating a new one.

Categories