consider the following Python code:
import numpy
a = numpy.random.rand(3,4)
b = numpy.random.rand(3,4)
c = a
c += b
c/2. - (a + b)/2.
The result of the last line is not an array with zeros. However, if I do:
d = a
d = d + b
d/2. - (a + b)/2.
Then the result is 0, as expected. This looks strange to me, can anybody please explain this behaviour? Is it wise to use +=, /=, ... for numpy arrays at all? Thank you!
(This is only a minimal example, I have to add up several arrays.)
The operation += is in place. This means it changes the content of array a in your first example!
The operation c=a makes c point to exactly the same data as a. Doing c += b also adds b to a.
The operation d = a also makes d point to a. But then d = d + b assigns a new spot in memory to d + b and then references d to this new spot.
As you can see, the differences are very important! For many algorithms you can exploit either one of these properties to gain efficiency, but caution is always necessary.
See here for a tutorial and here for an indepth SO question.
Because the line c = a only makes c point to a. It doesn't copy a. Then c += b also adds to a.
To add up several arrays, you have to either do it directly, or use a sum function.
c = a + b
c = sum([a, b])
c = numpy.sum([a, b], axis=0)
Or copy the array first:
c = a.copy()
c += b
it is because when you do:
c = a
from then on, a and c are the same object. so after,
c += b
you still have c == a
Related
Why does this
import numpy as np
a = np.array([[1,2,3],[4,5,6]])
a_old = a[0]
a[0] = a[0] + np.array([1,1,1])
print(a[0] - a_old)
give
[0 0 0]
and not
[1 1 1]
? But on the other hand
b = 2
b_old = b
b = b + 1
print(b - b_old)
gives indeed
1
as I would expect. I suppose there is a difference between assigning something to np.array elements, and assigning something to a variable. But I don't quite see through it. Thanks for help!
This can be tricky in numpy! It's because that when you do a_old = a[0], you are creating a view of the data in a, not a copy of the data. To quote the numpy indexing documentation:
It must be noted that the returned array is a view, i.e., it is not a copy of the original, but points to the same values in memory as does the original array.
So when you add to the original array: a[0] = a[0] + np.array([1,1,1]), the "value" of a_old changes as well since a_old points to the same location in memory. If you want to create a copy, you can simply edit your code:
a_old = a[0].copy()
In your floating point example, b_old = b copies the value of b to a new point in memory, so b and b_old can be changed without affecting one another.
This happens because in Python, as well as other programming languages, lists are Reference type. So, the value stored in a[0], isn't the list but it's a memory address where the list is stored.
So, the instruction a_old = a[0], copy the address of a[0] in a_old, and both the variables will reference the same list.
Instead, doing b_old = b, you're copying the value 2 directly.
To notice this, try running print(a_old is a[0]), you should see True.
Now, to solve your problem, you can run a_old = a[0].copy().
After this, print(a_old is a[0]) will output False, because the reference of the two variables are different.
This question already has answers here:
Multiple assignment and evaluation order in Python
(11 answers)
Closed 7 years ago.
I have found example for Fibonacci sequence that goes like this:
def fib(n):
a, b = 0, 1
while b < n:
print (b)
a, b = b, a+b
fib(20)
So here's what I don't get it:
a, b = 0, 1 # is just a shortcut for writing
a = 0
b = 1
right?
Now, following the same logic
a, b = b, a+b #should be the same as writing
a = b
b = a+b
But it isn't because if I write it like that, the output is different.
I'm having some hard time understanding why. Any thoughts?
Yes It isn't exactly the same , because when you write -
a, b = b, a+b
The value of a and b at the time of executing the statement is considered, lets say before this statement, a=1 , b=2 , then first right hand side is calculated , so b=2 is calculated and a+b=3 is calculated. Then the assignment occurs, that is a is assigned value 2 and b is assigned value 3.
But when you write -
a = b
b = a+b
The assignment occurs along with calculation, that is first b=2 is calculated, then assigned to a , so a becomes 2 , then a+b is calculated (with the changed value of a) , so a+b=4 and it is assigned to b , so b becomes 4 , and hence the difference.
a,b = b, a
This is a shorthand for swapping values of a and b , please note that if you want to swap the values without using this notation, you would need a temporary variable.
Internally how it works is that the right hand sid is made into a tuple, and then the values are unpacked, a simple test to see this is -
>>> a = 5
>>> b = 10
>>> t = a, b
>>> t
(5, 10)
>>> b, a = t
a, b = c, d is not shorthand for the following:
a = c
b = d
It's actually shorthand for this:
a, b = (c, d)
I.e., you're creating a tuple (c, d), a tuple with the values of c and d, which is then unpacked into the target list a, b. The tuple with its values is created before the values are unpacked into the target list. It actually is one atomic* operation, not a shorthand for several operations. So it does not matter whether one of the variables in the target list also occurs on the right hand side of the assignment operation.
* Not "atomic" in the sense of database ACID, but still not separate statements.
It is not the same thing.
x, y = y, x
equals to:
t = x
x = y
y = t
It actually uses a temporary variable to swap x and y.
So back to a, b = b, a+b. This expression equals to:
m = a; n = b
a = n
b = m + n
This question already has answers here:
Python - fibonacci numbers [duplicate]
(4 answers)
Closed 8 years ago.
Basically I'm quite new to Python, but ive written a code for the Fibonacci Sequence and it doesn't work, i've compared it online and its pretty much the same but when i write it slightly differently, it works! - But I have no idea why, can anyone shed some light on why it is behaving this way?
This code has been built and tested in the Python 3.3.2 Shell.
Working Code:
def fib(n):
a, b = 0, 1
while b < n:
print(b)
a, b = b, b + a
Non-Working Code:
def fib(n):
a = 0
b = 1
while b < n:
print(b)
a = b
b = b + a
I'm completely confused as to why it only works when the variables are grouped together and not when they are separate.
I believe it's in the line a,b = b,b+a.
The actual executed version does things a bit differently. An expanded form would be:
c = a
a = b
b = b + c
As b is incremented by the initial value of a, not the adjusted value.
To expand on Yeraze's answer, the actual assignment is closer to
# Make the tuple
a_b = (b, b+a)
# Unpack the tuple
a = a_b[0]
b = a_b[1]
so it's more obvious why the values are set and then assigned.
This question already has answers here:
Multiple assignment and evaluation order in Python
(11 answers)
Closed 9 years ago.
What is the difference between this:
a, b = b, a+b
And this:
a = b
b = a+b
I'm trying to follow along in the examples in the documentation and the first form (multiple assignment syntax) seems complicated to me. I tried to simplify it with the second example but it's not giving the same results. I'm clearly interpreting the first statement wrong. What am I missing?
Multiple assignment evaluates the values of everything on the right hand side before changing any of the values of the left hand side.
In other words, the difference is this:
a = 1
b = 2
a = b # a = 2
b = a + b # b = 2 + 2
vs. this:
a = 1
b = 2
a, b = b, a + b # a, b = 2, 1 + 2
Another way of thinking about it is that it's the equivalent of constructing a tuple and then deconstructing it again (which is actually exactly what's going on, except without an explicit intermediate tuple):
a = 1
b = 2
_tuple = (b, a+b)
a = _tuple[0]
b = _tuple[1]
I'm new to Python and struggling to solve the following issue the most Pythonic way.
I have a string (Example states given below) which needs to be split (.split('/', 2)) and appointed (up) to 3 variables (vars. a, b and c). The string is a URL which I need to split into 3 segments.
The string and its segments can be the following examples:
'seg_a/seb_b/the_rest' -> a = seg_a, b = seg_b, c = the_rest
'seg_a/the_rest' -> a = seg_a, b = None, c = the_rest
'seg_a' -> a = seg_a, b = None, c = None
Note: No obligation exists to have None value given if nothing else gets appointed. They simple may not exist (b in ex. 2, b and c in ex. 3).
If split results in 1 item, it's given to variable a.
If split results in 2 items, it's given to variable a and c
If split results in 3 items, then it's segments are given to variables a, b and c
I have found 2 methods achieving this, both seem not Pythonic, hence resulting in this question.
Method A:
Split.
Count.
Depending on count, appoint segments to variables with IF.. Elif.. Elif.. Else. statement
Method B:
Use list comprehension and nested Try-Except blocks. Ex:
try:
a, b, c = [i for i in to_split.split("/", 2)]
except ValueError:
try:
a, c = [i for i in to_split.split("/", 1)]
b = None
except ValueError:
a = to_split
b, c = None, None
My question (short):
What is the correct, Pythonic way of splitting this string to its
segments and appointing them to variables a, b and c?
I would do:
l = to_split.split("/", 2)
a, b, c = l + [None] * (3 - len(l))
IMHO, what is most Pythonic isn't what's most clever. If something is simple, concise, and comprehensible at a glance, then use it and get on with your day. If the rules you want to impose are
If split results in 1 item, it's given to variable a.
If split results in 2 items, it's given to variables a and c.
If split results in 3 items, it's given to variables a, b and c.
Then just implement that, Method A-style.
p = to_split.split("/", 2)
if len(p) == 1:
a, = p
elif len(p) == 2:
a, c = p
elif len(p) == 3:
a, b, c = p
else:
raise ValueError("could not parse {}".format(to_split))
I can read this and know exactly what it's doing. If there's a bug in there -- say I've swapped b and c when len(p) == 2 -- it's easy to fix once I see the problem.
It does seem a little strange that you're willing to let variables be undefined -- you must branch later to avoid getting a NameError, and that could, and probably should, be avoided with some refactoring. In my experience, something is probably a little off elsewhere. Even without changing anything else, I'd include a, b, c = [None]*3, myself.
One rule which helps keep code maintainable is that we should try to minimize the distance between what we would tell someone an algorithm is supposed to do and how we told the computer what to do. Here, since what you want to do is almost transcribable directly into Python, I'd just do that.
You could try:
a,b,c = (to_split("/",2) + [None]*3)[0:3]
However I agree with #DSM: the most pythonic way is not always the best approach to solve a problem. It could be ok at first, but a more verbose code works best in terms of readability.
That's one of the reasons I love Python: there are several ways to solve a problem, and it's up to the developer to choose the best according to his/her needs.