Lets say I have a numpy array like
x = np.arange(10)
is it somehow possible to create a reference to a single element i.e.
y = create_a_reference_to(x[3])
y = 100
print x
[ 0 1 2 100 4 5 6 7 8 9]
You can't create a reference to a single element, but you can get a view over that single element:
>>> x = numpy.arange(10)
>>> y = x[3:4]
>>> y[0] = 100
>>> x
array([0, 1, 2, 100, 4, 5, 6, 7, 8, 9])
The reason you can't do the former is that everything in python is a reference. By doing y = 100, you're modifying what y points to - not it's value.
If you really want to, you can get that behaviour on instance attributes by using properties. Note this is only possible because the python data model specifies additional operations while accessing class attributes - it's not possible to get this behaviour for variables.
No you cannot do that, and that is by design.
Numpy arrays are of type numpy.ndarray. Individual items in it can be accessed with numpy.ndarray.item which does "copy an element of an array to a standard Python scalar and return it".
I'm guessing numpy returns a copy instead of direct reference to the element to prevent mutability of numpy items outside of numpy's own implementation.
Just as a thoughtgame, let's assume this wouldn't be the case and you would be allowed to get reference to individual items. Then what would happen if: numpy was in the midle of calculation and you altered an individual intime in another thread?
#goncalopp gives a correct answer, but there are a few variations that will achieve similar effects.
All of the notations shown below are able to reference a single element while still returning a view:
x = np.arange(10)
two_index_method = [None] * 10
scalar_element_method = [None] * 10
expansion_method = [None] * 10
for i in range(10):
two_index_method[i] = x[i:i+1]
scalar_element_method[i] = x[..., i] # x[i, ...] works, too
expansion_method[i] = x[:, np.newaxis][i] # np.newaxis == None
two_index_method[5] # Returns a length 1 numpy.ndarray, shape=(1,)
# >>> array([5])
scalar_element_method[5] # Returns a numpy scalar, shape = ()
# >>> array(5)
expansion_method[5] # Returns a length 1 numpy.ndarray, shape=(1,)
# >>> array([5])
x[5] = 42 # Change the value in the original `ndarray`
x
# >>> array([0, 1, 2, 3, 4, 42, 6, 7, 8, 9]) # The element has been updated
# All methods presented here are correspondingly updated:
two_index_method[5], scalar_element_method[5], expansion_method[5]
# >>> (array([42]), array(42), array([42]))
Since the object in scalar_element_method is a dimension zero scalar, attempting to reference the element contained within the ndarray via element[0] will return an IndexError. For a scalar ndarray, element[()] can be used to reference the element contained within the numpy scalar. This method can also be used for assignment to a length-1 ndarray, but has the unfortunate side effect that it does not dereference a length-1 ndarray to a python scalar. Fortunately, there is a single method, element.item(), that can be used (for dereferencing only) to obtain the value regardless of whether the element is a length one ndarray or a scalar ndarray:
scalar_element_method[5][0] # This fails
# >>> IndexError: too many indices for array
scalar_element_method[5][()] # This works for scalar `ndarray`s
# >>> 42
scalar_element_method[5][()] = 6
expansion_method[5][0] # This works for length-1 `ndarray`s
# >>> 6
expansion_method[5][()] # Doesn't return a python scalar (or even a numpy scalar)
# >>> array([6])
expansion_method[5][()] = 8 # But can still be used to change the value by reference
scalar_element_method[5].item() # item() works to dereference all methods
# >>> 8
expansion_method[5].item()
# >>> [i]8
TLDR; You can create a single-element view v with v = x[i:i+1], v = x[..., i], or v = x[:, None][i]. While different setters and getters work with each method, you can always assign values with v[()]=new_value, and you can always retrieve a python scalar with v.item().
Related
Trying to follow the Scipy documentation to apply the linear_sum_assignment function to my code, in which I'm trying to assign 1 robot to each pizza so that the total travel time of the robots is as small as possible.
Robots is a list of 6 robot objects of which I am purposely ignoring the first robot.
SwapTargets is a list of 5 Pizza objects
newlist = []
for j in range(1,len(Robots)):
for i in range(0,len(SwapTargets)):
ref_x = SwapTargets[i].coordinates[0]
ref_y = SwapTargets[i].coordinates[1]
value = ((ref_x - Robots[j].x)**2) + (ref_y - Robots[j].y)**2
newlist.append(value)
myarray = np.array(newlist).reshape(len(Robots[1:]),len(SwapTargets))
from scipy.optimize import linear_sum_assignment
row_ind, col_ind = linear_sum_assignment(myarray)
PizzaList = np.array([SwapTargets])[row_ind]
RobotList = np.array([Robots[1:]])[col_ind]
result = dict(zip(PizzaList, RobotList))
print(result)
At PizzaList = np.array([SwapTargets])[row_ind]
I'm getting an error IndexError: index 1 is out of bounds for axis 0 with size 1
Just as a test, If I replace [SwapTargets] with ["A,"B","C","D","E"] in PizzaList = np.array([SwapTargets])[row_ind] in I pull no error, but dont understand why my list of 5 objects does not work.
Thx from a python noobie
You are almost answering your own question. You state that SwapTargets is a list of 5 Pizza objects. This means that it could have been initialized like this:
SwapTargets = [pizza0, pizza1, pizza2, pizza3, pizza4, pizza5]
Then you say that while executing:
PizzaList = np.array(["A", "B", "C", "D", "E"])[row_ind]
works, executing
PizzaList = np.array([SwapTargets])[row_ind]
does not. To understand what's going on, simply substitute the initialization of SwapTargets for the identifier, the above becomes:
PizzaList = np.array([[pizza0, pizza1, pizza2, pizza3, pizza4, pizza5]])[row_ind]
See the double nested brackets?
What's happening here is that you are calling np.array() on a list of just one item (that item itself is a list of 5 items). In the successful call, you were calling the same function of a list of 5 items, which is probably what you want.
What you wanted to write was actually:
PizzaList = np.array(SwapTargets)[row_ind]
(no brackets around the argument).
Consider the difference between the following:
>>> numpy.array([1,2,3])
array([1, 2, 3])
and
>>> numpy.array([[1,2,3]])
array([[1, 2, 3]])
The first is a vector of length 3, the second is a matrix with dimensions (1, 3). You can see by checking the shape of both:
>>> numpy.array([1,2,3]).shape
(3,)
>>> numpy.array([[1,2,3]]).shape
(1, 3)
np.array([SwapTargets]) is just the same as the second example above. Its first dimension is of length 1, and so indexing ends at 0 in the first dimension. You probably want np.array(SwapTargets) instead.
Edit: I fixed y so that x,y have the same length
I don't understand much about programing but I have a giant mass of data to analyze and it has to be done in Python.
Say I have two arrays:
import numpy as np
x=np.array([1,2,3,4,5,6,7,8,9,10])
y=np.array([25,18,16,19,30,5,9,20,80,45])
and say I want to choose the values in y which are greater than 17, and keep only the values in x which has the same index as the left values in y. for example I want to erase the first value of y (25) and accordingly the matching value in x (1).
I tried this:
filter=np.where(y>17, 0, y)
but I don't know how to filter the x values accordingly (the actual data are much longer arrays so doing it "by hand" is basically imposible)
Solution: using #mozway tip, now that x,y have the same length the needed code is:
import numpy as np
x=np.array([1,2,3,4,5,6,7,8,9,10])
y=np.array([25,18,16,19,30,5,9,20,80,45])
x_filtered=x[y>17]
As your question is not fully clear and you did not provide the expected output, here are two possibilities:
filtering
Nunique arrays can be sliced by an array (iterable) of booleans.
If the two arrays were the same length you could do:
x[y>17]
Here, xis longer than y so we first need to make it the same length:
import numpy as np
x=np.array([1,2,3,4,5,6,7,8,9,10])
y=np.array([25,18,16,19,30,5,9,20])
x[:len(y)][y>17]
Output: array([1, 2, 4, 5, 8])
replacement
To select between x and y based on a condition, use where:
np.where(y>17, x[:len(y)], y)
Output:
array([ 1, 2, 16, 4, 5, 5, 9, 8])
As someone with little experience in Numpy specifically, I wrote this answer before seeing #mozway's excellent answer for filtering. My answer works on more generic containers than Numpy's arrays, though it uses more concepts as a result. I'll attempt to explain each concept in enough detail for the answer to make sense.
TL;DR:
Please, definitely read the rest of the answer, it'll help you understand what's going on.
import numpy as np
x = np.array([1,2,3,4,5,6,7,8,9,10])
y = np.array([25,18,16,19,30,5,9,20])
filtered_x_list = []
filtered_y_list = []
for i in range(min(len(x), len(y))):
if y[i] > 17:
filtered_y_list.append(y[i])
filtered_x_list.append(x[i])
filtered_x = np.array(filtered_x_list)
filtered_y = np.array(filtered_y_list)
# These lines are just for us to see what happened
print(filtered_x) # prints [1 2 4 5 8]
print(filtered_y) # prints [25 18 19 30 20]
Pre-requisite Knowledge
Python containers (lists, arrays, and a bunch of other stuff I won't get into)
Lets take a look at the line:
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
What's Python doing?
The first thing it's doing is creating a list:
[1, 2, 3] # and so on
Lists in Python have a few features that are useful for us in this solution:
Accessing elements:
x_list = [ 1, 2, 3 ]
print(x_list[0]) # prints 1
print(x_list[1]) # prints 2, and so on
Adding elements to the end:
x_list = [ 1, 2, 3 ]
x_list.append(4)
print(x_list) # prints [1, 2, 3, 4]
Iteration:
x_list = [ 1, 2, 3 ]
for x in x_list:
print(x)
# prints:
# 1
# 2
# 3
Numpy arrays are slightly different: we can still access and iterate elements in them, but once they're created, we can't modify them - they have no .append, and there are other modifications one can do with lists (like changing one value, or deleting a value) we can't do with numpy arrays.
So the filtered_x_list and the filtered_y_list are empty lists we're creating, but we're going to modify them by adding the values we care about to the end.
The second thing Python is doing is creating a numpy array, using the list to define its contents. The array constructor can take a list expressed as [...], or a list defined by x_list = [...], which we're going to take advantage of later.
A little more on iteration
In your question, for every x element, there is a corresponding y element. We want to test something for each y element, then act on the corresponding x element, too.
Since we can access the same element in both arrays using an index - x[0], for instance - instead of iterating over one list or the other, we can iterate over all indices needed to access the lists.
First, we need to figure out how many indices we're going to need, which is just the length of the lists. len(x) lets us do that - in this case, it returns 10.
What if x and y are different lengths? In this case, I chose the smallest of the two - first, do len(x) and len(y), then pass those to the min() function, which is what min(len(x), len(y)) in the code above means.
Finally, we want to actually iterate through the indices, starting at 0 and ending at len(x) - 1 or len(y) - 1, whichever is smallest. The range sequence lets us do exactly that:
for i in range(10):
print(i)
# prints:
# 0
# 1
# 2
# 3
# 4
# 5
# 6
# 7
# 8
# 9
So range(min(len(x), len(y))), finally, gets us the indices to iterate over, and finally, this line makes sense:
for i in range(min(len(x), len(y))):
Inside this for loop, i now gives us an index we can use for both x and y.
Now, we can do the comparison in our for loop:
for i in range(min(len(x), len(y))):
if y[i] > 17:
filtered_y_list.append(y[i])
Then, including xs for the corresponding ys is a simple case of just appending the same x value to the x list:
for i in range(min(len(x), len(y))):
if y[i] > 17:
filtered_y_list.append(y[i])
filtered_x_list.append(x[i])
The filtered lists now contain the numbers you're after. The last two lines, outside the for loop, just create numpy arrays from the results:
filtered_x = np.array(filtered_x_list)
filtered_y = np.array(filtered_y_list)
Which you might want to do, if certain numpy functions expect arrays.
While there are, in my opinion, better ways to do this (I would probably write custom iterators that produce the intended results without creating new lists), they require a somewhat more advanced understanding of programming, so I opted for something simpler.
sorry this question came up before here Setting two arrays equal
But the solution did not work and i dont know why.
import numpy as np
zero_matrix = np.zeros((3,3)) # 3x3 zero matrix
test_matrix = zero_matrix[:] # test_matrix is a view of zero_matrix. Without [:] it would be same object
print (zero_matrix)
print ()
print (test_matrix)
print ()
print(id(test_matrix))
print ()
print(id(zero_matrix))
print ()
test_matrix[1] = 42
print (test_matrix)
print ()
print (zero_matrix)
the 'zero_matrix' is also changed when i set the test_matrix[1] = 42.
And i dont get why since both have different object ids.
This is what is mean by the comment in your code that says test_matrix is a "view". A view does not have its own copy of the data. Rather it shares the underlying data of the original array. Views do not have to be of the entire array, but can be of small sub-sections of the array. These sub sections do not even need to be contiguous if the view is strided. eg.
a = np.arange(10)
b = a[::2] # create a view of every other element starting with the 0-th
assert list(b) == [0, 2, 4, 6, 8]
assert a[4] == 4
b[2] = -1
assert a[4] == -1
Views are powerful as they allow more complex operations without having to copy large amounts of data. Not needing to copy data all the time can mean some operations are faster than they otherwise would be.
Beware, not all index operations create views. eg.
a = np.arange(10, 20)
b = a[[1,2,5]]
assert list(b) == [11, 12, 15]
b[0] == -1
assert a[1] != -1
Use copy to copy your numpy arrays:
zero_matrix = np.zeros((3,3))
test_matrix = zero_matrix.copy()
test_matrix[1] = 42
print(zero_matrix)
print(test_matrix)
Numpy arrays and python lists behave differently in this regard.
They indeed have both different object IDs, but, as you write yourself: test_matrix is a view of zero_matrix.
An object is usually called a "view object" when it provides a way to access another object (be it by reading or by writing). In this case, accesses to this view object are deflected to the other object both by reading and writing.
That's a speciality of numpy objects opposed to "normal" python objects.
But even python has these objects, but doesn't use them unless explicitly requested.
import numpy as np
means = [[2, 2], [8, 3], [3, 6]]
cov = [[1, 0], [0, 1]]
N = 20
X0 = np.random.multivariate_normal(means[0], cov, N)
X1 = np.random.multivariate_normal(means[1], cov, N)
X2 = np.random.multivariate_normal(means[2], cov, N)
X = np.concatenate((X0, X1, X2), axis = 0)
Y = X[np.random.choice(X.shape[0], 3, replace=False)]
A = [X[np.random.choice(X.shape[0], 3, replace=False)]]
B = A[-1]
print(Y), print(type(Y))
print(A), print(type(A))
print(B), print(type(B))
>>>
[[3.58758421 6.83484817]
[9.10469916 4.23009063]
[7.24996633 4.0524614 ]]
<class 'numpy.ndarray'>
[array([[3.22836848, 7.06719777],
[2.33102712, 0.96966102],
[2.06576315, 4.84061538]])]
<class 'list'>
[[3.22836848 7.06719777]
[2.33102712 0.96966102]
[2.06576315 4.84061538]]
<class 'numpy.ndarray'>
Can you help me explain
What does X[np.random.choice(X.shape[0], 3, replace=False)] mean?
Is np.random.choice() supposed to return a new array?
Why Y and A return different results?
Is B supposed to return the last element in the list?
Thank you!
You can find the docs for scipy and numpy here as referenced in the comments.
Y is a numpy.ndarray object, and A is a list object. This is due to the [brackets] you have when you create A. The first and only element in A (the list) is Y (the array).
B does return the last element in the list. The last element in the list is the array object.
I would recommend reading this documentation on numpy.random.choice to find out exactly how the function works. In this instance, it essentially chooses 3 random indices from the numpy array X.
Y = X[np.random.choice(X.shape[0], 3, replace=False)]
This line can be thought of like this: Choose 3 random values from X, and create a new numpy array containing those values, and call it Y.
A = [X[np.random.choice(X.shape[0], 3, replace=False)]]
Then, define a regular python list. This is a list with only one element. That one element is a numpy array of 3 random values from X. The key concept is that A only has one element. However, that one element happens to be an array, which itself has 3 elements.
B = A[-1]
Finally, you are right that this returns the last element of A, and calls it B. From above, we know that A only has one element, an array of 3 elements. Therefore, that array is the last element of the list A.
The major takeaway is that python allows you to have lists of lists, lists of numpy arrays, etc.
I want to increment a numpy array using advanced indexing, e.g.
import numpy
x = numpy.array([0,0])
indices = numpy.array([1,1])
x[indices] += [1,2]
print x #prints [0 2]
I would have expected, that the result is [0 3], since both 1 and 2 should be added to the second zero of x, but apparently numpy only adds the last element which matches to a particular index.
Is this the general behaviour and I can rely on that, or is this undefined behaviour and could change with a different version of numpy?
Additionally, is there an (easy) way to get numpy to add all elements which match the index and not just the last one?
From numpy docs:
For advanced assignments, there is in general no guarantee for the iteration order. This means that if an element is set more than once, it is not possible to predict the final result.
You can use np.add.at to get the desired behaviour:
Help on built-in function at in numpy.add:
numpy.add.at = at(...) method of numpy.ufunc instance
at(a, indices, b=None)
Performs unbuffered in place operation on operand 'a' for elements
specified by 'indices'. For addition ufunc, this method is equivalent to
`a[indices] += b`, except that results are accumulated for elements that
are indexed more than once. For example, `a[[0,0]] += 1` will only
increment the first element once because of buffering, whereas
`add.at(a, [0,0], 1)` will increment the first element twice.
.. versionadded:: 1.8.0
< snip >
Example:
>>> b = np.ones(2, int)
>>> a = np.zeros(2, int)
>>> c = np.arange(2,4)
>>> np.add.at(a, b, c)
>>> a
array([0, 5])