import numpy as np
means = [[2, 2], [8, 3], [3, 6]]
cov = [[1, 0], [0, 1]]
N = 20
X0 = np.random.multivariate_normal(means[0], cov, N)
X1 = np.random.multivariate_normal(means[1], cov, N)
X2 = np.random.multivariate_normal(means[2], cov, N)
X = np.concatenate((X0, X1, X2), axis = 0)
Y = X[np.random.choice(X.shape[0], 3, replace=False)]
A = [X[np.random.choice(X.shape[0], 3, replace=False)]]
B = A[-1]
print(Y), print(type(Y))
print(A), print(type(A))
print(B), print(type(B))
>>>
[[3.58758421 6.83484817]
[9.10469916 4.23009063]
[7.24996633 4.0524614 ]]
<class 'numpy.ndarray'>
[array([[3.22836848, 7.06719777],
[2.33102712, 0.96966102],
[2.06576315, 4.84061538]])]
<class 'list'>
[[3.22836848 7.06719777]
[2.33102712 0.96966102]
[2.06576315 4.84061538]]
<class 'numpy.ndarray'>
Can you help me explain
What does X[np.random.choice(X.shape[0], 3, replace=False)] mean?
Is np.random.choice() supposed to return a new array?
Why Y and A return different results?
Is B supposed to return the last element in the list?
Thank you!
You can find the docs for scipy and numpy here as referenced in the comments.
Y is a numpy.ndarray object, and A is a list object. This is due to the [brackets] you have when you create A. The first and only element in A (the list) is Y (the array).
B does return the last element in the list. The last element in the list is the array object.
I would recommend reading this documentation on numpy.random.choice to find out exactly how the function works. In this instance, it essentially chooses 3 random indices from the numpy array X.
Y = X[np.random.choice(X.shape[0], 3, replace=False)]
This line can be thought of like this: Choose 3 random values from X, and create a new numpy array containing those values, and call it Y.
A = [X[np.random.choice(X.shape[0], 3, replace=False)]]
Then, define a regular python list. This is a list with only one element. That one element is a numpy array of 3 random values from X. The key concept is that A only has one element. However, that one element happens to be an array, which itself has 3 elements.
B = A[-1]
Finally, you are right that this returns the last element of A, and calls it B. From above, we know that A only has one element, an array of 3 elements. Therefore, that array is the last element of the list A.
The major takeaway is that python allows you to have lists of lists, lists of numpy arrays, etc.
Related
When ı print out the following code Q is prints like it suppose to be (3 5 7 9) sum of the numbers with the next one. but in the variable explorer its a single integer ı want to get the result Q as an array like
Q = [3, 5, 7, 9]
import numpy as np
A = [1, 2, 3, 4, 5]
for i in range(0,4):
Q = np.array(A[i]+A[i+1])
print(Q)
for i in range(0,4):
Q = []
Q.append(Q[i] + A[i]+A[i+1])
print(Q)
This also doesnt work
Currently you're just re-declaring Q each time and it's never added to some collection of values
Instead, start with an empty list (or perhaps a numpy array in your case) and outside of your loop and append the values to it at each loop cycle
Q is a numpy array, but it's not what you're expecting!
It has no dimensions and only references a single value
>>> type(Q)
<class 'numpy.ndarray'>
>>> print(repr(Q))
array(9)
>>> import numpy as np
>>> A = [1, 2, 3, 4, 5]
>>> Q = np.array([], dtype=np.uint8)
>>> for i in range(4):
... Q = np.append(Q, A[i]+A[i+1]) # reassign each time for np
...
>>> print(Q)
[3 5 7 9]
Note that numpy arrays should be reassigned via np.append, while a normal python list has a .append() method (which does not return the list, but directly appends to it)
>>> l = ['a', 'b', 'c'] # start with a list of values
>>> l.append('d') # use the append method
>>> l # display resulting list
['a', 'b', 'c', 'd']
If you're not forced to use a numpy array to begin with, this can be done with a list comprehension
The resulting list can also be made into a numpy array afterwards
>>> [(x + x + 1) for x in range(1, 5)]
[3, 5, 7, 9]
All together with simplified math
>>> np.array([x*2+3 for x in range(4)])
array([3, 5, 7, 9])
If you want to use Numpy, then use Numpy. Start with a Numpy array (one-dimensional, containing the values), which looks like this:
A = np.array([1, 2, 3, 4, 5])
(Yes, you initialize it from the list).
Or you can create that kind of patterned data using Numpy's built-in tool:
A = np.arange(1, 6) # it works similarly to the built-in `range` type,
# but it does create an actual array.
Now we can get the values to use on the left-hand and right-hand sides of the addition:
# You can slice one-dimensional Numpy arrays just like you would lists.
# With more dimensions, you can slice in each dimension.
X = A[:-1]
Y = A[1:]
And add the values together element-wise:
Q = X + Y # yes, really that simple!
And that last line is the reason you would use Numpy to solve a problem like this. Otherwise, just use a list comprehension:
A = list(range(1, 6)) # same as [1, 2, 3, 4, 5]
# Same slicing, but now we have to do more work for the addition,
# by explaining the process of pairing up the elements.
Q = [x + y for x, y in zip(A[:-1], A[1:])]
For a 2D numpy array A, the loop for a in A will loop through all the rows in A. This functionality is what I want for my code, but I'm having difficulty with the edge case where A only has one row (i.e., is essentially a 1-dimensional array). In this case, the for loop treats A as a 1D array and iterates through its elements. What I want to instead happen in this case is a natural extension of the 2D case, where the loop retrieves the (single) row in A. Is there a way to format the array A such that the for loop functions like this?
Depending on if you declare the array yourself you can do this:
A = np.array([[1, 2, 3]])
Else you can check the dim of your array before iterating over it
B = np.array([1, 2, 3])
if B.ndim == 1:
B = B[None, :]
Or you can use the function np.at_least2d
C = np.array([1, 2, 3])
C = np.atleast_2d(C)
If your array trully is a 2D array, even with one row, there is no edge case:
import numpy
a = numpy.array([[1, 2, 3]])
for line in a:
print(line)
>>> [1 2 3]
You seem to be confusing numpy.array([[1, 2, 3]]) which is a 2D array of one line and numpy.array([1, 2, 3]) which would be a 1D array.
I think you can use np.expand_dims to achieve your goal
X = np.expand_dims(X, axis=0)
I am trying to iterate over the rows of a numpy array. The array may consist of two columns and multiple rows like [[a, b], [c, d], ...], or sometimes a single row like [a, b].
For the one-dimensional array, when I use enumerate to iterate over rows, python yields the individual elements a then b instead of the complete row [a, b] all at once.
How to I iterate the one-dimensional case in the same way as I would the 2D case?
Numpy iterates over the first dimension no matter what. Check the shape before you iterate.
>>> x = np.array([1, 2])
>>> x.ndim
1
>>> y = np.array([[1, 2], [3, 4], [5, 6]])
>>> y.ndim
2
Probably the simplest method is to always wrap in a call to np.array:
>>> x = np.array(x, ndmin=2, copy=False)
>>> y = np.array(y, ndmin=2, copy=False)
This will prepend a dimension of shape 1 to your array as necessary. It has the advantage that your inputs don't even have to be arrays, just something that can be converted to an array.
Another option is to use the atleast_2d function:
>>> x = np.atleast_2d(x)
All that being said, you are likely sacrificing most of the benefits of using numpy in the first place by attempting a vanilla python loop. Try to vectorize your operation instead.
Lets say I have a numpy array like
x = np.arange(10)
is it somehow possible to create a reference to a single element i.e.
y = create_a_reference_to(x[3])
y = 100
print x
[ 0 1 2 100 4 5 6 7 8 9]
You can't create a reference to a single element, but you can get a view over that single element:
>>> x = numpy.arange(10)
>>> y = x[3:4]
>>> y[0] = 100
>>> x
array([0, 1, 2, 100, 4, 5, 6, 7, 8, 9])
The reason you can't do the former is that everything in python is a reference. By doing y = 100, you're modifying what y points to - not it's value.
If you really want to, you can get that behaviour on instance attributes by using properties. Note this is only possible because the python data model specifies additional operations while accessing class attributes - it's not possible to get this behaviour for variables.
No you cannot do that, and that is by design.
Numpy arrays are of type numpy.ndarray. Individual items in it can be accessed with numpy.ndarray.item which does "copy an element of an array to a standard Python scalar and return it".
I'm guessing numpy returns a copy instead of direct reference to the element to prevent mutability of numpy items outside of numpy's own implementation.
Just as a thoughtgame, let's assume this wouldn't be the case and you would be allowed to get reference to individual items. Then what would happen if: numpy was in the midle of calculation and you altered an individual intime in another thread?
#goncalopp gives a correct answer, but there are a few variations that will achieve similar effects.
All of the notations shown below are able to reference a single element while still returning a view:
x = np.arange(10)
two_index_method = [None] * 10
scalar_element_method = [None] * 10
expansion_method = [None] * 10
for i in range(10):
two_index_method[i] = x[i:i+1]
scalar_element_method[i] = x[..., i] # x[i, ...] works, too
expansion_method[i] = x[:, np.newaxis][i] # np.newaxis == None
two_index_method[5] # Returns a length 1 numpy.ndarray, shape=(1,)
# >>> array([5])
scalar_element_method[5] # Returns a numpy scalar, shape = ()
# >>> array(5)
expansion_method[5] # Returns a length 1 numpy.ndarray, shape=(1,)
# >>> array([5])
x[5] = 42 # Change the value in the original `ndarray`
x
# >>> array([0, 1, 2, 3, 4, 42, 6, 7, 8, 9]) # The element has been updated
# All methods presented here are correspondingly updated:
two_index_method[5], scalar_element_method[5], expansion_method[5]
# >>> (array([42]), array(42), array([42]))
Since the object in scalar_element_method is a dimension zero scalar, attempting to reference the element contained within the ndarray via element[0] will return an IndexError. For a scalar ndarray, element[()] can be used to reference the element contained within the numpy scalar. This method can also be used for assignment to a length-1 ndarray, but has the unfortunate side effect that it does not dereference a length-1 ndarray to a python scalar. Fortunately, there is a single method, element.item(), that can be used (for dereferencing only) to obtain the value regardless of whether the element is a length one ndarray or a scalar ndarray:
scalar_element_method[5][0] # This fails
# >>> IndexError: too many indices for array
scalar_element_method[5][()] # This works for scalar `ndarray`s
# >>> 42
scalar_element_method[5][()] = 6
expansion_method[5][0] # This works for length-1 `ndarray`s
# >>> 6
expansion_method[5][()] # Doesn't return a python scalar (or even a numpy scalar)
# >>> array([6])
expansion_method[5][()] = 8 # But can still be used to change the value by reference
scalar_element_method[5].item() # item() works to dereference all methods
# >>> 8
expansion_method[5].item()
# >>> [i]8
TLDR; You can create a single-element view v with v = x[i:i+1], v = x[..., i], or v = x[:, None][i]. While different setters and getters work with each method, you can always assign values with v[()]=new_value, and you can always retrieve a python scalar with v.item().
Say that I have 4 numpy arrays
[1,2,3]
[2,3,1]
[3,2,1]
[1,3,2]
In this case, I've determined [1,2,3] is the "minimum array" for my purposes, as it is one of two arrays with lowest value at index 0, and of those two arrays it has the the lowest index 1. If there were more arrays with similar values, I would need to compare the next index values, and so on.
How can I extract the array [1,2,3] in that same order from the pile?
How can I extend that to x arrays of size n?
Thanks
Using the python non-numpy .sort() or sorted() on a list of lists (not numpy arrays) automatically does this e.g.
a = [[1,2,3],[2,3,1],[3,2,1],[1,3,2]]
a.sort()
gives
[[1,2,3],[1,3,2],[2,3,1],[3,2,1]]
The numpy sort seems to only sort the subarrays recursively so it seems the best way would be to convert it to a python list first. Assuming you have an array of arrays you want to pick the minimum of you could get the minimum as
sorted(a.tolist())[0]
As someone pointed out you could also do min(a.tolist()) which uses the same type of comparisons as sort, and would be faster for large arrays (linear vs n log n asymptotic run time).
Here's an idea using numpy:
import numpy
a = numpy.array([[1,2,3],[2,3,1],[3,2,1],[1,3,2]])
col = 0
while a.shape[0] > 1:
b = numpy.argmin(a[:,col:], axis=1)
a = a[b == numpy.min(b)]
col += 1
print a
This checks column by column until only one row is left.
numpy's lexsort is close to what you want. It sorts on the last key first, but that's easy to get around:
>>> a = np.array([[1,2,3],[2,3,1],[3,2,1],[1,3,2]])
>>> order = np.lexsort(a[:, ::-1].T)
>>> order
array([0, 3, 1, 2])
>>> a[order]
array([[1, 2, 3],
[1, 3, 2],
[2, 3, 1],
[3, 2, 1]])