How do I extract rows of a 2D NumPy array by condition? - python

I have a 4*5 NumPy array and I want to retrieve rows in which all elements are less than 5.
arr = np.array([[0,2,3,4,5],[1,2,4,1,3], [2,2,5,4,6], [0,2,3,4,3]])
arr[np.where(arr[:,:] <= 4)]
expected output:
[[1,2,4,1,3],[0,2,3,4,3]]
actual output:
array([0, 2, 3, 4, 1, 2, 4, 1, 3, 2, 2, 4, 0, 2, 3, 4, 3])
Any help is appreciated!

This actually quite simple. Just convert the entire array to booleans (each value is True if it's less than 5, False otherwise), and use np.all with axis=1 to return True for each row where all items are True:
>>> arr[np.all(arr < 5, axis=1)]
array([[1, 2, 4, 1, 3],
[0, 2, 3, 4, 3]])

Related

How to iterate list in numpy and avoid TypeError: Only integer scalar arrays can be converted to a scalar index

I am using numpy:
I have a list:[array([2, 5, 0, 6, 6, 0, 2, 0]), array([3, 2, 5, 4, 4, 5, 6, 0]), array([1, 1, 5, 1, 4, 6, 0, 0]), array([1, 3, 5, 4, 2, 2, 5, 3]), array([5, 0, 6, 3, 1, 0, 5, 3]), array([1, 5, 1, 6, 0, 3, 5, 5]), array([4, 6, 1, 1, 3, 5, 2, 6]), array([5, 5, 1, 2, 6, 0, 5, 0])] <class 'list'>
I want to be able to iterate each array in the list and pass it through a function and make a new list of outcomes for that I have this:
fit=[]
for i in collection:
state = collection[i]
test = Review(state)
fit.append(test.function())
print(fit)
But I get the following type error:
TypeError: only integer scalar arrays can be converted to a scalar
index
i needs to be an int but in this case, it would be an array from the list and what I need to do is pass each array to this function to get a result and add this to the new fit list
The for loop iterates over collection and as such i will be an element of collection. You're getting the error because i is not an int. Also the line state = collection[i] is redundant. Instead you can simply do state = i
After your comment, if you would like to iterate over the inner arrays you will need a second loop. To take your example of summing the arrays it would look something like this:
for i in collection:
arr_sum = 0
for j in i:
arr_sum += j
print(f'Array sum is {arr_sum}')
Note that for the application of a simple sum you can use the sum() function.
To iterate over list of arrays, try this:
fit=[]
for state in collection: #Iterate over each element in the collection
test = Review(state)
fit.append(test.function())
print(fit)
Or
fit=[]
for i in collection:
state = i
test = Review(state)
fit.append(test.function())
print(fit)

How would you reshuffle this array efficiently?

I have an array arr_val, which stores values of a certain function at large size of locations (for illustration let's just take a small one 4 locations). Now, let's say that I also have another array loc_array which stores the location of the function, and assume that location is again the same number 4. However, location array is multidimensional array such that each location index has the same 4 sub-location index, and each sub-location index is a pair coordinates. To clearly illustrate:
arr_val = np.array([1, 2, 3, 4])
loc_array = np.array([[[1,1],[2,3],[3,1],[3,2]],[[1,2],[2,4],[3,4],[4,1]],
[[2,1],[1,4],[1,3],[3,3]],[[4,2],[4,3],[2,2],[4,4]]])
The meaning of the above two arrays would be value of some parameter of interest at, for example locations [1,1],[2,3],[3,1],[3,2] is 1, and so on. However, I am interested in re-expressing the same thing above in a different form, which is instead of having random points, I would like to have coordinates in the following tractable form
coord = [[[1,1],[1,2],[1,3],[1,4]],[[2,1],[2,2],[2,3],[2,4]],[[3,1],[3,2],
[3,3],[3,4]],[[4,1],[4,2],[4,3],[4,4]]]
and the values at respective coordinates given as
val = [[1, 2, 3, 3],[3, 4, 1, 2],[1, 1, 3, 2], [2, 4, 4, 4]]
What would be a very efficient way to achieve the above for large numpy arrays?
You can use lexsort like so:
>>> order = np.lexsort(loc_array.reshape(-1, 2).T[::-1])
>>> arr_val.repeat(4)[order].reshape(4, 4)
array([[1, 2, 3, 3],
[3, 4, 1, 2],
[1, 1, 3, 2],
[2, 4, 4, 4]])
If you know for sure that loc_array is a permutation of all possible locations then you can avoid the sort:
>>> out = np.empty((4, 4), arr_val.dtype)
>>> out.ravel()[np.ravel_multi_index((loc_array-1).reshape(-1, 2).T, (4, 4))] = arr_val.repeat(4)
>>> out
array([[1, 2, 3, 3],
[3, 4, 1, 2],
[1, 1, 3, 2],
[2, 4, 4, 4]])
It could not be the answer what you want, but it works anyway.
val = [[1, 2, 3, 3],[3, 4, 1, 2],[1, 1, 3, 2], [2, 4, 4, 4]]
temp= ""
int_list = []
for element in val:
temp_int = temp.join(map(str, element ))
int_list.append(int(temp_int))
int_list.sort()
print(int_list)
## result ##
[1132, 1233, 2444, 3412]
Change each element array into int and construct int_list
Sort int_list
Construct 2D np.array from int_list
I skipped last parts. You may find the way on web.

how to roll two arrays of diffeent dimesnions into one dimensional array in python

I have two arrays (a,b) of different mXn dimensions
I need to know that how can I roll these two arrays into a single one dimensional array
I used np.flatten() for both a,b array and then rolled them into a single array but what i get is an array containg two one dimensional array(a,b)
a = np.array([[1,2,3,4],[3,4,5,6],[4,5,6,7]]) #3x4 array
b = np.array([ [1,2],[2,3],[3,4],[4,5],[5,6]]) #5x2 array
result = [a.flatten(),b.flatten()]
print(result)
[array([1, 2, 3, 4, 3, 4, 5, 6, 4, 5, 6, 7]), array([1, 2, 2, 3, ... 5, 6])]
In matlab , I would do it like this :
res = [a(:);b(:)]
Also, how can I retrieve a and b back from the result?
Use ravel + concatenate:
>>> np.concatenate((a.ravel(), b.ravel()))
array([1, 2, 3, 4, 3, 4, 5, 6, 4, 5, 6, 7, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6])
ravel returns a 1D view of the arrays, and is a cheap operation. concatenate joins the views together, returning a new array.
As an aside, if you want to be able to retrieve these arrays back, you'll need to store their shapes in some variable.
i = a.shape
j = b.shape
res = np.concatenate((a.ravel(), b.ravel()))
Later, to retrieve a and b from res,
a = res[:np.prod(i)].reshape(i)
b = res[np.prod(i):].reshape(j)
a
array([[1, 2, 3, 4],
[3, 4, 5, 6],
[4, 5, 6, 7]])
b
array([[1, 2],
[2, 3],
[3, 4],
[4, 5],
[5, 6]])
How about changing the middle line to:
result = [a.flatten(),b.flatten()].flatten()
Or even more simply (if you know there's always exactly 2 arrays)
result = a.flatten() + b.flatten()

How to find the indices of a vectorised matrix numpy

I have an ndmatrix in numpy (n x n x n), which I vectorise in order to do some sampling of my data in a particular way, giving me (1 x n^3).
I would like to take the individual vectorised indices and convert them back to n-dimensional indices in the form (n x n x n). Im not sure how bumpy actually vectorises matrices.
Can anyone advise?
Numpy has a function unravel_index which does pretty much that: given a set of 'flat' indices, it will return a tuple of arrays of indices in each dimension:
>>> indices = np.arange(25, dtype=int)
>>> np.unravel_index(indices, (5, 5))
(array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4,
4, 4], dtype=int64),
array([0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2,
3, 4], dtype=int64))
You can then zip them to get your original indices.
Be aware however that matrices can be represented as 'sequences of rows' (C convention, 'C') or 'sequence of columns' (Fortran convention, 'F'), or the corresponding convention in higher dimensions. Typical flattening of matrices in numpy will preserve that order, so [[1, 2], [3, 4]] can be flattened into [1, 2, 3, 4] (if it has 'C' order) or [1, 3, 2, 4] (if it has 'F' order). unravel_index takes an optional order parameter if you want to change the default (which is 'C'), so you can do:
>>> # Typically, transposition will change the order for
>>> # efficiency reasons: no need to change the data !
>>> n = np.random.random((2, 2, 2)).transpose()
>>> n.flags.f_contiguous
True
>>> n.flags.c_contiguous
False
>>> x, y, z = np.unravel_index([1,2,3,7], (2, 2, 2), order='F')

Finding differences between all values in an List

I want to find the differences between all values in a numpy array and append it to a new list.
Example: a = [1,4,2,6]
result : newlist= [3,1,5,3,2,2,1,2,4,5,2,4]
i.e for each value i of a, determine difference between values of the rest of the list.
At this point I have been unable to find a solution
You can do this:
a = [1,4,2,6]
newlist = [abs(i-j) for i in a for j in a if i != j]
Output:
print newlist
[3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4]
I believe what you are trying to do is to calculate absolute differences between elements of the input list, but excluding the self-differences. So, with that idea, this could be one vectorized approach also known as array programming -
# Input list
a = [1,4,2,6]
# Convert input list to a numpy array
arr = np.array(a)
# Calculate absolute differences between each element
# against all elements to give us a 2D array
sub_arr = np.abs(arr[:,None] - arr)
# Get diagonal indices for the 2D array
N = arr.size
rem_idx = np.arange(N)*(N+1)
# Remove the diagonal elements for the final output
out = np.delete(sub_arr,rem_idx)
Sample run to show the outputs at each step -
In [60]: a
Out[60]: [1, 4, 2, 6]
In [61]: arr
Out[61]: array([1, 4, 2, 6])
In [62]: sub_arr
Out[62]:
array([[0, 3, 1, 5],
[3, 0, 2, 2],
[1, 2, 0, 4],
[5, 2, 4, 0]])
In [63]: rem_idx
Out[63]: array([ 0, 5, 10, 15])
In [64]: out
Out[64]: array([3, 1, 5, 3, 2, 2, 1, 2, 4, 5, 2, 4])

Categories