Count different non-zero values of two 2D arrays in Python numpy

Count different non-zero values of two 2D arrays in Python numpy - python

I would like to find out how many values in 2D array array1 are different from values in array2 on same positions (x, y) and not equal 0 in array2 using Numpy.
array1 = numpy.array([[1, 2], [3, 0]])
array2 = numpy.array([[1, 2], [0, 3]])
print(numpy.count_nonzero(array1 != array2)) # 2
Example above prints 2, because 0 and 3 are different. Is there any way not to count difference if value in array2 is 0? Something like that (which is not working - ValueError: The truth value of an array with more then one element is ambiguous. Use a.any() or a.all()):
print(numpy.count_nonzero(array1 != array2 and array2 != 0)) # Should be 1.

a = np.array([[1, 2], [3, 0]])
b = np.array([[1, 2], [0, 3]])
Filter out b's zero values
np.nonzero returns indices, this uses multidimensional index arrays to filter out the zero values.
In [144]: b.nonzero()
Out[144]: (array([0, 0, 1], dtype=int64), array([0, 1, 1], dtype=int64))
In [145]: a[b.nonzero()]
Out[145]: array([1, 2, 0])
In [146]: b[b.nonzero()]
Out[146]: array([1, 2, 3])
In [147]: c = a[b.nonzero()] != b[b.nonzero()]
In [148]: c.sum()
Out[148]: 1
This uses boolean indexing to filter out the zero values.
In [149]: b != 0
Out[149]:
array([[ True, True],
[False, True]], dtype=bool)
In [150]: a[b != 0]
Out[150]: array([1, 2, 0])
In [151]: b[b != 0]
Out[151]: array([1, 2, 3])
In [152]: c = a[b != 0] != b[b != 0]
In [153]: c.sum()
Out[153]: 1

You can achieve that by replacing and with multiplication:
print(numpy.count_nonzero((array1 != array2) * (array2 != 0)))

Related

How do I access the values of a numpy array with an array of indices?

For example,
k = np.array([[[1,2,3,4],[1,2,3,4]]])
index = np.array([[0,0], [0,1]])
I want to be able to get the values from k responding to [0,0] and [0,1].
How could I do this?
If I use a for loop through the array it works.
for y in range(1):
for x in range(1):
k[index[y,x]]
However, I would like to do this without using for loops.

In [50]: k = np.array([[[1,2,3,4],[1,2,3,4]]])
...: index = np.array([[0,0], [0,1]])
In [51]: k
Out[51]:
array([[[1, 2, 3, 4],
[1, 2, 3, 4]]])
In [52]: k.shape
Out[52]: (1, 2, 4)
Note the shape - 3d, due to the 3 levels of []
In [53]: index
Out[53]:
array([[0, 0],
[0, 1]])
Because this array is symmetric, it doesn't matter whether we use the rows or the columns. For a more general case you'll need to be clearer.
In any case, we index each dimension of k with an array
Using columns of index, and working with the first 2 dimensions:
In [54]: k[index[:,0],index[:,1]]
Out[54]:
array([[1, 2, 3, 4],
[1, 2, 3, 4]])
Looks much like k except it is 2d.
Or applying a 0 to the first size 1 dimension:
In [55]: k[0,index[:,0],index[:,1]]
Out[55]: array([1, 2])
Read more at https://numpy.org/doc/stable/user/basics.indexing.html

multiple np.where return multiple values can not be accessed properly

When running:
np.where(vals == min(vals))
there are multiple output, which mean the smallest value in the list appear more than 1 time. the return value makes sense, which is a tuple made of an array:
result = (array([0, 2]),)
However, I tried to access the array and after doing :
result[0]
This becomes the output:
[0 2]
What the heck is this [0 2]?? When I say result[0][1], it said index out of bound!

In [62]: arr = np.array([2,3,2])
In [63]: idx = np.where(arr == np.min(arr))
In [64]: idx
Out[64]: (array([0, 2]),)
This tuple can used directly to index arr and return the matching values:
In [65]: arr[idx]
Out[65]: array([2, 2])
argwhere just applies transpose to this tuple, turning it into a 2d array:
In [66]: np.argwhere(arr == np.min(arr))
Out[66]:
array([[0],
[2]])
You can then iterate of the rows of the array to fetch individual values of arr:
In [67]: for i in _66:
...: print(arr[i])
...:
[2]
[2]
But I don't know when this iteration would be useful. The indexing in [65] faster.
A 2d example is more interesting:
In [74]: x = np.arange(12).reshape(3,4)
In [75]: x%3
Out[75]:
array([[0, 1, 2, 0],
[1, 2, 0, 1],
[2, 0, 1, 2]])
In [76]: idx = np.where(x%3==0)
In [77]: idx
Out[77]: (array([0, 0, 1, 2]), array([0, 3, 2, 1]))
In [78]: np.argwhere(x%3==0)
Out[78]:
array([[0, 0],
[0, 3],
[1, 2],
[2, 1]])

what's the difference between these two numpy array shape?

In [136]: s = np.array([[1,0,1],[0,1,1],[0,0,1],[1,1,1]])
In [137]: s
Out[137]:
array([[1, 0, 1],
[0, 1, 1],
[0, 0, 1],
[1, 1, 1]])
In [138]: x = s[0:1]
In [139]: x.shape
Out[139]: (1, 3)
In [140]: y = s[0]
In [141]: y.shape
Out[141]: (3,)
In [142]: x
Out[142]: array([[1, 0, 1]])
In [143]: y
Out[143]: array([1, 0, 1])
In the above code, x's shape is (1,3) and y's shape is(3,).
(1,3): 1 row and 3 columns
(3,): How many rows and columns in this case?
Does (3,) represent 1-dimension array?
In practice, if I want to iterate through the matrix row by row, which way should I go?
for i in range(len(x)):
row = x[i]
# OR
row = x[i:i+1]

First, you can get the number of dimensions of an numpy array array through len(array.shape).
An array with some dimensions of length 1 is not equal to an array with those dimensions removed, for example:
>>> a = np.array([[1], [2], [3]])
>>> b = np.array([1, 2, 3])
>>> a
array([[1],
[2],
[3]])
>>> b
array([1, 2, 3])
>>> a.shape
(3, 1)
>>> b.shape
(3,)
>>> a + a
array([[2],
[4],
[6]])
>>> a + b
array([[2, 3, 4],
[3, 4, 5],
[4, 5, 6]])
Conceptually, the difference between an array of shape (3, 1) and one of shape (3,) is like the difference between the length of [100] and 100.
[100] is a list that happens to have one element. It could have more, but right now it has the minimum possible number of elements.
On the other hand, it doesn't even make sense to talk about the length of 100, because it doesn't have one.
Similarly, the array of shape (3, 1) has 3 rows and 1 column, while the array of shape (3,) has no columns at all. It doesn't even have rows, in a sense; it is a row, just like 100 has no elements, because it is an element.
For more information on how differently shaped arrays behave when interacting with other arrays, you can see the broadcasting rules.
Lastly, for completeness, to iterate through the rows of a numpy array, you could just do for row in array. If you want to iterate through the back axes, you can use np.moveaxis, for example:
>>> array = np.array([[1, 2], [3, 4], [5, 6]])
>>> for row in array:
... print(row)
...
[1 2]
[3 4]
[5 6]
>>> for col in np.moveaxis(array, [0, 1], [1, 0]):
... print(col)
...
[1 3 5]
[2 4 6]

Use numpy.argwhere to obtain the matching values in an np.array

I'd like to use np.argwhere() to obtain the values in an np.array.
For example:
z = np.arange(9).reshape(3,3)
[[0 1 2]
[3 4 5]
[6 7 8]]
zi = np.argwhere(z % 3 == 0)
[[0 0]
[1 0]
[2 0]]
I want this array: [0, 3, 6] and did this:
t = [z[tuple(i)] for i in zi] # -> [0, 3, 6]
I assume there is an easier way.

Why not simply use masking here:
z[z % 3 == 0]
For your sample matrix, this will generate:
>>> z[z % 3 == 0]
array([0, 3, 6])
If you pass a matrix with the same dimensions with booleans as indices, you get an array with the elements of that matrix where the boolean matrix is True.
This will furthermore work more efficient, since you do the filtering at the numpy level (whereas list comprehension works at the Python interpreter level).

Source for argwhere
def argwhere(a):
"""
Find the indices of array elements that are non-zero, grouped by element.
...
"""
return transpose(nonzero(a))
np.where is the same as np.nonzero.
In [902]: z=np.arange(9).reshape(3,3)
In [903]: z%3==0
Out[903]:
array([[ True, False, False],
[ True, False, False],
[ True, False, False]], dtype=bool)
In [904]: np.nonzero(z%3==0)
Out[904]: (array([0, 1, 2], dtype=int32), array([0, 0, 0], dtype=int32))
In [905]: np.transpose(np.nonzero(z%3==0))
Out[905]:
array([[0, 0],
[1, 0],
[2, 0]], dtype=int32)
In [906]: z[[0,1,2], [0,0,0]]
Out[906]: array([0, 3, 6])
z[np.nonzero(z%3==0)] is equivalent to using I,J as indexing arrays:
In [907]: I,J =np.nonzero(z%3==0)
In [908]: I
Out[908]: array([0, 1, 2], dtype=int32)
In [909]: J
Out[909]: array([0, 0, 0], dtype=int32)
In [910]: z[I,J]
Out[910]: array([0, 3, 6])

How do I index an ndarray using rows of another ndarray?

If I have
x = np.arange(1, 10).reshape((3,3))
# array([[1, 2, 3],
# [4, 5, 6],
# [7, 8, 9]])
and
ind = np.array([[1,1], [1,2]])
# array([[1, 1],
# [1, 2]])
, how do I get use each row (axis 0) of ind to extract a cell of x? I hope to end up with the array [5, 6]. np.take(x, ind, axis=0) does not seem to work.

You could use "advanced integer indexing" by indexing x with two integer arrays, the first array for indexing the row, the second array for indexing the column:
In [58]: x[ind[:,0], ind[:,1]]
Out[58]: array([5, 6])

x[ind.T.tolist()]
works, too, and can also be used for multidimensional NumPy arrays.
Why?
NumPy arrays are indexed by tuples. Usually, these tuples are created implicitly by python:
Note
In Python, x[(exp1, exp2, ..., expN)] is equivalent to x[exp1, exp2, ..., expN]; the latter is just syntactic sugar for the former.
Note that this syntactic sugar isn't NumPy-specific. You could use it on dictionaries when the key is a tuple:
In [1]: d = { 'I like the number': 1, ('pi', "isn't"): 2}
In [2]: d[('pi', "isn't")]
Out[2]: 2
In [3]: d['pi', "isn't"]
Out[3]: 2
Actually, it's not even related to indexing:
In [5]: 1, 2, 3
Out[5]: (1, 2, 3)
Thus, for your NumPy array, x = np.arange(1,10).reshape((3,3))
In [11]: x[1,2]
Out[11]: 6
because
In [12]: x[(1,2)]
Out[12]: 6
So, in unutbu's answer, actually a tuple containing the columns of ind is passed:
In [21]: x[(ind[:,0], ind[:,1])]
Out[21]: array([5, 6])
with x[ind[:,0], ind[:,1]] just being an equivalent (and recommended) short hand notation for the same.
Here's how that tuple looks like:
In [22]: (ind[:,0], ind[:,1])
Out[22]: (array([1, 1]), array([1, 2]))
We can construct the same tuple diffently from ind: tolist() returns a NumPy array's rows. Transposing switches rows and columns, so we can get a list of columns by first transposing and calling tolist on the result:
In [23]: ind.T.tolist()
Out[23]: [[1, 1], [1, 2]]
Because ind is symmetric in your example, it is it's own transpose. Thus, for illustration, let's use
In [24]: ind_2 = np.array([[1,1], [1,2], [0, 0]])
# array([[1, 1],
# [1, 2],
# [0, 0]])
In [25]: ind_2.T.tolist()
Out[25]: [[1, 1, 0], [1, 2, 0]]
This can easily be converted to the tuples we want:
In [27]: tuple(ind_2.T.tolist())
Out[27]: ([1, 1, 0], [1, 2, 0])
In [28]: tuple(ind.T.tolist())
Out[28]: ([1, 1], [1, 2])
Thus,
In [29]: x[tuple(ind.T.tolist())]
Out[29]: array([5, 6])
equivalently to unutbu's answer for x.ndim == 2 and ind_2.shape[1] == 2, but also working more generally when x.ndim == ind_2.shape[1], in case you have to work with multi-dimensional NumPy arrays.
Why you can drop the tuple(...) and directly use the list for indexing, I don't know. Must be a NumPy thing:
In [43]: x[ind_2.T.tolist()]
Out[43]: array([5, 6, 1])

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Count different non-zero values of two 2D arrays in Python numpy - python

You can achieve that by replacing and with multiplication: print(numpy.count_nonzero((array1 != array2) * (array2 != 0)))

Related

How do I access the values of a numpy array with an array of indices?

multiple np.where return multiple values can not be accessed properly

what's the difference between these two numpy array shape?

Use numpy.argwhere to obtain the matching values in an np.array

How do I index an ndarray using rows of another ndarray?

Categories

Resources