I have two numpy matrixes (or sparse equivalents) like:
>>> A = numpy.array([[1,0,2],[3,0,0],[4,5,0],[0,2,2]])
>>> A
array([[1, 0, 2],
[3, 0, 0],
[4, 5, 0],
[0, 2, 2]])
>>> B = numpy.array([[2,3],[3,4],[5,0]])
>>> B
array([[2, 3],
[3, 4],
[5, 0]])
>>> C = mean_dot_product(A, B)
>>> C
array([[6 , 3],
[6 , 9],
[11.5, 16],
[8 , 8]])
where C[i, j] = sum(A[i,k] * B[k,j]) / count_nonzero(A[i,k] * B[k,j])
There is a fast way to preform this operation in numpy?
A non ideal solution is:
>>> maskA = A > 0
>>> maskB = B > 0
>>> maskA.dtype=numpy.uint8
>>> maskB.dtype=numpy.uint8
>>> D = replace_zeros_with_ones(numpy.dot(maskA,maskB))
>>> C = numpy.dot(A,B) / D
Anyone have a better algorithm?
Further, if A or B are sparse matrix, making them dense (replacing zeros with ones) make memory occupation expolde!
Why you need replace_zeros_with_ones? I delete this line and run your code and get the right result.
You can do this by only one line if all the numbers are not negtaive:
np.dot(A, B)/np.dot(np.sign(A), np.sign(B))
Related
I have a problem where I have two arrays with dimensions
n x d and d x h, where n, d, and h are positive integers that may or may not be the same. I am trying to find the dot product of the two matrices but with the condition that for each multiplication, I apply a function g to the term.
For example, in a case where n=2, d=3, and h=3
If I had the following matrices:
a = [[1, 2, 3],
[4, 5, 6]]
b = [[1, 2, 3],
[4, 5, 6],
[1, 1, 1]]
I would want to find c such that
c = [[g(1*1)+g(2*4)+g(3*1), g(1*2)+g(2*5)+g(3*1), g(1*3)+g(2*6)+g(3*1)],
[g(4*1)+g(5*4)+g(6*1), g(4*2)+g(5*5)+g(6*1), g(4*3)+g(5*6)+g(6*1)]]
Any help would be appreciated
I was able to do this by first doing the multiplications with broadcasting, applying g(), and then summing across the correct axis:
import numpy as np
def g(x):
return 1 / (1 + np.exp(-x))
a = np.array([[1, 2, 3],
[4, 5, 6]])
b = np.array([[1, 2, 3],
[4, 5, 6],
[1, 1, 1]])
First, the multiplication a[:, None] * b.T (probably a nicer way to do this), then evaluate g(x), then sum across axis 2:
>>> g(a[:, None] * b.T).sum(axis=2)
array([[2.68329736, 2.83332581, 2.90514211],
[2.97954116, 2.99719203, 2.99752123]])
Verifying that the first row indeed matches your desired result:
>>> g(1*1) + g(2*4) + g(3*1)
2.683297355321972
>>> g(1*2) + g(2*5) + g(3*1)
2.8333258069316134
>>> g(1*3) + g(2*6) + g(3*1)
2.9051421094702645
I'm new with Python and programming in general.
I want to create a function that multiplies two np.array of the same size and get their scalar value, for example:
matrix_1 = np.array([[1, 1], [0, 1], [1, 0]])
matrix_2 = np.array([[1, 2], [1, 1], [0, 0]])
I want to get 4 as output ((1 * 1) + (1 * 2) + (0 * 1) + (1 * 1) + (1 * 0) + (0 * 0))
Thanks!
Multiply two matrices element-wise
Sum all the elements
multiplied_matrix = np.multiply(matrix_1,matrix_2)
sum_of_elements = np.sum(multiplied_matrix)
print(sum_of_elements) # 4
Or in one shot:
print(np.sum(np.multiply(matrix_1, matrix_2))) # 4
You can make use of np.multiply() to multiply the two arrays elementwise, then we call np.sum() on this matrix. So we thus can calculate the result with:
np.multiply(matrix_1, matrix_2).sum()
For your given sample matrix, we thus obtain:
>>> matrix_1 = np.array([[1, 1], [0, 1], [1, 0]])
>>> matrix_2 = np.array([[1, 2], [1, 1], [0, 0]])
>>> np.multiply(matrix_1, matrix_2)
array([[1, 2],
[0, 1],
[0, 0]])
>>> np.multiply(matrix_1, matrix_2).sum()
4
There are a couple of ways to do it (Frobenius inner product) using numpy, e.g.
np.sum(A * B)
np.dot(A.flatten(), B.flatten())
np.trace(np.dot(A, B.T))
np.einsum('ij,ij', A, B)
One recommended way is using numpy.einsum, since it can be adapted to not only matrices but also multiway arrays (i.e., tensors).
Matrices of the same size
Take the matrices what you give as an example,
>>> import numpy as np
>>> matrix_1 = np.array([[1, 1], [0, 1], [1, 0]])
>>> matrix_2 = np.array([[1, 2], [1, 1], [0, 0]])
then, we have
>>> np.einsum('ij, ij ->', matrix_1, matrix_2)
4
Vectors of the same size
An example like this:
>>> vector_1 = np.array([1, 2, 3])
>>> vector_2 = np.array([2, 3, 4])
>>> np.einsum('i, i ->', vector_1, vector_2)
20
Tensors of the same size
Take three-way arrays (i.e., third-order tensors) as an example,
>>> tensor_1 = np.array([[[1, 2], [3, 4]], [[2, 3], [4, 5]], [[3, 4], [5, 6]]])
>>> print(tensor_1)
[[[1 2]
[3 4]]
[[2 3]
[4 5]]
[[3 4]
[5 6]]]
>>> tensor_2 = np.array([[[2, 3], [4, 5]], [[3, 4], [5, 6]], [[6, 7], [8, 9]]])
>>> print(tensor_2)
[[[2 3]
[4 5]]
[[3 4]
[5 6]]
[[6 7]
[8 9]]]
then, we have
>>> np.einsum('ijk, ijk ->', tensor_1, tensor_2)
248
For more usage about numpy.einsum, I recommend:
Understanding NumPy's einsum
I have a 2 dimension array and based if the value is greater than 0 I want to do a operation (example with x+1).
In plain python something like this:
a = [[2,5], [4,0], [0,2]]
for x in range(3):
for y in range(2):
if a[x][y] > 0:
a[x][y] = a[x][y] + 1
Result for a is [[3, 6], [5, 0], [0, 3]]. This is what I want.
Now I want to prevent the nested loop and tried with numpy something like this:
a = np.array([[2,5], [4,0], [0,2]])
mask = (a > 0)
a[mask] + 1
The result is now a 1 dimension and the shape of the array [3 6 5 3]. How can I do this operation and don't loose the dimension like in the plain python example before?
If a is a numpy array, you can simply do -
a[a>0] +=1
Sample run -
In [335]: a = np.array([[2,5], [4,0], [0,2]])
In [336]: a
Out[336]:
array([[2, 5],
[4, 0],
[0, 2]])
In [337]: a[a>0] +=1
In [338]: a
Out[338]:
array([[3, 6],
[5, 0],
[0, 3]])
I have four numpy arrays like:
X1 = array([[1, 2], [2, 0]])
X2 = array([[3, 1], [2, 2]])
I1 = array([[1], [1]])
I2 = array([[1], [1]])
And I'm doing:
Y = array([I1, X1],
[I2, X2]])
To get:
Y = array([[ 1, 1, 2],
[ 1, 2, 0],
[-1, -3, -1],
[-1, -2, -2]])
Like this example, I have large matrices, where X1 and X2 are n x d matrices.
Is there an efficient way in Python whereby I can get the matrix Y?
Although I am aware of the iterative manner, I am searching for an efficient manner to accomplish the above mentioned.
Here, Y is an n x (d+1) matrix and I1 and I2 are identity matrices of the dimension n x 1.
How about the following:
In [1]: import numpy as np
In [2]: X1 = np.array([[1,2],[2,0]])
In [3]: X2 = np.array([[3,1],[2,2]])
In [4]: I1 = np.array([[1],[1]])
In [5]: I2 = np.array([[4],[4]])
In [7]: Y = np.vstack((np.hstack((I1,X1)),np.hstack((I2,X2))))
In [8]: Y
Out[8]:
array([[1, 1, 2],
[1, 2, 0],
[4, 3, 1],
[4, 2, 2]])
Alternatively you could create an empty array of the appropriate size and fill it using the appropriate slices. This would avoid making intermediate arrays.
You need numpy.bmat
In [4]: A = np.mat('1 ; 1 ')
In [5]: B = np.mat('2 2; 2 2')
In [6]: C = np.mat('3 ; 5')
In [7]: D = np.mat('7 8; 9 0')
In [8]: np.bmat([[A,B],[C,D]])
Out[8]:
matrix([[1, 2, 2],
[1, 2, 2],
[3, 7, 8],
[5, 9, 0]])
For a numpy array, this page suggests the syntax may be of the form
vstack([hstack([a,b]),
hstack([c,d])])
I have the following array in NumPy:
A = array([1, 2, 3])
How can I obtain the following matrices (without an explicit loop)?
B = [ 1 1 1
2 2 2
3 3 3 ]
C = [ 1 2 3
1 2 3
1 2 3 ]
Thanks!
Edit2: The OP asks in the comments how to compute
n(i, j) = l(i, i) + l(j, j) - 2 * l(i, j)
I can think of two ways. I like this way because it generalizes easily:
import numpy as np
l=np.arange(9).reshape(3,3)
print(l)
# [[0 1 2]
# [3 4 5]
# [6 7 8]]
The idea is to use np.ogrid. This defines a list of two numpy arrays, one of shape (3,1) and one of shape (1,3):
grid=np.ogrid[0:3,0:3]
print(grid)
# [array([[0],
# [1],
# [2]]), array([[0, 1, 2]])]
grid[0] can be used as a proxy for the index i, and
grid[1] can be used as a proxy for the index j.
So everywhere in the expression l(i, i) + l(j, j) - 2 * l(i, j), you simply replace i-->grid[0], and j-->grid[1], and numpy broadcasting takes care of the rest:
n=l[grid[0],grid[0]] + l[grid[1],grid[1]] + 2*l
print(n)
# [[ 0 6 12]
# [10 16 22]
# [20 26 32]]
However, in this particular case, since l(i,i) and l(j,j) are just the diagonal elements of l, you could do this instead:
d=np.diag(l)
print(d)
# [0 4 8]
d[np.newaxis,:] pumps up the shape of d to (1,3), and
d[:,np.newaxis] pumps up the shape of d to (3,1).
Numpy broadcasting pumps up d[np.newaxis,:] and d[:,np.newaxis] to shape (3,3), copying values as appropriate.
n=d[np.newaxis,:] + d[:,np.newaxis] + 2*l
print(n)
# [[ 0 6 12]
# [10 16 22]
# [20 26 32]]
Edit1: Usually you do not need to form B or C. The purpose of Numpy broadcasting is to allow you to use A in place of B or C. If you show us how you plan to use B or C, we might be able to show you how to do the same with A and numpy broadcasting.
(Original answer):
In [11]: B=A.repeat(3).reshape(3,3)
In [12]: B
Out[12]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In [13]: C=B.T
In [14]: C
Out[14]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
or
In [25]: C=np.tile(A,(3,1))
In [26]: C
Out[26]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
In [27]: B=C.T
In [28]: B
Out[28]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
From the dirty tricks department:
In [57]: np.lib.stride_tricks.as_strided(A,shape=(3,3),strides=(4,0))
Out[57]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In [58]: np.lib.stride_tricks.as_strided(A,shape=(3,3),strides=(0,4))
Out[58]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
But note that these are views of A, not copies (as were the solutions above). Changing B, alters A:
In [59]: B=np.lib.stride_tricks.as_strided(A,shape=(3,3),strides=(4,0))
In [60]: B[0,0]=100
In [61]: A
Out[61]: array([100, 2, 3])
Very old thread but just in case someone cares...
C,B = np.meshgrid(A,A)