From numpy's documentation: https://numpy.org/doc/stable/reference/generated/numpy.dot.html#numpy.dot
numpy.dot(a, b, out=None)
Dot product of two arrays. Specifically,
If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a # b is preferred.
If either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.
If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
If a is an N-D array and b is an M-D array (where M>=2), it is a sum product over the last axis of a and the second-to-last axis of b:
According to the 4th point, if we take the dot product of a 2-D array A and a 1-D array B, we should be taking the sum product over the columns of A and row of B since the last axis of A is the column, correct?
Yet, when I tried this in the Python IDLE, here is my output:
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
>>> b
[1, 2]
>>> a.dot(b)
array([ 5, 11, 17])
I expected this to throw an error since the dimension of a's columns is greater than the dimension of b's row.
If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
The last axis of a has size 2:
>>> a.shape
(3, 2)
which matches bs size.
Remember: the first axis for multi-dimensional arrays in numpy is 'downward'. 1D arrays are displayed horizontally in numpy but I think it's usually better to think of them as vertical vectors instead. This is what's being computed:
Related
I wish to compute the dot product between two 3D tensors along the first dimension. I tried the following einsum notation:
import numpy as np
a = np.random.randn(30).reshape(3, 5, 2)
b = np.random.randn(30).reshape(3, 2, 5)
# Expecting shape: (3, 5, 5)
np.einsum("ijk,ikj->ijj", a, b)
Sadly it returns this error:
ValueError: einstein sum subscripts string includes output subscript 'j' multiple times
I went with Einstein sum after I failed at it with np.tensordot. Ideas and follow up questions are highly welcome!
Your two dimensions of size 5 and 5 do not correspond to the same axes. As such you need to use two different subscripts to designate them. For example, you can do:
>>> res = np.einsum('ijk,ilm->ijm', a, b)
>>> res.shape
(3, 5, 5)
Notice you are also required to change the subscript for axes of size 2 and 2. This is because you are computing the batched outer product (i.e. we iterate on two axes at the same time), not a dot product (i.e. we iterate simultaneously on the two axes).
Outer product:
>>> np.einsum('ijk,ilm->ijm', a, b)
Dot product over subscript k, which is axis=2 of a and axis=1 of b:
>>> np.einsum('ijk,ikm->ijm', a, b)
which is equivalent to a#b.
dot product ... along the first dimension is a bit unclear. Is the first dimension a 'batch' dimension, with 3 dot's on the rest? Or something else?
In [103]: a = np.random.randn(30).reshape(3, 5, 2)
...: b = np.random.randn(30).reshape(3, 2, 5)
In [104]: (a#b).shape
Out[104]: (3, 5, 5)
In [105]: np.einsum('ijk,ikl->ijl',a,b).shape
Out[105]: (3, 5, 5)
#Ivan's answer is different:
In [106]: np.einsum('ijk,ilm->ijm', a, b).shape
Out[106]: (3, 5, 5)
In [107]: np.allclose(np.einsum('ijk,ilm->ijm', a, b), a#b)
Out[107]: False
In [108]: np.allclose(np.einsum('ijk,ikl->ijl', a, b), a#b)
Out[108]: True
Ivan's sums the k dimension of one, and l of the other, and then does a broadcasted elementwise. That is not matrix multiplication:
In [109]: (a.sum(axis=-1,keepdims=True)* b.sum(axis=1,keepdims=True)).shape
Out[109]: (3, 5, 5)
In [110]: np.allclose((a.sum(axis=-1,keepdims=True)* b.sum(axis=1,keepdims=True)),np.einsum('ijk,ilm->ijm', a,
...: b))
Out[110]: True
Another test of the batch processing:
In [112]: res=np.zeros((3,5,5))
...: for i in range(3):
...: res[i] = a[i]#b[i]
...: np.allclose(res, a#b)
Out[112]: True
I have tried the following code but didn't find the difference between np.dot and np.multiply with np.sum
Here is np.dot code
logprobs = np.dot(Y, (np.log(A2)).T) + np.dot((1.0-Y),(np.log(1 - A2)).T)
print(logprobs.shape)
print(logprobs)
cost = (-1/m) * logprobs
print(cost.shape)
print(type(cost))
print(cost)
Its output is
(1, 1)
[[-2.07917628]]
(1, 1)
<class 'numpy.ndarray'>
[[ 0.693058761039 ]]
Here is the code for np.multiply with np.sum
logprobs = np.sum(np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1 - A2)))
print(logprobs.shape)
print(logprobs)
cost = - logprobs / m
print(cost.shape)
print(type(cost))
print(cost)
Its output is
()
-2.07917628312
()
<class 'numpy.float64'>
0.693058761039
I'm unable to understand the type and shape difference whereas the result value is same in both cases
Even in the case of squeezing former code cost value become same as later but type remains same
cost = np.squeeze(cost)
print(type(cost))
print(cost)
output is
<class 'numpy.ndarray'>
0.6930587610394646
np.dot is the dot product of two matrices.
|A B| . |E F| = |A*E+B*G A*F+B*H|
|C D| |G H| |C*E+D*G C*F+D*H|
Whereas np.multiply does an element-wise multiplication of two matrices.
|A B| ⊙ |E F| = |A*E B*F|
|C D| |G H| |C*G D*H|
When used with np.sum, the result being equal is merely a coincidence.
>>> np.dot([[1,2], [3,4]], [[1,2], [2,3]])
array([[ 5, 8],
[11, 18]])
>>> np.multiply([[1,2], [3,4]], [[1,2], [2,3]])
array([[ 1, 4],
[ 6, 12]])
>>> np.sum(np.dot([[1,2], [3,4]], [[1,2], [2,3]]))
42
>>> np.sum(np.multiply([[1,2], [3,4]], [[1,2], [2,3]]))
23
What you're doing is calculating the binary cross-entropy loss which measures how bad the predictions (here: A2) of the model are when compared to the true outputs (here: Y).
Here is a reproducible example for your case, which should explain why you get a scalar in the second case using np.sum
In [88]: Y = np.array([[1, 0, 1, 1, 0, 1, 0, 0]])
In [89]: A2 = np.array([[0.8, 0.2, 0.95, 0.92, 0.01, 0.93, 0.1, 0.02]])
In [90]: logprobs = np.dot(Y, (np.log(A2)).T) + np.dot((1.0-Y),(np.log(1 - A2)).T)
# `np.dot` returns 2D array since its arguments are 2D arrays
In [91]: logprobs
Out[91]: array([[-0.78914626]])
In [92]: cost = (-1/m) * logprobs
In [93]: cost
Out[93]: array([[ 0.09864328]])
In [94]: logprobs = np.sum(np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1 - A2)))
# np.sum returns scalar since it sums everything in the 2D array
In [95]: logprobs
Out[95]: -0.78914625761870361
Note that the np.dot sums along only the inner dimensions which match here (1x8) and (8x1). So, the 8s will be gone during the dot product or matrix multiplication yielding the result as (1x1) which is just a scalar but returned as 2D array of shape (1,1).
Also, most importantly note that here np.dot is exactly same as doing np.matmul since the inputs are 2D arrays (i.e. matrices)
In [107]: logprobs = np.matmul(Y, (np.log(A2)).T) + np.matmul((1.0-Y),(np.log(1 - A2)).T)
In [108]: logprobs
Out[108]: array([[-0.78914626]])
In [109]: logprobs.shape
Out[109]: (1, 1)
Return result as a scalar value
np.dot or np.matmul returns whatever the resulting array shape would be, based on input arrays. Even with out= argument it's not possible to return a scalar, if the inputs are 2D arrays. However, we can use np.asscalar() on the result to convert it to a scalar if the result array is of shape (1,1) (or more generally a scalar value wrapped in an nD array)
In [123]: np.asscalar(logprobs)
Out[123]: -0.7891462576187036
In [124]: type(np.asscalar(logprobs))
Out[124]: float
ndarray of size 1 to scalar value
In [127]: np.asscalar(np.array([[[23.2]]]))
Out[127]: 23.2
In [128]: np.asscalar(np.array([[[[23.2]]]]))
Out[128]: 23.2
If Y and A2 are (1,N) arrays, then np.dot(Y,A.T) will produce a (1,1) result. It is doing a matrix multiplication of a (1,N) with a (N,1). The N's are summed, leaving the (1,1).
With multiply the result is (1,N). Sum all values, and the result is a scalar.
If Y and A2 were (N,) shaped (same number of elements, but 1d), the np.dot(Y,A2) (no .T) would also produce a scalar. From np.dot documentation:
For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors
Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned.
squeeze reduces all size 1 dimensions, but still returns an array. In numpy an array can have any number of dimensions (from 0 to 32). So a 0d array is possible. Compare the shape of np.array(3), np.array([3]) and np.array([[3]]).
In this example it just not a coincidence. Lets take an example we have two (1,3) and (1,3) matrices.
// Lets code
import numpy as np
x1=np.array([1, 2, 3]) // first array
x2=np.array([3, 4, 3]) // second array
//Then
X_Res=np.sum(np.multiply(x1,x2))
// will result 20 as it will be calculated as - (1*3)+(2*4)+(3*3) , i.e element wise
// multiplication followed by sum.
Y_Res=np.dot(x1,x2.T)
// in order to get (1,1) matrix) from a dot of (1,3) matrix and //(1,3) matrix we need to //transpose second one.
//Hence|1 2 3| * |3|
// |4| = |1*3+2*4+3*3| = |20|
// |3|
// will result 20 as it will be (1*3)+(2*4)+(3*3) , i.e. dot product of two matrices
print X_Res //20
print Y_Res //20
What is the difference between these two numpy objects?
import numpy as np
np.array([[0,0,0,0]])
np.array([0,0,0,0])
In [71]: np.array([[0,0,0,0]]).shape
Out[71]: (1, 4)
In [72]: np.array([0,0,0,0]).shape
Out[72]: (4,)
The former is a 1 x 4 two-dimensional array, the latter a 4 element one-dimensional array.
The difference between single and double brackets starts with lists:
In [91]: ll=[0,1,2]
In [92]: ll1=[[0,1,2]]
In [93]: len(ll)
Out[93]: 3
In [94]: len(ll1)
Out[94]: 1
In [95]: len(ll1[0])
Out[95]: 3
ll is a list of 3 items. ll1 is a list of 1 item; that item is another list. Remember, a list can contain a variety of different objects, numbers, strings, other lists, etc.
Your 2 expressions effectively make arrays from two such lists
In [96]: np.array(ll)
Out[96]: array([0, 1, 2])
In [97]: _.shape
Out[97]: (3,)
In [98]: np.array(ll1)
Out[98]: array([[0, 1, 2]])
In [99]: _.shape
Out[99]: (1, 3)
Here the list of lists has been turned into a 2d array. In a subtle way numpy blurs the distinction between the list and the nested list, since the difference between the two arrays lies in their shape, not a fundamental structure. array(ll)[None,:] produces the (1,3) version, while array(ll1).ravel() produces a (3,) version.
In the end result the difference between single and double brackets is a difference in the number of array dimensions, but we shouldn't loose sight of the fact that Python first creates different lists.
When you defined an array with two brackets, what you were really doing was declaring an array with an array with 4 0's inside. Therefore, if you wanted to access the first zero you would be accessing
your_array[0][0] while in the second array you would just be accessing your array[0]. Perhaps a better way to visualize it is
array: [
[0,0,0,0],
]
vs
array: [0,0,0,0]
How can I convert numpy array a to numpy array b in a (num)pythonic way. Solution should ideally work for arbitrary dimensions and array lengths.
import numpy as np
a=np.arange(12).reshape(2,3,2)
b=np.empty((2,3),dtype=object)
b[0,0]=np.array([0,1])
b[0,1]=np.array([2,3])
b[0,2]=np.array([4,5])
b[1,0]=np.array([6,7])
b[1,1]=np.array([8,9])
b[1,2]=np.array([10,11])
For a start:
In [638]: a=np.arange(12).reshape(2,3,2)
In [639]: b=np.empty((2,3),dtype=object)
In [640]: for index in np.ndindex(b.shape):
b[index]=a[index]
.....:
In [641]: b
Out[641]:
array([[array([0, 1]), array([2, 3]), array([4, 5])],
[array([6, 7]), array([8, 9]), array([10, 11])]], dtype=object)
It's not ideal since it uses iteration. But I wonder whether it is even possible to access the elements of b in any other way. By using dtype=object you break the basic vectorization that numpy is known for. b is essentially a list with numpy multiarray shape overlay. dtype=object puts an impenetrable wall around those size 2 arrays.
For example, a[:,:,0] gives me all the even numbers, in a (2,3) array. I can't get those numbers from b with just indexing. I have to use iteration:
[b[index][0] for index in np.ndindex(b.shape)]
# [0, 2, 4, 6, 8, 10]
np.array tries to make the highest dimension array that it can, given the regularity of the data. To fool it into making an array of objects, we have to give an irregular list of lists or objects. For example we could:
mylist = list(a.reshape(-1,2)) # list of arrays
mylist.append([]) # make the list irregular
b = np.array(mylist) # array of objects
b = b[:-1].reshape(2,3) # cleanup
The last solution suggests that my first one can be cleaned up a bit:
b = np.empty((6,),dtype=object)
b[:] = list(a.reshape(-1,2))
b = b.reshape(2,3)
I suspect that under the covers, the list() call does an iteration like
[x for x in a.reshape(-1,2)]
So time wise it might not be much different from the ndindex time.
One thing that I wasn't expecting about b is that I can do math on it, with nearly the same generality as on a:
b-10
b += 10
b *= 2
An alternative to an object dtype would be a structured dtype, e.g.
In [785]: b1=np.zeros((2,3),dtype=[('f0',int,(2,))])
In [786]: b1['f0'][:]=a
In [787]: b1
Out[787]:
array([[([0, 1],), ([2, 3],), ([4, 5],)],
[([6, 7],), ([8, 9],), ([10, 11],)]],
dtype=[('f0', '<i4', (2,))])
In [788]: b1['f0']
Out[788]:
array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]]])
In [789]: b1[1,1]['f0']
Out[789]: array([8, 9])
And b and b1 can be added: b+b1 (producing an object dtype). Curiouser and curiouser!
Based on hpaulj I provide a litte more generic solution. a is an array of dimension N which shall be converted to an array b of dimension N1 with dtype object holding arrays of dimension (N-N1).
In the example N equals 5 and N1 equals 3.
import numpy as np
N=5
N1=3
#create array a with dimension N
a=np.random.random(np.random.randint(2,20,size=N))
a_shape=a.shape
b_shape=a_shape[:N1] # shape of array b
b_arr_shape=a_shape[N1:] # shape of arrays in b
#Solution 1 with list() method (faster)
b=np.empty(np.prod(b_shape),dtype=object) #init b
b[:]=list(a.reshape((-1,)+b_arr_shape))
b=b.reshape(b_shape)
print "Dimension of b: {}".format(len(b.shape)) # dim of b
print "Dimension of array in b: {}".format(len(b[0,0,0].shape)) # dim of arrays in b
#Solution 2 with ndindex loop (slower)
b=np.empty(b_shape,dtype=object)
for index in np.ndindex(b_shape):
b[index]=a[index]
print "Dimension of b: {}".format(len(b.shape)) # dim of b
print "Dimension of array in b: {}".format(len(b[0,0,0].shape)) # dim of arrays in b
I have a 2 dimensional 3x3 array e.g.:
(4,5,6
8, 10, 12
12,15,18 )
I would like to multiply this by a vector (1,2,3) so that I end up with a 3x3x3 array where along the third dimension all the elements of the original array are multiplied by 1, 2 or 3 respectively. How do it do this in python?
This is the shortest code i could come up with (not the most optimized):
a = [[1,2,3],[4,5,6],[7,8,9]]
b = [1,2,3]
mult = [[[z*x for z in y] for y in a] for x in b]