Using numpy, I want to multiple a matrix x by a column array y, elementwise:
x = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
y = numpy.array([1, 2, 3])
z = numpy.multiply(x, y)
print z
This gives the output as if y is a row array:
[[ 1 4 9]
[ 4 10 18]
[ 7 16 27]]
However, I want the output as if y is a column array:
[[ 1 2 3]
[ 8 10 12]
[21 24 27]]
So how can I manipulate y to achieve this? If I use:
y = numpy.transpose(y)
then y remains the same shape.
Enclose it in another list to make it 2D:
>>> y2 = numpy.transpose([y])
>>> y2
array([[1],
[2],
[3]])
>>> numpy.multiply(x, y2)
array([[ 1, 2, 3],
[ 8, 10, 12],
[21, 24, 27]])
The reason you can't transpose y is because it's initialized as a 1-D array. Transposing an array only makes sense in two (or more) dimensions.
To get around these mixed-dimension issues, numpy actually provides a set of convenience functions to sanitize your inputs:
y = np.array([1, 2, 3])
y1 = np.atleast_1d(y) # Converts array to 1-D if less than that
y2 = np.atleast_2d(y) # Converts array to 2-D if less than that
y3 = np.atleast_3d(y) # Converts array to 3-D if less than that
I also think np.column_stack falls under this convenience category, as it puts together 1-D and 2-D arrays as columns like you would expect, rather than having to figure out the right series of reshapes and stacks.
y1 = np.array([1, 2, 3])
y2 = np.array([2, 4, 6])
y3 = np.array([[2, 6], [2, 4], [7, 7]])
y = np.column_stack((y1, y2, y3))
I think these functions aren't as well known as they should be, and I find them much easier, more flexible, and safer than manually fiddling with reshape or array dimensions. They also avoid making copies when possible, which can be a small performance speedup.
To answer your question, you should use np.atleast_2d to convert your array to a 2-D array, then transpose it.
y = np.atleast_2d(y).T
The other way to quickly do it without worrying about y is to transpose x then transpose the result back.
z = (x.T * y).T
Though this can obfuscate the intent of the code. It is probably faster though if performance is important.
If performance is important, that can inform which method you want to use. Some timings on my computer:
%timeit x * np.atleast_2d(y).T
100000 loops, best of 3: 7.98 us per loop
%timeit (x.T*y).T
100000 loops, best of 3: 3.27 us per loop
%timeit x * np.transpose([y])
10000 loops, best of 3: 20.2 us per loop
%timeit x * y.reshape(-1, 1)
100000 loops, best of 3: 3.66 us per loop
You can use reshape:
y = y.reshape(-1,1)
The y variable has a shape of (3,). If you construct it this way:
y = numpy.array([1, 2, 3], ndmin=2)
...it will have a shape of (1,3), which you can transpose to get the result you want:
y = numpy.array([1, 2, 3], ndmin=2).T
z = numpy.multiply(x, y)
Related
What is the meaning of x[...] below?
a = np.arange(6).reshape(2,3)
for x in np.nditer(a, op_flags=['readwrite']):
x[...] = 2 * x
While the proposed duplicate What does the Python Ellipsis object do? answers the question in a general python context, its use in an nditer loop requires, I think, added information.
https://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#modifying-array-values
Regular assignment in Python simply changes a reference in the local or global variable dictionary instead of modifying an existing variable in place. This means that simply assigning to x will not place the value into the element of the array, but rather switch x from being an array element reference to being a reference to the value you assigned. To actually modify the element of the array, x should be indexed with the ellipsis.
That section includes your code example.
So in my words, the x[...] = ... modifies x in-place; x = ... would have broken the link to the nditer variable, and not changed it. It's like x[:] = ... but works with arrays of any dimension (including 0d). In this context x isn't just a number, it's an array.
Perhaps the closest thing to this nditer iteration, without nditer is:
In [667]: for i, x in np.ndenumerate(a):
...: print(i, x)
...: a[i] = 2 * x
...:
(0, 0) 0
(0, 1) 1
...
(1, 2) 5
In [668]: a
Out[668]:
array([[ 0, 2, 4],
[ 6, 8, 10]])
Notice that I had to index and modify a[i] directly. I could not have used, x = 2*x. In this iteration x is a scalar, and thus not mutable
In [669]: for i,x in np.ndenumerate(a):
...: x[...] = 2 * x
...
TypeError: 'numpy.int32' object does not support item assignment
But in the nditer case x is a 0d array, and mutable.
In [671]: for x in np.nditer(a, op_flags=['readwrite']):
...: print(x, type(x), x.shape)
...: x[...] = 2 * x
...:
0 <class 'numpy.ndarray'> ()
4 <class 'numpy.ndarray'> ()
...
And because it is 0d, x[:] cannot be used instead of x[...]
----> 3 x[:] = 2 * x
IndexError: too many indices for array
A simpler array iteration might also give insight:
In [675]: for x in a:
...: print(x, x.shape)
...: x[:] = 2 * x
...:
[ 0 8 16] (3,)
[24 32 40] (3,)
this iterates on the rows (1st dim) of a. x is then a 1d array, and can be modified with either x[:]=... or x[...]=....
And if I add the external_loop flag from the next section, x is now a 1d array, and x[:] = would work. But x[...] = still works and is more general. x[...] is used all the other nditer examples.
In [677]: for x in np.nditer(a, op_flags=['readwrite'], flags=['external_loop']):
...: print(x, type(x), x.shape)
...: x[...] = 2 * x
[ 0 16 32 48 64 80] <class 'numpy.ndarray'> (6,)
Compare this simple row iteration (on a 2d array):
In [675]: for x in a:
...: print(x, x.shape)
...: x[:] = 2 * x
...:
[ 0 8 16] (3,)
[24 32 40] (3,)
this iterates on the rows (1st dim) of a. x is then a 1d array, and can be modified with either x[:] = ... or x[...] = ....
Read and experiment with this nditer page all the way through to the end. By itself, nditer is not that useful in python. It does not speed up iteration - not until you port your code to cython.np.ndindex is one of the few non-compiled numpy functions that uses nditer.
The ellipsis ... means as many : as needed.
For people who don't have time, here is a simple example:
In [64]: X = np.reshape(np.arange(9), (3,3))
In [67]: Y = np.reshape(np.arange(2*3*4), (2,3,4))
In [70]: X
Out[70]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [71]: X[:,0]
Out[71]: array([0, 3, 6])
In [72]: X[...,0]
Out[72]: array([0, 3, 6])
In [73]: Y
Out[73]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [74]: Y[:,0]
Out[74]:
array([[ 0, 1, 2, 3],
[12, 13, 14, 15]])
In [75]: Y[...,0]
Out[75]:
array([[ 0, 4, 8],
[12, 16, 20]])
In [76]: X[0,...,0]
Out[76]: array(0)
In [77]: Y[0,...,0]
Out[77]: array([0, 4, 8])
This makes it easy to manipulate only one dimension at a time.
One thing - You can have only one ellipsis in any given indexing expression, or your expression would be ambiguous about how many : should be put in each.
I believe a very good parallel (that most people are maybe used to) is to think that way:
import numpy as np
random_array = np.random.rand(2, 2, 2, 2)
In such case, [:, :, :, 0] and [..., 0] are the same.
You can use to analyse only an specific dimension, say you have a batch of 50 128x128 RGB image (50, 3, 128, 128), if you want to slice a piece of it in every image at every color channel, you could either do image[:,:,50:70, 20:80] or image[...,50:70,20:80]
Just be aware that you can't use it more than once in the statement like [...,0,...] is invalid.
I want to calculate the following:
but I have no idea how to do this in python, I do not want to implement this manually but use a predefined function for this, something from numpy for example.
But numpy seems to ignore that x.T should be transposed.
Code:
import numpy as np
x = np.array([1, 5])
print(np.dot(x, x.T)) # = 26, This is not the matrix it should be!
While your vectors are defined as 1-d arrays, you can use np.outer:
np.outer(x, x.T)
> array([[ 1, 5],
> [ 5, 25]])
Alternatively, you could also define your vectors as matrices and use normal matrix multiplication:
x = np.array([[1], [5]])
x # x.T
> array([[ 1, 5],
> [ 5, 25]])
You can do:
x = np.array([[1], [5]])
print(np.dot(x, x.T))
Your original x is of shape (2,), while you need a shape of (2,1). Another way is reshaping your x:
x = np.array([1, 5]).reshape(-1,1)
print(np.dot(x, x.T))
.reshape(-1,1) reshapes your array to have 1 column and implicitely takes care of number of rows.
output:
[[ 1 5]
[ 5 25]]
np.matmul(x[:, np.newaxis], [x])
I have a 2D numpy array, say array1 with values. array1 is of dimensions 2x4. I want to create a 4D numpy array array2 with dimensions 20x20x2x4 and I wish to replicate the array array1 to get this array.
That is, if array1 was
[[1, 2, 3, 4],
[5, 6, 7, 8]]
I want
array2[0, 0] = array1
array2[0, 1] = array1
array2[0, 2] = array1
array2[0, 3] = array1
# etc.
How can I do this?
One approach with initialization -
array2 = np.empty((20,20) + array1.shape,dtype=array1.dtype)
array2[:] = array1
Runtime test -
In [400]: array1 = np.arange(1,9).reshape(2,4)
In [401]: array1
Out[401]:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
# #MSeifert's soln
In [402]: %timeit np.tile(array1, (20, 20, 1, 1))
100000 loops, best of 3: 8.01 µs per loop
# Proposed soln in this post
In [403]: %timeit initialization_based(array1)
100000 loops, best of 3: 4.11 µs per loop
# #MSeifert's soln for READONLY-view
In [406]: %timeit np.broadcast_to(array1, (20, 20, 2, 4))
100000 loops, best of 3: 2.78 µs per loop
There are two easy ways:
np.broadcast_to:
array2 = np.broadcast_to(array1, (20, 20, 2, 4)) # array2 is a READONLY-view
and np.tile:
array2 = np.tile(array1, (20, 20, 1, 1)) # array2 is a normal numpy array
If you don't want to modify your array2 then np.broadcast_to should be really fast and simple. Otherwise np.tile or assigning to a new allocated array (see Divakars answer) should be preferred.
i got the answer.
array2[:, :, :, :] = array1.copy()
this should work fine
Let's say we have two matrices A and B and let matrix C be A*B (matrix multiplication not element-wise). We wish to get only the diagonal entries of C, which can be done via np.diagonal(C). However, this causes unnecessary time overhead, because we are multiplying A with B even though we only need the the multiplications of each row in A with the column of B that has the same 'id', that is row 1 of A with column 1 of B, row 2 of A with column 2 of B and so on: the multiplications that form the diagonal of C. Is there a way to efficiently achieve that using Numpy? I want to avoid using loops to control which row is multiplied with which column, instead, I wish for a built-in numpy method that does this kind of operation to optimize performance.
Thanks in advance..
I might use einsum here:
>>> a = np.random.randint(0, 10, (3,3))
>>> b = np.random.randint(0, 10, (3,3))
>>> a
array([[9, 2, 8],
[5, 4, 0],
[8, 0, 6]])
>>> b
array([[5, 5, 0],
[3, 5, 5],
[9, 4, 3]])
>>> a.dot(b)
array([[123, 87, 34],
[ 37, 45, 20],
[ 94, 64, 18]])
>>> np.diagonal(a.dot(b))
array([123, 45, 18])
>>> np.einsum('ij,ji->i', a,b)
array([123, 45, 18])
For larger arrays, it'll be much faster than doing the multiplication directly:
>>> a = np.random.randint(0, 10, (1000,1000))
>>> b = np.random.randint(0, 10, (1000,1000))
>>> %timeit np.diagonal(a.dot(b))
1 loops, best of 3: 7.04 s per loop
>>> %timeit np.einsum('ij,ji->i', a, b)
100 loops, best of 3: 7.49 ms per loop
[Note: originally I'd done the elementwise version, ii,ii->i, instead of matrix multiplication. The same einsum tricks work.]
def diag(A,B):
diags = []
for x in range(len(A)):
diags.append(A[x][x] * B[x][x])
return diags
I believe the above code is that you're looking for.
I have a matrix X of dimensions (30x8100) and another one Y of dimensions (1x8100). I want to generate an array containing the difference between them (X[1]-Y, X[2]-Y,..., X[30]-Y)
Can anyone help?
All you need for that is
X - Y
Since several people have offered answers that seem to try to make the shapes match manually, I should explain:
Numpy will automatically expand Y's shape so that it matches with that of X. This is called broadcasting, and it usually does a very good job of guessing what should be done. In ambiguous cases, an axis keyword can be applied to tell it which direction to do things. Here, since Y has a dimension of length 1, that is the axis that is expanded to be length 30 to match with X's shape.
For example,
In [87]: import numpy as np
In [88]: n, m = 3, 5
In [89]: x = np.arange(n*m).reshape(n,m)
In [90]: y = np.arange(m)[None,...]
In [91]: x.shape
Out[91]: (3, 5)
In [92]: y.shape
Out[92]: (1, 5)
In [93]: (x-y).shape
Out[93]: (3, 5)
In [106]: x
Out[106]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
In [107]: y
Out[107]: array([[0, 1, 2, 3, 4]])
In [108]: x-y
Out[108]:
array([[ 0, 0, 0, 0, 0],
[ 5, 5, 5, 5, 5],
[10, 10, 10, 10, 10]])
But this is not really a euclidean distance, as your title seems to suggest you want:
df = np.asarray(x - y) # the difference between the images
dst = np.sqrt(np.sum(df**2, axis=1)) # their euclidean distances
use array and use numpy broadcasting in order to subtract it from Y
init the matrix:
>>> from numpy import *
>>> a = array([[1,2,3],[4,5,6]])
Accessing the second row in a:
>>> a[1]
array([4, 5, 6])
Subtract array from Y
>>> Y = array([3,9,0])
>>> a - Y
array([[-2, -7, 3],
[ 1, -4, 6]])
Just iterate rows from your numpy array and you can actually just subtract them and numpy will make a new array with the differences!
import numpy as np
final_array = []
#X is a numpy array that is 30X8100 and Y is a numpy array that is 1X8100
for row in X:
output = row - Y
final_array.append(output)
output will be your resulting array of X[0] - Y, X[1] - Y etc. Now your final_array will be an array with 30 arrays inside, each that have the values of the X-Y that you need! Simple as that. Just make sure you convert your matrices to a numpy arrays first
Edit: Since numpy broadcasting will do the iteration, all you need is one line once you have your two arrays:
final_array = X - Y
And then that is your array with the differences!
a1 = numpy.array(X) #make sure you have a numpy array like [[1,2,3],[4,5,6],...]
a2 = numpy.array(Y) #make sure you have a 1d numpy array like [1,2,3,...]
a2 = [a2] * len(a1[0]) #make a2 as wide as a1
a2 = numpy.array(zip(*a2)) #transpose it (a2 is now same shape as a1)
print a1-a2 #idiomatic difference between a1 and a2 (or X and Y)