Numpy: Subtract 2 numpy arrays row wise - python

I have 2 numpy arrays a and b as below:
a = np.random.randint(0,10,(3,2))
Out[124]:
array([[0, 2],
[6, 8],
[0, 4]])
b = np.random.randint(0,10,(2,2))
Out[125]:
array([[5, 9],
[2, 4]])
I want to subtract each row in b from each row in a and the desired output is of shape(3,2,2):
array([[[-5, -7], [-2, -2]],
[[ 1, -1], [ 4, 4]],
[[-5, -5], [-2, 0]]])
I can do this using:
print(np.c_[(a - b[0]),(a - b[1])].reshape(3,2,2))
But I need a fully vectorized solution or a built in numpy function to do this.

Just use np.newaxis (which is just an alias for None) to add a singleton dimension to a, and let broadcasting do the rest:
In [45]: a[:, np.newaxis] - b
Out[45]:
array([[[-5, -7],
[-2, -2]],
[[ 1, -1],
[ 4, 4]],
[[-5, -5],
[-2, 0]]])

I'm not sure what means a fully factorized solution, but may be this will help:
np.append(a, a, axis=1).reshape(3, 2, 2) - b

You can shave a little time off using np.subtract(), and a good bit more using np.concatenate()
import numpy as np
import time
start = time.time()
for i in range(100000):
a = np.random.randint(0,10,(3,2))
b = np.random.randint(0,10,(2,2))
c = np.c_[(a - b[0]),(a - b[1])].reshape(3,2,2)
print time.time() - start
start = time.time()
for i in range(100000):
a = np.random.randint(0,10,(3,2))
b = np.random.randint(0,10,(2,2))
#c = np.c_[(a - b[0]),(a - b[1])].reshape(3,2,2)
c = np.c_[np.subtract(a,b[0]),np.subtract(a,b[1])].reshape(3,2,2)
print time.time() - start
start = time.time()
for i in range(100000):
a = np.random.randint(0,10,(3,2))
b = np.random.randint(0,10,(2,2))
#c = np.c_[(a - b[0]),(a - b[1])].reshape(3,2,2)
c = np.concatenate([np.subtract(a,b[0]),np.subtract(a,b[1])],axis=1).reshape(3,2,2)
print time.time() - start
>>>
3.14023900032
3.00368094444
1.16146492958
reference:
confused about numpy.c_ document and sample code
np.c_ is another way of doing array concatenate

Reading from the doc on broadcasting, it says:
When operating on two arrays, NumPy compares their shapes
element-wise. It starts with the trailing dimensions, and works its
way forward. Two dimensions are compatible when
they are equal, or
one of them is 1
Back to your case, you want result to be of shape (3, 2, 2), following these rules, you have to play around with your dimensions.
Here's now the code to do it:
In [1]: a_ = np.expand_dims(a, axis=0)
In [2]: b_ = np.expand_dims(b, axis=1)
In [3]: c = a_ - b_
In [4]: c
Out[4]:
array([[[-5, -7],
[ 1, -1],
[-5, -5]],
[[-2, -2],
[ 4, 4],
[-2, 0]]])
In [5]: result = c.swapaxes(1, 0)
In [6]: result
Out[6]:
array([[[-5, -7],
[-2, -2]],
[[ 1, -1],
[ 4, 4]],
[[-5, -5],
[-2, 0]]])
In [7]: result.shape
Out[7]: (3, 2, 2)

Related

Batch dot product with numpy?

I need to get the dot product of many vectors with one vector. Example code:
a = np.array([0, 1, 2])
b = np.array([
[0, 1, 2],
[4, 5, 6],
[-1, 0, 1],
[-3, -2, 1]
])
I would like to get the dot product of each row of b against a. I can iterate:
result = []
for row in b:
result.append(np.dot(row, a))
print(result)
which gives:
[5, 17, 2, 0]
How can I get this without iterating? Thanks!
Use numpy.dot or numpy.matmul without for loop:
import numpy as np
np.matmul(b, a)
# or
np.dot(b, a)
Output:
array([ 5, 17, 2, 0])
I will just do #
b#a
Out[108]: array([ 5, 17, 2, 0])

Pseudo inverse matrix calculation

I try to repeat the example of calculation of pseudo inverse matrix from lectures:
I use this code
from numpy import *
# https://classes.soe.ucsc.edu/cmps290c/Spring04/paps/lls.pdf
x = np.array([[-11, 2],[2, 3],[2, -1]])
print(x)
# computing the inverse using pinv
a = linalg.pinv(x)
print(a)
My result of the calculation differs from the result in the lecture.
My result:
[[-0.07962213 0.05533063 0.00674764]
[ 0.04048583 0.2854251 -0.06275304]]
The result form lecture:
[[-0.148 0.180 0.246]
[ 0.164 0.189 -0.107]]
What am I doing wrong? Tell me please!
There is a mistake in the lecture notes. It appears that they found the pseudo-inverse of
[-1 2]
A = [ 2 3]
[ 2 -1]
(Note the change of A[0,0] from -11 to -1.) Here's the calculation with that version of A:
In [73]: A = np.array([[-1, 2], [2, 3], [2, -1]])
In [74]: A
Out[74]:
array([[-1, 2],
[ 2, 3],
[ 2, -1]])
In [75]: np.linalg.pinv(A)
Out[75]:
array([[-0.14754098, 0.18032787, 0.24590164],
[ 0.16393443, 0.18852459, -0.10655738]])
In [76]: np.linalg.pinv(A).dot([0, 7, 5])
Out[76]: array([ 2.49180328, 0.78688525])

Trying to add a column to a data file

I have a data file with 2 columns, x ranging from -5 to 4 and f(x). I need to add a third column with |f(x)| the absolute value of f(x). Then I need to export the 3 columns as a new data file.
Currently my code looks like this:
from numpy import *
data = genfromtxt("task1.dat")
c = []
ab = abs(data[:,1])
ablist = ab.tolist()
datalist = data.tolist()
c.append(ablist)
c.append (datalist)
A = asarray (c)
savetxt("task1b.dat", A)
It gives me the following error message for line "A = asarray(c)":
ValueError : setting an array element with a sequence.
Does someone know a quick and efficient way to add this column and export the data file?
You are getting a list within a list in c.
Anyway, I think this is much clearer:
import numpy as np
data = np.genfromtxt("task1.dat")
data_new = np.hstack((data, np.abs(data[:,-1]).reshape((-1,1))))
np.savetxt("task_out.dat", data_new)
c is a list and when you execute
c.append(ablist)
c.append (datalist)
it appends 2 lists of different shapes to the list c. It will probably end up looking like this
c == [ [ [....],[....]], [....]]
which is not possible to be parsed by numpy.asarray due to that shape difference
(I am saying probably because I am assuming there is a 2d matrix in genfromtxt("task1.dat"))
what you can do to concatenate the columns is
from numpy import *
data = genfromtxt("task1.dat")
ab = abs(data[:,1])
c = concatenate((data,ab.reshape(-1,1),axis=1)
savetxt("task1b.dat", c)
data is a 2d array like:
In [54]: data=np.arange(-5,5).reshape(5,2)
In [55]: data
Out[55]:
array([[-5, -4],
[-3, -2],
[-1, 0],
[ 1, 2],
[ 3, 4]])
In [56]: ab=abs(data[:,1])
There are various ways to concatenate 2 arrays. In this case, data is 2d, and ab is 1d, so you have to take some steps to ensure they are both 2d. np.column_stack does that for us.
In [58]: np.column_stack((data,ab))
Out[58]:
array([[-5, -4, 4],
[-3, -2, 2],
[-1, 0, 0],
[ 1, 2, 2],
[ 3, 4, 4]])
With a little change in indexing we could make ab a column array from that start, and simply concatenate on the 2nd axis:
ab=abs(data[:,[1]])
np.concatenate((data,ab),axis=1)
==================
The same numbers with your tolist produce a c like
In [72]: [ab.tolist()]+[data.tolist()]
Out[72]: [[4, 2, 0, 2, 4], [[-5, -4], [-3, -2], [-1, 0], [1, 2], [3, 4]]]
That is not good input for array.
To go the list route you need to do an iteration over a zip:
In [86]: list(zip(data,ab))
Out[86]:
[(array([-5, -4]), 4),
(array([-3, -2]), 2),
(array([-1, 0]), 0),
(array([1, 2]), 2),
(array([3, 4]), 4)]
In [87]: c=[]
In [88]: for i,j in zip(data,ab):
c.append(i.tolist()+[j])
....:
In [89]: c
Out[89]: [[-5, -4, 4], [-3, -2, 2], [-1, 0, 0], [1, 2, 2], [3, 4, 4]]
In [90]: np.array(c)
Out[90]:
array([[-5, -4, 4],
[-3, -2, 2],
[-1, 0, 0],
[ 1, 2, 2],
[ 3, 4, 4]])
Obviously this will be slower than the array concatenate, but studying this might help you understand both arrays and lists.

Slicing a 3-D array using a 2-D array

Assume we have two matrices:
x = np.random.randint(10, size=(2, 3, 3))
idx = np.random.randint(3, size=(2, 3))
The question is to access the element of x using idx, in the way as:
dim1 = x[0, range(0,3), idx[0]] # slicing x[0] using idx[0]
dim2 = x[1, range(0,3), idx[1]]
res = np.vstack((dim1, dim2))
Is there a neat way to do this?
You can just index it the basic way, only that the size of indexer array has to match. That's what those .reshape s are for:
x[np.array([0,1]).reshape(idx.shape[0], -1),
np.array([0,1,2]).reshape(-1,idx.shape[1]),
idx]
Out[29]:
array([[ 0.10786251, 0.2527514 , 0.11305823],
[ 0.67264076, 0.80958292, 0.07703623]])
Here's another way to do it with reshaping -
x.reshape(-1,x.shape[2])[np.arange(idx.size),idx.ravel()].reshape(idx.shape)
Sample run -
In [2]: x
Out[2]:
array([[[5, 0, 9],
[3, 0, 7],
[7, 1, 2]],
[[5, 3, 5],
[8, 6, 1],
[7, 0, 9]]])
In [3]: idx
Out[3]:
array([[2, 1, 2],
[1, 2, 0]])
In [4]: x.reshape(-1,x.shape[2])[np.arange(idx.size),idx.ravel()].reshape(idx.shape)
Out[4]:
array([[9, 0, 2],
[3, 1, 7]])

How to efficiently get matrix of the desired form in Python?

I have four numpy arrays like:
X1 = array([[1, 2], [2, 0]])
X2 = array([[3, 1], [2, 2]])
I1 = array([[1], [1]])
I2 = array([[1], [1]])
And I'm doing:
Y = array([I1, X1],
[I2, X2]])
To get:
Y = array([[ 1, 1, 2],
[ 1, 2, 0],
[-1, -3, -1],
[-1, -2, -2]])
Like this example, I have large matrices, where X1 and X2 are n x d matrices.
Is there an efficient way in Python whereby I can get the matrix Y?
Although I am aware of the iterative manner, I am searching for an efficient manner to accomplish the above mentioned.
Here, Y is an n x (d+1) matrix and I1 and I2 are identity matrices of the dimension n x 1.
How about the following:
In [1]: import numpy as np
In [2]: X1 = np.array([[1,2],[2,0]])
In [3]: X2 = np.array([[3,1],[2,2]])
In [4]: I1 = np.array([[1],[1]])
In [5]: I2 = np.array([[4],[4]])
In [7]: Y = np.vstack((np.hstack((I1,X1)),np.hstack((I2,X2))))
In [8]: Y
Out[8]:
array([[1, 1, 2],
[1, 2, 0],
[4, 3, 1],
[4, 2, 2]])
Alternatively you could create an empty array of the appropriate size and fill it using the appropriate slices. This would avoid making intermediate arrays.
You need numpy.bmat
In [4]: A = np.mat('1 ; 1 ')
In [5]: B = np.mat('2 2; 2 2')
In [6]: C = np.mat('3 ; 5')
In [7]: D = np.mat('7 8; 9 0')
In [8]: np.bmat([[A,B],[C,D]])
Out[8]:
matrix([[1, 2, 2],
[1, 2, 2],
[3, 7, 8],
[5, 9, 0]])
For a numpy array, this page suggests the syntax may be of the form
vstack([hstack([a,b]),
hstack([c,d])])

Categories