Subtracting Row From Column in NumPy - python

I have a m-dimensional NumPy array A and a n-dimensional NumPy array B
I want to create a m x n matrix C such that C[i, j] = B[j] - A[i]
Is there a efficient/vectorized way to do this in NumPy?
Currently I am using:
C = np.zeros((M, N))
for i in range(0, M):
C[i, :] = (B - A[i])
Edit:
m, n are big numbers, thus, C is a even bigger matrix (of m*n entries)
I tried np.repeat and np.subtract.outer but both of those crash my RAM

I think you are looking for Çıp.subtract.outer
M = 5
N = 10
A = np.arange(N)
B = np.arange(M)
np.subtract.outer(A,B)
output:
array([[ 0, -1, -2, -3, -4],
[ 1, 0, -1, -2, -3],
[ 2, 1, 0, -1, -2],
[ 3, 2, 1, 0, -1],
[ 4, 3, 2, 1, 0],
[ 5, 4, 3, 2, 1],
[ 6, 5, 4, 3, 2],
[ 7, 6, 5, 4, 3],
[ 8, 7, 6, 5, 4],
[ 9, 8, 7, 6, 5]])
It will apply subtraction to all pairs in A and B. For 1 dimension it is equivalent to
C = empty(len(A),len(B))
for i in range(len(A)):
for j in range(len(B)):
C[i,j] = A[i] - B[j]
EDIT
If you have memory issues you could try allocating the output buffer before doing the operation and setting the out keyword appropriately:
C = np.zeros((M, N))
np.subtract.outer(B, A, out=C)

You could repeat one of the arrays on a new axis, and then subtract the other array.
Example:
m = 10
n = 20
A = np.array(range(m))
B = np.array(range(n))
C = np.repeat(B[:, np.newaxis], m, axis=1) - A
Then C would contain:
array([[ 0, -1, -2, -3, -4, -5, -6, -7, -8, -9],
[ 1, 0, -1, -2, -3, -4, -5, -6, -7, -8],
[ 2, 1, 0, -1, -2, -3, -4, -5, -6, -7],
[ 3, 2, 1, 0, -1, -2, -3, -4, -5, -6],
[ 4, 3, 2, 1, 0, -1, -2, -3, -4, -5],
[ 5, 4, 3, 2, 1, 0, -1, -2, -3, -4],
[ 6, 5, 4, 3, 2, 1, 0, -1, -2, -3],
[ 7, 6, 5, 4, 3, 2, 1, 0, -1, -2],
[ 8, 7, 6, 5, 4, 3, 2, 1, 0, -1],
[ 9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1],
[11, 10, 9, 8, 7, 6, 5, 4, 3, 2],
[12, 11, 10, 9, 8, 7, 6, 5, 4, 3],
[13, 12, 11, 10, 9, 8, 7, 6, 5, 4],
[14, 13, 12, 11, 10, 9, 8, 7, 6, 5],
[15, 14, 13, 12, 11, 10, 9, 8, 7, 6],
[16, 15, 14, 13, 12, 11, 10, 9, 8, 7],
[17, 16, 15, 14, 13, 12, 11, 10, 9, 8],
[18, 17, 16, 15, 14, 13, 12, 11, 10, 9],
[19, 18, 17, 16, 15, 14, 13, 12, 11, 10]])

This is a simple broadcasting task:
In [31]: A =np.arange(3); B=np.arange(4)
In [32]: C = B - A[:,None]
In [33]: C.shape
Out[33]: (3, 4)
In [34]: C
Out[34]:
array([[ 0, 1, 2, 3],
[-1, 0, 1, 2],
[-2, -1, 0, 1]])
This is like the https://stackoverflow.com/a/66842410/901925 answer, but there's no need to use repeat. That should cut down the memory usage a bit, but if M*N*8 is anywhere close to your memory limits, this or subsequent operations using C could hit that limit.

Related

All permutations of array where each element in the array must be increasing by range between 0 and n

Let's say I have a list of elements with values
[1, 2, 3, 5, 6, 7, 9, 12]
Basically, the elements in the array may max have a difference of n, in this case three, in increasing order.
The array above would work like:
2-1 = 1 | difference of 1
3-2 = 1 | difference of 1
5-3 = 2 | difference of 2
6-5 = 1 | difference of 1
And so on.
How would I go about to find all permutations of an array with length x and a max difference of n?
Assuming you are looking for differences in absolute value, you can do this recursively by progressively adding each eligible element to the result:
here's an example using a recursive generator function:
def permuteSort(A,maxDiff,previous=None):
if not A: yield []; return
for i,a in enumerate(A):
if previous is not None and abs(a-previous) > maxDiff:
continue
yield from ([a]+p for p in permuteSort(A[:i]+A[i+1:],maxDiff,a))
Output
for p in permuteSort([1, 2, 3, 5, 6, 7, 9, 12],3):
print(p,"differences:",[b-a for a,b in zip(p,p[1:])])
[1, 2, 3, 5, 6, 7, 9, 12] differences: [1, 1, 2, 1, 1, 2, 3]
[1, 2, 3, 5, 7, 6, 9, 12] differences: [1, 1, 2, 2, -1, 3, 3]
[1, 2, 3, 6, 5, 7, 9, 12] differences: [1, 1, 3, -1, 2, 2, 3]
[1, 2, 5, 3, 6, 7, 9, 12] differences: [1, 3, -2, 3, 1, 2, 3]
[1, 3, 2, 5, 6, 7, 9, 12] differences: [2, -1, 3, 1, 1, 2, 3]
[1, 3, 2, 5, 7, 6, 9, 12] differences: [2, -1, 3, 2, -1, 3, 3]
[2, 1, 3, 5, 6, 7, 9, 12] differences: [-1, 2, 2, 1, 1, 2, 3]
[2, 1, 3, 5, 7, 6, 9, 12] differences: [-1, 2, 2, 2, -1, 3, 3]
[2, 1, 3, 6, 5, 7, 9, 12] differences: [-1, 2, 3, -1, 2, 2, 3]
[3, 1, 2, 5, 6, 7, 9, 12] differences: [-2, 1, 3, 1, 1, 2, 3]
[3, 1, 2, 5, 7, 6, 9, 12] differences: [-2, 1, 3, 2, -1, 3, 3]
[5, 2, 1, 3, 6, 7, 9, 12] differences: [-3, -1, 2, 3, 1, 2, 3]
[6, 3, 1, 2, 5, 7, 9, 12] differences: [-3, -2, 1, 3, 2, 2, 3]
[7, 5, 2, 1, 3, 6, 9, 12] differences: [-2, -3, -1, 2, 3, 3, 3]
[12, 9, 6, 3, 1, 2, 5, 7] differences: [-3, -3, -3, -2, 1, 3, 2]
[12, 9, 6, 7, 5, 2, 1, 3] differences: [-3, -3, 1, -2, -3, -1, 2]
[12, 9, 6, 7, 5, 2, 3, 1] differences: [-3, -3, 1, -2, -3, 1, -2]
[12, 9, 6, 7, 5, 3, 1, 2] differences: [-3, -3, 1, -2, -2, -2, 1]
[12, 9, 6, 7, 5, 3, 2, 1] differences: [-3, -3, 1, -2, -2, -1, -1]
[12, 9, 7, 5, 2, 1, 3, 6] differences: [-3, -2, -2, -3, -1, 2, 3]
[12, 9, 7, 5, 6, 3, 1, 2] differences: [-3, -2, -2, 1, -3, -2, 1]
[12, 9, 7, 5, 6, 3, 2, 1] differences: [-3, -2, -2, 1, -3, -1, -1]
[12, 9, 7, 6, 3, 1, 2, 5] differences: [-3, -2, -1, -3, -2, 1, 3]
[12, 9, 7, 6, 3, 5, 2, 1] differences: [-3, -2, -1, -3, 2, -3, -1]
[12, 9, 7, 6, 5, 2, 1, 3] differences: [-3, -2, -1, -1, -3, -1, 2]
[12, 9, 7, 6, 5, 2, 3, 1] differences: [-3, -2, -1, -1, -3, 1, -2]
[12, 9, 7, 6, 5, 3, 1, 2] differences: [-3, -2, -1, -1, -2, -2, 1]
[12, 9, 7, 6, 5, 3, 2, 1] differences: [-3, -2, -1, -1, -2, -1, -1]
Try this as a recursion. Should print out all allowed values.
x is described as required_number and n is diff.
def getPermutations(cur_permutation, arr_to_check, required_number, diff):
if len(arr_to_check) == 0:
return
if required_number == 0:
print cur_permutation
return
cur_last_item = cur_permutation[-1]
for index, arr_item in enumerate(arr_to_check):
if arr_item - cur_last_item <= diff:
new_copy = cur_permutation[:]
new_copy.append(arr_item)
return getPermutations(new_copy, arr_to_check[1:], required_number - 1, diff)
(This could also be done with keeping the require length and checking if cur_permutation is of that length).

1d array from columns of an ndarray

This is the array I have at hand:
[array([[[ 4, 9, 1, -3],
[-2, 0, 8, 6],
[ 1, 3, 7, 9 ],
[ 2, 5, 0, -7],
[-1, -6, -5, -8]]]),
array([[[ 0, 2, -1, 6 ],
[9, 8, 0, 3],
[ -1, 2, 5, -4],
[0, 5, 9, 6],
[ 6, 2, 9, 4]]]),
array([[[ 1, 2, 0, 9],
[3, 4, 8, -1],
[5, 6, 9, 0],
[ 7, 8, -3, -],
[9, 0, 8, -2]]])]
But the goal is obtain arrays A from first columns of nested arrays, B from second columns of nested arrays, Cfrom third columns of nested array etc.
Such that:
A = array([4, -2, 1, 2, -1, 0, 9, -1 ,0, 6, 1, 3, 5, 7, 9])
B = array([9, 0, 3, 5, -6, 2, 8, 2, 5, 2, 2,, 4, 6, 8, 0])
How should I do this?
You can do this with a single hstack() and use squeeze() to remove the extra dimension. With that you can use regular numpy indexing to pull out columns (or anything else you want):
import numpy as np
l = [np.array([[[ 4, 9, 1, -3],
[-2, 0, 8, 6],
[ 1, 3, 7, 9 ],
[ 2, 5, 0, -7],
[-1, -6, -5, -8]]]),
np.array([[[ 0, 2, -1, 6 ],
[9, 8, 0, 3],
[ -1, 2, 5, -4],
[0, 5, 9, 6],
[ 6, 2, 9, 4]]]),
np.array([[[ 1, 2, 0, 9],
[3, 4, 8, -1],
[5, 6, 9, 0],
[ 7, 8, -3, -1],
[9, 0, 8, -2]]])]
arr = np.hstack(l).squeeze()
A = arr[:,0]
print(A)
# [ 4 -2 1 2 -1 0 9 -1 0 6 1 3 5 7 9]
B = arr[:,1]
print(B)
#[ 9 0 3 5 -6 2 8 2 5 2 2 4 6 8 0]
# etc...
IIUC,
l = [np.array([[[ 4, 9, 1, -3],
[-2, 0, 8, 6],
[ 1, 3, 7, 9 ],
[ 2, 5, 0, -7],
[-1, -6, -5, -8]]]),
np.array([[[ 0, 2, -1, 6 ],
[9, 8, 0, 3],
[ -1, 2, 5, -4],
[0, 5, 9, 6],
[ 6, 2, 9, 4]]]),
np.array([[[ 1, 2, 0, 9],
[3, 4, 8, -1],
[5, 6, 9, 0],
[ 7, 8, -3, -9],
[9, 0, 8, -2]]])]
a = np.hstack([i[0][:, 0] for i in l])
b = np.hstack([i[0][:, 1] for i in l])
Output:
array([ 4, -2, 1, 2, -1, 0, 9, -1, 0, 6, 1, 3, 5, 7, 9])
array([ 9, 0, 3, 5, -6, 2, 8, 2, 5, 2, 2, 4, 6, 8, 0])

Zero pad array based on other array's shape

I've got K feature vectors that all share dimension n but have a variable dimension m (n x m). They all live in a list together.
to_be_padded = []
to_be_padded.append(np.reshape(np.arange(9),(3,3)))
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
to_be_padded.append(np.reshape(np.arange(18),(3,6)))
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
to_be_padded.append(np.reshape(np.arange(15),(3,5)))
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
What I am looking for is a smart way to zero pad the rows of these np.arrays such that they all share the same dimension m. I've tried solving it with np.pad but I have not been able to come up with a pretty solution. Any help or nudges in the right direction would be greatly appreciated!
The result should leave the arrays looking like this:
array([[0, 1, 2, 0, 0, 0],
[3, 4, 5, 0, 0, 0],
[6, 7, 8, 0, 0, 0]])
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
array([[ 0, 1, 2, 3, 4, 0],
[ 5, 6, 7, 8, 9, 0],
[10, 11, 12, 13, 14, 0]])
You could use np.pad for that, which can also pad 2-D arrays using a tuple of values specifying the padding width, ((top, bottom), (left, right)). For that you could define:
def pad_to_length(x, m):
return np.pad(x,((0, 0), (0, m - x.shape[1])), mode = 'constant')
Usage
You could start by finding the ndarray with the highest amount of columns. Say you have two of them, a and b:
a = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
b = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
m = max(i.shape[1] for i in [a,b])
# 5
And then use this parameter to pad the ndarrays:
pad_to_length(a, m)
array([[0, 1, 2, 0, 0],
[3, 4, 5, 0, 0],
[6, 7, 8, 0, 0]])
I believe there is no very efficient solution for this. I think you will need to loop over the list with a for loop and treat every array individually:
for i in range(len(to_be_padded)):
padded = np.zeros((n, maxM))
padded[:,:to_be_padded[i].shape[1]] = to_be_padded[i]
to_be_padded[i] = padded
where maxM is the longest m of the matrices in your list.

Adding the previous n rows as columns to a NumPy array

I want to add the previous n rows as columns to a NumPy array.
For example, if n=2, the array below...
[[ 1, 2]
[ 3, 4]
[ 5, 6]
[ 7, 8]
[ 9, 10]
[11, 12]]
...should be turned into the following one:
[[ 1, 2, 0, 0, 0, 0]
[ 3, 4, 1, 2, 0, 0]
[ 5, 6, 3, 4, 1, 2]
[ 7, 8, 5, 6, 3, 4]
[ 9, 10, 7, 8, 5, 6]
[11, 12, 9, 10, 7, 8]]
Any ideas how I could do that without going over the entire array in a for loop?
Here's a vectorized approach -
def vectorized_app(a,n):
M,N = a.shape
idx = np.arange(a.shape[0])[:,None] - np.arange(n+1)
out = a[idx.ravel(),:].reshape(-1,N*(n+1))
out[N*(np.arange(1,M+1))[:,None] <= np.arange(N*(n+1))] = 0
return out
Sample run -
In [255]: a
Out[255]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12],
[13, 14, 15],
[16, 17, 18]])
In [256]: vectorized_app(a,3)
Out[256]:
array([[ 1, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 4, 5, 6, 1, 2, 3, 0, 0, 0, 0, 0, 0],
[ 7, 8, 9, 4, 5, 6, 1, 2, 3, 0, 0, 0],
[10, 11, 12, 7, 8, 9, 4, 5, 6, 1, 2, 3],
[13, 14, 15, 10, 11, 12, 7, 8, 9, 4, 5, 6],
[16, 17, 18, 13, 14, 15, 10, 11, 12, 7, 8, 9]])
Runtime test -
I am timing #Psidom's loop-comprehension based method and the vectorized method listed in this post on a 100x scaled up version (in terms of size) of the sample posted in the question :
In [246]: a = np.random.randint(0,9,(600,200))
In [247]: n = 200
In [248]: %timeit np.column_stack(mypad(a, i) for i in range(n + 1))
1 loops, best of 3: 748 ms per loop
In [249]: %timeit vectorized_app(a,n)
1 loops, best of 3: 224 ms per loop
Here is a way to pad 0 in the beginning of the array and then column stack them:
import numpy as np
n = 2
def mypad(myArr, n):
if n == 0:
return myArr
else:
return np.pad(myArr, ((n,0), (0,0)), mode = "constant")[:-n]
np.column_stack(mypad(arr, i) for i in range(n + 1))
# array([[ 1, 2, 0, 0, 0, 0],
# [ 3, 4, 1, 2, 0, 0],
# [ 5, 6, 3, 4, 1, 2],
# [ 7, 8, 5, 6, 3, 4],
# [ 9, 10, 7, 8, 5, 6],
# [11, 12, 9, 10, 7, 8]])

Accessing elements in array Python

This is my first time handling multidimensional arrays and I'm having problems accessing elements. I'm trying to get the red pixels of a picture but just the first 8 elements within the array. Here's the code
import Image
import numpy as np
im = Image.open("C:\Users\Jones\Pictures\1.jpg")
pix = im.load()
r, g, b = np.array(im).T
print r[0:8]
Since you're dealing with images, r is a 2-D array. To get the first 8 pixels in the image, try
r.flatten()[:8]
This will wrap around automatically if the first row has less than 8 pixels.
do you want all rows too? Try this r[:,:8]
only want the first row? Try this r[0,:8]
You can do it like this:
r[0][:8]
Note, however, that this will not work if the first row has less than 8 pixels. To fix that, do this:
from itertools import chain
r = list(chain.from_iterable(r))
r[:8]
or (if you don't want to import an entire module):
r = [val for element in r for val in element]
r[:8]
I think it could be more simple. This example uses a random matrix (this will be your r matrix):
In [7]: from pylab import * # convention
In [8]: r = randint(0,10,(10,10)) # this is your image
In [9]: r
array([[7, 9, 5, 5, 6, 8, 1, 4, 3, 4],
[5, 4, 4, 4, 2, 6, 2, 6, 4, 2],
[1, 4, 9, 9, 2, 6, 1, 9, 0, 6],
[5, 9, 0, 7, 9, 9, 5, 2, 0, 7],
[8, 3, 3, 9, 0, 0, 5, 9, 2, 2],
[5, 3, 7, 8, 8, 1, 6, 3, 2, 0],
[0, 2, 5, 7, 0, 1, 0, 2, 1, 2],
[4, 0, 4, 5, 9, 9, 3, 8, 3, 7],
[4, 6, 9, 9, 5, 9, 3, 0, 5, 1],
[6, 9, 9, 0, 3, 4, 9, 7, 9, 6]])
Then, extract first 8 columns and do something
In [17]: r_8 = r[:,:8] # extract columns
In [18]: r_8
Out[18]:
array([[7, 9, 5, 5, 6, 8, 1, 4],
[5, 4, 4, 4, 2, 6, 2, 6],
[1, 4, 9, 9, 2, 6, 1, 9],
[5, 9, 0, 7, 9, 9, 5, 2],
[8, 3, 3, 9, 0, 0, 5, 9],
[5, 3, 7, 8, 8, 1, 6, 3],
[0, 2, 5, 7, 0, 1, 0, 2],
[4, 0, 4, 5, 9, 9, 3, 8],
[4, 6, 9, 9, 5, 9, 3, 0],
[6, 9, 9, 0, 3, 4, 9, 7]])
In [19]: r_8 = r_8 * 2 # do something
In [20]: r_8
Out[20]:
array([[14, 18, 10, 10, 12, 16, 2, 8],
[10, 8, 8, 8, 4, 12, 4, 12],
[ 2, 8, 18, 18, 4, 12, 2, 18],
[10, 18, 0, 14, 18, 18, 10, 4],
[16, 6, 6, 18, 0, 0, 10, 18],
[10, 6, 14, 16, 16, 2, 12, 6],
[ 0, 4, 10, 14, 0, 2, 0, 4],
[ 8, 0, 8, 10, 18, 18, 6, 16],
[ 8, 12, 18, 18, 10, 18, 6, 0],
[12, 18, 18, 0, 6, 8, 18, 14]])
Now, this is the trick. Replace the first 8 columns in r using hstack:
In [21]: r = hstack((r_8, r[:,8:])) # it replaces the FISRT 8 columns, note the indexing notation
In [22]: r
Out[22]:
array([[14, 18, 10, 10, 12, 16, 2, 8, 3, 4], # it does not touch the last 2 columns
[10, 8, 8, 8, 4, 12, 4, 12, 4, 2],
[ 2, 8, 18, 18, 4, 12, 2, 18, 0, 6],
[10, 18, 0, 14, 18, 18, 10, 4, 0, 7],
[16, 6, 6, 18, 0, 0, 10, 18, 2, 2],
[10, 6, 14, 16, 16, 2, 12, 6, 2, 0],
[ 0, 4, 10, 14, 0, 2, 0, 4, 1, 2],
[ 8, 0, 8, 10, 18, 18, 6, 16, 3, 7],
[ 8, 12, 18, 18, 10, 18, 6, 0, 5, 1],
[12, 18, 18, 0, 6, 8, 18, 14, 9, 6]])
EDIT: as to what DSM pointed out, OP is infact using a numpy array.
i retract my answer as nneonneo's correct

Categories