Every Nth Element In Convolution - python

Is there a faster way to do np.convolve(A, B)[::N] using Numpy? It feels wasteful to compute all the convolutions and then throw N - 1 of N away... I could do a for loop or list comprehension, but I thought it would be faster to use only native Numpy methods.
EDIT
Or does Numpy do lazy evaluation? I just saw this from a JS library, would be awesome for Numpy as well:
// Get first 3 unique values
const arr = [1, 2, 2, 3, 3, 4, 5, 6];
const result = R.pipe(
arr,
R.map(x => {
console.log('iterate', x);
return x;
}),
R.uniq(),
R.take(3)
); // => [1, 2, 3]
/**
* Console output:
* iterate 1
* iterate 2
* iterate 2
* iterate 3
* /

A convolution is a product of your kernel and a window on your array, then the sum. You can achieve the same manually using a rolling window:
First let's see a dummy example
A = np.arange(30)
B = np.ones(6)
N = 3
out = np.convolve(A, B)[::N]
print(out)
output: [ 0. 6. 21. 39. 57. 75. 93. 111. 129. 147. 135. 57.]
Now we do the same with a rolling view, padding, and slicing:
from numpy.lib.stride_tricks import sliding_window_view as swv
out = (swv(np.pad(A, B.shape[0]-1), B.shape[0])[::N]*B).sum(axis=1)
print(out)
output: [ 0. 6. 21. 39. 57. 75. 93. 111. 129. 147. 135. 57.]
Intermediate sliding view:
swv(np.pad(A, B.shape[0]-1), B.shape[0])
array([[ 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 1],
[ 0, 0, 0, 0, 1, 2],
[ 0, 0, 0, 1, 2, 3],
[ 0, 0, 1, 2, 3, 4],
[ 0, 1, 2, 3, 4, 5],
[ 1, 2, 3, 4, 5, 6],
[ 2, 3, 4, 5, 6, 7],
...
[24, 25, 26, 27, 28, 29],
[25, 26, 27, 28, 29, 0],
[26, 27, 28, 29, 0, 0],
[27, 28, 29, 0, 0, 0],
[28, 29, 0, 0, 0, 0],
[29, 0, 0, 0, 0, 0]])
# with slicing
swv(np.pad(A, B.shape[0]-1), B.shape[0])[::N]
array([[ 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 1, 2, 3],
[ 1, 2, 3, 4, 5, 6],
[ 4, 5, 6, 7, 8, 9],
[ 7, 8, 9, 10, 11, 12],
[10, 11, 12, 13, 14, 15],
[13, 14, 15, 16, 17, 18],
[16, 17, 18, 19, 20, 21],
[19, 20, 21, 22, 23, 24],
[22, 23, 24, 25, 26, 27],
[25, 26, 27, 28, 29, 0],
[28, 29, 0, 0, 0, 0]])

Related

How to efficiently multiply every element in a 2-dimensional array by a 1-dimensional array in Numpy?

I would like to efficiently multiply every element in a 2D array with a 1D array using numpy, so that a 3D array is returned.
Basically, the code should do something like:
import numpy as np
#create dummy data
arr1=np.arange(0,9).reshape((3,3))
arr2=np.arange(0,9)
#create output container
out = []
#loop over every increment in arr1
for col in arr1:
row = []
for i in col:
#perform calculation
row.append(i*arr2)
out.append(row)
#convert output to array
out = np.array(out)
With out having the shape (3, 3, 9) and thus amounting to
array([[[ 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 0, 2, 4, 6, 8, 10, 12, 14, 16]],
[[ 0, 3, 6, 9, 12, 15, 18, 21, 24],
[ 0, 4, 8, 12, 16, 20, 24, 28, 32],
[ 0, 5, 10, 15, 20, 25, 30, 35, 40]],
[[ 0, 6, 12, 18, 24, 30, 36, 42, 48],
[ 0, 7, 14, 21, 28, 35, 42, 49, 56],
[ 0, 8, 16, 24, 32, 40, 48, 56, 64]]])
Thank you very much in advance!
Use numpy.outer:
np.outer(arr2,arr1).reshape(3,3,9)
to get:
array([[[ 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 0, 2, 4, 6, 8, 10, 12, 14, 16]],
[[ 0, 3, 6, 9, 12, 15, 18, 21, 24],
[ 0, 4, 8, 12, 16, 20, 24, 28, 32],
[ 0, 5, 10, 15, 20, 25, 30, 35, 40]],
[[ 0, 6, 12, 18, 24, 30, 36, 42, 48],
[ 0, 7, 14, 21, 28, 35, 42, 49, 56],
[ 0, 8, 16, 24, 32, 40, 48, 56, 64]]])
Alternatively to np.outer product as in #makis' answer, you can directly use np.einsum like this:
out_einsum = np.einsum('i,jk->jki', arr2, arr1)
and then avoid reshaping. Thus, also giving:
>>> array([[[ 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8],
[ 0, 2, 4, 6, 8, 10, 12, 14, 16]],
[[ 0, 3, 6, 9, 12, 15, 18, 21, 24],
[ 0, 4, 8, 12, 16, 20, 24, 28, 32],
[ 0, 5, 10, 15, 20, 25, 30, 35, 40]],
[[ 0, 6, 12, 18, 24, 30, 36, 42, 48],
[ 0, 7, 14, 21, 28, 35, 42, 49, 56],
[ 0, 8, 16, 24, 32, 40, 48, 56, 64]]])
This has the disadvantage of being a bit less intuitive if you are not used to that function subscripts inputs but it's worth the try.
(arr1.reshape(arr2.size, 1) * arr2.reshape(1, arr2.size)).reshape(3, 3, 9)

How to multiply numpy 1D with N-D array?

I have a numpy array A:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
And the orther array B:
array([0, 1])
How can I get the result by multiply A and B?
array([[[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
Thank you very much.
You need to reshape the second ndarray so both arrays have the same number of dimensions:
arr1 * arr2[:, None, None]
or
arr1 * arr2.reshape(2, 1, -1)
arr1.shape
# (2, 3, 4)
arr2[:, None, None].shape
# (2, 1, 1)
arr2.reshape(2, 1, -1).shape
# (2, 1, 1)

How to reshape Numpy array with padded 0's

I have a Numpy array that looks like
array([1, 2, 3, 4, 5, 6, 7, 8])
and I want to reshape it to an array
array([[5, 0, 0, 6],
[0, 1, 2, 0],
[0, 3, 4, 0],
[7, 0, 0, 8]])
More specifically, I'm trying to reshape a 2D numpy array to get a 3D Numpy array to go from
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16],
[17, 18, 19, 20, 21, 22, 23, 24],
...
[ 9, 10, 11, 12, 13, 14, 15, 16],
[89, 90, 91, 92, 93, 94, 95, 96]])
to a numpy array that looks like
array([[[ 5, 0, 0, 6],
[ 0, 1, 2, 0],
[ 0, 3, 4, 0],
[ 7, 0, 0, 8]],
[[13, 0, 0, 14],
[ 0, 9, 10, 0],
[ 0, 11, 12, 0],
[15, 0, 0, 16]],
...
[[93, 0, 0, 94],
[ 0, 89, 90, 0],
[ 0, 91, 92, 0],
[95, 0, 0, 96]]])
Is there an efficient way to do this using numpy functionality, particularly vectorized?
We can make use of slicing -
def expand(a): # a is 2D array
out = np.zeros((len(a),4,4),dtype=a.dtype)
out[:,1:3,1:3] = a[:,:4].reshape(-1,2,2)
out[:,::3,::3] = a[:,4:].reshape(-1,2,2)
return out
The benefit is memory and hence perf. efficiency, as only the output would occupy memory space. The steps involved work with views thanks to the slicing on the input and output.
Sample run -
2D input :
In [223]: a
Out[223]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16]])
In [224]: expand(a)
Out[224]:
array([[[ 5, 0, 0, 6],
[ 0, 1, 2, 0],
[ 0, 3, 4, 0],
[ 7, 0, 0, 8]],
[[13, 0, 0, 14],
[ 0, 9, 10, 0],
[ 0, 11, 12, 0],
[15, 0, 0, 16]]])
1D input (feed in 2D extended input with None) :
In [225]: a = np.array([1, 2, 3, 4, 5, 6, 7, 8])
In [226]: expand(a[None])
Out[226]:
array([[[5, 0, 0, 6],
[0, 1, 2, 0],
[0, 3, 4, 0],
[7, 0, 0, 8]]])

Trying to understand what is happening in this Python Function

def closest_centroid(points, centroids):
"""returns an array containing the index to the nearest centroid for each point"""
distances = np.sqrt(((points - centroids[:, np.newaxis])**2).sum(axis=2))
return np.argmin(distances, axis=0)
Can someone explain the exact working of this function? I currently got points which looks like:
31998888119 0.94 34
23423423422 0.45 43
....
And so on. In this numpy array, points[1] would be the long ID while points[2] is 0.94 and points[3] would be 34 for their first entry.
Centroids is just a random selection from this particular array:
def initialize_centroids(points, k):
"""returns k centroids from the initial points"""
centroids = points.copy()
np.random.shuffle(centroids)
return centroids[:k]
Now I want to get the Euclidean distance from the values of points ignoring the first column of IDs and centroids (once again ignoring the first column). I don't exactly understand the syntax from the line distances = np.sqrt(((points - centroids[:, np.newaxis])**2).sum(axis=2)). Why exactly are we summing across the third column, while there being a decleration for a new axis: np.newaxis? Also along what axis am I supposed to make the np.argmin work?
It helps to think about the dimensions. Let's assume that k=4 and there are 10 points, so points.shape = (10,3).
Next, centroids = initialize_centroids(points, 4) returns an object with dimension (4,3).
Let's break up this line from the inside:
distances = np.sqrt(((points - centroids[:, np.newaxis])**2).sum(axis=2))
We want to subtract each centroid from each point. Since points and centroids are 2 dimensional, each points - centroid is 2 dimensional. If there were 1 centroid only, then we're ok. But we have 4 centroids! So we need to perform points - centroids, for each centroid. Therefore we need another dimension to store this. Hence the addition of a np.newaxis.
We square it because it's a distance, so we want to convert negatives to positive (and also because we are minimizing Euclidean distance).
We're not summing across the third column. In fact we are summing the difference between points and centroid, for each point, for each centroid.
np.argmin() finds the centroid with the minimum distance. So for each centroid, for each point, find the minimum index (hence argmin instead of min). That index is the centroid assigned to that point.
Here is an example:
points = np.array([
[ 1, 2, 4],
[ 1, 1, 3],
[ 1, 6, 2],
[ 6, 2, 3],
[ 7, 2, 3],
[ 1, 9, 6],
[ 6, 9, 1],
[ 3, 8, 6],
[ 10, 9, 6],
[ 0, 2, 0],
])
centroids = initialize_centroids(points, 4)
print(centroids)
array([[10, 9, 6],
[ 3, 8, 6],
[ 6, 2, 3],
[ 1, 1, 3]])
distances = (pts - centroids[:, np.newaxis])**2
print(distances)
array([[[ 81, 49, 4],
[ 81, 64, 9],
[ 81, 9, 16],
[ 16, 49, 9],
[ 9, 49, 9],
[ 81, 0, 0],
[ 16, 0, 25],
[ 49, 1, 0],
[ 0, 0, 0],
[100, 49, 36]],
[[ 4, 36, 4],
[ 4, 49, 9],
[ 4, 4, 16],
[ 9, 36, 9],
[ 16, 36, 9],
[ 4, 1, 0],
[ 9, 1, 25],
[ 0, 0, 0],
[ 49, 1, 0],
[ 9, 36, 36]],
[[ 25, 0, 1],
[ 25, 1, 0],
[ 25, 16, 1],
[ 0, 0, 0],
[ 1, 0, 0],
[ 25, 49, 9],
[ 0, 49, 4],
[ 9, 36, 9],
[ 16, 49, 9],
[ 36, 0, 9]],
[[ 0, 1, 1],
[ 0, 0, 0],
[ 0, 25, 1],
[ 25, 1, 0],
[ 36, 1, 0],
[ 0, 64, 9],
[ 25, 64, 4],
[ 4, 49, 9],
[ 81, 64, 9],
[ 1, 1, 9]]])
print(distances.sum(axis=2))
array([[134, 154, 106, 74, 67, 81, 41, 50, 0, 185],
[ 44, 62, 24, 54, 61, 5, 35, 0, 50, 81],
[ 26, 26, 42, 0, 1, 83, 53, 54, 74, 45],
[ 2, 0, 26, 26, 37, 73, 93, 62, 154, 11]])
# The minimum of the first 4 centroids is index 3. The minimum of the second 4 centroids is index 3 again.
print(np.argmin(distances.sum(axis=2), axis=0))
array([3, 3, 1, 2, 2, 1, 1, 1, 0, 3])

Mapping an array into other with zeros at the begining and the end

I have a numpy array
a = np.arange(30).reshape(5,6)
and I want to map it into
b = np.zeros((a.shape[0],a.shape[1]+2))
but leaving the first and last columns as zeros
i.e.
b =
array [[0, 0, 1, 2, 3, 4, 5, 0],
. . .
[0, 24, 25, 26, 27, 28, 29, 0]])
Thanks
a = np.arange(30).reshape(5, 6)
b = np.zeros((a.shape[0], a.shape[1]+2), dtype=a.dtype)
b[:, 1:-1] = a
>>> b
array([[ 0, 0, 1, 2, 3, 4, 5, 0],
[ 0, 6, 7, 8, 9, 10, 11, 0],
[ 0, 12, 13, 14, 15, 16, 17, 0],
[ 0, 18, 19, 20, 21, 22, 23, 0],
[ 0, 24, 25, 26, 27, 28, 29, 0]])

Categories