"Multiply" 1d numpy array with a smaller one and sum the result

"Multiply" 1d numpy array with a smaller one and sum the result - python

I want to "multiply" (for lack of better description) a numpy array X of size M with a smaller numpy array Y of size N, for every N elements in X. Then, I want to sum the resulting array (almost like a dotproduct).
I hope the example makes it more clear:
Example
X = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Y = [1,2,3]
Z = mymul(X, Y)
= [0*1, 1*2, 2*3, 3*1, 4*2, 5*3, 6*1, 7*2, 8*3, 9*1]
= [ 0, 2, 6, 3, 8, 15, 6, 14, 24, 9]
result = sum(Z) = 87
X and Y can be of varying lengths and Y is always smaller than X, but not necessarily divisible (e.g. M % N != 0)
I have some solutions but they are quite slow. I'm hoping there is a faster way to do this.
import numpy as np
X = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int)
Y = np.array([1,2,3], dtype=int)
# these work but are slow for large X, Y
# simple for-loop
t = 0
for i in range(len(X)):
t += X[i] * Y[i % len(Y)]
print(t) #87
# extend Y M/N times so np.dot can be applied
Ytiled = np.tile(Y, int(np.ceil(len(X) / len(Y))))[:len(X)]
t = np.dot(X, Ytiled)
print(t) #87

Resize Y to same length as X and then use matrix-multiplication -
In [52]: np.dot(X, np.resize(Y,len(X)))
Out[52]: 87
Alternative to using np.resize would be with tiling. Hence, np.tile(Y,(m+n-1)//n)[:m] for m,n = len(X), len(Y), could replace np.resize(Y,len(X)) for a faster one.
Another without resizing Y to achieve memory-efficiency -
In [79]: m,n = len(X), len(Y)
In [80]: s = n*(m//n)
In [81]: X2D = X[:s].reshape(-1,n)
In [82]: X2D.dot(Y).sum() + np.dot(X[s:],Y[:m-s])
Out[82]: 87
Alternatively, we can use np.einsum('ij,j->',X2D,Y) to replace X2D.dot(Y).sum().

You can use convolve (documentation):
np.convolve(X, Y[::-1], 'same')[::len(Y)].sum()
Remember to reverse the second array.

Related

Numpy Value error x and y must have same first dimension, but have shapes (10,) and (1,)

Have this value error problem
The x is an array of 0-9 10 total digits
X is passed into the for loop and put into the equation
Struggling with how y and x aren't the same size when the equation has run 10 times
import numpy as np
import matplotlib.pyplot as plt
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
a = np.array([2])
b = np.array([-3])
print(f'Scalar check for 0 dimensions a {a.ndim}, b {b.ndim} x {x.ndim}')
for i in x:
print(i)
y = i*a + b
plt.plot(x, y)
raise ValueError(f"x and y must have same first dimension, but "
ValueError: x and y must have same first dimension, but have shapes (10,) and (1,)
Though it would have ran when I changed the dimensions of a and b to 1d arrays before they were scalar but that was obviously not the error causing it

You are overwritting the y value each time. So in the end you have y = [15].
You can re-write it as follows:
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
a = np.array(2) <-- note the removed brackets: []
b = np.array(-3) <--
y = []
for i in x:
y.append(i * a + b)
and even simpler approach is
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
a = np.array(2)
b = np.array(-3)
y = x * a + b

Create numpy matrix from output of function on vector vs vector

I have two vectors / one-dimensional numpy arrays and a function I want to apply:
arr1 = np.arange(1, 5)
arr2 = np.arange(2, 6)
func = lambda x, y: x * y
I now want to construct a n * m matrix (with n, m being the lengths of arr1, and arr2 respectively) containing the values of the function outputs. The naive approach using for loops would look like this:
np.array([[func(x, y) for x in arr1] for y in arr2])
I was wondering if there is a smarter vectorized approach using the arr1[:, None] syntax to apply my function - please note my actual function is significantly more complicated and can't be broken down to simple numpy operations (arr1[:, None] * arr2[None, :] won't work).

When you have numpy.array, One approach can be numpy.einsum. Because you want to compute this : arr1_i * arr2_j -> insert to arr_result_ji.
>>> np.einsum('i, j -> ji', arr1, arr2)
array([[ 2, 4, 6, 8],
[ 3, 6, 9, 12],
[ 4, 8, 12, 16],
[ 5, 10, 15, 20]])
Or you can use numpy.matmul or use #.
>>> np.matmul(arr2[:,None], arr1[None,:])
# OR
>>> arr2[:,None] # arr1[None,:]
# Or by thanks #hpaulj by elementwise multiplication with broadcasting
>>> arr2[:,None] * arr1[None,:]
array([[ 2, 4, 6, 8],
[ 3, 6, 9, 12],
[ 4, 8, 12, 16],
[ 5, 10, 15, 20]])

Here is some comparison between your loop approach and #I'mahdi 's approach:
import time
arr1 = np.arange(1, 10000)
arr2 = np.arange(2, 10001)
start = time.time()
np.array([[func(x, y) for x in arr1] for y in arr2])
print('loop: __time__', time.time()-start)
start = time.time()
(arr1[:, None]*arr2[None, :]).T
print('* __time__', time.time()-start)
start = time.time()
np.einsum('i, j -> ji', arr1, arr2)
print('einsum __time__', time.time()-start)
start = time.time()
np.matmul(arr2[:,None], arr1[None,:])
print('matmul __time__', time.time()-start)
Output:
loop: __time__ 70.3061535358429
* __time__ 0.43536829948425293
einsum __time__ 0.508014440536499
matmul __time__ 0.7149899005889893

Two dimensional function not returning array of values?

I'm trying to plot a 2-dimensional function (specifically, a 2-d Laplace solution). I defined my function and it returns the right value when I put in specific numbers, but when I try running through an array of values (x,y below), it still returns only one number. I tried with a random function of x and y (e.g., f(x,y) = x^2 + y^2) and it gives me an array of values.
def V_func(x,y):
a = 5
b = 4
Vo = 4
n = np.arange(1,100,2)
sum_list = []
for indx in range(len(n)):
sum_term = (1/n[indx])*(np.cosh(n[indx]*np.pi*x/a))/(np.cosh(n[indx]*np.pi*b/a))*np.sin(n[indx]*np.pi*y/a)
sum_list = np.append(sum_list,sum_term)
summation = np.sum(sum_list)
V = 4*Vo/np.pi * summation
return V
x = np.linspace(-4,4,50)
y = np.linspace(0,5,50)
V_func(x,y)
Out: 53.633709914177224

Try this:
def V_func(x,y):
a = 5
b = 4
Vo = 4
n = np.arange(1,100,2)
# sum_list = []
sum_list = np.zeros(50)
for indx in range(len(n)):
sum_term = (1/n[indx])*(np.cosh(n[indx]*np.pi*x/a))/(np.cosh(n[indx]*np.pi*b/a))*np.sin(n[indx]*np.pi*y/a)
# sum_list = np.append(sum_list,sum_term)
sum_list += sum_term
# summation = np.sum(sum_list)
# V = 4*Vo/np.pi * summation
V = 4*Vo/np.pi * sum_list
return V

Define a pair of arrays:
In [6]: x = np.arange(3); y = np.arange(10,13)
In [7]: x,y
Out[7]: (array([0, 1, 2]), array([10, 11, 12]))
Try a simple function of the 2
In [8]: x + y
Out[8]: array([10, 12, 14])
Since they have the same size, they can be summed (or otherwise combined) elementwise. The result has the same shape as the 2 inputs.
Now try 'broadcasting'. x[:,None] has shape (3,1)
In [9]: x[:,None] + y
Out[9]:
array([[10, 11, 12],
[11, 12, 13],
[12, 13, 14]])
The result is (3,3), the first 3 from the reshaped x, the second from y.
I can generate the pair of arrays with meshgrid:
In [10]: I,J = np.meshgrid(x,y,sparse=True, indexing='ij')
In [11]: I
Out[11]:
array([[0],
[1],
[2]])
In [12]: J
Out[12]: array([[10, 11, 12]])
In [13]: I + J
Out[13]:
array([[10, 11, 12],
[11, 12, 13],
[12, 13, 14]])
Note the added parameters in meshgrid. So that's how we go about generating 2d values from a pair of 1d arrays.
Now look at what sum does. As you use it in the function:
In [14]: np.sum(I + J)
Out[14]: 108
the result is a scalar. See the docs. If I specify an axis I get an array.
In [15]: np.sum(I + J, axis=0)
Out[15]: array([33, 36, 39])
If you gave V_func the right x and y, sum_list could be a 3d array. That axis-less sum reduces it to a scalar.
In code like this you need to keep track of array shapes. Include test prints if needed; don't just assume anything; test it. Pay attention to how dimensions grow and shrink as they pass through various operations.

Summing and removing repeated elements of Numpy Arrays

I have 4 1D Numpy arrays of equal length.
The first three act as an ID, uniquely identifying the 4th array.
The ID arrays contain repeated combinations, for which I need to sum the 4th array, and remove the repeating element from all 4 arrays.
x = np.array([1, 2, 4, 1])
y = np.array([1, 1, 4, 1])
z = np.array([1, 2, 2, 1])
data = np.array([4, 7, 3, 2])
In this case I need:
x = [1, 2, 4]
y = [1, 1, 4]
z = [1, 2, 2]
data = [6, 7, 3]
The arrays are rather long so loops really won't work. I'm sure there is a fairly simple way to do this, but for the life of me I can't figure it out.

To get started, we can stack the ID vectors into a matrix such that each ID is a row of three values:
XYZ = np.vstack((x,y,z)).T
Now, we just need to find the indices of repeated rows. Unfortunately, np.unique doesn't operate on rows, so we need to do some tricks:
order = np.lexsort(XYZ.T)
diff = np.diff(XYZ[order], axis=0)
uniq_mask = np.append(True, (diff != 0).any(axis=1))
This part is borrowed from the np.unique source code, and finds the unique indices as well as the "inverse index" mapping:
uniq_inds = order[uniq_mask]
inv_idx = np.zeros_like(order)
inv_idx[order] = np.cumsum(uniq_mask) - 1
Finally, sum over the unique indices:
data = np.bincount(inv_idx, weights=data)
x,y,z = XYZ[uniq_inds].T

You can use unique and sum as reptilicus suggested to do the following
from itertools import izip
import numpy as np
x = np.array([1, 2, 4, 1])
y = np.array([1, 1, 4, 1])
z = np.array([1, 2, 2, 1])
data = np.array([4, 7, 3, 2])
# N = len(x)
# ids = x + y*N + z*(N**2)
ids = np.array([hash((a, b, c)) for a, b, c in izip(x, y, z)]) # creates flat ids
_, idx, idx_rep = np.unique(ids, return_index=True, return_inverse=True)
x_out = x[idx]
y_out = y[idx]
z_out = z[idx]
# data_out = np.array([np.sum(data[idx_rep == i]) for i in idx])
data_out = np.bincount(idx_rep, weights=data)
print x_out
print y_out
print z_out
print data_out

Euclidean distances between several images and one base image

I have a matrix X of dimensions (30x8100) and another one Y of dimensions (1x8100). I want to generate an array containing the difference between them (X[1]-Y, X[2]-Y,..., X[30]-Y)
Can anyone help?

All you need for that is
X - Y
Since several people have offered answers that seem to try to make the shapes match manually, I should explain:
Numpy will automatically expand Y's shape so that it matches with that of X. This is called broadcasting, and it usually does a very good job of guessing what should be done. In ambiguous cases, an axis keyword can be applied to tell it which direction to do things. Here, since Y has a dimension of length 1, that is the axis that is expanded to be length 30 to match with X's shape.
For example,
In [87]: import numpy as np
In [88]: n, m = 3, 5
In [89]: x = np.arange(n*m).reshape(n,m)
In [90]: y = np.arange(m)[None,...]
In [91]: x.shape
Out[91]: (3, 5)
In [92]: y.shape
Out[92]: (1, 5)
In [93]: (x-y).shape
Out[93]: (3, 5)
In [106]: x
Out[106]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
In [107]: y
Out[107]: array([[0, 1, 2, 3, 4]])
In [108]: x-y
Out[108]:
array([[ 0, 0, 0, 0, 0],
[ 5, 5, 5, 5, 5],
[10, 10, 10, 10, 10]])
But this is not really a euclidean distance, as your title seems to suggest you want:
df = np.asarray(x - y) # the difference between the images
dst = np.sqrt(np.sum(df**2, axis=1)) # their euclidean distances

use array and use numpy broadcasting in order to subtract it from Y
init the matrix:
>>> from numpy import *
>>> a = array([[1,2,3],[4,5,6]])
Accessing the second row in a:
>>> a[1]
array([4, 5, 6])
Subtract array from Y
>>> Y = array([3,9,0])
>>> a - Y
array([[-2, -7, 3],
[ 1, -4, 6]])

Just iterate rows from your numpy array and you can actually just subtract them and numpy will make a new array with the differences!
import numpy as np
final_array = []
#X is a numpy array that is 30X8100 and Y is a numpy array that is 1X8100
for row in X:
output = row - Y
final_array.append(output)
output will be your resulting array of X[0] - Y, X[1] - Y etc. Now your final_array will be an array with 30 arrays inside, each that have the values of the X-Y that you need! Simple as that. Just make sure you convert your matrices to a numpy arrays first
Edit: Since numpy broadcasting will do the iteration, all you need is one line once you have your two arrays:
final_array = X - Y
And then that is your array with the differences!

a1 = numpy.array(X) #make sure you have a numpy array like [[1,2,3],[4,5,6],...]
a2 = numpy.array(Y) #make sure you have a 1d numpy array like [1,2,3,...]
a2 = [a2] * len(a1[0]) #make a2 as wide as a1
a2 = numpy.array(zip(*a2)) #transpose it (a2 is now same shape as a1)
print a1-a2 #idiomatic difference between a1 and a2 (or X and Y)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

"Multiply" 1d numpy array with a smaller one and sum the result - python

You can use convolve (documentation): np.convolve(X, Y[::-1], 'same')[::len(Y)].sum() Remember to reverse the second array.

Related

Numpy Value error x and y must have same first dimension, but have shapes (10,) and (1,)

Create numpy matrix from output of function on vector vs vector

Two dimensional function not returning array of values?

Summing and removing repeated elements of Numpy Arrays

Euclidean distances between several images and one base image

Categories

Resources