How do I reduce the use of for loops using numpy? - python

Basically, I have three arrays that I multiply with values from 0 to 2, expanding the number of rows to the number of products (the values to be multiplied are the same for each array). From there, I want to calculate the product of every combination of rows from all three arrays. So I have three arrays
A = np.array([1, 2, 3])
B = np.array([1, 2, 3])
C = np.array([1, 2, 3])
and I'm trying to reduce the operation given below
search_range = np.linspace(0, 2, 11)
results = np.array([[0, 0, 0]])
for i in search_range:
for j in search_range:
for k in search_range:
sm = i*A + j*B + k*C
results = np.append(results, [sm], axis=0)
What I tried doing:
A = np.array([[1, 2, 3]])
B = np.array([[1, 2, 3]])
C = np.array([[1, 2, 3]])
n = 11
scale = np.linspace(0, 2, n).reshape(-1, 1)
A = np.repeat(A, n, axis=0) * scale
B = np.repeat(B, n, axis=0) * scale
C = np.repeat(C, n, axis=0) * scale
results = np.array([[0, 0, 0]])
for i in range(n):
A_i = A[i]
for j in range(n):
B_j = B[j]
C_k = C
sm = A_i + B_j + C_k
results = np.append(results, sm, axis=0)
which only removes the last for loop. How do I reduce the other for loops?

You can get the same result like this:
search_range = np.linspace(0, 2, 11)
search_range = np.array(np.meshgrid(search_range, search_range, search_range))
search_range = search_range.T.reshape(-1, 3)
sm = search_range[:, 0, None]*A + search_range[:, 1, None]*B + search_range[:, 2, None]*C
results = np.concatenate(([[0, 0, 0]], sm))
Instead of using three nested loops to get every combination of elements in the "search_range" array, I used the meshgrid function to convert "search_range" to a 2D array of every possible combination and then instead of i, j and k you can use the 3 items in the arrays in the "search_range".
And finally, as suggested by #Mercury you can use indexing for the new "search_range" array to generate the result. For example search_range[:, 1, None] is an array in shape of (1331, 1), containing singleton arrays of every element at index of 0 in arrays in the "search_range". That concatenate is only there because you wanted the results array to have default value of [[0, 0, 0]], so I appended sm to it; Otherwise, the sm array contains the answer.

Related

In numpy, multipy two structured matrices concisely

I have two matrices. The first has the following structure:
[[1, 0, a],
[0, 1, b],
[1, 0, c],
[0, 1, d]]
where 1, 0, a, b, c, and d are scalars. The matrix is 4 by 3
The second is just a 2 by 3 matrix:
[[r1],
[r2]]
where r1 and r2 are the first and second rows respectively, each having 3 elements.
I would like the output to be:
[[r1, 0, a*r1],
[0, r1, b*r1],
[r2, 0, c*r2],
[0, r2, d*r2]]
which would be a 4 by 9 matrix.
This is similar to the Kronecker product, except separately for each row of the second matrix. Of course this could be done with cumbersome loops which I want to avoid.
How can I do this concisely?
You can do exactly what you said in the last line: do a separate Kronecker product for each row of the second column and then concatenate the results.
Let's assume that the two matrices are called x (4 by 3) and y (2 by 3). The first thing to do is to split x in two parts because only half matrix participates in each part of the product.
x = x.reshape(2, 2, 3)
Then you can calculate the two products separately:
z0 = np.kron(x[0], y[0])
z1 = np.kron(x[1], y[1])
Finally, concatenate the two results along the first axis:
z = np.concatenate([z0, z1], axis=0)
Or if, like me, you enjoy big ugly one-liners you can do:
z = np.concatenate([np.kron(xr, yr) for xr, yr in zip(x.reshape(2, 2, 3), y)], axis=0)
In the general case you mentioned in the comments, it would become:
z = np.concatenate([np.kron(xr, yr) for xr, yr in zip(x.reshape(int(n / 2), 2, 3), y)], axis=0)
This gives equal results to the explicit loop, which can be numba.jit compiled I believe:
def solve_explicit(x, y):
# sanity checks
assert x.shape[0] == 2*y.shape[0]
assert x.shape[1] == y.shape[1]
n = x.shape[0]
z = np.zeros((n, 9))
for i in range(n):
for j in range(3):
for k in range(3):
z[i, k + 3 * j] = x[i, j] * y[int(i / 2), k]
return z
Using broadcasting, with x.shape (n, 3), and y.shape (n//2, 3):
out = (x.reshape(-1, 2, 3, 1) * y.reshape(-1, 1, 1, 3)).reshape(-1, 9)
I personally would use np.einsum in this situation because I think it's easier to understand than broadcasting.
import numpy as np
(a, b, c, d) = np.random.rand(4)
x = np.array([[1, 0, a], [0, 1, b], [1, 0, c], [0, 1, d]])
y = np.random.rand(2, 3)
z = np.einsum("ij,ik->ijk", x.reshape(-1, 6), y).reshape(-1, 9)
# timeit magic commands.
# %timeit -n 50000 np.einsum("ij,ik->ijk", x.reshape(-1, 6), y).reshape(-1, 9)
# %timeit -n 50000 (x.reshape(-1, 2, 3, 1) * y.reshape(-1, 1, 1, 3)).reshape(-1, 9)
Some good references on Einstein summation in NumPy: [2, 3, 4].

sum of elements in martix of matrices using numpy

I have a matrix of matrices with some arbitrary shape (N1,N2,k,k), meaning N1*N2 matrices with shape k*k.
I wish to calculate the sum of each matrix (of shape (k,k)) and convert the matrix itself with that sum.
the resulting array would be of shape (N1,N2), where each element positioned in some index i,j is the sum of the corresponding matrix in that given index.
is there a way of doing so with numpy operations? (that is - no looping over range(N1) and range(N2))
here's a simple example (Im using * with the first array and the second array transpose just to create the example):
m = np.array([[0, 0, 0, 0]]).reshape(2, 2) # matrix element of size k*k (k=2)
a = np.array([m, m + 1, m + 2, m + 3])
b = np.array([m, m + 1, m + 2, m + 3])
reshaped1 = a[:, np.newaxis] # (N1,1,k,k) where N1=4
reshaped2 = b[np.newaxis, :] # (1,N2,k,k) where N2=4
mult = reshaped1 * reshaped2 # (N1,N2,k,k)=(4,4,2,2)
I wish to create a new array res that will contain the sum of all mult elements. that can somewhat be done with the following pseudo:
for i in range(N1):
for j in range(N2):
res[i,j] = sum(mult[i,j])
appreciate your help!
If I understand you correctly, you can use np.sum with multiple axes:
np.sum(mult, axis=(2, 3))
Output:
array([[ 0, 0, 0, 0],
[ 0, 4, 8, 12],
[ 0, 8, 16, 24],
[ 0, 12, 24, 36]])
try using np.sum(np.sum(mult,axis=3),axis=2)
import numpy as np
N1=4
N2=4
m = np.array([[0, 0, 0, 0]]).reshape(2, 2) # matrix element of size k*k (k=2)
a = np.array([m, m + 1, m + 2, m + 3])
b = np.array([m, m + 1, m + 2, m + 3])
reshaped1 = a[:, np.newaxis] # (N1,1,k,k) where N1=4
reshaped2 = b[np.newaxis, :] # (1,N2,k,k) where N2=4
mult = reshaped1 * reshaped2 # (N1,N2,k,k)=(4,4,2,2)
np.sum(mult,axis=3)
res=np.zeros((4,4))
for i in range(N1):
for j in range(N2):
res[i,j] = np.sum(mult[i,j])
print(np.array_equal(np.sum(np.sum(mult,axis=3),axis=2),res))
>>> True

Vectorised index of arrays

Originally I had something like this:
a = 1 # Some randomly generated positive integer
b = -1 # Some randomly generated negative integer
c = 0 # Constant 0
i = 0 # Randomly picked from (0, 1, 2)
d = [a, b, c][i]
I would like to vectorise this so that many samples can be generated
So I have three arrays of length N, an index array of length N, and would like to use that index array to pick one of the three arrays
a = np.array([1, 2, 3, 4])
b = np.array([-1, -2, -3, -4])
c = np.array([0, 0, 0, 0])
i = np.array([2, 1, 2, 0])
d = np.array([a, b, c])[i] # Doesn't work
# Would like the result:
d = np.array([0, -2, 0, 4])
d = a * (i == 0) + b * (i == 1) + c * (i == 2) works, but surely there is a way that looks more like the unvectorised code
Make a 2-d array from the three arrays then use Integer indexing
>>> e = np.vstack([a,b,c])
>>> i = np.array([2, 1, 2, 0])
>>> e[(i,np.arange(i.shape[0]))]
array([ 0, -2, 0, 4])
>>>
Notice that your answer is on the diagonal of
np.array([a, b, c])[i]
so you can go:
np.array([a, b, c])[i].diagonal()

Python - Convert the array in a tuple to just a normal array

I have a signal where I want to find the average height of the values. This is done by finding the zero crossings and calculating the max and min between each zero crossing, then averaging these values.
My problem occurs when I want to use np.where() to find where the signal is crossing zero. When I use np.where() I get the result in a tuple, but I want it in an array where I can count the amount of times zero is crossed.
I am new to Python and coming from Matlab it is a bit confusing with all the different classes. As you can see, I get an error because nu = len(zero_u) gives 1 as a result, because the whole array is written in a tuple as one element.
Any ideas how to go around this?
The code looks like this:
import numpy as np
def averageheight(f):
rms = np.std(f)
f = f + (rms * 10**-6)
# Find zero crossing
fsign = np.sign(f)
fdiff = np.diff(fsign)
zero_u = np.asarray(np.where(fdiff > 0)) + 1
zero_d = np.asarray(np.where(fdiff < 0)) + 1
nu = len(zero_u)
nd = len(zero_d)
value_max = np.zeros((nu, 1))
value_min = np.zeros((nu, 1))
imaxvec = np.zeros((nu, 1))
iminvec = np.zeros((nu, 1))
if (nu > 2) and (nd > 2):
if zero_u[0] > zero_d[0]:
zero_d[0] = []
nu = len(zero_u)
nd = len(zero_d)
ncross = np.fmin(nu, nd)
# Find Maxima:
for ic in range(0, ncross - 1):
up = int(zero_u[ic])
down = int(zero_d[ic])
fvec = f[up:down]
value_max[ic] = np.amax(fvec)
index_max = value_max.argmax()
imaxvec[ic] = up + index_max - 1
# Find Minima:
for ic in range(0, ncross - 2):
down = int(zero_d[ic])
up = int(zero_u[ic+1])
fvec = f[down:up]
value_min[ic] = np.amin(fvec)
index_min = value_min.argmin()
iminvec[ic] = down + index_min - 1
# Remove spurious values, bumps and zero_d
thr = rms/3
maxfind = np.where(value_max < thr)
for i in range(0, len(maxfind)):
imaxfind = np.where(value_max == maxfind[i])
imaxvec[imaxfind] = 0
value_max[imaxfind] = 0
minfind = np.where(value_min > -thr)
for j in range(0, len(minfind)):
iminfind = np.where(value_min == minfind[j])
value_min[iminfind] = 0
iminvec[iminfind] = 0
# Find Average Height
avh = np.mean(value_max) - np.mean(value_min)
else:
avh = 0
return avh
np.where, and np.nonzero even more so, clearly explains that it returns a tuple, with one array for each dimension of the condition array:
In [71]: arr = np.random.randint(-5,5,10)
In [72]: arr
Out[72]: array([ 3, 4, 2, -3, -1, 0, -5, 4, 2, -3])
In [73]: arr.shape
Out[73]: (10,)
In [74]: np.where(arr>=0)
Out[74]: (array([0, 1, 2, 5, 7, 8]),)
In [75]: arr[_]
Out[75]: array([3, 4, 2, 0, 4, 2])
That Out[74] tuple can be used directly as an index.
You can also extract the array from the tuple:
In [76]: np.where(arr>=0)[0]
Out[76]: array([0, 1, 2, 5, 7, 8])
That, I think is a better choice than the np.asarray(np.where(...))
This convention for where becomes clearer when we use it on a 2d array
In [77]: arr2 = arr.reshape(2,5)
In [78]: np.where(arr2>=0)
Out[78]: (array([0, 0, 0, 1, 1, 1]), array([0, 1, 2, 0, 2, 3]))
In [79]: arr2[_]
Out[79]: array([3, 4, 2, 0, 4, 2])
Again we are indexing with a tuple. arr2[1,3] is really arr2[(1,3)]. The values in [] indexing brackets are actually passed to the indexing function as a tuple of values.
np.argwhere applies transpose to the result of where, producing an array:
In [80]: np.transpose(np.where(arr2>=0))
Out[80]:
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 2],
[1, 3]])
That's the same indexing arrays, but arranged in a 2d column matrix.
If you need the count of where without the actual values, a slightly faster function is
In [81]: np.count_nonzero(arr>=0)
Out[81]: 6
In fact np.nonzero uses the count to first determine the size of the arrays that it will return.

Summing and removing repeated elements of Numpy Arrays

I have 4 1D Numpy arrays of equal length.
The first three act as an ID, uniquely identifying the 4th array.
The ID arrays contain repeated combinations, for which I need to sum the 4th array, and remove the repeating element from all 4 arrays.
x = np.array([1, 2, 4, 1])
y = np.array([1, 1, 4, 1])
z = np.array([1, 2, 2, 1])
data = np.array([4, 7, 3, 2])
In this case I need:
x = [1, 2, 4]
y = [1, 1, 4]
z = [1, 2, 2]
data = [6, 7, 3]
The arrays are rather long so loops really won't work. I'm sure there is a fairly simple way to do this, but for the life of me I can't figure it out.
To get started, we can stack the ID vectors into a matrix such that each ID is a row of three values:
XYZ = np.vstack((x,y,z)).T
Now, we just need to find the indices of repeated rows. Unfortunately, np.unique doesn't operate on rows, so we need to do some tricks:
order = np.lexsort(XYZ.T)
diff = np.diff(XYZ[order], axis=0)
uniq_mask = np.append(True, (diff != 0).any(axis=1))
This part is borrowed from the np.unique source code, and finds the unique indices as well as the "inverse index" mapping:
uniq_inds = order[uniq_mask]
inv_idx = np.zeros_like(order)
inv_idx[order] = np.cumsum(uniq_mask) - 1
Finally, sum over the unique indices:
data = np.bincount(inv_idx, weights=data)
x,y,z = XYZ[uniq_inds].T
You can use unique and sum as reptilicus suggested to do the following
from itertools import izip
import numpy as np
x = np.array([1, 2, 4, 1])
y = np.array([1, 1, 4, 1])
z = np.array([1, 2, 2, 1])
data = np.array([4, 7, 3, 2])
# N = len(x)
# ids = x + y*N + z*(N**2)
ids = np.array([hash((a, b, c)) for a, b, c in izip(x, y, z)]) # creates flat ids
_, idx, idx_rep = np.unique(ids, return_index=True, return_inverse=True)
x_out = x[idx]
y_out = y[idx]
z_out = z[idx]
# data_out = np.array([np.sum(data[idx_rep == i]) for i in idx])
data_out = np.bincount(idx_rep, weights=data)
print x_out
print y_out
print z_out
print data_out

Categories