Create array/tensor of cycle shifted arrays - python

I want to create 2d tensor (or numpy array, doesn't really matter), where every row will be cycle shifted first row. I do it using for loop:
import torch
import numpy as np
a = np.random.rand(33, 11)
miss_size = 64
lp_order = a.shape[1] - 1
inv_a = -np.flip(a, axis=1)
mtx_size = miss_size+lp_order # some constant
mtx_row = torch.cat((torch.from_numpy(inv_a), torch.zeros((a.shape[0], miss_size - 1 + a.shape[1]))), dim=1)
mtx_full = mtx_row.unsqueeze(1)
for i in range(mtx_size):
mtx_row = torch.roll(mtx_row, 1, 1)
mtx_full = torch.cat((mtx_full, mtx_row.unsqueeze(1)), dim=1)
unsqueezing is needed because I stack 2d tensors into 3d tensor
Is there more efficient way to do that? Maybe linear algebra trick or more pythonic approach.

You can use scipy.linalg.circulant():
scipy.linalg.circulant([1, 2, 3])
# array([[1, 3, 2],
# [2, 1, 3],
# [3, 2, 1]])

I believe you can achieve this using torch.gather by constructing the appropriate index tensor. This approach works with batches too.
If we take this approach, the objective is to construct an index tensor where each value refers to an index in mtx_row (along the last dimension here dim=1). In this case, it would be shaped (3, 3):
tensor([[0, 1, 2],
[2, 0, 1],
[1, 2, 0]])
You can achieve this by broadcasting torch.arange with its own transpose and applying modulo on the resulting matrix:
>>> idx = (n-torch.arange(n)[None].T + torch.arange(n)[None]) % n
tensor([[0, 1, 2],
[2, 0, 1],
[1, 2, 0]])
Let mtx_row be shaped (2, 3):
>>> mtx_row
tensor([[0.3523, 0.0170, 0.1875],
[0.2156, 0.7773, 0.4563]])
From there you need to idx and mtx_row so they have the same shapes:
>>> idx_ = idx[None].expand(len(mtx_row), -1, -1)
>>> val_ = mtx_row[:, None].expand(-1, n, -1)
Then we can apply torch.gather on the last dimension dim=2:
>>> val_.gather(-1, idx_)
tensor([[[0.3523, 0.0170, 0.1875],
[0.1875, 0.3523, 0.0170],
[0.0170, 0.1875, 0.3523]],
[[0.2156, 0.7773, 0.4563],
[0.4563, 0.2156, 0.7773],
[0.7773, 0.4563, 0.2156]]])

Related

Loop append of numpy array element in python

Who can explain how to loop add an element to the numpy array by condition?
I wrote some code that should do add element 2 if i element of array A is 0 and add element 1 if i element of array A is not 0.
Here is the code itself:
import numpy as np
def finalconcat(somearray):
for i in somearray:
arraysome=[]
if somearray[i]==0:
arraysome=np.append(arraysome,[2],axis=0)
else:
arraysome=np.append(arraysome,[1],axis=0)
return arraysome
Let me give you an example:
A=np.array([1,0,2,3,4,5])
C=finalconcat(B)
print(C)
It should come out:
[1,2,1,1,1,1]
But it comes out like:
[1.]
Please explain to me what is wrong here, I just don't understand what could be wrong...
You have several issues:
arraysome=[] is inside your loop so for each iteration of somearray you are emptying arraysome. Consequently, you can never end up with more than one element in arraysome when you are all done.
You have for i in somearray. On each iteration i will be the next element of somearray; it will not be iterating indices of the array. Yet later you have if somearray[i]==0:. This should just be if i==0:.
If you want the resulting elements of arraysome to be integers rather than floats, then you should initialize it to be a an numpy array of integers.
You have C=finalconcat(B), but B is not defined.
You should really spend some time reading the PEP 8 – Style Guide for Python Code.
import numpy as np
def finalconcat(somearray):
arraysome = np.array([], dtype=np.int)
for i in somearray:
if i == 0:
arraysome = np.append(arraysome, [2], axis=0)
else:
arraysome = np.append(arraysome, [1], axis=0)
return arraysome
a = np.array([1, 0, 2, 3, 4, 5])
c = finalconcat(a)
print(c)
Prints:
[1 2 1 1 1 1]
For iteration like this it's better to use lists. np.append is just a poorly named cover for np.concatenate, which returns a whole new array with each call. List append works in-place, and is more efficient. And easier to use:
def finalconcat(somearray):
rec = [2 if i==0 else 1 for i in somearray]
# arr = np.array(rec)
return rec
In [31]: a = np.array([1, 0, 2, 3, 4, 5])
In [32]: np.array([2 if i==0 else 1 for i in a])
Out[32]: array([1, 2, 1, 1, 1, 1])
But it's better to use whole-array methods, such as:
In [33]: b = np.ones_like(a)
In [34]: b
Out[34]: array([1, 1, 1, 1, 1, 1])
In [35]: b[a==0] = 2
In [36]: b
Out[36]: array([1, 2, 1, 1, 1, 1])
or
In [37]: np.where(a==0, 2, 1)
Out[37]: array([1, 2, 1, 1, 1, 1])

How do I reduce the use of for loops using numpy?

Basically, I have three arrays that I multiply with values from 0 to 2, expanding the number of rows to the number of products (the values to be multiplied are the same for each array). From there, I want to calculate the product of every combination of rows from all three arrays. So I have three arrays
A = np.array([1, 2, 3])
B = np.array([1, 2, 3])
C = np.array([1, 2, 3])
and I'm trying to reduce the operation given below
search_range = np.linspace(0, 2, 11)
results = np.array([[0, 0, 0]])
for i in search_range:
for j in search_range:
for k in search_range:
sm = i*A + j*B + k*C
results = np.append(results, [sm], axis=0)
What I tried doing:
A = np.array([[1, 2, 3]])
B = np.array([[1, 2, 3]])
C = np.array([[1, 2, 3]])
n = 11
scale = np.linspace(0, 2, n).reshape(-1, 1)
A = np.repeat(A, n, axis=0) * scale
B = np.repeat(B, n, axis=0) * scale
C = np.repeat(C, n, axis=0) * scale
results = np.array([[0, 0, 0]])
for i in range(n):
A_i = A[i]
for j in range(n):
B_j = B[j]
C_k = C
sm = A_i + B_j + C_k
results = np.append(results, sm, axis=0)
which only removes the last for loop. How do I reduce the other for loops?
You can get the same result like this:
search_range = np.linspace(0, 2, 11)
search_range = np.array(np.meshgrid(search_range, search_range, search_range))
search_range = search_range.T.reshape(-1, 3)
sm = search_range[:, 0, None]*A + search_range[:, 1, None]*B + search_range[:, 2, None]*C
results = np.concatenate(([[0, 0, 0]], sm))
Instead of using three nested loops to get every combination of elements in the "search_range" array, I used the meshgrid function to convert "search_range" to a 2D array of every possible combination and then instead of i, j and k you can use the 3 items in the arrays in the "search_range".
And finally, as suggested by #Mercury you can use indexing for the new "search_range" array to generate the result. For example search_range[:, 1, None] is an array in shape of (1331, 1), containing singleton arrays of every element at index of 0 in arrays in the "search_range". That concatenate is only there because you wanted the results array to have default value of [[0, 0, 0]], so I appended sm to it; Otherwise, the sm array contains the answer.

N-D indexing with defaults in NumPy

Can I index NumPy N-D array with fallback to default values for out-of-bounds indexes? Example code below for some imaginary np.get_with_default(a, indexes, default):
import numpy as np
print(np.get_with_default(
np.array([[1,2,3],[4,5,6]]), # N-D array
[(np.array([0, 0, 1, 1, 2, 2]), np.array([1, 2, 2, 3, 3, 5]))], # N-tuple of indexes along each axis
13, # Default for out-of-bounds fallback
))
should print
[2 3 6 13 13 13]
I'm looking for some built-in function for this. If such not exists then at least some short and efficient implementation to do that.
I arrived at this question because I was looking for exactly the same. I came up with the following function, which does what you ask for 2 dimension. It could likely be generalised to N dimensions.
def get_with_defaults(a, xx, yy, nodata):
# get values from a, clipping the index values to valid ranges
res = a[np.clip(yy, 0, a.shape[0] - 1), np.clip(xx, 0, a.shape[1] - 1)]
# compute a mask for both x and y, where all invalid index values are set to true
myy = np.ma.masked_outside(yy, 0, a.shape[0] - 1).mask
mxx = np.ma.masked_outside(xx, 0, a.shape[1] - 1).mask
# replace all values in res with NODATA, where either the x or y index are invalid
np.choose(myy + mxx, [res, nodata], out=res)
return res
xx and yy are the index array, a is indexed by (y,x).
This gives:
>>> a=np.zeros((3,2),dtype=int)
>>> get_with_defaults(a, (-1, 1000, 0, 1, 2), (0, -1, 0, 1, 2), -1)
array([-1, -1, 0, 0, -1])
As an alternative, the following implementation achieves the same and is more concise:
def get_with_default(a, xx, yy, nodata):
# get values from a, clipping the index values to valid ranges
res = a[np.clip(yy, 0, a.shape[0] - 1), np.clip(xx, 0, a.shape[1] - 1)]
# replace all values in res with NODATA (gets broadcasted to the result array), where
# either the x or y index are invalid
res[(yy < 0) | (yy >= a.shape[0]) | (xx < 0) | (xx >= a.shape[1])] = nodata
return res
I don't know if there is anything in NumPy to do that directly, but you can always implement it yourself. This is not particularly smart or efficient, as it requires multiple advanced indexing operations, but does what you need:
import numpy as np
def get_with_default(a, indices, default=0):
# Ensure inputs are arrays
a = np.asarray(a)
indices = tuple(np.broadcast_arrays(*indices))
if len(indices) <= 0 or len(indices) > a.ndim:
raise ValueError('invalid number of indices.')
# Make mask of indices out of bounds
mask = np.zeros(indices[0].shape, np.bool)
for ind, s in zip(indices, a.shape):
mask |= (ind < 0) | (ind >= s)
# Only do masking if necessary
n_mask = np.count_nonzero(mask)
# Shortcut for the case where all is masked
if n_mask == mask.size:
return np.full_like(a, default)
if n_mask > 0:
# Ensure index arrays are contiguous so masking works right
indices = tuple(map(np.ascontiguousarray, indices))
for ind in indices:
# Replace masked indices with zeros
ind[mask] = 0
# Get values
res = a[indices]
if n_mask > 0:
# Replace values of masked indices with default value
res[mask] = default
return res
# Test
print(get_with_default(
np.array([[1,2,3],[4,5,6]]),
(np.array([0, 0, 1, 1, 2, 2]), np.array([1, 2, 2, 3, 3, 5])),
13
))
# [ 2 3 6 13 13 13]
I also needed a solution to this, but I wanted a solution that worked in N dimensions. I made Markus' solution work for N-dimensions, including selecting from an array with more dimensions than the coordinates point to.
def get_with_defaults(arr, coords, nodata):
coords, shp = np.array(coords), np.array(arr.shape)
# Get values from arr, clipping to valid ranges
res = arr[tuple(np.clip(c, 0, s-1) for c, s in zip(coords, shp))]
# Set any output where one of the coords was out of range to nodata
res[np.any(~((0 <= coords) & (coords < shp[:len(coords), None])), axis=0)] = nodata
return res
import numpy as np
if __name__ == '__main__':
A = np.array([[1,2,3],[4,5,6]])
B = np.array([[[1, -9],[2, -8],[3, -7]],[[4, -6],[5, -5],[6, -4]]])
coords1 = [[0, 0, 1, 1, 2, 2], [1, 2, 2, 3, 3, 5]]
coords2 = [[0, 0, 1, 1, 2, 2], [1, 2, 2, 3, 3, 5], [1, 1, 1, 1, 1, 1]]
out1 = get_with_defaults(A, coords1, 13)
out2 = get_with_defaults(B, coords1, 13)
out3 = get_with_defaults(B, coords2, 13)
print(out1)
# [2, 3, 6, 13, 13, 13]
print(out2)
# [[ 2 -8]
# [ 3 -7]
# [ 6 -4]
# [13 13]
# [13 13]
# [13 13]]
print(out3)
# [-8, -7, -4, 13, 13, 13]

Why does calling np.array() on this list comprehension produce a 3d array instead of 2d?

I have a script produces the first several iterations of a Markov matrix multiplying a given set of input values. With the matrix stored as A and the start values in the column u0, I use this list comprehension to store the output in an array:
out = np.array([ ( (A**n) * u0).T for n in range(10) ])
The output has shape (10,1,6), but I want the output in shape (10,6) instead. Obviously, I can fix this with .reshape(), but is there a way to avoid creating the extra dimension in the first place, perhaps by simplifying the list comprehension or the inputs?
Here's the full script and output:
import numpy as np
# Random 6x6 Markov matrix
n = 6
A = np.matrix([ (lambda x: x/x.sum())(np.random.rand(n)) for _ in range(n)]).T
print(A)
#[[0.27457312 0.20195133 0.14400801 0.00814027 0.06026188 0.23540134]
# [0.21526648 0.17900277 0.35145882 0.30817386 0.15703758 0.21069114]
# [0.02100412 0.05916883 0.18309142 0.02149681 0.22214047 0.15257011]
# [0.17032696 0.11144443 0.01364982 0.31337906 0.25752732 0.1037133 ]
# [0.03081507 0.2343255 0.2902935 0.02720764 0.00895182 0.21920371]
# [0.28801424 0.21410713 0.01749843 0.32160236 0.29408092 0.07842041]]
# Random start values
u0 = np.matrix(np.random.randint(51, size=n)).T
print(u0)
#[[31]
# [49]
# [44]
# [29]
# [10]
# [ 0]]
# Find the first 10 iterations of the Markov process
out = np.array([ ( (A**n) * u0).T for n in range(10) ])
print(out)
#[[[31. 49. 44. 29. 10.
# 0. ]]
#
# [[25.58242101 41.41600236 14.45123543 23.00477134 26.08867045
# 32.45689942]]
#
# [[26.86917065 36.02438292 16.87560159 26.46418685 22.66236879
# 34.10428921]]
#
# [[26.69224394 37.06346073 16.59208202 26.48817955 22.56696872
# 33.59706504]]
#
# [[26.68772374 36.99727159 16.49987315 26.5003184 22.61130862
# 33.7035045 ]]
#
# [[26.68766363 36.98517264 16.50532933 26.51717543 22.592951
# 33.71170797]]
#
# [[26.68695152 36.98895204 16.50314718 26.51729716 22.59379049
# 33.70986161]]
#
# [[26.68682195 36.98848867 16.50286371 26.51763013 22.59362679
# 33.71056876]]
#
# [[26.68681128 36.98850409 16.50286036 26.51768807 22.59359453
# 33.71054167]]
#
# [[26.68680313 36.98851046 16.50285038 26.51769497 22.59359219
# 33.71054886]]]
print(out.shape)
#(10, 1, 6)
out = out.reshape(10,n)
print(out)
#[[31. 49. 44. 29. 10. 0. ]
# [25.58242101 41.41600236 14.45123543 23.00477134 26.08867045 32.45689942]
# [26.86917065 36.02438292 16.87560159 26.46418685 22.66236879 34.10428921]
# [26.69224394 37.06346073 16.59208202 26.48817955 22.56696872 33.59706504]
# [26.68772374 36.99727159 16.49987315 26.5003184 22.61130862 33.7035045 ]
# [26.68766363 36.98517264 16.50532933 26.51717543 22.592951 33.71170797]
# [26.68695152 36.98895204 16.50314718 26.51729716 22.59379049 33.70986161]
# [26.68682195 36.98848867 16.50286371 26.51763013 22.59362679 33.71056876]
# [26.68681128 36.98850409 16.50286036 26.51768807 22.59359453 33.71054167]
# [26.68680313 36.98851046 16.50285038 26.51769497 22.59359219 33.71054886]]
I think your confusion lies with how arrays can be joined.
Start with a simple 1d array (in numpy 1d is a real thing, not just a 'row vector' or 'column vector'):
In [288]: arr = np.arange(6)
In [289]: arr
Out[289]: array([0, 1, 2, 3, 4, 5])
np.array joins element arrays along a new 1st dimension:
In [290]: np.array([arr,arr])
Out[290]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
np.stack with the default axis value does the same thing. Read its docs.
We can make a 2d array, a column vector:
In [291]: arr1 = arr[:,None]
In [292]: arr1
Out[292]:
array([[0],
[1],
[2],
[3],
[4],
[5]])
In [293]: arr1.shape
Out[293]: (6, 1)
Using np.array on its transpose the (1,6) arrays:
In [294]: np.array([arr1.T, arr1.T])
Out[294]:
array([[[0, 1, 2, 3, 4, 5]],
[[0, 1, 2, 3, 4, 5]]])
In [295]: _.shape
Out[295]: (2, 1, 6)
Note the middle size 1 dimension, that bothered you.
np.vstack joins the arrays along the existing 1st dimension. It does not add one:
In [296]: np.vstack([arr1.T, arr1.T])
Out[296]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
Or we could join the arrays horizontally, on the 2nd dimension:
In [297]: np.hstack([arr1, arr1])
Out[297]:
array([[0, 0],
[1, 1],
[2, 2],
[3, 3],
[4, 4],
[5, 5]])
That is (6,2) which can be transposed to (2,6):
In [298]: np.hstack([arr1, arr1]).T
Out[298]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
If you use np.array() for input and # for matrix multiplication, it works as expected.
# Random 6x6 Markov matrix
n = 6
A = np.array([ (lambda x: x/x.sum())(np.random.rand(n)) for _ in range(n)]).T
# Random start values
u0 = np.random.randint(51, size=n).T
# Find the first 10 iterations of the Markov process
out = np.array([ ( np.linalg.matrix_power(A,n) # u0).T for n in range(10) ])
print(out)
#[[29. 24. 5. 12. 10. 32. ]
# [15.82875119 13.53436868 20.61648725 19.22478172 20.34082205 22.45478912]
# [21.82434718 10.06037119 14.29281935 20.75271393 18.76134538 26.30840297]
# [20.77484848 10.1379821 15.47488423 19.4965479 20.05618311 26.05955418]
# [21.02944236 10.09401438 15.24263478 19.48662616 19.95767996 26.18960236]
# [20.96887722 10.11647819 15.30729334 19.44261102 20.00089222 26.16384802]
# [20.98086362 10.11522779 15.29529799 19.44899285 19.99137187 26.16824587]
# [20.97795615 10.11606978 15.29817734 19.44798612 19.99293494 26.16687566]
# [20.97858032 10.11591954 15.29752865 19.44839852 19.99245389 26.16711909]
# [20.97844343 10.11594666 15.29766432 19.4483417 19.99254284 26.16706104]]
I made a few changes to the code, although I'm not 100% certain that the result is still the same (I am not familiar with Markov chains).
import numpy as np
n = 6
num_proc_iters = 10
rand_nums_arr = np.random.random_sample((n, n))
rand_nums_arr = np.transpose(rand_nums_arr / rand_nums_arr.sum(axis=1))
u0 = np.random.randint(51, size=n)
res_arr = np.concatenate([np.linalg.matrix_power(rand_nums_arr, curr) # u0 for curr in range(num_proc_iters)])
I would love to hear if anyone can think of any further improvements.

Python - Convert the array in a tuple to just a normal array

I have a signal where I want to find the average height of the values. This is done by finding the zero crossings and calculating the max and min between each zero crossing, then averaging these values.
My problem occurs when I want to use np.where() to find where the signal is crossing zero. When I use np.where() I get the result in a tuple, but I want it in an array where I can count the amount of times zero is crossed.
I am new to Python and coming from Matlab it is a bit confusing with all the different classes. As you can see, I get an error because nu = len(zero_u) gives 1 as a result, because the whole array is written in a tuple as one element.
Any ideas how to go around this?
The code looks like this:
import numpy as np
def averageheight(f):
rms = np.std(f)
f = f + (rms * 10**-6)
# Find zero crossing
fsign = np.sign(f)
fdiff = np.diff(fsign)
zero_u = np.asarray(np.where(fdiff > 0)) + 1
zero_d = np.asarray(np.where(fdiff < 0)) + 1
nu = len(zero_u)
nd = len(zero_d)
value_max = np.zeros((nu, 1))
value_min = np.zeros((nu, 1))
imaxvec = np.zeros((nu, 1))
iminvec = np.zeros((nu, 1))
if (nu > 2) and (nd > 2):
if zero_u[0] > zero_d[0]:
zero_d[0] = []
nu = len(zero_u)
nd = len(zero_d)
ncross = np.fmin(nu, nd)
# Find Maxima:
for ic in range(0, ncross - 1):
up = int(zero_u[ic])
down = int(zero_d[ic])
fvec = f[up:down]
value_max[ic] = np.amax(fvec)
index_max = value_max.argmax()
imaxvec[ic] = up + index_max - 1
# Find Minima:
for ic in range(0, ncross - 2):
down = int(zero_d[ic])
up = int(zero_u[ic+1])
fvec = f[down:up]
value_min[ic] = np.amin(fvec)
index_min = value_min.argmin()
iminvec[ic] = down + index_min - 1
# Remove spurious values, bumps and zero_d
thr = rms/3
maxfind = np.where(value_max < thr)
for i in range(0, len(maxfind)):
imaxfind = np.where(value_max == maxfind[i])
imaxvec[imaxfind] = 0
value_max[imaxfind] = 0
minfind = np.where(value_min > -thr)
for j in range(0, len(minfind)):
iminfind = np.where(value_min == minfind[j])
value_min[iminfind] = 0
iminvec[iminfind] = 0
# Find Average Height
avh = np.mean(value_max) - np.mean(value_min)
else:
avh = 0
return avh
np.where, and np.nonzero even more so, clearly explains that it returns a tuple, with one array for each dimension of the condition array:
In [71]: arr = np.random.randint(-5,5,10)
In [72]: arr
Out[72]: array([ 3, 4, 2, -3, -1, 0, -5, 4, 2, -3])
In [73]: arr.shape
Out[73]: (10,)
In [74]: np.where(arr>=0)
Out[74]: (array([0, 1, 2, 5, 7, 8]),)
In [75]: arr[_]
Out[75]: array([3, 4, 2, 0, 4, 2])
That Out[74] tuple can be used directly as an index.
You can also extract the array from the tuple:
In [76]: np.where(arr>=0)[0]
Out[76]: array([0, 1, 2, 5, 7, 8])
That, I think is a better choice than the np.asarray(np.where(...))
This convention for where becomes clearer when we use it on a 2d array
In [77]: arr2 = arr.reshape(2,5)
In [78]: np.where(arr2>=0)
Out[78]: (array([0, 0, 0, 1, 1, 1]), array([0, 1, 2, 0, 2, 3]))
In [79]: arr2[_]
Out[79]: array([3, 4, 2, 0, 4, 2])
Again we are indexing with a tuple. arr2[1,3] is really arr2[(1,3)]. The values in [] indexing brackets are actually passed to the indexing function as a tuple of values.
np.argwhere applies transpose to the result of where, producing an array:
In [80]: np.transpose(np.where(arr2>=0))
Out[80]:
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 2],
[1, 3]])
That's the same indexing arrays, but arranged in a 2d column matrix.
If you need the count of where without the actual values, a slightly faster function is
In [81]: np.count_nonzero(arr>=0)
Out[81]: 6
In fact np.nonzero uses the count to first determine the size of the arrays that it will return.

Categories