How to interpret an array into squere bracket of another array? - python

I've write this code and I'm trying to understand the meaning of the output to apply a mask to an array
matrix = np.random.rand(3,3)
matrix
output:
array([[0.7441097 , 0.02908848, 0.60378581],
[0.53335156, 0.21701412, 0.51545259],
[0.91777356, 0.49123304, 0.15410852]])
mask
output:
matrix([[0, 0, 2],
[1, 1, 0],
[2, 2, 2]])
matrix[mask]
output:
array([[[0.7441097 , 0.02908848, 0.60378581],
[0.7441097 , 0.02908848, 0.60378581],
[0.91777356, 0.49123304, 0.15410852]],
[[0.53335156, 0.21701412, 0.51545259],
[0.53335156, 0.21701412, 0.51545259],
[0.7441097 , 0.02908848, 0.60378581]],
[[0.91777356, 0.49123304, 0.15410852],
[0.91777356, 0.49123304, 0.15410852],
[0.91777356, 0.49123304, 0.15410852]]])
how can this result be interpreted?

This is simply doing this:
In [1108]: matrix[0]
Out[1108]: array([0.02502891, 0.74397363, 0.74176154])
In [1109]: matrix[1]
Out[1109]: array([0.76480152, 0.84331737, 0.29647379])
In [1110]: matrix[2]
Out[1110]: array([0.68258943, 0.43118925, 0.82981894])
When you do :
matrix[mask]
where mask is :
matrix([[0, 0, 2],
[1, 1, 0],
[2, 2, 2]])
It returns you an array whose first element will be :
[matrix[0], matrix[0], matrix[2]],
2nd:
[matrix[1], matrix[1], matrix[0]]
and so on.

Related

Create array/tensor of cycle shifted arrays

I want to create 2d tensor (or numpy array, doesn't really matter), where every row will be cycle shifted first row. I do it using for loop:
import torch
import numpy as np
a = np.random.rand(33, 11)
miss_size = 64
lp_order = a.shape[1] - 1
inv_a = -np.flip(a, axis=1)
mtx_size = miss_size+lp_order # some constant
mtx_row = torch.cat((torch.from_numpy(inv_a), torch.zeros((a.shape[0], miss_size - 1 + a.shape[1]))), dim=1)
mtx_full = mtx_row.unsqueeze(1)
for i in range(mtx_size):
mtx_row = torch.roll(mtx_row, 1, 1)
mtx_full = torch.cat((mtx_full, mtx_row.unsqueeze(1)), dim=1)
unsqueezing is needed because I stack 2d tensors into 3d tensor
Is there more efficient way to do that? Maybe linear algebra trick or more pythonic approach.
You can use scipy.linalg.circulant():
scipy.linalg.circulant([1, 2, 3])
# array([[1, 3, 2],
# [2, 1, 3],
# [3, 2, 1]])
I believe you can achieve this using torch.gather by constructing the appropriate index tensor. This approach works with batches too.
If we take this approach, the objective is to construct an index tensor where each value refers to an index in mtx_row (along the last dimension here dim=1). In this case, it would be shaped (3, 3):
tensor([[0, 1, 2],
[2, 0, 1],
[1, 2, 0]])
You can achieve this by broadcasting torch.arange with its own transpose and applying modulo on the resulting matrix:
>>> idx = (n-torch.arange(n)[None].T + torch.arange(n)[None]) % n
tensor([[0, 1, 2],
[2, 0, 1],
[1, 2, 0]])
Let mtx_row be shaped (2, 3):
>>> mtx_row
tensor([[0.3523, 0.0170, 0.1875],
[0.2156, 0.7773, 0.4563]])
From there you need to idx and mtx_row so they have the same shapes:
>>> idx_ = idx[None].expand(len(mtx_row), -1, -1)
>>> val_ = mtx_row[:, None].expand(-1, n, -1)
Then we can apply torch.gather on the last dimension dim=2:
>>> val_.gather(-1, idx_)
tensor([[[0.3523, 0.0170, 0.1875],
[0.1875, 0.3523, 0.0170],
[0.0170, 0.1875, 0.3523]],
[[0.2156, 0.7773, 0.4563],
[0.4563, 0.2156, 0.7773],
[0.7773, 0.4563, 0.2156]]])

Why does calling np.array() on this list comprehension produce a 3d array instead of 2d?

I have a script produces the first several iterations of a Markov matrix multiplying a given set of input values. With the matrix stored as A and the start values in the column u0, I use this list comprehension to store the output in an array:
out = np.array([ ( (A**n) * u0).T for n in range(10) ])
The output has shape (10,1,6), but I want the output in shape (10,6) instead. Obviously, I can fix this with .reshape(), but is there a way to avoid creating the extra dimension in the first place, perhaps by simplifying the list comprehension or the inputs?
Here's the full script and output:
import numpy as np
# Random 6x6 Markov matrix
n = 6
A = np.matrix([ (lambda x: x/x.sum())(np.random.rand(n)) for _ in range(n)]).T
print(A)
#[[0.27457312 0.20195133 0.14400801 0.00814027 0.06026188 0.23540134]
# [0.21526648 0.17900277 0.35145882 0.30817386 0.15703758 0.21069114]
# [0.02100412 0.05916883 0.18309142 0.02149681 0.22214047 0.15257011]
# [0.17032696 0.11144443 0.01364982 0.31337906 0.25752732 0.1037133 ]
# [0.03081507 0.2343255 0.2902935 0.02720764 0.00895182 0.21920371]
# [0.28801424 0.21410713 0.01749843 0.32160236 0.29408092 0.07842041]]
# Random start values
u0 = np.matrix(np.random.randint(51, size=n)).T
print(u0)
#[[31]
# [49]
# [44]
# [29]
# [10]
# [ 0]]
# Find the first 10 iterations of the Markov process
out = np.array([ ( (A**n) * u0).T for n in range(10) ])
print(out)
#[[[31. 49. 44. 29. 10.
# 0. ]]
#
# [[25.58242101 41.41600236 14.45123543 23.00477134 26.08867045
# 32.45689942]]
#
# [[26.86917065 36.02438292 16.87560159 26.46418685 22.66236879
# 34.10428921]]
#
# [[26.69224394 37.06346073 16.59208202 26.48817955 22.56696872
# 33.59706504]]
#
# [[26.68772374 36.99727159 16.49987315 26.5003184 22.61130862
# 33.7035045 ]]
#
# [[26.68766363 36.98517264 16.50532933 26.51717543 22.592951
# 33.71170797]]
#
# [[26.68695152 36.98895204 16.50314718 26.51729716 22.59379049
# 33.70986161]]
#
# [[26.68682195 36.98848867 16.50286371 26.51763013 22.59362679
# 33.71056876]]
#
# [[26.68681128 36.98850409 16.50286036 26.51768807 22.59359453
# 33.71054167]]
#
# [[26.68680313 36.98851046 16.50285038 26.51769497 22.59359219
# 33.71054886]]]
print(out.shape)
#(10, 1, 6)
out = out.reshape(10,n)
print(out)
#[[31. 49. 44. 29. 10. 0. ]
# [25.58242101 41.41600236 14.45123543 23.00477134 26.08867045 32.45689942]
# [26.86917065 36.02438292 16.87560159 26.46418685 22.66236879 34.10428921]
# [26.69224394 37.06346073 16.59208202 26.48817955 22.56696872 33.59706504]
# [26.68772374 36.99727159 16.49987315 26.5003184 22.61130862 33.7035045 ]
# [26.68766363 36.98517264 16.50532933 26.51717543 22.592951 33.71170797]
# [26.68695152 36.98895204 16.50314718 26.51729716 22.59379049 33.70986161]
# [26.68682195 36.98848867 16.50286371 26.51763013 22.59362679 33.71056876]
# [26.68681128 36.98850409 16.50286036 26.51768807 22.59359453 33.71054167]
# [26.68680313 36.98851046 16.50285038 26.51769497 22.59359219 33.71054886]]
I think your confusion lies with how arrays can be joined.
Start with a simple 1d array (in numpy 1d is a real thing, not just a 'row vector' or 'column vector'):
In [288]: arr = np.arange(6)
In [289]: arr
Out[289]: array([0, 1, 2, 3, 4, 5])
np.array joins element arrays along a new 1st dimension:
In [290]: np.array([arr,arr])
Out[290]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
np.stack with the default axis value does the same thing. Read its docs.
We can make a 2d array, a column vector:
In [291]: arr1 = arr[:,None]
In [292]: arr1
Out[292]:
array([[0],
[1],
[2],
[3],
[4],
[5]])
In [293]: arr1.shape
Out[293]: (6, 1)
Using np.array on its transpose the (1,6) arrays:
In [294]: np.array([arr1.T, arr1.T])
Out[294]:
array([[[0, 1, 2, 3, 4, 5]],
[[0, 1, 2, 3, 4, 5]]])
In [295]: _.shape
Out[295]: (2, 1, 6)
Note the middle size 1 dimension, that bothered you.
np.vstack joins the arrays along the existing 1st dimension. It does not add one:
In [296]: np.vstack([arr1.T, arr1.T])
Out[296]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
Or we could join the arrays horizontally, on the 2nd dimension:
In [297]: np.hstack([arr1, arr1])
Out[297]:
array([[0, 0],
[1, 1],
[2, 2],
[3, 3],
[4, 4],
[5, 5]])
That is (6,2) which can be transposed to (2,6):
In [298]: np.hstack([arr1, arr1]).T
Out[298]:
array([[0, 1, 2, 3, 4, 5],
[0, 1, 2, 3, 4, 5]])
If you use np.array() for input and # for matrix multiplication, it works as expected.
# Random 6x6 Markov matrix
n = 6
A = np.array([ (lambda x: x/x.sum())(np.random.rand(n)) for _ in range(n)]).T
# Random start values
u0 = np.random.randint(51, size=n).T
# Find the first 10 iterations of the Markov process
out = np.array([ ( np.linalg.matrix_power(A,n) # u0).T for n in range(10) ])
print(out)
#[[29. 24. 5. 12. 10. 32. ]
# [15.82875119 13.53436868 20.61648725 19.22478172 20.34082205 22.45478912]
# [21.82434718 10.06037119 14.29281935 20.75271393 18.76134538 26.30840297]
# [20.77484848 10.1379821 15.47488423 19.4965479 20.05618311 26.05955418]
# [21.02944236 10.09401438 15.24263478 19.48662616 19.95767996 26.18960236]
# [20.96887722 10.11647819 15.30729334 19.44261102 20.00089222 26.16384802]
# [20.98086362 10.11522779 15.29529799 19.44899285 19.99137187 26.16824587]
# [20.97795615 10.11606978 15.29817734 19.44798612 19.99293494 26.16687566]
# [20.97858032 10.11591954 15.29752865 19.44839852 19.99245389 26.16711909]
# [20.97844343 10.11594666 15.29766432 19.4483417 19.99254284 26.16706104]]
I made a few changes to the code, although I'm not 100% certain that the result is still the same (I am not familiar with Markov chains).
import numpy as np
n = 6
num_proc_iters = 10
rand_nums_arr = np.random.random_sample((n, n))
rand_nums_arr = np.transpose(rand_nums_arr / rand_nums_arr.sum(axis=1))
u0 = np.random.randint(51, size=n)
res_arr = np.concatenate([np.linalg.matrix_power(rand_nums_arr, curr) # u0 for curr in range(num_proc_iters)])
I would love to hear if anyone can think of any further improvements.

Create stack of arrays from diagonal values using numpy

I'm trying to do some matrix calculations in python and came across a problem when I tried to speed up my code using stacked arrays instead of simple for loops. I need to create a 2D-array with values (given as 1D-array) on the diagonal, but could't figure out a smart way to do it with stacked arrays.
In the old (loop) version, I used the np.diag() method, which returns exactly what I need (a 2D-array in that case) if I give the values as 1D-array as input. However, when I switched to stacked arrays my input is not a 1D-array anymore, so that the np.diag() method returns a copy of the diagonal of my 2D-input instead.
Old version with 1D input:
import numpy as np
vals = np.array([1,2,3])
mat = np.diag(vals)
print(mat.shape)
Out: (3, 3)
New version with 2D input:
vals_stack = np.repeat(np.expand_dims(vals, axis=0), 5, axis=0)
# btw: is there a better way to repeat/stack my array?
mat_stack = np.diag(vals_stack)
print(mat_stack.shape)
Out: (3,)
So you can see that np.diag() returns a 1D-array (as expected from the documentation), but I actually need a stack of 2D-arrays. So the shape of the mat_stack must be (7,3,3) and not (3,). Is there any function for that in numpy? Or do I have to loop over that additional dimension like this:
def mydiag(stack):
diag = np.zeros([stack.shape[0], stack.shape[1], stack.shape[1]])
for i in np.arange(stack.shape[0]):
diag[i,:,:] = np.diag([stack[i,:].ravel()])
return diag
In numpy you should use apply_along_axis. There is even an example at the end of the doc for your specific case (here). So the answer is :
np.apply_along_axis(np.diag, -1, vals_stack)
A more pythonic way would be something like this:
[np.diag(row) for row in vals_stack]
Is this what you had in mind:
In [499]: x = np.arange(12).reshape(4,3)
In [500]: X = np.zeros((4,3,3),int)
In [501]: X[np.arange(4)[:,None],np.arange(3), np.arange(3)] = x
In [502]: X
Out[502]:
array([[[ 0, 0, 0],
[ 0, 1, 0],
[ 0, 0, 2]],
[[ 3, 0, 0],
[ 0, 4, 0],
[ 0, 0, 5]],
[[ 6, 0, 0],
[ 0, 7, 0],
[ 0, 0, 8]],
[[ 9, 0, 0],
[ 0, 10, 0],
[ 0, 0, 11]]])
X[0,np.arange(3), np.arange(3)] indexes the diagonal on the first plane. np.arange(4)[:,None] is a (4,1) array, which broadcasts with a (3,) to index a (4,3) block, matching the size of x.

Editting python 2-dimensional array without for-loop?

So, I have a given 2 dimensional matrix which is randomly generated:
a = np.random.randn(4,4)
which gives output:
array([[-0.11449491, -2.7777728 , -0.19784241, 1.8277976 ],
[-0.68511473, 0.40855461, 0.06003551, -0.8779363 ],
[-0.55650378, -0.16377137, 0.10348714, -0.53449633],
[ 0.48248298, -1.12199767, 0.3541335 , 0.48729845]])
I want to change all the negative values to 0 and all the positive values to 1.
How can I do this without a for loop?
You can use np.where()
import numpy as np
a = np.random.randn(4,4)
a = np.where(a<0, 0, 1)
print(a)
[[1 1 0 1]
[1 0 1 0]
[1 1 0 0]
[0 1 1 0]]
(a<0).astype(int)
This is one possibly solution - converting the array to boolean array according to your condition and then converting it from boolean to integer.
array([[ 0.63694991, -0.02785534, 0.07505496, 1.04719295],
[-0.63054947, -0.26718763, 0.34228736, 0.16134474],
[ 1.02107383, -0.49594998, -0.11044738, 0.64459594],
[ 0.41280766, 0.668819 , -1.0636972 , -0.14684328]])
And the result -
(a<0).astype(int)
>>> array([[0, 1, 0, 0],
[1, 1, 0, 0],
[0, 1, 1, 0],
[0, 0, 1, 1]])

Numpy Dot product with nested array

trying to come up with a method to perform load combinations and transient load patterning for structural/civil engineering applications.
without patterning it's fairly simple:
list of load results = [[d],[t1],...,[ti]], where [ti] = transient load result as a numpy array = A
list of combos = [[1,0,....,0],[0,1,....,1], [dfi, tf1,.....,tfi]] , where tfi = code load factor for transient load = B
in python this works as numpy.dot(A,B)
so my issue arises where:
`list of load results = [[d],[t1],.....[ti]]`, where [t1] = [[t11]......[t1i]] for i pattern possibilities and [t1i] = numpy array
so I have a nested array within another array and want to multiply by a matrix of load combinations. Is there a way to implement this in one matrix operation, I can come up with a method by looping the pattern possibilities then a dot product with the load combos, but this is computationally expensive. Any thoughts?
Thanks
for an example not considering patterning see: https://github.com/buddyd16/Structural-Engineering/blob/master/Analysis/load_combo_test.py
essential I need a method that gives similar results assuming that for loads = np.array([[D],[Ex],[Ey],[F],[H],[L],[Lr],[R],[S],[Wx],[Wy]]) --> [L],[Lr],[R],[S] are actually nested arrays ie if D = 1x500 array/vector, L, Lr, R, or S could = 100x500 array.
my simple solution is:
combined_pattern = []
for pattern in load_patterns:
loads = np.array([[D],[Ex],[Ey],[F],[H],[L[pattern]],[Lr[pattern]],[R[pattern]],[S[pattern]],[Wx],[Wy]])
combined_pattern.append(np.dot(basic_factors, loads))
Simpler Example:
import numpy as np
#Simple
A = np.array([1,0,0])
B = np.array([0,1,0])
C = np.array([0,0,1])
Loads = np.array([A,B,C])
Factors = np.array([[1,1,1],[0.5,0.5,0.5],[0.25,0.25,0.25]])
result = np.dot(Factors, Loads)
# Looking for a faster way to accomplish the below operation
# this works but will be slow for large data sets
# bi can be up to 1x5000 in size and i can be up to 500
A = np.array([1,0,0])
b1 = np.array([1,0,0])
b2 = np.array([0,1,0])
b3 = np.array([0,0,1])
B = np.array([b1,b2,b3])
C = np.array([0,0,1])
result_list = []
for load in B:
Loads = np.array([A,load,C])
Factors = np.array([[1,1,1],[0.5,0.5,0.5],[0.25,0.25,0.25]])
result = np.dot(Factors, Loads)
result_list.append(result)
edit: Had Factors and Loads reversed in the np.dot().
In your simple example, the array shapes are:
In [2]: A.shape
Out[2]: (3,)
In [3]: Loads.shape
Out[3]: (3, 3)
In [4]: Factors.shape
Out[4]: (3, 3)
In [5]: result.shape
Out[5]: (3, 3)
The rule in dot is that the last dimension of Loads pairs with the 2nd to the last of Factors
result = np.dot(Loads,Factors)
(3,3) dot (3,3) => (3,3) # 3's in common
(m,n) dot (n,l) => (m,l) # n's in common
In the iteration, A,load and C are all (3,) and Loads is (3,3).
result_list is a list of 3 (3,3) arrays, and np.array(result_list) would be (3,3,3).
Let's make a 3d array of all the Loads:
In [16]: Bloads = np.array([np.array([A,load,C]) for load in B])
In [17]: Bloads.shape
Out[17]: (3, 3, 3)
In [18]: Bloads
Out[18]:
array([[[1, 0, 0],
[1, 0, 0],
[0, 0, 1]],
[[1, 0, 0],
[0, 1, 0],
[0, 0, 1]],
[[1, 0, 0],
[0, 0, 1],
[0, 0, 1]]])
I can easily do a dot of this Bloads and Factors with einsum:
In [19]: np.einsum('lkm,mn->lkn', Bloads, Factors)
Out[19]:
array([[[1. , 1. , 1. ],
[1. , 1. , 1. ],
[0.25, 0.25, 0.25]],
[[1. , 1. , 1. ],
[0.5 , 0.5 , 0.5 ],
[0.25, 0.25, 0.25]],
[[1. , 1. , 1. ],
[0.25, 0.25, 0.25],
[0.25, 0.25, 0.25]]])
einsum isn't the only way, but it's the easiest way (for me) to keep track of dimensions.
It's even easier to keep dimensions straight if they differ. Here they are all 3, so it's hard to keep them separate. But if B was (5,4) and Factors (4,2), then Bloads would be (5,3,4), and the einsum result (5,3,2) (the size 4 dropping out in the dot).
Constructing Bloads without a loop is a bit trickier, since the rows of B are interleaved with A and C.
In [38]: np.stack((A[None,:].repeat(3,0),B,C[None,:].repeat(3,0)),1)
Out[38]:
array([[[1, 0, 0],
[1, 0, 0],
[0, 0, 1]],
[[1, 0, 0],
[0, 1, 0],
[0, 0, 1]],
[[1, 0, 0],
[0, 0, 1],
[0, 0, 1]]])
To understand this test the subexpressions, e.g. A[None,:], the repeat etc.
Equivalently:
np.array((A[None,:].repeat(3,0),B,C[None,:].repeat(3,0))).transpose(1,0,2)

Categories