I want to create a N x N array in numpy such that the diagonal is zero and [x,y] = -[y,x].
For example:
np.array([[[0,12, 2],
[-12, 0, 3],
[-2, -3, 0]],])
The values inside the array can be any float.
One way would be with scipy.spatial.distance.squareform -
from scipy.spatial.distance import squareform
def diag_inverted(n):
l = n*(n-1)//2
out = squareform(np.random.randn(l))
out[np.tri(len(out),k=-1,dtype=bool)] *= -1
return out
Another with array-assignment and masking -
def diag_inverted_v2(n):
l = n*(n-1)//2
m = np.tri(n, k=-1, dtype=bool)
out = np.zeros((n,n),dtype=float)
out[m] = np.random.randn(l)
out[m.T] = -out.T[m.T]
return out
Sample runs -
In [148]: diag_inverted(2)
Out[148]:
array([[ 0. , -0.97873798],
[ 0.97873798, 0. ]])
In [149]: diag_inverted(3)
Out[149]:
array([[ 0. , -2.2408932 , -1.86755799],
[ 2.2408932 , 0. , 0.97727788],
[ 1.86755799, -0.97727788, 0. ]])
In [150]: diag_inverted(4)
Out[150]:
array([[ 0. , -0.95008842, 0.15135721, -0.4105985 ],
[ 0.95008842, 0. , 0.10321885, -0.14404357],
[-0.15135721, -0.10321885, 0. , -1.45427351],
[ 0.4105985 , 0.14404357, 1.45427351, 0. ]])
Here you go:
size = 3
a = np.random.normal(0,1, (size, size))
ret = (a-a.transpose())/2
Output (random):
array([[ 0. , 0.11872306, 0.46792054],
[-0.11872306, 0. , 0.12530741],
[-0.46792054, -0.12530741, 0. ]])
Related
I wanna standardize my input data for a neural network.
Data looks like this:
data= np.array([[0,0,0,0,233,2,0,0,0],[0,0,0,23,50,2,0,0,0],[0,0,0,0,3,20,3,0,0]])
This is the function that I used. It doesn't work because of the zeros.
def standardize(data): #dataframe
_,c = data.shape
data_standardized = data.copy(deep=True)
for j in range(c):
x = data_standardized.iloc[:, j]
avg = x.mean()
std = x.std()
x_standardized = (x - avg)/ std
data_standardized.iloc[:, j] = x_standardized
return data_standardized
Use boolean indexing to avoid dividing by zero:
In [90]: data= np.array([[0,0,0,0,233,2,0,0,0],[0,0,0,23,50,2,0,0,0],[0,0,0,0,3,20,3,0,0]])
In [91]: new = np.zeros(data.shape)
In [92]: m = data.mean(0)
In [93]: std = data.std(0)
In [94]: r = data-m
In [95]: new[:,std.nonzero()] = r[:,std.nonzero()]/std[std.nonzero()]
In [96]: new
Out[96]:
array([[ 0. , 0. , 0. , -0.70710678, 1.3875163 ,
-0.70710678, -0.70710678, 0. , 0. ],
[ 0. , 0. , 0. , 1.41421356, -0.45690609,
-0.70710678, -0.70710678, 0. , 0. ],
[ 0. , 0. , 0. , -0.70710678, -0.9306102 ,
1.41421356, 1.41421356, 0. , 0. ]])
Or use sklearn.preprocessing.StandardScaler.
Your function refactored:
def standardize(data): #dataframe
data = data.values
new = np.zeros(data.shape)
m = data.mean(0)
std = data.std(0)
new[:,std.nonzero()] = r[:,std.nonzero()]/std[std.nonzero()]
return pd.DataFrame(new)
I have following program
import numpy as np
arr = np.random.randn(3,4)
print(arr)
regArr = (arr > 0.8)
print (regArr)
print (arr[ regArr].reshape(arr.shape))
output:
[[ 0.37182134 1.4807685 0.11094223 0.34548185]
[ 0.14857641 -0.9159358 -0.37933393 -0.73946522]
[ 1.01842304 -0.06714827 -1.22557205 0.45600827]]
I am looking for output in arr where values greater than 0.8 should exist and other values to be zero.
I tried bool masking as shown above. But I am able to slove this. Kindly help
I'm not entirely sure what exactly you want to achieve, but this is what I did to filter.
arr = np.random.randn(3,4)
array([[-0.04790508, -0.71700005, 0.23204224, -0.36354634],
[ 0.48578236, 0.57983561, 0.79647091, -1.04972601],
[ 1.15067885, 0.98622772, -0.7004639 , -1.28243462]])
arr[arr < 0.8] = 0
array([[0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. ],
[1.15067885, 0.98622772, 0. , 0. ]])
Thanks to user3053452, I have added one more solution which the original data will not be changed.
arr = np.random.randn(3,4)
array([[ 0.4297907 , 0.38100702, 0.30358291, -0.71137138],
[ 1.15180635, -1.21251676, 0.04333404, 1.81045931],
[ 0.17521058, -1.55604971, 1.1607159 , 0.23133528]])
new_arr = np.where(arr < 0.8, 0, arr)
array([[0. , 0. , 0. , 0. ],
[1.15180635, 0. , 0. , 1.81045931],
[0. , 0. , 1.1607159 , 0. ]])
This question already has answers here:
Numpy array loss of dimension when masking
(5 answers)
Closed 3 years ago.
The question sounds very basic. But when I try to use where or boolean conditions on numpy arrays, it always returns a flattened array.
I have the NumPy array
P = array([[ 0.49530662, 0.07901 , -0.19012371],
[ 0.1421513 , 0.48607405, -0.20315014],
[ 0.76467375, 0.16479826, -0.56598029],
[ 0.53530718, -0.21166188, -0.08773241]])
I want to extract the array of only negative values, but when I try
P[P<0]
array([-0.19012371, -0.41421612, -0.20315014, -0.56598029, -0.21166188,
-0.08773241, -0.09241335])
P[np.where(P<0)]
array([-0.19012371, -0.41421612, -0.20315014, -0.56598029, -0.21166188,
-0.08773241, -0.09241335])
I get a flattened array. How can I extract the array of the form
array([[ 0, 0, -0.19012371],
[ 0 , 0, -0.20315014],
[ 0, 0, -0.56598029],
[ 0, -0.21166188, -0.08773241]])
I do not wish to create a temp array and then use something like Temp[Temp>=0] = 0
Since your need is:
I want to "extract" the array of only negative values
You can use numpy.where() with your condition (checking for negative values), which can preserve the dimension of the array, as in the below example:
In [61]: np.where(P<0, P, 0)
Out[61]:
array([[ 0. , 0. , -0.19012371],
[ 0. , 0. , -0.20315014],
[ 0. , 0. , -0.56598029],
[ 0. , -0.21166188, -0.08773241]])
where P is your input array.
Another idea could be to use numpy.zeros_like() for initializing a same shape array and numpy.where() to gather the indices at which our condition satisfies.
# initialize our result array with zeros
In [106]: non_positives = np.zeros_like(P)
# gather the indices where our condition is obeyed
In [107]: idxs = np.where(P < 0)
# copy the negative values to correct indices
In [108]: non_positives[idxs] = P[idxs]
In [109]: non_positives
Out[109]:
array([[ 0. , 0. , -0.19012371],
[ 0. , 0. , -0.20315014],
[ 0. , 0. , -0.56598029],
[ 0. , -0.21166188, -0.08773241]])
Yet another idea would be to simply use the barebones numpy.clip() API, which would return a new array, if we omit the out= kwarg.
In [22]: np.clip(P, -np.inf, 0) # P.clip(-np.inf, 0)
Out[22]:
array([[ 0. , 0. , -0.19012371],
[ 0. , 0. , -0.20315014],
[ 0. , 0. , -0.56598029],
[ 0. , -0.21166188, -0.08773241]])
This should work, essentially get the indexes of all elements which are above 0, and set them to 0, this will preserve the dimensions! I got the idea from here: Replace all elements of Python NumPy Array that are greater than some value
Also note that I have modified the original array, I haven't used a temp array here
import numpy as np
P = np.array([[ 0.49530662, 0.07901 , -0.19012371],
[ 0.1421513 , 0.48607405, -0.20315014],
[ 0.76467375, 0.16479826, -0.56598029],
[ 0.53530718, -0.21166188, -0.08773241]])
P[P >= 0] = 0
print(P)
The output will be
[[ 0. 0. -0.19012371]
[ 0. 0. -0.20315014]
[ 0. 0. -0.56598029]
[ 0. -0.21166188 -0.08773241]]
As noted below, this will modify the array, so we should use np.where(P<0, P 0) to preserve the original array as follows, thanks #kmario123 as follows
import numpy as np
P = np.array([[ 0.49530662, 0.07901 , -0.19012371],
[ 0.1421513 , 0.48607405, -0.20315014],
[ 0.76467375, 0.16479826, -0.56598029],
[ 0.53530718, -0.21166188, -0.08773241]])
print( np.where(P<0, P, 0))
print(P)
The output will be
[[ 0. 0. -0.19012371]
[ 0. 0. -0.20315014]
[ 0. 0. -0.56598029]
[ 0. -0.21166188 -0.08773241]]
[[ 0.49530662 0.07901 -0.19012371]
[ 0.1421513 0.48607405 -0.20315014]
[ 0.76467375 0.16479826 -0.56598029]
[ 0.53530718 -0.21166188 -0.08773241]]
I have a numpy array A with shape (M,N). I want to create a new array B with shape (M,N,3) where the result would be the same as the following:
import numpy as np
def myfunc(A,sx=1.5,sy=3.5):
M,N=A.shape
B=np.zeros((M,N,3))
for i in range(M):
for j in range(N):
B[i,j,0]=i*sx
B[i,j,1]=j*sy
B[i,j,2]=A[i,j]
return B
A=np.array([[1,2,3],[9,8,7]])
print(myfunc(A))
Giving the result:
[[[0. 0. 1. ]
[0. 3.5 2. ]
[0. 7. 3. ]]
[[1.5 0. 9. ]
[1.5 3.5 8. ]
[1.5 7. 7. ]]]
Is there a way to do it without the loop? I was thinking whether numpy would be able to apply a function element-wise using the indexes of the array. Something like:
def myfuncEW(indx,value,out,vars):
out[0]=indx[0]*vars[0]
out[1]=indx[1]*vars[1]
out[2]=value
M,N=A.shape
B=np.zeros((M,N,3))
np.applyfunctionelementwise(myfuncEW,A,B,(sx,sy))
You could use mgrid and moveaxis:
>>> M, N = A.shape
>>> I, J = np.mgrid[:M, :N] * np.array((sx, sy))[:, None, None]
>>> np.moveaxis((I, J, A), 0, -1)
array([[[ 0. , 0. , 1. ],
[ 0. , 3.5, 2. ],
[ 0. , 7. , 3. ]],
[[ 1.5, 0. , 9. ],
[ 1.5, 3.5, 8. ],
[ 1.5, 7. , 7. ]]])
>>>
You could use meshgrid and dstack, like this:
import numpy as np
def myfunc(A,sx=1.5,sy=3.5):
M, N = A.shape
J, I = np.meshgrid(range(N), range(M))
return np.dstack((I*sx, J*sy, A))
A=np.array([[1,2,3],[9,8,7]])
print(myfunc(A))
# array([[[ 0. , 0. , 1. ],
# [ 0. , 3.5, 2. ],
# [ 0. , 7. , 3. ]],
#
# [[ 1.5, 0. , 9. ],
# [ 1.5, 3.5, 8. ],
# [ 1.5, 7. , 7. ]]])
By preallocating the 3d array B, you save about half the time compared to stacking I, J and A.
def myfunc(A, sx=1.5, sy=3.5):
M, N = A.shape
B = np.zeros((M, N, 3))
B[:, :, 0] = np.arange(M)[:, None]*sx
B[:, :, 1] = np.arange(N)[None, :]*sy
B[:, :, 2] = A
return B
I have following numpy array
import numpy as np
np.random.seed(20)
np.random.rand(20).reshape(5, 4)
array([[ 0.5881308 , 0.89771373, 0.89153073, 0.81583748],
[ 0.03588959, 0.69175758, 0.37868094, 0.51851095],
[ 0.65795147, 0.19385022, 0.2723164 , 0.71860593],
[ 0.78300361, 0.85032764, 0.77524489, 0.03666431],
[ 0.11669374, 0.7512807 , 0.23921822, 0.25480601]])
For each column I would like to slice it in positions:
position_for_slicing=[0, 3, 4, 4]
So I will get following array:
array([[ 0.5881308 , 0.85032764, 0.23921822, 0.81583748],
[ 0.03588959, 0.7512807 , 0 0],
[ 0.65795147, 0, 0 0],
[ 0.78300361, 0, 0 0],
[ 0.11669374, 0, 0 0]])
Is there fast way to do this ? I know I can use to do for loop for each column, but I was wondering if there is more elegant way to do this.
If "elegant" means "no loop" the following would qualify, but probably not under many other definitions (arr is your input array):
m, n = arr.shape
arrf = np.asanyarray(arr, order='F')
padded = np.r_[arrf, np.zeros_like(arrf)]
assert padded.flags['F_CONTIGUOUS']
expnd = np.lib.stride_tricks.as_strided(padded, (m, m+1, n), padded.strides[:1] + padded.strides)
expnd[:, [0,3,4,4], range(4)]
# array([[ 0.5881308 , 0.85032764, 0.23921822, 0.25480601],
# [ 0.03588959, 0.7512807 , 0. , 0. ],
# [ 0.65795147, 0. , 0. , 0. ],
# [ 0.78300361, 0. , 0. , 0. ],
# [ 0.11669374, 0. , 0. , 0. ]])
Please note that order='C' and then 'C_CONTIGUOUS' in the assertion also works. My hunch is that 'F' could be a bit faster because the indexing then operates on contiguous slices.