square root of sum of square of columns in multidimensional array - python

I am using multidimensional list with numpy
I have a list.
l = [[0 2 8] [0 2 7] [0 2 5] [2 4 5] [ 8 4 7]]
I need to find square root of sum of square of columns.
0 2 8
0 2 7
0 2 5
2 4 5
8 4 7
output as,
l = [sqrt((square(0) + square(0) + square(0) + square(2) + square(8)) sqrt((square(2) + square(2) + square(2) + square(4) + square(4)) sqrt((square(8) + square(7) + square(5)) + square(5) + square(7))]

>>> import numpy as np
>>> a = np.array([[0, 2, 8], [0, 2, 7], [0, 2, 5], [2, 4, 5], [ 8, 4, 7]])
>>> np.sqrt(np.sum(np.square(a), axis=0))
array([ 8.24621125, 6.63324958, 14.56021978])

>>> import numpy as np
>>> np.sum(np.array(l)**2,axis=0)**.5
array([ 10.67707825, 3.46410162, 11.74734012])

Use the standard function numpy.linalg.norm for this...
import numpy as np
a = np.array([[0, 2, 8], [0, 2, 7], [0, 2, 5], [2, 4, 5], [ 8, 4, 7]])
np.linalg.norm(a,axis=0)
gives:
array([ 8.24621125, 6.63324958, 14.56021978])

What you want to do is use map/reduce
In theory in can be done using nested for loops but could be done in a more functional way...
for l in matrix:
sum all elements**2 in
return the squar root of the sum
A one liner:
map(lambda x: sqrt(lambda r, z: r + z**2, x), matrix)
But to make it more clear, you could rewrite it as such:
def SumOfSquare(lst):
return reduce(lambda r, x: r + x**2, lst)
def ListOfRoot(lst):
return map(lambda x: SumOfSquare(x), lst)
s = ListOfRoot(matrix)
Misread the question, it's without numpy.

Related

Find the row index number of an array in a 2D numpy array

If I have a 2D numpy array A:
[[6 9 6]
[1 1 2]
[8 7 3]]
And I have access to array [1 1 2]. Clearly, [1 1 2] belongs to index 1 of array A. But how do I do this?
Access the second row using the following operator:
import numpy as np
a = np.array([[6, 9, 6],
[1, 1, 2],
[8, 7, 3]])
row = [1, 1, 2]
i = np.where(np.all(a==row, axis=1))
print(i[0][0])
np.where will return a tuple of indices (lists), which is why you need to use the operators [0][0] consecutively in order to obtain an int.
One option:
a = np.array([[6, 9, 6],
[1, 1, 2],
[8, 7, 3]])
b = np.array([1, 1, 2])
np.nonzero((a == b).all(1))[0]
output: [1]
arr1 = [[6,9,6],[1,1,2],[8,7,3]]
ind = arr1.index([1,1,2])
Output:
ind = 1
EDIT for 2D np.array:
arr1 = np.array([[6,9,6],[1,1,2],[8,7,3]])
ind = [l for l in range(len(arr1)) if (arr1[l,:] == np.array([1,1,2])).all()]
import numpy as np
a = np.array([[6, 9, 6],
[1, 1, 2],
[8, 7, 3]])
b = np.array([1, 1, 2])
[x for x,y in enumerate(a) if (y==b).all()] # here enumerate will keep the track of index
#output
[1]

how to produce a matrix of the x[0]+y[0], x[1]+y[0]....... x[n]+y[n]

I have a list x and y, both are list of list and trying to produce a matrix of the add up of each element of the each list
x = numpy.array ([[1,1,1,1],[2,2,2,2]])
y = numpy.array ([[0,1,2,3],[4,5,6,7]])
result: [x[0]+y[0], x[0]+y[1], x[1]+y[0], x[1]+y[1]]
=> numpy.array ([[1,2,3,4],[5,6,7,8],[2,3,4,5],[6,7,8,9]])
Do I have to reshape y before production?
Is there any smarter and more effective way achieve it?
Thanks you
With broadcasting followed by a reshape:
In [138]: x
Out[138]:
array([[1, 1, 1, 1],
[2, 2, 2, 2]])
In [139]: y
Out[139]:
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
In [140]: (np.expand_dims(x, 1) + y).reshape(-1, x.shape[-1])
Out[140]:
array([[1, 2, 3, 4],
[5, 6, 7, 8],
[2, 3, 4, 5],
[6, 7, 8, 9]])
You can express the sum of all row combinations between the two arrays using numpy.repeat and numpy.tile:
import numpy as np
x = np.array ([[1,1,1,1],[2,2,2,2],[3,3,3,3]])
y = np.array ([[0,1,2,3],[4,5,6,7]])
x_height = x.shape[0]
y_height = y.shape[0]
result = np.repeat(x, y_height, axis=0) + np.tile(y, (x_height, 1))
print(result)
results in
[[ 1 2 3 4] # x[0] + y[0]
[ 5 6 7 8] # x[0] + y[1]
[ 2 3 4 5] # x[1] + y[0]
[ 6 7 8 9] # x[1] + y[1]
[ 3 4 5 6] # x[2] + y[0]
[ 7 8 9 10]] # x[2] + y[1]
This generalizes to x and y with arbitrary numbers of rows.

How to get the indexes of the greatest N values greater than a threshold in Numpy?

For a project I need to be able to get, from a vector with shape (k, m), the indexes of the N greatest values of each row greater than a fixed threshold.
For example, if k=3, m=5, N=3 and the threshold is 5 and the vector is :
[[3 2 6 7 0],
[4 1 6 4 0],
[7 10 6 9 8]]
I should get the result (or the flattened version, I don't care) :
[[2, 3],
[2],
[1, 3, 4]]
The indexes don't have to be sorted.
My code is currently :
indexes = []
for row, inds in enumerate(np.argsort(results, axis=1)[:, -N:]):
for index in inds:
if results[row, index] > threshold:
indexes.append(index)
but I feel like I am not using Numpy to its full capacity.
Does anybody know a better and more elegant solution ?
How about this method:
import numpy as np
arr = np.array(
[[3, 2, 6, 7, 0],
[4, 1, 6, 4, 0],
[7, 10, 6, 9, 8]]
)
t = 5
n = 3
sorted_idxs = arr.argsort(1)[:, -n:]
sorted_arr = np.sort(arr, 1)[:, -n:]
item_nums = np.cumsum((sorted_arr > t).sum(1))
masked_idxs = sorted_idxs[sorted_arr > t]
idx_lists = np.split(masked_idxs, item_nums)
output:
[array([2, 3]), array([2]), array([4, 3, 1])]

Numpy: calculate edges of a matrix

I have the following to calculate the difference of a matrix, i.e. the i-th element - the (i-1) element.
How can I (easily) calculate the difference for each element horizontally and vertically? With a transpose?
inputarr = np.arange(12)
inputarr.shape = (3,4)
inputarr+=1
#shift one position
newarr = list()
for x in inputarr:
newarr.append(np.hstack((np.array([0]),x[:-1])))
z = np.array(newarr)
print inputarr
print 'first differences'
print inputarr-z
Output
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
first differences
[[1 1 1 1]
[5 1 1 1]
[9 1 1 1]]
Check out numpy.diff.
From the documentation:
Calculate the n-th order discrete difference along given axis.
The first order difference is given by out[n] = a[n+1] - a[n] along
the given axis, higher order differences are calculated by using diff
recursively.
An example:
>>> import numpy as np
>>> a = np.arange(12).reshape((3,4))
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> np.diff(a,axis = 1) # row-wise
array([[1, 1, 1],
[1, 1, 1],
[1, 1, 1]])
>>> np.diff(a, axis = 0) # column-wise
array([[4, 4, 4, 4],
[4, 4, 4, 4]])

Elementwise mean of dot product in Python (numpy)

I have two numpy matrixes (or sparse equivalents) like:
>>> A = numpy.array([[1,0,2],[3,0,0],[4,5,0],[0,2,2]])
>>> A
array([[1, 0, 2],
[3, 0, 0],
[4, 5, 0],
[0, 2, 2]])
>>> B = numpy.array([[2,3],[3,4],[5,0]])
>>> B
array([[2, 3],
[3, 4],
[5, 0]])
>>> C = mean_dot_product(A, B)
>>> C
array([[6 , 3],
[6 , 9],
[11.5, 16],
[8 , 8]])
where C[i, j] = sum(A[i,k] * B[k,j]) / count_nonzero(A[i,k] * B[k,j])
There is a fast way to preform this operation in numpy?
A non ideal solution is:
>>> maskA = A > 0
>>> maskB = B > 0
>>> maskA.dtype=numpy.uint8
>>> maskB.dtype=numpy.uint8
>>> D = replace_zeros_with_ones(numpy.dot(maskA,maskB))
>>> C = numpy.dot(A,B) / D
Anyone have a better algorithm?
Further, if A or B are sparse matrix, making them dense (replacing zeros with ones) make memory occupation expolde!
Why you need replace_zeros_with_ones? I delete this line and run your code and get the right result.
You can do this by only one line if all the numbers are not negtaive:
np.dot(A, B)/np.dot(np.sign(A), np.sign(B))

Categories