Modifying numpy array with index array with multiplicated indexes [duplicate] - python

This question already has answers here:
numpy.array.__iadd__ and repeated indices [duplicate]
(2 answers)
Closed 6 years ago.
I'm trying to do histrogram using numpy array indexing (without explicit iteration over array). Just to check if it works as expected I did following test:
import numpy as np
arr = np.zeros(10)
inds = np.array([1,2,3,1,3,5,3])
arr[inds] += 1.0
print(arr)
the result is
[ 0. 1. 1. 1. 0. 1. 0. 0. 0. 0.] instead of
[ 0. 2. 1. 3. 0. 1. 0. 0. 0. 0.].
(i.e. it omits indexes which appear multiple times in index array)
I'm not sure if there is some reason for this behavior (perhaps to make these operation order independent and therefore easier to paralellize).
Is there any other way how to do this in numpy ?

The OP script adds +1 only once to the arr indexes specified in inds i.e. at indexes (1,2,3,5)
A well fitted NumPy function for what you require is numpy.bincount().
As the result of this function will have the size = inds.max(), you will have to slice arr to specify which indexes will be added. If not, the shapes will not coincide.
import numpy as np
arr = np.zeros(10)
inds = np.array([1,2,3,1,3,5,3])
values = np.bincount(inds)
print values
arr[:values.size]+= values
print(arr)
values will be:
[0 2 1 3 0 1]
and arr will take the form:
array([ 0., 2., 1., 3., 0., 1., 0., 0., 0., 0.])

When you have multiple assignments in your operation on a numpy array python leaves the assignment to the last one. And it's all for sake of logical matters. Which has mentioned in document as well:
a = np.arange(5)
a[[0,0,2]]+=1
a array([1, 1, 3, 3, 4])
Even though 0 occurs twice in the list of indices, the 0th element is >only incremented once. This is because
Python requires a+=1 to be equivalent to a=a+1.

Related

Python NumPy, remove unnecessary brackets

In Numpy, Transposing of a column vector makes the the array an embedded array.
For example, transposing
[[1.],[2.],[3.]] gives [[1., 2., 3.]] and the dimension of the outermost array is 1. And this produces many errors in my code. Is there a way to produce [1., 2., 3.] directly?
Try .flatten(), .ravel(), .reshape(-1), .squeeze().
Yes, you can use the .flatten() method of a Numpy array to convert a multi-dimensional array into a one-dimensional array. In your case, you can simply call .flatten() on the transposed array to get the desired result:
import numpy as np
column_vector = np.array([[1.], [2,], [3,]])
flattened_array = column_vector.T.flatten()
print(flattened_array)
Output:
[1. 2. 3.]

How to replace an array with an array of different shape

I'm trying to replace the values of an array with another array of a different shape.
a = np.array([[0,1],
[0,0,1]])
b = np.array([[0. 0. 0.],
[0. 0. 0.]])
I tried
b[0, :2] = a[0]
b[1,:] = a[1]
which works, but not the best for larger arrays.
I want
c = [[0,1,0],
[0,0,1]]
Edit: Unless there is a way for an array to be filled with any value we could specify so that the array has the same shape as b. For example my array 'a' can become
[[0,1,nan],[0,0,1]]

implementing euclidean distance based formula using numpy

I am trying to implement this formula in python using numpy
As you can see in picture above X is numpy matrix and each xi is a vector with n dimensions and C is also a numpy matrix and each Ci is vector with n dimensions too, dist(Ci,xi) is euclidean distance between these two vectors.
I implement a code in python:
value = 0
for i in range(X.shape[0]):
min_value = math.inf
#this for loop iterate k times
for j in range(C.shape[0]):
distance = (np.dot(X[i] - C[j],
X[i] - C[j])) ** .5
min_value = min(min_value, distance)
value += min_value
fitnessValue = value
But my code performance is not good enough I'am looking for faster,is there any faster way to calculate that formula in python any idea would be thankful.
Generally, loops running an important number of times should be avoided when possible in python.
Here, there exists a scipy function, scipy.spatial.distance.cdist(C, X), which computes the pairwise distance matrix between C and X. That is to say, if you call distance_matrix = scipy.spatial.distance.cdist(C, X), you have distance_matrix[i, j] = dist(C_i, X_j).
Then, for each j, you want to compute the minimum of the dist(C_i, X_j) over all i. You do not either need a loop to compute this! The function numpy.minimum does it for you, if you pass an axis argument.
And finally, the summation of all these minimum is done by calling the numpy.sum function.
This gives code much more readable and faster:
import scipy.spatial.distance
import numpy as np
def your_function(C, X):
distance_matrix = scipy.spatial.distance.cdist(C, X)
minimum = np.min(distance_matrix, axis=0)
return np.sum(minimum)
Which returns the same results as your function :)
Hope this helps!
einsum can also be called into play. Here is a simple small example of a pairwise distance calculation for a small set. Useful if you don't have scipy installed and/or wish to use numpy solely.
>>> a
array([[ 0., 0.],
[ 1., 1.],
[ 2., 2.],
[ 3., 3.],
[ 4., 4.]])
>>> b = a.reshape(np.prod(a.shape[:-1]),1,a.shape[-1])
>>> b
array([[[ 0., 0.]],
[[ 1., 1.]],
[[ 2., 2.]],
[[ 3., 3.]],
[[ 4., 4.]]])
>>> diff = a - b; dist_arr = np.sqrt(np.einsum('ijk,ijk->ij', diff, diff)).squeeze()
>>> dist_arr
array([[ 0. , 1.41421, 2.82843, 4.24264, 5.65685],
[ 1.41421, 0. , 1.41421, 2.82843, 4.24264],
[ 2.82843, 1.41421, 0. , 1.41421, 2.82843],
[ 4.24264, 2.82843, 1.41421, 0. , 1.41421],
[ 5.65685, 4.24264, 2.82843, 1.41421, 0. ]])
Array 'a' is a simple 2d (shape=(5,2), 'b' is just 'a' reshaped to facilitate (5, 1, 2) the difference calculations for the cdist style array. The terms are written verbosely since they are extracted from other code. the 'diff' variable is the difference array and the dist_arr shown is for the 'euclidean' distance. Should you need euclideansq (square distance) for 'closest' determinations, simply remove the np.sqrt term and finally squeeze, just removes and 1 terms in the shape.
cdist is faster for much larger arrays (in the order of 1000s of origins and destinations) but einsum is a nice alternative and well documented by others on this site.

Is there A 1D interpolation (along one axis) of an image using two images (2D arrays) as inputs? [duplicate]

This question already has an answer here:
Interpolate in one direction
(1 answer)
Closed 7 years ago.
I have two images representing x and y values. The images are full of 'holes' (the 'holes' are the same in both images).
I want to interpolate (linear interpolation is fine though higher level interpolation is preferable) along ONE of the axis in order to 'fill' the holes.
Say the axis of choice is 0, that is, I want to interpolate across each column. All I have found with numpy is interpolation when x is the same (e.g. numpy.interpolate.interp1d). In this case, however, each x is different (i.e. the holes or empty cells are different in each row).
Is there any numpy/scipy technique I can use? Could a 1D convolution work?(though kernels are fixed)
You still can use interp1d:
import numpy as np
from scipy import interpolate
A = np.array([[1,np.NaN,np.NaN,2],[0,np.NaN,1,2]])
#array([[ 1., nan, nan, 2.],
# [ 0., nan, 1., 2.]])
for row in A:
mask = np.isnan(row)
x, y = np.where(~mask)[0], row[~mask]
f = interpolate.interp1d(x, y, kind='linear',)
row[mask] = f(np.where(mask)[0])
#array([[ 1. , 1.33333333, 1.66666667, 2. ],
# [ 0. , 0.5 , 1. , 2. ]])

Assigning values to two dimensional array from two one dimensional ones

Most probably somebody else already asked this but I couldn't find it. The question is how can I assign values to a 2D array from two 1D arrays. For example:
import numpy as np
#a is the 2D array. b is the 1D array and should be assigned
#to second coordinate. In this exaple the first coordinate is 1.
a=np.zeros((3,2))
b=np.asarray([1,2,3])
c=np.ones(3)
a=np.vstack((c,b)).T
output:
[[ 1. 1.]
[ 1. 2.]
[ 1. 3.]]
I know the way I am doing it so naive, but I am sure there should be a one line way of doing this.
P.S. In real case that I am dealing with, this is a subarray of an array, and therefore I cannot set the first coordinate from the beginning to one. The whole array's first coordinate are different, but after applying np.where they become constant.
How about 2 lines?
>>> c = np.ones((3, 2))
>>> c[:, 1] = [1, 2, 3]
And the proof it works:
>>> c
array([[ 1., 1.],
[ 1., 2.],
[ 1., 3.]])
Or, perhaps you want np.column_stack:
>>> np.column_stack(([1.,1,1],[1,2,3]))
array([[ 1., 1.],
[ 1., 2.],
[ 1., 3.]])
First, there's absolutely no reason to create the original zeros array that you stick in a, never reference, and replace with a completely different array with the same name.
Second, if you want to create an array the same shape and dtype as b but with all ones, use ones_like.
So:
b = np.array([1,2,3])
c = np.ones_like(b)
d = np.vstack((c, b).T
You could of course expand b to a 3x1-array instead of a 3-array, in which case you can use hstack instead of needing to vstack then transpose… but I don't think that's any simpler:
b = np.array([1,2,3])
b = np.expand_dims(b, 1)
c = np.ones_like(b)
d = np.hstack((c, b))
If you insist on 1 line, use fancy indexing:
>>> a[:,0],a[:,1]=[1,1,1],[1,2,3]

Categories