Which is faster, numpy transpose or flip indices?

Which is faster, numpy transpose or flip indices? - python

I have a dynamic programming algorithm (modified Needleman-Wunsch) which requires the same basic calculation twice, but the calculation is done in the orthogonal direction the second time. For instance, from a given cell (i,j) in matrix scoreMatrix, I want to both calculate a value from values "up" from (i,j), as well as a value from values to the "left" of (i,j). In order to reuse the code I have used a function in which in the first case I send in parameters i,j,scoreMatrix, and in the next case I send in j,i,scoreMatrix.transpose(). Here is a highly simplified version of that code:
def calculateGapCost(i,j,scoreMatrix,gapcost):
return scoreMatrix[i-1,j] - gapcost
...
gapLeft = calculateGapCost(i,j,scoreMatrix,gapcost)
gapUp = calculateGapCost(j,i,scoreMatrix.transpose(),gapcost)
...
I realized that I could alternatively send in a function that would in the one case pass through arguments (i,j) when retrieving a value from scoreMatrix, and in the other case reverse them to (j,i), rather than transposing the matrix each time.
def passThrough(i,j,matrix):
return matrix[i,j]
def flipIndices(i,j,matrix):
return matrix[j,i]
def calculateGapCost(i,j,scoreMatrix,gapcost,retrieveValue):
return retrieveValue(i-1,j,scoreMatrix) - gapcost
...
gapLeft = calculateGapCost(i,j,scoreMatrix,gapcost,passThrough)
gapUp = calculateGapCost(j,i,scoreMatrix,gapcost,flipIndices)
...
However if numpy transpose uses some features I'm unaware of to do the transpose in just a few operations, it may be that transpose is in fact faster than my pass-through function idea. Can anyone tell me which would be faster (or if there is a better method I haven't thought of)?
The actual method would call retrieveValue 3 times, and involves 2 matrices that would be referenced (and thus transposed if using that approach).

In NumPy, transpose returns a view with a different shape and strides. It does not touch the data.
Therefore, you will likely find that the two approaches have identical performance, since in essence they are exactly the same.
However, the only way to be sure is to benchmark both.

Related

How to efficiently make a function call to each row of a 2D ndarray?

I'm implementing a KNN classifier and need to quickly traverse the test set to calculate and store their predicted labels.
The way I use now is to use the list comprehension to get a list, and then turn it into a ndarray, similar to np.array([predict(point) for point in test_set]), but I think it takes time and space, because the for loop of Python is relatively slow and it needs to create another copy. Is there a more efficient way to get such an array?
I know that numpy has apply_along_axis function, but it is said that it only implicitly uses the for loop, which may not improve the performance.
EDIT: I learned a possible way to save memory: match np.fromiter() function and generator, like np.fromiter((predict(point) for point in test_set), int, test_set.shape[0]), which avoids creating a list halfway. Unfortunately, in my program, it seems to run a little slower than the previous method.

the good old way:
def my_func(test_set):
i = 0
test_set_size = len(test_set)
result = [None] * test_set_size
while i < test_set_size:
result[i] = predict(test_set[i])
i = i + 1
return np.array(result)

Faster way to fill 2d numpy array with these noise parameters? Currently looping over each element

Is there a faster way to populate a 2d numpy array using the same algorithm (pnoise3 with the same input arguments, notably, i/scale j/scale) seen here? self.world is the np array and it is pretty large (2048,1024) to be traversing like this.
for i in range(self.height):
for j in range(self.width):
self.world[i][j] = noise.pnoise3(i/self.noise['scale'],
j/self.noise['scale'],
SEED,
octaves = self.noise['octaves'],
persistence = self.noise['persistence'],
lacunarity = self.noise['lacunarity'],
repeatx= self.width,
repeaty= self.height,
base= 0)
After learning about boolean indexing I was able to get rid of this nested for loop elsewhere in my program and was amazed at how much more efficient it was. Is there any room for improvement above?
I thought about doing something like self.world[self.world is not None] = noise.pnoise3(arg, arg, etc...) but that cannot accommodate for the incrementing i and j values. And would setting it to a function output mean every value is the same anyways? I also thought about make a separate array and then combining them but I still cannot figure out how to reproduce the incrementing i and j values in that scenario.
Also, as an aside, I used self.world[self.world is not None] as an example of a boolean index that would return true for everything but I imagine this is not the best way to do what I want. Is there an obvious alternative I am missing?

If pnoise is perlin noise then there are numpy vectorized implementations.
Here is one.
As it is I do not think you can do it faster. Numpy is fast when it can do the inner loop in C. That is the case for built in numpy functions like np.sin.
Here you have a vector operation where the operation is a python function.
However it could be possible to re-implement the noise function so that it internally uses numpy vectorized functions.

interp1d with streaming data?

I know that you can pass the data lists x and y to scipy's interp1d by reference. Does this mean I can add new data to it by simply modifying the inputs x and y in-place?
Ideally, I'm looking for something that will do the following efficiently:
It interpolates the value for some point we request.
Based on that interpolated value, we decide whether or not to obtain the 'true' value.
If the true value is obtained, we then want to update the algorithm's knowledge for future interpolations.
However, once the values are put into the interpolating algorithm, they are not presumed to change--new points can only be added. I think interp1d does some kind of fancy processing on the input data to make the lookup faster, but I'm not sure if that precludes adding to the data in-place. Please help!
Edit: Some of you will likely notice that this has a lot in common with Metropolis-Hastings, however, steps 1-3 may not occur serially; hence I need a more abstract interpolation method to support asynchronous updates. If you know of any, suggestions, that would be great!

I think the simplest is to write your own interpolating object:
class Interpolator:
def __init__(self,x,y):
if len(x)!=len(y):
raise BaseException("Lists must have the same length")
self.xlist=x
self.ylist=y
self.len=len(x)
def find_x_index(self,x0): # find index i such that xlist[i]<=x0<xlist[i+1]
a,b=0,self.len-1
while b-a>1:
m=int((b+a)/2)
if x0<self.xlist[m]:
b=m
else:
a=m
return a
def add_point(self,x,y): # add a new point
if x<self.xlist[0]:
self.xlist.insert(0,x)
self.ylist.insert(0,y)
elif x>self.xlist[-1]:
self.xlist.append(x)
self.ylist.append(y)
else:
i=self.find_x_index(x)
self.xlist.insert(i+1,x)
self.ylist.insert(i+1,y)
self.len+=1
def interpolate(self,x0): # interpolates y value for x0
if x0<self.xlist[0] or x0>self.xlist[-1]:
raise BaseException("Value out of range")
a=self.find_x_index(x0)
eps=(x0-self.xlist[a])/(self.xlist[a+1]-self.xlist[a]) # interpolation
return (eps*self.ylist[a+1]+(1-eps)*self.ylist[a])
itp=Interpolator([1,2,3],[1,3,4])
print(itp.interpolate(1.6))
itp.add_point(1.5,3)
print(itp.interpolate(1.6))
The key point is to always keep the x list sorted, so that you can use dichotomy which is a logarithmic complexity algorithm.
Remark: in add_point, you should check that there aren't two same x values with different y

numpy shorthand for taking jagged slice

I have an operation that I'm doing commonly which I'm calling a "jagged-slice" because I don't know the real name for it. It's best explained by example:
a = np.random.randn(50, 10)
entries_of_interest = np.random.randint(10, size = 50) # Vector of 50 indices between 0 and 9
# Now I want the values contained in each row of a at the corresponding index in "entries of interest"
jagged_slice_of_a = a[np.arange(a.shape[0]), entries_of_interest]
# jagged_slice_of_a is now a vector with 50 elements. Good.
Only problem is it's a bit cumbersome to do this a[np.arange(a.shape[0]), entries_of_interest] indexing (it seems silly to have to construct the "np.arange(a.shape[0])" just for the sake of this). I'd like something like the : operator for this, but the : does something else. Is there any more succinct way to do this operation?
Best answer:
No, there is no better way with native numpy. You can create a helper function for this if you want.

This is combersome only in the sense that it requires more typing for a task that seems so simple to you.
a[np.arange(a.shape[0]), entries_of_interest]
But as you note, the syntactically simpler a[:, entries_of_interest] has another interpretation in numpy. Choosing a subset of the columns of an array is a more common task that choosing one (random) item from each row.
Your case is just a specialized instance of
a[I, J]
where I and J are 2 arrays of the same shape. In the general case entries_of_interest could be smaller than a.shape[0] (not all the rows), or larger (several items from some rows), or even be 2d. It could even select certain elements repeatedly.
I have found in other SO questions that performing this kind of element selection is faster when applied to a.flat. But that requires some math to construct the I*n+J kind of flat index.
With your special knowledge of J, constructing I seems extra work, but numpy can't make that kind of assumption. If this selection was more common someone could write a function that wraps your expression
def peter_selection(a,I):
# check the a.shape[0]==I.shape[0]
return a[np.arange(a.shape[0]), I]

I think that your current method is probably the best way.
You can also use choose for this kind of selection. This is syntactically clearer, but is trickier to get right and potentially more limited. The equivalent with this method would be:
entries_of_interest.choose(a.T)

The elements in jagged_slice_of_a are the diagonal elements of a[:,entries_of_interest]
A slightly less cumbersome way of doing this would therefore be to use np.diagonal to extract them.
jagged_slice_of_a = a[:, entries_of_interest].diagonal()

Efficiently recalculating the gradient of a numpy array with unknown dimensionality

I have an N-dimensional numpy array S. Every iteration, exactly one value in this array will change.
I have a second array, G that stores the gradient of S, as calculated by numpy's gradient() function. Currently, my code unnecessarily recalculates all of G every time I update S, but this is unnecessary, as only one value in S has changed, and so I only should have to recalculate 1+d*2 values in G, where d is the number of dimensions in S.
This would be an easier problem to solve if I knew the dimensionality of the arrays, but the solutions I have come up with in the absence of this knowledge have been quite inefficient (not substantially better than just recalculating all of G).
Is there an efficient way to recalculate only the necessary values in G?
Edit: adding my attempt, as requested
The function returns a vector indicating the gradient of S at coords in each dimension. It calculates this without calculating the gradient of S at every point, but the problem is that it does not seem to be very efficient.
It looks similar in some ways to the answers already posted, but maybe there is something quite inefficient about it?
The idea is the following: I iterate through each dimension, creating a slice that is a vector only in that dimension. For each of these slices, I calculate the gradient and place the appropriate value from that gradient into the correct place in the returned vector grad.
The use of min() and max() is to deal with the boundary conditions.
def getSGradAt(self,coords) :
"""Returns the gradient of S at position specified by
the vector argument 'coords'.
self.nDim : the number of dimensions of S
self.nBins : the width of S (same in every dim)
self.s : S """
grad = zeros(self.nDim)
for d in xrange(self.nDim) :
# create a slice through S that has size > 1 only in the current
# dimension, d.
slices = list(coords)
slices[d] = slice(max(0,coords[d]-1),min(self.nBins,coords[d]+2))
# take the middle value from the gradient vector
grad[d] = gradient(self.s[sl])[1]
return grad
The problem is that this doesn't run very quickly. In fact, just taking the gradient of the whole array S seems to run faster (for nBins = 25 and nDim = 4).
Edited again, to add my final solution
Here is what i ended up using. This function updates S, changing the value at X by the amount change. It then updates G using a variation on the technique proposed by Jaime.
def changeSField(self,X,change) :
# change s
self.s[X] += change
# update g (gradient field)
slices = tuple(slice(None if j-2 <= 0 else j-2, j+3, 1) for j in X)
newGrads = gradient(self.s[slices])
for i in arange(self.nDim) :
self.g[i][slices] = newGrads[i]

Your question is much to open for you to get a good answer: it is always a good idea to post your inefficient code, so that potential answerers can better help you. Anyway, lets say you know the coordinates of the point that has changed, and that you store those in a tuple named coords. First, lets construct a tuple of slices encompassing your point:
slices = tuple(slice(None if j-1 <= 0 else j-1, j+2, 1) for j in coords)
You may want to extend the limits to j-2 and j+3 so that the gradient is calculated using central differences whenever possible, but it will be slower.
You can now update you array doing something like:
G[slices] = np.gradient(N[slices])

Uhmmm, I could work better if I had an example, but what about just creating a secondary array, S2 (by the way, I'd choose longer and more meaningful names for your variables) and recalculate the gradient for it, G2, and then introduce it back into G?
Another question is: if you don't know the dimensionality of S, how are you changing the particular element that changes? Are you just recalculating the whole of S?
I suggest you clarify this things so that people can help you better.
Cheers!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Which is faster, numpy transpose or flip indices? - python

In NumPy, transpose returns a view with a different shape and strides. It does not touch the data. Therefore, you will likely find that the two approaches have identical performance, since in essence they are exactly the same. However, the only way to be sure is to benchmark both.

Related

How to efficiently make a function call to each row of a 2D ndarray?

Faster way to fill 2d numpy array with these noise parameters? Currently looping over each element

interp1d with streaming data?

numpy shorthand for taking jagged slice

Efficiently recalculating the gradient of a numpy array with unknown dimensionality

Categories

Resources