Finding local maxima with TensorFlow - python

I am looking for a way to get the indices of the local maxima of a tensor using TensorFlow exclusively.
tl;dr
I am not a data scientist. I don't know much about the theory behind much of computer vision, but I am trying to build a computer vision app using TensorFlow. I plan on saving my model and calling it as a service using TF Serving, so I can't depend on external libraries such as numpy, scipy, etc. What I want to accomplish is algorithmically the same as scipy's signal.argrelextrema, but in a way that can be saved with my model and rerun. Other algorithms for this have been shown here, but none execute within TensorFlow. Can anyone point me in the right direction?

EDIT
My first solution was functional, but inefficient. It required five iterations of the tensor (zero-trail, reverse, zero-trail, reverse, where). I now have a solution that requires only two iterations, and is also flexible enough to quickly identify local minima as well...
def get_slope(prev, cur):
# A: Ascending
# D: Descending
# P: PEAK (on previous node)
# V: VALLEY (on previous node)
return tf.cond(prev[0] < cur, lambda: (cur, ascending_or_valley(prev, cur)), lambda: (cur, descending_or_peak(prev, cur)))
def ascending_or_valley(prev, cur):
return tf.cond(tf.logical_or(tf.equal(prev[1], 'A'), tf.equal(prev[1], 'V')), lambda: np.array('A'), lambda: np.array('V'))
def descending_or_peak(prev, cur):
return tf.cond(tf.logical_or(tf.equal(prev[1], 'A'), tf.equal(prev[1], 'V')), lambda: np.array('P'), lambda: np.array('D'))
def label_local_extrema(tens):
"""Return a vector of chars indicating ascending, descending, peak, or valley slopes"""
# initializer element values don't matter, just the type.
initializer = (np.array(0, dtype=np.float32), np.array('A'))
# First, get the slope for each element
slope = tf.scan(get_slope, tens, initializer)
# shift by one, since each slope indicator is the slope
# of the previous node (necessary to identify peaks and valleys)
return slope[1][1:]
def find_local_maxima(tens):
"""Return the indices of the local maxima of the first dimension of the tensor"""
return tf.squeeze(tf.where(tf.equal(label_local_extrema(blur_x_tf), 'P')))
End EDIT
Ok, I've managed to find a solution, but it's not pretty. The following function takes a 1D tensor, and reduces all points that are not local maxima to zero. This will work only for positive numbers, and would require modification for datatypes other than float32, but it meets my needs.
There has to be a better way to do this, though.
def zero_descent(prev, cur):
"""reduces all descent steps to zero"""
return tf.cond(prev[0] < cur, lambda: (cur, cur), lambda: (cur, 0.0))
def skeletonize_1d(tens):
"""reduces all point other than local maxima to zero"""
# initializer element values don't matter, just the type.
initializer = (np.array(0, dtype=np.float32), np.array(0, dtype=np.float32))
# First, zero out the trailing side
trail = tf.scan(zero_descent, tens, initializer)
# Next, let's make the leading side the trailing side
trail_rev = tf.reverse(trail[1], [0])
# Now zero out the leading (now trailing) side
lead = tf.scan(zero_descent, trail_rev, initializer)
# Finally, undo the reversal for the result
return tf.reverse(lead[1], [0])
def find_local_maxima(tens):
return tf.where(skeletonize_1d >0)

Pseudo:
input_matrix == max_pool(input_matrix)
Explanation:
When input values are the same as the ones taken by max_pooling, it means they are the greatest around.

I don't think you are giving enough information to clarify much. First of all, I'm not sure you want to get the maximum element of a Tensor (there is a function in tf for this) or you want to find the local maxima of a function (not a Tensor). In this case, you can revert the function and find the local minima which would result in what you are looking for.

Related

Search for the nearest array in a huge array of arrays

I need to find the closest possible sentence.
I have an array of sentences and a user sentence, and I need to find the closest to the user's sentence element of the array.
I presented each sentence in the form of a vector using word2vec:
def get_avg_vector(word_list, model_w2v, size=500):
sum_vec = np.zeros(shape = (1, size))
count = 0
for w in word_list:
if w in model_w2v and w != '':
sum_vec += model_w2v[w]
count +=1
if count == 0:
return sum_vec
else:
return sum_vec / count + 1
As a result, the array element looks like this:
array([[ 0.93162371, 0.95618944, 0.98519795, 0.98580566, 0.96563747,
0.97070891, 0.99079191, 1.01572807, 1.00631016, 1.07349398,
1.02079309, 1.0064849 , 0.99179418, 1.02865136, 1.02610303,
1.02909719, 0.99350413, 0.97481178, 0.97980362, 0.98068508,
1.05657591, 0.97224562, 0.99778703, 0.97888296, 1.01650529,
1.0421448 , 0.98731804, 0.98349052, 0.93752996, 0.98205837,
1.05691232, 0.99914532, 1.02040555, 0.99427229, 1.01193818,
0.94922226, 0.9818139 , 1.03955 , 1.01252615, 1.01402485,
...
0.98990598, 0.99576604, 1.0903802 , 1.02493086, 0.97395976,
0.95563786, 1.00538653, 1.0036294 , 0.97220088, 1.04822631,
1.02806122, 0.95402776, 1.0048053 , 0.97677222, 0.97830801]])
I represent the sentence of the user also as a vector, and I compute the closest element to it is like this:
%%cython
from scipy.spatial.distance import euclidean
def compute_dist(v, list_sentences):
dist_dict = {}
for key, val in list_sentences.items():
dist_dict[key] = euclidean(v, val)
return sorted(dist_dict.items(), key=lambda x: x[1])[0][0]
list_sentences in the method above is a dictionary in which keys are a text representation of sentences, and values are vector.
It takes a very long time, because I have more than 60 million sentences.
How can I speed up, optimize this process?
I'll be grateful for any advice.
The initial calculation of the 60 million sentences' vectors is essentially a fixed cost you'll pay once. I'm assuming you mainly care about the time for each subsequent lookup, for a single user-supplied query sentence.
Using numpy native array operations can speed up the distance calculations over doing your own individual calculations in a Python loop. (It's able to do things in bulk using its optimized code.)
But first you'd want to replace list_sentences with a true numpy array, accessed only by array-index. (If you have other keys/texts you need to associate with each slot, you'd do that elsewhere, with some dict or list.)
Let's assume you've done that, in whatever way is natural for your data, and now have array_sentences, a 60-million by 500-dimension numpy array, with one sentence average vector per row.
Then a 1-liner way to get an array full of the distances is as the vector-length ("norm") of the difference between each of the 60 million candidates and the 1 query (which gives a 60-million entry answer with each of the differences):
dists = np.linalg.norm(array_sentences - v)
Another 1-liner way is to use the numpy utility function cdist() for comuting distance between each pair of two collections of inputs. Here, your first collection is just the one query vector v (but if you had batches to do at once, supplying more than one query at a time could offer an additional slight speedup):
dists = np.linalg.cdists(array[v], array_sentences)
(Note that such vector comparisons often use cosine-distance/cosine-similarity rather than euclidean-distance. If you switch to that, you might be doing other norming/dot-products instead of the first option above, or use the metric='cosine' option to cdist().)
Once you have all the distances in a numpy array, using a numpy-native sort option is likely to be faster than using Python sorted(). For example, numpy's indirect sort argsort(), which just returns the sorted indexes (and thus avoids moving all the vector coordinates-around), since you just want to know which items are the best match(es). For example:
sorted_indexes = argsort(dists)
best_index = sorted_indexes[0]
If you need to turn that int index back into your other key/text, you'd use your own dict/list that remembered the slot-to-key relationships.
All these still give an exactly right result, by comparing against all candidates, which (even when done optimally well) is still time-consuming.
There are ways to get faster results, based on pre-building indexes to the full set of candidates – but such indexes become very tricky in high-dimensional spaces (like your 500-dimensional space). They often trade off perfectly accurate results for faster results. (That is, what they return for 'closest 1' or 'closest N' will have some errors, but usually not be off by much.) For examples of such libraries, see Spotify's ANNOY or Facebook's FAISS.
At least if you are doing this procedure for multiple sentences, you could try using scipy.spatial.cKDTree (I don't know whether it pays for itself on a single query. Also 500 is quite high, I seem to remember KDTrees work better for not quite as many dimensions. You'll have to experiment).
Assuming you've put all your vectors (dict values) into one large numpy array:
>>> import numpy as np
>>> from scipy.spatial import cKDTree as KDTree
>>>
# 100,000 vectors (that's all my RAM can take)
>>> a = np.random.random((100000, 500))
>>>
>>> t = KDTree(a)
# create one new vector and find distance and index of closest
>>> t.query(np.random.random(500))
(8.20910072933986, 83407)
I can think about 2 possible ways of optimizing this process.
First, if your goal is only to get the closest vector (or sentence), you could get rid of the list_sentences variable and only keep in memory the closest sentence you have found yet. This way, you won't need to sort the complete (and presumably very large) list at the end, and only return the closest one.
def compute_dist(v, list_sentences):
min_dist = 0
for key, val in list_sentences.items():
dist = euclidean(v, val)
if dist < min_dist:
closest_sentence = key
min_dist = dist
return closest_sentence
The second one is maybe a little more unsound. You can try to re implement the euclidean method by giving it a third argument which would be the current minimum distance min_dist between the closest vector you have found so far and the user vector. I don't know how the scipy euclidean method is implemented but I guess it is close to summing squared differences along all the vectors dimensions. What you want is the method to stop if the sum is higher than min_dist (the distance will be higher than min_dist anyway and you won't keep it).

Usage of forEdges iterator in networkit (python)

I carefully read the docs, but it still is unclear to me how to use G.forEdges(), described as an "experimental edge iterator interface".
Let's say that I want to decrease the density of my graph. I have a sorted list of weights, and I want to remove edges based on their weight until the graph splits into two connected components. Then I'll select the minimum number of links that keeps the graph connected. I would do something like this:
cc = components.ConnectedComponents(G).run()
while cc.numberOfComponents()==1:
for weight in weightlist:
for (u,v) in G.edges():
if G.weight(u,v)==weight:
G=G.removeEdge(u,v)
By the way I know from the docs that there is this edge iterator, which probably does the iteration in a more efficient way. But from the docs I really can't understand how to correctly use this forEdges, and I can't find a single example over the internet. Any ideas?
Or maybe an alternative idea to do what I want to do: since it's a huge graph (125millions links) the iteration will take forever, even if I am working on a cluster.
NetworKit iterators accept a callback function so if you want to iterate over edges (or nodes) you have to define a function and then pass it to the iterator as a parameter. You can find more information here. For example a simple function that just prints all edges is:
# Callback function.
# To iterate over edges it must accept 4 parameters
def myFunction(u, v, weight, edgeId):
print("Edge from {} to {} has weight {} and id {}".format(u, v, weight, edgeId))
# Using iterator with callback function
G.forEdges(myFunction)
Now if you want to keep removing edges whose weight is inside your weightlist until the graph splits into two connected components you also have to update the connected components of the graph since ConnectedComponents will not do that for you automatically (this may be also one of the reasons why the iteration takes forever). To do this efficiently, you can use the DynConnectedComponents class (see my example below). In this case, I think that the edge iterator will not help you much so I would suggest you to keep using the for loop.
from networkit import *
# Efficiently updates connected components after edge updates
cc = components.DynConnectedComponents(G).run()
# Removes edges with weight equals to w until components split
def removeEdges(w):
for (u, v) in G.edges():
if G.weight(u, v) == weight:
G.removeEdge(u, v)
# Updating connected components
event = dynamic.GraphEvent(dynamic.GraphEvent.EDGE_REMOVAL, u, v, weight)
cc.update(event)
if cc.numberOfComponents() > 1:
# Components did split
return True
# Components did not split
return False
if cc.numberOfComponents() == 1:
for weight in weights:
if removeEdges(weight):
break
This should speed up a bit your original code. However, it is still sequential code so even if you run it on a multi-core machine it will use only one core.

Efficient tensor contraction in python

I have a list L of tensors (ndarray objects), with several indices each. I need to contract these indices according to a graph of connections.
The connections are encoded in a list of tuples in the form ((m,i),(n,j)) signifying "contract the i-th index of the tensor L[m] with the j-th index of the tensor L[n].
How can I handle non-trivial connectivity graphs? The first problem is that as soon as I contract a pair of indices, the result is a new tensor that does not belong to the list L. But even if I solved this (e.g. by giving a unique identifier to all the indices of all the tensors), there is the issue that one can pick any order to perform the contractions, and some choices yield unnecessarily enormous beasts in mid-computation (even if the final result is small). Suggestions?
Memory considerations aside, I believe you can do the contractions in a single call to einsum, although you'll need some preprocessing. I'm not entirely sure what you mean by "as I contract a pair of indices, the result is a new tensor that does not belong to the list L", but I think doing the contraction in a single step would exactly solve this problem.
I suggest using the alternative, numerically indexed syntax of einsum:
einsum(op0, sublist0, op1, sublist1, ..., [sublistout])
So what you need to do is encode the indices to contract in integer sequences. First you'll need to set up a range of unique indices initially, and keep another copy to be used as sublistout. Then, iterating over your connectivity graph, you need to set contracted indices to the same index where necessary, and at the same time remove the contracted index from sublistout.
import numpy as np
def contract_all(tensors,conns):
'''
Contract the tensors inside the list tensors
according to the connectivities in conns
Example input:
tensors = [np.random.rand(2,3),np.random.rand(3,4,5),np.random.rand(3,4)]
conns = [((0,1),(2,0)), ((1,1),(2,1))]
returned shape in this case is (2,3,5)
'''
ndims = [t.ndim for t in tensors]
totdims = sum(ndims)
dims0 = np.arange(totdims)
# keep track of sublistout throughout
sublistout = set(dims0.tolist())
# cut up the index array according to tensors
# (throw away empty list at the end)
inds = np.split(dims0,np.cumsum(ndims))[:-1]
# we also need to convert to a list, otherwise einsum chokes
inds = [ind.tolist() for ind in inds]
# if there were no contractions, we'd call
# np.einsum(*zip(tensors,inds),sublistout)
# instead we need to loop over the connectivity graph
# and manipulate the indices
for (m,i),(n,j) in conns:
# tensors[m][i] contracted with tensors[n][j]
# remove the old indices from sublistout which is a set
sublistout -= {inds[m][i],inds[n][j]}
# contract the indices
inds[n][j] = inds[m][i]
# zip and flatten the tensors and indices
args = [subarg for arg in zip(tensors,inds) for subarg in arg]
# assuming there are no multiple contractions, we're done here
return np.einsum(*args,sublistout)
A trivial example:
>>> tensors = [np.random.rand(2,3), np.random.rand(4,3)]
>>> conns = [((0,1),(1,1))]
>>> contract_all(tensors,conns)
array([[ 1.51970003, 1.06482209, 1.61478989, 1.86329518],
[ 1.16334367, 0.60125945, 1.00275992, 1.43578448]])
>>> np.einsum('ij,kj',tensors[0],tensors[1])
array([[ 1.51970003, 1.06482209, 1.61478989, 1.86329518],
[ 1.16334367, 0.60125945, 1.00275992, 1.43578448]])
In case there are multiple contractions, the logistics in the loop becomes a bit more complex, because we need to handle all the duplicates. The logic, however, is the same. Furthermore, the above is obviously missing checks to ensure that the corresponding indices can be contracted.
In hindsight I realized that the default sublistout doesn't have to be specified, einsum uses that order anyway. I decided to leave that variable in the code, because in case we want a non-trivial output index order, we'll have to handle that variable appropriately, and it might come handy.
As for optimization of the contraction order, you can effect internal optimization in np.einsum as of version 1.12 (as noted by #hpaulj in a now-deleted comment). This version introduced the optimize optional keyword argument to np.einsum, allowing to choose a contraction order that cuts down on computational time at the cost of memory. Passing 'greedy' or 'optimal' as the optimize keyword will make numpy choose a contraction order in roughly decreasing order of sizes of the dimensions.
The options available for the optimize keyword come from the apparently undocumented (as far as online documentation goes; help() fortunately works) function np.einsum_path:
einsum_path(subscripts, *operands, optimize='greedy')
Evaluates the lowest cost contraction order for an einsum expression by
considering the creation of intermediate arrays.
The output contraction path from np.einsum_path can also be used as an input for the optimize argument of np.einsum. In your question you were worried about too much memory being used, so I suspect that the default of no optimization (with potentially longer runtime and smaller memory footprint).
Maybe helpful: Take a look into https://arxiv.org/abs/1402.0939 which is a description of an efficient framework for the problem of contracting so called tensor networks in a single function ncon(...). As far as I see implementations of it are directly available for Matlab (can be found within in the link) and for Python3 (https://github.com/mhauru/ncon).

Efficiently recalculating the gradient of a numpy array with unknown dimensionality

I have an N-dimensional numpy array S. Every iteration, exactly one value in this array will change.
I have a second array, G that stores the gradient of S, as calculated by numpy's gradient() function. Currently, my code unnecessarily recalculates all of G every time I update S, but this is unnecessary, as only one value in S has changed, and so I only should have to recalculate 1+d*2 values in G, where d is the number of dimensions in S.
This would be an easier problem to solve if I knew the dimensionality of the arrays, but the solutions I have come up with in the absence of this knowledge have been quite inefficient (not substantially better than just recalculating all of G).
Is there an efficient way to recalculate only the necessary values in G?
Edit: adding my attempt, as requested
The function returns a vector indicating the gradient of S at coords in each dimension. It calculates this without calculating the gradient of S at every point, but the problem is that it does not seem to be very efficient.
It looks similar in some ways to the answers already posted, but maybe there is something quite inefficient about it?
The idea is the following: I iterate through each dimension, creating a slice that is a vector only in that dimension. For each of these slices, I calculate the gradient and place the appropriate value from that gradient into the correct place in the returned vector grad.
The use of min() and max() is to deal with the boundary conditions.
def getSGradAt(self,coords) :
"""Returns the gradient of S at position specified by
the vector argument 'coords'.
self.nDim : the number of dimensions of S
self.nBins : the width of S (same in every dim)
self.s : S """
grad = zeros(self.nDim)
for d in xrange(self.nDim) :
# create a slice through S that has size > 1 only in the current
# dimension, d.
slices = list(coords)
slices[d] = slice(max(0,coords[d]-1),min(self.nBins,coords[d]+2))
# take the middle value from the gradient vector
grad[d] = gradient(self.s[sl])[1]
return grad
The problem is that this doesn't run very quickly. In fact, just taking the gradient of the whole array S seems to run faster (for nBins = 25 and nDim = 4).
Edited again, to add my final solution
Here is what i ended up using. This function updates S, changing the value at X by the amount change. It then updates G using a variation on the technique proposed by Jaime.
def changeSField(self,X,change) :
# change s
self.s[X] += change
# update g (gradient field)
slices = tuple(slice(None if j-2 <= 0 else j-2, j+3, 1) for j in X)
newGrads = gradient(self.s[slices])
for i in arange(self.nDim) :
self.g[i][slices] = newGrads[i]
Your question is much to open for you to get a good answer: it is always a good idea to post your inefficient code, so that potential answerers can better help you. Anyway, lets say you know the coordinates of the point that has changed, and that you store those in a tuple named coords. First, lets construct a tuple of slices encompassing your point:
slices = tuple(slice(None if j-1 <= 0 else j-1, j+2, 1) for j in coords)
You may want to extend the limits to j-2 and j+3 so that the gradient is calculated using central differences whenever possible, but it will be slower.
You can now update you array doing something like:
G[slices] = np.gradient(N[slices])
Uhmmm, I could work better if I had an example, but what about just creating a secondary array, S2 (by the way, I'd choose longer and more meaningful names for your variables) and recalculate the gradient for it, G2, and then introduce it back into G?
Another question is: if you don't know the dimensionality of S, how are you changing the particular element that changes? Are you just recalculating the whole of S?
I suggest you clarify this things so that people can help you better.
Cheers!

Which is faster, numpy transpose or flip indices?

I have a dynamic programming algorithm (modified Needleman-Wunsch) which requires the same basic calculation twice, but the calculation is done in the orthogonal direction the second time. For instance, from a given cell (i,j) in matrix scoreMatrix, I want to both calculate a value from values "up" from (i,j), as well as a value from values to the "left" of (i,j). In order to reuse the code I have used a function in which in the first case I send in parameters i,j,scoreMatrix, and in the next case I send in j,i,scoreMatrix.transpose(). Here is a highly simplified version of that code:
def calculateGapCost(i,j,scoreMatrix,gapcost):
return scoreMatrix[i-1,j] - gapcost
...
gapLeft = calculateGapCost(i,j,scoreMatrix,gapcost)
gapUp = calculateGapCost(j,i,scoreMatrix.transpose(),gapcost)
...
I realized that I could alternatively send in a function that would in the one case pass through arguments (i,j) when retrieving a value from scoreMatrix, and in the other case reverse them to (j,i), rather than transposing the matrix each time.
def passThrough(i,j,matrix):
return matrix[i,j]
def flipIndices(i,j,matrix):
return matrix[j,i]
def calculateGapCost(i,j,scoreMatrix,gapcost,retrieveValue):
return retrieveValue(i-1,j,scoreMatrix) - gapcost
...
gapLeft = calculateGapCost(i,j,scoreMatrix,gapcost,passThrough)
gapUp = calculateGapCost(j,i,scoreMatrix,gapcost,flipIndices)
...
However if numpy transpose uses some features I'm unaware of to do the transpose in just a few operations, it may be that transpose is in fact faster than my pass-through function idea. Can anyone tell me which would be faster (or if there is a better method I haven't thought of)?
The actual method would call retrieveValue 3 times, and involves 2 matrices that would be referenced (and thus transposed if using that approach).
In NumPy, transpose returns a view with a different shape and strides. It does not touch the data.
Therefore, you will likely find that the two approaches have identical performance, since in essence they are exactly the same.
However, the only way to be sure is to benchmark both.

Categories