For one of my project in image processing, I need to use key points. To compute them, I found that OpenCV was quite fast and convenient to use. But, when computing the key points of an image with, for example, the FAST algorithm we receive an array of KeyPoint objects.
When I get those key points, I would like to only take their coordinates, not the additional information (angle, etc.). Those coordinates will be used to make several computation with numpy. The problem is the conversion time from an array of KeyPoint to a numpy array. It takes approximately 60% of the total execution time. Any suggestion how to improve the loop below?
import cv2
import numpy as np
image_cv2 = cv2.imread("sample.jpg")
fast = cv2.FastFeatureDetector(threshold=25)
keypoints = fast.detect(image_cv2)
n = len(keypoints)
cd_x = np.zeros(n, dtype=np.int)
cd_y = np.zeros(n, dtype=np.int)
for i in xrange(0, n):
cd_x[i] = keypoints[i].pt[0]
cd_y[i] = keypoints[i].pt[1]
PS: I tried to use np.vectorize but I did not notice any improvement. For information, the number of key points by image is often around 5 000.
Update:
As some people pointed out, the simple assignation from keypoints to numpy array should be quite fast. After some tests, it is true that it is very fast. For example, for a dataset of 275 images, with 1 thread, the complete execution time is 22.9s, with only 0.2s for the keypoints->numpy to execute, and around 20s is spent by cv2.imread().
My mistake was to use too many threads at the same time, as each core was not used at least at 80% I kept increasing their quantity until this arbitrary limit, which slowed down the loop execution. Thank you everyone to make me open my eyes on a stupid mistake elsewhere in the code!
A trial case:
In [765]: class Keypoints(object):
def __init__(self):
self.pt=[1.,2.]
.....:
In [766]: keypoints=[Keypoints() for i in xrange(1000)]
In [767]: cd=np.array([k.pt for k in keypoints])
In [768]: cd
Out[768]:
array([[ 1., 2.],
[ 1., 2.],
[ 1., 2.],
...,
[ 1., 2.],
[ 1., 2.],
[ 1., 2.]])
In [769]: cd_x=cd[:,0]
In timeit tests, the keypoints step takes just as long as the cd calculation, 1ms.
But the 2 simpler iterations
cd_x=np.array([k.pt[0] for k in keypoints])
cd_y=np.array([k.pt[1] for k in keypoints])
takes half the time. I was expecting the single iteration to save time. But in these simple cases, the comprehension itself takes only half of the time, the rest is creating the array.
In [789]: timeit [k.pt[0] for k in keypoints]
10000 loops, best of 3: 136 us per loop
In [790]: timeit np.array([k.pt[0] for k in keypoints])
1000 loops, best of 3: 282 us per loop
Related
Sorry if the title is confusing, but it is very hard to put what I would like to do in a single sentence. Image you have an image stack stack in the form of N m x n matrices as a numpy array in the shape of (m, n, N). Now, if I want to perform the numpy.median for example along the stack axis N it is very easy: numpy.median(stack, 0). The problem is that for each image of the stack, I also have a mask of pixels that I would not like to include in the operation, in this case numpy.median. Is there any efficient way to do that?
So far, all I could think of is this, but it is increadibly slow and absolutely not feasible:
median = [[]]*images[0].flatten().shape
for i in range(len(images)):
image = images[i].flatten()
mask = mask[i].flatten()
for j in range(len(median)):
if mask[j] == 0:
median[j].append(image[j])
for i in range(len(median)):
median[j] = np.median(median[j]) if median[j] else 0
median = np.array(median).reshape(images[0].shape)
There has to be a better way.
What you can do is build a an array with NaNs in the non-masked values and compute np.nanmedian (which ignores NaNs). You can build such an array "on the fly" using np.where:
x = np.arange(4*3*4).reshape((4,3,4))
m = x%2 == 0
np.nanmedian(np.where(m, x, np.NaN), axis=2)
>>array([[ 1., 5., 9.],
[13., 17., 21.],
[25., 29., 33.],
[37., 41., 45.]])
I have a hard time understanding what you are trying to say, but hopefully this will help:
You can use np.whereto find and replace - or ignore/remove - values that you want to exclude.
Or you can use bin_mask = stack != value_you_want_to_ignore to get a boolean array that you can use to ignore your critical values.
What is the equivalent of math.remainder() function in NumPy? Basically I would like to compute y = x - np.around(x) for a NumPy array x. (It is important that all elements of y have absolute at most 1/2). Looking at NumPy documentation neither np.fmod nor np.remainder do the job.
Obviously, I thought about writing x - np.around(x) but I am afraid that for large values of x subtraction produces floating-point inaccuracies. For example:
import numpy as np
a = np.arange(1000) * 1e-9
x = a / 1e-9
y = x - np.around(x)
Should produce an all-zero vector y, but in practice there will be some errors (that get larger if I increase the size of arrays from 1000 to 10000).
The reason I am asking this question is to figure out if there is a NumPy function for this purpose that calls directly C math library remainder (as math.remainder must do) so as to minimize the floating-points errors.
I don't think this currently exists numpy. That said, the automagic numpy.vectorize call does the right thing for me, e.g:
import math
import numpy as np
ieee_remainder = np.vectorize(math.remainder)
ieee_remainder(np.arange(5), 5)
this broadcasts over the parameters nicely, giving:
array([ 0., 1., 2., -2., -1.])
which might be what you want.
performance is about 10 times slower than you'd get with a native implementation. given an array of 10,000 elements my laptop takes approximately the following times:
1200 µs for ieee_remainder
150 µs for a Cython version I hacked together
120 µs for a C program doing a naive loop
80 µs for numpy.fmod
given the complexity of glibc's remainder implementation I'm amazed it's as fast as it is.
BACKSTORY:
Getting data together to feed into a neural network; starts as a document (long string); gets split into sentences, sentences are reduced to 1 or 0 depending on if they have a feature (in this case, class of word) or not.
The catch is that documents have different numbers of sentences, so it can't be a 1-1 between sentences and input neurons; you have to train to a fixed number of neurons (unless I'm missing something).
So, I'm working on an algo to map arrays to a fixed size, while preserving the frequency and position of those 1's in the array as much as possible (as that's what the NN is making its decisions off of.
CODE:
say we're aiming for a fixed length of 10 sentences or neurons, and need to be able to handle arrays of smaller and larger size.
new_length = 10
short = [1,0,1,0,0,0,0,1]
long = [1,1,0,0,1,0,0,0,0,1,0,0,1]
def map_to_fixed_length(arr, new_length):
arr_length = len(arr)
partition_size = arr_length/new_length
res = []
for i in range(new_length):
slice_start_index = int(math.floor(i * partition_size))
slice_end_index = int(math.ceil(i * partition_size))
partition = arr[slice_start_index:slice_end_index]
val = sum(partition)
res.append([slice_start_index, slice_end_index, partition])
if val > 0:
res.append(1)
else:
res.append(0)
return res
Not very pythonic probably. Anyways, the issue is that this is ommits certain index-slices. For example, the last index of short is omitted, and due to the rounding various indexes get omitted as well.
This is a simplified version of what I've been working on, which is mainly just adding if-statements to address all the gaps this leaves. But is there a better way to do this? Something a bit more statistically sound?
I was looking through numpy but all the resize functions are just padding with zeros or something pretty arbitrarily.
A simple way might be to use scipy.interpolate.interp1d like so:
>>> from scipy.interpolate import interp1d
>>> def resample(data, n):
... m = len(data)
... xin, xout = np.arange(n, 2*m*n, 2*n), np.arange(m, 2*m*n, 2*m)
... return interp1d(xin, data, 'nearest', fill_value='extrapolate')(xout)
...
>>> resample(short, new_length)
array([1., 0., 0., 1., 0., 0., 0., 0., 0., 1.])
>>>
>>> resample(long, new_length)
array([1., 1., 0., 1., 0., 0., 0., 1., 0., 1.])
A followup- Once the network was up and running, I tested Paul Panzer's answer vs the best I could come up with (below)
def resample(arr, new_length):
arr_length = len(arr)
partition_size = arr_length/new_length
res = []
last_round_end_slice = 0
for i in range(new_length):
slice_start_index = int(math.floor(i * partition_size))
slice_end_index = int(math.ceil(i * partition_size))
if slice_start_index > last_round_end_slice:
slice_start_index = last_round_end_slice
if i == 0:
slice_end_index = int(math.ceil(partition_size))
if i == new_length:
slice_end_index = arr_length
partition = arr[slice_start_index:slice_end_index]
val = sum(partition)
if val > 0:
res.append(1)
else:
res.append(0)
last_round_end_slice = slice_end_index
return res
Ugly as hell, but it does work.
Average accuracy results after 1000 training cycles (full optomization cycles, all batches) were
0.9765427094697953 for mine
0.968362500667572 for scipy
Interpolation is the correct answer, but in this use case it seems a roll-your-own is a bit better. The catch with interpolation is it sometimes results in arrays that are all 0; that is, it doesn't err towards representing matches vs eliminating them.
And I suppose the ultimate requirement is consistency. So long as the input is consistently determined, the network can learn from it.
Thought that was interesting in case anyone ends up stumbling upon this
I know how to draw points moved by matrix, like this below
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
x=np.random.randn(2) #2*1 matrix
A=np.random.randn(2,2) #2*2 matrix
print ('the content of x:\n{}\n the content of A:\n{}'.format(x,A))
def action(pt,n):
record=[pt]
for i in range(n):
pt= A#pt
record=np.vstack([record,pt])
plt.scatter(record[:,1],record[:,1])
action(x,100)
the function "action" will draw something like a line, but I want to move points by matrix and then draw it like an orbit
SHORT ANSWER:
plt.scatter(record[:,1],record[:,1]) will feed same values in both x & y dimensions & hence will always return a line. Replace it by:
X,Y = np.hsplit(record,2)
plt.scatter(X,Y)
LONG ANSWER:
The main cause behind plot coming out as a line is that you are generating the plot using 2 constants (although randomly generated). I will illustrate using below example:
>>> c
array([[ 1., 2.],
[ 2., 4.]])
>>> d
array([ 3., 4.])
>>> d#c
array([ 11., 22.])
>>> d#c#c
array([ 55., 110.])
>>> d#c#c#c
array([ 275., 550.])
Notice how all the recursive operation is only multiplying the initial co-ordinate by 5 at each stage.
How to get a non-linear plot??
Utilize the variable 'i' which we are calling for loop operation by giving it a power of 2(parabola) or more.
Use random numbers populated in the 2 matrices greater than 1. Otherwise all the operations either increase the magnitude in -ve or if b/w (-1,1) the magnitude decreases.
Use mathematical functions to introduce non-linearity. Eg:
pt = pt + np.sin(pt)
Reflect if using 2 random matrices & looping over them is the only way to achieve the curve. If this activity is independent from your bigger programme etc, then probably try different approach by using mathematical functions which generate the curve you like.
I am quite new to python numpy.
If i do have a list of numpy vectors. What is the best way to ensure computation is fast.
I am currently doing this which i find it to be too slow.
vec = sum(list of numpy vectors) # 4 vectors of 500 dimensions each
It does take up quite a noticeable amount of time using sum.
Is this what you are trying to do (but with much larger arrays)?
In [193]: sum([np.ones((2,3)),np.arange(6).reshape(2,3)])
Out[193]:
array([[ 1., 2., 3.],
[ 4., 5., 6.]])
500 dimensions each is an unclear description. Do you mean an array with shape (500,) or with ndim==500? If the latter, just how many elements total are there.
The fact that it is a list of 4 of these arrays shouldn't be a big deal. What's the time for array1 + array2?
If the arrays just have 500 elements each, the sum time is trivial:
In [195]: timeit sum([np.arange(500),np.arange(500),np.arange(500),np.arange(500)])
10000 loops, best of 3: 20.9 µs per loop
on the other hand a sum of arrays with many small dimensions is slower simply because such an array is much larger
In [204]: x=np.ones((3,)*10)
In [205]: timeit z=sum([x,x,x,x])
1000 loops, best of 3: 1.6 ms per loop
In my opinion this is already the fastest variant. It is pure numpy and as such computed within C-code.
Alternatives could be to compute the sums of each vector individually and then sum the values of the list or to stack all vectors and then sum up. But both are slower:
import numpy as np
import time
n = 10000
start = time.time()
for i in range(n):
lst = np.hstack([np.random.random(500) for i in range(4)])
x = np.sum(lst)
print("stack then np.sum: ", time.time()- start)
start = time.time()
for i in range(n):
lst = [np.sum(np.random.random(500)) for i in range(4)]
x = np.sum(lst)
print("sum up individually: ", time.time()- start)
start = time.time()
for i in range(n):
lst = [np.random.random(500) for i in range(4)]
x = np.sum(lst)
print("np.sum on list of vectors:", time.time()- start)
output:
stack then np.sum: 0.35804247856140137
sum up individually: 0.400468111038208
np.sum on list of vectors: 0.3427283763885498