I'm trying to use Python and Numpy/Scipy to implement an image processing algorithm. The profiler tells me a lot of time is being spent in the following function (called often), which tells me the sum of square differences between two images
def ssd(A,B):
s = 0
for i in range(3):
s += sum(pow(A[:,:,i] - B[:,:,i],2))
return s
How can I speed this up? Thanks.
s = numpy.sum((A[:,:,0:3]-B[:,:,0:3])**2)
(which I expect is likely just sum((A-B)**2) if the shape is always (,,3))
You can also use the sum method: ((A-B)**2).sum()
Just to mention that one can also use np.dot:
def ssd(A,B):
dif = A.ravel() - B.ravel()
return np.dot( dif, dif )
This might be a bit faster and possibly more accurate than alternatives using np.sum and **2, but doesn't work if you want to compute ssd along a specified axis. In that case, there might be a magical subscript formula using np.einsum.
I am confused why you are taking i in range(3). Is that supposed to be the whole array, or just part?
Overall, you can replace most of this with operations defined in numpy:
def ssd(A,B):
squares = (A[:,:,:3] - B[:,:,:3]) ** 2
return numpy.sum(squares)
This way you can do one operation instead of three and using numpy.sum may be able to optimize the addition better than the builtin sum.
Further to Ritsaert Hornstra's answer that got 2 negative marks (admittedly I didn't see it in it's original form...)
This is actually true.
For a large number of iterations it can often take twice as long to use the '**' operator or the pow(x,y) method as to just manually multiply the pairs together. If necessary use the math.fabs() method if it's throwing out NaN's (which it sometimes does especially when using int16s etc.), and it still only takes approximately half the time of the two functions given.
Not that important to the original question I know, but definitely worth knowing.
I do not know if the pow() function with power 2 will be fast. Try:
def ssd(A,B):
s = 0
for i in range(3):
s += sum((A[:,:,i] - B[:,:,i])*(A[:,:,i] - B[:,:,I]))
return s
You can try this one:
dist_sq = np.sum((A[:, np.newaxis, :] - B[np.newaxis, :, :]) ** 2, axis=-1)
More details can be found here (the 'k-Nearest Neighbors' example):
In Ruby language you can achieve this in this way
def diff_btw_sum_of_squars_and_squar_of_sum(from=1,to=100) # use default values from 1..100.
((1..100).inject(:+)**2) -(1..100).map {|num| num ** 2}.inject(:+)
diff_btw_sum_of_squars_and_squar_of_sum #call for above method
Since the following expansion for the logarithm holds:
one can calculate the following functions which have removable singularities at x:
I am trying to use NumPy for these calculations, and specifically the log1p function, which is accurate near x=0. However, convergence for the aforementioned functions is still problematic.
Do you have any ideas for any existing functions implementing these formulas or should I write one myself using the previous expansions, which will not be as efficient, however?
The simplest thing to do is something like
In [17]: def logf(x, eps=1e-6):
...: if abs(x) < eps:
...: return -0.5 - x/3.
...: else:
...: return (1. + log1p(-x)/x)/x
and play a bit with the threshold eps.
If you want a numpy-like, vectorized solution, replace an if with a np.where
>>> np.where(x > eps, 1. + log1p(-x)/x) / x, -0.5 - x/3.)
Why not successively take the Square of the candidate, after initially extracting the exponent component? When the square results in a number greater than 2, divide by two, and set the bit in the mantissa of your result that corresponds to the iteration. This is a much quicker and simpler way of determining log base 2, which can then in a single multiplication, be transformed to the e or 10 base.
Some predefined functions don't work at singularity points. One simple-minded solution is to compute the series by adding terms from a peculiar sequence.
For your example, the sequence would be :
sum = 0
for i in range(n):
sum+= x^k/k
sum = -sum
for log(1-x)
Then you keep adding a lot of terms or until the last term is under a small threshold.
I have been browsing through the questions, and could find some help, but I prefer having confirmation by asking it directly. So here is my problem.
I have an (numpy) array u of dimension N, from which I want to build a square matrix k of dimension N^2. Basically, each matrix element k(i,j) is defined as k(i,j)=exp(-|u_i-u_j|^2).
My first naive way to do it was like this, which is, I believe, Fortran-like:
for i in range(N):
for j in range(N):
However, this is extremely slow. For N=1000, for example, it is taking around 15 seconds.
My other way to proceed is the following (inspired by other questions/answers):
i, j = np.ogrid[:N,:N]
k = np.exp(np.sum(-(u[i]-u[j])**2,axis=2))
This is way faster, as for N=1000, the result is almost instantaneous.
So I have two questions.
1) Why is the first method so slow, and why is the second one so fast ?
2) Is there a faster way to do it ? For N=10000, it is starting to take quite some time already, so I really don't know if this was the "right" way to do it.
Thank you in advance !
P.S: the matrix is symmetric, so there must also be a way to make the process faster by calculating only the upper half of the matrix, but my question was more related to the way to manipulate arrays, etc.
First, a small remark, there is no need to use np.sum if u can be re-written as u = np.arange(N). Which seems to be the case since you wrote that it is of dimension N.
1) First question:
Accessing indices in Python is slow, so best is to not use [] if there is a way to not use it. Plus you call multiple times np.exp and np.sum, whereas they can be called for vectors and matrices. So, your second proposal is better since you compute your k all in once, instead of elements by elements.
2) Second question:
Yes there is. You should consider using only numpy functions and not using indices (around 3 times faster):
k = np.exp(-np.power(np.subtract.outer(u,u),2))
(NB: You can keep **2 instead of np.power, which is a bit faster but has smaller precision)
edit (Take into account that u is an array of tuples)
With tuple data, it's a bit more complicated:
ma = np.subtract.outer(u[:,0],u[:,0])**2
mb = np.subtract.outer(u[:,1],u[:,1])**2
k = np.exp(-np.add(ma, mb))
You'll have to use twice np.substract.outer since it will return a 4 dimensions array if you do it in one time (and compute lots of useless data), whereas u[i]-u[j] returns a 3 dimensions array.
I used np.add instead of np.sum since it keep the array dimensions.
NB: I checked with
N = 10000
u = np.random.random_sample((N,2))
I returns the same as your proposals. (But 1.7 times faster)
Basically I have an array that may vary between any two numbers, and I want to preserve the distribution while constraining it to the [0,1] space. The function to do this is very very simple. I usually write it as:
def to01(array):
array -= array.min()
array /= array.max()
return array
Of course it can and should be more complex to account for tons of situations, such as all the values being the same (divide by zero) and float vs. integer division (use np.subtract and np.divide instead of operators). But this is the most basic.
The problem is that I do this very frequently across stuff in my project, and it seems like a fairly standard mathematical operation. Is there a built in function that does this in NumPy?
Don't know if there's a builtin for that (probably not, it's not really a difficult thing to do as is). You can use vectorize to apply a function to all the elements of the array:
def to01(array):
a = array.min()
# ignore the Runtime Warning
with numpy.errstate(divide='ignore'):
b = 1. /(array.max() - array.min())
if not(numpy.isfinite(b)):
b = 0
return numpy.vectorize(lambda x: b * (x - a))(array)
Could anyone suggest which library supports creation of a gaussian filter of required length and sigma?I basically need an equivalent function for the below matlab function:
fltr = fspecial('gaussian',[1 n],sd)
You don't need a library for a simple 1D gaussian.
from math import pi, sqrt, exp
def gauss(n=11,sigma=1):
r = range(-int(n/2),int(n/2)+1)
return [1 / (sigma * sqrt(2*pi)) * exp(-float(x)**2/(2*sigma**2)) for x in r]
Note: This will always return an odd-length list centered around 0. I suppose there may be situations where you would want an even-length Gaussian with values for x = [..., -1.5, -0.5, 0.5, 1.5, ...], but in that case, you would need a slightly different formula and I'll leave that to you ;)
Output example with default values n = 11, sigma = 1:
>>> g = gauss()
>>> sum(g)
Perhaps scipy.ndimage.filters.gaussian_filter? I've never used it, but the documentation is at: https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.ndimage.filters.gaussian_filter.html
Try scipy.ndimage.gaussian_filter, but do you really want the kernel or do you also want to apply it? (In which case you can just use this function.) In the former case, apply the filter on an array which is 0 everywhere but with a 1 in the center. For the easier-to-write 1d case, this would be for example:
>>> ndimage.gaussian_filter1d(np.float_([0,0,0,0,1,0,0,0,0]), 1)
array([ 1.33830625e-04, 4.43186162e-03, 5.39911274e-02,
2.41971446e-01, 3.98943469e-01, 2.41971446e-01,
5.39911274e-02, 4.43186162e-03, 1.33830625e-04])
If run-time speed is of importance I highly recommend creating the filter once and then using it on every iteration. Optimizations are constantly made but a couple of years ago this significantly sped some code I wrote. ( The above answers show how to create the filter ).
I have an iterator of numbers, for example a file object:
f = open("datafile.dat")
now I want to compute:
mean = get_mean(f)
sigma = get_sigma(f, mean)
What is the best implementation? Suppose that the file is big and I would like to avoid to read it twice.
If you want to iterate once, you can write your sum function:
def mysum(l):
s2 = 0
s = 0
for e in l:
s += e
s2 += e * e
return (s, s2)
and use the result in your sigma function.
Edit: now you can calculate the variance like this: (s2 - (s*s) / N) / N
By taking account of #Adam Bowen's comment,
keep in mind that if we use mathematical tricks and transform the original formulas
we may degrade the results.
I think Nick D has the correct answer.
Assuming you want to compute both mean and variance in one sweep of the file (and you don't really want two functions that have to be called one after the other), you can collect the sum of the values and of their squares and them use such sums (toghether with the number of read elements) to compute at the same time mean and variance.
There are some numerical stability issues, but the idea in
is the basic ingredient you need. Some more details are at
where I suggest you to read the "Naïve algorithm".
Hope this helps,
You can compute both in one pass. See:
Make a list from the iterable, or use itertools.tee().
I am not sure there is much choice.
You will have to iterate your numbers twice in any case as the standard deviation will require the mean information on each value.
If you have enough memory, you can gain on the I/O access by loading your file in memory during the first iteration but that is about it IMO.
As I feel that there are good elements scattered in multiple answers, I would like to summarize:
If your file is too big to conveniently fit in memory, and if you want a good precision in the variance, you do need to read the file twice (with one pass, the variance is the difference between two large numbers, which is not precise because of floating point limitations). Note that your operating system is likely to provide some automatic speed-up for the second file reading, as it may still be in RAM during the second pass.
If you do not care for the precision of the variance, you can simply iterate once over the file and calculate the quantities suggested by Nick D, with the details provided in the comment by Adam Bowen.
You have two solutions
Make a list out of your iterator and loop it as many time as you wish. Drawback is everything will be in memory, so not suitable if your file is big. Simple use of itertools.tee also will not save you
There is no other solution , unless , you do not need to pass output of get_mean to get_sigma, because in that case they can only be in series, but if you remove this restriction then you can run both functions in parallel using threads, and use itertools.tee to have two iterators from one
You can use map reduce in an elegant fashion way
sample is the list you want to get its variance
sample = [a,b,c, ...]
mean = float(reduce(lambda x,y : x+y, sample)) / len(sample)
variance = reduce(lambda x,y: x+y, map(lambda xi: (xi-mean)**2, sample))/ len(sample)
In a succinct line of code:
variance = reduce(lambda x,y: x+y, map(lambda xi: (xi-(float(reduce(lambda x,y : x+y, sample)) / len(sample)))**2, sample))/ len(sample)