Speeding up monte carlo code in python - python

Consider points Y given in increasing order from [0,T). We are to consider these points as lying on a circle of circumference T. Now consider points X also from [0,T) and also lying on a circle of circumference T.
We say the distance between X and Y is the sum of the absolute distance between the each point in X and its closest point in Y recalling that both are considered to be lying in a circle. Write this distance as Delta(X, Y).
I am trying to find a quick way of approximating the distributions of distance between the circles over all possible rotations of X. I am currently does this by Monte Carlo simulation. First here is my code to make some fake data.
import random
import numpy as np
from bisect import bisect_left
def simul(rate, T):
time = np.random.exponential(rate)
times = [0]
newtime = times[-1]+time
while (newtime < T):
times.append(newtime)
newtime = newtime+np.random.exponential(rate)
return times[1:]
Now the code the find the distance between two circles.
def takeClosest(myList, myNumber, T):
"""
Assumes myList is sorted. Returns closest value to myNumber in a circle of circumference T.
If two numbers are equally close, return the smallest number.
"""
pos = bisect_left(myList, myNumber)
if (pos == 0 and myList[pos] != myNumber):
before = myList[pos - 1] - T
after = myList[0]
elif (pos == len(myList)):
before = myList[pos-1]
after = myList[0] + T
else:
before = myList[pos - 1]
after = myList[pos]
if after - myNumber < myNumber - before:
return after
else:
return before
def circle_dist(timesY, timesX):
dist = 0
for t in timesX:
closest_number = takeClosest(timesY, t, T)
dist += np.abs(closest_number - t)
return dist
Now the main code to make the data and to try 1000 different random rotations.
T = 50000
timesX = simul(1, T)
timesY = simul(10, T)
dists=[]
iters = 100
for i in xrange(iters):
offset = np.random.randint(0,T)
timesX = [(t+offset) % T for t in timesX]
dists.append(circle_dist(timesY, timesX))
We can now print out any statistics we like of the distances. I am particularly interested in the variance.
print "Variance is ", np.var(dists)
Unfortunately I need to do this a lot and it takes around 16 seconds currently. I find this a little surprising it is so slow. Any suggestions for how to speed it up gratefully received.
Edit 1. Reduced the number of iterations to 100 (the previous value didn't correspond to my timings correctly). This now takes around 16 seconds on my computer.
Edit 2. Fixed bug in takeClosest

EDIT: I've just noticed that performance optimization is a little premature, because the expression closest_number - t is not a valid implementation of any definition of a distance on a "circle" - that is only a distance on an open-ended line
sample test case (pseudocode):
T = 10
X = [1, 2]
Y = [9]
dist(X, Y) = dist(1, 9) + dist(2, 9)
dist_on_line = 8 + 7 = 15
dist_on_circle = 2 + 3 = 5
Note that definition of the circle [0,10) implies that dist(0, 10) is not defined, but in the limit it approaches 0: lim(dist(0, t), t->10) = 0
A correct implementation of a distance on a circle would be:
dist_of_t = min(t - closest_number_before_t,
closes_number_after_t - t,
T - t + closes_number_before_t,
T - closest_number_after_t + t)
Original answer:
you could rotate and iterate over timesY instead of timesX since that array is an order of magnitude smaller - doing bisect_left of timeX is negligible (O(logn)) compared to iterating over all the elements (O(n))
but IMHO, the real slowdown if because of Python dynamic typing (every of the ~50000 items in timesX has to be checked for type compatibility each time you try to compare it to some other value) => converting timesX and timesY to numpy arrays should help, if that is not enought CPU acceleration (cython, numba, ...) is the think you need

The function circle_dist can be replaced by a one-liner. So you can plug it into your outer for i loop:
sum(abs(takeClosest(timesY, t) - t) for t in timesX)
Furthermore, you should always - if possible - allocate arrays like dists in one step and avoid appending elements many thousand times.
But, unfortunately, both improvements only save a few percent of computing time.
Edit 1: Replacing np.abs(...) with abs(...) decreases computing time by 50 % on my machine (on a reduced data set)!
Edit 2: Updated the one-liner according to Aprillion's comment.

Related

How to optimize (3*O(n**2)) + O(n) algorithm?

I am trying to solve the arithmetic progression problem from USACO. Here is the problem statement.
An arithmetic progression is a sequence of the form a, a+b, a+2b, ..., a+nb where n=0, 1, 2, 3, ... . For this problem, a is a non-negative integer and b is a positive integer.
Write a program that finds all arithmetic progressions of length n in the set S of bisquares. The set of bisquares is defined as the set of all integers of the form p2 + q2 (where p and q are non-negative integers).
The two lines of input are n and m, which are the length of each sequence, and the upper bound to limit the search of the bi squares respectively.
I have implemented an algorithm which correctly solves the problem, yet it takes too long. With the max constraints of n = 25 and m = 250, my program does not solve the problem in the 5 second time limit.
Here is the code:
n = 25
m = 250
bisq = set()
for i in range(m+1):
for j in range(i,m+1):
bisq.add(i**2+j**2)
seq = []
for b in range(1, max(bisq)):
for a in bisq:
x = a
for i in range(n):
if x not in bisq:
break
x += b
else:
seq.append((a,b))
The program outputs the correct answer, but it takes too long. I tried running the program with the max n/m values, and after 30 seconds, it was still going.
Disclaimer: this is not a full answer. This is more of a general direction where to look for.
For each member of a sequence, you're looking for four parameters: two numbers to be squared and summed (q_i and p_i), and two differences to be used in the next step (x and y) such that
q_i**2 + p_i**2 + b = (q_i + x)**2 + (p_i + y)**2
Subject to:
0 <= q_i <= m
0 <= p_i <= m
0 <= q_i + x <= m
0 <= p_i + y <= m
There are too many unknowns so we can't get a closed form solution.
let's fix b: (still too many unknowns)
let's fix q_i, and also state that this is the first member of the sequence. I.e., let's start searching from q_1 = 0, extend as much as possible and then extract all sequences of length n. Still, there are too many unknowns.
let's fix x: we only have p_i and y to solve for. At this point, note that the range of possible values to satisfy the equation is much smaller than full range of 0..m. After some calculus, b = x*(2*q_i + x) + y*(2*p_i + y), and there are really not many values to check.
This last step prune is what distinguishes it from the full search. If you write down this condition explicitly, you can get the range of possible p_i values and from that find the length of possible sequence with step b as a function of q_i and x. Rejecting sequences smaller than n should further prune the search.
This should get you from O(m**4) complexity to ~O(m**2). It should be enough to get into the time limit.
A couple more things that might help prune the search space:
b <= 2*m*m//n
a <= 2*m*m - b*n
An answer on math.stackexchange says that for a number x to be a bisquare, any prime factor of x of the form 3 + 4k (e.g., 3, 7, 11, 19, ...) must have an even power. I think this means that for any n > 3, b has to be even. The first item in the sequence a is a bisquare, so it has an even number of factors of 3. If b is odd, then one of a+1b or a+2b will have an odd number of factors of 3 and therefore isn't a bisquare.

Optimizing Totient function

I'm trying to maximize the Euler Totient function on Python given it can use large arbitrary numbers. The problem is that the program gets killed after some time so it doesn't reach the desired ratio. I have thought of increasing the starting number into a larger number, but I don't think it's prudent to do so. I'm trying to get a number when divided by the totient gets higher than 10. Essentially I'm trying to find a sparsely totient number that fits this criteria.
Here's my phi function:
def phi(n):
amount = 0
for k in range(1, n + 1):
if fractions.gcd(n, k) == 1:
amount += 1
return amount
The most likely candidates for high ratios of N/phi(N) are products of prime numbers. If you're just looking for one number with a ratio > 10, then you can generate primes and only check the product of primes up to the point where you get the desired ratio
def totientRatio(maxN,ratio=10):
primes = []
primeProd = 1
isPrime = [1]*(maxN+1)
p = 2
while p*p<=maxN:
if isPrime[p]:
isPrime[p*p::p] = [0]*len(range(p*p,maxN+1,p))
primes.append(p)
primeProd *= p
tot = primeProd
for f in primes:
tot -= tot//f
if primeProd/tot >= ratio:
return primeProd,primeProd/tot,len(primes)
p += 1 + (p&1)
output:
totientRatio(10**6)
16516447045902521732188973253623425320896207954043566485360902980990824644545340710198976591011245999110,
10.00371973209101,
55
This gives you the smallest number with that ratio. Multiples of that number will have the same ratio.
n = 16516447045902521732188973253623425320896207954043566485360902980990824644545340710198976591011245999110
n*2/totient(n*2) = 10.00371973209101
n*11*13/totient(n*11*13) = 10.00371973209101
No number will have a higher ratio until you reach the next product of primes (i.e. that number multiplied by the next prime).
n*263/totient(n*263) = 10.041901868473037
Removing a prime from the product affects the ratio by a proportion of (1-1/P).
For example if m = n/109, then m/phi(m) = n/phi(n) * (1-1/109)
(n//109) / totient(n//109) = 9.91194248684247
10.00371973209101 * (1-1/109) = 9.91194248684247
This should allow you to navigate the ratios efficiently and find the numbers that meed your need.
For example, to get a number with a ratio that is >= 10 but closer to 10, you can go to the next prime product(s) and remove one or more of the smaller primes to reduce the ratio. This can be done using combinations (from itertools) and will allow you to find very specific ratios:
m = n*263/241
m/totient(m) = 10.000234225865265
m = n*(263...839) / (7 * 61 * 109 * 137) # 839 is 146th prime
m/totient(m) = 10.000000079805726
I have a partial solution for you, but the results don't look good.. (this solution may not give you an answer with modern computer hardware (amount of ram is limiting currently)) I took an answer from this pcg challenge and modified it to spit out ratios of n/phi(n) up to a particular n
import numba as nb
import numpy as np
import time
n = int(2**31)
#nb.njit("i4[:](i4[:])", locals=dict(
n=nb.int32, i=nb.int32, j=nb.int32, q=nb.int32, f=nb.int32))
def summarum(phi):
#calculate phi(i) for i: 1 - n
#taken from <a>https://codegolf.stackexchange.com/a/26753/42652</a>
phi[1] = 1
i = 2
while i < n:
if phi[i] == 0:
phi[i] = i - 1
j = 2
while j * i < n:
if phi[j] != 0:
q = j
f = i - 1
while q % i == 0:
f *= i
q //= i
phi[i * j] = f * phi[q]
j += 1
i += 1
#divide each by n to get ratio n/phi(n)
i = 1
while i < n: #jit compiled while loop is faster than: for i in range(): blah blah blah
phi[i] = i//phi[i]
i += 1
return phi
if __name__ == "__main__":
s1 = time.time()
a = summarum(np.zeros(n, np.int32))
locations = np.where(a >= 10)
print(len(locations))
I only have enough ram on my work comp. to test about 0 < n < 10^8 and the largest ratio was about 6. You may or may not have any luck going up to larger n, although 10^8 already took several seconds (not sure what the overhead was... spyder's been acting strange lately)
p55# is a sparsely totient number satisfying the desired condition.
Furthermore, all subsequent primorial numbers are as well, because pn# / phi(pn#) is a strictly increasing sequence:
p1# / phi(p1#) is 2, which is positive. For n > 1, pn# / phi(pn#) is equal to pn-1#pn / phi(pn-1#pn), which, since pn and pn-1# are coprime, is equal to (pn-1# / phi(pn-1#)) * (pn/phi(pn)). We know pn > phi(pn) > 0 for all n, so pn/phi(pn) > 1. So we have that the sequence pn# / phi(pn#) is strictly increasing.
I do not believe these to be the only sparsely totient numbers satisfying your request, but I don't have an efficient way of generating the others coming to mind. Generating primorials, by comparison, amounts to generating the first n primes and multiplying the list together (whether by using functools.reduce(), math.prod() in 3.8+, or ye old for loop).
As for the general question of writing a phi(n) function, I would probably first find the prime factors of n, then use Euler's product formula for phi(n). As an aside, make sure to NOT use floating-point division. Even finding the prime factors of n by trial division should outperform computing gcd n times, but when working with large n, replacing this with an efficient prime factorization algorithm will pay dividends. Unless you want a good cross to die on, don't write your own. There's one in sympy that I'm aware of, and given the ubiquity of the problem, probably plenty of others around. Time as needed.
Speaking of timing, if this is still relevant enough to you (or a future reader) to want to time... definitely throw the previous answer in the mix as well.

Fast algorithm to compute Adamic-Adar

I'm working on graph analysis. I want to compute an N by N similarity matrix that contains the Adamic Adar similarity between every two vertices. To give an overview of Adamic Adar let me start with this introduction:
Given the adjacency matrix A of an undirected graph G. CN is the set of all common neighbors of two vertices x, y. A common neighbor of two vertices is one where both vertices have an edge/link to, i.e. both vertices will have a 1 for the corresponding common neighbor node in A. k_n is the degree of node n.
Adamic-Adar is defined as the following:
My attempt to compute it is to fetch both rows of the x and y nodes from A and then sum them. Then look for the elements that has 2 as the value and then gets their degrees and apply the equation. However computing that takes really really a long of time. I tried with a graph that contains 1032 vertices and it took a lot of time to compute. It started with 7 minutes and then I cancelled the computations. So my question: is there a better algorithm to compute it?
Here's my code in python:
def aa(graph):
"""
Calculates the Adamic-Adar index.
"""
N = graph.num_vertices()
A = gts.adjacency(graph)
S = np.zeros((N,N))
degrees = get_degrees_dic(graph)
for i in xrange(N):
A_i = A[i]
for j in xrange(N):
if j != i:
A_j = A[j]
intersection = A_i + A_j
common_ns_degs = list()
for index in xrange(N):
if intersection[index] == 2:
cn_deg = degrees[index]
common_ns_degs.append(1.0/np.log10(cn_deg))
S[i,j] = np.sum(common_ns_degs)
return S
Since you're using numpy, you can really cut down on your need to iterate for every operation in the algorithm. my numpy- and vectorized-fu aren't the greatest, but the below runs in around 2.5s on a graph with ~13,000 nodes:
def adar_adamic(adj_mat):
"""Computes Adar-Adamic similarity matrix for an adjacency matrix"""
Adar_Adamic = np.zeros(adj_mat.shape)
for i in adj_mat:
AdjList = i.nonzero()[0] #column indices with nonzero values
k_deg = len(AdjList)
d = np.log(1.0/k_deg) # row i's AA score
#add i's score to the neighbor's entry
for i in xrange(len(AdjList)):
for j in xrange(len(AdjList)):
if AdjList[i] != AdjList[j]:
cell = (AdjList[i],AdjList[j])
Adar_Adamic[cell] = Adar_Adamic[cell] + d
return Adar_Adamic
unlike MBo's answer, this does build the full, symmetric matrix, but the inefficiency (for me) was tolerable, given the execution time.
I believe you are using rather slow approach. It would better to revert it -
- initialize AA (Adamic-Adar) matrix by zeros
- for every node k get it's degree k_deg
- calc d = log(1.0/k_deg) (why log10 - is it important or not?)
- add d to all AAij, where i,j - all pairs of 1s in kth row
of adjacency matrix
Edit:
- for sparse graphs it is useful to extract positions of all 1s in kth row to the list to reach O(V*(V+E)) complexity instead of O(V^3)
AA = np.zeros((N,N))
for k = 0 to N - 1 do
AdjList = []
for j = 0 to N - 1 do
if A[k, j] = 1 then
AdjList.Add(j)
k_deg = AdjList.Length
d = log(1/k_deg)
for j = 0 to AdjList.Length - 2 do
for i = j+1 to AdjList.Length - 1 do
AA[AdjList[i],AdjList[j]] = AA[AdjList[i],AdjList[j]] + d
//half of matrix filled, it is symmetric for undirected graph
I don't see a way of reducing the time complexity, but it can be vectorized:
degrees = A.sum(axis=0)
weights = np.log10(1.0/degrees)
adamic_adar = (A*weights).dot(A.T)
With A a regular Numpy array. It seems you're using graph_tool.spectral.adjacency and thus A would be a sparse matrix. In that case the code would be:
from scipy.sparse import csr_matrix
degrees = A.sum(axis=0)
weights = csr_matrix(np.log10(1.0/degrees))
adamic_adar = A.multiply(weights) * A.T
This is much faster than using Python loops. A small warning though: with this approach you really need to make sure that the values on the main diagonal (of A and adamic_adar) are what you expect them to be. Also, A must not contain weights, but only zeros and ones.
I believe there most be a function like the one defined in R igraph in its python_igraph as well for the node similarity (Adamic_Adar as well)

Monte Carlo Method in Python

I've been attempting to use Python to create a script that lets me generate large numbers of points for use in the Monte Carlo method to calculate an estimate to Pi. The script I have so far is this:
import math
import random
random.seed()
n = 10000
for i in range(n):
x = random.random()
y = random.random()
z = (x,y)
if x**2+y**2 <= 1:
print z
else:
del z
So far, I am able to generate all of the points I need, but what I would like to get is the number of points that are produced when running the script for use in a later calculation. I'm not looking for incredibly precise results, just a good enough estimate. Any suggestions would be greatly appreciated.
If you're doing any kind of heavy duty numerical calculation, considering learning numpy. Your problem is essentially a one-linear with a numpy setup:
import numpy as np
N = 10000
pts = np.random.random((N,2))
# Select the points according to your condition
idx = (pts**2).sum(axis=1) < 1.0
print pts[idx], idx.sum()
Giving:
[[ 0.61255615 0.44319463]
[ 0.48214768 0.69960483]
[ 0.04735956 0.18509277]
...,
[ 0.37543094 0.2858077 ]
[ 0.43304577 0.45903071]
[ 0.30838206 0.45977162]], 7854
The last number is count of the number of events that counted, i.e. the count of the points whose radius is less than one.
Not sure if this is what you're looking for, but you can run enumerate on range and get the position in your iteration:
In [1]: for index, i in enumerate(xrange(10, 15)):
...: print index + 1, i
...:
...:
1 10
2 11
3 12
4 13
5 14
In this case, index + 1 would represent the current point being created (index itself would be the total number of points created at the beginning of a given iteration). Also, if you are using Python 2.x, xrange is generally better for these sorts of iterations as it does not load the entire list into memory but rather accesses it on an as-needed basis.
Just add hits variable before the loop, initialize it to 0 and inside your if statement increment hits by one.
Finally you can calculate PI value using hits and n.
import math
import random
random.seed()
n = 10000
hits = 0 # initialize hits with 0
for i in range(n):
x = random.random()
y = random.random()
z = (x,y)
if x**2+y**2 <= 1:
hits += 1
else:
del z
# use hits and n to compute PI

How can the Euclidean distance be calculated with NumPy?

I have two points in 3D space:
a = (ax, ay, az)
b = (bx, by, bz)
I want to calculate the distance between them:
dist = sqrt((ax-bx)^2 + (ay-by)^2 + (az-bz)^2)
How do I do this with NumPy? I have:
import numpy
a = numpy.array((ax, ay, az))
b = numpy.array((bx, by, bz))
Use numpy.linalg.norm:
dist = numpy.linalg.norm(a-b)
This works because the Euclidean distance is the l2 norm, and the default value of the ord parameter in numpy.linalg.norm is 2.
For more theory, see Introduction to Data Mining:
Use scipy.spatial.distance.euclidean:
from scipy.spatial import distance
a = (1, 2, 3)
b = (4, 5, 6)
dst = distance.euclidean(a, b)
For anyone interested in computing multiple distances at once, I've done a little comparison using perfplot (a small project of mine).
The first advice is to organize your data such that the arrays have dimension (3, n) (and are C-contiguous obviously). If adding happens in the contiguous first dimension, things are faster, and it doesn't matter too much if you use sqrt-sum with axis=0, linalg.norm with axis=0, or
a_min_b = a - b
numpy.sqrt(numpy.einsum('ij,ij->j', a_min_b, a_min_b))
which is, by a slight margin, the fastest variant. (That actually holds true for just one row as well.)
The variants where you sum up over the second axis, axis=1, are all substantially slower.
Code to reproduce the plot:
import numpy
import perfplot
from scipy.spatial import distance
def linalg_norm(data):
a, b = data[0]
return numpy.linalg.norm(a - b, axis=1)
def linalg_norm_T(data):
a, b = data[1]
return numpy.linalg.norm(a - b, axis=0)
def sqrt_sum(data):
a, b = data[0]
return numpy.sqrt(numpy.sum((a - b) ** 2, axis=1))
def sqrt_sum_T(data):
a, b = data[1]
return numpy.sqrt(numpy.sum((a - b) ** 2, axis=0))
def scipy_distance(data):
a, b = data[0]
return list(map(distance.euclidean, a, b))
def sqrt_einsum(data):
a, b = data[0]
a_min_b = a - b
return numpy.sqrt(numpy.einsum("ij,ij->i", a_min_b, a_min_b))
def sqrt_einsum_T(data):
a, b = data[1]
a_min_b = a - b
return numpy.sqrt(numpy.einsum("ij,ij->j", a_min_b, a_min_b))
def setup(n):
a = numpy.random.rand(n, 3)
b = numpy.random.rand(n, 3)
out0 = numpy.array([a, b])
out1 = numpy.array([a.T, b.T])
return out0, out1
b = perfplot.bench(
setup=setup,
n_range=[2 ** k for k in range(22)],
kernels=[
linalg_norm,
linalg_norm_T,
scipy_distance,
sqrt_sum,
sqrt_sum_T,
sqrt_einsum,
sqrt_einsum_T,
],
xlabel="len(x), len(y)",
)
b.save("norm.png")
I want to expound on the simple answer with various performance notes. np.linalg.norm will do perhaps more than you need:
dist = numpy.linalg.norm(a-b)
Firstly - this function is designed to work over a list and return all of the values, e.g. to compare the distance from pA to the set of points sP:
sP = set(points)
pA = point
distances = np.linalg.norm(sP - pA, ord=2, axis=1.) # 'distances' is a list
Remember several things:
Python function calls are expensive.
[Regular] Python doesn't cache name lookups.
So
def distance(pointA, pointB):
dist = np.linalg.norm(pointA - pointB)
return dist
isn't as innocent as it looks.
>>> dis.dis(distance)
2 0 LOAD_GLOBAL 0 (np)
2 LOAD_ATTR 1 (linalg)
4 LOAD_ATTR 2 (norm)
6 LOAD_FAST 0 (pointA)
8 LOAD_FAST 1 (pointB)
10 BINARY_SUBTRACT
12 CALL_FUNCTION 1
14 STORE_FAST 2 (dist)
3 16 LOAD_FAST 2 (dist)
18 RETURN_VALUE
Firstly - every time we call it, we have to do a global lookup for "np", a scoped lookup for "linalg" and a scoped lookup for "norm", and the overhead of merely calling the function can equate to dozens of python instructions.
Lastly, we wasted two operations on to store the result and reload it for return...
First pass at improvement: make the lookup faster, skip the store
def distance(pointA, pointB, _norm=np.linalg.norm):
return _norm(pointA - pointB)
We get the far more streamlined:
>>> dis.dis(distance)
2 0 LOAD_FAST 2 (_norm)
2 LOAD_FAST 0 (pointA)
4 LOAD_FAST 1 (pointB)
6 BINARY_SUBTRACT
8 CALL_FUNCTION 1
10 RETURN_VALUE
The function call overhead still amounts to some work, though. And you'll want to do benchmarks to determine whether you might be better doing the math yourself:
def distance(pointA, pointB):
return (
((pointA.x - pointB.x) ** 2) +
((pointA.y - pointB.y) ** 2) +
((pointA.z - pointB.z) ** 2)
) ** 0.5 # fast sqrt
On some platforms, **0.5 is faster than math.sqrt. Your mileage may vary.
**** Advanced performance notes.
Why are you calculating distance? If the sole purpose is to display it,
print("The target is %.2fm away" % (distance(a, b)))
move along. But if you're comparing distances, doing range checks, etc., I'd like to add some useful performance observations.
Let’s take two cases: sorting by distance or culling a list to items that meet a range constraint.
# Ultra naive implementations. Hold onto your hat.
def sort_things_by_distance(origin, things):
return things.sort(key=lambda thing: distance(origin, thing))
def in_range(origin, range, things):
things_in_range = []
for thing in things:
if distance(origin, thing) <= range:
things_in_range.append(thing)
The first thing we need to remember is that we are using Pythagoras to calculate the distance (dist = sqrt(x^2 + y^2 + z^2)) so we're making a lot of sqrt calls. Math 101:
dist = root ( x^2 + y^2 + z^2 )
:.
dist^2 = x^2 + y^2 + z^2
and
sq(N) < sq(M) iff M > N
and
sq(N) > sq(M) iff N > M
and
sq(N) = sq(M) iff N == M
In short: until we actually require the distance in a unit of X rather than X^2, we can eliminate the hardest part of the calculations.
# Still naive, but much faster.
def distance_sq(left, right):
""" Returns the square of the distance between left and right. """
return (
((left.x - right.x) ** 2) +
((left.y - right.y) ** 2) +
((left.z - right.z) ** 2)
)
def sort_things_by_distance(origin, things):
return things.sort(key=lambda thing: distance_sq(origin, thing))
def in_range(origin, range, things):
things_in_range = []
# Remember that sqrt(N)**2 == N, so if we square
# range, we don't need to root the distances.
range_sq = range**2
for thing in things:
if distance_sq(origin, thing) <= range_sq:
things_in_range.append(thing)
Great, both functions no-longer do any expensive square roots. That'll be much faster, but before you go further, check yourself: why did sort_things_by_distance need a "naive" disclaimer both times above? Answer at the very bottom (*a1).
We can improve in_range by converting it to a generator:
def in_range(origin, range, things):
range_sq = range**2
yield from (thing for thing in things
if distance_sq(origin, thing) <= range_sq)
This especially has benefits if you are doing something like:
if any(in_range(origin, max_dist, things)):
...
But if the very next thing you are going to do requires a distance,
for nearby in in_range(origin, walking_distance, hotdog_stands):
print("%s %.2fm" % (nearby.name, distance(origin, nearby)))
consider yielding tuples:
def in_range_with_dist_sq(origin, range, things):
range_sq = range**2
for thing in things:
dist_sq = distance_sq(origin, thing)
if dist_sq <= range_sq: yield (thing, dist_sq)
This can be especially useful if you might chain range checks ('find things that are near X and within Nm of Y', since you don't have to calculate the distance again).
But what about if we're searching a really large list of things and we anticipate a lot of them not being worth consideration?
There is actually a very simple optimization:
def in_range_all_the_things(origin, range, things):
range_sq = range**2
for thing in things:
dist_sq = (origin.x - thing.x) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.y - thing.y) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.z - thing.z) ** 2
if dist_sq <= range_sq:
yield thing
Whether this is useful will depend on the size of 'things'.
def in_range_all_the_things(origin, range, things):
range_sq = range**2
if len(things) >= 4096:
for thing in things:
dist_sq = (origin.x - thing.x) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.y - thing.y) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.z - thing.z) ** 2
if dist_sq <= range_sq:
yield thing
elif len(things) > 32:
for things in things:
dist_sq = (origin.x - thing.x) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.y - thing.y) ** 2 + (origin.z - thing.z) ** 2
if dist_sq <= range_sq:
yield thing
else:
... just calculate distance and range-check it ...
And again, consider yielding the dist_sq. Our hotdog example then becomes:
# Chaining generators
info = in_range_with_dist_sq(origin, walking_distance, hotdog_stands)
info = (stand, dist_sq**0.5 for stand, dist_sq in info)
for stand, dist in info:
print("%s %.2fm" % (stand, dist))
(*a1: sort_things_by_distance's sort key calls distance_sq for every single item, and that innocent looking key is a lambda, which is a second function that has to be invoked...)
Another instance of this problem solving method:
def dist(x,y):
return numpy.sqrt(numpy.sum((x-y)**2))
a = numpy.array((xa,ya,za))
b = numpy.array((xb,yb,zb))
dist_a_b = dist(a,b)
Starting Python 3.8, the math module directly provides the dist function, which returns the euclidean distance between two points (given as tuples or lists of coordinates):
from math import dist
dist((1, 2, 6), (-2, 3, 2)) # 5.0990195135927845
And if you're working with lists:
dist([1, 2, 6], [-2, 3, 2]) # 5.0990195135927845
It can be done like the following. I don't know how fast it is, but it's not using NumPy.
from math import sqrt
a = (1, 2, 3) # Data point 1
b = (4, 5, 6) # Data point 2
print sqrt(sum( (a - b)**2 for a, b in zip(a, b)))
A nice one-liner:
dist = numpy.linalg.norm(a-b)
However, if speed is a concern I would recommend experimenting on your machine. I've found that using math library's sqrt with the ** operator for the square is much faster on my machine than the one-liner NumPy solution.
I ran my tests using this simple program:
#!/usr/bin/python
import math
import numpy
from random import uniform
def fastest_calc_dist(p1,p2):
return math.sqrt((p2[0] - p1[0]) ** 2 +
(p2[1] - p1[1]) ** 2 +
(p2[2] - p1[2]) ** 2)
def math_calc_dist(p1,p2):
return math.sqrt(math.pow((p2[0] - p1[0]), 2) +
math.pow((p2[1] - p1[1]), 2) +
math.pow((p2[2] - p1[2]), 2))
def numpy_calc_dist(p1,p2):
return numpy.linalg.norm(numpy.array(p1)-numpy.array(p2))
TOTAL_LOCATIONS = 1000
p1 = dict()
p2 = dict()
for i in range(0, TOTAL_LOCATIONS):
p1[i] = (uniform(0,1000),uniform(0,1000),uniform(0,1000))
p2[i] = (uniform(0,1000),uniform(0,1000),uniform(0,1000))
total_dist = 0
for i in range(0, TOTAL_LOCATIONS):
for j in range(0, TOTAL_LOCATIONS):
dist = fastest_calc_dist(p1[i], p2[j]) #change this line for testing
total_dist += dist
print total_dist
On my machine, math_calc_dist runs much faster than numpy_calc_dist: 1.5 seconds versus 23.5 seconds.
To get a measurable difference between fastest_calc_dist and math_calc_dist I had to up TOTAL_LOCATIONS to 6000. Then fastest_calc_dist takes ~50 seconds while math_calc_dist takes ~60 seconds.
You can also experiment with numpy.sqrt and numpy.square though both were slower than the math alternatives on my machine.
My tests were run with Python 2.6.6.
I find a 'dist' function in matplotlib.mlab, but I don't think it's handy enough.
I'm posting it here just for reference.
import numpy as np
import matplotlib as plt
a = np.array([1, 2, 3])
b = np.array([2, 3, 4])
# Distance between a and b
dis = plt.mlab.dist(a, b)
You can just subtract the vectors and then innerproduct.
Following your example,
a = numpy.array((xa, ya, za))
b = numpy.array((xb, yb, zb))
tmp = a - b
sum_squared = numpy.dot(tmp.T, tmp)
result = numpy.sqrt(sum_squared)
I like np.dot (dot product):
a = numpy.array((xa,ya,za))
b = numpy.array((xb,yb,zb))
distance = (np.dot(a-b,a-b))**.5
With Python 3.8, it's very easy.
https://docs.python.org/3/library/math.html#math.dist
math.dist(p, q)
Return the Euclidean distance between two points p and q, each given
as a sequence (or iterable) of coordinates. The two points must have
the same dimension.
Roughly equivalent to:
sqrt(sum((px - qx) ** 2.0 for px, qx in zip(p, q)))
Having a and b as you defined them, you can use also:
distance = np.sqrt(np.sum((a-b)**2))
Since Python 3.8
Since Python 3.8 the math module includes the function math.dist().
See here https://docs.python.org/3.8/library/math.html#math.dist.
math.dist(p1, p2)
Return the Euclidean distance between two points p1 and p2,
each given as a sequence (or iterable) of coordinates.
import math
print( math.dist( (0,0), (1,1) )) # sqrt(2) -> 1.4142
print( math.dist( (0,0,0), (1,1,1) )) # sqrt(3) -> 1.7321
Here's some concise code for Euclidean distance in Python given two points represented as lists in Python.
def distance(v1,v2):
return sum([(x-y)**2 for (x,y) in zip(v1,v2)])**(0.5)
import math
dist = math.hypot(math.hypot(xa-xb, ya-yb), za-zb)
Calculate the Euclidean distance for multidimensional space:
import math
x = [1, 2, 6]
y = [-2, 3, 2]
dist = math.sqrt(sum([(xi-yi)**2 for xi,yi in zip(x, y)]))
5.0990195135927845
import numpy as np
from scipy.spatial import distance
input_arr = np.array([[0,3,0],[2,0,0],[0,1,3],[0,1,2],[-1,0,1],[1,1,1]])
test_case = np.array([0,0,0])
dst=[]
for i in range(0,6):
temp = distance.euclidean(test_case,input_arr[i])
dst.append(temp)
print(dst)
You can easily use the formula
distance = np.sqrt(np.sum(np.square(a-b)))
which does actually nothing more than using Pythagoras' theorem to calculate the distance, by adding the squares of Δx, Δy and Δz and rooting the result.
import numpy as np
# any two python array as two points
a = [0, 0]
b = [3, 4]
You first change list to numpy array and do like this: print(np.linalg.norm(np.array(a) - np.array(b))). Second method directly from python list as: print(np.linalg.norm(np.subtract(a,b)))
The other answers work for floating point numbers, but do not correctly compute the distance for integer dtypes which are subject to overflow and underflow. Note that even scipy.distance.euclidean has this issue:
>>> a1 = np.array([1], dtype='uint8')
>>> a2 = np.array([2], dtype='uint8')
>>> a1 - a2
array([255], dtype=uint8)
>>> np.linalg.norm(a1 - a2)
255.0
>>> from scipy.spatial import distance
>>> distance.euclidean(a1, a2)
255.0
This is common, since many image libraries represent an image as an ndarray with dtype="uint8". This means that if you have a greyscale image which consists of very dark grey pixels (say all the pixels have color #000001) and you're diffing it against black image (#000000), you can end up with x-y consisting of 255 in all cells, which registers as the two images being very far apart from each other. For unsigned integer types (e.g. uint8), you can safely compute the distance in numpy as:
np.linalg.norm(np.maximum(x, y) - np.minimum(x, y))
For signed integer types, you can cast to a float first:
np.linalg.norm(x.astype("float") - y.astype("float"))
For image data specifically, you can use opencv's norm method:
import cv2
cv2.norm(x, y, cv2.NORM_L2)
Find difference of two matrices first. Then, apply element wise multiplication with numpy's multiply command. After then, find summation of the element wise multiplied new matrix. Finally, find square root of the summation.
def findEuclideanDistance(a, b):
euclidean_distance = a - b
euclidean_distance = np.sum(np.multiply(euclidean_distance, euclidean_distance))
euclidean_distance = np.sqrt(euclidean_distance)
return euclidean_distance
What's the best way to do this with NumPy, or with Python in general? I have:
Well best way would be safest and also the fastest
I would suggest hypot usage for reliable results for chances of underflow and overflow are very little compared to writing own sqroot calculator
Lets see math.hypot, np.hypot vs vanilla np.sqrt(np.sum((np.array([i, j, k])) ** 2, axis=1))
i, j, k = 1e+200, 1e+200, 1e+200
math.hypot(i, j, k)
# 1.7320508075688773e+200
np.sqrt(np.sum((np.array([i, j, k])) ** 2))
# RuntimeWarning: overflow encountered in square
Speed wise math.hypot look better
%%timeit
math.hypot(i, j, k)
# 100 ns ± 1.05 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%%timeit
np.sqrt(np.sum((np.array([i, j, k])) ** 2))
# 6.41 µs ± 33.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Underflow
i, j = 1e-200, 1e-200
np.sqrt(i**2+j**2)
# 0.0
Overflow
i, j = 1e+200, 1e+200
np.sqrt(i**2+j**2)
# inf
No Underflow
i, j = 1e-200, 1e-200
np.hypot(i, j)
# 1.414213562373095e-200
No Overflow
i, j = 1e+200, 1e+200
np.hypot(i, j)
# 1.414213562373095e+200
Refer
The fastest solution I could come up with for large number of distances is using numexpr. On my machine it is faster than using numpy einsum:
import numexpr as ne
import numpy as np
np.sqrt(ne.evaluate("sum((a_min_b)**2,axis=1)"))
If you want something more explicit you can easily write the formula like this:
np.sqrt(np.sum((a-b)**2))
Even with arrays of 10_000_000 elements this still runs at 0.1s on my machine.

Categories