I have two points in 3D space:
a = (ax, ay, az)
b = (bx, by, bz)
I want to calculate the distance between them:
dist = sqrt((ax-bx)^2 + (ay-by)^2 + (az-bz)^2)
How do I do this with NumPy? I have:
import numpy
a = numpy.array((ax, ay, az))
b = numpy.array((bx, by, bz))
Use numpy.linalg.norm:
dist = numpy.linalg.norm(a-b)
This works because the Euclidean distance is the l2 norm, and the default value of the ord parameter in numpy.linalg.norm is 2.
For more theory, see Introduction to Data Mining:
Use scipy.spatial.distance.euclidean:
from scipy.spatial import distance
a = (1, 2, 3)
b = (4, 5, 6)
dst = distance.euclidean(a, b)
For anyone interested in computing multiple distances at once, I've done a little comparison using perfplot (a small project of mine).
The first advice is to organize your data such that the arrays have dimension (3, n) (and are C-contiguous obviously). If adding happens in the contiguous first dimension, things are faster, and it doesn't matter too much if you use sqrt-sum with axis=0, linalg.norm with axis=0, or
a_min_b = a - b
numpy.sqrt(numpy.einsum('ij,ij->j', a_min_b, a_min_b))
which is, by a slight margin, the fastest variant. (That actually holds true for just one row as well.)
The variants where you sum up over the second axis, axis=1, are all substantially slower.
Code to reproduce the plot:
import numpy
import perfplot
from scipy.spatial import distance
def linalg_norm(data):
a, b = data[0]
return numpy.linalg.norm(a - b, axis=1)
def linalg_norm_T(data):
a, b = data[1]
return numpy.linalg.norm(a - b, axis=0)
def sqrt_sum(data):
a, b = data[0]
return numpy.sqrt(numpy.sum((a - b) ** 2, axis=1))
def sqrt_sum_T(data):
a, b = data[1]
return numpy.sqrt(numpy.sum((a - b) ** 2, axis=0))
def scipy_distance(data):
a, b = data[0]
return list(map(distance.euclidean, a, b))
def sqrt_einsum(data):
a, b = data[0]
a_min_b = a - b
return numpy.sqrt(numpy.einsum("ij,ij->i", a_min_b, a_min_b))
def sqrt_einsum_T(data):
a, b = data[1]
a_min_b = a - b
return numpy.sqrt(numpy.einsum("ij,ij->j", a_min_b, a_min_b))
def setup(n):
a = numpy.random.rand(n, 3)
b = numpy.random.rand(n, 3)
out0 = numpy.array([a, b])
out1 = numpy.array([a.T, b.T])
return out0, out1
b = perfplot.bench(
setup=setup,
n_range=[2 ** k for k in range(22)],
kernels=[
linalg_norm,
linalg_norm_T,
scipy_distance,
sqrt_sum,
sqrt_sum_T,
sqrt_einsum,
sqrt_einsum_T,
],
xlabel="len(x), len(y)",
)
b.save("norm.png")
I want to expound on the simple answer with various performance notes. np.linalg.norm will do perhaps more than you need:
dist = numpy.linalg.norm(a-b)
Firstly - this function is designed to work over a list and return all of the values, e.g. to compare the distance from pA to the set of points sP:
sP = set(points)
pA = point
distances = np.linalg.norm(sP - pA, ord=2, axis=1.) # 'distances' is a list
Remember several things:
Python function calls are expensive.
[Regular] Python doesn't cache name lookups.
So
def distance(pointA, pointB):
dist = np.linalg.norm(pointA - pointB)
return dist
isn't as innocent as it looks.
>>> dis.dis(distance)
2 0 LOAD_GLOBAL 0 (np)
2 LOAD_ATTR 1 (linalg)
4 LOAD_ATTR 2 (norm)
6 LOAD_FAST 0 (pointA)
8 LOAD_FAST 1 (pointB)
10 BINARY_SUBTRACT
12 CALL_FUNCTION 1
14 STORE_FAST 2 (dist)
3 16 LOAD_FAST 2 (dist)
18 RETURN_VALUE
Firstly - every time we call it, we have to do a global lookup for "np", a scoped lookup for "linalg" and a scoped lookup for "norm", and the overhead of merely calling the function can equate to dozens of python instructions.
Lastly, we wasted two operations on to store the result and reload it for return...
First pass at improvement: make the lookup faster, skip the store
def distance(pointA, pointB, _norm=np.linalg.norm):
return _norm(pointA - pointB)
We get the far more streamlined:
>>> dis.dis(distance)
2 0 LOAD_FAST 2 (_norm)
2 LOAD_FAST 0 (pointA)
4 LOAD_FAST 1 (pointB)
6 BINARY_SUBTRACT
8 CALL_FUNCTION 1
10 RETURN_VALUE
The function call overhead still amounts to some work, though. And you'll want to do benchmarks to determine whether you might be better doing the math yourself:
def distance(pointA, pointB):
return (
((pointA.x - pointB.x) ** 2) +
((pointA.y - pointB.y) ** 2) +
((pointA.z - pointB.z) ** 2)
) ** 0.5 # fast sqrt
On some platforms, **0.5 is faster than math.sqrt. Your mileage may vary.
**** Advanced performance notes.
Why are you calculating distance? If the sole purpose is to display it,
print("The target is %.2fm away" % (distance(a, b)))
move along. But if you're comparing distances, doing range checks, etc., I'd like to add some useful performance observations.
Let’s take two cases: sorting by distance or culling a list to items that meet a range constraint.
# Ultra naive implementations. Hold onto your hat.
def sort_things_by_distance(origin, things):
return things.sort(key=lambda thing: distance(origin, thing))
def in_range(origin, range, things):
things_in_range = []
for thing in things:
if distance(origin, thing) <= range:
things_in_range.append(thing)
The first thing we need to remember is that we are using Pythagoras to calculate the distance (dist = sqrt(x^2 + y^2 + z^2)) so we're making a lot of sqrt calls. Math 101:
dist = root ( x^2 + y^2 + z^2 )
:.
dist^2 = x^2 + y^2 + z^2
and
sq(N) < sq(M) iff M > N
and
sq(N) > sq(M) iff N > M
and
sq(N) = sq(M) iff N == M
In short: until we actually require the distance in a unit of X rather than X^2, we can eliminate the hardest part of the calculations.
# Still naive, but much faster.
def distance_sq(left, right):
""" Returns the square of the distance between left and right. """
return (
((left.x - right.x) ** 2) +
((left.y - right.y) ** 2) +
((left.z - right.z) ** 2)
)
def sort_things_by_distance(origin, things):
return things.sort(key=lambda thing: distance_sq(origin, thing))
def in_range(origin, range, things):
things_in_range = []
# Remember that sqrt(N)**2 == N, so if we square
# range, we don't need to root the distances.
range_sq = range**2
for thing in things:
if distance_sq(origin, thing) <= range_sq:
things_in_range.append(thing)
Great, both functions no-longer do any expensive square roots. That'll be much faster, but before you go further, check yourself: why did sort_things_by_distance need a "naive" disclaimer both times above? Answer at the very bottom (*a1).
We can improve in_range by converting it to a generator:
def in_range(origin, range, things):
range_sq = range**2
yield from (thing for thing in things
if distance_sq(origin, thing) <= range_sq)
This especially has benefits if you are doing something like:
if any(in_range(origin, max_dist, things)):
...
But if the very next thing you are going to do requires a distance,
for nearby in in_range(origin, walking_distance, hotdog_stands):
print("%s %.2fm" % (nearby.name, distance(origin, nearby)))
consider yielding tuples:
def in_range_with_dist_sq(origin, range, things):
range_sq = range**2
for thing in things:
dist_sq = distance_sq(origin, thing)
if dist_sq <= range_sq: yield (thing, dist_sq)
This can be especially useful if you might chain range checks ('find things that are near X and within Nm of Y', since you don't have to calculate the distance again).
But what about if we're searching a really large list of things and we anticipate a lot of them not being worth consideration?
There is actually a very simple optimization:
def in_range_all_the_things(origin, range, things):
range_sq = range**2
for thing in things:
dist_sq = (origin.x - thing.x) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.y - thing.y) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.z - thing.z) ** 2
if dist_sq <= range_sq:
yield thing
Whether this is useful will depend on the size of 'things'.
def in_range_all_the_things(origin, range, things):
range_sq = range**2
if len(things) >= 4096:
for thing in things:
dist_sq = (origin.x - thing.x) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.y - thing.y) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.z - thing.z) ** 2
if dist_sq <= range_sq:
yield thing
elif len(things) > 32:
for things in things:
dist_sq = (origin.x - thing.x) ** 2
if dist_sq <= range_sq:
dist_sq += (origin.y - thing.y) ** 2 + (origin.z - thing.z) ** 2
if dist_sq <= range_sq:
yield thing
else:
... just calculate distance and range-check it ...
And again, consider yielding the dist_sq. Our hotdog example then becomes:
# Chaining generators
info = in_range_with_dist_sq(origin, walking_distance, hotdog_stands)
info = (stand, dist_sq**0.5 for stand, dist_sq in info)
for stand, dist in info:
print("%s %.2fm" % (stand, dist))
(*a1: sort_things_by_distance's sort key calls distance_sq for every single item, and that innocent looking key is a lambda, which is a second function that has to be invoked...)
Another instance of this problem solving method:
def dist(x,y):
return numpy.sqrt(numpy.sum((x-y)**2))
a = numpy.array((xa,ya,za))
b = numpy.array((xb,yb,zb))
dist_a_b = dist(a,b)
Starting Python 3.8, the math module directly provides the dist function, which returns the euclidean distance between two points (given as tuples or lists of coordinates):
from math import dist
dist((1, 2, 6), (-2, 3, 2)) # 5.0990195135927845
And if you're working with lists:
dist([1, 2, 6], [-2, 3, 2]) # 5.0990195135927845
It can be done like the following. I don't know how fast it is, but it's not using NumPy.
from math import sqrt
a = (1, 2, 3) # Data point 1
b = (4, 5, 6) # Data point 2
print sqrt(sum( (a - b)**2 for a, b in zip(a, b)))
A nice one-liner:
dist = numpy.linalg.norm(a-b)
However, if speed is a concern I would recommend experimenting on your machine. I've found that using math library's sqrt with the ** operator for the square is much faster on my machine than the one-liner NumPy solution.
I ran my tests using this simple program:
#!/usr/bin/python
import math
import numpy
from random import uniform
def fastest_calc_dist(p1,p2):
return math.sqrt((p2[0] - p1[0]) ** 2 +
(p2[1] - p1[1]) ** 2 +
(p2[2] - p1[2]) ** 2)
def math_calc_dist(p1,p2):
return math.sqrt(math.pow((p2[0] - p1[0]), 2) +
math.pow((p2[1] - p1[1]), 2) +
math.pow((p2[2] - p1[2]), 2))
def numpy_calc_dist(p1,p2):
return numpy.linalg.norm(numpy.array(p1)-numpy.array(p2))
TOTAL_LOCATIONS = 1000
p1 = dict()
p2 = dict()
for i in range(0, TOTAL_LOCATIONS):
p1[i] = (uniform(0,1000),uniform(0,1000),uniform(0,1000))
p2[i] = (uniform(0,1000),uniform(0,1000),uniform(0,1000))
total_dist = 0
for i in range(0, TOTAL_LOCATIONS):
for j in range(0, TOTAL_LOCATIONS):
dist = fastest_calc_dist(p1[i], p2[j]) #change this line for testing
total_dist += dist
print total_dist
On my machine, math_calc_dist runs much faster than numpy_calc_dist: 1.5 seconds versus 23.5 seconds.
To get a measurable difference between fastest_calc_dist and math_calc_dist I had to up TOTAL_LOCATIONS to 6000. Then fastest_calc_dist takes ~50 seconds while math_calc_dist takes ~60 seconds.
You can also experiment with numpy.sqrt and numpy.square though both were slower than the math alternatives on my machine.
My tests were run with Python 2.6.6.
I find a 'dist' function in matplotlib.mlab, but I don't think it's handy enough.
I'm posting it here just for reference.
import numpy as np
import matplotlib as plt
a = np.array([1, 2, 3])
b = np.array([2, 3, 4])
# Distance between a and b
dis = plt.mlab.dist(a, b)
You can just subtract the vectors and then innerproduct.
Following your example,
a = numpy.array((xa, ya, za))
b = numpy.array((xb, yb, zb))
tmp = a - b
sum_squared = numpy.dot(tmp.T, tmp)
result = numpy.sqrt(sum_squared)
I like np.dot (dot product):
a = numpy.array((xa,ya,za))
b = numpy.array((xb,yb,zb))
distance = (np.dot(a-b,a-b))**.5
With Python 3.8, it's very easy.
https://docs.python.org/3/library/math.html#math.dist
math.dist(p, q)
Return the Euclidean distance between two points p and q, each given
as a sequence (or iterable) of coordinates. The two points must have
the same dimension.
Roughly equivalent to:
sqrt(sum((px - qx) ** 2.0 for px, qx in zip(p, q)))
Having a and b as you defined them, you can use also:
distance = np.sqrt(np.sum((a-b)**2))
Since Python 3.8
Since Python 3.8 the math module includes the function math.dist().
See here https://docs.python.org/3.8/library/math.html#math.dist.
math.dist(p1, p2)
Return the Euclidean distance between two points p1 and p2,
each given as a sequence (or iterable) of coordinates.
import math
print( math.dist( (0,0), (1,1) )) # sqrt(2) -> 1.4142
print( math.dist( (0,0,0), (1,1,1) )) # sqrt(3) -> 1.7321
Here's some concise code for Euclidean distance in Python given two points represented as lists in Python.
def distance(v1,v2):
return sum([(x-y)**2 for (x,y) in zip(v1,v2)])**(0.5)
import math
dist = math.hypot(math.hypot(xa-xb, ya-yb), za-zb)
Calculate the Euclidean distance for multidimensional space:
import math
x = [1, 2, 6]
y = [-2, 3, 2]
dist = math.sqrt(sum([(xi-yi)**2 for xi,yi in zip(x, y)]))
5.0990195135927845
import numpy as np
from scipy.spatial import distance
input_arr = np.array([[0,3,0],[2,0,0],[0,1,3],[0,1,2],[-1,0,1],[1,1,1]])
test_case = np.array([0,0,0])
dst=[]
for i in range(0,6):
temp = distance.euclidean(test_case,input_arr[i])
dst.append(temp)
print(dst)
You can easily use the formula
distance = np.sqrt(np.sum(np.square(a-b)))
which does actually nothing more than using Pythagoras' theorem to calculate the distance, by adding the squares of Δx, Δy and Δz and rooting the result.
import numpy as np
# any two python array as two points
a = [0, 0]
b = [3, 4]
You first change list to numpy array and do like this: print(np.linalg.norm(np.array(a) - np.array(b))). Second method directly from python list as: print(np.linalg.norm(np.subtract(a,b)))
The other answers work for floating point numbers, but do not correctly compute the distance for integer dtypes which are subject to overflow and underflow. Note that even scipy.distance.euclidean has this issue:
>>> a1 = np.array([1], dtype='uint8')
>>> a2 = np.array([2], dtype='uint8')
>>> a1 - a2
array([255], dtype=uint8)
>>> np.linalg.norm(a1 - a2)
255.0
>>> from scipy.spatial import distance
>>> distance.euclidean(a1, a2)
255.0
This is common, since many image libraries represent an image as an ndarray with dtype="uint8". This means that if you have a greyscale image which consists of very dark grey pixels (say all the pixels have color #000001) and you're diffing it against black image (#000000), you can end up with x-y consisting of 255 in all cells, which registers as the two images being very far apart from each other. For unsigned integer types (e.g. uint8), you can safely compute the distance in numpy as:
np.linalg.norm(np.maximum(x, y) - np.minimum(x, y))
For signed integer types, you can cast to a float first:
np.linalg.norm(x.astype("float") - y.astype("float"))
For image data specifically, you can use opencv's norm method:
import cv2
cv2.norm(x, y, cv2.NORM_L2)
Find difference of two matrices first. Then, apply element wise multiplication with numpy's multiply command. After then, find summation of the element wise multiplied new matrix. Finally, find square root of the summation.
def findEuclideanDistance(a, b):
euclidean_distance = a - b
euclidean_distance = np.sum(np.multiply(euclidean_distance, euclidean_distance))
euclidean_distance = np.sqrt(euclidean_distance)
return euclidean_distance
What's the best way to do this with NumPy, or with Python in general? I have:
Well best way would be safest and also the fastest
I would suggest hypot usage for reliable results for chances of underflow and overflow are very little compared to writing own sqroot calculator
Lets see math.hypot, np.hypot vs vanilla np.sqrt(np.sum((np.array([i, j, k])) ** 2, axis=1))
i, j, k = 1e+200, 1e+200, 1e+200
math.hypot(i, j, k)
# 1.7320508075688773e+200
np.sqrt(np.sum((np.array([i, j, k])) ** 2))
# RuntimeWarning: overflow encountered in square
Speed wise math.hypot look better
%%timeit
math.hypot(i, j, k)
# 100 ns ± 1.05 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
%%timeit
np.sqrt(np.sum((np.array([i, j, k])) ** 2))
# 6.41 µs ± 33.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Underflow
i, j = 1e-200, 1e-200
np.sqrt(i**2+j**2)
# 0.0
Overflow
i, j = 1e+200, 1e+200
np.sqrt(i**2+j**2)
# inf
No Underflow
i, j = 1e-200, 1e-200
np.hypot(i, j)
# 1.414213562373095e-200
No Overflow
i, j = 1e+200, 1e+200
np.hypot(i, j)
# 1.414213562373095e+200
Refer
The fastest solution I could come up with for large number of distances is using numexpr. On my machine it is faster than using numpy einsum:
import numexpr as ne
import numpy as np
np.sqrt(ne.evaluate("sum((a_min_b)**2,axis=1)"))
If you want something more explicit you can easily write the formula like this:
np.sqrt(np.sum((a-b)**2))
Even with arrays of 10_000_000 elements this still runs at 0.1s on my machine.
Related
Using Python, I would like to implement a function that takes a natural number n as input and outputs a list of natural numbers [y1, y2, y3, ...] such that n + y1*y1 and n + y2*y2 and n + y3*y3 and so forth is again a square.
What I tried so far is to obtain one y-value using the following function:
def find_square(n:int) -> tuple[int, int]:
if n%2 == 1:
y = (n-1)//2
x = n+y*y
return (y,x)
return None
It works fine, eg. find_square(13689) gives me a correct solution y=6844. It would be great to have an algorithm that yields all possible y-values such as y=44 or y=156.
Simplest slow approach is of course for given N just to iterate all possible Y and check if N + Y^2 is square.
But there is a much faster approach using integer Factorization technique:
Lets notice that to solve equation N + Y^2 = X^2, that is to find all integer pairs (X, Y) for given fixed integer N, we can rewrite this equation to N = X^2 - Y^2 = (X + Y) * (X - Y) which follows from famous school formula of difference of squares.
Now lets rename two factors as A, B i.e. N = (X + Y) * (X - Y) = A * B, which means that X = (A + B) / 2 and Y = (A - B) / 2.
Notice that A and B should be of same odditiy, either both odd or both even, otherwise in last formulas above we can't have whole division by 2.
We will factorize N into all possible pairs of two factors (A, B) of same oddity. For fast factorization in code below I used simple to implement but yet quite fast algorithm Pollard Rho, also two extra algorithms were needed as a helper to Pollard Rho, one is Fermat Primality Test (which allows fast checking if number is probably prime) and second is Trial Division Factorization (which helps Pollard Rho to factor out small factors, which could cause Pollard Rho to fail).
Pollard Rho for composite number has time complexity O(N^(1/4)) which is very fast even for 64-bit numbers. Any faster factorization algorithm can be chosen if needed a bigger space to be searched. My fast algorithm time is dominated by speed of factorization, remaining part of algorithm is blazingly fast, just few iterations of loop with simple formulas.
If your N is a square itself (hence we know its root easily), then Pollard Rho can factor N even much faster, within O(N^(1/8)) time. Even for 128-bit numbers it means very small time, 2^16 operations, and I hope you're solving your task for less than 128 bit numbers.
If you want to process a range of possible N values then fastest way to factorize them is to use techniques similar to Sieve of Erathosthenes, using set of prime numbers, it allows to compute all factors for all N numbers within some range. Using Sieve of Erathosthenes for the case of range of Ns is much faster than factorizing each N with Pollard Rho.
After factoring N into pairs (A, B) we compute (X, Y) based on (A, B) by formulas above. And output resulting Y as a solution of fast algorithm.
Following code as an example is implemented in pure Python. Of course one can use Numba to speed it up, Numba usually gives 30-200 times speedup, for Python it achieves same speed as optimized C++. But I thought that main thing here is to implement fast algorithm, Numba optimizations can be done easily afterwards.
I added time measurement into following code. Although it is pure Python still my fast algorithm achieves 8500x times speedup compared to regular brute force approach for limit of 1 000 000.
You can change limit variable to tweak amount of searched space, or num_tests variable to tweak amount of different tests.
Following code implements both solutions - fast solution find_fast() described above plus very tiny brute force solution find_slow() which is very slow as it scans all possible candidates. This slow solution is only used to compare correctness in tests and compare speedup.
Code below uses nothing except few standard Python library modules, no external modules were used.
Try it online!
def find_slow(N):
import math
def is_square(x):
root = int(math.sqrt(float(x)) + 0.5)
return root * root == x, root
l = []
for y in range(N):
if is_square(N + y ** 2)[0]:
l.append(y)
return l
def find_fast(N):
import itertools, functools
Prod = lambda it: functools.reduce(lambda a, b: a * b, it, 1)
fs = factor(N)
mfs = {}
for e in fs:
mfs[e] = mfs.get(e, 0) + 1
fs = sorted(mfs.items())
del mfs
Ys = set()
for take_a in itertools.product(*[
(range(v + 1) if k != 2 else range(1, v)) for k, v in fs]):
A = Prod([p ** t for (p, _), t in zip(fs, take_a)])
B = N // A
assert A * B == N, (N, A, B, take_a)
if A < B:
continue
X = (A + B) // 2
Y = (A - B) // 2
assert N + Y ** 2 == X ** 2, (N, A, B, X, Y)
Ys.add(Y)
return sorted(Ys)
def trial_div_factor(n, limit = None):
# https://en.wikipedia.org/wiki/Trial_division
fs = []
while n & 1 == 0:
fs.append(2)
n >>= 1
all_checked = False
for d in range(3, (limit or n) + 1, 2):
if d * d > n:
all_checked = True
break
while True:
q, r = divmod(n, d)
if r != 0:
break
fs.append(d)
n = q
if n > 1 and all_checked:
fs.append(n)
n = 1
return fs, n
def fermat_prp(n, trials = 32):
# https://en.wikipedia.org/wiki/Fermat_primality_test
import random
if n <= 16:
return n in (2, 3, 5, 7, 11, 13)
for i in range(trials):
if pow(random.randint(2, n - 2), n - 1, n) != 1:
return False
return True
def pollard_rho_factor(n):
# https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm
import math, random
fs, n = trial_div_factor(n, 1 << 7)
if n <= 1:
return fs
if fermat_prp(n):
return sorted(fs + [n])
for itry in range(8):
failed = False
x = random.randint(2, n - 2)
for cycle in range(1, 1 << 60):
y = x
for i in range(1 << cycle):
x = (x * x + 1) % n
d = math.gcd(x - y, n)
if d == 1:
continue
if d == n:
failed = True
break
return sorted(fs + pollard_rho_factor(d) + pollard_rho_factor(n // d))
if failed:
break
assert False, f'Pollard Rho failed! n = {n}'
def factor(N):
import functools
Prod = lambda it: functools.reduce(lambda a, b: a * b, it, 1)
fs = pollard_rho_factor(N)
assert N == Prod(fs), (N, fs)
return sorted(fs)
def test():
import random, time
limit = 1 << 20
num_tests = 20
t0, t1 = 0, 0
for i in range(num_tests):
if (round(i / num_tests * 1000)) % 100 == 0 or i + 1 >= num_tests:
print(f'test {i}, ', end = '', flush = True)
N = random.randrange(limit)
tb = time.time()
r0 = find_slow(N)
t0 += time.time() - tb
tb = time.time()
r1 = find_fast(N)
t1 += time.time() - tb
assert r0 == r1, (N, r0, r1, t0, t1)
print(f'\nTime slow {t0:.05f} sec, fast {t1:.05f} sec, speedup {round(t0 / max(1e-6, t1))} times')
if __name__ == '__main__':
test()
Output:
test 0, test 2, test 4, test 6, test 8, test 10, test 12, test 14, test 16, test 18, test 19,
Time slow 26.28198 sec, fast 0.00301 sec, speedup 8732 times
For the easiest solution, you can try this:
import math
n=13689 #or we can ask user to input a square number.
for i in range(1,9999):
if math.sqrt(n+i**2).is_integer():
print(i)
I am trying to find the number of ways to construct an array such that consecutive positions contain different values.
Specifically, I need to construct an array with elements such that each element 1 between and k , all inclusive. I also want the first and last elements of the array to be 1 and x.
Complete problem statement:
Here is what I tried:
def countArray(n, k, x):
# Return the number of ways to fill in the array.
if x > k:
return 0
if x == 1:
return 0
def fact(n):
if n == 0:
return 1
fact_range = n+1
T = [1 for i in range(fact_range)]
for i in range(1,fact_range):
T[i] = i * T[i-1]
return T[fact_range-1]
ways = fact(k) / (fact(n-2)*fact(k-(n-2)))
return int(ways)
In short, I did K(C)N-2 to find the ways. How could I solve this?
It passes one of the base case with inputs as countArray(4,3,2) but fails for 16 other cases.
Let X(n) be the number of ways of constructing an array of length n, starting with 1 and ending in x (and not repeating any numbers). Let Y(n) be the number of ways of constructing an array of length n, starting with 1 and NOT ending in x (and not repeating any numbers).
Then there's these recurrence relations (for n>1)
X(n+1) = Y(n)
Y(n+1) = X(n)*(k-1) + Y(n)*(k-2)
In words: If you want an array of length n+1 ending in x, then you need an array of length n not ending in x. And if you want an array of length n+1 not ending in x, then you can either add any of the k-1 symbols to an array of length n ending in x, or you can take an array of length n not ending in x, and add any of the k-2 symbols that aren't x and don't repeat the last value.
For the base case, n=1, if x is 1 then X(1)=1, Y(1)=0 otherwise, X(1)=0, Y(1)=1
This gives you an O(n)-time method of computing the result.
def ways(n, k, x):
M = 10**9 + 7
wx = (x == 1)
wnx = (x != 1)
for _ in range(n-1):
wx, wnx = wnx, wx * (k-1) + wnx*(k-2)
wnx = wnx % M
return wx
print(ways(100, 5, 2))
In principle you can reduce this to O(log n) by expressing the recurrence relations as a matrix and computing the matrix power (mod M), but it's probably not necessary for the question.
[Additional working]
We have the recurrence relations:
X(n+1) = Y(n)
Y(n+1) = X(n)*(k-1) + Y(n)*(k-2)
Using the first, we can replace the Y(_) in the second with X(_+1) to reduce it down to a single variable. Then:
X(n+2) = X(n)*(k-1) + X(n+1)*(k-2)
Using standard techniques, we can solve this linear recurrence relation exactly.
In the case x!=1, we have:
X(n) = ((k-1)^(n-1) - (-1)^n) / k
And in the case x=1, we have:
X(n) = ((k-1)^(n-1) - (1-k)(-1)^n)/k
We can compute these mod M using Fermat's little theorem because M is prime. So 1/k = k^(M-2) mod M.
Thus we have (with a little bit of optimization) this short program that solves the problem and runs in O(log n) time:
def ways2(n, k, x):
S = -1 if n%2 else 1
return ((pow(k-1, n-1, M) + S) * pow(k, M-2, M) - S*(x==1)) % M
could you try this DP version: (it's passed all tests) (it's inspired by #PaulHankin and take DP approach - will run performance later to see what's diff for big matrix)
def countArray(n, k, x):
# Return the number of ways to fill in the array.
big_mod = 10 ** 9 + 7
dp = [[1], [1]]
if x == 1:
dp = [[1], [0]]
else:
dp = [[1], [1]]
for _ in range(n-2):
dp[0].append(dp[0][-1] * (k - 1) % big_mod)
dp[1].append((dp[0][-1] - dp[1][-1]) % big_mod)
return dp[1][-1]
I want to compute the overlap fraction of two numeric ranges. Let me illustrate my question with an example since I believe that it will be easier to understand.
Lets say that I have two numeric ranges:
A = [1,100]
B = [25,100]
What I want to know (and code) is how much is B overlapping A and viceversa (how much is A overlapping B)
In this case, A overlaps B (as a fraction of B) by 100% and B overlaps A (as a fraction of A) by 75% percent.
I have try been trying to code this in python, but I am struggling and I can't find the proper solution for computing both fractions
What I have been able to achieve so far is the following:
Given the start and end of both numeric ranges, I have been able to figure out if the two numerical ranges overlap (from other stackoverflow post)
I have done this with the following code
def is_overlapping(x1,x2,y1,y2):
return max(x1,y1) <= min(x2,y2)
thanks!
Here's a fast solution without for loops:
def overlapping(x1,x2,y1,y2):
#A = [x1,x2]
#B = [y1,y1]
# Compute the B over A
if(x1 <= y1 and x2 >= y2): # Total overlapping
return 1
elif(x2 < y1 or y2 < x1):
return 0
elif(x2 == y1 or x1 == y2):
return 1/float(y2 - y1 + 1)
return (min(x2,y2) - max(x1,y1))/float(y2 - y1)
One (less efficient) way to do this is by using sets.
If you set up ranges
A = range(1,101)
B = range(25, 101)
then you can find your fractions as follows:
len(set(A)&set(B))/float(len(set(B)))
and
len(set(A)&set(B))/float(len(set(A)))
giving 1.0 and 0.76.
There are 76 points in B that are also in A (since your ranges appear to be inclusive).
There are more efficient ways to do this using some mathematics as the other answers show, but this is general purpose.
I believe there are countless ways of solving this problem. The first one that came into my mind is making best use of the sum function which can also sum up over an iterable:
a = range(1,100)
b = range(25,100)
sum_a = sum(1 for i in b if i in a)
sum_b = sum(1 for i in a if i in b)
share_a = sum_a*100 / len(b)
share_b = sum_b*100 / len(a)
print(share_a, share_b)
>>> 100 75
This might be a bit more robus, e.g. when you are not working with ranges but with unsorted lists.
Here's my solution using numpy & python3
import numpy as np
def my_example(A,B):
# Convert to numpy arrays
A = np.array(A)
B = np.array(B)
# determine which elements are overlapping
overlapping_elements=np.intersect1d(A, B)
# determine how many there are
coe=overlapping_elements.size
#return the ratios
return coe/A.size , coe/B.size
# Generate two test lists
a=[*range(1,101)]
b=[*range(25,101)]
# Call the example & print the results
x,y = my_example(a,b) # returns ratios, multiply by 100 for percentage
print(x,y)
I have assumed both lower and upper bounds are included in the range. Here is my way of calculating overlapping distance with respect to other:
def is_valid(x):
try:
valid = (len(x) == 2) and (x[0] <= x[1])
except:
valid = False
finally:
return valid
def is_overlapping(x,y):
return max(x[0],y[0]) <= min(x[1],y[1])
def overlapping_percent(x,y):
if(is_valid(x) and is_valid(y)) == False:
raise ValueError("Invalid range")
if is_overlapping(x,y):
overlapping_distance = min(x[1],y[1]) - max(x[0],y[0]) + 1
width_x = x[1] - x[0] + 1
width_y = y[1] - y[0] + 1
overlap_x = overlapping_distance * 100.0/width_y
overlap_y = overlapping_distance *100.0/width_x
return (overlap_x, overlap_y)
return (0,0);
if __name__ == '__main__':
try:
print(overlapping_percent((1,100),(26,100)))
print(overlapping_percent((26,100),(1,100)))
print(overlapping_percent((26,50),(1,100)))
print(overlapping_percent((1,100),(26,50)))
print(overlapping_percent((1,100),(200,300)))
print(overlapping_percent((26,150),(1,100)))
print(overlapping_percent((126,50),(1,100)))
except Exception as e:
print(e)
Output:
(100.0, 75.0)
(75.0, 100.0)
(25.0, 100.0)
(100.0, 25.0)
(0, 0)
(60.0, 75.0)
Invalid range
I hope it helps.
I have these two arrays:
import numpy as np
a = np.array([0, 10, 20])
b = np.array([20, 30, 40, 50])
I'd like to add both in the following way:
for i in range (len(a)):
for j in range(len(b)):
c = a[i] + b[j]
d = delta(c, dr)
As you see for each iteration I get a value c which I pass through a function delta (see note at the end of the post).
The thing is that I want to avoid slow Python "for" loops when the arrays are huge.
One thing I could do would be:
c = np.ravel(a(-1, 1) + b)
Which is much much faster. The problem is that now c is an array, and again I would have to go throw it using a for loop.
So, do you have any idea on how I could do this without using a for loop at all.
NOTE: delta is a function I define in the following way:
def delta(r,dr):
if r >= 0.5*dr and r <= 1.5*dr:
delta = (5-3*abs(r)/dr-np.sqrt(-3*(1-abs(r)/dr)**2+1))/(6*dr)
elif r <= 0.5*dr:
delta = (1+np.sqrt(-3*(r/dr)**2+1))/(3*dr)
else:
delta = 0
return delta
Using ravel is a good idea. Note that you could also use simple array broadcasting (a[:, np.newaxis] + b[np.newaxis, :]).
For your function, you can improve this a lot because it is composed of only three particular cases. Probably the best approach is to use masking for each of those three sections.
You're starting with:
def delta(r,dr):
if r >= 0.5*dr and r <= 1.5*dr:
delta = (5-3*abs(r)/dr-np.sqrt(-3*(1-abs(r)/dr)**2+1))/(6*dr)
elif r <= 0.5*dr:
delta = (1+np.sqrt(-3*(r/dr)**2+1))/(3*dr)
else:
delta = 0
A common alternative approach would be something like:
def delta(r, dr):
res = np.zeros_like(r)
ma = (r >= 0.5*dr) & (r <= 1.5*dr) # Create first mask
res[ma] = (5-3*np.abs(r[ma])/dr[ma]-np.sqrt(-3*(1-np.abs(r[ma])/dr[ma])**2+1))/(6*dr[ma])
ma = (r <= 0.5*dr) # Create second mask
res[ma] = (1+np.sqrt(-3*(r[ma]/dr[ma])**2+1))/(3*dr[ma])
return res
Initializing to zeros handles the final else case. Also I'm assuming np.abs is faster than abs --- but I'm not actually sure...
Edit: for sparse matrices
The same basic idea should apply, but perhaps instead of using a boolean masking array, using the valid indices themselves would be better... e.g. something like:
res = scipy.sparse.coo_matrix(np.shape(r))
ma = np.where((r >= 0.5*dr) & (r <= 1.5*dr)) # Create first mask
res[ma] = ...
This is the same answer as DilithiumMatrix, but using logical functions that numpy accepts to generate the masks.
import numpy as np
def delta(r, dr):
res = np.zeros(r.shape)
mask1 = (r >= 0.5*dr) & (r <= 1.5*dr)
res[mask1] = \
(5-3*np.abs(r[mask1])/dr \
- np.sqrt(-3*(1-np.abs(r[mask1])/dr)**2+1)) \
/(6*dr)
mask2 = np.logical_not(mask1) & (r <= 0.5*dr)
res[mask2] = (1+np.sqrt(-3*(r[mask2]/dr)**2+1))/(3*dr)
return res
Assuming your two arrays (a and b) are not enormous, you could do something like this:
import itertools
a = numpy.array([1,2,3])
b = numpy.array([4,5,6])
c = numpy.sum(list(itertools.product(a, b), 1)
def func(x, y):
return x*y
numpy.vectorize(func)(c, 10)
Note that with large arrays, this simply won't work - you'll have n**2 elements in c, which means that even for smallish-seeming pairs of arrays, you'll use enormous amounts of memory. For 2 arrays with 100,000 elements each, total memory required will be in the range of 74 GB.
Consider points Y given in increasing order from [0,T). We are to consider these points as lying on a circle of circumference T. Now consider points X also from [0,T) and also lying on a circle of circumference T.
We say the distance between X and Y is the sum of the absolute distance between the each point in X and its closest point in Y recalling that both are considered to be lying in a circle. Write this distance as Delta(X, Y).
I am trying to find a quick way of approximating the distributions of distance between the circles over all possible rotations of X. I am currently does this by Monte Carlo simulation. First here is my code to make some fake data.
import random
import numpy as np
from bisect import bisect_left
def simul(rate, T):
time = np.random.exponential(rate)
times = [0]
newtime = times[-1]+time
while (newtime < T):
times.append(newtime)
newtime = newtime+np.random.exponential(rate)
return times[1:]
Now the code the find the distance between two circles.
def takeClosest(myList, myNumber, T):
"""
Assumes myList is sorted. Returns closest value to myNumber in a circle of circumference T.
If two numbers are equally close, return the smallest number.
"""
pos = bisect_left(myList, myNumber)
if (pos == 0 and myList[pos] != myNumber):
before = myList[pos - 1] - T
after = myList[0]
elif (pos == len(myList)):
before = myList[pos-1]
after = myList[0] + T
else:
before = myList[pos - 1]
after = myList[pos]
if after - myNumber < myNumber - before:
return after
else:
return before
def circle_dist(timesY, timesX):
dist = 0
for t in timesX:
closest_number = takeClosest(timesY, t, T)
dist += np.abs(closest_number - t)
return dist
Now the main code to make the data and to try 1000 different random rotations.
T = 50000
timesX = simul(1, T)
timesY = simul(10, T)
dists=[]
iters = 100
for i in xrange(iters):
offset = np.random.randint(0,T)
timesX = [(t+offset) % T for t in timesX]
dists.append(circle_dist(timesY, timesX))
We can now print out any statistics we like of the distances. I am particularly interested in the variance.
print "Variance is ", np.var(dists)
Unfortunately I need to do this a lot and it takes around 16 seconds currently. I find this a little surprising it is so slow. Any suggestions for how to speed it up gratefully received.
Edit 1. Reduced the number of iterations to 100 (the previous value didn't correspond to my timings correctly). This now takes around 16 seconds on my computer.
Edit 2. Fixed bug in takeClosest
EDIT: I've just noticed that performance optimization is a little premature, because the expression closest_number - t is not a valid implementation of any definition of a distance on a "circle" - that is only a distance on an open-ended line
sample test case (pseudocode):
T = 10
X = [1, 2]
Y = [9]
dist(X, Y) = dist(1, 9) + dist(2, 9)
dist_on_line = 8 + 7 = 15
dist_on_circle = 2 + 3 = 5
Note that definition of the circle [0,10) implies that dist(0, 10) is not defined, but in the limit it approaches 0: lim(dist(0, t), t->10) = 0
A correct implementation of a distance on a circle would be:
dist_of_t = min(t - closest_number_before_t,
closes_number_after_t - t,
T - t + closes_number_before_t,
T - closest_number_after_t + t)
Original answer:
you could rotate and iterate over timesY instead of timesX since that array is an order of magnitude smaller - doing bisect_left of timeX is negligible (O(logn)) compared to iterating over all the elements (O(n))
but IMHO, the real slowdown if because of Python dynamic typing (every of the ~50000 items in timesX has to be checked for type compatibility each time you try to compare it to some other value) => converting timesX and timesY to numpy arrays should help, if that is not enought CPU acceleration (cython, numba, ...) is the think you need
The function circle_dist can be replaced by a one-liner. So you can plug it into your outer for i loop:
sum(abs(takeClosest(timesY, t) - t) for t in timesX)
Furthermore, you should always - if possible - allocate arrays like dists in one step and avoid appending elements many thousand times.
But, unfortunately, both improvements only save a few percent of computing time.
Edit 1: Replacing np.abs(...) with abs(...) decreases computing time by 50 % on my machine (on a reduced data set)!
Edit 2: Updated the one-liner according to Aprillion's comment.