Related
I am making a geometry interface in python (currently using tkinter) but I have stumbled upon a major problem: I need a function that is able to return a point, that is at a certain angle with a certain line segment, is a certain length apart from the vertex of the angle. We know the coordinates of the points of the line segment, and also the angle at which we want the point to be. I have attached an image below for a more graphical view of my question.
The problem: I can calculate it using trigonometry, where
x, y = vertex.getCoords()
endx = x + length * cos(radians(angle))
endy = y + length * sin(radians(angle))
p = Point(endx, endy)
The angle I input is in degrees. That calculation is true only when the line segment is parallel to the abscissa. But the sizes of the angles I get back are very strange, to say the least. I want the function to work wherever the first two points are on the tkinter canvas, whatever the angle is. I am very lost as to what I should do to fix it. What I found out: I get as output a point that when connected to the vertex, makes a line that is at the desired angle to the abscissa. So it works when the first arm(leg, shoulder) of the angle is parallel to the abscissa, then the function runs flawlessly (because of cross angles) - the Z formation. As soon as I make it not parallel, it becomes weird. This is because we are taking the y of the vertex, not where the foot of the perpendicular lands(C1 on the attached image). I am pretty good at math, so feel free to post some more technical solutions, I will understand them
EDIT: I just wanted to make a quick recap of my question: how should I construct a point that is at a certain angle from a line segment. I have already made functions that create the angle in respect to the X and Y axes, but I have no idea how i can make it in respect to the line inputted. Some code for the two functions:
def inRespectToXAxis(vertex, angle, length):
x, y = vertex.getCoords()
newx = x + length * cos(radians(angle))
newy = y + length * sin(radians(angle))
p = Point(abs(newx), abs(newy))
return p
def inRespectToYAxis(vertex, length, angle):
x, y = vertex.getCoords()
theta_rad = pi / 2 - radians(angle)
newx = x + length * cos(radians(angle))
newy = y + length * sin(radians(angle))
p = Point(newx, newy)
return p
Seems you want to add line segment angle to get proper result. You can calculate it using segment ends coordinates (x1,y1) and (x2,y2)
lineAngle = math.atan2(y2 - y1, x2 - x1)
Result is in radians, so apply it as
endx = x1 + length * cos(radians(angle) + lineAngle) etc
I have written code that calculates the angle between two vectors. However the way in which is does this is to start with two vectors, rotate each according to some euler angles calculated in a separate program, then calculate the angle between the vectors.
Up until now I have been working with a use case that means both starting vectors are (0,0,1) that makes life super easy. I could just take one set of euler angles away from the other and then calculate the angle between 0,0,1 and the vector that had been rotated by the difference. It meant I could plot nice distribution plots and vector diagrams because everything was normalised to 0,0,1. (I have 1000s of these vectors for the record).
No I am trying to write in a function that would allow for a use case where the two starting vectors are not on 0,0,1. I figured the easiest way to do this would be to calculate direction of the vector relative to 0,0,1 and after calculating the position of the vector just rotate by the precalculated offsets. (this might be a stupid way to do it, if it is please tell me).
MY current code works for a case where a vector is 0,1,0 but then breaks down if i start entering random numbers.
import numpy as np
import math
def RotationMatrix(axis, rotang):
"""
This uses Euler-Rodrigues formula.
"""
#Input taken in degrees, here we change it to radians
theta = rotang * 0.0174532925
axis = np.asarray(axis)
#Ensure axis is a unit vector
axis = axis/math.sqrt(np.dot(axis, axis))
#calclating a, b, c and d according to euler-rodrigues forumla requirments
a = math.cos(theta/2)
b, c, d = axis*math.sin(theta/2)
a2, b2, c2, d2 = a*a, b*b, c*c, d*d
bc, ad, ac, ab, bd, cd = b*c, a*d, a*c, a*b, b*d, c*d
#Return the rotation matrix
return np.array([[a2+b2-c2-d2, 2*(bc-ad), 2*(bd+ac)],
[2*(bc+ad), a2+c2-b2-d2, 2*(cd-ab)],
[2*(bd-ac), 2*(cd+ab), a2+d2-b2-c2]])
def ApplyRotationMatrix(vector, rotationmatrix):
"""
This function take the output from the RotationMatrix function and
uses that to apply the rotation to an input vector
"""
a1 = (vector[0] * rotationmatrix[0, 0]) + (vector[1] * rotationmatrix[0, 1]) + (vector[2] * rotationmatrix[0, 2])
b1 = (vector[0] * rotationmatrix[1, 0]) + (vector[1] * rotationmatrix[1, 1]) + (vector[2] * rotationmatrix[1, 2])
c1 = (vector[0] * rotationmatrix[2, 0]) + (vector[1] * rotationmatrix[2, 1]) + (vector[2] * rotationmatrix[2, 2])
return np.array((a1, b1, c1)
'''
Functions for Calculating the angles of 3D vectors relative to one another
'''
def CalculateAngleBetweenVector(vector, vector2):
"""
Does what it says on the tin, outputs an angle in degrees between two input vectors.
"""
dp = np.dot(vector, vector2)
maga = math.sqrt((vector[0] ** 2) + (vector[1] ** 2) + (vector[2] ** 2))
magb = math.sqrt((vector2[0] ** 2) + (vector2[1] ** 2) + (vector2[2] ** 2))
magc = maga * magb
dpmag = dp / magc
#These if statements deal with rounding errors of floating point operations
if dpmag > 1:
error = dpmag - 1
print('error = {}, do not worry if this number is very small'.format(error))
dpmag = 1
elif dpmag < -1:
error = 1 + dpmag
print('error = {}, do not worry if this number is very small'.format(error))
dpmag = -1
angleindeg = ((math.acos(dpmag)) * 180) / math.pi
return angleindeg
def CalculateAngleAroundZ(Vector):
X,Y,Z = Vector[0], Vector[1], Vector[2]
AngleAroundZ = math.atan2(Y, X)
AngleAroundZdeg = (AngleAroundZ*180)/math.pi
return AngleAroundZdeg
def CalculateAngleAroundX(Vector):
X,Y,Z = Vector[0], Vector[1], Vector[2]
AngleAroundZ = math.atan2(Y, Z)
AngleAroundZdeg = (AngleAroundZ*180)/math.pi
return AngleAroundZdeg
def CalculateAngleAroundY(Vector):
X,Y,Z = Vector[0], Vector[1], Vector[2]
AngleAroundZ = math.atan2(X, Z)
AngleAroundZdeg = (AngleAroundZ*180)/math.pi
return AngleAroundZdeg
V1 = (0,0,1)
V2 = (3,5,4)
Xoffset = (CalculateAngleAroundX(V2))
Yoffset = (CalculateAngleAroundY(V2))
Zoffset = (CalculateAngleAroundZ(V2))
XRM = RotationMatrix((1,0,0), (Xoffset * 1))
YRM = RotationMatrix((0,1,0), (Yoffset * 1))
ZRM = RotationMatrix((0,0,1), (Zoffset * 1))
V2 = V2 / np.linalg.norm(V2)
V2X = ApplyRotationMatrix(V2, XRM)
V2XY = ApplyRotationMatrix(V2X, YRM)
V2XYZ = ApplyRotationMatrix(V2XY, ZRM)
print(V2XYZ)
print(CalculateAngleBetweenVector(V1, V2XYZ))
Any advice to fix this problem will be much appreciated.
I'm not sure to fully understand what you need but if it is to compute the angle between two vectors in space you can use the formula:
where a.b is the scalar product and theta is the angle between vectors.
thus your function CalculateAngleBetweenVector becomes:
def CalculateAngleBetweenVector(vector, vector2):
return math.acos(np.dot(vector,vector2)/(np.linalg.norm(vector)* np.linalg.norm(vector2))) * 180 /math.pi
You can also simplify your ApplyRotationMatrix function:
def ApplyRotationMatrix(vector, rotationmatrix):
"""
This function take the output from the RotationMatrix function and
uses that to apply the rotation to an input vector
"""
return rotationmatrix # vector
the # symbol is the matrix product
Hope this will help you. Feel free to precise your request if this is not helpfull.
Im an idiot I just needed to do the cross product and the dot product and rotate by the dot product *-1 around the cross product.
The Goal:
I would like to vectorize (or otherwise speed up) this code. It rotates a 3d numpy model around its center point (let x,y,z denote the dimensions; then we want to rotate around the z-axis). The np model is binary voxels that are either "on" or "off"
I bet some basic matrix operation could do it, like take a layer and apply the rotation matrix to each element. The only issue with that is decimals; where should I have the new value land since cos(pi / 6) == sqrt(3) / 2?
The Code:
def rotate_model(m, theta):
'''
theta in degrees
'''
n =np.zeros(m.shape)
for i,layer in enumerate(m):
rotated = rotate(layer,theta)
n[i] = rotated
return n
where rotate() is:
def rotate(arr, theta):
'''
Rotates theta clockwise
rotated.shape == arr.shape, unlike scipy.ndimage.rotate(), which inflates size and also does some strange mixing
'''
if theta == int(theta):
theta *= pi / 180
theta = -theta
# theta=-theta b/c clockwise. Otherwise would default to counterclockwise
rotated =np.zeros(arr.shape)
#print rotated.shape[0], rotated.shape[1]
y_mid = arr.shape[0]//2
x_mid = arr.shape[1]//2
val = 0
for x_new in range(rotated.shape[1]):
for y_new in range(rotated.shape[0]):
x_centered = x_new - x_mid
y_centered = y_new - y_mid
x = x_centered*cos(theta) - y_centered*sin(theta)
y = x_centered*sin(theta) + y_centered*cos(theta)
x += x_mid
y += y_mid
x = int(round(x)); y = int(round(y)) # cast so range() picks it up
# lossy rotation
if x in range(arr.shape[1]) and y in range(arr.shape[0]):
val = arr[y,x]
rotated[y_new,x_new] = val
#print val
#print x,y
return rotated
You have a couple of problems in your code. First, if you want to fit the original image onto a rotated grid then you need a larger grid (usually). Alternatively, imagine a regular grid but the shape of your object - a rectangle - is rotated, thus becoming a "rhomb". It is obvious if you want to fit the entire rhomb - you need a larger output grid (array). On the other hand, you say in the code "rotated.shape == arr.shape, unlike scipy.ndimage.rotate(), which inflates size". If that is the case, maybe you do not want to fit the entire object? So, maybe it is OK to do this: rotated=np.zeros(arr.shape). But in general, yeah, one has to have a larger grid in order to fit the entire input image after it is rotated.
Another issue is angle conversion that you are doing:
if theta == int(theta):
theta *= pi / 180
theta = -theta
Why??? What will happen when I want to rotate the image by 1 radian? Or 2 radians? Am I forbidden to use integer number of radians? I think you are trying to do too much in this function and therefore it will be very confusing to do use it. Just require the caller to convert angles to radians. Or, you can do it inside this function if input theta is always in degrees. Or, you can add another parameter called, e.g., units and caller could set it to radians or degrees. Don't try to guess it based on "integer-ness" of input!
Now, let's rewrite your code a little bit:
rotated = np.zeros_like(arr) # instead of np.zero(arr.shape)
y_mid = arr.shape[0] // 2
x_mid = arr.shape[1] // 2
# val = 0 <- this is unnecessary
# pre-compute cos(theta) and sin(theta):
cs = cos(theta)
sn = sin(theta)
for x_new in range(rotated.shape[1]):
for y_new in range(rotated.shape[0]):
x = int(round((x_new - x_mid) * cs - (y_new - y_mid) * sn + x_mid)
y = int(round((x_new - x_mid) * sn - (y_new - y_mid) * cs + y_mid)
# just use comparisons, don't search through many values!
if 0 <= x < arr.shape[1] and 0 <= y < arr.shape[0]:
rotated[y_new, x_new] = arr[y, x]
So, now I can see (more easily) that for each pixel from the output array is mapped to a location in the input array. Yes, you can vectorize this.
import numpy as np
def rotate(arr, theta, unit='rad'):
# deal with theta units:
if unit.startswith('deg'):
theta = np.deg2rad(theta)
# for convenience, store array size:
ny, nx = arr.shape
# generate arrays of indices and flatten them:
y_new, x_new = np.indices(arr.shape)
x_new = x_new.ravel()
y_new = y_new.ravel()
# compute center of the array:
x0 = nx // 2
y0 = ny // 2
# compute old coordinates
xc = x_new - x0
yc = y_new - y0
x = np.round(np.cos(theta) * xc - np.sin(theta) * yc + x0).astype(np.int)
y = np.round(np.sin(theta) * xc - np.cos(theta) * yc + y0).astype(np.int)
# main idea to deal with indices is to create a mask:
mask = (x >= 0) & (x < nx) & (y >= 0) & (y < ny)
# ... and then select only those coordinates (both in
# input and "new" coordinates) that satisfy the above condition:
x = x[mask]
y = y[mask]
x_new = x_new[mask]
y_new = y_new[mask]
# map input values to output pixels *only* for selected "good" pixels:
rotated = np.zeros_like(arr)
rotated[y_new, x_new] = arr[y, x]
return rotated
Here is some code for anyone also doing 3d modeling. It solved my specific use-case pretty well. Still figuring out how to rotate in the proper plane. Hope it's helpful to you as well:
def rotate_model(m, theta):
'''
Redefines the prev 'rotate_model()' method
theta has to be in degrees
'''
rotated = scipy.ndimage.rotate(m, theta, axes=(1,2))
# have tried (1,0), (2,0), and now (1,2)
# ^ z is "up" and "2"
# scipy.ndimage.rotate() shrinks the model
# TODO: regrow it back
x_r = rotated.shape[1]
y_r = rotated.shape[0]
x_m = m.shape[1]
y_m = m.shape[0]
x_diff = abs(x_r - x_m)
y_diff = abs(y_r - y_m)
if x_diff%2==0 and y_diff%2==0:
return rotated[
x_diff//2 : x_r-x_diff//2,
y_diff//2 : y_r-y_diff//2,
:
]
elif x_diff%2==0 and y_diff%2==1:
# if this shift ends up turning the model to shit in a few iterations,
# change the following lines to include a flag that alternates cutting off the top and bottom bits of the array
return rotated[
x_diff//2 : x_r-x_diff//2,
y_diff//2+1 : y_r-y_diff//2,
:
]
elif x_diff%2==1 and y_diff%2==0:
return rotated[
x_diff//2+1 : x_r-x_diff//2,
y_diff//2 : y_r-y_diff//2,
:
]
else:
# x_diff%2==1 and y_diff%2==1:
return rotated[
x_diff//2+1 : x_r-x_diff//2,
y_diff//2+1 : y_r-y_diff//2,
:
]
I should probably start of with saying I have no idea how to code and don't consider myself even a beginner when it comes to coding. That being said I would really appreciate some help with getting started with some code. As the title suggests I have to code what is known as the Ising model. The premise of the model is:
E= -Σ(hs(i)) - Σ(Js(i)*s(j))
this will follow what i believe is the Monte Carlo simulation. so for each configuration of {s(i)} there is a probability e^(-ßE{s(i)})
We start with a random spin to yield potential {s(i)}
If E(1)>E(0) we flip the sign
If E(1) < E(0), then you draw a random number and compare to e^(ß∆E)
if the number , say x is:
x< e^(ß∆E) then flip
x > e^(ß∆E) do nothing then {s(i)}={s(0)}
I hope that is enough info, but I did pickup some code which I think is relevant
import numpy as np
import random
def init_spin_array(rows, cols):
return np.ones((rows, cols))
def find_neighbors(spin_array, lattice, x, y):
left = (x, y - 1)
right = (x, (y + 1) % lattice)
top = (x - 1, y)
bottom = ((x + 1) % lattice, y)
return [spin_array[left[0], left[1]],
spin_array[right[0], right[1]],
spin_array[top[0], top[1]],
spin_array[bottom[0], bottom[1]]]
def energy(spin_array, lattice, x ,y):
return 2 * spin_array[x, y] * sum(find_neighbors(spin_array, lattice, x, y))
def main():
RELAX_SWEEPS = 50
lattice = eval(input("Enter lattice size: "))
sweeps = eval(input("Enter the number of Monte Carlo Sweeps: "))
for temperature in np.arange(0.1, 5.0, 0.1):
spin_array = init_spin_array(lattice, lattice)
# the Monte Carlo follows below
mag = np.zeros(sweeps + RELAX_SWEEPS)
for sweep in range(sweeps + RELAX_SWEEPS):
for i in range(lattice):
for j in range(lattice):
e = energy(spin_array, lattice, i, j)
if e <= 0:
spin_array[i, j] *= -1
elif np.exp((-1.0 * e)/temperature) > random.random():
spin_array[i, j] *= -1
mag[sweep] = abs(sum(sum(spin_array))) / (lattice ** 2)
print(temperature, sum(mag[RELAX_SWEEPS:]) / sweeps)
main()
All i need is to plot this info into a M vs T plot, and somehow change the code to allow three parameter h,J,T to be varied, as in if I hold T at a certain #, what is h vs. J plot look like. Please any help would be immensely appreciated.
I am attempting to generate map overlay images that would assist in identifying hot-spots, that is areas on the map that have high density of data points. None of the approaches that I've tried are fast enough for my needs.
Note: I forgot to mention that the algorithm should work well under both low and high zoom scenarios (or low and high data point density).
I looked through numpy, pyplot and scipy libraries, and the closest I could find was numpy.histogram2d. As you can see in the image below, the histogram2d output is rather crude. (Each image includes points overlaying the heatmap for better understanding)
My second attempt was to iterate over all the data points, and then calculate the hot-spot value as a function of distance. This produced a better looking image, however it is too slow to use in my application. Since it's O(n), it works ok with 100 points, but blows out when I use my actual dataset of 30000 points.
My final attempt was to store the data in an KDTree, and use the nearest 5 points to calculate the hot-spot value. This algorithm is O(1), so much faster with large dataset. It's still not fast enough, it takes about 20 seconds to generate a 256x256 bitmap, and I would like this to happen in around 1 second time.
Edit
The boxsum smoothing solution provided by 6502 works well at all zoom levels and is much faster than my original methods.
The gaussian filter solution suggested by Luke and Neil G is the fastest.
You can see all four approaches below, using 1000 data points in total, at 3x zoom there are around 60 points visible.
Complete code that generates my original 3 attempts, the boxsum smoothing solution provided by 6502 and gaussian filter suggested by Luke (improved to handle edges better and allow zooming in) is here:
import matplotlib
import numpy as np
from matplotlib.mlab import griddata
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import math
from scipy.spatial import KDTree
import time
import scipy.ndimage as ndi
def grid_density_kdtree(xl, yl, xi, yi, dfactor):
zz = np.empty([len(xi),len(yi)], dtype=np.uint8)
zipped = zip(xl, yl)
kdtree = KDTree(zipped)
for xci in range(0, len(xi)):
xc = xi[xci]
for yci in range(0, len(yi)):
yc = yi[yci]
density = 0.
retvalset = kdtree.query((xc,yc), k=5)
for dist in retvalset[0]:
density = density + math.exp(-dfactor * pow(dist, 2)) / 5
zz[yci][xci] = min(density, 1.0) * 255
return zz
def grid_density(xl, yl, xi, yi):
ximin, ximax = min(xi), max(xi)
yimin, yimax = min(yi), max(yi)
xxi,yyi = np.meshgrid(xi,yi)
#zz = np.empty_like(xxi)
zz = np.empty([len(xi),len(yi)])
for xci in range(0, len(xi)):
xc = xi[xci]
for yci in range(0, len(yi)):
yc = yi[yci]
density = 0.
for i in range(0,len(xl)):
xd = math.fabs(xl[i] - xc)
yd = math.fabs(yl[i] - yc)
if xd < 1 and yd < 1:
dist = math.sqrt(math.pow(xd, 2) + math.pow(yd, 2))
density = density + math.exp(-5.0 * pow(dist, 2))
zz[yci][xci] = density
return zz
def boxsum(img, w, h, r):
st = [0] * (w+1) * (h+1)
for x in xrange(w):
st[x+1] = st[x] + img[x]
for y in xrange(h):
st[(y+1)*(w+1)] = st[y*(w+1)] + img[y*w]
for x in xrange(w):
st[(y+1)*(w+1)+(x+1)] = st[(y+1)*(w+1)+x] + st[y*(w+1)+(x+1)] - st[y*(w+1)+x] + img[y*w+x]
for y in xrange(h):
y0 = max(0, y - r)
y1 = min(h, y + r + 1)
for x in xrange(w):
x0 = max(0, x - r)
x1 = min(w, x + r + 1)
img[y*w+x] = st[y0*(w+1)+x0] + st[y1*(w+1)+x1] - st[y1*(w+1)+x0] - st[y0*(w+1)+x1]
def grid_density_boxsum(x0, y0, x1, y1, w, h, data):
kx = (w - 1) / (x1 - x0)
ky = (h - 1) / (y1 - y0)
r = 15
border = r * 2
imgw = (w + 2 * border)
imgh = (h + 2 * border)
img = [0] * (imgw * imgh)
for x, y in data:
ix = int((x - x0) * kx) + border
iy = int((y - y0) * ky) + border
if 0 <= ix < imgw and 0 <= iy < imgh:
img[iy * imgw + ix] += 1
for p in xrange(4):
boxsum(img, imgw, imgh, r)
a = np.array(img).reshape(imgh,imgw)
b = a[border:(border+h),border:(border+w)]
return b
def grid_density_gaussian_filter(x0, y0, x1, y1, w, h, data):
kx = (w - 1) / (x1 - x0)
ky = (h - 1) / (y1 - y0)
r = 20
border = r
imgw = (w + 2 * border)
imgh = (h + 2 * border)
img = np.zeros((imgh,imgw))
for x, y in data:
ix = int((x - x0) * kx) + border
iy = int((y - y0) * ky) + border
if 0 <= ix < imgw and 0 <= iy < imgh:
img[iy][ix] += 1
return ndi.gaussian_filter(img, (r,r)) ## gaussian convolution
def generate_graph():
n = 1000
# data points range
data_ymin = -2.
data_ymax = 2.
data_xmin = -2.
data_xmax = 2.
# view area range
view_ymin = -.5
view_ymax = .5
view_xmin = -.5
view_xmax = .5
# generate data
xl = np.random.uniform(data_xmin, data_xmax, n)
yl = np.random.uniform(data_ymin, data_ymax, n)
zl = np.random.uniform(0, 1, n)
# get visible data points
xlvis = []
ylvis = []
for i in range(0,len(xl)):
if view_xmin < xl[i] < view_xmax and view_ymin < yl[i] < view_ymax:
xlvis.append(xl[i])
ylvis.append(yl[i])
fig = plt.figure()
# plot histogram
plt1 = fig.add_subplot(221)
plt1.set_axis_off()
t0 = time.clock()
zd, xe, ye = np.histogram2d(yl, xl, bins=10, range=[[view_ymin, view_ymax],[view_xmin, view_xmax]], normed=True)
plt.title('numpy.histogram2d - '+str(time.clock()-t0)+"sec")
plt.imshow(zd, origin='lower', extent=[view_xmin, view_xmax, view_ymin, view_ymax])
plt.scatter(xlvis, ylvis)
# plot density calculated with kdtree
plt2 = fig.add_subplot(222)
plt2.set_axis_off()
xi = np.linspace(view_xmin, view_xmax, 256)
yi = np.linspace(view_ymin, view_ymax, 256)
t0 = time.clock()
zd = grid_density_kdtree(xl, yl, xi, yi, 70)
plt.title('function of 5 nearest using kdtree\n'+str(time.clock()-t0)+"sec")
cmap=cm.jet
A = (cmap(zd/256.0)*255).astype(np.uint8)
#A[:,:,3] = zd
plt.imshow(A , origin='lower', extent=[view_xmin, view_xmax, view_ymin, view_ymax])
plt.scatter(xlvis, ylvis)
# gaussian filter
plt3 = fig.add_subplot(223)
plt3.set_axis_off()
t0 = time.clock()
zd = grid_density_gaussian_filter(view_xmin, view_ymin, view_xmax, view_ymax, 256, 256, zip(xl, yl))
plt.title('ndi.gaussian_filter - '+str(time.clock()-t0)+"sec")
plt.imshow(zd , origin='lower', extent=[view_xmin, view_xmax, view_ymin, view_ymax])
plt.scatter(xlvis, ylvis)
# boxsum smoothing
plt3 = fig.add_subplot(224)
plt3.set_axis_off()
t0 = time.clock()
zd = grid_density_boxsum(view_xmin, view_ymin, view_xmax, view_ymax, 256, 256, zip(xl, yl))
plt.title('boxsum smoothing - '+str(time.clock()-t0)+"sec")
plt.imshow(zd, origin='lower', extent=[view_xmin, view_xmax, view_ymin, view_ymax])
plt.scatter(xlvis, ylvis)
if __name__=='__main__':
generate_graph()
plt.show()
This approach is along the lines of some previous answers: increment a pixel for each spot, then smooth the image with a gaussian filter. A 256x256 image runs in about 350ms on my 6-year-old laptop.
import numpy as np
import scipy.ndimage as ndi
data = np.random.rand(30000,2) ## create random dataset
inds = (data * 255).astype('uint') ## convert to indices
img = np.zeros((256,256)) ## blank image
for i in xrange(data.shape[0]): ## draw pixels
img[inds[i,0], inds[i,1]] += 1
img = ndi.gaussian_filter(img, (10,10))
A very simple implementation that could be done (with C) in realtime and that only takes fractions of a second in pure python is to just compute the result in screen space.
The algorithm is
Allocate the final matrix (e.g. 256x256) with all zeros
For each point in the dataset increment the corresponding cell
Replace each cell in the matrix with the sum of the values of the matrix in an NxN box centered on the cell. Repeat this step a few times.
Scale result and output
The computation of the box sum can be made very fast and independent on N by using a sum table. Every computation just requires two scan of the matrix... total complexity is O(S + WHP) where S is the number of points; W, H are width and height of output and P is the number of smoothing passes.
Below is the code for a pure python implementation (also very un-optimized); with 30000 points and a 256x256 output grayscale image the computation is 0.5sec including linear scaling to 0..255 and saving of a .pgm file (N = 5, 4 passes).
def boxsum(img, w, h, r):
st = [0] * (w+1) * (h+1)
for x in xrange(w):
st[x+1] = st[x] + img[x]
for y in xrange(h):
st[(y+1)*(w+1)] = st[y*(w+1)] + img[y*w]
for x in xrange(w):
st[(y+1)*(w+1)+(x+1)] = st[(y+1)*(w+1)+x] + st[y*(w+1)+(x+1)] - st[y*(w+1)+x] + img[y*w+x]
for y in xrange(h):
y0 = max(0, y - r)
y1 = min(h, y + r + 1)
for x in xrange(w):
x0 = max(0, x - r)
x1 = min(w, x + r + 1)
img[y*w+x] = st[y0*(w+1)+x0] + st[y1*(w+1)+x1] - st[y1*(w+1)+x0] - st[y0*(w+1)+x1]
def saveGraph(w, h, data):
X = [x for x, y in data]
Y = [y for x, y in data]
x0, y0, x1, y1 = min(X), min(Y), max(X), max(Y)
kx = (w - 1) / (x1 - x0)
ky = (h - 1) / (y1 - y0)
img = [0] * (w * h)
for x, y in data:
ix = int((x - x0) * kx)
iy = int((y - y0) * ky)
img[iy * w + ix] += 1
for p in xrange(4):
boxsum(img, w, h, 2)
mx = max(img)
k = 255.0 / mx
out = open("result.pgm", "wb")
out.write("P5\n%i %i 255\n" % (w, h))
out.write("".join(map(chr, [int(v*k) for v in img])))
out.close()
import random
data = [(random.random(), random.random())
for i in xrange(30000)]
saveGraph(256, 256, data)
Edit
Of course the very definition of density in your case depends on a resolution radius, or is the density just +inf when you hit a point and zero when you don't?
The following is an animation built with the above program with just a few cosmetic changes:
used sqrt(average of squared values) instead of sum for the averaging pass
color-coded the results
stretching the result to always use the full color scale
drawn antialiased black dots where the data points are
made an animation by incrementing the radius from 2 to 40
The total computing time of the 39 frames of the following animation with this cosmetic version is 5.4 seconds with PyPy and 26 seconds with standard Python.
Histograms
The histogram way is not the fastest, and can't tell the difference between an arbitrarily small separation of points and 2 * sqrt(2) * b (where b is bin width).
Even if you construct the x bins and y bins separately (O(N)), you still have to perform some ab convolution (number of bins each way), which is close to N^2 for any dense system, and even bigger for a sparse one (well, ab >> N^2 in a sparse system.)
Looking at the code above, you seem to have a loop in grid_density() which runs over the number of bins in y inside a loop of the number of bins in x, which is why you're getting O(N^2) performance (although if you are already order N, which you should plot on different numbers of elements to see, then you're just going to have to run less code per cycle).
If you want an actual distance function then you need to start looking at contact detection algorithms.
Contact Detection
Naive contact detection algorithms come in at O(N^2) in either RAM or CPU time, but there is an algorithm, rightly or wrongly attributed to Munjiza at St. Mary's college London, which runs in linear time and RAM.
you can read about it and implement it yourself from his book, if you like.
I have written this code myself, in fact
I have written a python-wrapped C implementation of this in 2D, which is not really ready for production (it is still single threaded, etc) but it will run in as close to O(N) as your dataset will allow. You set the "element size", which acts as a bin size (the code will call interactions on everything within b of another point, and sometimes between b and 2 * sqrt(2) * b), give it an array (native python list) of objects with an x and y property and my C module will callback to a python function of your choice to run an interaction function for matched pairs of elements. it's designed for running contact force DEM simulations, but it will work fine on this problem too.
As I haven't released it yet, because the other bits of the library aren't ready yet, I'll have to give you a zip of my current source but the contact detection part is solid. The code is LGPL'd.
You'll need Cython and a c compiler to make it work, and it's only been tested and working under *nix environemnts, if you're on windows you'll need the mingw c compiler for Cython to work at all.
Once Cython's installed, building/installing pynet should be a case of running setup.py.
The function you are interested in is pynet.d2.run_contact_detection(py_elements, py_interaction_function, py_simulation_parameters) (and you should check out the classes Element and SimulationParameters at the same level if you want it to throw less errors - look in the file at archive-root/pynet/d2/__init__.py to see the class implementations, they're trivial data holders with useful constructors.)
(I will update this answer with a public mercurial repo when the code is ready for more general release...)
Your solution is okay, but one clear problem is that you're getting dark regions despite there being a point right in the middle of them.
I would instead center an n-dimensional Gaussian on each point and evaluate the sum over each point you want to display. To reduce it to linear time in the common case, use query_ball_point to consider only points within a couple standard deviations.
If you find that he KDTree is really slow, why not call query_ball_point once every five pixels with a slightly larger threshold? It doesn't hurt too much to evaluate a few too many Gaussians.
You can do this with a 2D, separable convolution (scipy.ndimage.convolve1d) of your original image with a gaussian shaped kernel. With an image size of MxM and a filter size of P, the complexity is O(PM^2) using separable filtering. The "Big-Oh" complexity is no doubt greater, but you can take advantage of numpy's efficient array operations which should greatly speed up your calculations.
Just a note, the histogram2d function should work fine for this. Did you play around with different bin sizes? Your initial histogram2d plot seems to just use the default bin sizes... but there's no reason to expect the default sizes to give you the representation you want. Having said that, many of the other solutions are impressive too.