Related
I am plotting a vector field using the numpy function quiver() and it works. But I would like to emphasize the cowlick in the following plot:
I am not sure how to go about it, but increasing the density of arrows in the center could possibly do the trick. To do so, I would like to resort to some option within np.meshgrid() that would allow me to get more tightly packed x,y coordinate points in the center. A linear, quadratic or other specification does not seem to be built in. I am not sure if sparse can be modified to this end.
The code:
lim = 10
int = 0.22 *lim
x,y = np.meshgrid(np.arange(-lim, lim, int), np.arange(-lim, lim, int))
u = 3 * np.cos(np.arctan2(y,x)) - np.sqrt(x**2+y**2) * np.sin(np.arctan2(y,x))
v = 3 * np.sin(np.arctan2(y,x)) + np.sqrt(x**2+y**2) * np.cos(np.arctan2(y,x))
color = x**2 + y**2
plt.rcParams["image.cmap"] = "Greys_r"
mult = 1
plt.figure(figsize=(mult*lim, mult*lim))
plt.quiver(x,y,u,v,color, linewidths=.006, lw=.1)
plt.show()
Closing the loop on this, thanks to the accepted answer I was able to finally strike a balance between the density of the mesh as I learned from to do from #flwr and keeping the "cowlick" structure of the vector field conspicuous (avoiding the radial structure around the origin as much as possible):
You can construct the points whereever you want to calculate your field on and quivers will be happy about it. The code below uses polar coordinates and stretches the radial coordinate non-linearly.
import numpy as np
import matplotlib.pyplot as plt
lim = 10
N = 10
theta = np.linspace(0.1, 2*np.pi, N*2)
stretcher_factor = 2
r = np.linspace(0.3, lim**(1/stretcher_factor), N)**stretcher_factor
R, THETA = np.meshgrid(r, theta)
x = R * np.cos(THETA)
y = R * np.sin(THETA)
# x,y = np.meshgrid(x, y)
r = x**2 + y**2
u = 3 * np.cos(THETA) - np.sqrt(r) * np.sin(THETA)
v = 3 * np.sin(THETA) + np.sqrt(r) * np.cos(THETA)
plt.rcParams["image.cmap"] = "Greys_r"
mult = 1
plt.figure(figsize=(mult*lim, mult*lim))
plt.quiver(x,y,u,v,r, linewidths=.006, lw=.1)
Edit: Bug taking meshgrid twice
np.meshgrid just makes a grid of the vectors you provide.
What you could do is contract this regular grid in the center to have more points in the center (best visible with more points), e.g. like so:
# contract in the center
a = 0.5 # how far to contract
b = 0.8 # how strongly to contract
c = 1 - b*np.exp(-((x/lim)**2 + (y/lim)**2)/a**2)
x, y = c*x, c*y
plt.plot(x,y,'.k')
plt.show()
Alternatively you can x,y cooridnates that are not dependent on a grid at all:
x = np.random.randn(500)
y = np.random.randn(500)
plt.plot(x,y,'.k')
plt.show()
But I think you'd prefer a slightly more regular patterns you could look into poisson disk sampling with adaptive distances or something like that, but the key point here is that for using quiver, you can use ANY set of coordinates, they do not have to be in a regular grid.
I wanna have multiple density plots in one Python Pyx plot.
I can do this to have two density plots:
Question: how to remove unecessary repeated color bar?
Example code is from here:
from pyx import *
f = canvas.canvas()
re_min = -2
re_max = 0.5
im_min = -1.25
im_max = 1.25
gridx = 100
gridy = 100
max_iter = 10
re_step = (re_max - re_min) / gridx
im_step = (im_max - im_min) / gridy
d = []
for re_index in range(gridx):
re = re_min + re_step * (re_index + 0.5)
for im_index in range(gridy):
im = im_min + im_step * (im_index + 0.5)
c = complex(re, im)
n = 0
z = complex(0, 0)
while n < max_iter and abs(z) < 2:
z = (z * z) + c
n += 1
d.append([re, im, n])
g1 = graph.graphxy(height=8, width=8,
x=graph.axis.linear(min=re_min, max=re_max, title=r"$\Re(c)$"),
y=graph.axis.linear(min=im_min, max=im_max, title=r'$\Im(c)$'))
g1.plot(graph.data.points(d, x=1, y=2, color=3, title="iterations"),
[graph.style.density(gradient=color.rgbgradient.Rainbow)])
f.insert(g1)
g2 = graph.graphxy(height=8, width=8, xpos=g1.xpos+14.0,
x=graph.axis.linear(min=re_min, max=re_max, title=r"$\Re(c)$"),
y=graph.axis.linear(min=im_min, max=im_max, title=r'$\Im(c)$'))
g2.plot(graph.data.points(d, x=1, y=2, color=3, title="iterations"),
[graph.style.density(gradient=color.rgbgradient.Rainbow)])
f.insert(g2)
f.writePDFfile()
The color bar is called a keygraph, and it is a property of the density style. You can set it to None, i.e.
graph.style.density(gradient=color.rgbgradient.Rainbow, keygraph=None)
which does not remove the (automatic) key graph internally, but it supresses its output.
You can also set the keygraph yourself, and in addition, you can set the coloraxis of this keygraph. It is a property of the style, too, and a simple linear axis by default, but can be changed (like fixing min and max values).
Now, when you suppress the keygraph, it is not sure, that the same axis will be used in both graphs (even if you share the same axis, as long as you remain using flexible ranges). There are various solutions out. Let me give a more advanced one. :-)
In the first graph, we could keep a copy of the plotitem, finish the plot (which creates the keygraph) and then access the coloraxis like this:
d1 = graph.style.density(gradient=color.rgbgradient.Rainbow)
plotitem = g1.plot(graph.data.points(d, x=1, y=2, color=3, title="iterations"), [d1])
f.insert(g1)
g1.finish()
coloraxis = plotitem.privatedatalist[-1].keygraph.axes['x']
Now you can use this coloraxis in the second graph while still supressing the keygraph:
d2 = graph.style.density(gradient=color.rgbgradient.Rainbow, keygraph=None,
coloraxis=graph.axis.linkedaxis(coloraxis))
This ensures the same scale in the keygraph and thus the same colors. :-)
I have a grid containing some data in polar coordinates, simulating data obtained from a LIDAR for the SLAM problem. Each row in the grid represents the angle, and each column represents a distance. The values contained in the grid store a weighted probability of the occupancy map for a Cartesian world.
After converting to Cartesian Coordinates, I obtain something like this:
This mapping is intended to work in a FastSLAM application, with at least 10 particles. The performance I am obtaining isn't good enough for a reliable application.
I have tried with nested loops, using the scipy.ndimage.geometric_transform library and accessing directly the grid with pre-computed coordinates.
In those examples, I am working with a 800x800 grid.
Nested loops: aprox 300ms
i = 0
for scan in scans:
hit = scan < laser.range_max
if hit:
d = np.linspace(scan + wall_size, 0, num=int((scan+ wall_size)/cell_size))
else:
d = np.linspace(scan, 0, num=int(scan/cell_size))
for distance in distances:
x = int(pos[0] + d * math.cos(angle[i]+pos[2]))
y = int(pos[1] + d * math.sin(angle[i]+pos[2]))
if distance > scan:
grid_cart[y][x] = grid_cart[y][x] + hit_weight
else:
grid_cart[y][x] = grid_cart[y][x] + miss_weight
i = i + 1
Scipy library (Described here): aprox 2500ms (Gives a smoother result since it interpolates the empty cells)
grid_cart = S.ndimage.geometric_transform(weight_mat, polar2cartesian,
order=0,
output_shape = (weight_mat.shape[0] * 2, weight_mat.shape[0] * 2),
extra_keywords = {'inputshape':weight_mat.shape,
'origin':(weight_mat.shape[0], weight_mat.shape[0])})
def polar2cartesian(outcoords, inputshape, origin):
"""Coordinate transform for converting a polar array to Cartesian coordinates.
inputshape is a tuple containing the shape of the polar array. origin is a
tuple containing the x and y indices of where the origin should be in the
output array."""
xindex, yindex = outcoords
x0, y0 = origin
x = xindex - x0
y = yindex - y0
r = np.sqrt(x**2 + y**2)
theta = np.arctan2(y, x)
theta_index = np.round((theta + np.pi) * inputshape[1] / (2 * np.pi))
return (r,theta_index)
Pre-computed indexes: 80ms
for i in range(0, 144000):
gird_cart[ys[i]][xs[i]] = grid_polar_1d[i]
I am not very used to python and Numpy, and I feel I am skipping an easy and fast way to solve this problem. Are there any other alternatives to solve that?
Many thanks to you all!
I came across a piece of code that seems to behave x10 times faster (8ms):
angle_resolution = 1
range_max = 400
a, r = np.mgrid[0:int(360/angle_resolution),0:range_max]
x = (range_max + r * np.cos(a*(2*math.pi)/360.0)).astype(int)
y = (range_max + r * np.sin(a*(2*math.pi)/360.0)).astype(int)
for i in range(0, int(360/angle_resolution)):
cart_grid[y[i,:],x[i,:]] = polar_grid[i,:]
Here's a rough explanation of what I do in vtk:
Create a surface (a minimal surface, not too relevant what it is, the geometry is important though: the gyroid has two labyrinths that are completely shut off from each other).
use vtkClipClosedSurface to shut off one of the labyrinths so that I get an object that has no open surfaces anymore. A regular surface looks like this, with a closed surface it looks like this.
Here's my problem: For more complicated versions of my structure, I get this:
Can you see how on the top left it works fine and near the bottom right it stops creating surfaces? Sometimes I also get really weird triangles in that last part.
To my understanding vtkClipClosedSurface knows from the surface normals where to close a surface and where not. The thing is: The normals of my structure are fine and they all point in the right direction. If you take a closer look on the structure you will notice that the lower part is basically an inversion of the top part that changes gradually, all in one surface.
I tried to modify my structure before cutting with many things like vtkSmoothPolyDataFilter, vtkCleanPolyData or vtkPolyDataNormals. I even tried extracting the boundary surfaces with vtkFeatureEdges, which led to an even worse result. Even vtkFillHolesFilter didn't yield any acceptable results. My surface seems flawless and easy enough to create a boundary.
I have no idea what else to try. This happens for other structures, too. Fixing it with a CAD tool is out of question, because it is supposed to work out of box. Please help me!
Here's another example of a geometry that doesn't close the surface properly. This time I used vtkFillHolesFilter which results in surfaces on the inside of the structure, while they should only occupy the boundary of te object.
In case you need a more detailed rundown of my pipeline, here goes:
create surface using mayavi.mlab.contour3d
get the PolyData by extracting the actor.mapper.input
convert format from tvtk to regular vtk
vtkClipClosedSurface with a plane collection that cuts away part of the structure (errors occur when the plane collection is the same as the structure boundary)
visualize it
Edit: Okay, this did not receive enough attention, so I constructed a minimal, complete and verifiable working example that reproduces the behaviour:
import numpy as np
import vtk # VTK version 7.0
from mayavi import mlab # mayavi version 4.4.4
from mayavi.api import Engine, OffScreenEngine
from tvtk.api import tvtk
def schwarz_D(x, y, z, linear_term=0):
"""This is the function for the Schwarz Diamond level surface."""
return (np.sin(x) * np.sin(y) * np.sin(z) + np.sin(x) * np.cos(y) * np.cos(z) +
np.cos(x) * np.sin(y) * np.cos(z) + np.cos(x) * np.cos(y) * np.sin(z)) - linear_term * z
def plane_collection(xn, x, yn, y, zn, z):
"""Defines the 6 planes for cutting rectangular objects to the right size."""
plane1 = vtk.vtkPlane()
plane1.SetOrigin(x, 0, 0)
plane1.SetNormal(-1, 0, 0)
plane2 = vtk.vtkPlane()
plane2.SetOrigin(0, y, 0)
plane2.SetNormal(0, -1, 0)
plane3 = vtk.vtkPlane()
plane3.SetOrigin(0, 0, z)
plane3.SetNormal(0, 0, -1)
plane4 = vtk.vtkPlane()
plane4.SetOrigin(xn, 0, 0)
plane4.SetNormal(1, 0, 0)
plane5 = vtk.vtkPlane()
plane5.SetOrigin(0, yn, 0)
plane5.SetNormal(0, 1, 0)
plane6 = vtk.vtkPlane()
plane6.SetOrigin(0, 0, zn)
plane6.SetNormal(0, 0, 1)
plane_list = [plane4, plane1, plane5, plane2, plane6, plane3]
planes = vtk.vtkPlaneCollection()
for item in plane_list:
planes.AddItem(item)
return planes
[nx, ny, nz] = [2, 2, 8] # amount of unit cells
cell_size = 1
gradient_value = 0.04 # only values below 0.1 produce the desired geometry; this term is essential
x, y, z = np.mgrid[-cell_size*(nx + 1)/2:cell_size*(nx + 1)/2:100j,
-cell_size*(ny + 1)/2:cell_size*(ny + 1)/2:100j,
-cell_size*(nz + 1)/2:cell_size*(nz + 1)/2:100*2j] * np.pi / (cell_size/2)
# engine = Engine()
engine = OffScreenEngine() # do not start mayavi GUI
engine.start()
fig = mlab.figure(figure=None, engine=engine)
contour3d = mlab.contour3d(x, y, z, schwarz_D(x, y, z, gradient_value), figure=fig)
scene = engine.scenes[0]
actor = contour3d.actor.actors[0]
iso_surface = scene.children[0].children[0].children[0]
iso_surface.contour.minimum_contour = 0
iso_surface.contour.number_of_contours = 1
iso_surface.compute_normals = False
iso_surface.contour.auto_update_range = False
mlab.draw(fig)
# mlab.show() # enable if you want to see the mayavi GUI
polydata = tvtk.to_vtk(actor.mapper.input) # convert tvtkPolyData to vtkPolyData
# Move object to the coordinate center to make clipping easier later on.
center_coords = np.array(polydata.GetCenter())
center = vtk.vtkTransform()
center.Translate(-center_coords[0], -center_coords[1], -center_coords[2])
centerFilter = vtk.vtkTransformPolyDataFilter()
centerFilter.SetTransform(center)
centerFilter.SetInputData(polydata)
centerFilter.Update()
# Reverse normals in order to receive a closed surface after clipping
reverse = vtk.vtkReverseSense()
reverse.SetInputConnection(centerFilter.GetOutputPort())
reverse.ReverseNormalsOn()
reverse.ReverseCellsOn()
reverse.Update()
bounds = np.asarray(reverse.GetOutput().GetBounds())
clip = vtk.vtkClipClosedSurface()
clip.SetInputConnection(reverse.GetOutputPort())
clip.SetTolerance(10e-3)
# clip.TriangulationErrorDisplayOn() # enable to see errors for not watertight surfaces
clip.SetClippingPlanes(plane_collection(bounds[0] + cell_size/2, bounds[1] - cell_size/2,
bounds[2] + cell_size/2, bounds[3] - cell_size/2,
bounds[4] + cell_size/2, bounds[5] - cell_size/2))
clip.Update()
# Render the result
mapper = vtk.vtkPolyDataMapper()
mapper.SetInputConnection(clip.GetOutputPort())
actor = vtk.vtkActor()
actor.SetMapper(mapper)
renderer = vtk.vtkRenderer()
renderWindow = vtk.vtkRenderWindow()
renderWindow.AddRenderer(renderer)
renderWindowInteractor = vtk.vtkRenderWindowInteractor()
renderWindowInteractor.SetRenderWindow(renderWindow)
renderer.AddActor(actor)
renderWindow.Render()
renderWindowInteractor.Start()
This really is a short as it gets, I stripped as much as I could. The problem still persists and I can't figure out a solution.
Try using pymeshfix. I had a very similar problem with some low-res mandelbulbs I was generating.
You may also want ot check out pyvista, it's a nice python wrapper for vtk.
Great problem and thanks for the example.
I was able to get this to work in pyvista with some modifications:
import numpy as np
import pyvista as pv
def schwarz_D(x, y, z, linear_term=0):
"""This is the function for the Schwarz Diamond level surface."""
return (np.sin(x) * np.sin(y) * np.sin(z) + np.sin(x) * np.cos(y) * np.cos(z) +
np.cos(x) * np.sin(y) * np.cos(z) + np.cos(x) * np.cos(y) * np.sin(z)) - linear_term * z
# Create the grid
[nx, ny, nz] = [2, 2, 8] # amount of unit cells
cell_size = 1
gradient_value = 0.04 # only values below 0.1 produce the desired geometry; this term is essential
x, y, z = np.mgrid[-cell_size*(nx + 1)/2:cell_size*(nx + 1)/2:100j,
-cell_size*(ny + 1)/2:cell_size*(ny + 1)/2:100j,
-cell_size*(nz + 1)/2:cell_size*(nz + 1)/2:100*2j] * np.pi / (cell_size/2)
# make a grid and exclude cells below 0.1
grid = pv.StructuredGrid(x, y, z)
grid['scalars'] = schwarz_D(x, y, z, gradient_value).ravel(order='F')
contour = grid.clip_scalar(value=0.1)
# make a bunch of clips
# bounds = contour.bounds
# contour.clip(origin=(bounds[0] + cell_size/2, 0, 0), normal='-x', inplace=True)
# contour.clip(origin=(0, bounds[1] - cell_size/2, 0), normal='-y', inplace=True)
# contour.clip(origin=(0, 0, bounds[2] + cell_size/2), normal='-z', inplace=True)
# contour.clip(origin=(bounds[3] - cell_size/2, 0, 0), normal='x', inplace=True)
# contour.clip(origin=(0, bounds[4] + cell_size/2, 0), normal='y', inplace=True)
# contour.clip(origin=(0, 0, bounds[5] - cell_size/2), normal='z', inplace=True)
contour.plot(smooth_shading=True, color='w')
I'm not sure why you're using clipping planes, and I think that you would be better off simply limiting your x, y, and z ranges put into creating the grids. That way, you wouldn't have to clip the final mesh.
I am attempting to generate map overlay images that would assist in identifying hot-spots, that is areas on the map that have high density of data points. None of the approaches that I've tried are fast enough for my needs.
Note: I forgot to mention that the algorithm should work well under both low and high zoom scenarios (or low and high data point density).
I looked through numpy, pyplot and scipy libraries, and the closest I could find was numpy.histogram2d. As you can see in the image below, the histogram2d output is rather crude. (Each image includes points overlaying the heatmap for better understanding)
My second attempt was to iterate over all the data points, and then calculate the hot-spot value as a function of distance. This produced a better looking image, however it is too slow to use in my application. Since it's O(n), it works ok with 100 points, but blows out when I use my actual dataset of 30000 points.
My final attempt was to store the data in an KDTree, and use the nearest 5 points to calculate the hot-spot value. This algorithm is O(1), so much faster with large dataset. It's still not fast enough, it takes about 20 seconds to generate a 256x256 bitmap, and I would like this to happen in around 1 second time.
Edit
The boxsum smoothing solution provided by 6502 works well at all zoom levels and is much faster than my original methods.
The gaussian filter solution suggested by Luke and Neil G is the fastest.
You can see all four approaches below, using 1000 data points in total, at 3x zoom there are around 60 points visible.
Complete code that generates my original 3 attempts, the boxsum smoothing solution provided by 6502 and gaussian filter suggested by Luke (improved to handle edges better and allow zooming in) is here:
import matplotlib
import numpy as np
from matplotlib.mlab import griddata
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import math
from scipy.spatial import KDTree
import time
import scipy.ndimage as ndi
def grid_density_kdtree(xl, yl, xi, yi, dfactor):
zz = np.empty([len(xi),len(yi)], dtype=np.uint8)
zipped = zip(xl, yl)
kdtree = KDTree(zipped)
for xci in range(0, len(xi)):
xc = xi[xci]
for yci in range(0, len(yi)):
yc = yi[yci]
density = 0.
retvalset = kdtree.query((xc,yc), k=5)
for dist in retvalset[0]:
density = density + math.exp(-dfactor * pow(dist, 2)) / 5
zz[yci][xci] = min(density, 1.0) * 255
return zz
def grid_density(xl, yl, xi, yi):
ximin, ximax = min(xi), max(xi)
yimin, yimax = min(yi), max(yi)
xxi,yyi = np.meshgrid(xi,yi)
#zz = np.empty_like(xxi)
zz = np.empty([len(xi),len(yi)])
for xci in range(0, len(xi)):
xc = xi[xci]
for yci in range(0, len(yi)):
yc = yi[yci]
density = 0.
for i in range(0,len(xl)):
xd = math.fabs(xl[i] - xc)
yd = math.fabs(yl[i] - yc)
if xd < 1 and yd < 1:
dist = math.sqrt(math.pow(xd, 2) + math.pow(yd, 2))
density = density + math.exp(-5.0 * pow(dist, 2))
zz[yci][xci] = density
return zz
def boxsum(img, w, h, r):
st = [0] * (w+1) * (h+1)
for x in xrange(w):
st[x+1] = st[x] + img[x]
for y in xrange(h):
st[(y+1)*(w+1)] = st[y*(w+1)] + img[y*w]
for x in xrange(w):
st[(y+1)*(w+1)+(x+1)] = st[(y+1)*(w+1)+x] + st[y*(w+1)+(x+1)] - st[y*(w+1)+x] + img[y*w+x]
for y in xrange(h):
y0 = max(0, y - r)
y1 = min(h, y + r + 1)
for x in xrange(w):
x0 = max(0, x - r)
x1 = min(w, x + r + 1)
img[y*w+x] = st[y0*(w+1)+x0] + st[y1*(w+1)+x1] - st[y1*(w+1)+x0] - st[y0*(w+1)+x1]
def grid_density_boxsum(x0, y0, x1, y1, w, h, data):
kx = (w - 1) / (x1 - x0)
ky = (h - 1) / (y1 - y0)
r = 15
border = r * 2
imgw = (w + 2 * border)
imgh = (h + 2 * border)
img = [0] * (imgw * imgh)
for x, y in data:
ix = int((x - x0) * kx) + border
iy = int((y - y0) * ky) + border
if 0 <= ix < imgw and 0 <= iy < imgh:
img[iy * imgw + ix] += 1
for p in xrange(4):
boxsum(img, imgw, imgh, r)
a = np.array(img).reshape(imgh,imgw)
b = a[border:(border+h),border:(border+w)]
return b
def grid_density_gaussian_filter(x0, y0, x1, y1, w, h, data):
kx = (w - 1) / (x1 - x0)
ky = (h - 1) / (y1 - y0)
r = 20
border = r
imgw = (w + 2 * border)
imgh = (h + 2 * border)
img = np.zeros((imgh,imgw))
for x, y in data:
ix = int((x - x0) * kx) + border
iy = int((y - y0) * ky) + border
if 0 <= ix < imgw and 0 <= iy < imgh:
img[iy][ix] += 1
return ndi.gaussian_filter(img, (r,r)) ## gaussian convolution
def generate_graph():
n = 1000
# data points range
data_ymin = -2.
data_ymax = 2.
data_xmin = -2.
data_xmax = 2.
# view area range
view_ymin = -.5
view_ymax = .5
view_xmin = -.5
view_xmax = .5
# generate data
xl = np.random.uniform(data_xmin, data_xmax, n)
yl = np.random.uniform(data_ymin, data_ymax, n)
zl = np.random.uniform(0, 1, n)
# get visible data points
xlvis = []
ylvis = []
for i in range(0,len(xl)):
if view_xmin < xl[i] < view_xmax and view_ymin < yl[i] < view_ymax:
xlvis.append(xl[i])
ylvis.append(yl[i])
fig = plt.figure()
# plot histogram
plt1 = fig.add_subplot(221)
plt1.set_axis_off()
t0 = time.clock()
zd, xe, ye = np.histogram2d(yl, xl, bins=10, range=[[view_ymin, view_ymax],[view_xmin, view_xmax]], normed=True)
plt.title('numpy.histogram2d - '+str(time.clock()-t0)+"sec")
plt.imshow(zd, origin='lower', extent=[view_xmin, view_xmax, view_ymin, view_ymax])
plt.scatter(xlvis, ylvis)
# plot density calculated with kdtree
plt2 = fig.add_subplot(222)
plt2.set_axis_off()
xi = np.linspace(view_xmin, view_xmax, 256)
yi = np.linspace(view_ymin, view_ymax, 256)
t0 = time.clock()
zd = grid_density_kdtree(xl, yl, xi, yi, 70)
plt.title('function of 5 nearest using kdtree\n'+str(time.clock()-t0)+"sec")
cmap=cm.jet
A = (cmap(zd/256.0)*255).astype(np.uint8)
#A[:,:,3] = zd
plt.imshow(A , origin='lower', extent=[view_xmin, view_xmax, view_ymin, view_ymax])
plt.scatter(xlvis, ylvis)
# gaussian filter
plt3 = fig.add_subplot(223)
plt3.set_axis_off()
t0 = time.clock()
zd = grid_density_gaussian_filter(view_xmin, view_ymin, view_xmax, view_ymax, 256, 256, zip(xl, yl))
plt.title('ndi.gaussian_filter - '+str(time.clock()-t0)+"sec")
plt.imshow(zd , origin='lower', extent=[view_xmin, view_xmax, view_ymin, view_ymax])
plt.scatter(xlvis, ylvis)
# boxsum smoothing
plt3 = fig.add_subplot(224)
plt3.set_axis_off()
t0 = time.clock()
zd = grid_density_boxsum(view_xmin, view_ymin, view_xmax, view_ymax, 256, 256, zip(xl, yl))
plt.title('boxsum smoothing - '+str(time.clock()-t0)+"sec")
plt.imshow(zd, origin='lower', extent=[view_xmin, view_xmax, view_ymin, view_ymax])
plt.scatter(xlvis, ylvis)
if __name__=='__main__':
generate_graph()
plt.show()
This approach is along the lines of some previous answers: increment a pixel for each spot, then smooth the image with a gaussian filter. A 256x256 image runs in about 350ms on my 6-year-old laptop.
import numpy as np
import scipy.ndimage as ndi
data = np.random.rand(30000,2) ## create random dataset
inds = (data * 255).astype('uint') ## convert to indices
img = np.zeros((256,256)) ## blank image
for i in xrange(data.shape[0]): ## draw pixels
img[inds[i,0], inds[i,1]] += 1
img = ndi.gaussian_filter(img, (10,10))
A very simple implementation that could be done (with C) in realtime and that only takes fractions of a second in pure python is to just compute the result in screen space.
The algorithm is
Allocate the final matrix (e.g. 256x256) with all zeros
For each point in the dataset increment the corresponding cell
Replace each cell in the matrix with the sum of the values of the matrix in an NxN box centered on the cell. Repeat this step a few times.
Scale result and output
The computation of the box sum can be made very fast and independent on N by using a sum table. Every computation just requires two scan of the matrix... total complexity is O(S + WHP) where S is the number of points; W, H are width and height of output and P is the number of smoothing passes.
Below is the code for a pure python implementation (also very un-optimized); with 30000 points and a 256x256 output grayscale image the computation is 0.5sec including linear scaling to 0..255 and saving of a .pgm file (N = 5, 4 passes).
def boxsum(img, w, h, r):
st = [0] * (w+1) * (h+1)
for x in xrange(w):
st[x+1] = st[x] + img[x]
for y in xrange(h):
st[(y+1)*(w+1)] = st[y*(w+1)] + img[y*w]
for x in xrange(w):
st[(y+1)*(w+1)+(x+1)] = st[(y+1)*(w+1)+x] + st[y*(w+1)+(x+1)] - st[y*(w+1)+x] + img[y*w+x]
for y in xrange(h):
y0 = max(0, y - r)
y1 = min(h, y + r + 1)
for x in xrange(w):
x0 = max(0, x - r)
x1 = min(w, x + r + 1)
img[y*w+x] = st[y0*(w+1)+x0] + st[y1*(w+1)+x1] - st[y1*(w+1)+x0] - st[y0*(w+1)+x1]
def saveGraph(w, h, data):
X = [x for x, y in data]
Y = [y for x, y in data]
x0, y0, x1, y1 = min(X), min(Y), max(X), max(Y)
kx = (w - 1) / (x1 - x0)
ky = (h - 1) / (y1 - y0)
img = [0] * (w * h)
for x, y in data:
ix = int((x - x0) * kx)
iy = int((y - y0) * ky)
img[iy * w + ix] += 1
for p in xrange(4):
boxsum(img, w, h, 2)
mx = max(img)
k = 255.0 / mx
out = open("result.pgm", "wb")
out.write("P5\n%i %i 255\n" % (w, h))
out.write("".join(map(chr, [int(v*k) for v in img])))
out.close()
import random
data = [(random.random(), random.random())
for i in xrange(30000)]
saveGraph(256, 256, data)
Edit
Of course the very definition of density in your case depends on a resolution radius, or is the density just +inf when you hit a point and zero when you don't?
The following is an animation built with the above program with just a few cosmetic changes:
used sqrt(average of squared values) instead of sum for the averaging pass
color-coded the results
stretching the result to always use the full color scale
drawn antialiased black dots where the data points are
made an animation by incrementing the radius from 2 to 40
The total computing time of the 39 frames of the following animation with this cosmetic version is 5.4 seconds with PyPy and 26 seconds with standard Python.
Histograms
The histogram way is not the fastest, and can't tell the difference between an arbitrarily small separation of points and 2 * sqrt(2) * b (where b is bin width).
Even if you construct the x bins and y bins separately (O(N)), you still have to perform some ab convolution (number of bins each way), which is close to N^2 for any dense system, and even bigger for a sparse one (well, ab >> N^2 in a sparse system.)
Looking at the code above, you seem to have a loop in grid_density() which runs over the number of bins in y inside a loop of the number of bins in x, which is why you're getting O(N^2) performance (although if you are already order N, which you should plot on different numbers of elements to see, then you're just going to have to run less code per cycle).
If you want an actual distance function then you need to start looking at contact detection algorithms.
Contact Detection
Naive contact detection algorithms come in at O(N^2) in either RAM or CPU time, but there is an algorithm, rightly or wrongly attributed to Munjiza at St. Mary's college London, which runs in linear time and RAM.
you can read about it and implement it yourself from his book, if you like.
I have written this code myself, in fact
I have written a python-wrapped C implementation of this in 2D, which is not really ready for production (it is still single threaded, etc) but it will run in as close to O(N) as your dataset will allow. You set the "element size", which acts as a bin size (the code will call interactions on everything within b of another point, and sometimes between b and 2 * sqrt(2) * b), give it an array (native python list) of objects with an x and y property and my C module will callback to a python function of your choice to run an interaction function for matched pairs of elements. it's designed for running contact force DEM simulations, but it will work fine on this problem too.
As I haven't released it yet, because the other bits of the library aren't ready yet, I'll have to give you a zip of my current source but the contact detection part is solid. The code is LGPL'd.
You'll need Cython and a c compiler to make it work, and it's only been tested and working under *nix environemnts, if you're on windows you'll need the mingw c compiler for Cython to work at all.
Once Cython's installed, building/installing pynet should be a case of running setup.py.
The function you are interested in is pynet.d2.run_contact_detection(py_elements, py_interaction_function, py_simulation_parameters) (and you should check out the classes Element and SimulationParameters at the same level if you want it to throw less errors - look in the file at archive-root/pynet/d2/__init__.py to see the class implementations, they're trivial data holders with useful constructors.)
(I will update this answer with a public mercurial repo when the code is ready for more general release...)
Your solution is okay, but one clear problem is that you're getting dark regions despite there being a point right in the middle of them.
I would instead center an n-dimensional Gaussian on each point and evaluate the sum over each point you want to display. To reduce it to linear time in the common case, use query_ball_point to consider only points within a couple standard deviations.
If you find that he KDTree is really slow, why not call query_ball_point once every five pixels with a slightly larger threshold? It doesn't hurt too much to evaluate a few too many Gaussians.
You can do this with a 2D, separable convolution (scipy.ndimage.convolve1d) of your original image with a gaussian shaped kernel. With an image size of MxM and a filter size of P, the complexity is O(PM^2) using separable filtering. The "Big-Oh" complexity is no doubt greater, but you can take advantage of numpy's efficient array operations which should greatly speed up your calculations.
Just a note, the histogram2d function should work fine for this. Did you play around with different bin sizes? Your initial histogram2d plot seems to just use the default bin sizes... but there's no reason to expect the default sizes to give you the representation you want. Having said that, many of the other solutions are impressive too.