Pathway of lowest values between 2 points in 2D heatmap - python

I was wondering if I could get some concept ideas from you all before spending too much time on this.
I have a (X,Y,Z) heatmap file showing the energy (Z value) of multiple XY coordinates.
X,Y,Z
-8.000000,0.000000,30
-7.920000,0.000000,30
-7.840000,0.000000,30
-7.760000,0.000000,30
-7.680000,0.000000,30
(...)
7.680000,25.000000,30
7.760000,25.000000,30
7.840000,25.000000,30
7.920000,25.000000,30
8.000000,25.000000,30
I would like to determine possible pathways between 2 points in the XY space. These pathways should consist of a series of XY coordinates with the lowest Z values necessary in order to connect the selected regions.
I appreciate any suggestions on how to approach this.

Related

Clustering on evenly spaced grid points

I have a 50 by 50 grid of evenly spaced (x,y) points. Each of these points has a third scalar value. This can be visualized using a contourplot which I have added. I am interested in the regions indicated in by the red circles. These regions of low "Z-values" are what I want to extract from this data.
2D contour plot of 50 x 50 evenly spaced grid points:
I want to do this by using clustering (machine learning), which can be lightning quick when applied correctly. The problem is, however, that the points are evenly spaced together and therefore the density of the entire dataset is equal everywhere.
I have tried using a DBSCAN algorithm with a custom distance metric which takes into account the Z values of each point. I have defined the distance between two points as follows:\
def custom_distance(point1,point2):
average_Z = (point1[2]+point2[2])/2
distance = np.sqrt(np.square((point1[0]-point2[0])) + np.square((point1[1]-point2[1])))
distance = distance * average_Z
return distance
This essentially determines the Euclidean distance between two points and adds to it the average of the two Z values of both points. In the picture below I have tested this distance determination function applied in a DBSCAN algorithm. Each point in this 50 by 50 grid each has a Z value of 1, except for four clusters that I have randomly placed. These points each have a z value of 10. The algorithm is able to find the clusters in the data based on their z value as can be seen below.
DBSCAN clustering result using scalar value distance determination:
Positive about the results I tried to apply it to my actual data, only to be disappointed by the results. Since the x and y values of my data are very large, I have simply scaled them to be 0 to 49. The z values I have left untouched. The results of the clustering can be seen in the image below:
Clustering result on original data:
This does not come close to what I want and what I was expecting. For some reason the clusters that are found are of rectangular shape and the light regions of low Z values that I am interested in are not extracted with this approach.
Is there any way I can make the DBSCAN algorithm work in this way? I suspect the reason that it is currently not working has something to do with the differences in scale of the x,y and z values. I am also open for tips or recommendations on other approaches on how to define and find the lighter regions in the data.

Convert 2D Array to a 3D Space

I am trying to develop a 3D cube with values from a flat 2D Plane. I am having a lot of difficulty trying to pseudo code it out so I was hoping to get some input from you guys.
I will try my best to express myself through pictures as I am able to visualize what I am trying to achieve.
I have a 2D output based on the black line in this figure:
I have an array with data of amplitude as each index's value i.e (0; 1) -> the 0 is the x coordinate (sample) and 1 as the y coordinate (amplitude) or as another example (~1900; ~0.25).
How do I take this 1 dimensional sequence and extrude it into a 3D picture like below:
Is there perhaps a library that does such? Or am I going about it the wrong way? The data is from a matched filter output of a sonar signal and I wish to visualize the concentration of the intensity versus where it is located in a sample on a 3D plane. The data has peaks that have inclining and declining gradient slopes before a peak.
I cannot seem to wrap my mind around such a task. Is there a library or a term used to associate what I wish to accomplish?
EDIT: I found this https://www.tutorialspoint.com/matplotlib/matplotlib_3d_surface_plot.htm
But it requires all x, y and z points. Whereas I only have x and y. Additionally I need to be able to access every coordinate (x, y, z) to be able to do range and angle estimation from sample (0, 1) (Transmitted sound where power is highest). I would only like to basically see the top of this though on another 2D axis...
EDIT 2: Following up on a comment below, I would like to convert Figure 1 above into the below image using a library if there exists.
Thanks so much in advanced!

how to intersect two planes in python and export the coordinates of the intersection

I have a bunch of points (x, y and z) in a 3d space and want to extract some points out of them. I copied a simplified example with two arrays which are linked together:
all_points=[[np.array([[6.8,1.,0.1], [6.8,3.,0.1], [6.8,6.,0.1],\
[4.8,1.,2.], [4.8,3.,2.], [4.8,6.,2.],\
[3.8,1.,3.], [3.8,3.,3.], [3.8,6.,3.],\
[2.8,1.,4.1], [2.8,3.,4.1], [2.8,6.,4.1]]),\
np.array([[5.,1.,2.], [5.,3.,2.], [5.,6.,2.],\
[4.,1.,3.], [4.,3.,3.], [4.,6.,3.],\
[6.,1.,3.], [6.,3.,3.], [6.,6.,3.],\
[7.,1.,4.], [7.,3.,4.], [7.,6.,4.],\
[3.,1.,4.], [3.,3.,4.], [3.,6.,4.]])]]
Firstly, I want to check whether the array is normal or not. If I sort a normal array based on z values, the x value of srted array will be increasing or decreasing. First array (blue dots in upladed fig) clearly show a normal set. For normal arrays I just do a simple task and export four points showing corners of them (shown by yellow and green arrows in my fig). These points are found based on the minimum and maximum of x, y and z. Following code gives me four corners of normals:
four_corners=[]
for points in all_points:
for sub_points in points:
sorted_sub=np.sort(sub_points.view('i8,i8,i8'), order=['f2', 'f1'], axis=0).view('float')
le_st=sorted_sub[np.where(sorted_sub[:,2] == sorted_sub[0,2])]
le_st=len(le_st)
le_en=sorted_sub[np.where(sorted_sub[:,2] == sorted_sub[-1,2])]
le_en=len(le_en)
cor=np.array([sorted_sub[0,:], sorted_sub[int((le_st-1)),:], sorted_sub[-1,:], sorted_sub[-le_en,:]])
four_corners.append(cor)
In abnormal sets (black squares in my fig) usually some points are very close to a normal set (a limit can be defined) and then they go away. I want to extract four points but by creating two planes. First plane is created using three of the four corners points found for the normal points. Second surface is created using each three points of the abnormal points that are not close to the normal points (highlighted by a red line in my fig). Then, I want to find intersection line of two surfaces and find the x and z in the minimum and maximum of y (1 and 6) of the intersection. y value of all my corners points (normal or abnormal) is the minimum or maximum value. Other two points are created by substituting the y and z values of the two corners points coming from the normal plane that have higher z values (highlted by yellow arrows) into the equation of the plane of abnormal set. I only know how to create surfaces based on this solution. In reality I may have several normal and abnormal sets that all are linked to the normal. In advance, I do appreciate any help and contribution for doing what I want in python.

How to plot coarse-grained average of a set of data points?

I have a set of discrete 2-dimensional data points. Each of these points has a measured value associated with it. I would like to get a scatter plot with points colored by their measured values. But the data points are so dense that points with different colors would overlap with each other, that may not be good for visualization. So I am thinking if I could associate the color for each point based on the coarse-grained average of measured values of some points near it. Does anyone know how to implement this in Python?
Thanks!
I have it done by using sklearn.neighbors.RadiusNeighborsClassifier(), the idea is the take the average of the values of the neighbors within a specific radius. Suppose the coordinates of the data points are in the list temp_coors, the values associated with these points are coloring, then coloring could be coarse-grained in the following way:
r_neigh = RadiusNeighborsRegressor(radius=smoothing_radius, weights='uniform')
r_neigh.fit(temp_coors, coloring)
coloring = r_neigh.predict(temp_coors)

Finding n nearest data points to grid locations

I'm working on a problem where I have a large set (>4 million) of data points located in a three-dimensional space, each with a scalar function value. This is represented by four arrays: XD, YD, ZD, and FD. The tuple (XD[i], YD[i], ZD[i]) refers to the location of data point i, which has a value of FD[i].
I'd like to superimpose a rectilinear grid of, say, 100x100x100 points in the same space as my data. This grid is set up as follows.
[XGrid, YGrid, ZGrid] = np.mgrid[Xmin:Xmax:Xstep, Ymin:Ymax:Ystep, Zmin:Zmax:Zstep]
XG = XGrid[:,0,0]
YG = YGrid[0,:,0]
ZG = ZGrid[0,0,:]
XGrid is a 3D array of the x-value at each point in the grid. XG is a 1D array of the x-values going from Xmin to Xmax, separated by a distance of XStep.
I'd like to use an interpolation algorithm I have to find the value of the function at each grid point based on the data surrounding it. In this algorithm I require 20 data points closest (or at least close) to my grid point of interest. That is, for grid point (XG[i], YG[j], ZG[k]) I want to find the 20 closest data points.
The only way I can think of is to have one for loop that goes through each data point and a subsequent embedded for loop going through all (so many!) data points, calculating the Euclidean distance, and picking out the 20 closest ones.
for i in range(0,XG.shape):
for j in range(0,YG.shape):
for k in range(0,ZG.shape):
Distance = np.zeros([XD.shape])
for a in range(0,XD.shape):
Distance[a] = (XD[a] - XG[i])**2 + (YD[a] - YG[j])**2 + (ZD[a] - ZG[k])**2
B = np.zeros([20], int)
for a in range(0,20):
indx = np.argmin(Distance)
B[a] = indx
Distance[indx] = float(inf)
This would give me an array, B, of the indices of the data points closest to the grid point. I feel like this would take too long to go through each data point at each grid point.
I'm looking for any suggestions, such as how I might be able to organize the data points before calculating distances, which could cut down on computation time.
Have a look at a seemingly simmilar but 2D problem and see if you cannot improve with ideas from there.
From the top of my head, I'm thinking that you can sort the points according to their coordinates (three separate arrays). When you need the closest points to the [X, Y, Z] grid point you'll quickly locate points in those three arrays and start from there.
Also, you don't really need the euclidian distance, since you are only interested in relative distance, which can also be described as:
abs(deltaX) + abs(deltaY) + abs(deltaZ)
And save on the expensive power and square roots...
No need to iterate over your data points for each grid location: Your grid locations are inherently ordered, so just iterate over your data points once, and assign each data point to the eight grid locations that surround it. When you're done, some grid locations may have too few data points. Check the data points of adjacent grid locations. If you have plenty of data points to go around (it depends on how your data is distributed), you can already select the 20 closest neighbors during the initial pass.
Addendum: You may want to reconsider other parts of your algorithm as well. Your algorithm is a kind of piecewise-linear interpolation, and there are plenty of relatively simple improvements. Instead of dividing your space into evenly spaced cubes, consider allocating a number of center points and dynamically repositioning them until the average distance of data points from the nearest center point is minimized, like this:
Allocate each data point to its closest center point.
Reposition each center point to the coordinates that would minimize the average distance from "its" points (to the "centroid" of the data subset).
Some data points now have a different closest center point. Repeat steps 1. and 2. until you converge (or near enough).

Categories