Scipy normalization-- localize values to set discrete points

Scipy normalization-- localize values to set discrete points - python

I am currently displaying two separate 2D images (x,y plane and z,y plane) that are derived from 96x512 arrays of 0-255 values. I would like to be able to filter the data so that anything under a certain value is done away with (the highest values are indicative of targets). What I would like to be able to do is from these images, separate discrete points that may be then mapped three-dimensionally as points, rather than mapping two intersecting planes. I'm not entirely sure how to do this or where to start (I'm very new to python). I am producing the images using scipy and have done some normalization and noise reduction, but I'm not sure how to then separate out anything over the threshold as it's own individual point. Is this possible?

If I understand correctly what you want, filtering points can be done like this:
A=numpy.random.rand(5,5)
B=A>0.5
Now B is a binary mask, and you can use it in a number of ways:
A[B]
will return an array with all values of A that are true in B.
A[B]=0
will assign 0 to all values in A that are true in B.
numpy.nonzero(B)
will give you the x,y coordinates of each point that is true in B.

Related

Coordinates Python

I am facing with the sorting airfoil coordinates. In particular given a set of coordinates, which are not sorted, I have to sorted them starting from the trailing edge upper surface. Here I report the code that I have developed but as you can see, the starting point do not match with what I suppose, moreover exist several oscillations as you can see in the reported figure (and a detail, in blue the starting point after the sort).
Can someone suggest me what I miss? How can I do?
Thanks you in advance.
def sort_airfoil(points):
x0 = np.mean(-points[:,1])
y0 = np.mean(points[:,2])
r = np.sqrt((-points[:,1]-x0)**2 + (points[:,2]-y0)**2)
tempx=-points[:,1]
xmax=np.max(tempx)
ind_max=np.where(tempx==xmax)
ymax=np.max(points[ind_max,2])
ind_max_t=np.where((tempx>0.95*xmax) & (tempx<xmax))
ymax_t=points[ind_max_t,2]
ymin=np.min(ymax_t)
indx_temp=np.where(points[:,2]==ymin)
xmin=np.max(tempx[indx_temp])
xmed=(xmin+xmax)/2
ymed=(ymin+ymax)/2
print(x0,y0)
print(xmin,ymin)
print((xmin+xmax)/2, (ymin+ymax)/2)
angle0=np.arctan2((ymed-y0),(xmed-x0))
print("angle", angle0)
angles = np.where((points[:,2]-y0) > 0, np.arccos((-points[:,1]-x0)/r), 2*np.pi-np.arccos((-points[:,1]-x0)/r))
angles=angles-angle0
for i in range(len(angles)):
if angles[i]<0:
angles[i]=angles[i]+2*np.pi
elif angles[i]>2*np.pi:
angles[i]=angles[i]-2*np.pi
mask = np.argsort(angles)
x_sorted = points[mask,1]
y_sorted = points[mask,2]
points_new=np.zeros([len(points), 3])
points_new[:,0]=points[:,0]
points_new[:,1]=x_sorted
points_new[:,2]=y_sorted
return points_new

The issue comes from the algorithm itself: it only work when the points form a convex polygon. However, the shape is concave.
More specifically, the first sorted points (and the last ones) form a zigzag-shaped lines because there is two sets of points (green arrows) interleaving with growing angles (red arrow) from the median point (red line).
Note the points are horizontally flipped on the gathered point from the question. Thus the points are sorted clockwise.
One simple solution is to split horizontally the shape in many set of point (eg. 10 set) so that each set form a convex shape. Then, the parts can be merged to form the final shape. The merge consists in finding the points at the "edge" of each locally-sorted set of points (parts) and reorder the partially sorted array of points consequently.
More specifically, the points of each part are split in 2 sub-sets: the upper ones and the lower ones. You can find them easily by selecting the 2 left-most points of a right part with the right-most points of a left part. The 2 top-most points needs to be connected each other and the same for the 2 bottom-most points. Thus, the sequence of the two upper sets of points needs to be reordered so they are contiguous and the same for the lower part.
Here is an example:
Note that if you are unsure about how to split the points in many parts so that each one form a convex-shaped sets of points, then you can: split the shape in n parts, check if the set of points form a convex shape by computing a convex hull (eg. using a Graham scan) and split evenly the parts that are concave (recursively). This is quite expensive, but more robust.

How to separate points and find corners of the points in python based on their distance

I have some points in the 3d space (x, y and z). These point sets are stored as arrays in lists. I copied a simplified example having two point sets:
all_points=[[np.array([[6.8,1.,0.1], [6.8,3.,0.1], [6.8,6.,0.1],\
[5.8,1.,1.1], [5.8,3.,1.1], [5.8,6.,1.1],\
[4.8,1.,2.], [4.8,3.,2.], [4.8,6.,2.],\
[3.8,1.,3.], [3.8,3.,3.], [3.8,6.,3.],\
[2.8,1.,4.1], [2.8,3.,4.1], [2.8,6.,4.1]]),\
np.array([[5.,1.,2.], [5.,3.,2.], [5.,6.,2.],[6.,1.,1.2],\
[4.,1.,3.], [4.,3.,3.], [4.,6.,3.],[5.5,3.,1.5],\
[6.,1.,3.], [6.,3.,3.], [6.,6.,3.],\
[7.,1.,4.], [7.,3.,4.], [7.,6.,4.],\
[3.,1.,4.], [3.,3.,4.], [3.,6.,4.]])]]
My point sets are normal or abnormal. They are normal if when I sort them based on their z, the x value will be only increasing or decreasing. Blue dots in my fig cleary show the normal type. But black squares show an abnormal point set. These two sets are linked because some points of the abnormal set are close to the normal one. Minimum and maximum of y value in both normal and abnormal sets is fixed (1 and 6 in my example). In normal set, I simply want four corners of them (shown by green arrows in my fig). This code gives me four corners:
four_corners=[]
for points in all_points:
for sub_points in points:
sorted_sub=np.sort(sub_points.view('i8,i8,i8'), order=['f2', 'f1'], axis=0).view('float')
le_st=sorted_sub[np.where(sorted_sub[:,2] == sorted_sub[0,2])]
le_st=len(le_st)
le_en=sorted_sub[np.where(sorted_sub[:,2] == sorted_sub[-1,2])]
le_en=len(le_en)
cor=np.array([sorted_sub[0,:], sorted_sub[int((le_st-1)),:], sorted_sub[-1,:], sorted_sub[-le_en,:]])
four_corners.append(cor)
Abnormal point sets can be devided into two groups: a group that is close to normal point sets and another one is far from them. A threshold can separate them. I tried the following code to seperate them (I should transfer my normal and abnormal arrays automatically here, but I have written them manually):
from scipy.spatial import distance
import numpy_indexed as npi
threshold=0.5
close_points=abnormal[np.where(np.min(distance.cdist(abnormal, normal),axis=0)<threshold)[0],:]
far_points= npi.difference(abnormal, close_points)
After separation, I want two points from far_points and two point from close_points. In far_points I want two point that have the highest z values and have min of y (1) and max of y (6). These two points are shown by yellow arrows in my fig and are:
[[7.,1.,4.], [7.,6.,4.]]
In close_points I want the points that their y value is again min and max (1 and 6). I name them y_min and y_max subgroups and from each subgroup, I want the point that the least z value. In my data they are and are shown by red arrows:
[[6.,1.,1.2],[5.,6.,2.]]
Finally, I want to find two point of the normal point sets that are closest to theese two point of close_points of the abnormal group. They are:
[[5.8,1.,1.1], [4.8,6.,2.]]
So, I want a method to firstly distiguish which array is normal and which is abnormal. Then find four simple corners of my normal sets and explained four explained points of abnormal sets. the method should be also able to ditinguish which normal set is connected to which abnormal ones. I may have one normal sets and two or three linked abnormal sets or maybe two normals and one abnormal which is connected to a normal set. I do appreciate any help for doing what I want in python.

Matrix normalization over multiple runs, what does this code do?

I have several numpy matrices collected over some time. I now want to visualize these matrices and explore visual similarities among them. The matrices contain small numbers from 0.0 to 1.0.
To compare them, I want to ensure that the same "areas" get colored with the same color, e.g. that 0.01 to 0.02 always is red, and 0.02 to 0.03 always is green. I have two question:
I found another question which has this code snippet:
a = np.random.normal(0.0,0.5,size=(5000,10))**2
a = a/np.sum(a,axis=1)[:,None] # Normalize
plt.pcolor(a)
What is the effect of the second line, precisely the [:,None] statement. I tried normalizing a matrix by:
max_a = a/10# Normalize
print(max_a.shape)
plt.pcolor(max_a)
but there is not much visual difference compared to the visualization for the unnormalized matrix. When I then add the [:,None] statement I get an error
ValueError: too many values to unpack (expected 2)
which is expected since the shape now is 10,1,10. I therefor want to know what the brackets do and how to read the statement.
Secondly, and related, I want to make sure that I can visual compare the matrices. I therefor want to fix the "colorization", e.g. the ranges when a color is green or red, so that I do not end up with 0 to 0.1 as green in plot A and with 0 to 0.1 as red in plot B. How can I fix the "translation" from floats to colors? Do I have to normalize each matrix with a same constant, e.g. 10? Or do I normalize them with an unique value -- do I even need normalization here?

[:,None] adds new axis so you'll be able to divide sum of all columns in each row - it is the same as using np.sum(a,axis=1)[:,np.newaxis] - when you sum all columns with np.sum(a,axis=1) you'll get 1d array with shape (5000), but to be able to normalize your matrix with summed columns you need 2d array with shape (5000,1), that's why new axis is needed.
You can have fixed colors by fixing scale of your colormap: plt.pcolor(max_a,vmin=0,vmax=1)
adding discrete colorbar might also help:
from pylab import cm
cmap = cm.get_cmap('jet', 10)
plt.pcolor(a,cmap=cmap,vmin=0,vmax=1)
plt.colorbar()

How would I translate this equation into code?

I am working in Python, and I a trying to compute a wight matrix for a graph of pixels, and the weight of each edge is dependent on their "feature" similarity (F(i) - F(j)), and their location similarity (X(i)-X(j)). "Features" includes intensity, color, texture.
Right now I have it implemented and it is working, but not for color images. I at first tried to simply take some RGB values and average each pixel to convert the entire image to greyscale. But that didn't work as I had hoped, and I have read throgh a paper that suggests a different method.
They say to use this: F(i) = [v, v*s*sin(h), v*s*cos(h)](i)
where h, s, and v and the HSV color values.
I am just confused on the notation. What is this suppsed to mean? What does it mean to have three different terms separated by commas inside square brackets? I'm also confused with what the (i) at the end is supposed to mean. The solution to F(i) for any given pixel should be a single number, to be able to carry out F(i)-F(j)?
I'm not asking for someone to do this for me I just need some clarification.

Features can be vectors and you can calculate distance between vectors.
f1 = numpy.array([1,2,3])
f2 = numpy.array([0,2,3])
distance = numpy.linalg.norm(f1 - f2).

Performing many means in numpy

Good Morning,
I am implimenting a Cressman filter for doing distance weighted averages in Numpy.. I use a Ball Tree implimentation (thanks to Jake VanderPlas) to return a list of locatations for each point in a request array.. the query array (q) is shape [n,3] and at each point has the x,y,z at point I want to do a weighted average of points stored in the tree.. the code wrapped around the tree returns points within a certain distance so I get an arrays of variable length arrays..
I use a where to find non-empty entries (ie positions where there were at least some points within the radius of influence) creating the isgood array...
I then loop over all query points to return the weighted average of the values self.z (note that this can either be dims=1 or dims=2 to allow multiple co-gridding)
so the thing that complilcates using map or other quicker methods is the nonuniformity of the lengths of the arrays within self.distances and self.locations... I am still fairly green to numpy/python but I can not think of a way to do this array wise (ie not reverting to loops)
self.locations, self.distances = self.tree.query_radius( q, r, return_distance=True)
t2=time()
if debug: print "Removing voids"
isgood=np.where( np.array([len(x) for x in self.locations])!=0)[0]
interpol = np.zeros( (len(self.locations),) + np.shape(self.z[0]) )
interpol.fill(np.nan)
for dist, ix, posn, roi in zip(self.distances[isgood], self.locations[isgood], isgood, r[isgood]):
interpol[isgood[jinterpol]] = np.average(self.z[ix], weights=(roi**2-dist**2) / (roi**2 + dist**2), axis=0)
jinterpol += 1
so... Any hints of how to speed up the loop?..
For a typical mapping as appied to mapping weather radar data from a range,azimuth,elevation grid to a cartesian grid where I have 240x240x34 points and 4 variables takes 99s to query the tree (written by Jake in C and cython.. this is the hard step as you need to search the data!) and 100 seconds to do the calculation... which in my opinon is slow?? where is my overhead? is np.mean efficient or as it is called millions of times is there a speedup to be gained here? would I gain by using float32 rather than the default64... or even scaling to ints (which would be very hard to avoid wrap around in the weighting... any hints gratefully recieved!

You can find a discussion about the relative merits of the Cressman scheme vs using a Gaussian weight function at:
http://www.flame.org/~cdoswell/publications/radar_oa_00.pdf
The key is to match the smoothing parameter to the data (I recommend using a value close to the average spacing between data points). Once you know the smoothing parameter, you can set an "influence radius" equal to the radius where the weight function falls to 0.01 (or whatever).
How important is speed? If you wish, rather than calling an exponential function to determine the weight, you can make up a discrete table of weights for some fixed number of radius increments, which speeds up the calculation considerably. Ideally, you should have data outside the grid boundaries that can be used in the mapping of the values surrounding the gridpoints (even on the boundary points of the grid). Note this is NOT a true interpolation scheme - it won't return the observed values at the data points exactly. Like the Cressman scheme, it's a low-pass filer.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.