Calculate derivate of spatial measurements - python

I have a set of spatial distributed measurements.
For each point p1 = [x1,y1,z1] there is a measurement v1 which is a scalar. (e.g. Temperature measurements under water.)
Lets assume these measurements are on a regular grid.
I would like to find out where is the most variation in this distribution.
That means in what positions is the most change of temperature.
I think this corresponds to the spatial derivation of temperature.
Can somebody give me an advice how to proceed?
What are methodologies to archive this?
I tried to implement it with np.gradient() but i fail at interpreting the result...

This is absolutely not an optimized code, but here is what I came up with, at least to explain how it works.
grid = [[[1, 2], [2, 3]], [[8, 5], [4, 1000]]]
def get_greatest_diff(g, x, y, z):
value = g[x][y][z]
try:
diff_x = abs(value-g[x+1][y][z])
except IndexError:
diff_x = -1
try:
diff_y= abs(value-g[x][y+1][z])
except IndexError:
diff_y = -1
try:
diff_z = abs(value-g[x][y][z+1])
except IndexError:
diff_z = -1
if diff_x>=diff_y and diff_x>=diff_z:
return diff_x, [x+1, y, z]
if diff_y>diff_x and diff_y>=diff_z:
return diff_y, [x, y+1, z]
return diff_z, [x, y, z+1]
greatest_diff = 0
greatest_diff_pos0 = []
greatest_diff_pos1 = []
for x in range(len(grid)):
for y in range(len(grid[x])):
for z in range(len(grid[x][y])):
diff, coords = get_greatest_diff(grid, x, y, z)
if diff > greatest_diff:
greatest_diff = diff
greatest_diff_pos0 = [x, y, z]
greatest_diff_pos1 = coords
print(greatest_diff, greatest_diff_pos0, greatest_diff_pos1)
The try:...except:... are here to handle the edge conditions. (That's dirty but that's quick!)
For each cell, you will look at the three neighbours x+1 or y+1 or z+1 and you compute the difference with their values. You keep the largest difference in the neighborhood and you return it. (That is the explanation of get_greatest_diff)
In the main loop, you check if the difference in this neighborhood is the greatest of all, if so, store the difference, and the two cells in question.
Finally, return the greatest difference and the cells in question.

Here is a numpy solution that returns the indices in an ndarray that has the biggest total differences with its neighbors.
Say the input array is X and it is 2D. I will create D where D[i,j] = |X[i, j]-X[i-1, j]|+|X[i,j]-X[i, j-1]|. And return the indices of D which give the largest value in D.
def greatest_diff(X):
ndim = X.ndim
Ds = [np.abs(np.diff(X, axis = i, prepend=0)) for i in range(ndim)]
D = sum(Ds)
return np.unravel_index(D.argmax(), D.shape)
X = np.zeros((5,5))
X[2,2] = 1
greatest_diff(X)
# returns (2, 2)
X = np.zeros((5,10,9))
X[2,2,7] = -1
greatest_diff(X)
# returns (2, 2, 7)
Another solution might be calculating the difference between X[i, j] and sum(X[k, l]) where k,l are the neighbors of i, j. You can achieve this by applying a gaussian filter to the X say gX then taking the squared differences: (X-gX)^2.
def greatest_diff_gaussian(X, sigma = 1):
from scipy.ndimage import gaussian_filter
gX = gaussian_filter(X, sigma)
dgX = np.power(X - gX, 2)
return np.unravel_index(dgX.argmax(), dgX.shape)

Related

Efficient way to map 3D function to a meshgrid with NumPy

I have a set of data values for a scalar 3D function, arranged as inputs x,y,z in an array of shape (n,3) and the function values f(x,y,z) in an array of shape (n,).
EDIT: For instance, consider the following simple function
data = np.array([np.arange(n)]*3).T
F = np.linalg.norm(data,axis=1)**2
I would like to convolve this function with a spherical kernel in order to perform a 3D smoothing. The easiest way I found to perform this is to map the function values in a 3D spatial grid and then apply a 3D convolution with the kernel I want.
This works fine, however the part that maps the 3D function to the 3D grid is very slow, as I did not find a way to do it with NumPy only. The code below is my actual implementation, where data is the (n,3) array containing the 3D positions in which the function is evaluated, F is the (n,) array containing the corresponding values of the function and M is the (N,N,N) array that contains the 3D space grid.
step = 0.1
# Create meshgrid
xmin = data[:,0].min()
xmax = data[:,0].max()
ymin = data[:,1].min()
ymax = data[:,1].max()
zmin = data[:,2].min()
zmax = data[:,2].max()
x = np.linspace(xmin,xmax,int((xmax-xmin)/step)+1)
y = np.linspace(ymin,ymax,int((ymax-ymin)/step)+1)
z = np.linspace(zmin,zmax,int((zmax-zmin)/step)+1)
# Build image
M = np.zeros((len(x),len(y),len(z)))
for l in range(len(data)):
for i in range(len(x)-1):
if x[i] < data[l,0] < x[i+1]:
for j in range(len(y)-1):
if y[j] < data[l,1] < y[j+1]:
for k in range(len(z)-1):
if z[k] < data[l,2] < z[k+1]:
M[i,j,k] = F[l]
Is there a more efficient way to fill a 3D spatial grid with the values of a 3D function ?
For each item of data you're scanning pixels of cuboid to check if it's inside. There is an option to skip this scan. You could calculate corresponding indices of these pixels by yourself, for example:
data = np.array([[1, 2, 3], #14 (corner1)
[4, 5, 6], #77 (corner2)
[2.5, 3.5, 4.5], #38.75 (duplicated pixel)
[2.9, 3.9, 4.9], #47.63 (duplicated pixel)
[1.5, 2, 3]]) #15.25 (one step up from [1, 2, 3])
step = 0.5
data_idx = ((data - data.min(axis=0))//step).astype(int)
M = np.zeros(np.max(data_idx, axis=0) + 1)
x, y, z = data_idx.T
M[x, y, z] = F
Note that only one value of duplicated pixels is being mapped to M.
All you need is just reshape F[:, 3] (only f(x, y, z)) into a grid. Hard to be more precise without sample data:
If the data is not sorted, you need to sort it:
F_sorted = F[np.lexsort((F[:,0], F[:,1], F[:,2]))] # sort by x, then y, then z
Choose only f(x, y, z)
F_values = F_sorted[:, 3]
Finally, reshape data into a grid:
M = F_sorted.reshape(N, N, N)
This method is faster than the original (approximatly 20x speed up):
step = 0.1
mins = np.min(data, axis=0)
maxs = np.max(data, axis=0)
ranges = np.floor((maxs - mins) / step + 1).astype(int)
indx = np.zeros(data.shape,dtype=int)
for i in range(3):
x = np.linspace(mins[i], maxs[i], ranges[i])
indx[:,i] = np.argmax(data[:,i,np.newaxis] <= (x[np.newaxis,:]), axis=1) -1
M = np.zeros(ranges)
M[indx[:,0],indx[:,1],indx[:,2]] = F
The first part sets up the required grid variables. The argmax function provides a simple (and fast) way to find the first true value of the broadcasted array. This produces a set of indices for x, y and z directions for each of the function values.
The resulting array M is not the same as that produced by the original code as the original code loses data. The logic of y[j] < data[l,1] < y[j+1] where y is a vector produced using linspace means the minimum and maximum values for each direction will be missed (data[l,1] might be equal to either y[j] or y[j+1]!). Run it with a dataset of two values each with their own coordinates and the M array will be all zeros.

Finding minimum distance between points using two iterative lists and 'in range (0,len a_list))

So i have a question that needs solving and the answer has to be fixed within a set of parameters. I've searched Stack for hours on this but can't find anything that helps. I'm new to Python and this is part of a quiz but without enough knowledge on iteration through lists I'm lost!
I know the solution to this having done the brute force method manually and looking at the scatter graph that comes out but I can't get the code to generate what I expect.
Code as follows
import matplotlib.pyplot as plt
xs = [1, 7, 2, 10, 3, 4, 8, 4]
ys = [1, 2, 4, 9, 16, 0, 12, 8]
plt.scatter(xs, ys)
import math
"""Compute the minimun distance between points given as lists for
x, y coordinates. Return the values for the closest pair of points."""
def min_distance(xs, ys):
"""assume inputs xs, ys are lists of same length representing
x,y point ordinates where points are distinct
start with a high number as lowest"""
min_dist = 99999
min_index1 = 0
min_index2 = 0
# iterate x,y ordinates to find minimum distance
"""YOUR CODE GOES HERE - BEGIN"""
for i in range (0, len(xs)):
for j in range (0, len(ys)):
if i > min_index1 and j > min_index2:
min_dist = math.sqrt((xs[i] - xs[j])**2 + (ys[i] - ys[j])**2)
"""YOUR CODE GOES HERE - END"""
return(min_index1, min_index2, min_dist)
index1, index2, d = min_distance(xs, ys)
print("Closest ordinates {},{} with distance: {}".format(index1,index2,d))
The only code I'm supposed to change is the bit in the grey box that I have attempted to code myself (after it states ENTER YOUR CODE HERE)
Welcome to StackOverflow. First of all, inside the nested loop you're doing
xs[i] - xs[i]
which always returns zero.
And maybe I misunderstood the question, but if all of your points are generated by p_i = (xs[i],ys[i]), then for every p_i check the distance between the others points p_j (j > i), and store the minimum and the index of these points on the process.
So, the function you are looking for is zip: https://docs.python.org/3/library/functions.html#zip
With it you can iterate over your coordinates like:
for x, y in zip(xs, ys):
calculate distance here
list_of_distances.append((x, y, distance))
You also want to store everything, so you can use min() or sorted() to find the lowest.
https://docs.python.org/3/library/functions.html#min
https://docs.python.org/3/library/functions.html#sorted
The question could indeed be edited inline with the comments on others answers. There might be a better solution for this, but here a solution that will iterate through every point combination. It creates a distance matrix and then looks for the positions where the minimum was found. Note that performance will be of concern for very large arrays.
xs = [1, 7, 2, 10, 3, 4, 8, 4]
ys = [1, 2, 4, 9, 16, 0, 12, 8]
from typing import List, Tuple
import numpy as np
x = np.array(xs)
y = np.array(ys)
def get_idx_min_dist(x: np.array, y: np.array) -> Tuple[int, int]:
DIST_MAT = np.full((len(x),len(y)), np.inf)
xy = [(i, j) for i, j in zip(x, y)]
for i, point1 in enumerate(xy):
for j, point2 in enumerate(xy):
if i != j:
DIST_MAT[i, j] = (point1[0] - point2[0])**2 + (point1[1] - point2[1])**2
min_pos = DIST_MAT.argmin()
first_point = min_pos // len(x)
second_point = min_pos % len(x)
return first_point, second_point
first, second = get_idx_min_dist(x, y)
print(first, second)
The output is:
0 2
EDIT
You might want to use combinations from itertools and create all different point pair combinations instead of creating a big dist matrix and consequent expensive search for min values.
Thanks for your help all, after a lot of trial by error I got the answer
for i in range (len(xs)-1):
for j in range (i+1,len(ys)):
min_distance = math.sqrt((xs[i] - xs[j])*2 + (ys[i] - ys[j])*2)
if min_distance < min_dist:
min_dist = min_distance
min_index1 = xs[i]
min_index2 = ys[j]

Arrange and Sub-sample 3d Points in Coordinate Grid With Numpy

I have a list of 3d points such as
np.array([
[220, 114, 2000],
[125.24, 214, 2519],
...
[54.1, 254, 1249]
])
The points are in no meaningful order. I'd like to sort and reshape the array in a way that better represents a coordinate grid (such that I have a known width and height and can retrieve Z values by index). I would also like to down sample the points into say whole integers to handle collisions. Applying min,max, or mean during the down sampling.
I know I can down sample a 1d array using np.mean and np.shape
The approach I'm currently using finds the min and max in X,Y and then puts the Z values into a 2d array while doing the down sampling manually.
This iterates the giant array numerous times and I'm wondering if there is a way to do this with np.meshgrid or some other numpy functionality that I'm overlooking.
Thanks
You can use the binning method from Most efficient way to sort an array into bins specified by an index array?
To get an index array from y,x coordinates you can use np.searchsorted and np.ravel_multi_index
Here is a sample implementation, stb module is the code from the linked post.
import numpy as np
from stb import sort_to_bins_sparse as sort_to_bins
def grid1D(u, N):
mn, mx = u.min(), u.max()
return np.linspace(mn, mx, N, endpoint=False)
def gridify(yxz, N):
try:
Ny, Nx = N
except TypeError:
Ny = Nx = N
y, x, z = yxz.T
yg, xg = grid1D(y, Ny), grid1D(x, Nx)
yidx, xidx = yg.searchsorted(y, 'right')-1, xg.searchsorted(x, 'right')-1
yx = np.ravel_multi_index((yidx, xidx), (Ny, Nx))
zs = sort_to_bins(yx, z)
return np.concatenate([[0], np.bincount(yx).cumsum()]), zs, yg, xg
def bin(yxz, N, binning_method='min'):
boundaries, binned, yg, xg = gridify(yxz, N)
result = np.full((yg.size, xg.size), np.nan)
if binning_method == 'min':
result.reshape(-1)[:len(boundaries)-1] = np.minimum.reduceat(binned, boundaries[:-1])
elif binning_method == 'max':
result.reshape(-1)[:len(boundaries)-1] = np.maximum.reduceat(binned, boundaries[:-1])
elif binning_method == 'mean':
result.reshape(-1)[:len(boundaries)-1] = np.add.reduceat(binned, boundaries[:-1]) / np.diff(boundaries)
else:
raise ValueError
result.reshape(-1)[np.where(boundaries[1:] == boundaries[:-1])] = np.nan
return result
def test():
yxz = np.random.uniform(0, 100, (100000, 3))
N = 20
boundaries, binned, yg, xg = gridify(yxz, N)
binmin = bin(yxz, N)
binmean = bin(yxz, N, 'mean')
y, x, z = yxz.T
for i in range(N-1):
for j in range(N-1):
msk = (y>=yg[i]) & (y<yg[i+1]) & (x>=xg[j]) & (x<xg[j+1])
assert (z[msk].min() == binmin[i, j]) if msk.any() else np.isnan(binmin[i, j])
assert np.isclose(z[msk].mean(), binmean[i, j]) if msk.any() else np.isnan(binmean[i, j])

Implementation of a threshold detection function in Python

I want to implement following trigger function in Python:
Input:
time vector t [n dimensional numpy vector]
data vector y [n dimensional numpy vector] (values correspond to t vector)
threshold tr [float]
Threshold type vector tr_type [m dimensional list of int values]
Output:
Threshold time vector tr_time [m dimensional list of float values]
Function:
I would like to return tr_time which consists of the exact (preferred also interpolated which is not yet in code below) time values at which y is crossing tr (crossing means going from less then to greater then or the other way around). The different values in tr_time correspond to the tr_type vector: the elements of tr_type indicate the number of the crossing and if this is an upgoing or a downgoing crossing. For example 1 means first time y goes from less then tr to greater than tr, -3 means the third time y goes from greater then tr to less then tr (third time means along the time vector t)
For the moment I have next code:
import numpy as np
import matplotlib.pyplot as plt
def trigger(t, y, tr, tr_type):
triggermarker = np.diff(1 * (y > tr))
positiveindices = [i for i, x in enumerate(triggermarker) if x == 1]
negativeindices = [i for i, x in enumerate(triggermarker) if x == -1]
triggertime = []
for i in tr_type:
if i >= 0:
triggertime.append(t[positiveindices[i - 1]])
elif i < 0:
triggertime.append(t[negativeindices[i - 1]])
return triggertime
t = np.linspace(0, 20, 1000)
y = np.sin(t)
tr = 0.5
tr_type = [1, 2, -2]
print(trigger(t, y, tr, tr_type))
plt.plot(t, y)
plt.grid()
Now I'm pretty new to Python so I was wondering if there is a more Pythonic and more efficient way to implement this. For example without for loops or without the need to write separate code for upgoing or downgoing crossings.
You can use two masks: the first separates the value below and above the threshold, the second uses np.diff on the first mask: if the i and i+1 value are both below or above the threshold, np.diff yields 0:
import numpy as np
import matplotlib.pyplot as plt
t = np.linspace(0, 8 * np.pi, 400)
y = np.sin(t)
th = 0.5
mask = np.diff(1 * (y > th) != 0)
plt.plot(t, y, 'bx', markersize=3)
plt.plot(t[:-1][mask], y[:-1][mask], 'go', markersize=8)
Using the slice [:-1] will yield the index "immediately before" crossing the threshold (you can see that in the chart). if you want the index "immediately after" use [1:] instead of [:-1]

Minimizing a function of a linear combination of data with Scipy

Suppose I have some matrix X where each row represents a time-series. For example, X could be a matrix of size 3 x 1000, which would mean that there are 3 time-series each consisting of 1000 time-points. In addition to X, I have one scalar for each time-series in X. I would like to find a linear combination
a[0] * X[0, :] + a[1] * X[1, :] + ... + a[n-1] * X[n-1, :]
that has the minimum value for some function F.
So, I attempted the following
import numpy as np
from scipy.optimization import minimize
def f(x):
return 0 # for testing purposes
def obj(a,x):
y = a*x
return f(y)
minimize(obj, np.array([1,1]), args=np.array([[1,1],[2,2]]), method='nelder-mead')
So the second argument is the initial guess x0 (the coefficients a). The data given by args should get mapped to x (if I understand it correctly) and remains constant during the optimization.
However, I get the error
ValueError: setting an array element with a sequence.
I guess my problem is pretty general one, so I hope someone would be able to help!
Something like this?
import scipy.optimize as opt
def f(val):
return val**2
def obj(a, series):
s = 0
for row in series:
for t in range(len(row)):
s += f(a[t] * row[t])
return s
ll_x = [[2, 3, 2, 6], [3, 5, 2, 7]] # 2 series
l_a = [1 for _ in ll_x[0]] # initial coeffs.
res = opt.minimize(obj, l_a, args=ll_x, method='nelder-mead')
for elem in sorted(res.items()):
print(*elem)
(works for me with Python 3.4.3)

Categories