I need to create a function which would be the inverse of the np.gradient function.
Where the Vx,Vy arrays (Velocity component vectors) are the input and the output would be an array of anti-derivatives (Arrival Time) at the datapoints x,y.
I have data on a (x,y) grid with scalar values (time) at each point.
I have used the numpy gradient function and linear interpolation to determine the gradient vector Velocity (Vx,Vy) at each point (See below).
I have achieved this by:
#LinearTriInterpolator applied to a delaunay triangular mesh
LTI= LinearTriInterpolator(masked_triang, time_array)
#Gradient requested at the mesh nodes:
(Vx, Vy) = LTI.gradient(triang.x, triang.y)
The first image below shows the velocity vectors at each point, and the point labels represent the time value which formed the derivatives (Vx,Vy)
The next image shows the resultant scalar value of the derivatives (Vx,Vy) plotted as a colored contour graph with associated node labels.
So my challenge is:
I need to reverse the process!
Using the gradient vectors (Vx,Vy) or the resultant scalar value to determine the original Time-Value at that point.
Is this possible?
Knowing that the numpy.gradient function is computed using second order accurate central differences in the interior points and either first or second order accurate one-sides (forward or backwards) differences at the boundaries, I am sure there is a function which would reverse this process.
I was thinking that taking a line derivative between the original point (t=0 at x1,y1) to any point (xi,yi) over the Vx,Vy plane would give me the sum of the velocity components. I could then divide this value by the distance between the two points to get the time taken..
Would this approach work? And if so, which numpy integrate function would be best applied?
An example of my data can be found here [http://www.filedropper.com/calculatearrivaltimefromgradientvalues060820]
Your help would be greatly appreciated
EDIT:
Maybe this simplified drawing might help understand where I'm trying to get to..
EDIT:
Thanks to #Aguy who has contibuted to this code.. I Have tried to get a more accurate representation using a meshgrid of spacing 0.5 x 0.5m and calculating the gradient at each meshpoint, however I am not able to integrate it properly. I also have some edge affects which are affecting the results that I don't know how to correct.
import numpy as np
from scipy import interpolate
from matplotlib import pyplot
from mpl_toolkits.mplot3d import Axes3D
#Createmesh grid with a spacing of 0.5 x 0.5
stepx = 0.5
stepy = 0.5
xx = np.arange(min(x), max(x), stepx)
yy = np.arange(min(y), max(y), stepy)
xgrid, ygrid = np.meshgrid(xx, yy)
grid_z1 = interpolate.griddata((x,y), Arrival_Time, (xgrid, ygrid), method='linear') #Interpolating the Time values
#Formatdata
X = np.ravel(xgrid)
Y= np.ravel(ygrid)
zs = np.ravel(grid_z1)
Z = zs.reshape(X.shape)
#Calculate Gradient
(dx,dy) = np.gradient(grid_z1) #Find gradient for points on meshgrid
Velocity_dx= dx/stepx #velocity ms/m
Velocity_dy= dy/stepx #velocity ms/m
Resultant = (Velocity_dx**2 + Velocity_dy**2)**0.5 #Resultant scalar value ms/m
Resultant = np.ravel(Resultant)
#Plot Original Data F(X,Y) on the meshgrid
fig = pyplot.figure()
ax = fig.add_subplot(projection='3d')
ax.scatter(x,y,Arrival_Time,color='r')
ax.plot_trisurf(X, Y, Z)
ax.set_xlabel('X-Coordinates')
ax.set_ylabel('Y-Coordinates')
ax.set_zlabel('Time (ms)')
pyplot.show()
#Plot the Derivative of f'(X,Y) on the meshgrid
fig = pyplot.figure()
ax = fig.add_subplot(projection='3d')
ax.scatter(X,Y,Resultant,color='r',s=0.2)
ax.plot_trisurf(X, Y, Resultant)
ax.set_xlabel('X-Coordinates')
ax.set_ylabel('Y-Coordinates')
ax.set_zlabel('Velocity (ms/m)')
pyplot.show()
#Integrate to compare the original data input
dxintegral = np.nancumsum(Velocity_dx, axis=1)*stepx
dyintegral = np.nancumsum(Velocity_dy, axis=0)*stepy
valintegral = np.ma.zeros(dxintegral.shape)
for i in range(len(yy)):
for j in range(len(xx)):
valintegral[i, j] = np.ma.sum([dxintegral[0, len(xx) // 2],
dyintegral[i, len(yy) // 2], dxintegral[i, j], - dxintegral[i, len(xx) // 2]])
valintegral = valintegral * np.isfinite(dxintegral)
Now the np.gradient is applied at every meshnode (dx,dy) = np.gradient(grid_z1)
Now in my process I would analyse the gradient values above and make some adjustments (There is some unsual edge effects that are being create which I need to rectify) and would then integrate the values to get back to a surface which would be very similar to f(x,y) shown above.
I need some help adjusting the integration function:
#Integrate to compare the original data input
dxintegral = np.nancumsum(Velocity_dx, axis=1)*stepx
dyintegral = np.nancumsum(Velocity_dy, axis=0)*stepy
valintegral = np.ma.zeros(dxintegral.shape)
for i in range(len(yy)):
for j in range(len(xx)):
valintegral[i, j] = np.ma.sum([dxintegral[0, len(xx) // 2],
dyintegral[i, len(yy) // 2], dxintegral[i, j], - dxintegral[i, len(xx) // 2]])
valintegral = valintegral * np.isfinite(dxintegral)
And now I need to calculate the new 'Time' values at the original (x,y) point locations.
UPDATE (08-09-20) : I am getting some promising results using the help from #Aguy. The results can be seen below (with the blue contours representing the original data, and the red contours representing the integrated values).
I am still working on an integration approach which can remove the inaccuarcies at the areas of min(y) and max(y)
from matplotlib.tri import (Triangulation, UniformTriRefiner,
CubicTriInterpolator,LinearTriInterpolator,TriInterpolator,TriAnalyzer)
import pandas as pd
from scipy.interpolate import griddata
import matplotlib.pyplot as plt
import numpy as np
from scipy import interpolate
#-------------------------------------------------------------------------
# STEP 1: Import data from Excel file, and set variables
#-------------------------------------------------------------------------
df_initial = pd.read_excel(
r'C:\Users\morga\PycharmProjects\venv\Development\Trial'
r'.xlsx')
Inputdata can be found here link
df_initial = df_initial .sort_values(by='Delay', ascending=True) #Update dataframe and sort by Delay
x = df_initial ['X'].to_numpy()
y = df_initial ['Y'].to_numpy()
Arrival_Time = df_initial ['Delay'].to_numpy()
# Createmesh grid with a spacing of 0.5 x 0.5
stepx = 0.5
stepy = 0.5
xx = np.arange(min(x), max(x), stepx)
yy = np.arange(min(y), max(y), stepy)
xgrid, ygrid = np.meshgrid(xx, yy)
grid_z1 = interpolate.griddata((x, y), Arrival_Time, (xgrid, ygrid), method='linear') # Interpolating the Time values
# Calculate Gradient (velocity ms/m)
(dy, dx) = np.gradient(grid_z1) # Find gradient for points on meshgrid
Velocity_dx = dx / stepx # x velocity component ms/m
Velocity_dy = dy / stepx # y velocity component ms/m
# Integrate to compare the original data input
dxintegral = np.nancumsum(Velocity_dx, axis=1) * stepx
dyintegral = np.nancumsum(Velocity_dy, axis=0) * stepy
valintegral = np.ma.zeros(dxintegral.shape) # Makes an array filled with 0's the same shape as dx integral
for i in range(len(yy)):
for j in range(len(xx)):
valintegral[i, j] = np.ma.sum(
[dxintegral[0, len(xx) // 2], dyintegral[i, len(xx) // 2], dxintegral[i, j], - dxintegral[i, len(xx) // 2]])
valintegral[np.isnan(dx)] = np.nan
min_value = np.nanmin(valintegral)
valintegral = valintegral + (min_value * -1)
##Plot Results
fig = plt.figure()
ax = fig.add_subplot()
ax.scatter(x, y, color='black', s=7, zorder=3)
ax.set_xlabel('X-Coordinates')
ax.set_ylabel('Y-Coordinates')
ax.contour(xgrid, ygrid, valintegral, levels=50, colors='red', zorder=2)
ax.contour(xgrid, ygrid, grid_z1, levels=50, colors='blue', zorder=1)
ax.set_aspect('equal')
plt.show()
TL;DR;
You have multiple challenges to address in this issue, mainly:
Potential reconstruction (scalar field) from its gradient (vector field)
But also:
Observation in a concave hull with non rectangular grid;
Numerical 2D line integration and numerical inaccuracy;
It seems it can be solved by choosing an adhoc interpolant and a smart way to integrate (as pointed out by #Aguy).
MCVE
In a first time, let's build a MCVE to highlight above mentioned key points.
Dataset
We recreate a scalar field and its gradient.
import numpy as np
from scipy import interpolate
import matplotlib.pyplot as plt
def f(x, y):
return x**2 + x*y + 2*y + 1
Nx, Ny = 21, 17
xl = np.linspace(-3, 3, Nx)
yl = np.linspace(-2, 2, Ny)
X, Y = np.meshgrid(xl, yl)
Z = f(X, Y)
zl = np.arange(np.floor(Z.min()), np.ceil(Z.max())+1, 2)
dZdy, dZdx = np.gradient(Z, yl, xl, edge_order=1)
V = np.hypot(dZdx, dZdy)
The scalar field looks like:
axe = plt.axes(projection='3d')
axe.plot_surface(X, Y, Z, cmap='jet', alpha=0.5)
axe.view_init(elev=25, azim=-45)
And, the vector field looks like:
axe = plt.contour(X, Y, Z, zl, cmap='jet')
axe.axes.quiver(X, Y, dZdx, dZdy, V, units='x', pivot='tip', cmap='jet')
axe.axes.set_aspect('equal')
axe.axes.grid()
Indeed gradient is normal to potential levels. We also plot the gradient magnitude:
axe = plt.contour(X, Y, V, 10, cmap='jet')
axe.axes.set_aspect('equal')
axe.axes.grid()
Raw field reconstruction
If we naively reconstruct the scalar field from the gradient:
SdZx = np.cumsum(dZdx, axis=1)*np.diff(xl)[0]
SdZy = np.cumsum(dZdy, axis=0)*np.diff(yl)[0]
Zhat = np.zeros(SdZx.shape)
for i in range(Zhat.shape[0]):
for j in range(Zhat.shape[1]):
Zhat[i,j] += np.sum([SdZy[i,0], -SdZy[0,0], SdZx[i,j], -SdZx[i,0]])
Zhat += Z[0,0] - Zhat[0,0]
We can see the global result is roughly correct, but levels are less accurate where the gradient magnitude is low:
Interpolated field reconstruction
If we increase the grid resolution and pick a specific interpolant (usual when dealing with mesh grid), we can get a finer field reconstruction:
r = np.stack([X.ravel(), Y.ravel()]).T
Sx = interpolate.CloughTocher2DInterpolator(r, dZdx.ravel())
Sy = interpolate.CloughTocher2DInterpolator(r, dZdy.ravel())
Nx, Ny = 200, 200
xli = np.linspace(xl.min(), xl.max(), Nx)
yli = np.linspace(yl.min(), yl.max(), Nx)
Xi, Yi = np.meshgrid(xli, yli)
ri = np.stack([Xi.ravel(), Yi.ravel()]).T
dZdxi = Sx(ri).reshape(Xi.shape)
dZdyi = Sy(ri).reshape(Xi.shape)
SdZxi = np.cumsum(dZdxi, axis=1)*np.diff(xli)[0]
SdZyi = np.cumsum(dZdyi, axis=0)*np.diff(yli)[0]
Zhati = np.zeros(SdZxi.shape)
for i in range(Zhati.shape[0]):
for j in range(Zhati.shape[1]):
Zhati[i,j] += np.sum([SdZyi[i,0], -SdZyi[0,0], SdZxi[i,j], -SdZxi[i,0]])
Zhati += Z[0,0] - Zhati[0,0]
Which definitely performs way better:
So basically, increasing the grid resolution with an adhoc interpolant may help you to get more accurate result. The interpolant also solve the need to get a regular rectangular grid from a triangular mesh to perform integration.
Concave and convex hull
You also have pointed out inaccuracy on the edges. Those are the result of the combination of the interpolant choice and the integration methodology. The integration methodology fails to properly compute the scalar field when it reach concave region with few interpolated points. The problem disappear when choosing a mesh-free interpolant able to extrapolate.
To illustrate it, let's remove some data from our MCVE:
q = np.full(dZdx.shape, False)
q[0:6,5:11] = True
q[-6:,-6:] = True
dZdx[q] = np.nan
dZdy[q] = np.nan
Then the interpolant can be constructed as follow:
q2 = ~np.isnan(dZdx.ravel())
r = np.stack([X.ravel(), Y.ravel()]).T[q2,:]
Sx = interpolate.CloughTocher2DInterpolator(r, dZdx.ravel()[q2])
Sy = interpolate.CloughTocher2DInterpolator(r, dZdy.ravel()[q2])
Performing the integration we see that in addition of classical edge effect we do have less accurate value in concave regions (swingy dot-dash lines where the hull is concave) and we have no data outside the convex hull as Clough Tocher is a mesh-based interpolant:
Vl = np.arange(0, 11, 1)
axe = plt.contour(X, Y, np.hypot(dZdx, dZdy), Vl, cmap='jet')
axe.axes.contour(Xi, Yi, np.hypot(dZdxi, dZdyi), Vl, cmap='jet', linestyles='-.')
axe.axes.set_aspect('equal')
axe.axes.grid()
So basically the error we are seeing on the corner are most likely due to integration issue combined with interpolation limited to the convex hull.
To overcome this we can choose a different interpolant such as RBF (Radial Basis Function Kernel) which is able to create data outside the convex hull:
Sx = interpolate.Rbf(r[:,0], r[:,1], dZdx.ravel()[q2], function='thin_plate')
Sy = interpolate.Rbf(r[:,0], r[:,1], dZdy.ravel()[q2], function='thin_plate')
dZdxi = Sx(ri[:,0], ri[:,1]).reshape(Xi.shape)
dZdyi = Sy(ri[:,0], ri[:,1]).reshape(Xi.shape)
Notice the slightly different interface of this interpolator (mind how parmaters are passed).
The result is the following:
We can see the region outside the convex hull can be extrapolated (RBF are mesh free). So choosing the adhoc interpolant is definitely a key point to solve your problem. But we still need to be aware that extrapolation may perform well but is somehow meaningless and dangerous.
Solving your problem
The answer provided by #Aguy is perfectly fine as it setups a clever way to integrate that is not disturbed by missing points outside the convex hull. But as you mentioned there is inaccuracy in concave region inside the convex hull.
If you wish to remove the edge effect you detected, you will have to resort to an interpolant able to extrapolate as well, or find another way to integrate.
Interpolant change
Using RBF interpolant seems to solve your problem. Here is the complete code:
df = pd.read_excel('./Trial-Wireup 2.xlsx')
x = df['X'].to_numpy()
y = df['Y'].to_numpy()
z = df['Delay'].to_numpy()
r = np.stack([x, y]).T
#S = interpolate.CloughTocher2DInterpolator(r, z)
#S = interpolate.LinearNDInterpolator(r, z)
S = interpolate.Rbf(x, y, z, epsilon=0.1, function='thin_plate')
N = 200
xl = np.linspace(x.min(), x.max(), N)
yl = np.linspace(y.min(), y.max(), N)
X, Y = np.meshgrid(xl, yl)
#Zp = S(np.stack([X.ravel(), Y.ravel()]).T)
Zp = S(X.ravel(), Y.ravel())
Z = Zp.reshape(X.shape)
dZdy, dZdx = np.gradient(Z, yl, xl, edge_order=1)
SdZx = np.nancumsum(dZdx, axis=1)*np.diff(xl)[0]
SdZy = np.nancumsum(dZdy, axis=0)*np.diff(yl)[0]
Zhat = np.zeros(SdZx.shape)
for i in range(Zhat.shape[0]):
for j in range(Zhat.shape[1]):
#Zhat[i,j] += np.nansum([SdZy[i,0], -SdZy[0,0], SdZx[i,j], -SdZx[i,0]])
Zhat[i,j] += np.nansum([SdZx[0,N//2], SdZy[i,N//2], SdZx[i,j], -SdZx[i,N//2]])
Zhat += Z[100,100] - Zhat[100,100]
lz = np.linspace(0, 5000, 20)
axe = plt.contour(X, Y, Z, lz, cmap='jet')
axe = plt.contour(X, Y, Zhat, lz, cmap='jet', linestyles=':')
axe.axes.plot(x, y, '.', markersize=1)
axe.axes.set_aspect('equal')
axe.axes.grid()
Which graphically renders as follow:
The edge effect is gone because of the RBF interpolant can extrapolate over the whole grid. You can confirm it by comparing the result of mesh-based interpolants.
Linear
Clough Tocher
Integration variable order change
We can also try to find a better way to integrate and mitigate the edge effect, eg. let's change the integration variable order:
Zhat[i,j] += np.nansum([SdZy[N//2,0], SdZx[N//2,j], SdZy[i,j], -SdZy[N//2,j]])
With a classic linear interpolant. The result is quite correct, but we still have an edge effect on the bottom left corner:
As you noticed the problem occurs at the middle of the axis in region where the integration starts and lacks a reference point.
Here is one approach:
First, in order to be able to do integration, it's good to be on a regular grid. Using here variable names x and y as short for your triang.x and triang.y we can first create a grid:
import numpy as np
n = 200 # Grid density
stepx = (max(x) - min(x)) / n
stepy = (max(y) - min(y)) / n
xspace = np.arange(min(x), max(x), stepx)
yspace = np.arange(min(y), max(y), stepy)
xgrid, ygrid = np.meshgrid(xspace, yspace)
Then we can interpolate dx and dy on the grid using the same LinearTriInterpolator function:
fdx = LinearTriInterpolator(masked_triang, dx)
fdy = LinearTriInterpolator(masked_triang, dy)
dxgrid = fdx(xgrid, ygrid)
dygrid = fdy(xgrid, ygrid)
Now comes the integration part. In principle, any path we choose should get us to the same value. In practice, since there are missing values and different densities, the choice of path is very important to get a reasonably accurate answer.
Below I choose to integrate over dxgrid in the x direction from 0 to the middle of the grid at n/2. Then integrate over dygrid in the y direction from 0 to the i point of interest. Then over dxgrid again from n/2 to the point j of interest. This is a simple way to make sure most of the path of integration is inside the bulk of available data by simply picking a path that goes mostly in the "middle" of the data range. Other alternative consideration would lead to different path selections.
So we do:
dxintegral = np.nancumsum(dxgrid, axis=1) * stepx
dyintegral = np.nancumsum(dygrid, axis=0) * stepy
and then (by somewhat brute force for clarity):
valintegral = np.ma.zeros(dxintegral.shape)
for i in range(n):
for j in range(n):
valintegral[i, j] = np.ma.sum([dxintegral[0, n // 2], dyintegral[i, n // 2], dxintegral[i, j], - dxintegral[i, n // 2]])
valintegral = valintegral * np.isfinite(dxintegral)
valintegral would be the result up to an arbitrary constant which can help put the "zero" where you want.
With your data shown here:
ax.tricontourf(masked_triang, time_array)
This is what I'm getting reconstructed when using this method:
ax.contourf(xgrid, ygrid, valintegral)
Hopefully this is somewhat helpful.
If you want to revisit the values at the original triangulation points, you can use interp2d on the valintegral regular grid data.
EDIT:
In reply to your edit, your adaptation above has a few errors:
Change the line (dx,dy) = np.gradient(grid_z1) to (dy,dx) = np.gradient(grid_z1)
In the integration loop change the dyintegral[i, len(yy) // 2] term to dyintegral[i, len(xx) // 2]
Better to replace the line valintegral = valintegral * np.isfinite(dxintegral) with valintegral[np.isnan(dx)] = np.nan
Related
say we have a 2D grid that is projected on a 3D surface, resulting in a 3D numpy array, like the below image. What is the most efficient way to calculate a surface normal for each point of this grid?
I can give you an example with simulated data:
I showed your way, with three points. With three points you can always calculate the cross product to get the perpendicular vector based on the two vectors created from three points. Order does not matter.
I took the liberty to also add the PCA approach using predefined sklearn functions. You can create your own PCA, good exercise to understand what happens under the hood but this works fine. The benefit of the approach is that it is easy to increase the number of neighbors and you are still able to calculate the normal vector. It is also possible to select the neighbors within a range instead of N nearest neighbors.
If you need more explanation about the working of the code please let me know.
from functools import partial
import numpy as np
from sklearn.neighbors import KDTree
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Grab some test data.
X, Y, Z = axes3d.get_test_data(0.25)
X, Y, Z = map(lambda x: x.flatten(), [X, Y, Z])
plt.plot(X, Y, Z, '.')
plt.show(block=False)
data = np.array([X, Y, Z]).T
tree = KDTree(data, metric='minkowski') # minkowki is p2 (euclidean)
# Get indices and distances:
dist, ind = tree.query(data, k=3) #k=3 points including itself
def calc_cross(p1, p2, p3):
v1 = p2 - p1
v2 = p3 - p1
v3 = np.cross(v1, v2)
return v3 / np.linalg.norm(v3)
def PCA_unit_vector(array, pca=PCA(n_components=3)):
pca.fit(array)
eigenvalues = pca.explained_variance_
return pca.components_[ np.argmin(eigenvalues) ]
combinations = data[ind]
normals = list(map(lambda x: calc_cross(*x), combinations))
# lazy with map
normals2 = list(map(PCA_unit_vector, combinations))
## NEW ##
def calc_angle_with_xy(vectors):
'''
Assuming unit vectors!
'''
l = np.sum(vectors[:,:2]**2, axis=1) ** 0.5
return np.arctan2(vectors[:, 2], l)
dist, ind = tree.query(data, k=5) #k=3 points including itself
combinations = data[ind]
# map with functools
pca = PCA(n_components=3)
normals3 = list(map(partial(PCA_unit_vector, pca=pca), combinations))
print( combinations[10] )
print(normals3[10])
n = np.array(normals3)
n[calc_angle_with_xy(n) < 0] *= -1
def set_axes_equal(ax):
'''Make axes of 3D plot have equal scale so that spheres appear as spheres,
cubes as cubes, etc.. This is one possible solution to Matplotlib's
ax.set_aspect('equal') and ax.axis('equal') not working for 3D.
Input
ax: a matplotlib axis, e.g., as output from plt.gca().
FROM: https://stackoverflow.com/questions/13685386/matplotlib-equal-unit-length-with-equal-aspect-ratio-z-axis-is-not-equal-to
'''
x_limits = ax.get_xlim3d()
y_limits = ax.get_ylim3d()
z_limits = ax.get_zlim3d()
x_range = abs(x_limits[1] - x_limits[0])
x_middle = np.mean(x_limits)
y_range = abs(y_limits[1] - y_limits[0])
y_middle = np.mean(y_limits)
z_range = abs(z_limits[1] - z_limits[0])
z_middle = np.mean(z_limits)
# The plot bounding box is a sphere in the sense of the infinity
# norm, hence I call half the max range the plot radius.
plot_radius = 0.5*max([x_range, y_range, z_range])
ax.set_xlim3d([x_middle - plot_radius, x_middle + plot_radius])
ax.set_ylim3d([y_middle - plot_radius, y_middle + plot_radius])
ax.set_zlim3d([z_middle - plot_radius, z_middle + plot_radius])
u, v, w = n.T
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
# ax.set_aspect('equal')
# Make the grid
ax.quiver(X, Y, Z, u, v, w, length=10, normalize=True)
set_axes_equal(ax)
plt.show()
The surface normal for a point cloud is not well defined. One way to define them is from the surface normal of a reconstructed mesh using triangulation (which can introduce artefacts regarding you specific input). A relatively simple and fast solution is to use VTK to do that, and more specifically, vtkSurfaceReconstructionFilter and vtkPolyDataNormals . Regarding your needs, it might be useful to apply other filters.
Goal
I would like to compute the 3D volume integral of a numeric scalar field.
Code
For this post, I will use an example of which the integral can be exactly computed. I have therefore chosen the following function:
In Python, I define the function, and a set of points in 3D, and then generate the discrete values at these points:
import numpy as np
# Make data.
def function(x, y, z):
return x**y**z
N = 5
grid = np.meshgrid(
np.linspace(0, 1, N),
np.linspace(0, 1, N),
np.linspace(0, 1, N)
)
points = np.vstack(list(map(np.ravel, grid))).T
x = points[:, 0]
y = points[:, 1]
z = points[:, 2]
values = [function(points[i, 0], points[i, 1], points[i, 2])
for i in range(len(points))]
Question
How can I find the integral, if I don't know the underlying function, i.e. if I only have the coordinates (x, y, z) and the values?
A nice way to go about this would be using scipy's tplquad integration. However, to use that, we need a function and not a cloud point.
An easy way around that is to use an interpolator, to get a function approximating our cloud point - we can for example use scipy's RegularGridInterpolator if the data is on a regular grid:
import numpy as np
from scipy import integrate
from scipy.interpolate import RegularGridInterpolator
# Make data.
def function(x,y,z):
return x*y*z
N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1
x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)
values = function(*np.meshgrid(x,y,z, indexing='ij'))
# Interpolate:
function_interpolated = RegularGridInterpolator((x, y, z), values)
# tplquad integrates func(z,y,x)
f = lambda z,y,x : my_interpolating_function([z,y,x])
result, error = integrate.tplquad(f, xmin, xmax, lambda _: ymin, lambda _:ymax,lambda *_: zmin, lambda *_: zmax)
In the example above, we get result = 0.12499999999999999 - close enough!
The easiest way to achieve what you are looking for is probably scipy's integration function. Here your example:
from scipy import integrate
# Make data.
def func(x,y,z):
return x**y**z
ranges = [[0,1], [0,1], [0,1]]
result, error = integrate.nquad(func, ranges)
Are you aware that the function that you created is different from the one that you show in the image. The one you created is an exponential (x^y^z) while the one that you are showing is just multiplications. If you want to represent the function in the image, use
def func(x,y,z):
return x*y*z
Hope this answers your question, otherwise just write a comment!
Edit:
Misread your post. If you only have the results, and they are not regularly spaced, you would have to figure out some form of interpolation (i.e. linear) and a lookup-table. If you do not know how to create that, let me know. The rest of the stated answer could still be used if you define func to return interpolated values from your original data
The first answer explains nicely the principal approach to handle this. Just wanted to illustrate an alternative way by showing the power of sklearn package and machine learning regression.
Doing the meshgrid in 3D gives a very large numpy array,
import numpy as np
N = 5
xmin, xmax = 0, 1
ymin, ymax = 0, 1
zmin, zmax = 0, 1
x = np.linspace(xmin, xmax, N)
y = np.linspace(ymin, ymax, N)
z = np.linspace(zmin, zmax, N)
grid = np.array(np.meshgrid(x,y,z, indexing='ij'))
grid.shape = (3, 5, 5, 5) # 2*5*5*5 = 250 numbers
Which is visually not very intuitive with 250 numbers. With different possible indexing ('ij' or 'xy'). Using regression we can get the same result with few input points (15-20).
# building random combinations from (x,y,z)
X = np.random.choice(x, 20)[:,None]
Y = np.random.choice(y, 20)[:,None]
Z = np.random.choice(z, 20)[:,None]
xyz = np.concatenate((X,Y,Z), axis = 1)
data = np.multiply.reduce(xyz, axis = 1)
So the input (grid) is just a 2D numpy array,
xyz.shape
(20, 3)
With the corresponding data,
data.shape = (20,)
Now the regression function and integration,
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from scipy import integrate
pipe=Pipeline([('polynomial',PolynomialFeatures(degree=3)),('modal',LinearRegression())])
pipe.fit(xyz, data)
def func(x,y,z):
return pipe.predict([[x, y, z]])
ranges = [[0,1], [0,1], [0,1]]
result, error = integrate.nquad(func, ranges)
print(result)
0.1257
This approach is useful with limited number of points.
Based on your requirements, it sounds like the most appropriate technique would be Monte Carlo integration:
# Step 0 start with some empirical data
observed_points = np.random.uniform(0,1,size=(10000,3))
unknown_fn = lambda x: np.prod(x) # just used to generate fake values
observed_values = np.apply_along_axis(unknown_fn, 1, observed_points)
K = 1000000
# Step 1 - assume that f(x,y,z) can be approximated by an interpolation
# of the data we have (you could get really fancy with the
# selection of interpolation method - we'll stick with straight lines here)
from scipy.interpolate import LinearNDInterpolator
f_interpolate = LinearNDInterpolator(observed_points, observed_values)
# Step 2 randomly sample from within convex hull of observed data
# Step 2a - Uniformly sample from bounding 3D-box of data
lower_bounds = observed_points.min(axis=0)
upper_bounds = observed_points.max(axis=0)
sampled_points = np.random.uniform(lower_bounds, upper_bounds,size=(K, 3))
# Step 2b - Reject points outside of convex hull...
# Luckily, we get a np.nan from LinearNDInterpolator in this case
sampled_values = f_interpolate(sampled_points)
rejected_idxs = np.argwhere(np.isnan(sampled_values))
# Step 2c - Remember accepted values of estimated f(x_i, y_i, z_i)
final_sampled_values = np.delete(sampled_values, rejected_idxs, axis=0)
# Step 3 - Calculate estimate of volume of observed data domain
# Since we sampled uniformly from the convex hull of data domain,
# each point was selected with P(x,y,z)= 1 / Volume of convex hull
volume = scipy.spatial.ConvexHull(observed_points).volume
# Step 4 - Multiply estimated volume of domain by average sampled value
I_hat = volume * final_sampled_values.mean()
print(I_hat)
For a derivation of why this works see this: https://cs.dartmouth.edu/wjarosz/publications/dissertation/appendixA.pdf
So I am trying to plot the nullclines of a system of ODEs, however I can't seem to plot them in the correct way. When I plot them, I manage to plot them according to time (t vs x and t vs y) but not at (x vs y). I'm not really sure how to explain it, and I think it would be better to just show it. I am trying to replicate this. The equations and parameters are given, however this was done in a program called XPP (I'll post these at the bottom), and there are some parameters that i don't understand what they mean.
My entire code is:
import numpy as np
from scipy import integrate
import matplotlib.pyplot as plt
# define system in terms of a Numpy array
def Sys(X, t=0):
# here X[0] = x and x[1] = y
#protien [] is represented with y, and mRNA [] is represented by x
return np.array([ (k1*S*Kd**p)/(Kd**p + X[1]**p) - kdx*X[0], ksy*X[0] - (k2*ET*X[1])/(Km + X[1])])
#variables
k1=.1
S=1
Kd=1
kdx=.1
p=2
ksy=1
k2=1
ET=1
Km=1
# generate 1000 linearly spaced numbers for x-axes
t = np.linspace(0, 50,100)
# initial values
Sys0 = np.array([1, 0])
#Solves the ODE
X, infodict = integrate.odeint(Sys, Sys0, t, full_output = 1, mxstep = 50000)
#assigns appropriate equations to x and y
x,y = X.T
#plot's the graph
fig = plt.figure(figsize=(15,5))
fig.subplots_adjust(wspace = 0.5, hspace = 0.3)
ax1 = fig.add_subplot(1,2,1)
ax1.plot(x, color="blue")
ax1.plot(y, color = 'red')
ax1.set_xlabel("Protien concentration")
ax1.set_ylabel("mRNA concentration")
ax1.set_title("Phase space")
ax1.grid()
The given equations and parameters are:
model for a simple negative feedback loop
protein (y) inhibits the synthesis of its mRNA (x)
dx/dt = k1SKd^p/(Kd^p + y^p) - kdx*x
dy/dt = ksyx - k2ET*y/(Km + y)
p k1=0.1, S=1, Kd=1, kdx=0.1, p=2
p ksy=1, k2=1, ET=1, Km=1
# XP=y, YP=x, TOTAL=100, METH=stiff, XLO=0, XHI=4, YLO=0, YHI=1.05 (I don't exactly understand what is going on here)
Again, this uses a program called XPP or WINPP.
Any help with this would be appreciated, the original paper I am trying to replicate this from is : Design principles of biochemical oscillators by Bela Novak and John J. Tyson
I'd like to plot two profiles through the highest intensity point in a 2D numpy array, which is an image of a blob (i.e. a line through the semi-major axis, and another line through the semi-minor axis). The blob is rotated at an angle theta counterclockwise from the standard x-axis and is asymmetric.
It is a 600x600 array with a max intensity of 1 (at only one pixel) that is located right at the center at (300, 300). The angle rotation from the x-axis (which then gives the location of the semi-major axis when rotated by that angle) is theta = 89.54 degrees. I do not want to use scipy.ndimage.rotate because it uses spline interpolation, and I do not want to change any of my pixel values. But I suppose a nearest-neighbor interpolation method would be okay.
I tried generating lines corresponding to the major and minor axes across the image, but the result was not right at all (the peak was far less than 1), so maybe I did something wrong. The code for this is below:
import numpy as np
import matplotlib.pyplot as plt
from scipy import ndimage
def profiles_at_angle(image, axis, theta):
theta = np.deg2rad(theta)
if axis == 'major':
x_0, y_0 = 0, 300-300*np.tan(theta)
x_1, y_1 = 599, 300+300*np.tan(theta)
elif axis=='minor':
x_0, y_0 = 300-300*np.tan(theta), 599
x_1, y_1 = 300+300*np.tan(theta), -599
num = 600
x, y = np.linspace(x_0, x_1, num), np.linspace(y_0, y_1, num)
z = ndimage.map_coordinates(image, np.vstack((x,y)))
fig, axes = plt.subplots(nrows=2)
axes[0].imshow(image, cmap='gray')
axes[0].axis('image')
axes[1].plot(z)
plt.xlim(250,350)
plt.show()
profiles_at_angle(image, 'major', theta)
Did I do something obviously wrong in my code above? Or how else can I accomplish this? Thank you.
Edit: Here are some example images. Sorry for the bad quality; my browser crashed every time I tried uploading them anywhere so I had to take photos of the screen.
Figure 1: This is the result of my code above, which is clearly wrong since the peak should be at 1. I'm not sure what I did wrong though.
Figure 2: I made this plot below by just taking the profiles through the standard x and y axes, ignoring any rotation (this only looks good coincidentally because the real angle of rotation is so close to 90 degrees, so I was able to just switch the labels and get this). I want my result to look something like this, but taking the correction rotation angle into account.
Edit: It could be useful to run tests on this method using data very much like my own (it's a 2D Gaussian with nearly the same parameters):
image = np.random.random((600,600))
def generate(data_set):
xvec = np.arange(0, np.shape(data_set)[1], 1)
yvec = np.arange(0, np.shape(data_set)[0], 1)
X, Y = np.meshgrid(xvec, yvec)
return X, Y
def gaussian_func(xy, x0, y0, sigma_x, sigma_y, amp, theta, offset):
x, y = xy
a = (np.cos(theta))**2/(2*sigma_x**2) + (np.sin(theta))**2/(2*sigma_y**2)
b = -np.sin(2*theta)/(4*sigma_x**2) + np.sin(2*theta)/(4*sigma_y**2)
c = (np.sin(theta))**2/(2*sigma_x**2) + (np.cos(theta))**2/(2*sigma_y**2)
inner = a * (x-x0)**2
inner += 2*b*(x-x0)*(y-y0)
inner += c * (y-y0)**2
return (offset + amp * np.exp(-inner)).ravel()
xx, yy = generate(image)
image = gaussian_func((xx.ravel(), yy.ravel()), 300, 300, 5, 4, 1, 1.56, 0)
image = np.reshape(image, (600, 600))
This should do it for you. You just did not properly compute your lines.
theta = 65
peak = np.argwhere(image==1)[0]
x = np.linspace(peak[0]-100,peak[0]+100,1000)
y = lambda x: (x-peak[1])*np.tan(np.deg2rad(theta))+peak[0]
y_maj = np.linspace(y(peak[1]-100),y(peak[1]+100),1000)
y = lambda x: -(x-peak[1])/np.tan(np.deg2rad(theta))+peak[0]
y_min = np.linspace(y(peak[1]-100),y(peak[1]+100),1000)
del y
z_min = scipy.ndimage.map_coordinates(image, np.vstack((x,y_min)))
z_maj = scipy.ndimage.map_coordinates(image, np.vstack((x,y_maj)))
fig, axes = plt.subplots(nrows=2)
axes[0].imshow(image)
axes[0].plot(x,y_maj)
axes[0].plot(x,y_min)
axes[0].axis('image')
axes[1].plot(z_min)
axes[1].plot(z_maj)
plt.show()
Lets say I have a path in a 2d-plane given by a parametrization, for example the archimedian spiral:
x(t) = a*φ*cos(φ), y(t) = a*φ*sin(φ)
Im looking for a way to discretize this with a numpy array,
the problem is if I use
a = 1
phi = np.arange(0, 10*np.pi, 0.1)
x = a*phi*np.cos(phi)
y = a*phi*np.sin(phi)
plt.plot(x,y, "ro")
I get a nice curve but the points don't have the same distance, for
growing φ the distance between 2 points gets larger.
Im looking for a nice and if possible fast way to do this.
It might be possible to get the exact analytical formula for your simple spiral, but I am not in the mood to do that and this might not be possible in a more general case. Instead, here is a numerical solution:
import matplotlib.pyplot as plt
import numpy as np
a = 1
phi = np.arange(0, 10*np.pi, 0.1)
x = a*phi*np.cos(phi)
y = a*phi*np.sin(phi)
dr = (np.diff(x)**2 + np.diff(y)**2)**.5 # segment lengths
r = np.zeros_like(x)
r[1:] = np.cumsum(dr) # integrate path
r_int = np.linspace(0, r.max(), 200) # regular spaced path
x_int = np.interp(r_int, r, x) # interpolate
y_int = np.interp(r_int, r, y)
plt.subplot(1,2,1)
plt.plot(x, y, 'o-')
plt.title('Original')
plt.axis([-32,32,-32,32])
plt.subplot(1,2,2)
plt.plot(x_int, y_int, 'o-')
plt.title('Interpolated')
plt.axis([-32,32,-32,32])
plt.show()
It calculates the length of all the individual segments, integrates the total path with cumsum and finally interpolates to get a regular spaced path. You might have to play with your step-size in phi, if it is too large you will see that the spiral is not a smooth curve, but instead built from straight line segments. Result: