space varying vertical coordinates of model data-- writing netCDF file (xarray) - python

Consider using a numerical flow model to simulate a simple 1D advection-diffusion case, in hydraulic engineering, e.g.: evolution of salt concentration (Cs). The domain has no y-dimension, but only X and Z dimension, meaning that the flow is not depth-averaged and, for 1 timestep, I have the salt concentration:
Cs = Cs(x, z)
where
x: space coordinate (equally spaced vector), and z: vertical coordinate (non equally spaced vector).
or more in general, including time, :
Cs = Cs(x, z, t)
Now, x is constant over time and space (meaning that the x-grid is not "moving" slightly back and forth), but z (i.e. the vertical coordinate of each "Layer") is indeed adapting and oscillating up and down.
For each time-step, the model spits out the actual numerical value of the salt concentration, AND the vertical coordinate for each layer (i.e. at each given depth of the fluid). IN Python I could easily "merge" contour the salinity over x and z, by staggering each z on top of its previous level.
Therefore, cords matrix in Python has a number of lines that equals the number of vertical layers and a number of rows that equals the number of x grid points. By so doing, coordinates and Salinity matrices are consistent and may be plotted (contours).
My question now is:
In Python, I want to use x-array to generate/output a netcdf file to read that in with Paraview. I could get started and created a netCDF file, containing (for now) only 1 time-step, ingest that into Paraview and plot it.
However, my problem is that z = z(x), which means that each level height varies along the horzinotal coordinate. So far, I could only apply an equally spaced vertical vector to describe vertical coordinate, and that vector is not depending on x, so it is not varying along the horizontal coordinate.
How can I achieve my goal?
nx = np.shape(Xp)[1]
nz = Nvert+1
#X vector is linearly spaced, constant dx, and it is OK like that!
xmin = 0
xmax = 10
X = np.linspace(xmin, xmax, nx)
#Z is equally spaced too, by it should NOT be! might vary over vertical AND/or over X!!!
zmin = 0
zmax = 0.5
Z = np.linspace(zmin, zmax, nz)
# merge quantity "qname", at timestep 0, given Nvert vertical layers
q0 = merge_quant(Nvert, t[0], qname)
#create two matrix coordinates, xk and zk, being consistent is size with q0
xk, zk = merge_xnadz(Nvert, t[0])
#assign:
vals = q0
#preallocate data using xarray
ds = xr.Dataset(
{qname: (("z", "x"), vals)},
coords={
"x": X,
"z": Z #---this here should vary but is instead constant..?!?
}
)
#save to disk
ds.to_netcdf("Testout.nc")
How can I add another dimension? Could I just add the matrix of coordinates in xarray before saving to netCDF? That would really solve my problem.
Any help is appreciated.
Thank you!
Marco
Edit:
Two images, one for model results; and one obtained in PARAVIEW

Related

Interpolation onto a 3d grid from 3 different pairs of points and values

I have three 3D images with me, each representing one of the orthogonal views. I know the physical x,y,z locations on which each of the images are placed.
Let X1 = {(x1,y1,z1)} represent the set of physical coordinate tuples for one of the images and for which I know the corresponding intensity values I1. There are N tuples in X1 and hence, N intensity values. Similarly, I have access to X2, I2, and X3,I3 which are for the other two images. There are N tuples in X2 and X3 as well.
I want to estimate the volume that comes from interpolating information from all the views. I know the physical coordinates Xq for the final volume as well.
For example:
#Let image_matrix1, image_matrix2, and image_matrix3 represent the three #volumes (matrix with intensity values)
#for image/view 1
xs1 = np.linspace(-5,5,100)
ys1 = np.linspace(-5,5,100)
zs1 = np.linspace(-2,2,20)
#for image/view 2
xs2 = np.linspace(-5,5,100)
ys2 = np.linspace(-5,5,100)
zs2 = np.linspace(-2,2,20)
#for image/view 3
xs3 = np.linspace(-5,5,100)
ys3 = np.linspace(-5,5,100)
zs3 = np.linspace(-2,2,20)
#the following will not work, but this is close to what i want to achieve.
xs = [xs1,xs2,xs3]
ys = [ys1,ys2,ys3]
zs = [zs1,zs2,zs3]
points = (xs,ys,zs)
values = [image_matrix1,image_matrix2,image_matrix3]
query = (3.4,2.2,5.2) # the physical point at which i want to know the value
value_at_query = interpolating_function(points,values,query)
#the following will work, but this is for one image only
points = (xs1,ys1,zs1) #modified to take coords of one image only
values = [image_matrix1] #modified to take values of one image only
query = (3.4,2.2,5.2) # the physical point at which i want to know the value
value_at_query = interpolating_function(points,values,query)
Please help.
It doesn't make sense to me to interpolate between the three volumes (as a fourth dimension) as I understand the problem. The volumes are not like a fourth dimension in that they don't lie on a continuous axis that you can interpolate at a specified value.
You could interpolate the views separately and then calculate an aggregate value from the results by a suitable metric (average, quadratic average, min/max, etc.).
value_at_query = suitable_aggregate_metric(
interpolating_function(points1, [image_matrix1], query),
interpolating_function(points2, [image_matrix2], query),
interpolating_function(points3, [image_matrix3], query)
)
Considering the extrapolation, you could use a weight-matrix for each image. This weight-matrix would enclose the whole outer cube (128x128x128) with a weight-value of one in the region where it intersects with the image (128x128x10) and decaying to zero towards the outside (probably a strong decay like quadratic/cubic or higher order works better than linear). You then interpolate each image for an intensity-value and a weight-value and then calculate a weighted intensity-average.
The reason for my suggestion is, that if you probe e.g. at location (4, 4, 2.5) you have to extrapolate on every image, but you would want to weight the third image highest, as it is way closer to known values of the image and thus more reliable. A higher order decay exaggerates this weight towards closer values further.

Calculate Divergence of Velocity Field (3D) in Python

I am trying to calculate the divergence of a 3D velocity field in a multi-phase flow setting (with solids immersed in a fluid). If we assume u,v,w to be the three velocity components (each a n x n x n) 3D numpy array, here is the function I have for calculating divergence:
def calc_divergence_velocity(df,h=0.025):
"""
#param df: A dataframe with the entire vector field with columns [x,y,z,u,v,w] with
x,y,z indicating the 3D coordinates of each point in the field and u,v,w
the velocities in the x,y,z directions respectively.
#param h: This is the dimension of a single side of the 3D (uniform) grid. Used
as input to numpy.gradient() function.
"""
"""
Reshape dataframe columns to get 3D numpy arrays (dim = 80) so each u,v,w is a
80x80x80 ndarray.
"""
u = df['u'].values.reshape((dim,dim,dim))
v = df['v'].values.reshape((dim,dim,dim))
w = df['w'].values.reshape((dim,dim,dim))
#Supply x,y,z coordinates appropriately.
#Note: Only a scalar `h` has been supplied to np.gradient because
#the type of grid we are dealing with is a uniform grid with each
#grid cell having the same dimensions in x,y,z directions.
u_grad = np.gradient(u,h,axis=0) #central diff. du_dx
v_grad = np.gradient(v,h,axis=1) #central diff. dv_dy
w_grad = np.gradient(w,h,axis=2) #central diff. dw_dz
"""
The `mask` column in the dataframe is a binary column indicating the locations
in the field where we are interested in measuring divergence.
The problem I am looking at is multi-phase flow with solid particles and a fluid
hence we are only interested in the fluid locations.
"""
sdf = df['mask'].values.reshape((dim,dim,dim))
div = (u_grad*sdf) + (v_grad*sdf) + (w_grad*sdf)
return div
The problem I'm having is that the divergence values that I am seeing are far too high.
For example the image below showcases, a distribution with values between [-350,350] whereas most values should technically be close to zero and somewhere between [20,-20] in my case. This tells me I'm calculating the divergence incorrectly and I would like some pointers as to how to correct the above function to calculate the divergence appropriately. As far as I can tell (please correct me if I'm wrong), I think have done something similar to this upvoted SO response. Thanks in advance!

k-space vector for N-body simulation box DFTs

I'm trying to write a particle mesh N-body simulation. In such a simulation the potential field is found by solving Poisson's equation using Fourier transforms. I have been following a presentation by Andrey Kravtsov (http://astro.uchicago.edu/~andrey/talks/PM/pm.pdf), but slide 15 has me confused. So far, I have assigned densities to a 3d grid from particle positions, and Fourier transformed the density grid. The next step is to calculate Green's function in Fourier space and multiply it with the Fourier transformed density grid, and afterwards applying an inverse Fourier transform to real space to obtain the potential grid. Through trial and error I traced the part that wasn't working correctly to the potential calculation, and specifically the k-space vector.
So, to calculate Green's function in Fourier space I need the Fourier axes usually called k-space vectors k_x, k_y, k_z. Using the slide it should be 2*pi*(k,l,m)/N_g for components k,l,m, where N_g is the number of grid cells. So far I've tried with these components running from 0,+1,+2,...,N_g. And -N_particle/2, ..., +N_particle/2 and several other iterations. The only thing that has produced reasonable results (can see a cluster in density slice projected on the same potential field slice) has been with using numpy.fft.freq in Python for specific values of the resolution/sample spacing. However, any resolution I chose (such as L/N_g, N_p/N_g, 2pi/N_g, etc.) did not scale properly with box size L, number of grid cells or number of particles and no longer worked for e.g. larger number of grid cells.
My question is:
How do I define my k-space vectors (i.e. the Fourier axes in reciprocal space) for a simulation with, along one direction, box size L, number of grid cells N_g and number of particles N_p?
I should add that the particle positions and velocities are all in code units as defined in the first few slides.
Minimum working example:
#!/usr/bin/env python3
import numpy as np
import matplotlib.pyplot as plt
M = 30 #Number of particles in 1 direction
Mn = 90 #Number of grid cells in 1 direction
Lx = 10 #grid physical size
u = np.random.random(M*M*M)
v = np.random.random(M*M*M)
w = np.random.random(M*M*M)
#Have purposefully taken smaller cube, to show potential works
planex = M*u
planey = M*v
planez = M*w
#Create a new grid
grid = np.zeros([Mn,Mn,Mn], dtype='cfloat')
#cell center coordinates
x_c = np.floor(planex).astype(int)%Mn
y_c = np.floor(planey).astype(int)%Mn
z_c = np.floor(planez).astype(int)%Mn
#in terms of the average density of the universe, doesnt matter for the
#example
mass = 1.
#Update the grid
grid[z_c,y_c,x_c] += mass
fig = plt.figure()
ax = fig.add_subplot(111)
plt.imshow(grid[:,:,2].real)
plt.show()
#FFT the grid
grid = np.fft.fftn(grid)
#The resolution and the k-space vectors are the parts I am unsure about
resolution = np.pi*2/(M/Mn)
resolution = Lx/Mn
#Define the k-space vectors
k_x = np.fft.fftfreq(Mn, resolution)
k_y = np.fft.fftfreq(Mn, resolution)
k_z = np.fft.fftfreq(Mn, resolution)
kz, ky, kx = np.meshgrid(k_z, k_y, k_x)
Omega_0 = 0.27
a = 0.3
#Calculate Greens function
k_squared = np.sin(kz/2)**2 + np.sin(ky/2)**2 + np.sin(kx/2)**2
Greens = -3*Omega_0/8/a*np.divide(1, k_squared, where=k_squared!=0)
#Multiply the grids in Fourier space
grid = Greens*grid
#IFFT to real space
potentials = np.fft.ifftn(grid)
fig1 = plt.figure()
ax1 = fig1.add_subplot(111)
plt.imshow(potentials[:,:,0].real)
plt.show()
Large value for the resolution makes velocities explosive, small value and very small velocities. So what makes the right resolution?
This is my first time asking on Stack overflow, please let me know if I'm doing something wrong.
Best, R.

Python Mayavi - set size of scatter point

I've a very simply script with which I'm trying to plot 2 points with a set size:
from mayavi.mlab import *
x = [0.,3.]
y = [0.,0.]
z = [0.,0.]
scalars = [1.5,1.5]
pts = points3d(x, y, z, scalars, scale_factor = 1)
However, I can't figure out how, with this simple example, to set the size of the two points so that the points just touch each other. I want to set the size in the same units as the coordinates of the points. So I separate the two points by 3 units and set the size of the two points to 1.5.
However, in the image attached, the two points don't touch like expected.
Any idea why?
In mayavi, the scale of spheres determines their diameter and not their radius.
Use
pts = points3d(x, y, z, scalars, scale_factor=2, resolution=100)
the resolution argument makes a smoother sphere (number of angular points). Beware of high values of resolutions if you intend to display many spheres.

Fast 3D interpolation of atmospheric data in Numpy/Scipy

I am trying to interpolate 3D atmospheric data from one vertical coordinate to another using Numpy/Scipy. For example, I have cubes of temperature and relative humidity, both of which are on constant, regular pressure surfaces. I want to interpolate the relative humidity to constant temperature surface(s).
The exact problem I am trying to solve has been asked previously here, however, the solution there is very slow. In my case, I have approximately 3M points in my cube (30x321x321), and that method takes around 4 minutes to operate on one set of data.
That post is nearly 5 years old. Do newer versions of Numpy/Scipy perhaps have methods that handle this faster? Maybe new sets of eyes looking at the problem have a better approach? I'm open to suggestions.
EDIT:
Slow = 4 minutes for one set of data cubes. I'm not sure how else I can quantify it.
The code being used...
def interpLevel(grid,value,data,interp='linear'):
"""
Interpolate 3d data to a common z coordinate.
Can be used to calculate the wind/pv/whatsoever values for a common
potential temperature / pressure level.
grid : numpy.ndarray
The grid. For example the potential temperature values for the whole 3d
grid.
value : float
The common value in the grid, to which the data shall be interpolated.
For example, 350.0
data : numpy.ndarray
The data which shall be interpolated. For example, the PV values for
the whole 3d grid.
kind : str
This indicates which kind of interpolation will be done. It is directly
passed on to scipy.interpolate.interp1d().
returns : numpy.ndarray
A 2d array containing the *data* values at *value*.
"""
ret = np.zeros_like(data[0,:,:])
for yIdx in xrange(grid.shape[1]):
for xIdx in xrange(grid.shape[2]):
# check if we need to flip the column
if grid[0,yIdx,xIdx] > grid[-1,yIdx,xIdx]:
ind = -1
else:
ind = 1
f = interpolate.interp1d(grid[::ind,yIdx,xIdx], \
data[::ind,yIdx,xIdx], \
kind=interp)
ret[yIdx,xIdx] = f(value)
return ret
EDIT 2:
I could share npy dumps of sample data, if anyone was interested enough to see what I am working with.
Since this is atmospheric data, I imagine that your grid does not have uniform spacing; however if your grid is rectilinear (such that each vertical column has the same set of z-coordinates) then you have some options.
For instance, if you only need linear interpolation (say for a simple visualization), you can just do something like:
# Find nearest grid point
idx = grid[:,0,0].searchsorted(value)
upper = grid[idx,0,0]
lower = grid[idx - 1, 0, 0]
s = (value - lower) / (upper - lower)
result = (1-s) * data[idx - 1, :, :] + s * data[idx, :, :]
(You'll need to add checks for value being out of range, of course).For a grid your size, this will be extremely fast (as in tiny fractions of a second)
You can pretty easily modify the above to perform cubic interpolation if need be; the challenge is in picking the correct weights for non-uniform vertical spacing.
The problem with using scipy.ndimage.map_coordinates is that, although it provides higher order interpolation and can handle arbitrary sample points, it does assume that the input data be uniformly spaced. It will still produce smooth results, but it won't be a reliable approximation.
If your coordinate grid is not rectilinear, so that the z-value for a given index changes for different x and y indices, then the approach you are using now is probably the best you can get without a fair bit of analysis of your particular problem.
UPDATE:
One neat trick (again, assuming that each column has the same, not necessarily regular, coordinates) is to use interp1d to extract the weights doing something like follows:
NZ = grid.shape[0]
zs = grid[:,0,0]
ident = np.identity(NZ)
weight_func = interp1d(zs, ident, 'cubic')
You only need to do the above once per grid; you can even reuse weight_func as long as the vertical coordinates don't change.
When it comes time to interpolate then, weight_func(value) will give you the weights, which you can use to compute a single interpolated value at (x_idx, y_idx) with:
weights = weight_func(value)
interp_val = np.dot(data[:, x_idx, y_idx), weights)
If you want to compute a whole plane of interpolated values, you can use np.inner, although since your z-coordinate comes first, you'll need to do:
result = np.inner(data.T, weights).T
Again, the computation should be practically immediate.
This is quite an old question but the best way to do this nowadays is to use MetPy's interpolate_1d funtion:
https://unidata.github.io/MetPy/latest/api/generated/metpy.interpolate.interpolate_1d.html
There is a new implementation of Numba accelerated interpolation on regular grids in 1, 2, and 3 dimensions:
https://github.com/dbstein/fast_interp
Usage is as follows:
from fast_interp import interp2d
import numpy as np
nx = 50
ny = 37
xv, xh = np.linspace(0, 1, nx, endpoint=True, retstep=True)
yv, yh = np.linspace(0, 2*np.pi, ny, endpoint=False, retstep=True)
x, y = np.meshgrid(xv, yv, indexing='ij')
test_function = lambda x, y: np.exp(x)*np.exp(np.sin(y))
f = test_function(x, y)
test_x = -xh/2.0
test_y = 271.43
fa = test_function(test_x, test_y)
interpolater = interp2d([0,0], [1,2*np.pi], [xh,yh], f, k=5, p=[False,True], e=[1,0])
fe = interpolater(test_x, test_y)

Categories