Interpolate latitude, longitude, data in Python (structured grid inside a circle) - python

I have a set of latitude, longitude points with a data-variable e.g. drive-time from an address. These points have been created by sampling a structured grid and then cutting out a circle.
As such I don't think I can have a matrix of data because some columns will have more zeros/missing than others (the top and bottom parts of the circle) which may confuse the algorithm?
Ideally, I would like to fill in the circle with more points; e.g. at 5 decimal places such that instead of having 51.5454 and 51.5455 I have 51.54540, 51.54541, .... , 51.54550.
My data looks like this:
And I would like to fill in the gaps:
I have tried using:
from scipy.interpolate import RectSphereBivariateSpline
In the following fashion - (test-case), however I am not sure if this is the correct approach in general?
def geointerp(lats, lons, data, grid_size_deg, mesh=False):
deg2rad = np.pi/180.
new_lats = np.linspace(50, 51, 180/grid_size_deg)
new_lons = np.linspace(-1, 1, 360/grid_size_deg)
new_lats, new_lons = np.meshgrid(new_lats*deg2rad, new_lons*deg2rad)
#We need to set up the interpolator object
lut = RectSphereBivariateSpline(lons*deg2rad, lats*deg2rad, data)
new_lats = new_lats.ravel()
new_lons = new_lons.ravel()
data_interp = lut.ev(new_lats,new_lons)
if mesh == True:
data_interp = data_interp.reshape((360/grid_size_deg, 180/grid_size_deg)).T
return new_lats/deg2rad, new_lons/deg2rad, data_interp
# Read in-data
lats_in = []
lons_in = []
data_in = []
with open('interpolation_test.csv') as f:
for x in csv.reader(f):
lats_in.append(float(x[0]))
lons_in.append(float(x[1]))
data_in.append(float(x[2]))
# Interpolate:
lats_in = np.asarray(lats_in)
lons_in = np.asarray(lons_in)
data_in = np.asarray(data_in)
output_list = geointerp(lats_in, lons_in, data_in, 0.01)
# Output
f = open('interpolation_test_out.csv', 'w', newline='')
w = csv.writer(f)
for out in output_list:
w.writerow([out])
f.close()
Not to mention errors such as:
"if not v.size == r.shape[1]:
IndexError: tuple index out of range"

Related

Matplotlib is making positives into negatives

I'm just trying to graph some simple data and whether I try to do it with plot or subplot it comes out the same. All values in my lists are positive but the y axis is acting like a number line with only positives.
import matplotlib.pyplot as plt
xVal = []
yVal1 = []
yVal2 = []
yVal3 = []
data = []
# load data
with open(r"path", 'r') as f:
data = f.readlines()
yVal1 = data[0].split(",")
yVal2 = data[1].split(",")
yVal3 = data[2].split(",")
del yVal1[-1]
del yVal2[-1]
del yVal3[-1]
print(yVal1)
print(yVal2)
print(yVal3)
# graph dem bois
xVal = [*range(0, len(yVal1))]
'''fig, ax = plt.subplots(3)
ax[0].plot(xVal, yVal1)
ax[0].set_title("pm5")
ax[1].plot(xVal, yVal2)
ax[1].set_title("pm7.5")
ax[2].plot(xVal, yVal3)
ax[2].set_title("pm10")
fig.suptitle("Particulate Levels over time")'''
plt.plot(xVal, yVal3)
plt.show()
As per the comment by Jody Klymak I converted the string lists into float lists and it worked.
fyVal1 = [float(x) for x in yVal1]

xarray: polar pcolormesh with low-overhead axis coordinate transformation

I'm trying to plot a two-dimensional xarray DataArray representing a variable parametrised in polar coordinates. Important: the theta coordinate is in degree, not in radian. The following snippet creates an example data set:
import numpy as np
import xarray as xr
res_theta = 20
thetas = np.arange(0, 360, res_theta)
res_r = 0.1
rs = np.arange(0, 1, res_r)
data = np.random.random((len(thetas), len(rs)))
my_da = xr.DataArray(
data,
coords=(thetas, rs),
dims=("theta", "r"),
)
I would like to plot this data as a polar pcolormesh. I also would like to rely on xarray's plotting routines to benefit from as many features as possible (faceting, plot customisation, etc.). Matplotlib's polar projection assumes that the theta angle is given in radian: if I go for the straightforward solution, I first have to convert my theta coordinates to radian, but I don't want to modify the array in-place. I haven't found a better way than copying the array and converting the copy's theta, like this for instance:
def pcolormesh_polar_expensive(da, *args, **kwargs):
da_tmp = da.copy() # I'd like to avoid that
# Get x value
try:
x = args[0]
except IndexError:
x = da_tmp.dims[0]
da_tmp[x] = np.deg2rad(da_tmp[x])
try:
subplot_kws = kwargs["subplot_kws"]
except KeyError:
subplot_kws = {}
return da_tmp.plot.pcolormesh(
*args,
subplot_kws=dict(projection="polar"),
**kwargs
)
This produces the desired plot:
pcolormesh_polar_expensive(my_da, "theta", "r")
The Actual Problem
I however would like to avoid duplicating the data: my actual data sets are much larger than that. I made some research and found out about Matplotlib's transformation pipeline, and I have the feeling that I could use it to dynamically insert this transformation in plotting routines, but I couldn't get anything to work properly so far. Does anybody have an idea of how I could proceed?
Thanks to #kmuehlbauer's suggestion and a careful examination of the xarray.DataArray.assign_coords() docs, I managed to produce exactly what I wanted.
First, I modified my test data to also include unit metadata:
import numpy as np
import xarray as xr
import pint
ureg = pint.UnitRegistry()
res_r = 0.1
rs = np.arange(0, 1, res_r)
res_theta = 20
thetas = np.arange(0, 360, res_theta)
data = np.random.random((len(rs), len(thetas)))
my_da = xr.DataArray(
data,
coords=(rs, thetas),
dims=("r", "theta"),
)
my_da.theta.attrs["units"] = "deg"
Then, I improved the kwargs processing to automate unit conversion and created an extra coordinate associated to the theta dimension:
def pcolormesh_polar_cheap(da, r=None, theta=None, add_labels=False, **kwargs):
if r is None:
r = da.dims[0]
if theta is None:
theta = da.dims[1]
try:
theta_units = ureg.Unit(da[theta].attrs["units"])
except KeyError:
theta_units = ureg.rad
if theta_units != ureg.rad:
theta_rad = f"{theta}_rad"
theta_rad_values = ureg.Quantity(da[theta].values, theta_units).to(ureg.rad).magnitude
da_plot = da.assign_coords(**{theta_rad: (theta, theta_rad_values)})
da_plot[theta_rad].attrs = da[theta].attrs
da_plot[theta_rad].attrs["units"] = "rad"
else:
theta_rad = theta
da_plot = da
kwargs["x"] = theta_rad
kwargs["y"] = r
kwargs["add_labels"] = add_labels
try:
subplot_kws = kwargs["subplot_kws"]
except KeyError:
subplot_kws = {}
subplot_kws["projection"] = "polar"
return da_plot.plot.pcolormesh(
**kwargs,
subplot_kws=subplot_kws,
)
A very important point here is that assign_coords() returns a copy of the data array it's called from, and this copy's values actually reference the original array, thus adding no memory cost other than the creation of the extra coordinate. Modifying the data array in-place as suggested by #kmuehlbauer is straightforward (just replace da_plot = da.assign_coords(...) with da = da.assign_coords(...)).
We then get the same plot (without axis labels, since I changed the defaults so as to hide them):
pcolormesh_polar_cheap(my_da, r="r", theta="theta")

Project np.array of points to np.array of segments

I have the following working code to project a single point to every segment in an array.
But I want every point in an array of points to be projected to every segment.
import numpy as np
#find closest segment to single point
#line segment
l1 = np.array([[2,3,0],[7,5,0]])
l2 = np.array([[5,1,0],[8,6,0]])
#point that gets projected
p = np.array([[6,5,0]]) #only single point
#set to origin
line = l2-l1
pv = p-l1
#length of line squared
len_sq = np.sum(line**2, axis = 1) #len_sq = numpy.einsum("ij,ij->i", line, line)
#dot product of 3D vectors with einsum
dot = np.einsum('ij,ij->i',line,pv) #np.sum(line*pv,axis=1)
#percentage of line the pv vector travels in
param = np.array([dot/len_sq])
#param<0 projected point=l1, param>1 pp=l2
clamped_param = np.clip(param,0,1)
#add line fraction to l1 to get projected point
pp = l1+(clamped_param.T*line)
For Example, make
p = np.array([[6,5,0],[3,2,0]]) #multiple points
and return np.array() of 4 projected points.
Maybe you can try something like the following. If project is a function that can do the operation for a single point, then by using apply along axis, you can get it to work on all points in an array of points. The output is yielded as separate generators for each point, which have to be converted back to a single array using a stacking operation.
l1 = np.array([[2,3,0],[7,5,0]])
l2 = np.array([[5,1,0],[8,6,0]])
line = l2-l1
len_sq = np.sum(line**2, axis = 1)
def project(p):
pv = p-l1
dot = np.einsum('ij,ij->i',line,pv)
param = np.array([dot/len_sq])
clamped_param = np.clip(param,0,1)
yield l1+(clamped_param.T*line)
pts = np.array([[6,5,0],
[3,2,0]])
gen = np.apply_along_axis(project, 1, pts)
out = np.hstack([list(G) for G in gen])[0]

How to rotate a subplot in matplotlib freely

I am currently writing a program where I can project a hologram video on my computer screen, I had written the code below and I do not know how to specifically rotate a subplot, I had created a 3*3 subplot and I need to rotate subplot 4 by 270 clockwise, subplot 6 by 90 clockwise and subplot 8 by 180.
Second question is how to get rid of all of the axis label... So that the hologram projected will be nice and neatly....
import pandas as pd
import serial
import numpy as np
import matplotlib.pyplot as plt
ser = serial.Serial("COM5", 115200) # define the serial port that we are communicating to and also the baud rate
plt.style.use('dark_background') #define the black background
plt.ion() # tell pyplot we need live data
fig,[[ax1,ax2,ax3],[ax4,ax5,ax6],[ax7,ax8,ax9]] = plt.subplots(3,3) # plotting a figure with 9 subplot
Xplot = []
Yplot = []
Zplot = []
blankx = []
blanky = []
fig = [ax1,ax2,ax3,ax4,ax5,ax6,ax7,ax8,ax9]
while True: #always looping this sequence
while(ser.inWaiting()==0): #if no input from the serial, wait and do nothing
pass
data = ser.readline() #obtain the input from COM 5
data_processed = data.decode('utf-8') #to get rid of the unnecessary string part
data_split = data_processed.split(",") # split the incoming string into a list
x = float(data_split[0]) #to obtain seperate float values for x,y,z
y = float(data_split[1])
z = float(data_split[2])
reset = int(data_split[3]) # reset will output 1
draw = int(data_split[4]) # draw will output 2
if(draw == 2):
Xplot.append(x) #if draw is given instruction, add the x,y,z value into the list to be plot on the graph
Yplot.append(y)
Zplot.append(z)
ax1.plot(blankx,blanky) # subplotting
ax2.plot(Xplot,Yplot,"ro")
ax3.plot(blankx,blank)
ax4.plot(Xplot,Yplot,"ro")
ax5.plot(blankx,blank)
ax6.plot(Xplot,Yplot,"ro")
ax7.plot(blankx,blanky)
ax8.plot(Xplot,Yplot,"ro")
ax9.plot(blankx,blanky)
if(reset == 1):
for f in fig: #if reset is given instruction, clear all figure and clear the elements in the plotting list
f.clear()
Xplot = []
Yplot = []
Zplot = []
plt.pause(.000001)
I might have found a solution, but not a perfect one, I use math instead of code to rotate the plotting, just multiple it by negative value to flip at x and y axis, I have also added a denoiser function to lower the deviation, here is the code that I use, if anyone had any idea about how to rotate a subplot freely, please enlight me.
import pandas as pd
import serial
import matplotlib.pyplot as plt
ser = serial.Serial("COM5", 115200) # define the serial port that we are communicating to and also the baud rate
plt.style.use('dark_background') #define the black background
plt.ion() # tell pyplot we need live data
fig,[[ax1,ax2,ax3],[ax4,ax5,ax6],[ax7,ax8,ax9]] = plt.subplots(3,3) # plotting a figure with 9 subplot
rx = [0]
ry = [0]
rz = [0]
Xplot2 = []
Xplot4 = []
Xplot6 = []
Xplot8 = []
Zplot2 = []
Zplot4 = []
Zplot6 = []
Zplot8 = []
blankx = []
blankz = []
fig = [ax1,ax2,ax3,ax4,ax5,ax6,ax7,ax8,ax9]
def switch(x):
return x*-1
def denoiser(x):
return (x[-1] +x[-2])/4
while True: #always looping this sequence
while(ser.inWaiting()==0): #if no input from the serial, wait and do nothing
pass
data = ser.readline() #obtain the input from COM 5
data_processed = data.decode('utf-8') #to get rid of the unnecessary string part
data_split = data_processed.split(",") # split the incoming string into a list
rx.append(float(data_split[0])) #to obtain seperate float values for x,y,z
ry.append(float(data_split[1]))
rz.append(float(data_split[2]))
reset = int(data_split[3]) # reset will output 1
draw = int(data_split[4]) # draw will output 2
x = denoiser(rx)
y = denoiser(ry)
z = denoiser(rz)
if(draw == 2):
Xplot8.append(x) #if draw is given instruction, add the x,y,z value into the list to be plot on the graph
Zplot8.append(z)
Xplot2.append(switch(x))
Zplot2.append(switch(z))
Xplot4.append(x)
Zplot4.append(switch(z))
Xplot6.append(switch(x))
Zplot6.append(z)
ax1.plot(blankx,blankz) # subplotting
ax1.axis("off")
ax2.plot(Xplot2,Zplot2,"ro")
ax2.axis("off")
ax3.plot(blankx,blankz)
ax3.axis("off")
ax4.plot(Xplot4,Zplot4,"ro")
ax4.axis("off")
ax5.plot(blankx,blankz)
ax5.axis("off")
ax6.plot(Xplot6,Zplot6,"ro")
ax6.axis("off")
ax7.plot(blankx,blankz)
ax7.axis("off")
ax8.plot(Xplot8,Zplot8,"ro")
ax8.axis("off")
ax9.plot(blankx,blankz)
ax9.axis("off")
if(reset == 1):
for f in fig: #if reset is given instruction, clear all figure and clear the elements in the plotting list
f.clear()
Xplot2 = []
Xplot4 = []
Xplot6 = []
Xplot8 = []
Zplot2 = []
Zplot4 = []
Zplot6 = []
Zplot8 = []
plt.pause(.000001)

How to vectorize a code with python numpy.bincount, using apply along axis

I'm trying to vectorize a code with numpy, to run it using multiprocessing, but i can't understand how numpy.apply_along_axis works. This is an example of the code, vectorized using map
import numpy
from scipy import sparse
import multiprocessing
from matplotlib import pyplot
#first i build a matrix of some x positions vs time datas in a sparse format
matrix = numpy.random.randint(2, size = 100).astype(float).reshape(10,10)
x = numpy.nonzero(matrix)[0]
times = numpy.nonzero(matrix)[1]
weights = numpy.random.rand(x.size)
#then i define an array of y positions
nStepsY = 5
y = numpy.arange(1,nStepsY+1)
#now i build an image using x-y-times coordinates and x-times weights
def mapIt(ithStep):
ncolumns = 80
image = numpy.zeros(ncolumns)
yTimed = y[ithStep]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
values = numpy.bincount(positions,weights)
values = values[numpy.nonzero(values)]
positions = numpy.unique(positions)
image[positions] = values
return image
image = list(map(mapIt, range(nStepsY)))
image = numpy.array(image)
a = pyplot.imshow(image, aspect = 10)
Here the output plot
I tried to use numpy.apply_along_axis, but this function allows me to iterate only along the rows of image, while i need to iterate along the ithStep index too. E.g.:
#now i build an image using x-y-times coordinates and x-times weights
nrows = nStepsY
ncolumns = 80
matrix = numpy.zeros(nrows*ncolumns).reshape(nrows,ncolumns)
def applyIt(image):
image = numpy.zeros(ncolumns)
yTimed = y[ithStep]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
values = numpy.bincount(positions,weights)
values = values[numpy.nonzero(values)]
positions = numpy.unique(positions)
image[positions] = values
return image
imageApplied = numpy.apply_along_axis(applyIt,1,matrix)
a = pyplot.imshow(imageApplied, aspect = 10)
It obviously return only the firs row nrows times, since nothing iterates ithStep:
And here the wrong plot
There is a way to iterate an index, or to use an index while numpy.apply_along_axis iterates?
Here the code with only matricial operations: it's quite faster than map or apply_along_axis but uses so much memory.
(in this function i use a trick with scipy.sparse, which works more intuitively than numpy arrays when you try to sum numbers on a same element)
def fullmatrix(nRows, nColumns):
y = numpy.arange(1,nStepsY+1)
image = numpy.zeros((nRows, nColumns))
yTimed = numpy.outer(y,times)
x3d = numpy.outer(numpy.ones(nStepsY),x)
weights3d = numpy.outer(numpy.ones(nStepsY),weights)
y3d = numpy.outer(y,numpy.ones(x.size))
positions = (numpy.round(x3d-yTimed)+50).astype(int)
matrix = sparse.coo_matrix((numpy.ravel(weights3d), (numpy.ravel(y3d), numpy.ravel(positions)))).todense()
return matrix
image = fullmatrix(nStepsY, 80)
a = pyplot.imshow(image, aspect = 10)
This way is simplier and very fast! Thank you so much.
nStepsY = 5
nRows = nStepsY
nColumns = 80
y = numpy.arange(1,nStepsY+1)
image = numpy.zeros((nRows, nColumns))
fakeRow = numpy.zeros(positions.size)
def itermatrix(ithStep):
yTimed = y[ithStep]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
matrix = sparse.coo_matrix((weights, (fakeRow, positions))).todense()
matrix = numpy.ravel(matrix)
missColumns = (nColumns-matrix.size)
zeros = numpy.zeros(missColumns)
matrix = numpy.concatenate((matrix, zeros))
return matrix
for i in numpy.arange(nStepsY):
image[i] = itermatrix(i)
#or, without initialization of image:
imageMapped = list(map(itermatrix, range(nStepsY)))
imageMapped = numpy.array(imageMapped)
It feels like attempting to use map or apply_along_axis is obscuring the essentially iteration of the problem.
I rewrote your code as an explicit loop on y:
nStepsY = 5
y = numpy.arange(1,nStepsY+1)
image = numpy.zeros((nStepsY, 80))
for i, yi in enumerate(y):
yTimed = yi*times
positions = (numpy.round(x-yTimed)+50).astype(int)
values = numpy.bincount(positions,weights)
values = values[numpy.nonzero(values)]
positions = numpy.unique(positions)
image[i, positions] = values
a = pyplot.imshow(image, aspect = 10)
pyplot.show()
Looking at the code, I think I could calculate positions for all y values making a (y.shape[0],times.shape[0]) array. But the rest, the bincount and unique still have to work row by row.
apply_along_axis when working with a 2d array, and axis=1 essentially does:
res = np.zeros_like(arr)
for i in range....:
res[i,:] = func1d(arr[i,:])
If the input array has more dimensions it constructs a more elaborate indexing object [i,j,k,:]. And it can handle cases where func1d returns a different size array than the input. But in any case it is just a generalized iteration tool.
Moving the initial positions creation outside the loop:
yTimed = y[:,None]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
image = numpy.zeros((positions.shape[0], 80))
for i, pos in enumerate(positions):
values = numpy.bincount(pos,weights)
values = values[numpy.nonzero(values)]
pos = numpy.unique(pos)
image[i, pos] = values
Now I can cast this as an apply_along_axis problem, with an applyIt that takes a positions vector (with all the yTimed information) rather than blank image vector.
def applyIt(pos, size, weights):
acolumn = numpy.zeros(size)
values = numpy.bincount(pos,weights)
values = values[numpy.nonzero(values)]
pos = numpy.unique(pos)
acolumn[pos] = values
return acolumn
image = numpy.apply_along_axis(applyIt, 1, positions, 80, weights)
Timing wise I expect it's a bit slower than my explicit iteration. It has to do more setup work, including a test call applyIt(positions[0,:],...) to determine the size of its return array (i.e image has different shape than positions.)
def csrmatrix(y, times, x, weights):
yTimed = numpy.outer(y,times)
n=y.shape[0]
x3d = numpy.outer(numpy.ones(n),x)
weights3d = numpy.outer(numpy.ones(n),weights)
y3d = numpy.outer(y,numpy.ones(x.size))
positions = (numpy.round(x3d-yTimed)+50).astype(int)
#print(y.shape, weights3d.shape, y3d.shape, positions.shape)
matrix = sparse.csr_matrix((numpy.ravel(weights3d), (numpy.ravel(y3d), numpy.ravel(positions))))
#print(repr(matrix))
return matrix
# one call
image = csrmatrix(y, times, x, weights)
# iterative call
alist = []
for yi in numpy.arange(1,nStepsY+1):
alist.append(csrmatrix(numpy.array([yi]), times, x, weights))
def mystack(alist):
# concatenate without offset
row, col, data = [],[],[]
for A in alist:
A = A.tocoo()
row.extend(A.row)
col.extend(A.col)
data.extend(A.data)
print(len(row),len(col),len(data))
return sparse.csr_matrix((data, (row, col)))
vimage = mystack(alist)

Categories