How to get equally spaced grid points in an irregularly shaped figure? - python

I have an irregularly shaped image and I want to get equally spaced grid points inside that.
The image that I have for example is Image I have
I am thinking of using OpenCV to get the corner coordinates and that is easy. But I do not know how to pass all the corner coordinates or divide my shape in identifiable geometric shapes and do this.
Right now, I have hard coded the coordinates and created a function to pass the coordinates.
import numpy as np
import matplotlib.pyplot as plt
import functools
def gridFunc(arr):
center = np.mean(arr, axis=0)
x = np.arange(min(arr[:, 0]), max(arr[:, 0]) + 0.04, 0.4)
y = np.arange(min(arr[:, 1]), max(arr[:, 1]) + 0.04, 0.4)
a, b = np.meshgrid(x, y)
points = np.stack([a.reshape(-1), b.reshape(-1)]).T
def normal(a, b):
v = b - a
n = np.array([v[1], -v[0]])
# normal needs to point out
if (center - a) # n > 0:
n *= -1
return n
mask = functools.reduce(np.logical_and, [((points - a) # normal(a, b)) < 0 for a, b in zip(arr[:-1], arr[1:])])
#plt.plot(arr[:, 0], arr[:, 1])
#plt.gca().set_aspect('equal')
#plt.scatter(points[mask][:, 0], points[mask][:, 1])
#plt.show()
return points[mask]
arr1 = np.array([[0, 7],[3, 10],[3, 4],[0, 7]])
arr2 = np.array([[3, 0], [3, 14], [12, 14], [12, 0], [3,0]])
arr3 = np.array([[12, 4], [12, 10], [20, 10], [20, 4], [12, 4]])
arr_1 = gridFunc(arr1)
arr_2 = gridFunc(arr2)
arr_3 = gridFunc(arr3)
res = np.append(arr_1, arr_2)
res = np.reshape(res, (-1, 2))
res = np.append(res, arr_3)
res = np.reshape(res, (-1, 2))
plt.scatter(res[:,0], res[:,1])
plt.show()
The image that I get is this, But I am doing this manually And I want to extend this to other shapes as well.
Image I get

Related

How to calculate the correlation coefficient on a rolling window of a vector using numpy?

I'm able to calculate a rolling correlation coefficient for a 1D-array (data against [0, 1, 2, 3, 4]) using a loop.
I'm looking for a smarter solution using numpy (not pandas).
Here is my current code:
import numpy as np
data = np.array([10,5,8,9,15,22,26,11,15,16,18,7,4,8,-2,-3,-4,-6,-2,0,10,0,5,8])
x = np.zeros_like(data).astype('float32')
length = 5
for i in range(length, data.shape[0]):
x[i] = np.corrcoef(data[i - length:i], np.arange(length))[0, 1]
print(x)
x gives :
[ 0. 0. 0. 0. 0. 0.607 0.959 0.98 0.328 -0.287
-0.61 -0.314 -0.18 -0.8 -0.782 -0.847 -0.811 -0.825 -0.869 -0.283
0.566 0.863 0.643 0.454]
Any solution without the loop please?
Use a numpy.lib.stride_tricks.sliding_window_view (available in numpy v1.20.0+)
swindow = np.lib.stride_tricks.sliding_window_view(data, (length,))
which gives a view on the data array that looks like so:
array([[10, 5, 8, 9, 15],
[ 5, 8, 9, 15, 22],
[ 8, 9, 15, 22, 26],
[ 9, 15, 22, 26, 11],
[15, 22, 26, 11, 15],
[22, 26, 11, 15, 16],
[26, 11, 15, 16, 18],
[11, 15, 16, 18, 7],
[15, 16, 18, 7, 4],
[16, 18, 7, 4, 8],
[18, 7, 4, 8, -2],
[ 7, 4, 8, -2, -3],
[ 4, 8, -2, -3, -4],
[ 8, -2, -3, -4, -6],
[-2, -3, -4, -6, -2],
[-3, -4, -6, -2, 0],
[-4, -6, -2, 0, 10],
[-6, -2, 0, 10, 0],
[-2, 0, 10, 0, 5],
[ 0, 10, 0, 5, 8]])
Now, we want to apply the correlation coefficient calculation to each row of this array. Unfortunately, np.corrcoef doesn't take an axis argument, it applies the calculation to the entire matrix and doesn't provide a way to do so for each row/column.
However, the calculation for the correlation coefficient of two vectors is quite simple:
Applying that here:
def vec_corrcoef(X, y, axis=1):
Xm = np.mean(X, axis=axis, keepdims=True)
ym = np.mean(y)
n = np.sum((X - Xm) * (y - ym), axis=axis)
d = np.sqrt(np.sum((X - Xm)**2, axis=axis) * np.sum((y - ym)**2))
return n / d
Now, call this function with our array and arange:
cc = vec_corrcoef(swindow, np.arange(length))
which gives the desired result:
array([ 0.60697698, 0.95894955, 0.98 , 0.3279521 , -0.28709766,
-0.61035663, -0.31390158, -0.17995394, -0.80041656, -0.78192905,
-0.84702587, -0.81091772, -0.82464375, -0.86892667, -0.28347335,
0.56568542, 0.86304424, 0.64326752, 0.45374261, 0.38135638])
To get your x, just set the appropriate indices of a zeros array of the correct size.
Note: I think your x should contain nonzero values starting at the 4 index (because that's where the sliding window is full) instead of starting at index 5.
x = np.zeros(data.shape)
x[-len(cc):] = cc
If you are sure that your values should start at the index 5, then you can do:
x = np.zeros(data.shape)
x[length:] = cc[:-1] # Ignore the last value in cc
Comparing the runtimes of your original approach with those suggested in the answers here:
f_OP_loopy is your approach, which implements a sliding window using a loop
f_PH_numpy is my approach, which uses the sliding_window_view and the vectorized function for row-wise calculation of the vector correlation coefficient
f_RA_numpy is Rontogiannis's approach, which tiles the arange, calculates the correlation coefficient for the entire matrices, and only selects the first len(data) - length rows of the last column
f_RA_recur is Rontogiannis's recursive approach, but I didn't time this because it misses out on the last correlation coefficient.
Unsurprisingly, the numpy-only solution is faster than the loopy approach.
My numpy solution, which computes the row-wise correlation coefficient, is faster than that shown by Rontogiannis below, because the extra work involved in tiling the vector input and calculating the correlation of the entire matrix, only to discard the unwanted elements, is avoided by my approach.
As the input data size increases, this "extra work" in Rontogiannis's approach increases so much that its runtime is worse even than the loopy approach! I am unsure if this extra time is in the np.corrcoef calculation or in the np.tile operation.
Note: This plot was obtained on my 2.2GHz i7 Macbook Air with 8GB RAM, Python 3.10.7 and numpy 1.23.3. Similar results were obtained on Google Colab
If you're interested in the timing code, here it is:
import timeit
import numpy as np
from matplotlib import pyplot as plt
def time_funcs(funcs, sizes, arg_gen, N=20):
times = np.zeros((len(sizes), len(funcs)))
gdict = globals().copy()
for i, s in enumerate(sizes):
args = arg_gen(s)
print(args)
for j, f in enumerate(funcs):
gdict.update(locals())
try:
times[i, j] = timeit.timeit("f(*args)", globals=gdict, number=N) / N
print(f"{i}/{len(sizes)}, {j}/{len(funcs)}, {times[i, j]}")
except ValueError:
print(f"ERROR in {f}, with args=", *args)
return times
def plot_times(times, funcs):
fig, ax = plt.subplots()
for j, f in enumerate(funcs):
ax.plot(sizes, times[:, j], label=f.__name__)
ax.set_xlabel("Array size")
ax.set_ylabel("Time per function call (s)")
ax.set_xscale("log")
ax.set_yscale("log")
ax.legend()
ax.grid()
fig.tight_layout()
return fig, ax
#%%
def arg_gen(n):
return [np.random.randint(-100, 100, (n,)), 5]
#%%
def f_OP_loopy(data, length):
x = np.zeros_like(data).astype('float32')
for i in range(length-1, data.shape[0]):
x[i] = np.corrcoef(data[i - length + 1:i+1], np.arange(length))[0, 1]
return x
def f_PH_numpy(data, length):
swindow = np.lib.stride_tricks.sliding_window_view(data, (length,))
cc = vec_corrcoef(swindow, np.arange(length))
x = np.zeros(data.shape)
x[-len(cc):] = cc
return x
def f_RA_recur(data, length):
return np.concatenate((
np.zeros([length,]),
rolling_correlation_recurse(data, 0, length)
))
def f_RA_numpy(data, length):
n = len(data)
cc = np.corrcoef(np.lib.stride_tricks.sliding_window_view(data, length), np.tile(np.arange(length), (n-length+1, 1)))[:n-length+1, -1]
x = np.zeros(data.shape)
x[-len(cc):] = cc
return x
#%%
def rolling_correlation_recurse(data, i, length) :
assert i+length < data.size
left = np.array([np.corrcoef(data[i:i+length], np.arange(length))[0, 1]])
if i+length+1 == data.size :
return left
right = rolling_correlation_recurse(data, i+1, length)
return np.concatenate((left, right))
def vec_corrcoef(X, y, axis=1):
Xm = np.mean(X, axis=axis, keepdims=True)
ym = np.mean(y)
n = np.sum((X - Xm) * (y - ym), axis=axis)
d = np.sqrt(np.sum((X - Xm)**2, axis=axis) * np.sum((y - ym)**2))
return n / d
#%%
if __name__ == "__main__":
#%% Set up sim
sizes = [5, 10, 50, 100, 500, 1000, 5000, 10_000] #, 50_000, 100_000]
funcs = [f_OP_loopy, #f_RA_recur,
f_PH_numpy, f_RA_numpy]
#%% Run timing
time_fcalls = np.zeros((len(sizes), len(funcs))) * np.nan
time_fcalls = time_funcs(funcs, sizes, arg_gen)
fig, ax = plot_times(time_fcalls, funcs)
ax.set_xlabel(f"Input size")
plt.show()
input("Enter x to exit")
Ask and you shall receive. Here is a solution that uses recursion:
import numpy as np
data = np.array([10,5,8,9,15,22,26,11,15,16,18,7,4,8,-2,-3,-4,-6,-2,0,10,0,5,8])
length = 5
def rolling_correlation_recurse(data, i, length) :
assert i+length < data.size
left = np.array([np.corrcoef(data[i:i+length], np.arange(length))[0, 1]])
if i+length+1 == data.size :
return left
right = rolling_correlation_recurse(data, i+1, length)
return np.concatenate((left, right))
def rolling_correlation(data, length) :
return np.concatenate((
np.zeros([length,]),
rolling_correlation_recurse(data, 0, length)
))
print(rolling_correlation(data, length))
Edit: here is a numpy solution too:
n = len(data)
print(np.corrcoef(np.lib.stride_tricks.sliding_window_view(data, length), np.tile(np.arange(length), (n-length+1, 1)))[:n-length+1, -1])

Assign indexes to rotated indexes

I have a (3, 2, 2) array whose three 2D subarrays I want to rotate without loops by 0°, 90°, and 180°, respectively:
import numpy as np
import matplotlib.pyplot as plt
arr = np.array([[[2,3],
[3,3]],
[[4,5],
[5,5]],
[[6,7],
[7,7]]])
for k in np.arange(3):
plt.imshow(arr[k,:,:], cmap='gray', vmin=2, vmax=7)
plt.show()
Therefore, I defined two (3, 2, 2) arrays containing the x and y indices...
x_ = np.array([[[0,1],
[0,1]],
[[0,1],
[0,1]],
[[0,1],
[0,1]]])
y_ = np.array([[[0,0],
[1,1]],
[[0,0],
[1,1]],
[[0,0],
[1,1]]])
... and rotated them:
x_rot = np.array([[[0,1],
[0,1]],
[[1,1],
[0,0]],
[[1,0],
[1,0]]])
y_rot = np.array([[[0,0],
[1,1]],
[[0,1],
[0,1]],
[[1,1],
[0,0]]])
But I don't understand why the following index assignment doesn't work, because instead each 2D subarray is rotated 180°:
arr_rot = np.zeros((3, 2, 2), dtype=int)
arr_rot[:, x_, y_] = arr[:, x_rot, y_rot]
for k in np.arange(3):
plt.imshow(arr_rot[k,:,:], cmap='gray', vmin=2, vmax=7)
plt.show()
You were already really close: you can get the desired result using x_rot and y_rot by also indexing the first dimension:
out = arr[np.arange(3)[:, None, None], x_rot, y_rot]
out:
array([[[2, 3],
[3, 3]],
[[5, 5],
[4, 5]],
[[7, 7],
[7, 6]]])
Thank you, W.A, for your detailed and understandable answer.
Sorry, I should have mentioned that I should program flexibly: arr can be any (T, M, M) array and the T angles are given arbitrarily. So my code so far looks like this:
import numpy as np
import matplotlib.pyplot as plt
arr = np.array([[[2,3],
[3,3]],
[[4,5],
[5,5]],
[[6,7],
[7,7]]])
M = 2
T = 3
b = 2*np.pi
angles = np.array([np.linspace(0, (T-1)*b/T, T)]).T
# create indices arrays
x_ = np.tile(np.arange(M), (M, 1)).
y_ = np.tile(np.arange(M), (M, 1)).T
x_, y_ = np.repeat(x_, T, axis=0), np.tile(y_, (T, 1))
x_, y_ = x_.reshape((T, M, M)), y_.reshape((T, M, M))
# center, rotate and decenter 2D subarrays
center = -M/2 + 0.5
decenter = -center
x_rot = ((x_ + center)*np.cos(angles) - (y_ + center)*np.sin(angles)) + decenter
y_rot = ((x_ + center)*np.sin(angles) + (y_ + center)*np.cos(angles)) + decenter
# set indices that lie outside 0...(M-1) equal to zero.
x_rot[(x_rot < 0) + (x_rot > M-1)] = 0
y_rot[(y_rot < 0) + (y_rot > M-1)] = 0
arr_rot = np.zeros((T, M, M), dtype=int)
arr_rot[:, x_, y_] = arr[:, x_rot.astype(int), y_rot.astype(int)]

How to broadcast or vectorize a linear interpolation of a 2D array that uses scipy.ndimage map_coordinates?

I have recently hit a roadblock when it comes to performance. I know how to manually loop and do the interpolation from the origin cell to all the other cells by brute-forcing/looping each row and column in 2d array.
however when I process a 2D array of a shape say (3000, 3000), the linear spacing and the interpolation come to a standstill and severely hurt performance.
I am looking for a way I can optimize this loop, I am aware of vectorization and broadcasting just not sure how I can apply it in this situation.
I will explain it with code and figures
import numpy as np
from scipy.ndimage import map_coordinates
m = np.array([
[10,10,10,10,10,10],
[9,9,9,10,9,9],
[9,8,9,10,8,9],
[9,7,8,0,8,9],
[8,7,7,8,8,9],
[5,6,7,7,6,7]])
origin_row = 3
origin_col = 3
m_max = np.zeros(m.shape)
m_dist = np.zeros(m.shape)
rows, cols = m.shape
for col in range(cols):
for row in range(rows):
# Get spacing linear interpolation
x_plot = np.linspace(col, origin_col, 5)
y_plot = np.linspace(row, origin_row, 5)
# grab the interpolated line
interpolated_line = map_coordinates(m,
np.vstack((y_plot,
x_plot)),
order=1, mode='nearest')
m_max[row][col] = max(interpolated_line)
m_dist[row][col] = np.argmax(interpolated_line)
print(m)
print(m_max)
print(m_dist)
As you can see this is very brute force, and I have managed to broadcast all the code around this part but stuck on this part.
here is an illustration of what I am trying to achieve, I will go through the first iteration
1.) the input array
2.) the first loop from 0,0 to origin (3,3)
3.) this will return [10 9 9 8 0] and the max will be 10 and the index will be 0
5.) here is the output for the sample array I used
Here is an update of the performance based on the accepted answer.
To speed up the code, you could first create the x_plot and y_plot outside of the loops instead of creating them several times each one:
#this would be outside of the loops
num = 5
lin_col = np.array([np.linspace(i, origin_col, num) for i in range(cols)])
lin_row = np.array([np.linspace(i, origin_row, num) for i in range(rows)])
then you could access them in each loop by x_plot = lin_col[col] and y_plot = lin_row[row]
Second, you can avoid both loops by using map_coordinates on more than just one v_stack for each couple (row, col). To do so, you can create all the combinaisons of x_plot and y_plot by using np.tile and np.ravel such as:
arr_vs = np.vstack(( np.tile( lin_row, cols).ravel(),
np.tile( lin_col.ravel(), rows)))
Note that ravel is not used at the same place each time to get all the combinaisons. Now you can use map_coordinates with this arr_vs and reshape the result with the number of rows, cols and num to get each interpolated_line in the last axis of a 3D-array:
arr_map = map_coordinates(m, arr_vs, order=1, mode='nearest').reshape(rows,cols,num)
Finally, you can use np.max and np.argmax on the last axis of arr_map to get the results m_max and m_dist. So all the code would be:
import numpy as np
from scipy.ndimage import map_coordinates
m = np.array([
[10,10,10,10,10,10],
[9,9,9,10,9,9],
[9,8,9,10,8,9],
[9,7,8,0,8,9],
[8,7,7,8,8,9],
[5,6,7,7,6,7]])
origin_row = 3
origin_col = 3
rows, cols = m.shape
num = 5
lin_col = np.array([np.linspace(i, origin_col, num) for i in range(cols)])
lin_row = np.array([np.linspace(i, origin_row, num) for i in range(rows)])
arr_vs = np.vstack(( np.tile( lin_row, cols).ravel(),
np.tile( lin_col.ravel(), rows)))
arr_map = map_coordinates(m, arr_vs, order=1, mode='nearest').reshape(rows,cols,num)
m_max = np.max( arr_map, axis=-1)
m_dist = np.argmax( arr_map, axis=-1)
print (m_max)
print (m_dist)
and you get like expected:
#m_max
array([[10, 10, 10, 10, 10, 10],
[ 9, 9, 10, 10, 9, 9],
[ 9, 9, 9, 10, 8, 9],
[ 9, 8, 8, 0, 8, 9],
[ 8, 8, 7, 8, 8, 9],
[ 7, 7, 8, 8, 8, 8]])
#m_dist
array([[0, 0, 0, 0, 0, 0],
[0, 0, 2, 0, 0, 0],
[0, 2, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0],
[0, 2, 0, 0, 0, 0],
[1, 1, 2, 1, 2, 1]])
EDIT: lin_col and lin_row are related, so you can do faster:
if cols >= rows:
arr = np.arange(cols)[:,None]
lin_col = arr + (origin_col-arr)/(num-1.)*np.arange(num)
lin_row = lin_col[:rows] + np.linspace(0, origin_row - origin_col, num)[None,:]
else:
arr = np.arange(rows)[:,None]
lin_row = arr + (origin_row-arr)/(num-1.)*np.arange(num)
lin_col = lin_row[:cols] + np.linspace(0, origin_col - origin_row, num)[None,:]
Here is a sort-of-vectorized approach. It is not very optimized and there may be one or two index-off-by-one errors, but it may give you ideas.
Two examples a monochrome 384x512 test pattern and a "real" 3-channel 768x1024 image. Both are uint8.
This takes half a minute on my machine.
For larger images one would require more RAM than I have (8GB). Or one would have to break it down into smaller chunks.
And the code
import numpy as np
def rays(img, ctr):
M, N, *d = img.shape
aidx = 2*(slice(None),) + (img.ndim-2)*(None,)
m, n = ctr
out = np.empty_like(img)
offsI = np.empty(img.shape, np.uint16)
offsJ = np.empty(img.shape, np.uint16)
img4, out4, I4, J4 = ((x[m:, n:], x[m:, n::-1], x[m::-1, n:], x[m::-1, n::-1]) for x in (img, out, offsI, offsJ))
for i, o, y, x in zip(img4, out4, I4, J4):
for _ in range(2):
M, N, *d = i.shape
widths = np.arange(1, M+1, dtype=np.uint16).clip(None, N)
I = np.arange(M, dtype=np.uint16).repeat(widths)
J = np.ones_like(I)
J[0] = 0
J[widths[:-1].cumsum()] -= widths[:-1]
J = J.cumsum(dtype=np.uint16)
ii = np.arange(1, 2*M-1, dtype=np.uint16) // 2
II = ii.clip(None, I[:, None])
jj = np.arange(2*M-2, dtype=np.uint32) // 2 * 2 + 1
jj[0] = 0
JJ = ((1 + jj) * J[:, None] // (2*(I+1))[:, None]).astype(np.uint16).clip(None, J[:, None])
idx = i[II, JJ].argmax(axis=1)
II, JJ = (np.take_along_axis(ZZ[aidx] , idx[:, None], 1)[:, 0] for ZZ in (II, JJ))
y[I, J], x[I, J] = II, JJ
SH = II, JJ, *np.ogrid[tuple(map(slice, img.shape))][2:]
o[I, J] = i[SH]
i, o = i.swapaxes(0, 1), o.swapaxes(0, 1)
y, x = x.swapaxes(0, 1), y.swapaxes(0, 1)
return out, offsI, offsJ
from scipy.misc import face
f = face()
fr, *fidx = rays(f, (200, 400))
s = np.uint8((np.arange(384)[:, None] % 41 < 2)&(np.arange(512) % 41 < 2))
s = 255*s + 128*s[::-1, ::-1] + 64*s[::-1] + 32*s[:, ::-1]
sr, *sidx = rays(s, (200, 400))
import Image
Image.fromarray(f).show()
Image.fromarray(fr).show()
Image.fromarray(s).show()
Image.fromarray(sr).show()

Creating 2d histogram from 2d numpy array

I have a numpy array like this:
[[[0,0,0], [1,0,0], ..., [1919,0,0]],
[[0,1,0], [1,1,0], ..., [1919,1,0]],
...,
[[0,1019,0], [1,1019,0], ..., [1919,1019,0]]]
To create I use function (thanks to #Divakar and #unutbu for helping in other question):
def indices_zero_grid(m,n):
I,J = np.ogrid[:m,:n]
out = np.zeros((m,n,3), dtype=int)
out[...,0] = I
out[...,1] = J
return out
I can access this array by command:
>>> out = indices_zero_grid(3,2)
>>> out
array([[[0, 0, 0],
[0, 1, 0]],
[[1, 0, 0],
[1, 1, 0]],
[[2, 0, 0],
[2, 1, 0]]])
>>> out[1,1]
array([1, 1, 0])
Now I wanted to plot 2d histogram where (x,y) (out[(x,y]) is the coordinates and the third value is number of occurrences. I've tried using normal matplotlib plot, but I have so many values for each coordinates (I need 1920x1080) that program needs too much memory.
If I understand correctly, you want an image of size 1920x1080 which colors the pixel at coordinate (x, y) according to the value of out[x, y].
In that case, you could use
import numpy as np
import matplotlib.pyplot as plt
def indices_zero_grid(m,n):
I,J = np.ogrid[:m,:n]
out = np.zeros((m,n,3), dtype=int)
out[...,0] = I
out[...,1] = J
return out
h, w = 1920, 1080
out = indices_zero_grid(h, w)
out[..., 2] = np.random.randint(256, size=(h, w))
plt.imshow(out[..., 2])
plt.show()
which yields
Notice that the other two "columns", out[..., 0] and out[..., 1] are not used. This suggests that indices_zero_grid is not really needed here.
plt.imshow can accept an array of shape (1920, 1080). This array has a scalar value at each location in the array. The structure of the array tells imshow where to color each cell. Unlike a scatter plot, you don't need to generate the coordinates yourself.

Matplotlib: Hysteresis loop using Mirrored or Split x axis

In an experiment, a load cell advances in equal increments of distance with time, compresses a sample; stops when a specified distance from the start point is reached; then retracts in equal increments of distance with time back to the starting position.
A plot of pressure (load cell reading) on the y axis against pressure on the x axis produces a familiar hysteresis loop. A plot of pressure (load cell reading) on the y axis against time on the x axis produces an assymetric peak with the maximum pressure in the centre, corresponding to the maximum advancement point of the sensor.
Instead of the above, I'd like to plot pressure on the y axis against distance on the x axis, with the additional constraint that the x axis is labelled starting at 0, with maximum pressure at the middle of the x axis, and 0 again at the right hand end of the x-axis. In other words, the curve will be identical in shape to the plot of pressure v time, but will be of pressure v distance, where the left half of the plot indicates the distance of the probe from its starting position during advancement; and the right half of the plot indicates distance of the probe from its starting position during retraction.
My actual datasets contain thousands of rows of data but by way of illustration, a minimal dummy dataset would look something like the following, where the 3 columns correspond to Time, Distance of probe from origin, and Pressure measured by probe respectively:
[
[0,0,0],
[1,2,10],
[2,4,30],
[3,6,60],
[4,4,35],
[5,2,15],
[6,0,0]
]
I can't work out how to get MatPlotlib to construct the x-axis so that the range goes from 0 to a maximum, then back to 0 again. I'd be grateful for advice on how to achieve this plot in the most simple and elegant way. Many thanks.
As you have time, you can use it for the x axis values and just change the x tick labels:
import numpy as np
import matplotlib.pyplot as plt
# Time, Distance, Pressure
data = [[0, 0, 0],
[1, 2, 10],
[2, 4, 30],
[3, 6, 60],
[4, 4, 35],
[5, 2, 15],
[6, 0, 0]]
# convert to array to allow indexing like [i, j]
data = np.array(data)
fig = plt.figure()
ax = fig.add_subplot(111)
max_ticks = 10
skip = (data.shape[0] / max_ticks) + 1
ax.plot(data[:, 0], data[:, 2]) # Pressure(time)
ax.set_xticks(data[::skip, 0])
ax.set_xticklabels(data[::skip, 1]) # Pressure(Distance(time)) ?
ax.set_ylabel('Pressure [Pa?]')
ax.set_xlabel('Distance [m?]')
fig.show()
The skip is just so you don't end up with too many ticks on the plot, change as you like.
As said in comment, the above only holds for uniforme changes in distance as a function of time. For non uniform changes, you'll have to use something like:
data = [[0, 0, 0],
[1, 2, 10],
[2, 4, 30],
[3, 6, 60],
[3.5, 5.4, 40],
[4, 4, 35],
[5, 2, 15],
[6, 0, 0]]
# convert to array to allow indexing like [i, j]
data = np.array(data)
def find_max_pos(data, column=0):
return np.argmax(data[:, column])
def reverse_unload(data, unload_start):
# prepare new_data with new column:
new_shape = np.array(data.shape)
new_shape[1] += 1
new_data = np.empty(new_shape)
# copy all correct data
new_data[:, 0] = data[:, 0]
new_data[:, 1] = data[:, 1]
new_data[:, 2] = data[:, 2]
new_data[:unload_start+1, 3] = data[:unload_start+1, 1]
# use gradient to fill the rest
gradient = -np.gradient(data[:, 1])
for i in range(unload_start + 1, data.shape[0]):
new_data[i, 3] = new_data[i-1, 3] + gradient[i]
return new_data
data = reverse_unload(data, find_max_pos(data, 1))
fig = plt.figure()
ax = fig.add_subplot(111)
max_ticks = 10
skip = (data.shape[0] / max_ticks) + 1
ax.plot(data[:, 3], data[:, 2]) # Pressure("Distance")
ax.set_xticks(data[::skip, 3])
ax.set_xticklabels(data[::skip, 1])
ax.grid() # added for clarity
ax.set_ylabel('Pressure [Pa?]')
ax.set_xlabel('Distance [m?]')
fig.show()
Regarding the fact that using the measured values as the ticks results in these not being round nice numbers, I found it was just easier to map the automatic ticks from matplotlib to the correct values:
import numpy as np
import matplotlib.pyplot as plt
data = [[0, 0, 0],
[1, 2, 10],
[2, 4, 30],
[3, 6, 60],
[3.5, 5.4, 40],
[4, 4, 35],
[5, 2, 15],
[6, 0, 0]]
# convert to array to allow indexing like [i, j]
data = np.array(data)
def find_max_pos(data, column=0):
return np.argmax(data[:, column])
def reverse_unload(data):
unload_start = find_max_pos(data, 1)
# prepare new_data with new column:
new_shape = np.array(data.shape)
new_shape[1] += 1
new_data = np.empty(new_shape)
# copy all correct data
new_data[:, 0] = data[:, 0]
new_data[:, 1] = data[:, 1]
new_data[:, 2] = data[:, 2]
new_data[:unload_start+1, 3] = data[:unload_start+1, 1]
# use gradient to fill the rest
gradient = data[unload_start:-1, 1]-data[unload_start+1:, 1]
for i, j in enumerate(range(unload_start + 1, data.shape[0])):
new_data[j, 3] = new_data[j-1, 3] + gradient[i]
return new_data
def create_map_function(data):
"""
Return function that maps values of distance
folded over the maximum pressure applied.
"""
max_index = find_max_pos(data, 1)
x0, y0 = data[max_index, 1], data[max_index, 1]
x1, y1 = 2*data[max_index, 1], 0
m = (y1 - y0) / (x1 - x0)
b = y0 - m*x0
def map_function(x):
if x < x0:
return x
else:
return m*x+b
return map_function
def process_data(data):
data = reverse_unload(data)
map_function = create_map_function(data)
fig, ax = plt.subplots()
ax.plot(data[:, 3], data[:, 2])
ax.set_xticklabels([map_function(x) for x in ax.get_xticks()])
ax.grid()
ax.set_ylabel('Pressure [Pa?]')
ax.set_xlabel('Distance [m?]')
fig.show()
if __name__ == '__main__':
process_data(data)
Update: Have found a workaround to the problem of rounding ticks to the nearest integer by using the np.around function which rounds decimals to the nearest even value, to a specified number of decimal places (default = 0): e.g. 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. More info here: https://docs.scipy.org/doc/numpy1.10.4/reference/generated/numpy.around.html
So berna1111's code becomes:
import numpy as np
import matplotlib.pyplot as plt
# Time, Distance, Pressure
data = [[0, 0, 0],
[1, 1.9, 10], # Dummy data including decimals to demonstrate rounding
[2, 4.1, 30],
[3, 6.1, 60],
[4, 3.9, 35],
[5, 1.9, 15],
[6, -0.2, 0]]
# convert to array to allow indexing like [i, j]
data = np.array(data)
fig = plt.figure()
ax = fig.add_subplot(111)
max_ticks = 10
skip = (data.shape[0] / max_ticks) + 1
ax.plot(data[:, 0], data[:, 2]) # Pressure(time)
ax.set_xticks(data[::skip, 0])
ax.set_xticklabels(np.absolute(np.around((data[::skip, 1])))) # Pressure(Distance(time)); rounded to nearest integer
ax.set_ylabel('Pressure [Pa?]')
ax.set_xlabel('Distance [m?]')
fig.show()
According to the numpy documentation, np.around should round the final value of -0.2 for Distance to '0.0'; however it seems to round to '-0.0' instead. Not sure why this occurs, but since all my xticklabels in this particular case need to be positive integers or zero, I can correct this behaviour by using the np.absolute function as shown above. Everything now seems to work OK for my requirements, but if I'm missing something, or there's a better solution, please let me know.

Categories