I am measuring x,y coordinates (in cm) of an object with a special camera in fixed time intervals of 1s. I have the data in a numpy array:
a = np.array([ [ 0. , 0. ],[ 0.3 , 0. ],[ 1.25, -0.1 ],[ 2.1 , -0.9 ],[ 2.85, -2.3 ],[ 3.8 , -3.95],[ 5. , -5.75],[ 6.4 , -7.8 ],[ 8.05, -9.9 ],[ 9.9 , -11.6 ],[ 12.05, -12.85],[ 14.25, -13.7 ],[ 16.5 , -13.8 ],[ 19.25, -13.35],[ 21.3 , -12.2 ],[ 22.8 , -10.5 ],[ 23.55, -8.15],[ 22.95, -6.1 ],[ 21.35, -3.95],[ 19.1 , -1.9 ]])
And the curve looks like this:
plt.scatter(a[:,0], a[:,1])
Question:
How can I calculate the tangential and the radial aceleration vectors at each point? I found some formulas that might be relevant:
I am able to easily calculate the vx and the vy projections with np.diff(a, axis=0) but I am a numpy/python noob and it is way over my head to continue. If I could calculate the curvature at each point, also my problem would be solved. Can somebody help?
EDIT: I put together this answer off and on over a couple of hours, so I missed your latest edits indicating that you only needed curvature. Hopefully, this answer will be helpful regardless.
Other than doing some curve-fitting, our method of approximating derivatives is via finite differences. Thankfully, numpy has a gradient method that does these difference calculations for us, taking care of the details of averaging previous and next slopes for each interior point and leaving each endpoint alone, etc.
import numpy as np
a = np.array([ [ 0. , 0. ],[ 0.3 , 0. ],[ 1.25, -0.1 ],
[ 2.1 , -0.9 ],[ 2.85, -2.3 ],[ 3.8 , -3.95],
[ 5. , -5.75],[ 6.4 , -7.8 ],[ 8.05, -9.9 ],
[ 9.9 , -11.6 ],[ 12.05, -12.85],[ 14.25, -13.7 ],
[ 16.5 , -13.8 ],[ 19.25, -13.35],[ 21.3 , -12.2 ],
[ 22.8 , -10.5 ],[ 23.55, -8.15],[ 22.95, -6.1 ],
[ 21.35, -3.95],[ 19.1 , -1.9 ]])
Now, we compute the derivatives of each variable and put them together (for some reason, if we just call np.gradient(a), we get a list of arrays...not sure off the top of my head what's going on there, but I'll just work around it for now):
dx_dt = np.gradient(a[:, 0])
dy_dt = np.gradient(a[:, 1])
velocity = np.array([ [dx_dt[i], dy_dt[i]] for i in range(dx_dt.size)])
This gives us the following vector for velocity:
array([[ 0.3 , 0. ],
[ 0.625, -0.05 ],
[ 0.9 , -0.45 ],
[ 0.8 , -1.1 ],
[ 0.85 , -1.525],
[ 1.075, -1.725],
[ 1.3 , -1.925],
[ 1.525, -2.075],
[ 1.75 , -1.9 ],
[ 2. , -1.475],
[ 2.175, -1.05 ],
[ 2.225, -0.475],
[ 2.5 , 0.175],
[ 2.4 , 0.8 ],
[ 1.775, 1.425],
[ 1.125, 2.025],
[ 0.075, 2.2 ],
[-1.1 , 2.1 ],
[-1.925, 2.1 ],
[-2.25 , 2.05 ]])
which makes sense when glancing at the scatterplot of a.
Now, for speed, we take the length of the velocity vector. However, there's one thing that we haven't really kept in mind here: everything is a function of t. Thus, ds/dt is really a scalar function of t (as opposed to a vector function of t), just like dx/dt and dy/dt. Thus, we will represent ds_dt as a numpy array of values at each of the one second time intervals, each value corresponding to an approximation of the speed at each second:
ds_dt = np.sqrt(dx_dt * dx_dt + dy_dt * dy_dt)
This yields the following array:
array([ 0.3 , 0.62699681, 1.00623059, 1.36014705, 1.74588803,
2.03254766, 2.32284847, 2.57512136, 2.58311827, 2.48508048,
2.41518633, 2.27513736, 2.50611752, 2.52982213, 2.27623593,
2.31651678, 2.20127804, 2.37065392, 2.8487936 , 3.04384625])
which, again, makes some sense as you look at the gaps between the dots on the scatterplot of a: the object picks up speed, slowing down a bit as it takes the corner, and then speeds back up even more.
Now, in order to find the unit tangent vector, we need to make a small transformation to ds_dt so that its size is the same as that of velocity (this effectively allows us to divide the vector-valued function velocity by the (representation of) the scalar function ds_dt):
tangent = np.array([1/ds_dt] * 2).transpose() * velocity
This yields the following numpy array:
array([[ 1. , 0. ],
[ 0.99681528, -0.07974522],
[ 0.89442719, -0.4472136 ],
[ 0.5881717 , -0.80873608],
[ 0.48685826, -0.87348099],
[ 0.52889289, -0.84868859],
[ 0.55965769, -0.82872388],
[ 0.5922051 , -0.80578727],
[ 0.67747575, -0.73554511],
[ 0.80480291, -0.59354215],
[ 0.90055164, -0.43474907],
[ 0.97796293, -0.2087786 ],
[ 0.99755897, 0.06982913],
[ 0.9486833 , 0.31622777],
[ 0.77979614, 0.62603352],
[ 0.48564293, 0.87415728],
[ 0.03407112, 0.99941941],
[-0.46400699, 0.88583154],
[-0.67572463, 0.73715414],
[-0.73919634, 0.67349 ]])
Note two things: 1. At each value of t, tangent is pointing in the same direction as velocity, and 2. At each value of t, tangent is a unit vector. Indeed:
In [12]:
In [12]: np.sqrt(tangent[:,0] * tangent[:,0] + tangent[:,1] * tangent[:,1])
Out[12]:
array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1.])
Now, since we take the derivative of the tangent vector and divide by its length to get the unit normal vector, we do the same trick (isolating the components of tangent for convenience):
tangent_x = tangent[:, 0]
tangent_y = tangent[:, 1]
deriv_tangent_x = np.gradient(tangent_x)
deriv_tangent_y = np.gradient(tangent_y)
dT_dt = np.array([ [deriv_tangent_x[i], deriv_tangent_y[i]] for i in range(deriv_tangent_x.size)])
length_dT_dt = np.sqrt(deriv_tangent_x * deriv_tangent_x + deriv_tangent_y * deriv_tangent_y)
normal = np.array([1/length_dT_dt] * 2).transpose() * dT_dt
This gives us the following vector for normal:
array([[-0.03990439, -0.9992035 ],
[-0.22975292, -0.97324899],
[-0.48897562, -0.87229745],
[-0.69107645, -0.72278167],
[-0.8292422 , -0.55888941],
[ 0.85188045, 0.52373629],
[ 0.8278434 , 0.56095927],
[ 0.78434982, 0.62031876],
[ 0.70769355, 0.70651953],
[ 0.59568265, 0.80321988],
[ 0.41039706, 0.91190693],
[ 0.18879684, 0.98201617],
[-0.05568352, 0.99844847],
[-0.36457012, 0.93117594],
[-0.63863584, 0.76950911],
[-0.89417603, 0.44771557],
[-0.99992445, 0.0122923 ],
[-0.93801622, -0.34659137],
[-0.79170904, -0.61089835],
[-0.70603568, -0.70817626]])
Note that the normal vector represents the direction in which the curve is turning. The vector above then makes sense when viewed in conjunction with the scatterplot for a. In particular, we go from turning down to turning up after the fifth point, and we start turning to the left (with respect to the x axis) after the 12th point.
Finally, to get the tangential and normal components of acceleration, we need the second derivatives of s, x, and y with respect to t, and then we can get the curvature and the rest of our components (keeping in mind that they are all scalar functions of t):
d2s_dt2 = np.gradient(ds_dt)
d2x_dt2 = np.gradient(dx_dt)
d2y_dt2 = np.gradient(dy_dt)
curvature = np.abs(d2x_dt2 * dy_dt - dx_dt * d2y_dt2) / (dx_dt * dx_dt + dy_dt * dy_dt)**1.5
t_component = np.array([d2s_dt2] * 2).transpose()
n_component = np.array([curvature * ds_dt * ds_dt] * 2).transpose()
acceleration = t_component * tangent + n_component * normal
Related
I have this big serie of length t (t = 200K rows)
prices = [200, 100, 500, 300 ..]
and I want to calculate a matrix (tXt) where a value is calculated as:
matrix[i][j] = prices[j]/prices[i] - 1
I tried this using a double for, but it's too slow. Any ideas how to perform it better?
for p0 in prices:
for p1 in prices:
matrix[i][j] = p1/p0 - 1
A vectorized solution is using np.meshgrid, with prices and 1/prices as arguments (note that prices must be an array), and multiplying the result and substracting 1 in order to compute matrix[i][j] = prices[j]/prices[i] - 1:
a, b = np.meshgrid(p, 1/p)
a * b - 1
As an example:
p = np.array([1,4,2])
Would give:
a, b = np.meshgrid(p, 1/p)
a * b - 1
array([[ 0. , 3. , 1. ],
[-0.75, 0. , -0.5 ],
[-0.5 , 1. , 0. ]])
Quick check of some of the cells:
(i,j) prices[j]/prices[i] - 1
--------------------------------
(1,1) 1/1 - 1 = 0
(1,2) 4/1 - 1 = 3
(1,3) 2/1 - 1 = 1
(2,1) 1/4 - 1 = -0.75
Another solution:
[p] / np.array([p]).T - 1
array([[ 0. , 3. , 1. ],
[-0.75, 0. , -0.5 ],
[-0.5 , 1. , 0. ]])
There are two idiomatic ways of doing an outer product-type operation. Either use the .outer method of universal functions, here np.divide:
In [2]: p = np.array([10, 20, 30, 40])
In [3]: np.divide.outer(p, p)
Out[3]:
array([[ 1. , 0.5 , 0.33333333, 0.25 ],
[ 2. , 1. , 0.66666667, 0.5 ],
[ 3. , 1.5 , 1. , 0.75 ],
[ 4. , 2. , 1.33333333, 1. ]])
Alternatively, use broadcasting:
In [4]: p[:, None] / p[None, :]
Out[4]:
array([[ 1. , 0.5 , 0.33333333, 0.25 ],
[ 2. , 1. , 0.66666667, 0.5 ],
[ 3. , 1.5 , 1. , 0.75 ],
[ 4. , 2. , 1.33333333, 1. ]])
This p[None, :] itself can be spelled as a reshape, p.reshape((1, len(p))), but readability.
Both are equivalent to a double for-loop:
In [6]: o = np.empty((len(p), len(p)))
In [7]: for i in range(len(p)):
...: for j in range(len(p)):
...: o[i, j] = p[i] / p[j]
...:
In [8]: o
Out[8]:
array([[ 1. , 0.5 , 0.33333333, 0.25 ],
[ 2. , 1. , 0.66666667, 0.5 ],
[ 3. , 1.5 , 1. , 0.75 ],
[ 4. , 2. , 1.33333333, 1. ]])
I guess it can be done in this way
import numpy
prices = [200., 300., 100., 500., 600.]
x = numpy.array(prices).reshape(1, len(prices))
matrix = (1/x.T) * x - 1
Let me explain in details. This matrix is a matrix product of column vector of element-wise reciprocal price values and a row vector of original price values. Then matrix of ones of the same size needs to be subtracted from the result.
First of all we create row-vector from prices list
x = numpy.array(prices).reshape(1, len(prices))
Reshaping is required here. Otherwise your vector will have shape (len(prices),), not required (1, len(prices)).
Then we compute a column vector of element-wise reciprocal price values:
(1/x.T)
Finally, we compute the resulting matrix
matrix = (1/x.T) * x - 1
Here ending - 1 will be broadcasted to a matrix of the same shape with (1/x.T) * x.
I have a function that creates a 2-dim array, a Vandermonde matrix and is called as:
vandermonde(generator, rank)
Where generator is a n-sized array for example
generator = np.array([-1/2, 1/2, 3/2, 5/2, 7/2, 9/2])
and rank=4
Then I need to create 4 Vandermonde matrices (because rank=4) skewed by h in my space (that h is arbitrary here, lets call h=1).
Therefore I came with the following deterministic code:
V = np.array([
vandermonde(generator-0*h, rank),
vandermonde(generator-1*h, rank),
vandermonde(generator-2*h, rank),
vandermonde(generator-3*h, rank)
])
Then I want instead do multiple manual calls to vandermonde I used a for-loop as in:
V=[]
for i in range(rank):
V.append(vandermonde(generator - h*i, rank))
V = np.array(V)
This approach works fine, but seems too "patchy". I tried a np.append approach as below:
M = np.array([])
for i in range(rank):
M = np.append(M,[vandermonde(generator - h*i, rank)])
But didn't worked as I expected, seems np.append expand the array instead to create a new element.
My questions are:
How can I not use standard Python lists, use directly a np approach cause np.append seems not behave as I expect, instead it just grow that array instead add a new array element
Is there any more direct numpy approaches to this?
My vandermonde function is:
def vandermonde(generator, rank=None):
"""Returns a vandermonde matrix
If rank not passwd returns a square vandermonde matrix
"""
if rank is None:
rank = len(generator)
return np.tile(generator,(rank,1)) ** np.array(range(rank)).reshape((rank,1))
The expected answer is a 3 dimensional array with size (generator, rank, rank) where each element is one of the generator skewed vandermonde matrices. For the constants above(generator, rank, h) we have:
V= array([[[ 1. , 1. , 1. , 1. , 1. , 1. ],
[ -0.5 , 0.5 , 1.5 , 2.5 , 3.5 , 4.5 ],
[ 0.25, 0.25, 2.25, 6.25, 12.25, 20.25],
[ -0.12, 0.12, 3.38, 15.62, 42.88, 91.12]],
[[ 1. , 1. , 1. , 1. , 1. , 1. ],
[ -1.5 , -0.5 , 0.5 , 1.5 , 2.5 , 3.5 ],
[ 2.25, 0.25, 0.25, 2.25, 6.25, 12.25],
[ -3.38, -0.12, 0.12, 3.38, 15.62, 42.88]],
[[ 1. , 1. , 1. , 1. , 1. , 1. ],
[ -2.5 , -1.5 , -0.5 , 0.5 , 1.5 , 2.5 ],
[ 6.25, 2.25, 0.25, 0.25, 2.25, 6.25],
[-15.62, -3.38, -0.12, 0.12, 3.38, 15.62]],
[[ 1. , 1. , 1. , 1. , 1. , 1. ],
[ -3.5 , -2.5 , -1.5 , -0.5 , 0.5 , 1.5 ],
[ 12.25, 6.25, 2.25, 0.25, 0.25, 2.25],
[-42.88, -15.62, -3.38, -0.12, 0.12, 3.38]]])
Some related ideas can be found in this discussion on: efficient-way-to-compute-the-vandermonde-matrix
Use broadcasting to get the final 3D array in a vectorized manner -
r = np.arange(rank)
V_out = (generator - h*r[:,None,None]) ** r[:,None]
We can also use cumprod to achieve the exponential values for another solution -
gr = np.repeat(generator - h*r[:,None,None], rank, axis=1)
gr[:,0] = 1
out = gr.cumprod(1)
I generate a matrix that I want to get the covariance of:
test=np.array([4,2,.6,4.2,2.1,.59,3.9,2,.58,4.3,2.1,.62,4.1,2.2,.63]).reshape(5,3)
test
array([[ 4. , 2. , 0.6 ],
[ 4.2 , 2.1 , 0.59],
[ 3.9 , 2. , 0.58],
[ 4.3 , 2.1 , 0.62],
[ 4.1 , 2.2 , 0.63]])
I calculate the covariance with the numpy function:
np.cov(test)
array([[ 2.92 , 3.098 , 2.846 , 3.164 , 2.966 ],
[ 3.098 , 3.28703333, 3.0199 , 3.3566 , 3.1479 ],
[ 2.846 , 3.0199 , 2.7748 , 3.0832 , 2.8933 ],
[ 3.164 , 3.3566 , 3.0832 , 3.4288 , 3.2122 ],
[ 2.966 , 3.1479 , 2.8933 , 3.2122 , 3.0193 ]])
This however is different than following the covariance formula:
mean=np.mean(test,0)
np.dot(test-mean,(test-mean).T)/(5-1)
array([[ 0.004104, -0.002886, 0.006624, -0.005416, -0.002426],
[-0.002886, 0.002649, -0.005316, 0.005044, 0.000509],
[ 0.006624, -0.005316, 0.011744, -0.010496, -0.002556],
[-0.005416, 0.005044, -0.010496, 0.010164, 0.000704],
[-0.002426, 0.000509, -0.002556, 0.000704, 0.003769]])
This does not match the numpy calculations.
In fact, I take a peek at the source code and the equation is (x-m) * (x-m).T.conj() / (N - 1) which I believe I am implementing.
The difference comes from the fact that the np.cov calculates the covariance between row vectors, which is why the result is 5*5 instead of 3*3, but np.mean calculates the average of column vectors and when you do test - mean the calculation is also broadcasted along column which differs from what np.cov is doing, the fix would be a two-step:
Firstly, make sure the mean is calculated for each row, which can be done by simply transposing the test matrix:
mean = np.mean(test.T, 0)
And then when calculate x - x_bar, reshape the mean vector so that the minus is along the rows as well, and also since the vector under test is row vector the dimension is going to be 3 instead of 5. After these fixing, it will give consistent results as np.cov does:
np.dot(test-mean[:, None],(test-mean[:, None]).T)/(3-1)
# array([[ 2.92 , 3.098 , 2.846 , 3.164 , 2.966 ],
# [ 3.098 , 3.28703333, 3.0199 , 3.3566 , 3.1479 ],
# [ 2.846 , 3.0199 , 2.7748 , 3.0832 , 2.8933 ],
# [ 3.164 , 3.3566 , 3.0832 , 3.4288 , 3.2122 ],
# [ 2.966 , 3.1479 , 2.8933 , 3.2122 , 3.0193 ]])
I've become sort of used to broadcasting with 2 dimensional arrays, but I can't get my head around this 3-dimensional thing I want to do.
I have two 2-dimensional arrays:
>>> a = np.array([[0.01,.2,.3,.4],[.2,.03,.4,.5],[.9,.8,.7,.06]])
>>> b= np.array([[1,2,3],[3.,4,5]])
>>> a
array([[ 0.01, 0.2 , 0.3 , 0.4 ],
[ 0.2 , 0.03, 0.4 , 0.5 ],
[ 0.9 , 0.8 , 0.7 , 0.06]])
>>> b
array([[ 1., 2., 3.],
[ 3., 4., 5.]])
Now, what I want is the sum all rows in a, where each row is weighted by the column values in b.
So, I want 1. * a[0,:] + 2. * a[1,:] + 3. * a[2,:] and the same for the second row of b.
So, I know how to do this step-by-step:
>>> (np.array([b[0]]).T * a).sum(0)
array([ 3.11, 2.66, 3.2 , 1.58])
>>> (np.array([b[1]]).T * a).sum(0)
array([ 5.33, 4.72, 6. , 3.5 ])
But I have the feeling that if I knew how to broadcast the two correctly as 3-dimensional arrays I could get the result I want in one go.
The result being:
array([[ 3.11, 2.66, 3.2 , 1.58],
[ 5.33, 4.72, 6. , 3.5 ]])
I guess this shouldn't be too hard..?!?
You want to do matrix multiplication:
>>> b.dot(a)
array([[ 3.11, 2.66, 3.2 , 1.58],
[ 5.33, 4.72, 6. , 3.5 ]])
I am trying to plot a feature map (SOM) using python.
To keep it simple, imagine a 2D plot where each unit is represented as an hexagon.
As it is shown on this topic: Hexagonal Self-Organizing map in Python the hexagons are located side-by-side formated as a grid.
I manage to write the following piece of code and it works perfectly for a set number of polygons and for only few shapes (6 x 6 or 10 x 4 hexagons for example). However one important feature of a method like this is to support any grid shape from 3 x 3.
def plot_map(grid,
d_matrix,
w=10,
title='SOM Hit map'):
"""
Plot hexagon map where each neuron is represented by a hexagon. The hexagon
color is given by the distance between the neurons (D-Matrix) Scaled
hexagons will appear on top of the background image whether the hits array
is provided. They are scaled according to the number of hits on each
neuron.
Args:
- grid: Grid dictionary (keys: centers, x, y ),
- d_matrix: array contaning the distances between each neuron
- w: width of the map in inches
- title: map title
Returns the Matplotlib SubAxis instance
"""
n_centers = grid['centers']
x, y = grid['x'], grid['y']
fig = plt.figure(figsize=(1.05 * w, 0.85 * y * w / x), dpi=100)
ax = fig.add_subplot(111)
ax.axis('equal')
# Discover difference between centers
collection_bg = RegularPolyCollection(
numsides=6, # a hexagon
rotation=0,
sizes=(y * (1.3 * 2 * math.pi * w) ** 2 / x,),
edgecolors = (0, 0, 0, 1),
array= d_matrix,
cmap = cm.gray,
offsets = n_centers,
transOffset = ax.transData,
)
ax.add_collection(collection_bg, autolim=True)
ax.axis('off')
ax.autoscale_view()
ax.set_title(title)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
plt.colorbar(collection_bg, cax=cax)
return ax
I've tried to make something that automatically understands the grid shape. It didn't work (and I'm not sure why). It always appear a undesired space between the hexagons
Summarising: I would like to generate 3x3 or 6x6 or 10x4 (and so on) grid using hexagons with no spaces in the between for given points and setting the plot width.
As it was asked, here is the data for the hexagons location. As you can see, it always the same pattern
3x3
{'centers': array([[ 1.5 , 0.8660254 ],
[ 2.5 , 0.8660254 ],
[ 3.5 , 0.8660254 ],
[ 1. , 1.73205081],
[ 2. , 1.73205081],
[ 3. , 1.73205081],
[ 1.5 , 2.59807621],
[ 2.5 , 2.59807621],
[ 3.5 , 2.59807621]]),
'x': array([ 3.]),
'y': array([ 3.])}
6x6
{'centers': array([[ 1.5 , 0.8660254 ],
[ 2.5 , 0.8660254 ],
[ 3.5 , 0.8660254 ],
[ 4.5 , 0.8660254 ],
[ 5.5 , 0.8660254 ],
[ 6.5 , 0.8660254 ],
[ 1. , 1.73205081],
[ 2. , 1.73205081],
[ 3. , 1.73205081],
[ 4. , 1.73205081],
[ 5. , 1.73205081],
[ 6. , 1.73205081],
[ 1.5 , 2.59807621],
[ 2.5 , 2.59807621],
[ 3.5 , 2.59807621],
[ 4.5 , 2.59807621],
[ 5.5 , 2.59807621],
[ 6.5 , 2.59807621],
[ 1. , 3.46410162],
[ 2. , 3.46410162],
[ 3. , 3.46410162],
[ 4. , 3.46410162],
[ 5. , 3.46410162],
[ 6. , 3.46410162],
[ 1.5 , 4.33012702],
[ 2.5 , 4.33012702],
[ 3.5 , 4.33012702],
[ 4.5 , 4.33012702],
[ 5.5 , 4.33012702],
[ 6.5 , 4.33012702],
[ 1. , 5.19615242],
[ 2. , 5.19615242],
[ 3. , 5.19615242],
[ 4. , 5.19615242],
[ 5. , 5.19615242],
[ 6. , 5.19615242]]),
'x': array([ 6.]),
'y': array([ 6.])}
11x4
{'centers': array([[ 1.5 , 0.8660254 ],
[ 2.5 , 0.8660254 ],
[ 3.5 , 0.8660254 ],
[ 4.5 , 0.8660254 ],
[ 5.5 , 0.8660254 ],
[ 6.5 , 0.8660254 ],
[ 7.5 , 0.8660254 ],
[ 8.5 , 0.8660254 ],
[ 9.5 , 0.8660254 ],
[ 10.5 , 0.8660254 ],
[ 11.5 , 0.8660254 ],
[ 1. , 1.73205081],
[ 2. , 1.73205081],
[ 3. , 1.73205081],
[ 4. , 1.73205081],
[ 5. , 1.73205081],
[ 6. , 1.73205081],
[ 7. , 1.73205081],
[ 8. , 1.73205081],
[ 9. , 1.73205081],
[ 10. , 1.73205081],
[ 11. , 1.73205081],
[ 1.5 , 2.59807621],
[ 2.5 , 2.59807621],
[ 3.5 , 2.59807621],
[ 4.5 , 2.59807621],
[ 5.5 , 2.59807621],
[ 6.5 , 2.59807621],
[ 7.5 , 2.59807621],
[ 8.5 , 2.59807621],
[ 9.5 , 2.59807621],
[ 10.5 , 2.59807621],
[ 11.5 , 2.59807621],
[ 1. , 3.46410162],
[ 2. , 3.46410162],
[ 3. , 3.46410162],
[ 4. , 3.46410162],
[ 5. , 3.46410162],
[ 6. , 3.46410162],
[ 7. , 3.46410162],
[ 8. , 3.46410162],
[ 9. , 3.46410162],
[ 10. , 3.46410162],
[ 11. , 3.46410162]]),
'x': array([ 11.]),
'y': array([ 4.])}
I've manage to find a workaround by calculating the figure size of inches according the given dpi. After, I compute the pixel distance between two adjacent points (by plotting it using a hidden scatter plot). This way I could calculate the hexagon apothem and estimate correctly the size of the hexagon's inner circle (as the matplotlib expects).
No gaps in the end!
import matplotlib.pyplot as plt
from matplotlib import colors, cm
from matplotlib.collections import RegularPolyCollection
from mpl_toolkits.axes_grid1 import make_axes_locatable
import math
import numpy as np
def plot_map(grid,
d_matrix,
w=1080,
dpi=72.,
title='SOM Hit map'):
"""
Plot hexagon map where each neuron is represented by a hexagon. The hexagon
color is given by the distance between the neurons (D-Matrix)
Args:
- grid: Grid dictionary (keys: centers, x, y ),
- d_matrix: array contaning the distances between each neuron
- w: width of the map in inches
- title: map title
Returns the Matplotlib SubAxis instance
"""
n_centers = grid['centers']
x, y = grid['x'], grid['y']
# Size of figure in inches
xinch = (x * w / y) / dpi
yinch = (y * w / x) / dpi
fig = plt.figure(figsize=(xinch, yinch), dpi=dpi)
ax = fig.add_subplot(111, aspect='equal')
# Get pixel size between to data points
xpoints = n_centers[:, 0]
ypoints = n_centers[:, 1]
ax.scatter(xpoints, ypoints, s=0.0, marker='s')
ax.axis([min(xpoints)-1., max(xpoints)+1.,
min(ypoints)-1., max(ypoints)+1.])
xy_pixels = ax.transData.transform(np.vstack([xpoints, ypoints]).T)
xpix, ypix = xy_pixels.T
# In matplotlib, 0,0 is the lower left corner, whereas it's usually the
# upper right for most image software, so we'll flip the y-coords
width, height = fig.canvas.get_width_height()
ypix = height - ypix
# discover radius and hexagon
apothem = .9 * (xpix[1] - xpix[0]) / math.sqrt(3)
area_inner_circle = math.pi * (apothem ** 2)
collection_bg = RegularPolyCollection(
numsides=6, # a hexagon
rotation=0,
sizes=(area_inner_circle,),
edgecolors = (0, 0, 0, 1),
array= d_matrix,
cmap = cm.gray,
offsets = n_centers,
transOffset = ax.transData,
)
ax.add_collection(collection_bg, autolim=True)
ax.axis('off')
ax.autoscale_view()
ax.set_title(title)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="10%", pad=0.05)
plt.colorbar(collection_bg, cax=cax)
return ax