getting a list of coordinates from a 2D matrix - python

Let's say I have a 10 x 20 matrix of values (so 200 data points)
values = np.random.rand(10,20)
with a known regular spacing between coordinates so that the x and y coordinates are defined by
coord_x = np.arange(0,5,0.5) --> gives [0.0,0.5,1.0,1.5...4.5]
coord_y = np.arange(0,5,0.25) --> gives [0.0,0.25,0.50,0.75...4.5]
I'd like to get an array representing each coordinates points so that
the shape of the array is (200,2), 200 being the total number of points and the extra dimension simply representing x and y such as
coord[0][0]=0.0, coord[0][1]=0.0
coord[1][0]=0.0, coord[1][1]=0.25
coord[2][0]=0.0, coord[2][1]=0.50
...
coord[19][0]=0.0, coord[19][1]=5.0
coord[20][0]=0.5, coord[20][1]=0.0
coord[21][0]=0.5, coord[21][1]=0.25
coord[22][0]=0.5, coord[22][1]=0.50
...
coord[199][0]=4.5, coord[199][1]=4.5
That would a fairly easy thing to do with a double for loop, but I wonder if there is more elegant solution using built-in numpy (or else) functions.
?

I think meshgrid may be what you're looking for.
Here's an example, with smaller number of datapoints:
>>> from numpy import fliplr, dstack, meshgrid, linspace
>>> x, y, nx, ny = 4.5, 4.5, 3, 10
>>> Xs = linspace(0, x, nx)
>>> Ys = linspace(0, y, ny)
>>> fliplr(dstack(meshgrid(Xs, Ys)).reshape(nx * ny, 2))
array([[ 0. , 0. ],
[ 0. , 2.25],
[ 0. , 4.5 ],
[ 0.5 , 0. ],
[ 0.5 , 2.25],
[ 0.5 , 4.5 ],
[ 1. , 0. ],
[ 1. , 2.25],
[ 1. , 4.5 ],
[ 1.5 , 0. ],
[ 1.5 , 2.25],
[ 1.5 , 4.5 ],
[ 2. , 0. ],
[ 2. , 2.25],
[ 2. , 4.5 ],
[ 2.5 , 0. ],
[ 2.5 , 2.25],
[ 2.5 , 4.5 ],
[ 3. , 0. ],
[ 3. , 2.25],
[ 3. , 4.5 ],
[ 3.5 , 0. ],
[ 3.5 , 2.25],
[ 3.5 , 4.5 ],
[ 4. , 0. ],
[ 4. , 2.25],
[ 4. , 4.5 ],
[ 4.5 , 0. ],
[ 4.5 , 2.25],
[ 4.5 , 4.5 ]])

I think you meant coord_y = np.arange(0,5,0.25) in your question. You can do
from numpy import meshgrid,column_stack
x,y=meshgrid(coord_x,coord_y)
coord = column_stack((x.T.flatten(),y.T.flatten()))

Related

axis -1 in numpy array

I am struggling to understand two things from below matrix (numpy arrays):
How can I deduce from the np.stack(cart_indexing, axis=1) function that there are 5 dimensions? I am struggling to conceptually understand the (5, 2, 5) part. I see it as (rows, column numbers, dimensions).
What does axis = -1 really mean? How to understand it?
x = np.linspace(start=-10, stop=0, num=5, endpoint=True)
y = np.linspace(start=1, stop=10, num=5)
cart_indexing = np.meshgrid(x, y, indexing="xy") # cartesian indexing
>> [array([[-10. , -7.5, -5. , -2.5, 0. ],
[-10. , -7.5, -5. , -2.5, 0. ],
[-10. , -7.5, -5. , -2.5, 0. ],
[-10. , -7.5, -5. , -2.5, 0. ],
[-10. , -7.5, -5. , -2.5, 0. ]]),
array([[ 1. , 1. , 1. , 1. , 1. ],
[ 3.25, 3.25, 3.25, 3.25, 3.25],
[ 5.5 , 5.5 , 5.5 , 5.5 , 5.5 ],
[ 7.75, 7.75, 7.75, 7.75, 7.75],
[10. , 10. , 10. , 10. , 10. ]])]
np.stack(cart_indexing, axis=0)
>> array([[[-10. , -7.5 , -5. , -2.5 , 0. ],
[-10. , -7.5 , -5. , -2.5 , 0. ],
[-10. , -7.5 , -5. , -2.5 , 0. ],
[-10. , -7.5 , -5. , -2.5 , 0. ],
[-10. , -7.5 , -5. , -2.5 , 0. ]],
[[ 1. , 1. , 1. , 1. , 1. ],
[ 3.25, 3.25, 3.25, 3.25, 3.25],
[ 5.5 , 5.5 , 5.5 , 5.5 , 5.5 ],
[ 7.75, 7.75, 7.75, 7.75, 7.75],
[ 10. , 10. , 10. , 10. , 10. ]]])
np.stack(cart_indexing, axis=1)
>> array([[[-10. , -7.5 , -5. , -2.5 , 0. ],
[ 1. , 1. , 1. , 1. , 1. ]],
[[-10. , -7.5 , -5. , -2.5 , 0. ],
[ 3.25, 3.25, 3.25, 3.25, 3.25]],
[[-10. , -7.5 , -5. , -2.5 , 0. ],
[ 5.5 , 5.5 , 5.5 , 5.5 , 5.5 ]],
[[-10. , -7.5 , -5. , -2.5 , 0. ],
[ 7.75, 7.75, 7.75, 7.75, 7.75]],
[[-10. , -7.5 , -5. , -2.5 , 0. ],
[ 10. , 10. , 10. , 10. , 10. ]]])
np.stack(cart_indexing, axis=1).shape
>> (5, 2, 5)
np.stack(cart_indexing, axis=-1)
>> array([[[-10. , 1. ],
[ -7.5 , 1. ],
[ -5. , 1. ],
[ -2.5 , 1. ],
[ 0. , 1. ]],
[[-10. , 3.25],
[ -7.5 , 3.25],
[ -5. , 3.25],
[ -2.5 , 3.25],
[ 0. , 3.25]],
[[-10. , 5.5 ],
[ -7.5 , 5.5 ],
[ -5. , 5.5 ],
[ -2.5 , 5.5 ],
[ 0. , 5.5 ]],
[[-10. , 7.75],
[ -7.5 , 7.75],
[ -5. , 7.75],
[ -2.5 , 7.75],
[ 0. , 7.75]],
[[-10. , 10. ],
[ -7.5 , 10. ],
[ -5. , 10. ],
[ -2.5 , 10. ],
[ 0. , 10. ]]])
np.stack(cart_indexing, axis=-1).shape
>> (5, 5, 2)
It's not clear what you mean by
there are 5 dimensions
None of your arrays have 5 dimensions. You start with a list of 2 arrays with 2 dimensions;
for i in cart_indexing:
print(f"Shape:{i.shape}; Dimensions:{i.ndim}")
Shape:(5, 5); Dimensions:2
Shape:(5, 5); Dimensions:2
Notice here how you have 5 and 5 and 2.
Then, the axis parameter in your stack comes into play:
for i in range(3):
print(f"Stacked on axis {i} my array has {np.stack(cart_indexing, axis=i).ndim} dimensions and a shape of {np.stack(cart_indexing, axis=i).shape}")
Stacked on axis 0 my array has 3 dimensions and a shape of (2, 5, 5) #the 2 is in the (axis=)0th position
Stacked on axis 1 my array has 3 dimensions and a shape of (5, 2, 5) #the 2 is in the (axis=)1st position
Stacked on axis 2 my array has 3 dimensions and a shape of (5, 5, 2) #the 2 is in the (axis=)2nd position
Put another way, stacking adds a dimension along which the arrays are stacked. The axis parameter determines which dimension is created during stacking/along which dimension they are stacked
What does axis = -1 really mean?
Why does print("Hello world"[-1]) print "d"?
Or, in other words, if we want to count our dimensions from last to first:
for i in range(-3,0):
print(f"Stacked on axis {i} my array has {np.stack(cart_indexing, axis=i).ndim} dimensions and a shape of {np.stack(cart_indexing, axis=i).shape}")
Stacked on axis -3 my array has 3 dimensions and a shape of (2, 5, 5) #dimension that is third from last
Stacked on axis -2 my array has 3 dimensions and a shape of (5, 2, 5) #dimension that is second from last
Stacked on axis -1 my array has 3 dimensions and a shape of (5, 5, 2) #last dimesnion

python: DELETE points out of a very big 2D array and elements are float, like discarding unwanted points in KNN

I have a 2D array and I want to delete a point out of it but suppose it's so big meaning I can't specify an index and just grab it and the values of the array are float
How can I delete this point? With a LOOP and WITHOUT LOOP?? the following is 2D array and I want to delete [ 32.9, 23.]
[[ 1. , -1.4],
[ -2.9, -1.5],
[ -3.6, -2. ],
[ 1.5, 1. ],
[ 24. , 11. ],
[ -1. , 1.4],
[ 2.9, 1.5],
[ 3.6, 2. ],
[ -1.5, -1. ],
[ -24. , -11. ],
[ 32.9, 23. ],
[-440. , 310. ]]
I tried this but doesn't work:
this_point = np.asarray([ 32.9, 23.])
[x for x in y if x == point]
del datapoints[this_point]
np.delete(datapoints,len(datapoints), axis=0)
for this_point in datapoints:
del this_point
when I do this, the this_point stays in after printing all points, what should I do?
Python can remove a list element by content, but numpy does only by index. So, use "where" to find the coordinates of the matching row:
import numpy as np
a = np.array([[ 1. , -1.4],
[ -2.9, -1.5],
[ -3.6, -2. ],
[ 1.5, 1. ],
[ 24. , 11. ],
[ -1. , 1.4],
[ 2.9, 1.5],
[ 3.6, 2. ],
[ -1.5, -1. ],
[ -24. , -11. ],
[ 32.9, 23. ],
[-440. , 310. ]])
find = np.array([32.9,23.])
row = np.where( (a == find).all(axis=1))
print( row )
print(np.delete( a, row, axis=0 ) )
Output:
(array([10], dtype=int64),)
[[ 1. -1.4]
[ -2.9 -1.5]
[ -3.6 -2. ]
[ 1.5 1. ]
[ 24. 11. ]
[ -1. 1.4]
[ 2.9 1.5]
[ 3.6 2. ]
[ -1.5 -1. ]
[ -24. -11. ]
[-440. 310. ]]
C:\tmp>

matrix.dot(inv(matrix)) isn't equal to identity matrix

I'm encountering an issue since hours, I don't understand why the V matrix below doesn't equal the Identity matrix:
A = np.random.randint(50, size=(100, 2))
V = A.dot(A.T)
D = V.dot(inv(V))
D
The result I found is below either:
array([[ 3.26611328, 7.87890625, 14.1953125 , ..., 2. ,
-5. , -24. ],
[ -5.91061401, -26.05834961, 5.30126953, ..., -10. ,
8. , -16. ],
[ -2.64431763, 3.55639648, 3.10107422, ..., -0.5 ,
-5. , -4. ],
...,
[ -2.62512207, -7.78222656, 10.26367188, ..., -6. ,
18. , 0. ],
[ -3.0625 , 14. , -4. , ..., -0.0625 ,
0. , 8. ],
[ 2. , -7. , 16. , ..., -7.5 ,
-8. , -4. ]])
Thank you for your help
I've found my issue:
I was trying to find the inv() of a matrix which det(matrix) = 0, that's why the calculus wasn't correct.
D = V.T.dot(V)
inv(D).dot(D)
then I find the Identity matrix
Thank you
Habib

Python: Generate a new array with more columns based on another array

I have this array:
I need create a new array like this:
I guess I need use a conditional, but I don't know how create an array with 7 columns, based on values of a 5 columns array.
If anyone could help me, I thank!
I'm going to assume you want to convert your last column into one hot concodings and then concat it to your original array. You can initialise an array of zeros, and then set the appropriate indices to 1. Finally concat the OHE array to your original.
MCVE:
print(arr)
array([[ -9.95, 15.27, 9.08, 1. ],
[ -6.81, 11.87, 8.38, 2. ],
[ -3.02, 11.08, -8.5 , 1. ],
[ -5.73, -2.29, -2.09, 2. ],
[ -7.01, -0.9 , 12.91, 2. ],
[-11.64, -10.3 , 2.09, 2. ],
[ 17.85, 13.7 , 2.14, 0. ],
[ 6.34, -9.49, -8.05, 2. ],
[ 18.62, -9.43, -1.02, 1. ],
[ -2.15, -23.65, -13.03, 1. ]])
c = arr[:, -1].astype(int)
ohe = np.zeros((c.shape[0], c.max() + 1))
ohe[np.arange(c.shape[0]), c] = 1
arr = np.hstack((arr[:, :-1], ohe))
print(arr)
array([[ -9.95, 15.27, 9.08, 0. , 1. , 0. ],
[ -6.81, 11.87, 8.38, 0. , 0. , 1. ],
[ -3.02, 11.08, -8.5 , 0. , 1. , 0. ],
[ -5.73, -2.29, -2.09, 0. , 0. , 1. ],
[ -7.01, -0.9 , 12.91, 0. , 0. , 1. ],
[-11.64, -10.3 , 2.09, 0. , 0. , 1. ],
[ 17.85, 13.7 , 2.14, 1. , 0. , 0. ],
[ 6.34, -9.49, -8.05, 0. , 0. , 1. ],
[ 18.62, -9.43, -1.02, 0. , 1. , 0. ],
[ -2.15, -23.65, -13.03, 0. , 1. , 0. ]])
One-line version of #COLDSPEED using the np.eye trick:
np.hstack([arr[:,:-1], np.eye(arr[:,-1].astype(int).max() + 1)[arr[:,-1].astype(int)]])

Matplotlib RegularPolygon collection location on the canvas

I am trying to plot a feature map (SOM) using python.
To keep it simple, imagine a 2D plot where each unit is represented as an hexagon.
As it is shown on this topic: Hexagonal Self-Organizing map in Python the hexagons are located side-by-side formated as a grid.
I manage to write the following piece of code and it works perfectly for a set number of polygons and for only few shapes (6 x 6 or 10 x 4 hexagons for example). However one important feature of a method like this is to support any grid shape from 3 x 3.
def plot_map(grid,
d_matrix,
w=10,
title='SOM Hit map'):
"""
Plot hexagon map where each neuron is represented by a hexagon. The hexagon
color is given by the distance between the neurons (D-Matrix) Scaled
hexagons will appear on top of the background image whether the hits array
is provided. They are scaled according to the number of hits on each
neuron.
Args:
- grid: Grid dictionary (keys: centers, x, y ),
- d_matrix: array contaning the distances between each neuron
- w: width of the map in inches
- title: map title
Returns the Matplotlib SubAxis instance
"""
n_centers = grid['centers']
x, y = grid['x'], grid['y']
fig = plt.figure(figsize=(1.05 * w, 0.85 * y * w / x), dpi=100)
ax = fig.add_subplot(111)
ax.axis('equal')
# Discover difference between centers
collection_bg = RegularPolyCollection(
numsides=6, # a hexagon
rotation=0,
sizes=(y * (1.3 * 2 * math.pi * w) ** 2 / x,),
edgecolors = (0, 0, 0, 1),
array= d_matrix,
cmap = cm.gray,
offsets = n_centers,
transOffset = ax.transData,
)
ax.add_collection(collection_bg, autolim=True)
ax.axis('off')
ax.autoscale_view()
ax.set_title(title)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
plt.colorbar(collection_bg, cax=cax)
return ax
I've tried to make something that automatically understands the grid shape. It didn't work (and I'm not sure why). It always appear a undesired space between the hexagons
Summarising: I would like to generate 3x3 or 6x6 or 10x4 (and so on) grid using hexagons with no spaces in the between for given points and setting the plot width.
As it was asked, here is the data for the hexagons location. As you can see, it always the same pattern
3x3
{'centers': array([[ 1.5 , 0.8660254 ],
[ 2.5 , 0.8660254 ],
[ 3.5 , 0.8660254 ],
[ 1. , 1.73205081],
[ 2. , 1.73205081],
[ 3. , 1.73205081],
[ 1.5 , 2.59807621],
[ 2.5 , 2.59807621],
[ 3.5 , 2.59807621]]),
'x': array([ 3.]),
'y': array([ 3.])}
6x6
{'centers': array([[ 1.5 , 0.8660254 ],
[ 2.5 , 0.8660254 ],
[ 3.5 , 0.8660254 ],
[ 4.5 , 0.8660254 ],
[ 5.5 , 0.8660254 ],
[ 6.5 , 0.8660254 ],
[ 1. , 1.73205081],
[ 2. , 1.73205081],
[ 3. , 1.73205081],
[ 4. , 1.73205081],
[ 5. , 1.73205081],
[ 6. , 1.73205081],
[ 1.5 , 2.59807621],
[ 2.5 , 2.59807621],
[ 3.5 , 2.59807621],
[ 4.5 , 2.59807621],
[ 5.5 , 2.59807621],
[ 6.5 , 2.59807621],
[ 1. , 3.46410162],
[ 2. , 3.46410162],
[ 3. , 3.46410162],
[ 4. , 3.46410162],
[ 5. , 3.46410162],
[ 6. , 3.46410162],
[ 1.5 , 4.33012702],
[ 2.5 , 4.33012702],
[ 3.5 , 4.33012702],
[ 4.5 , 4.33012702],
[ 5.5 , 4.33012702],
[ 6.5 , 4.33012702],
[ 1. , 5.19615242],
[ 2. , 5.19615242],
[ 3. , 5.19615242],
[ 4. , 5.19615242],
[ 5. , 5.19615242],
[ 6. , 5.19615242]]),
'x': array([ 6.]),
'y': array([ 6.])}
11x4
{'centers': array([[ 1.5 , 0.8660254 ],
[ 2.5 , 0.8660254 ],
[ 3.5 , 0.8660254 ],
[ 4.5 , 0.8660254 ],
[ 5.5 , 0.8660254 ],
[ 6.5 , 0.8660254 ],
[ 7.5 , 0.8660254 ],
[ 8.5 , 0.8660254 ],
[ 9.5 , 0.8660254 ],
[ 10.5 , 0.8660254 ],
[ 11.5 , 0.8660254 ],
[ 1. , 1.73205081],
[ 2. , 1.73205081],
[ 3. , 1.73205081],
[ 4. , 1.73205081],
[ 5. , 1.73205081],
[ 6. , 1.73205081],
[ 7. , 1.73205081],
[ 8. , 1.73205081],
[ 9. , 1.73205081],
[ 10. , 1.73205081],
[ 11. , 1.73205081],
[ 1.5 , 2.59807621],
[ 2.5 , 2.59807621],
[ 3.5 , 2.59807621],
[ 4.5 , 2.59807621],
[ 5.5 , 2.59807621],
[ 6.5 , 2.59807621],
[ 7.5 , 2.59807621],
[ 8.5 , 2.59807621],
[ 9.5 , 2.59807621],
[ 10.5 , 2.59807621],
[ 11.5 , 2.59807621],
[ 1. , 3.46410162],
[ 2. , 3.46410162],
[ 3. , 3.46410162],
[ 4. , 3.46410162],
[ 5. , 3.46410162],
[ 6. , 3.46410162],
[ 7. , 3.46410162],
[ 8. , 3.46410162],
[ 9. , 3.46410162],
[ 10. , 3.46410162],
[ 11. , 3.46410162]]),
'x': array([ 11.]),
'y': array([ 4.])}
I've manage to find a workaround by calculating the figure size of inches according the given dpi. After, I compute the pixel distance between two adjacent points (by plotting it using a hidden scatter plot). This way I could calculate the hexagon apothem and estimate correctly the size of the hexagon's inner circle (as the matplotlib expects).
No gaps in the end!
import matplotlib.pyplot as plt
from matplotlib import colors, cm
from matplotlib.collections import RegularPolyCollection
from mpl_toolkits.axes_grid1 import make_axes_locatable
import math
import numpy as np
def plot_map(grid,
d_matrix,
w=1080,
dpi=72.,
title='SOM Hit map'):
"""
Plot hexagon map where each neuron is represented by a hexagon. The hexagon
color is given by the distance between the neurons (D-Matrix)
Args:
- grid: Grid dictionary (keys: centers, x, y ),
- d_matrix: array contaning the distances between each neuron
- w: width of the map in inches
- title: map title
Returns the Matplotlib SubAxis instance
"""
n_centers = grid['centers']
x, y = grid['x'], grid['y']
# Size of figure in inches
xinch = (x * w / y) / dpi
yinch = (y * w / x) / dpi
fig = plt.figure(figsize=(xinch, yinch), dpi=dpi)
ax = fig.add_subplot(111, aspect='equal')
# Get pixel size between to data points
xpoints = n_centers[:, 0]
ypoints = n_centers[:, 1]
ax.scatter(xpoints, ypoints, s=0.0, marker='s')
ax.axis([min(xpoints)-1., max(xpoints)+1.,
min(ypoints)-1., max(ypoints)+1.])
xy_pixels = ax.transData.transform(np.vstack([xpoints, ypoints]).T)
xpix, ypix = xy_pixels.T
# In matplotlib, 0,0 is the lower left corner, whereas it's usually the
# upper right for most image software, so we'll flip the y-coords
width, height = fig.canvas.get_width_height()
ypix = height - ypix
# discover radius and hexagon
apothem = .9 * (xpix[1] - xpix[0]) / math.sqrt(3)
area_inner_circle = math.pi * (apothem ** 2)
collection_bg = RegularPolyCollection(
numsides=6, # a hexagon
rotation=0,
sizes=(area_inner_circle,),
edgecolors = (0, 0, 0, 1),
array= d_matrix,
cmap = cm.gray,
offsets = n_centers,
transOffset = ax.transData,
)
ax.add_collection(collection_bg, autolim=True)
ax.axis('off')
ax.autoscale_view()
ax.set_title(title)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="10%", pad=0.05)
plt.colorbar(collection_bg, cax=cax)
return ax

Categories