Rearranging array of vertices into array of edges - python

I have a 3x2 array where each row represents a vertex of a triangle. I would like to reshape it in order to obtain a new array where each row represents a side.
I'm currently trying the following approach:
points = np.array([[0,0], [0,1], [1,0]])
sides = np.array([
[points[0], points[1]],
[points[1], points[2]],
[points[2], points[0]]
])
Is there any build in function to do that in a more elegant way?

Elegance is a matter of definition, if you find the following solution more elegant, is up to you. I use np.roll to shift the indices from [0], [1], [2] to [1] [2] [0] and then pair the shifted and unshifted arrays using np.stack, similar to what you do in your manual code (watch the index pairs you create, they are the same).
import numpy as np
points = np.array([[0,0], [0,1], [1,0]])
print(points)
#array([[0, 0],
# [0, 1],
# [1, 0]])
sides = np.stack([
points,
np.roll(points, -1, axis=-1)
], axis=-1)
print(sides)
#array([[[0, 0],
# [0, 1]],
#
# [[0, 1],
# [1, 0]],
#
# [[1, 0],
# [0, 0]]])
Keep in mind that this solution does not work for an arbitrary amount of vertices, but just three.

Related

Getting all row indices in numpy 2d array where elements in each row exists more than 2 times in entire array

I am working with graph data defined as 2d array of edges.
I.e.
[[1, 0],
[2, 5],
[1, 5],
[3, 4],
[1, 4]]
Defines a graph, all elements define a node id, there are no self loops, it is directed, and no value in a column exists in the other column.
Now to the question,
I need to select all edges where both 'nodes' occur more than once in the list.
How do I do that in a quick way. Currently I am iterating over each edge and looking at the nodes individually. It feels like a really bad way to do this.
Current dumb/slow solution
edges = []
for edge in graph:
src, dst = edge[0], edge[1]
# Check src for existance in col 1 & 2
src_fan = np.count_nonzero(graph == src, axis=1).sum()
dst_fan = np.count_nonzero(graph == dst, axis=1).sum()
if(src_fan >= 2 and dst_fan >= 2):
# Add to edges
edges.append(edge)
I am also not entirely sure this way is even correct...
# Obtain the unique nodes and their counts
from_nodes, from_counts = np.unique(a[:, 0], return_counts = True)
to_nodes, to_counts = np.unique(a[:, 1], return_counts = True)
# Obtain the duplicated nodes
dup_from_nodes = from_nodes[from_counts > 1]
dup_to_nodes = to_nodes[to_counts > 1]
# Obtain the edge whose nodes are duplicated
graph[np.in1d(a[:, 0], dup_from_nodes) & np.in1d(a[:, 1], dup_to_nodes)]
Out[297]: array([[1, 4]])
a solution using networkx:
import networkx as nx
edges = [[1, 0],
[2, 5],
[1, 5],
[3, 4],
[1, 4]]
G = nx.DiGraph()
G.add_edges_from(edges)
print([node for node in G.nodes if G.degree[node]>1])
edit:
print([edge for edge in G.edges if (G.degree[edge[0]]>1) & (G.degree[edge[1]]>1)])
import numpy as np
graph = np.array([[1, 0],
[2, 5],
[1, 5],
[3, 4],
[1, 4]])
# get a 1d array of all nodes
array = graph.reshape(-1)
# get occurances of each element
occurances = np.sum(np.equal(array, array[:,np.newaxis]), axis=0)
# reshape back to graph shape
occurances = occurances.reshape(graph.shape)
# check if both edges occur more than once
mask = np.all(occurances > 1, axis=1)
# select the masked elements
edges = graph[mask]
Based on my test this method is almost 2 times faster than the accepted answer.
Test:
import timeit
import numpy as np
graph = np.array([[1, 0],
[2, 5],
[1, 5],
[3, 4],
[1, 4]])
# accepted answer
def method1(a):
# Obtain the unique nodes and their counts
from_nodes, from_counts = np.unique(a[:, 0], return_counts = True)
to_nodes, to_counts = np.unique(a[:, 1], return_counts = True)
# Obtain the duplicated nodes
dup_from_nodes = from_nodes[from_counts > 1]
dup_to_nodes = to_nodes[to_counts > 1]
# Obtain the edge whose nodes are duplicated
return graph[np.in1d(a[:, 0], dup_from_nodes) & np.in1d(a[:, 1], dup_to_nodes)]
# this answer
def method2(graph):
# get a 1d array of all nodes
array = graph.reshape(-1)
# get occurances of each element then reshape back to graph shape
occurances = np.sum(np.equal(array, array[:,np.newaxis]), axis=0).reshape(graph.shape)
# check if both edges occur more than once
mask = np.all(occurances > 1, axis=1)
# select the masked elements
edges = graph[mask]
return edges
print('method1 (accepted answer): ', timeit.timeit(lambda: method1(graph), number=10000))
print('method2 (this answer): ', timeit.timeit(lambda: method2(graph), number=10000))
Outhput:
method1 (accepted answer): 0.20238440000000013
method2 (this answer): 0.06534320000000005

Replacing array at i`th dimension

Let's say I have a two-dimensional array
import numpy as np
a = np.array([[1, 1, 1], [2,2,2], [3,3,3]])
and I would like to replace the third vector (in the second dimension) with zeros. I would do
a[:, 2] = np.array([0, 0, 0])
But what if I would like to be able to do that programmatically? I mean, let's say that variable x = 1 contained the dimension on which I wanted to do the replacing. How would the function replace(arr, dimension, value, arr_to_be_replaced) have to look if I wanted to call it as replace(a, x, 2, np.array([0, 0, 0])?
numpy has a similar function, insert. However, it doesn't replace at dimension i, it returns a copy with an additional vector.
All solutions are welcome, but I do prefer a solution that doesn't recreate the array as to save memory.
arr[:, 1]
is basically shorthand for
arr[(slice(None), 1)]
that is, a tuple with slice elements and integers.
Knowing that, you can construct a tuple of slice objects manually, adjust the values depending on an axis parameter and use that as your index. So for
import numpy as np
arr = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])
axis = 1
idx = 2
arr[:, idx] = np.array([0, 0, 0])
# ^- axis position
you can use
slices = [slice(None)] * arr.ndim
slices[axis] = idx
arr[tuple(slices)] = np.array([0, 0, 0])

Scipy KDTree get rectangular subset of grid defined by two points

I am using the following example from :
from scipy import spatial
x, y = np.mgrid[0:5, 2:8]
tree = spatial.KDTree(list(zip(x.ravel(), y.ravel())))
pts = np.array([[0, 0], [2.1, 2.9]])
idx = tree.query(pts)[1]
data = tree.data[??????????]
If I input two arbitrary points (see variable pts), I am looking to return all pairs of coordinates that lie within the rectangle defined by the two points (KDTree finds the closest neighbour). So in this case:
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[2, 2]])
How can I achieve that from the tree data?
Seems that I found a solution:
from scipy import spatial
import numpy as np
x, y = np.mgrid[0:5, 0:5]
tree = spatial.KDTree(list(zip(x.ravel(), y.ravel())))
pts = np.array([[0, 0], [2.1, 2.2]])
idx = tree.query(pts)[1]
data = tree.data[[idx[0], idx[1]]]
rectangle = tree.data[np.where((tree.data[:,0]>=min(data[:,0])) & (tree.data[:,0]<=max(data[:,0])) & (tree.data[:,1]>=min(data[:,1])) & (tree.data[:,1]<=max(data[:,1])))]
However, I would love to see a solution using the query option!

Creating 2d histogram from 2d numpy array

I have a numpy array like this:
[[[0,0,0], [1,0,0], ..., [1919,0,0]],
[[0,1,0], [1,1,0], ..., [1919,1,0]],
...,
[[0,1019,0], [1,1019,0], ..., [1919,1019,0]]]
To create I use function (thanks to #Divakar and #unutbu for helping in other question):
def indices_zero_grid(m,n):
I,J = np.ogrid[:m,:n]
out = np.zeros((m,n,3), dtype=int)
out[...,0] = I
out[...,1] = J
return out
I can access this array by command:
>>> out = indices_zero_grid(3,2)
>>> out
array([[[0, 0, 0],
[0, 1, 0]],
[[1, 0, 0],
[1, 1, 0]],
[[2, 0, 0],
[2, 1, 0]]])
>>> out[1,1]
array([1, 1, 0])
Now I wanted to plot 2d histogram where (x,y) (out[(x,y]) is the coordinates and the third value is number of occurrences. I've tried using normal matplotlib plot, but I have so many values for each coordinates (I need 1920x1080) that program needs too much memory.
If I understand correctly, you want an image of size 1920x1080 which colors the pixel at coordinate (x, y) according to the value of out[x, y].
In that case, you could use
import numpy as np
import matplotlib.pyplot as plt
def indices_zero_grid(m,n):
I,J = np.ogrid[:m,:n]
out = np.zeros((m,n,3), dtype=int)
out[...,0] = I
out[...,1] = J
return out
h, w = 1920, 1080
out = indices_zero_grid(h, w)
out[..., 2] = np.random.randint(256, size=(h, w))
plt.imshow(out[..., 2])
plt.show()
which yields
Notice that the other two "columns", out[..., 0] and out[..., 1] are not used. This suggests that indices_zero_grid is not really needed here.
plt.imshow can accept an array of shape (1920, 1080). This array has a scalar value at each location in the array. The structure of the array tells imshow where to color each cell. Unlike a scatter plot, you don't need to generate the coordinates yourself.

Efficiently select random matrix indices with given probabilities

I have a numpy array of probabilities, such as:
[[0.1, 0, 0.3,],
0.2, 0, 0.05],
0, 0.15, 0.2 ]]
I want to select an element (e.g., select some indices (i,j)) from this matrix, with probability weighted according to this matrix. The actual matrices this will be working with are large (up to 1000x1000), so I'm looking for an efficient way to do this. This is my current solution:
def weighted_mat_choice(prob_mat):
"""
Randomly select indices of the matrix according to the probabilities in prob_mat
:param prob_mat: Normalized probabilities to select each element
:return: indices (i, j) selected
"""
inds_mat = [[(i, j) for j in xrange(prob_mat.shape[1])] for i in xrange(prob_mat.shape[0])]
inds_list = [item for sublist in inds_mat for item in sublist]
inds_of_inds = xrange(len(inds_list))
prob_list = prob_mat.flatten()
pick_ind_of_ind = np.random.choice(inds_of_inds, p=prob_list)
pick_ind = inds_list[pick_ind_of_ind]
return pick_ind
which is definitely not efficient. (Basically, linearizing the matrix, creating a list of index tuples, and then picking accordingly.) Is there a better way to do this selection?
You don't need a list of tuple to choice. Just use a arange(n) array, and convert it back to two dimension by unravel_index().
import numpy as np
p = np.array(
[[0.1, 0, 0.3,],
[0.2, 0, 0.05],
[0, 0.15, 0.2]]
)
p_flat = p.ravel()
ind = np.arange(len(p_flat))
res = np.column_stack(
np.unravel_index(
np.random.choice(ind, p=p_flat, size=10000),
p.shape))
The result:
array([[0, 2],
[2, 2],
[2, 1],
...,
[1, 0],
[0, 2],
[0, 0]], dtype=int64)

Categories