Searching numpy array for element values and storing location - python

I have a numpy array which represents the adjacency of faces in a 3D model. In general the nth row and column represent the nth face of the model. If a 1 is located in the upper right triangle of the matrix, it represents a convex connection between two faces. If a 1 is located in the lower left triangle, it represents a concave connection.
For example in the matrix below, there are convex connections between faces 1 and 2, 1 and 3, 2 and 3 and so on.
1 2 3 4 5 6
1 [[ 0. 1. 1. 0. 0. 0.]
2 [ 0. 0. 1. 1. 1. 1.]
3 [ 0. 0. 0. 0. 0. 0.]
4 [ 0. 0. 0. 0. 1. 0.]
5 [ 0. 0. 0. 0. 0. 0.]
6 [ 0. 0. 0. 0. 0. 0.]]
Id like to be able to record how many concave and convex connections each face has.
i.e. Face 1 has: 0 concave and 2 convex connections
Possibly even record which faces they are connected to.
i.e. Face 1 has: 0 concave and 2 convex (2, 3) connections
So far I have tried using np.nonzero() to return the indices of the 1's. However this returns the indices in a format which doesn't seem to be very easy to work with (a separate array for the row and column indices:
(array([ 0, 0, 1, 1, 1, 1, 3]), array([ 1, 2, 2, 3, 4, 5,
4]))
Can anyone help me with an easier way to carry out this task? Thanks

try this:
import numpy as np
a=np.matrix([[0,1,1,0,0,0],
[ 0,0,1,1,1,1],
[ 0,0,0,0,0,0],
[ 0,0,0,0,1,0],
[ 0,0,0,0,0,0],
[ 0,0,0,0,0,0]]).astype(float)
concave={}
convex={}
for i,j in zip(np.nonzero(a)[0]+1,np.nonzero(a)[1]+1):
if j > i :
if i not in convex.keys():
convex[i]=[]
if j not in convex.keys():
convex[j]=[]
convex[i].append(j)
convex[j].append(i)
else :
if i not in concave.keys():
concave[i]=[]
if j not in concave.keys():
concave[j]=[]
concave[i].append(j)
concave[j].append(i)
print 'concave relations : {} and number of relations is {}'.format(concave,sum(len(v) for v in concave.values()))
print 'convex relations : {} and number of relations is {}'.format(convex,sum(len(v) for v in convex.values()))
gives the result :
concave relations : {} and number of relations is 0
convex relations : {1: [2, 3], 2: [1, 3, 4, 5, 6], 3: [1, 2], 4: [2, 5], 5: [2, 4], 6: [2]} and number of relations is 14
where the dictionary key is the name of the face and key values are it's connections.
Logic is :
for every non-zero pair (i,j)
if i>j then j is the concave connection of face i & i is the concave connection of face j
if j>i then j is the convex connection of face i & i is the convex connection of face j

import numpy as np
A = np.array([[0, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 1],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]])
convex = np.triu(A, 1) # upper triangle
concave = np.tril(A, -1) # lower triangle
convex_indices = list(zip(np.nonzero(convex)[0] + 1, np.nonzero(convex)[1] + 1))
concave_indices = list(zip(np.nonzero(concave)[0] + 1, np.nonzero(concave)[1] + 1))
num_convex = len(convex_indices)
num_concave = len(concave_indices)
print('There are {} convex connections between faces: {}'.format(num_convex, ', '.join(str(e) for e in convex_indices)))
print('There are {} concave connections between faces: {}'.format(num_concave, ', '.join(str(e) for e in concave_indices)))
# will print:
# There are 7 convex connections between faces: (1, 2), (1, 3), (2, 3), (2, 4), (2, 5), (2, 6), (4, 5)
# There are 0 concave connections between faces:

Related

How to "zero" all entries in an array with an equal adjacent entry?

I have a large array of 1's and 0's. Each time a 1 appears in it, I want to make sure that all the 1's adjacent (or diagonal) to that 1 are set to 0 - so that the array you are left with has one 1 for every "group" of ones that appeared in the original array (I don't mind which 1 is kept per group, just as long as there are no adjacent/neighbouring 1's).
For example, with the following array (much smaller than what I will be dealing with, but I want to illustrate the problem).
[[0. 0. 0. 1.]
[0. 1. 0. 0.]
[0. 1. 0. 0.]
[1. 0. 0. 1.]
[0. 0. 1. 0.]]
I would want to return an array that looks like:
[[0. 0. 0. 1.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[1. 0. 0. 0.]
[0. 0. 1. 0.]]
I have tried the following code for the above array (which is labelled as a) in order to achieve this:
# Define a function which finds if an entry has a 1 neighbouring it
def has_neighbour(matrix, x, y):
if matrix[x-1,y]==1 or matrix[x+1,y]==1 or matrix[x, y+1]==1 or matrix[x, y-1]==1 or matrix[x+1, y+1]==1 or matrix[x+1, y-1]==1 or matrix[x-1, y+1]==1 or matrix[x-1, y-1]==1:
# Find the entries in the matrix of 1 and if they have an adjacent 1, set them to zero
for i in range(4):
for j in range(3):
if a[i,j]==0 or (a[i,j] == 1 and has_neighbour(a, i, j)==True):
a[i,j] = 0
else:
a[i,j] = 1
# Re-search along the last row and column for adjacent 1's, setting them to zero where necessary. This is needed due to for loops not changing the last row and column of an array
for i in range(4):
if a[i,3]==0 or (a[i,3]==1 and (a[i+1,3]==1 or a[i-1,3]==1)):
a[i,3] = 0
else:
a[i,3]=1
for j in range(3):
if a[4, j]==0 or (a[4, j]==1 and (a[4, j+1]==1 or a[4, j-1]==1)):
a[4, j] = 0
else:
a[4, j] = 1
However, this code returns the following array:
[[0. 0. 0. 1.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 1.]
[0. 0. 1. 0.]]
Anyone know what I is going wrong here? I am very confused as it's getting rid of some 1's I don't want it to and not getting rid of some that I do want it to.
Edit
#JohnColeman's answer is the closest to working so far. However, it fails on the following matrix (I've attached his code, but changed the test part to the matrix that it is failing with):
def pick_reps(matrix):
m = len(matrix)
n = len(matrix[0])
reps = [[0]*n for _ in range(m)]
represented = set()
for i, row in enumerate(matrix):
for j,x in enumerate(row):
if x == 1:
#check if this position is in an already represented block
covered = False
if j > 0 and (i,j-1) in represented:
covered = True
elif i > 0 and (i-1,j) in represented:
covered = True
elif i > 0 and j > 0 and (i-1,j-1) in represented:
covered = True
elif i > 0 and j < n and (i-1,j+1) in represented:
covered = True
if not covered:
reps[i][j] = 1
represented.add((i,j))
return reps
#test:
matrix = [[0, 0, 0, 0],
[0, 0, 0, 1],
[0, 1, 1, 0],
[1, 0, 1, 1],
[0, 0, 0, 1]]
reps = pick_reps(matrix)
for row in reps: print(row)
This should return a matrix with only one 1 in it. However, it returns:
[[0, 0, 0, 0],
[0, 0, 0, 1],
[0, 1, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]]
I think your is_neighbo(u)r1 function still needs some work. There is a lot of logic down the line that should in fact be part of that function; in particular, the borderline cases should be accounted for inside the function. Here's one way to implement your idea:
def has_neighbor(matrix, x, y):
"""Return True if any of the neighboring cells are equal to 1"""
# just in case you need to change this value at some point
value = 1
# Define neighbors to check
corners = [(-1, -1),
(0, -1),
(1, -1),
(1, 0),
(1, 1),
(0, 1),
(-1, 1),
(-1, 0)]
# get the boundaries of the matrix
x_max, y_max = matrix.shape
# for each neighbor
for x_off, y_off in corners:
# get the new coordinates to check
x2 = x+x_off
y2 = y+y_off
# skip coordinates that are out of bounds
out_of_grid = [x2 < 0,
y2 < 0,
x2 >= x_max,
y2 >= y_max]
if any(out_of_grid):
continue
# finally!
if matrix[x2, y2] == value:
return True
It can now handle the borderline cases, which makes the final loop much simpler:
x_max, y_max = a.shape
for x in range(x_max):
for y in range(y_max):
if has_neighbor(a, x, y):
a[x, y] = 0
The output is as desired:
array([[0, 0, 0, 1],
[0, 0, 0, 0],
[0, 0, 0, 0],
[1, 0, 0, 0],
[0, 0, 1, 0]])
1 NB: In code, American spelling is more common.
This probably isn't very idiomatic, but you could iterate through the array - when you find a 1 you call a recursive function to clear all reachable 1s. This of course will clear the initial, representative element, so you set that back to 1.
a = [[0, 0, 0, 1],
[0, 1, 0, 0],
[0, 1, 0, 0],
[1, 0, 0, 1],
[0, 0, 1, 0]]
def clear(a, r, c):
if r < 0 or r == len(a) or c < 0 or c == len(a[0]) or a[r][c] == 0:
return
a[r][c] = 0
for i in [-1, 0, 1]:
for j in [-1, 0, 1]:
clear(a, r+i, c+j)
for r in range(len(a)):
for c in range(len(a[0])):
if a[r][c] == 1:
clear(a, r, c)
a[r][c] = 1
for row in a: print(row)
Output:
[0, 0, 0, 1]
[0, 1, 0, 0]
[0, 0, 0, 0]
[0, 0, 0, 1]
[0, 0, 0, 0]

Find all the rectangles coordinates in a numpy 2d array ( prioritizing the vertical rectangles)

I am trying to figure out a solution for finding all the rectangles in a 2d array.
But in the mean while, I need to get the vertical ones first.
For example:
[[0. 0. 1. 0. 0.]
[0. 0. 0. 1. 1.]
[0. 0. 1. 1. 1.]]
The desire output would be
[[0, 2, 0, 2], [2, 2, 2, 2], [1, 3, 2, 4]]
Or something like
[[1. 0. 0. 0. 0.]
[1. 1. 0. 1. 1.]
[1. 1. 1. 1. 0.]]
Output should be
[[0, 0, 0, 0], [1, 0, 2, 1], [2, 2, 2, 2], [1, 3, 2, 3], [1, 4, 1, 4]]
In other words, if it's a horizontal rectangle of the height of only 1, it is viewed as multiple 11.
I am kind of stuck on the logic which should proceed first, my results prioritize the horizontal ones and have troubles dealing with zeros when encounter a 2*2 or above rectangle.
UPDATE
A rectangle in here means an area composed of 1s in the 2d array. However when something like
[[0. 0. 0. 0. 0.]
[0. 0. 0. 1. 1.]
[0. 0. 0. 1. 0.]]
happens, the output should be
[[1, 3, 2, 3], [1, 4, 1, 4]]
instead of
[[1, 3, 1, 4], [2, 3, 2, 3]]
1*1 counts as a rectangle too
The code I have for now looks like this
def solve2(grid):
output = []
visited = set()
for j in range(len(grid[0])):
for i in range(len(grid)):
if (i,j) in visited:
continue
visited.add((i,j))
if grid[i][j] == 1:
s_row, s_col = i, j
e_row, e_col = i,j
while e_col < len(grid[0]) and grid[i][e_col]:
while e_row < len(grid) and grid[e_row][j]:
e_row += 1
e_col += 1
for x in range(s_row, e_row):
for y in range(s_col, e_col):
visited.add((x,y))
e_row -= 1
e_col -= 1
output.append([s_row, s_col, e_row, e_col])
return output

Understanding _r from numpy

a = np.zeros([4, 4])
b = np.ones([4, 4])
#vertical stacking(ROW WISE)
print(np.r_[a,b])
print(np.r_[[1,2,3],0,0,[4,5,6]])
# output is
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[1 2 3 0 0 4 5 6]
But here np._r doesn't perform vertical stacking, but does horizontal stacking. How does np._r work? Would be grateful for any help
In [324]: a = np.zeros([4, 4],int)
...: b = np.ones([4, 4],int)
In [325]: np.r_[a,b]
Out[325]:
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]])
This is a row stack; same as vstack. And since the arrays are already 2d, concatenate is enough:
In [326]: np.concatenate((a,b), axis=0)
Out[326]:
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]])
With the mix of 1d and scalars, r_ is the same as hstack:
In [327]: np.r_[[1,2,3],0,0,[4,5,6]]
Out[327]: array([1, 2, 3, 0, 0, 4, 5, 6])
In [328]: np.hstack([[1,2,3],0,0,[4,5,6]])
Out[328]: array([1, 2, 3, 0, 0, 4, 5, 6])
In [329]: np.concatenate([[1,2,3],0,0,[4,5,6]],axis=0)
...
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 0 dimension(s)
concatenate fails because of the scalars. The other methods first convert those to 1d arrays.
In both case, r_ does
Translates slice objects to concatenation along the first axis.
r_ is actually an instance of a special class, with its own __getitem__ method, that allows us to use [] instead of (). It also means it can take slices as inputs (which are actually rendered as np.arange or np.linspace).
r_ takes an optional initial string argument, which if consisting of 3 numbers, can control the concatenate axis, and control how inputs are adjusted to matching dimensions. See the docs for details, and np.lib.index_tricks.py file for more details.
In order of importance I think the concatenate functions are:
np.concatenate # base
np.vstack # easy join 1d arrays into 2d
np.stack # generalize np.array
np.hstack # saves specifying axis
np.r_
np.c_
r_ and c_ can do neat things when mixing arrays of different shapes, but it all boils down to using concatanate correctly.

Creating tuples of multiples from pairs of indices

Given a numpy array, which can be subset to indices for array elements meeting given criteria. How do I create tuples of triplets (or quadruplets, quintuplets, ...) from the resulting pairs of indices ?
In the example below, pairs_tuples is equal to [(1, 0), (3, 0), (3, 1), (3, 2)]. triplets_tuples should be [(0, 1, 3)] because all of its elements (i.e. (1, 0), (3, 0), (3, 1)) have pairwise values meeting the condition, whereas (3, 2) does not.
a = np.array([[0. , 0. , 0. , 0. , 0. ],
[0.96078379, 0. , 0. , 0. , 0. ],
[0.05498203, 0.0552454 , 0. , 0. , 0. ],
[0.46005028, 0.45468466, 0.11167813, 0. , 0. ],
[0.1030161 , 0.10350956, 0.00109096, 0.00928037, 0. ]])
pairs = np.where((a >= .11) & (a <= .99))
pairs_tuples = list(zip(pairs[0].tolist(), pairs[1].tolist()))
# [(1, 0), (3, 0), (3, 1), (3, 2)]
How to get to the below?
triplets_tuples = [(0, 1, 3)]
quadruplets_tuples = []
quintuplets_tuples = []
This has an easy part and an NP part. Here's the solution to the easy part.
Let's assume you have the full correlation matrix:
>>> c = a + a.T
>>> c
array([[0. , 0.96078379, 0.05498203, 0.46005028, 0.1030161 ],
[0.96078379, 0. , 0.0552454 , 0.45468466, 0.10350956],
[0.05498203, 0.0552454 , 0. , 0.11167813, 0.00109096],
[0.46005028, 0.45468466, 0.11167813, 0. , 0.00928037],
[0.1030161 , 0.10350956, 0.00109096, 0.00928037, 0. ]])
What you're doing is converting this into an adjacency matrix:
>>> adj = (a >= .11) & (a <= .99)
>>> adj.astype(int) # for readability below - False and True take a lot of space
array([[0, 1, 0, 1, 0],
[1, 0, 0, 1, 0],
[0, 0, 0, 1, 0],
[1, 1, 1, 0, 0],
[0, 0, 0, 0, 0]])
This now represents a graph where columns and rows corresponds to nodes, and a 1 is a line between them. We can use networkx to visualize this:
import networkx
g = networkx.from_numpy_matrix(adj)
networkx.draw(g)
You're looking for maximal fully-connected subgraphs, or "cliques", within this graph. This is the Clique problem, and is the NP part. Thankfully, networkx can solve that too:
>>> list(networkx.find_cliques(g))
[[3, 0, 1], [3, 2], [4]]
Here [3, 0, 1] is one of your triplets.

Transform an array of count data into a matrix of ones and zeroes

I have an array n of count data, and I want to transform it into a matrix x in which each row contains a number of ones equal to the corresponding count number, padded by zeroes, e.g:
n = [0 1 3 0 1]
x = [[ 0. 0. 0.]
[ 1. 0. 0.]
[ 1. 1. 1.]
[ 0. 0. 0.]
[ 1. 0. 0.]]
My solution is the following, and is very slow. Is it possible to do better?
n = np.random.poisson(2,5)
max_n = max(n)
def f(y):
return np.concatenate((np.ones(y), np.zeros(max_n-y)))
x = np.vstack(map(f,n))
Here's one way to vectorize it:
>>> n = np.array([0,2,1,0,3])
>>> width = 4
>>> (np.arange(width) < n[:,None]).astype(int)
array([[0, 0, 0, 0],
[1, 1, 0, 0],
[1, 0, 0, 0],
[0, 0, 0, 0],
[1, 1, 1, 0]])
where if you liked, width could be max(n) or anything else you chose.
import numpy as np
n = np.array([0, 1, 3, 0, 1])
max_n = max(n)
np.vstack(n > i for i in range(max_n)).T.astype(int) # xrange(max_n) for python 2.x
Output:
array([[0, 0, 0],
[1, 0, 0],
[1, 1, 1],
[0, 0, 0],
[1, 0, 0]])

Categories