How to segment a matrix by neighbouring values? - python

Suppose I have a matrix like this:
m = [0, 1, 1, 0,
1, 1, 0, 0,
0, 0, 0, 1]
And I need to get the coordinates of the same neighbouring values (but not diagonally):
So the result would be a list of lists of coordinates in the "matrix" list, starting with [0,0], like this:
r = [[[0,0]],
[[0,1], [0,2], [1,0], [1,1]],
[[0,3], [1,2], [1,3], [2,0], [2,1], [2,2]]
[[2,3]]]
There must be a way to do that, but I'm really stuck.

tl;dr: We take an array of zeros and ones and use scipy.ndimage.label to convert it to an array of zeros and [1,2,3,...]. We then use np.where to find the coordinates of each element with value > 0. Elements that have the same value end up in the same list.
scipy.ndimage.label interprets non-zero elements of a matrix as features and labels them. Each unique feature in the input gets assigned a unique label. Features are e.g. groups of adjacent elements (or pixels) with the same value.
import numpy as np
from scipy.ndimage import label
# make dummy data
arr = np.array([[0,1,1,0], [1,1,0,0], [0,0,0,1]])
#initialise list of features
r = []
Since OP wanted all features, that is groups of zero and non-zero pixels, we use label twice: First on the original array, and second on 1 - original array. (For an array of zeros and ones, 1 - array just flips the values).
Now, label returns a tuple, containing the labelled array (which we are interested in) and the number of features that it found in that array (which we could use, but when I coded this, I chose to ignore it. So, we are interested in the first element of the tuple returned by label, which we access with [0]:
a = label(arr)[0]
b = label(1-arr)[0]
Now we check which unique pixel values label has assigned. So we want the set of a and b, repectively. In order for set() to work, we need to linearise both arrays, which we do with .ravel(). We have to subtract {0} in both cases, because for both a and b we are interested in only the non-zero values.
So, having found the unique labels, we loop through these values, and use np.where to find where on the array a given value is located. np.where returns a tuple of arrays. The first element of this tuple are all the row-coordinates for which the condition was met, and the second element are the column-coordinates.
So, we can use zip(* to unpack the two containers of length n to n containers of length 2. This means that we go from list of all row-coords + list of all column-coords to list of all row-column-coordinate pairs for which the condition is met. Finally in python 3, zip is a generator, which we can evaluate by calling list() on it. The resulting list is then appended to our list of coordinates, r.
for x in set(a.ravel())-{0}:
r.append(list(zip(*np.where(a==x))))
for x in set(b.ravel())-{0}:
r.append(list(zip(*np.where(b==x))))
print(r)
[[(0, 1), (0, 2), (1, 0), (1, 1)],
[(2, 3)],
[(0, 0)],
[(0, 3), (1, 2), (1, 3), (2, 0), (2, 1), (2, 2)]]
That said, we can speed up this code slightly by making use of the fact that label returns the number of features it assigned. This allows us to avoid the set command, which can take time on large arrays:
a, num_a = label(arr)
for x in range(1, num_a+1): # range from 1 to the highest label
r.append(list(zip(*np.where(a==x))))

A solution with only standard libraries:
from pprint import pprint
m = [0, 1, 1, 0,
1, 1, 0, 0,
0, 0, 0, 1]
def is_neighbour(x1, y1, x2, y2):
return (x1 in (x2-1, x2+1) and y1 == y2) or \
(x1 == x2 and y1 in (y2+1, y2-1))
def is_value_touching_group(val, groups, x, y):
for d in groups:
if d['color'] == val and any(is_neighbour(x, y, *cell) for cell in d['cells']):
return d
def check(m, w, h):
groups = []
for i in range(h):
for j in range(w):
val = m[i*w + j]
touching_group = is_value_touching_group(val, groups, i, j)
if touching_group:
touching_group['cells'].append( (i, j) )
else:
groups.append({'color':val, 'cells':[(i, j)]})
final_groups = []
while groups:
current_group = groups.pop()
for c in current_group['cells']:
touching_group = is_value_touching_group(current_group['color'], groups, *c)
if touching_group:
touching_group['cells'].extend(current_group['cells'])
break
else:
final_groups.append(current_group['cells'])
return final_groups
pprint( check(m, 4, 3) )
Prints:
[[(2, 3)],
[(0, 3), (1, 3), (1, 2), (2, 2), (2, 0), (2, 1)],
[(0, 1), (0, 2), (1, 1), (1, 0)],
[(0, 0)]]

Returns as a list of groups under value key.
import numpy as np
import math
def get_keys(old_dict):
new_dict = {}
for key, value in old_dict.items():
if value not in new_dict.keys():
new_dict[value] = []
new_dict[value].append(key)
else:
new_dict[value].append(key)
return new_dict
def is_neighbor(a,b):
if a==b:
return True
else:
distance = abs(a[0]-b[0]), abs(a[1]-b[1])
return distance == (0,1) or distance == (1,0)
def collate(arr):
arr2 = arr.copy()
ret = []
for a in arr:
for i, b in enumerate(arr2):
if set(a).intersection(set(b)):
a = list(set(a+b))
ret.append(a)
for clist in ret:
clist.sort()
return [list(y) for y in set([tuple(x) for x in ret])]
def get_groups(d):
for k,v in d.items():
ret = []
for point in v:
matches = [a for a in v if is_neighbor(point, a)]
ret.append(matches)
d[k] = collate(ret)
return d
a = np.array([[0,1,1,0],
[1,1,0,0],
[0,0,1,1]])
d = dict(np.ndenumerate(a))
d = get_keys(d)
d = get_groups(d)
print(d)
Result:
{
0: [[(0, 3), (1, 2), (1, 3)], [(0, 0)], [(2, 0), (2, 1)]],
1: [[(2, 2), (2, 3)], [(0, 1), (0, 2), (1, 0), (1, 1)]]
}

Related

Recursive function to find position of number within a matrix

I have to make this small recursion exercise where, given a matrix and a number I have to return the position of the number in the matrix. For example:
matrix = [[2,0,1],[3,5,3],[5,1,4,9],[0,5]]
numberToFind = 5
The expected result would be this one:
[(1,1),(2,0),(3,1)]
Could anyone pint me on how to start or what I have to do to create the code?
Here is one approach using a recursive generator:
matrix = [[2,0,1],[3,5,3],[5,1,4,9],[0,5]]
numberToFind = 5
def find(m, n, prev=tuple()):
for i,x in enumerate(m):
if isinstance(x, list):
yield from find(x, n, prev=prev+(i,))
elif x==n:
yield prev+(i,)
list(find(matrix, numberToFind))
output: [(1, 1), (2, 0), (3, 1)]
other example:
matrix = [[2,0,1],[3,5,3],[5,1,4,5],[0,5],[[[2,5,[1,5]]]]]
list(find(matrix, numberToFind))
# [(1, 1), (2, 0), (2, 3), (3, 1), (4, 0, 0, 1), (4, 0, 0, 2, 1)]
you can use just a single for loop as below. Way more efficient than a recursion
matrix = [[2,0,1],[3,5,3],[5,1,4,9],[0,5]]
my_list = []
for x in range(0,len(matrix)):
try:
a = matrix[x].index(5)
my_list.append((x,a))
except ValueError:
pass

How can I add a random binary info into current 'coordinate'? (Python)

This is part of the code I'm working on: (Using Python)
import random
pairs = [
(0, 1),
(1, 2),
(2, 3),
(3, 0), # I want to treat 0,1,2,3 as some 'coordinate' (or positional infomation)
]
alphas = [(random.choice([1, -1]) * random.uniform(5, 15), pairs[n]) for n in range(4)]
alphas.sort(reverse=True, key=lambda n: abs(n[0]))
A sample output looks like this:
[(13.747649802587832, (2, 3)),
(13.668274782626717, (1, 2)),
(-9.105374057105703, (0, 1)),
(-8.267840318934667, (3, 0))]
Now I'm wondering is there a way I can give each element in 0,1,2,3 a random binary number, so if [0,1,2,3] = [0,1,1,0], (By that I mean if the 'coordinates' on the left list have the corresponding random binary information on the right list. In this case, coordinate 0 has the random binary number '0' and etc.) then the desired output using the information above looks like:
[(13.747649802587832, (1, 0)),
(13.668274782626717, (1, 1)),
(-9.105374057105703, (0, 1)),
(-8.267840318934667, (0, 0))]
Thanks!!
One way using dict:
d = dict(zip([0,1,2,3], [0,1,1,0]))
[(i, tuple(d[j] for j in c)) for i, c in alphas]
Output:
[(13.747649802587832, (1, 0)),
(13.668274782626717, (1, 1)),
(-9.105374057105703, (0, 1)),
(-8.267840318934667, (0, 0))]
You can create a function to convert your number to the random binary assigned. Using a dictionary within this function would make sense. Something like this should work where output1 is that first sample output you provide and binary_code would be [0, 1, 1, 0] in your example:
def convert2bin(original, binary_code):
binary_dict = {n: x for n, x in enumerate(binary_code)}
return tuple([binary_code[x] for x in original])
binary_code = np.random.randint(2, size=4)
[convert2bin(x[1], binary_code) for x in output1]

How to find the index of a tuple in a 2D array in python?

I have an array with the form as follows (with much more elements):
coords = np.array(
[[(2, 1), 1613, 655],
[(2, 5), 906, 245],
[(5, 2), 0, 0]])
And I would like to find the index of a specific tuple. For example, I might be looking for the position of the tuple (2, 5), which should be in position 1 in this case.
I have tried with np.where and np.argwhere, with no luck:
pos = np.argwhere(coords == (2,5))
print(pos)
>> DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
pos = np.where(coords == (2,5))
print(pos)
>> DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
How can I get the index of a tuple?
If you intend to use a numpy array containing objects, all comparison will be done using python itself. At that point, you have given up almost all the advantages of numpy and may as well use a list:
coords = coords.tolist()
index = next((i for i, n in enumerate(coords) if n[0] == (2, 5)), -1)
If you really want to use numpy, I suggest you transform your data appropriately. Two simple options come to mind. You can either expand your tuple and create an array of shape (N, 4), or you can create a structured array that preserves the arrangement of the data as a unit, and has shape (N,). The former is much simpler, while the later is, in my opinion, more elegant.
If you flatten the coordinates:
coords = np.array([[x[0][0], x[0][1], x[1], x[2]] for x in coords])
index = np.flatnonzero(np.all(coords[:, :2] == [2, 5], axis=1))
The structured solution:
coordt = np.dtype([('x', np.int_), ('y', np.int_)])
dt = np.dtype([('coord', coordt), ('a', np.int_), ('b', np.int_)])
coords = np.array([((2, 1), 1613, 655), ((2, 5), 906, 245), ((5, 2), 0, 0)], dtype=dt)
index = np.flatnonzero(coords['coord'] == np.array((2, 5), dtype=coordt))
You can also just transform the first part of your data to a real numpy array, and operate on that:
coords = np.array(coords[:, 0].tolist())
index = np.flatnonzero((coords == [2, 5]).all(axis=1))
You should not compare (2, 5) and coords, but compare (2, 5) and coords[:, 0].
Try this code.
np.where([np.array_equal(coords[:, 0][i], (2, 5)) for i in range(len(coords))])[0]
Try this one
import numpy as np
coords = np.array([[(2, 1), 1613, 655], [(2, 5), 906, 245], [(5, 2), 0, 0]])
tpl=(2,5)
i=0 # index of the column in which the tuple you are looking for is listed
pos=([t[i] for t in coords].index(tpl))
print(pos)
Assuming your target tuple (e.g. (2,5) ) is always in the first column of the numpy array coords i.e. coords[:,0] you can simply do the following without any loops!
[*coords[:,0]].index((2,5))
If the tuples aren't necessarily in the first column always, then you can use,
[*coords.flatten()].index((2,5))//3
Hope that helps.
First of all, the tuple (2, 5) is in position 0 as it is the first element of the list [(2, 5), 906, 245].
And second of all, you can use basic python functions to check the index of a tuple in that array. Here's how you do it:
>>> coords = np.array([[(2, 1), 1613, 655], [(2, 5), 906, 245], [(5, 2), 0, 0]])
>>>
>>> coords_list = cl = list(coords)
>>> cl
[[(2, 1), 1613, 655], [(2, 5), 906, 245], [(5, 2), 0, 0]]
>>>
>>> tuple_to_be_checked = tuple_ = (2, 5)
>>> tuple_
(2, 5)
>>>
>>> for i in range(0, len(cl), 1): # Dynamically works for any array `cl`
for j in range(0, len(cl[i]), 1): # Dynamic; works for any list `cl[i]`
if cl[i][j] == tuple_: # Found the tuple
# Print tuple index and containing list index
print(f'Tuple at index {j} of list at index {i}')
break # Break to avoid unwanted loops
Tuple at index 0 of list at index 1
>>>

python: finding smallest distance between two points in two arrays

I've got two lists containing a series of tuples (x,y), representing different points on a Cartesian plane:
a = [(0, 0), (1, 2), (1, 3), (2, 4)]
b = [(3, 4), (4, 1), (5, 3)]
I'd like to find the two points (one for each list, not within the same list) at the smaller distance, in this specific case:
[((2, 4), (3, 4))]
whose distance is equal to 1. I was using list comprehension, as:
[(Pa, Pb) for Pa in a for Pb in b \
if math.sqrt(math.pow(Pa[0]-Pb[0],2) + math.pow(Pa[1]-Pb[1],2)) <= 2.0]
but this uses a threshold value. Is there a way to append an argmin() somewhere or something like that and get only the pair [((xa, ya), (xb, yb))] smallest distance? Thanks.
import numpy
e = [(Pa, Pb) for Pa in a for Pb in b]
e[numpy.argmin([math.sqrt(math.pow(Pa[0]-Pb[0],2) + math.pow(Pa[1]-Pb[1],2)) for (Pa, Pb) in e])]
Will use argmin as you suggested and return ((2, 4), (3, 4))
Just use list comprehension and min as follows:
dist = [(Pa, Pb, math.sqrt(math.pow(Pa[0]-Pb[0],2) + math.pow(Pa[1]-Pb[1],2)))
for Pa in a for Pb in b]
print min(dist, key=lambda x:x[2])[0:2]
Solution similar to DevShark's one with a few optimization tricks:
import math
import itertools
import numpy as np
def distance(p1, p2):
return math.hypot(p2[0] - p1[0], p2[1] - p1[1])
a = [(0, 0), (1, 2), (1, 3), (2, 4)]
b = [(3, 4), (4, 1), (5, 3)]
points = [tup for tup in itertools.product(a, b)]
print(points[np.argmin([distance(Pa, Pb) for (Pa, Pb) in points])])
You could also use the scipy.spatial library with the following :
import scipy.spatial as spspat
import numpy as np
distanceMatrix = spspat.distance_matrix(a,b)
args = np.argwhere(distanceMatrix==distanceMatrix.min())
print(args)
This will return you the following : array([[3, 0]]) , being the position of the points in each list.
This should also work in any dimension.

Sorting 3d arrays in python as per grid

I have grid points in 3d and I would like to sort them based (x,y,z) using python (if possible avoiding loops)..
For example if the input is,
(1,2,1), (0,8,1), (1,0,0) ..
then output should be
(0,8,1), (1,0,0), (1,2,1)..
Sorry for this side track but I am actually doing is reading from a file which has data in following way:
x y z f(x) f(y) f(z)..
what I was doing was following:
def fill_array(output_array,source_array,nx,ny,nz,position):
for i in range(nx):
for j in range(ny):
for k in range(nz):
output_array[i][j][k] = source_array[i][j][k][position]
nx = 8
ny = 8
nz = 8
ndim = 6
x = np.zeros((nx,ny,nz))
y = np.zeros((nx,ny,nz))
z = np.zeros((nx,ny,nz))
bx = np.zeros((nx,ny,nz))
by = np.zeros((nx,ny,nz))
bz = np.zeros((nx,ny,nz))
data_file = np.loadtxt('datafile')
f = np.reshape(data_file, (nx,ny,nz,ndim))
fill_array(x,f,nx,ny,nz,0))
fill_array(y,f,nx,ny,nz,1)
fill_array(z,f,nx,ny,nz,2)
fill_array(fx,f,nx,ny,nz,3)
fill_array(fy,f,nx,ny,nz,4)
fill_array(fz,f,nx,ny,nz,5)
This was working fine when data was arranged (as explained previously) but with file written not in order it is creating problems with plot later on. Is there are better way to do this ? Of course I only want to arrange x,y,z and then associate functional value f(x),f(y),f(z) to its right position (x,y,z)
two updates
1) i am getting following error when I use sorted with either x,y,z,fx,fy,fz or f.
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
2) i need it in that specific way because I am using mayavi then for contour3d
The built-in function sorted does what you want:
>>> a = [(1, 2, 1), (0, 8, 1), (1, 0, 0)]
>>> sorted(a)
[(0, 8, 1), (1, 0, 0), (1, 2, 1)]
Use [sorted][1].
In [71]: sorted(a)
Out[71]: [(0, 8, 1), (1, 0, 0), (1, 2, 1)]
more precisely
In [70]: sorted(a, key=lambda x: (x[0], x[1], x[2]))
Out[70]: [(0, 8, 1), (1, 0, 0), (1, 2, 1)]
key=lambda x: (x[0], x[1], x[2])
at this step we are sorting list at 0th 1st and 2nd element of tuple

Categories