Sorting 3d arrays in python as per grid - python

I have grid points in 3d and I would like to sort them based (x,y,z) using python (if possible avoiding loops)..
For example if the input is,
(1,2,1), (0,8,1), (1,0,0) ..
then output should be
(0,8,1), (1,0,0), (1,2,1)..
Sorry for this side track but I am actually doing is reading from a file which has data in following way:
x y z f(x) f(y) f(z)..
what I was doing was following:
def fill_array(output_array,source_array,nx,ny,nz,position):
for i in range(nx):
for j in range(ny):
for k in range(nz):
output_array[i][j][k] = source_array[i][j][k][position]
nx = 8
ny = 8
nz = 8
ndim = 6
x = np.zeros((nx,ny,nz))
y = np.zeros((nx,ny,nz))
z = np.zeros((nx,ny,nz))
bx = np.zeros((nx,ny,nz))
by = np.zeros((nx,ny,nz))
bz = np.zeros((nx,ny,nz))
data_file = np.loadtxt('datafile')
f = np.reshape(data_file, (nx,ny,nz,ndim))
fill_array(x,f,nx,ny,nz,0))
fill_array(y,f,nx,ny,nz,1)
fill_array(z,f,nx,ny,nz,2)
fill_array(fx,f,nx,ny,nz,3)
fill_array(fy,f,nx,ny,nz,4)
fill_array(fz,f,nx,ny,nz,5)
This was working fine when data was arranged (as explained previously) but with file written not in order it is creating problems with plot later on. Is there are better way to do this ? Of course I only want to arrange x,y,z and then associate functional value f(x),f(y),f(z) to its right position (x,y,z)
two updates
1) i am getting following error when I use sorted with either x,y,z,fx,fy,fz or f.
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
2) i need it in that specific way because I am using mayavi then for contour3d

The built-in function sorted does what you want:
>>> a = [(1, 2, 1), (0, 8, 1), (1, 0, 0)]
>>> sorted(a)
[(0, 8, 1), (1, 0, 0), (1, 2, 1)]

Use [sorted][1].
In [71]: sorted(a)
Out[71]: [(0, 8, 1), (1, 0, 0), (1, 2, 1)]
more precisely
In [70]: sorted(a, key=lambda x: (x[0], x[1], x[2]))
Out[70]: [(0, 8, 1), (1, 0, 0), (1, 2, 1)]
key=lambda x: (x[0], x[1], x[2])
at this step we are sorting list at 0th 1st and 2nd element of tuple

Related

Recursive function to find position of number within a matrix

I have to make this small recursion exercise where, given a matrix and a number I have to return the position of the number in the matrix. For example:
matrix = [[2,0,1],[3,5,3],[5,1,4,9],[0,5]]
numberToFind = 5
The expected result would be this one:
[(1,1),(2,0),(3,1)]
Could anyone pint me on how to start or what I have to do to create the code?
Here is one approach using a recursive generator:
matrix = [[2,0,1],[3,5,3],[5,1,4,9],[0,5]]
numberToFind = 5
def find(m, n, prev=tuple()):
for i,x in enumerate(m):
if isinstance(x, list):
yield from find(x, n, prev=prev+(i,))
elif x==n:
yield prev+(i,)
list(find(matrix, numberToFind))
output: [(1, 1), (2, 0), (3, 1)]
other example:
matrix = [[2,0,1],[3,5,3],[5,1,4,5],[0,5],[[[2,5,[1,5]]]]]
list(find(matrix, numberToFind))
# [(1, 1), (2, 0), (2, 3), (3, 1), (4, 0, 0, 1), (4, 0, 0, 2, 1)]
you can use just a single for loop as below. Way more efficient than a recursion
matrix = [[2,0,1],[3,5,3],[5,1,4,9],[0,5]]
my_list = []
for x in range(0,len(matrix)):
try:
a = matrix[x].index(5)
my_list.append((x,a))
except ValueError:
pass

How can I add a random binary info into current 'coordinate'? (Python)

This is part of the code I'm working on: (Using Python)
import random
pairs = [
(0, 1),
(1, 2),
(2, 3),
(3, 0), # I want to treat 0,1,2,3 as some 'coordinate' (or positional infomation)
]
alphas = [(random.choice([1, -1]) * random.uniform(5, 15), pairs[n]) for n in range(4)]
alphas.sort(reverse=True, key=lambda n: abs(n[0]))
A sample output looks like this:
[(13.747649802587832, (2, 3)),
(13.668274782626717, (1, 2)),
(-9.105374057105703, (0, 1)),
(-8.267840318934667, (3, 0))]
Now I'm wondering is there a way I can give each element in 0,1,2,3 a random binary number, so if [0,1,2,3] = [0,1,1,0], (By that I mean if the 'coordinates' on the left list have the corresponding random binary information on the right list. In this case, coordinate 0 has the random binary number '0' and etc.) then the desired output using the information above looks like:
[(13.747649802587832, (1, 0)),
(13.668274782626717, (1, 1)),
(-9.105374057105703, (0, 1)),
(-8.267840318934667, (0, 0))]
Thanks!!
One way using dict:
d = dict(zip([0,1,2,3], [0,1,1,0]))
[(i, tuple(d[j] for j in c)) for i, c in alphas]
Output:
[(13.747649802587832, (1, 0)),
(13.668274782626717, (1, 1)),
(-9.105374057105703, (0, 1)),
(-8.267840318934667, (0, 0))]
You can create a function to convert your number to the random binary assigned. Using a dictionary within this function would make sense. Something like this should work where output1 is that first sample output you provide and binary_code would be [0, 1, 1, 0] in your example:
def convert2bin(original, binary_code):
binary_dict = {n: x for n, x in enumerate(binary_code)}
return tuple([binary_code[x] for x in original])
binary_code = np.random.randint(2, size=4)
[convert2bin(x[1], binary_code) for x in output1]

How to find the index of a tuple in a 2D array in python?

I have an array with the form as follows (with much more elements):
coords = np.array(
[[(2, 1), 1613, 655],
[(2, 5), 906, 245],
[(5, 2), 0, 0]])
And I would like to find the index of a specific tuple. For example, I might be looking for the position of the tuple (2, 5), which should be in position 1 in this case.
I have tried with np.where and np.argwhere, with no luck:
pos = np.argwhere(coords == (2,5))
print(pos)
>> DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
pos = np.where(coords == (2,5))
print(pos)
>> DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
How can I get the index of a tuple?
If you intend to use a numpy array containing objects, all comparison will be done using python itself. At that point, you have given up almost all the advantages of numpy and may as well use a list:
coords = coords.tolist()
index = next((i for i, n in enumerate(coords) if n[0] == (2, 5)), -1)
If you really want to use numpy, I suggest you transform your data appropriately. Two simple options come to mind. You can either expand your tuple and create an array of shape (N, 4), or you can create a structured array that preserves the arrangement of the data as a unit, and has shape (N,). The former is much simpler, while the later is, in my opinion, more elegant.
If you flatten the coordinates:
coords = np.array([[x[0][0], x[0][1], x[1], x[2]] for x in coords])
index = np.flatnonzero(np.all(coords[:, :2] == [2, 5], axis=1))
The structured solution:
coordt = np.dtype([('x', np.int_), ('y', np.int_)])
dt = np.dtype([('coord', coordt), ('a', np.int_), ('b', np.int_)])
coords = np.array([((2, 1), 1613, 655), ((2, 5), 906, 245), ((5, 2), 0, 0)], dtype=dt)
index = np.flatnonzero(coords['coord'] == np.array((2, 5), dtype=coordt))
You can also just transform the first part of your data to a real numpy array, and operate on that:
coords = np.array(coords[:, 0].tolist())
index = np.flatnonzero((coords == [2, 5]).all(axis=1))
You should not compare (2, 5) and coords, but compare (2, 5) and coords[:, 0].
Try this code.
np.where([np.array_equal(coords[:, 0][i], (2, 5)) for i in range(len(coords))])[0]
Try this one
import numpy as np
coords = np.array([[(2, 1), 1613, 655], [(2, 5), 906, 245], [(5, 2), 0, 0]])
tpl=(2,5)
i=0 # index of the column in which the tuple you are looking for is listed
pos=([t[i] for t in coords].index(tpl))
print(pos)
Assuming your target tuple (e.g. (2,5) ) is always in the first column of the numpy array coords i.e. coords[:,0] you can simply do the following without any loops!
[*coords[:,0]].index((2,5))
If the tuples aren't necessarily in the first column always, then you can use,
[*coords.flatten()].index((2,5))//3
Hope that helps.
First of all, the tuple (2, 5) is in position 0 as it is the first element of the list [(2, 5), 906, 245].
And second of all, you can use basic python functions to check the index of a tuple in that array. Here's how you do it:
>>> coords = np.array([[(2, 1), 1613, 655], [(2, 5), 906, 245], [(5, 2), 0, 0]])
>>>
>>> coords_list = cl = list(coords)
>>> cl
[[(2, 1), 1613, 655], [(2, 5), 906, 245], [(5, 2), 0, 0]]
>>>
>>> tuple_to_be_checked = tuple_ = (2, 5)
>>> tuple_
(2, 5)
>>>
>>> for i in range(0, len(cl), 1): # Dynamically works for any array `cl`
for j in range(0, len(cl[i]), 1): # Dynamic; works for any list `cl[i]`
if cl[i][j] == tuple_: # Found the tuple
# Print tuple index and containing list index
print(f'Tuple at index {j} of list at index {i}')
break # Break to avoid unwanted loops
Tuple at index 0 of list at index 1
>>>

How to segment a matrix by neighbouring values?

Suppose I have a matrix like this:
m = [0, 1, 1, 0,
1, 1, 0, 0,
0, 0, 0, 1]
And I need to get the coordinates of the same neighbouring values (but not diagonally):
So the result would be a list of lists of coordinates in the "matrix" list, starting with [0,0], like this:
r = [[[0,0]],
[[0,1], [0,2], [1,0], [1,1]],
[[0,3], [1,2], [1,3], [2,0], [2,1], [2,2]]
[[2,3]]]
There must be a way to do that, but I'm really stuck.
tl;dr: We take an array of zeros and ones and use scipy.ndimage.label to convert it to an array of zeros and [1,2,3,...]. We then use np.where to find the coordinates of each element with value > 0. Elements that have the same value end up in the same list.
scipy.ndimage.label interprets non-zero elements of a matrix as features and labels them. Each unique feature in the input gets assigned a unique label. Features are e.g. groups of adjacent elements (or pixels) with the same value.
import numpy as np
from scipy.ndimage import label
# make dummy data
arr = np.array([[0,1,1,0], [1,1,0,0], [0,0,0,1]])
#initialise list of features
r = []
Since OP wanted all features, that is groups of zero and non-zero pixels, we use label twice: First on the original array, and second on 1 - original array. (For an array of zeros and ones, 1 - array just flips the values).
Now, label returns a tuple, containing the labelled array (which we are interested in) and the number of features that it found in that array (which we could use, but when I coded this, I chose to ignore it. So, we are interested in the first element of the tuple returned by label, which we access with [0]:
a = label(arr)[0]
b = label(1-arr)[0]
Now we check which unique pixel values label has assigned. So we want the set of a and b, repectively. In order for set() to work, we need to linearise both arrays, which we do with .ravel(). We have to subtract {0} in both cases, because for both a and b we are interested in only the non-zero values.
So, having found the unique labels, we loop through these values, and use np.where to find where on the array a given value is located. np.where returns a tuple of arrays. The first element of this tuple are all the row-coordinates for which the condition was met, and the second element are the column-coordinates.
So, we can use zip(* to unpack the two containers of length n to n containers of length 2. This means that we go from list of all row-coords + list of all column-coords to list of all row-column-coordinate pairs for which the condition is met. Finally in python 3, zip is a generator, which we can evaluate by calling list() on it. The resulting list is then appended to our list of coordinates, r.
for x in set(a.ravel())-{0}:
r.append(list(zip(*np.where(a==x))))
for x in set(b.ravel())-{0}:
r.append(list(zip(*np.where(b==x))))
print(r)
[[(0, 1), (0, 2), (1, 0), (1, 1)],
[(2, 3)],
[(0, 0)],
[(0, 3), (1, 2), (1, 3), (2, 0), (2, 1), (2, 2)]]
That said, we can speed up this code slightly by making use of the fact that label returns the number of features it assigned. This allows us to avoid the set command, which can take time on large arrays:
a, num_a = label(arr)
for x in range(1, num_a+1): # range from 1 to the highest label
r.append(list(zip(*np.where(a==x))))
A solution with only standard libraries:
from pprint import pprint
m = [0, 1, 1, 0,
1, 1, 0, 0,
0, 0, 0, 1]
def is_neighbour(x1, y1, x2, y2):
return (x1 in (x2-1, x2+1) and y1 == y2) or \
(x1 == x2 and y1 in (y2+1, y2-1))
def is_value_touching_group(val, groups, x, y):
for d in groups:
if d['color'] == val and any(is_neighbour(x, y, *cell) for cell in d['cells']):
return d
def check(m, w, h):
groups = []
for i in range(h):
for j in range(w):
val = m[i*w + j]
touching_group = is_value_touching_group(val, groups, i, j)
if touching_group:
touching_group['cells'].append( (i, j) )
else:
groups.append({'color':val, 'cells':[(i, j)]})
final_groups = []
while groups:
current_group = groups.pop()
for c in current_group['cells']:
touching_group = is_value_touching_group(current_group['color'], groups, *c)
if touching_group:
touching_group['cells'].extend(current_group['cells'])
break
else:
final_groups.append(current_group['cells'])
return final_groups
pprint( check(m, 4, 3) )
Prints:
[[(2, 3)],
[(0, 3), (1, 3), (1, 2), (2, 2), (2, 0), (2, 1)],
[(0, 1), (0, 2), (1, 1), (1, 0)],
[(0, 0)]]
Returns as a list of groups under value key.
import numpy as np
import math
def get_keys(old_dict):
new_dict = {}
for key, value in old_dict.items():
if value not in new_dict.keys():
new_dict[value] = []
new_dict[value].append(key)
else:
new_dict[value].append(key)
return new_dict
def is_neighbor(a,b):
if a==b:
return True
else:
distance = abs(a[0]-b[0]), abs(a[1]-b[1])
return distance == (0,1) or distance == (1,0)
def collate(arr):
arr2 = arr.copy()
ret = []
for a in arr:
for i, b in enumerate(arr2):
if set(a).intersection(set(b)):
a = list(set(a+b))
ret.append(a)
for clist in ret:
clist.sort()
return [list(y) for y in set([tuple(x) for x in ret])]
def get_groups(d):
for k,v in d.items():
ret = []
for point in v:
matches = [a for a in v if is_neighbor(point, a)]
ret.append(matches)
d[k] = collate(ret)
return d
a = np.array([[0,1,1,0],
[1,1,0,0],
[0,0,1,1]])
d = dict(np.ndenumerate(a))
d = get_keys(d)
d = get_groups(d)
print(d)
Result:
{
0: [[(0, 3), (1, 2), (1, 3)], [(0, 0)], [(2, 0), (2, 1)]],
1: [[(2, 2), (2, 3)], [(0, 1), (0, 2), (1, 0), (1, 1)]]
}

Dimensionality agnostic (generic) cartesian product [duplicate]

This question already has answers here:
How to get the cartesian product of multiple lists
(17 answers)
Closed 8 months ago.
I'm looking to generate the cartesian product of a relatively large number of arrays to span a high-dimensional grid. Because of the high dimensionality, it won't be possible to store the result of the cartesian product computation in memory; rather it will be written to hard disk. Because of this constraint, I need access to the intermediate results as they are generated. What I've been doing so far is this:
for x in xrange(0, 10):
for y in xrange(0, 10):
for z in xrange(0, 10):
writeToHdd(x,y,z)
which, apart from being very nasty, is not scalable (i.e. it would require me writing as many loops as dimensions). I have tried to use the solution proposed here, but that is a recursive solution, which therefore makes it quite hard to obtain the results on the fly as they are being generated. Is there any 'neat' way to do this other than having a hardcoded loop per dimension?
In plain Python, you can generate the Cartesian product of a collection of iterables using itertools.product.
>>> arrays = range(0, 2), range(4, 6), range(8, 10)
>>> list(itertools.product(*arrays))
[(0, 4, 8), (0, 4, 9), (0, 5, 8), (0, 5, 9), (1, 4, 8), (1, 4, 9), (1, 5, 8), (1, 5, 9)]
In Numpy, you can combine numpy.meshgrid (passing sparse=True to avoid expanding the product in memory) with numpy.ndindex:
>>> arrays = np.arange(0, 2), np.arange(4, 6), np.arange(8, 10)
>>> grid = np.meshgrid(*arrays, sparse=True)
>>> [tuple(g[i] for g in grid) for i in np.ndindex(grid[0].shape)]
[(0, 4, 8), (0, 4, 9), (1, 4, 8), (1, 4, 9), (0, 5, 8), (0, 5, 9), (1, 5, 8), (1, 5, 9)]
I think I figured out a nice way using a memory mapped file:
def carthesian_product_mmap(vectors, filename, mode='w+'):
'''
Vectors should be a tuple of `numpy.ndarray` vectors. You could
also make it more flexible, and include some error checking
'''
# Make a meshgrid with `copy=False` to create views
grids = np.meshgrid(*vectors, copy=False, indexing='ij')
# The shape for concatenating the grids from meshgrid
shape = grid[0].shape + (len(vectors),)
# Find the "highest" dtype neccesary
dtype = np.result_type(*vectors)
# Instantiate the memory mapped file
M = np.memmap(filename, dtype, mode, shape=shape)
# Fill the memmap with the grids
for i, grid in enumerate(grids):
M[...,i] = grid
# Make sure the data is written to disk (optional?)
M.flush()
# Reshape to put it in the right format for Carthesian product
return M.reshape((-1, len(vectors)))
But I wonder if you really need to store the whole Carthesian product (there's a lot of data duplication). Is it not an option to generate the rows in the product at the moment they're needed?
It seems you just want to loop over an arbitrary number of dimensions. My generic solution for this is using an index field and increment indices plus handling overflows.
Example:
n = 3 # number of dimensions
N = 1 # highest index value per dimension
idx = [0]*n
while True:
print(idx)
# increase first dimension
idx[0] += 1
# handle overflows
for i in range(0, n-1):
if idx[i] > N:
# reset this dimension and increase next higher dimension
idx[i] = 0
idx[i+1] += 1
if idx[-1] > N:
# overflow in the last dimension, we are finished
break
Gives:
[0, 0, 0]
[1, 0, 0]
[0, 1, 0]
[1, 1, 0]
[0, 0, 1]
[1, 0, 1]
[0, 1, 1]
[1, 1, 1]
Numpy has something similar inbuilt: ndenumerate.

Categories