Crop empty arrays (padding) from a volume - python

What I want to do is crop a volume to remove all irrelevant data. For example, say I have a 100x100x100 volume filled with zeros, except for a 50x50x50 volume within that is filled with ones.
How do I obtain the cropped 50x50x50 volume from the original ?
Here's the naive method I came up with.
import numpy as np
import tensorflow as tf
test=np.zeros((100,100,100)) # create an empty 100x100x100 volume
rand=np.random.rand(66,25,34) # create a 66x25x34 filled volume
test[10:76, 20:45, 30:64] = rand # partially fill the empty volume
# initialize the cropping coordinates
# compute the optimal cropping coordinates
while(tf.reduce_max(test[minx,:,:]) == 0): # check for empty slices along the x axis
while(tf.reduce_max(test[:,miny,:]) == 0): # check for empty slices along the y axis
while(tf.reduce_max(test[:,:,minz]) == 0): # check for empty slices along the z axis
while(tf.reduce_max(test[maxx,:,:]) == 0):
while(tf.reduce_max(test[:,maxy,:]) == 0):
while(tf.reduce_max(test[:,:,maxz]) == 0):
crop = test[minx:maxx,miny:maxy,minz:maxz]
This prints:
10 20 30 76 45 64
(66, 25, 34)
(66, 25, 34)
, which is correct. However, it takes too long and is probably suboptimal. I'm looking for better ways to achieve the same thing.
The subvolume wouldn't necessarily be a cuboid, it could be any shape.
I want to keep gaps within the subvolume, only remove what's "outside" the shape to be cropped.

Oops, I hadn't seen the comment about keeping the so-called "gaps" between elements! This should be the one, finally.
def get_nonzero_sub(arr):
arr_slices = tuple(np.s_[curr_arr.min():curr_arr.max() + 1] for curr_arr in arr.nonzero())
return arr[arr_slices]

While you wait for a sensible response (I would guess this is a builtin function in an image processing library somewhere), here's a way
y, x = np.where(np.any(test, 0))
z, _ = np.where(np.any(test, 1))
test[min(z):max(z)+1, min(y):max(y)+1, min(x):max(x)+1]
I think leaving tf out of this should up your performance.
Explanation (based on 2D array)
test = np.array([
[0, 0, 0, 0, 0, ],
[0, 0, 1, 2, 0, ],
[0, 0, 3, 0, 0, ],
[0, 0, 0, 0, 0, ],
[0, 0, 0, 0, 0, ],
We want to crop it to get
[[1, 2]
[3, 0]]
np.any(..., 0) this will 'iterate' over axis 0 and return True if any of the elements in the slice are truthy. I show the result of this in the comments here:
[0, 0, 0, 0, 0, ], # False
[0, 0, 1, 2, 0, ], # True
[0, 0, 3, 0, 0, ], # True
[0, 0, 0, 0, 0, ], # False
[0, 0, 0, 0, 0, ], # False
i.e. it returns np.array([False, True, True, False, False])
np.any(..., 1) does the same as step 2 but over axis 1 instead of axis zero i.e.
[0, 0, 0, 0, 0, ],
[0, 0, 1, 2, 0, ],
[0, 0, 3, 0, 0, ],
[0, 0, 0, 0, 0, ],
[0, 0, 0, 0, 0, ],
# False False True True False
Note that in the case of a 3D array, these steps return 2D arrays
(x,) = np.where(...) this returns the index values of the truthy values in an array. So np.where([False, True, True, False, False]) returns (array([1, 2]),). Note that this is a tuple so in the 2D case we would need to call (x,) = ... so x is just the array array([1, 2]). The syntax is nicer in the 2D case as we can use tuple-unpacking i.e x, y = ...
Note that in the 3D case, np.where can give us the value for 2 axes at a time. I chose to do x-y in one go and then z-? in the second go. The ? is either x or y, I can't be bothered to work out which and since we don't need it I throw it away in a variable named _ which by convention is a reasonable place to store junk output you don't actually want. Note I need to do z, _ = as I want the tuple-unpacking and not just z = otherwise z become the tuple with both arrays.
Well, this step is pretty much the same as what you did at the end of your answer so I assume you understand it. Simple slicing in each dimension from the first element with a value in that dimension to the last. You need the + 1 because slicing in python are not inclusive of the index after the :.
Hopefully that's clear?


How to vmap over specific funciton in jax?

I have this function which works for single vector:
def vec_to_board(vector, player, dim, reverse=False):
player_board = np.zeros(dim * dim)
player_pos = np.argwhere(vector == player)
if not reverse:
player_board[mapping[player_pos.T]] = 1
player_board[reverse_mapping[player_pos.T]] = 1
return np.reshape(player_board, [dim, dim])
However, I want it to work for a batch of vectors.
What I have tried so far:
states = jnp.array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2], [1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2]])
batch_size = 1
b_states = vmap(vec_to_board)((states, 1, 4), batch_size)
This doesn't work. However, if I understand correctly vmap should be able to handle this transformation for batches?
There are a couple issues you'll run into when trying to vmap this function:
This function is defined in terms of numpy arrays, not jax arrays. How do I know? JAX arrays are immutable, so things like arr[idx] = 1 will raise errors. You need to replace these with equivalent JAX operations (see JAX Sharp Bits: in-place updates) and ensure your function works with JAX array operations rather than numpy array operations.
Your function makes used of dynamically-shaped arrays; e.g. player_pos, has a shape dependent on the number of nonzero entries in vector == player. You'll have to rewrite your function in terms of statically-shaped arrays. There is some discussion of this in the jnp.argwhere docstring; for example, if you know a priori how many True entries you expect in the array, you can specify the size to make this work.
Good luck!

How to solve partial Knight's Tour with special constraints

Knight's tour problem described in the image here, with diagram.
A knight was initially located in a square labeled 1. It then proceeded to make a
series of moves, never re-visiting a square, and labeled the visited squares in
order. When the knight was finished, the labeled squares in each region of connected
squares had the same sum.
A short while later, many of the labels were erased. The remaining labels can be seen
Complete the grid by re-entering the missing labels. The answer to this puzzle is
the sum of the squares of the largest label in each row of the completed grid, as in
the example.
[1]: E.g. the 14 and 33 are in different regions.
The picture explains it a lot more clearly, but in summary a Knight has gone around a 10 x 10 grid. The picture shows a 10 x 10 board that shows some positions in has been in, and at what point of its journey. You do not know which position the Knight started in, or how many movements it made.
The coloured groups on the board need to all sum to the same amount.
I’ve built a python solver, but it runs for ages - uses recursion. I’ve noted that the maximum sum of a group is 197, based on there being 100 squares and the smallest group is 2 adjacent squares.
My code at this link:
import sys, numpy as np
fixedLocationsArray = [[ 12, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 5, 0, 23, 0],
[ 0, 0, 0, 0, 0, 0, 8, 0, 0, 0],
[ 0, 0, 0, 14, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 2, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 20, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 33, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 28]]
groupsArray = [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0,10, 0],
[0, 0, 0, 1, 0, 0, 0, 0,10, 0],
[0, 0, 1, 1, 1, 1, 9,10,10,10],
[2, 0, 1, 0, 0,11, 9, 9, 9, 9],
[2, 0, 0, 0,11,11,11,15,15, 9],
[2, 4, 4,14,11,12,12,15,15, 8],
[2, 3, 4,14,14,13,13,13,15, 8],
[2, 3, 5,14,16,16,16, 7, 7, 8],
[3, 3, 5, 6, 6, 6, 6, 6, 7, 8]]
- Noted that the maximum sum of a group is 197 since the group of only 2 can have the 100 filled and then 97 on return
class KnightsTour:
def __init__(self, width, height, fixedLocations, groupsArray):
self.w = width
self.h = height
self.fixedLocationsArray = fixedLocations
self.groupsArray = groupsArray
self.npfixedLocationsArray = np.array(fixedLocations)
self.npgroupsArray = np.array(groupsArray)
self.board = [] # Contains the solution
def generate_board(self):
Creates a nested list to represent the game board
for i in range(self.h):
def print_board(self): # Prints out the final board solution
print(" ")
for elem in self.board:
print(" ")
def generate_legal_moves(self, cur_pos, n):
Generates a list of legal moves for the knight to take next
possible_pos = []
move_offsets = [(1, 2), (1, -2), (-1, 2), (-1, -2),
(2, 1), (2, -1), (-2, 1), (-2, -1)]
locationOfNumberInFixed = [(ix,iy) for ix, row in enumerate(self.fixedLocationsArray) for iy, i in enumerate(row) if i == n+1]
groupsizeIsNotExcessive = self.groupsNotExcessiveSize(self.board, self.groupsArray)
for move in move_offsets:
new_x = cur_pos[0] + move[0]
new_y = cur_pos[1] + move[1]
new_pos = (new_x, new_y)
if groupsizeIsNotExcessive:
if locationOfNumberInFixed:
print(f"This number {n+1} exists in the fixed grid at {locationOfNumberInFixed[0]}")
if locationOfNumberInFixed[0] == new_pos:
print(f"Next position is {new_pos} and matches location in fixed")
possible_pos.append((new_x, new_y))
elif not locationOfNumberInFixed: # if the current index of move is not in table, then evaluate if it is a legal move
if (new_x >= self.h): # if it is out of height of the board, continue, don't app onto the list of possible moves
elif (new_x < 0):
elif (new_y >= self.w):
elif (new_y < 0):
possible_pos.append((new_x, new_y))
print(f"The legal moves for index {n} are {possible_pos}")
print(f"The current board looks like:")
return possible_pos
def sort_lonely_neighbors(self, to_visit, n):
It is more efficient to visit the lonely neighbors first,
since these are at the edges of the chessboard and cannot
be reached easily if done later in the traversal
neighbor_list = self.generate_legal_moves(to_visit, n)
empty_neighbours = []
for neighbor in neighbor_list:
np_value = self.board[neighbor[0]][neighbor[1]]
if np_value == 0:
scores = []
for empty in empty_neighbours:
score = [empty, 0]
moves = self.generate_legal_moves(empty, n)
for m in moves:
if self.board[m[0]][m[1]] == 0:
score[1] += 1
scores_sort = sorted(scores, key = lambda s: s[1])
sorted_neighbours = [s[0] for s in scores_sort]
return sorted_neighbours
def groupby_perID_and_sum(self, board, groups):
# Convert into numpy arrays
npboard = np.array(board)
npgroups = np.array(groups)
# Get argsort indices, to be used to sort a and b in the next steps
board_flattened = npboard.ravel()
groups_flattened = npgroups.ravel()
sidx = groups_flattened.argsort(kind='mergesort')
board_sorted = board_flattened[sidx]
groups_sorted = groups_flattened[sidx]
# Get the group limit indices (start, stop of groups)
cut_idx = np.flatnonzero(np.r_[True,groups_sorted[1:] != groups_sorted[:-1],True])
# Create cut indices for all unique IDs in b
n = groups_sorted[-1]+2
cut_idxe = np.full(n, cut_idx[-1], dtype=int)
insert_idx = groups_sorted[cut_idx[:-1]]
cut_idxe[insert_idx] = cut_idx[:-1]
cut_idxe = np.minimum.accumulate(cut_idxe[::-1])[::-1]
# Split input array with those start, stop ones
arrayGroups = [board_sorted[i:j] for i,j in zip(cut_idxe[:-1],cut_idxe[1:])]
arraySum = [np.sum(a) for a in arrayGroups]
sumsInListSame = arraySum.count(arraySum[0]) == len(arraySum)
return sumsInListSame
def groupsNotExcessiveSize(self, board, groups):
# Convert into numpy arrays
npboard = np.array(board)
npgroups = np.array(groups)
# Get argsort indices, to be used to sort a and b in the next steps
board_flattened = npboard.ravel()
groups_flattened = npgroups.ravel()
sidx = groups_flattened.argsort(kind='mergesort')
board_sorted = board_flattened[sidx]
groups_sorted = groups_flattened[sidx]
# Get the group limit indices (start, stop of groups)
cut_idx = np.flatnonzero(np.r_[True,groups_sorted[1:] != groups_sorted[:-1],True])
# Create cut indices for all unique IDs in b
n = groups_sorted[-1]+2
cut_idxe = np.full(n, cut_idx[-1], dtype=int)
insert_idx = groups_sorted[cut_idx[:-1]]
cut_idxe[insert_idx] = cut_idx[:-1]
cut_idxe = np.minimum.accumulate(cut_idxe[::-1])[::-1]
# Split input array with those start, stop ones
arrayGroups = [board_sorted[i:j] for i,j in zip(cut_idxe[:-1],cut_idxe[1:])]
arraySum = [np.sum(a) for a in arrayGroups]
# Check if either groups aren't too large
groupSizeNotExcessive = all(sum <= 197 for sum in arraySum)
return groupSizeNotExcessive
def tour(self, n, path, to_visit):
Recursive definition of knights tour. Inputs are as follows:
n = current depth of search tree
path = current path taken
to_visit = node to visit, i.e. the coordinate
self.board[to_visit[0]][to_visit[1]] = n # This writes the number on the grid
path.append(to_visit) #append the newest vertex to the current point
print(f"Added {n}")
print(f"For {n+1} visiting: ", to_visit)
if self.groupby_perID_and_sum(self.board, self.npgroupsArray): #if all areas sum
print("Done! All areas sum equal")
sorted_neighbours = self.sort_lonely_neighbors(to_visit, n)
for neighbor in sorted_neighbours:
self.tour(n+1, path, neighbor)
#If we exit this loop, all neighbours failed so we reset
self.board[to_visit[0]][to_visit[1]] = 0
print("Going back to: ", path[-1])
except IndexError:
print("No path found")
if __name__ == '__main__':
#Define the size of grid. We are currently solving for an 8x8 grid
kt0 = KnightsTour(10, 10, fixedLocationsArray, groupsArray)
kt0.tour(1, [], (3, 0))
# kt0.tour(1, [], (7, 0))
# kt0.tour(1, [], (7,2))
# kt0.tour(1, [], (6,3))
# kt0.tour(1, [], (4,3))
# kt0.tour(1, [], (3,2))
# startingPositions = [(3, 0), (7, 0), (7,2), (6,3), (4,3), (3,2)]
Here are some observations that you could include to be able to stop more early in your backtracking.
First of all remember that for n steps the total sum in all areas can be computed with the formula n(n+1)/2. This number has to be divisible evenly into the groups i.e. it has to be divisible by 17 which is the amount of groups.
Furthermore if we look at the 12 we can conclude that the 11 and 13 must have been in the same area so we get a lower bound on the number for each area as 2+5+8+11+12+13=51.
And lastly we have groups of size two so the largest two step numbers must make up the total sum for one group.
Using those conditions we can calculate the remaining possible amount of steps with
# the total sum is divisible by 17:
# n*(n+1)/2 % 17 == 0
# the sum for each group is at least the predictable sum for
# the biggest given group 2+5+8+11+12+13=51:
# n*(n+1)/(2*17) >= 51
# since there are groups of two elements the sum of the biggest
# two numbers must be as least as big as the sum for each group
# n+n-1 >= n*(n+1)/(2*17)
[n for n in range(101) if n*(n+1)/2 % 17 == 0 and n*(n+1)/(2*17) >= 51 and n+n-1 >= n*(n+1)/(2*17)]
giving us
So the knight must have taken either 50 or 51 steps and the sum for each area must be either 75 or 78.

Simple Mutation with a Probability

As per the section of code below, I am trying to implement an automatic and random mutation process.
data = [0,1,0,0,0,0,0,1,0,0,1,1,0,0,1]
The code is an adaptation of some other posts I have found, although it is randomly mutating a value every time with either a 0 or 1. I require this to occur with only a certain probability (such as a 0.05 chance of mutation) rather than always being guaranteed.
Additionally, often a 0 is being replaced with a 0 and therefore there is no change to the output, so I would like to limit it in a way that a 0 will only mutate to a 1 and a 1 mutates to a 0.
I would really appreciate the assistance in resolving these two issues.
mutate any value with a choosen probability
randomly choose the position
when the position is choosen, switch between 0 and 1
def mutate(data, proba=0.05):
if random.random() < proba:
data[random.randrange(len(data))] ^= 1
if __name__ == '__main__':
data = [0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1]
for i in range(10):
import random
def changeData(data):
seed = random.randint(0,1000)
# probability of 0.05 (50 / 1000)
if seed <= 50:
indexToChange = random.randint(0,len(data)-1)
# change 0 with 1 and viceversa
data[indexToChange] = 1 if data[indexToChange] == 0 else 0
if __name__== '__main__':
data = [0,1,0,0,0,0,0,1,0,0,1,1,0,0,1]
for i in range(0,100):
You can do as following:
For each element in data, mutate it (1 - val) only if a random value generated by random() function is less than the defined mutation probability.
For example:
import random
mutation_prob = 0.05
data = [0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1]
mutated_data = [1 - x if random.random() < mutation_prob else x for x in data]
If the mutation should be decided regarding the data as a whole, you can do:
mutation_prob = 0.05
data = [0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1]
do_mutation = random.random() < mutation_prob
mutated_data = [1 - x if do_mutation else x for x in data]

How to plot eigenvalues representing symbolic functions in Python?

I need to calculate the eigenvalues of an 8x8-matrix and plot each of the eigenvalues for a symbolic variable occuring in the matrix. For the matrix I'm using I get 8 different eigenvalues where each is representing a function in "W", which is my symbolic variable.
Using python I tried calculating the eigenvalues with Scipy and Sympy which worked kind of, but the results are stored in a weird way (at least for me as a newbie not understanding much of programming so far) and I didn't find a way to extract just one eigenvalue in order to plot it.
import numpy as np
import sympy as sp
W = sp.Symbol('W')
# This is my 8x8-matrix
A= sp.Matrix([[w0+3*wl, 2*W, 0, 0, 0, np.sqrt(3)*W, 0, 0],
[2*W, 4*wl, 0, 0, 0, 0, 0, 0],
[0, 0, 2*wl+w0, np.sqrt(3)*W, 0, 0, 0, np.sqrt(2)*W],
[0, 0, np.sqrt(3)*W, 3*wl, 0, 0, 0, 0],
[0, 0, 0, 0, wl+w0, np.sqrt(2)*W, 0, 0],
[np.sqrt(3)*W, 0, 0, 0, np.sqrt(2)*W, 2*wl, 0, 0],
[0, 0, 0, 0, 0, 0, w0, W],
[0, 0, np.sqrt(2)*W, 0, 0, 0, W, wl]])
# Calculating eigenvalues
eva = A.eigenvals()
evaRR = np.array(list(eva.keys()))
eva1p = evaRR[0] # <- this is my try to refer to the first eigenvalue
In the end I hope to get a plot over "W" where the interesting range is [-0.002 0.002]. For the ones interested it's about atomic physics and W refers to the rabi frequency and I'm looking at so called dressed states.
You're not doing anything incorrectly -- I think you're just caught up since your eigenvalues look so jambled and complicated.
import numpy as np
import sympy as sp
import matplotlib.pyplot as plt
W = sp.Symbol('W')
# This is my 8x8-matrix
A= sp.Matrix([[w0+3*wl, 2*W, 0, 0, 0, np.sqrt(3)*W, 0, 0],
[2*W, 4*wl, 0, 0, 0, 0, 0, 0],
[0, 0, 2*wl+w0, np.sqrt(3)*W, 0, 0, 0, np.sqrt(2)*W],
[0, 0, np.sqrt(3)*W, 3*wl, 0, 0, 0, 0],
[0, 0, 0, 0, wl+w0, np.sqrt(2)*W, 0, 0],
[np.sqrt(3)*W, 0, 0, 0, np.sqrt(2)*W, 2*wl, 0, 0],
[0, 0, 0, 0, 0, 0, w0, W],
[0, 0, np.sqrt(2)*W, 0, 0, 0, W, wl]])
# Calculating eigenvalues
eva = A.eigenvals()
evaRR = np.array(list(eva.keys()))
# The above is copied from your question
# We have to answer what exactly the eigenvalue is in this case
print(type(evaRR[0])) # >>> Piecewise
# Okay, so it's a piecewise function (link to documentation below).
# In the documentation we see that we can use the .subs method to evaluate
# the piecewise function by substituting a symbol for a value. For instance,
print(evaRR[0].subs(W, 0)) # Will substitute 0 for W
# This prints out something really nasty with tons of fractions..
# We can evaluate this mess with sympy's numerical evaluation method, N
print(sp.N(evaRR[0].subs(W, 0)))
# >>> 0.00222190090611143 - 6.49672880062804e-34*I
# That's looking more like it! Notice the e-34 exponent on the imaginary part...
# I think it's safe to assume we can just trim that off.
# This is done by setting the chop keyword to True when using N:
print(sp.N(evaRR[0].subs(W, 0), chop=True)) # >>> 0.00222190090611143
# Now let's try to plot each of the eigenvalues over your specified range
fig, ax = plt.subplots(3, 3) # 3x3 grid of plots (for our 8 e.vals)
ax = ax.flatten() # This is so we can index the axes easier
plot_range = np.linspace(-0.002, 0.002, 10) # Range from -0.002 to 0.002 with 10 steps
for n in range(8):
current_eigenval = evaRR[n]
# There may be a way to vectorize this computation, but I'm not familiar enough with sympy.
evaluated_array = np.zeros(np.size(plot_range))
# This will be our Y-axis (or W-value). It is set to be the same shape as
# plot_range and is initally filled with all zeros.
for i in range(np.size(plot_range)):
evaluated_array[i] = sp.N(current_eigenval.subs(W, plot_range[i]),
# The above line is evaluating your eigenvalue at a specific point,
# approximating it numerically, and then chopping off the imaginary.
ax[n].plot(plot_range, evaluated_array, "c-")
ax[n].set_title("Eigenvalue #{}".format(n))
And as promised, the Piecewise documentation.

Is there a way to replicate scikit.measure.label for edge detection in Numpy?

I know this is a frivolous question, but is there a way to replicate that functionality without using the scikit library?
This question is essentially what I'm trying to do. As far as I've gotten is possibly using the concept of neighbors via a meta array, but I'm still struggling with the idea of how to split that into different arrays for each set of neighbors.
Edit: Sorry haven't asked many questions here, realized this is difficult without actual examples.
E.g. Given the numpy array:
[ 0, 0, 1, 1, 0, 0 ]
[ 0, 0, 1, 0, 0, 0 ]
[ 0, 0, 1, 1, 0, 0 ]
[ 0, 0, 0, 0, 0, 0 ]
[ 0, 1, 1, 1, 0, 0 ]
[ 0, 1, 1, 0, 0, 0 ]
I'm trying to get to an output that's two different arrays with the index location of the touching values (excluding 0's). So desired result is:
output1 = [[0,2], [0,3], [1,2], [2,2], [2,3]]
output2 = [[4,1], [4,2], [4,3], [5,1], [5,2]]
Edit #2: I've made some progress. I am now able to create a list of lists. Here is the code:
data = np.asarray(image_x)
vals = np.argwhere(data == 0)
list = []
listoflists = []
row = -1
for x in range(0, vals.shape[0]):
for y in range(0, vals.shape[1]-1):
if row == -1:
row = vals[x][0]
if row == vals[x][0]:
listoflists.append((vals[x-1][0], list))
row = vals[x][0]
list = []
image_x is a black and white image of shape outlines
The loop currently doesn't capture a pixel if it is a single pixel on a single row at the very end. I'm troubleshooting this now, just wanted to post the update in my excitement.
