How to create binary matrix given indices in tensorflow - python

Suppose I have a tf tensor with indices for two samples:
x = [[2,3,5], [5,7,5]]
I would like to create a tensor with a certain shape (samples, 10), where the indices of each sample in x are set to 1 and the rest to 0 like this:
output = [[0, 0, 1, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 1, 0, 0]]
What is the best way to do this, without creating a lot of intermediary matrices?
The closest I got was using tf.scatter_nd, but I couldn't figure out how to transform x and the updates correctly, except manually adding additional information like this:
>>> tf.cast(tf.scatter_nd([[0,2], [0,3], [0,5], [1,5], [1,7], [1,5]], [1, 1, 1, 1, 1, 1] ,
[2, 10]) > 0, dtype="int64")
<tf.Tensor: id=1191, shape=(2, 10), dtype=int64, numpy=
array([[0, 0, 1, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 1, 0, 0]])>
Also, this approach will aggregate duplicate indices at first, which makes an intermediary boolean matrix necessary. (This I could live with though, the main problem is getting from x to a matrix with shape (samples, 10) where non-existent indices are 0 for each sample.)
Thanks for any help! :)

I found a solution (tensorflow 2.2.0):
class BinarizeSequence(tf.keras.layers.Layer):
"""
Transforms an integer sequence into a binary representation
with shape (samples, vocab_size).
Example:
In: [[2,3,5], [5,7,5]]
Out: [[0, 0, 1, 1, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 1, 0, 0]]
By default the output is returned as SparseTensor.
Use dense_output=True if you need a dense representation.
"""
def __init__(self, vocab_size, dense_output=False, **kwargs):
super(BinarizeSequence, self).__init__(**kwargs)
self.vocab_size = vocab_size
self.dense_output = dense_output
def get_config(self):
config = super().get_config().copy()
config.update(
{"vocab_size": self.vocab_size, "dense_output": self.dense_output}
)
return config
def call(self, x, mask=None):
# create indices for binarized representation
x = tf.cast(x, dtype=tf.int32)
x_1d = tf.reshape(x, [-1])
sample_dim = tf.repeat(
tf.range(tf.shape(x)[0], dtype=tf.int32), tf.shape(x)[1]
)
indices = tf.transpose(tf.stack([sample_dim, x_1d]))
# only keep unique indices
# (see https://stackoverflow.com/a/42245425/979377)
indices64 = tf.bitcast(indices, type=tf.int64)
unique64, idx = tf.unique(indices64)
unique_indices = tf.bitcast(unique64, type=tf.int32)
# build binarized representation
updates = tf.ones(tf.shape(unique_indices)[0])
output_shape = [tf.shape(x)[0], self.vocab_size]
if self.dense_output:
output = tf.scatter_nd(unique_indices, updates, output_shape)
else:
output = tf.sparse.SparseTensor(
tf.cast(unique_indices, tf.int64), updates, output_shape
)
return output

Related

How to pad a specific dimension of a numpy array?

I'm trying to create a class that Right Pads a Numpy array to have the shape (257, 87). Currently the array has shape (257, 24), so I only need to pad the 2nd dim. I've tried a few iterations of the below class, but it always pads both dimensions.
class Padder:
def __init__(self, mode="constant"):
self.mode = mode
def right_pad(self, array):
num_missing_items = 87 - array.shape[1]
padded_array = np.pad(array,
(num_missing_items, 0),
mode=self.mode)
return padded_array
This results in shape (320, 87).
I also tried indexing the input array
class Padder:
def __init__(self, mode="constant"):
self.mode = mode
def right_pad(self, array):
num_missing_items = 87 - array.shape[1]
padded_array = np.pad(array[1],
(num_missing_items, 0),
mode=self.mode)
return padded_array
But this only returns the padded 2nd dim, nothing in the first dim, shape = (87,). So I tried to create a new array with the first dim as the original array's 1st dim, and 2nd dim as the padded 2nd dim.
class Padder:
def __init__(self, mode="constant"):
self.mode = mode
def right_pad(self, array):
num_missing_items = 87 - array.shape[1]
padded_array = np.array([array[0], np.pad(array[1],
(num_missing_items, 0),
mode=self.mode)])
return padded_array
But this returns an array of shape (2,)
How can I use padding to get my array to shape (257, 870)?
Have a look at the docs for the pad_width parameter for np.pad:
https://numpy.org/doc/stable/reference/generated/numpy.pad.html
You can pass it a sequence of (before, after) tuples. In your case, the (before, after) for the first dimension needs to be (0, 0), and you can choose yourself for the second. Here is an example:
import numpy as np
arr = np.arange(12).reshape(4, 3)
padded = np.pad(arr, ((0, 0), (2, 3)))
array([[ 0, 0, 0, 1, 2, 0, 0, 0],
[ 0, 0, 3, 4, 5, 0, 0, 0],
[ 0, 0, 6, 7, 8, 0, 0, 0],
[ 0, 0, 9, 10, 11, 0, 0, 0]])

Torch tensor randomly contains huge, impossible values

I'm new to PyTorch and I'm trying to debug my code using IntelliJ PyCharm. I have a line that logs the content of a torch.IntTensor
logger.debug(f"action_tensor = {action_tensor}")
Most of the time this seems to work just fine, but occasionally the print out shows one or several huge values in the tensor, such as:
2021-08-06 09:21:17,737 DEBUG main.py state_tensor = tensor([2089484293, 0, 0, 1, 0, 1,
1, 0, 0, 0, 0, 0,
0, 1, 1, 1, 1, 1,
1, 1, 1, 2, 2, 0,
0, 0, 0, 0, 0, 6,
0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0],
dtype=torch.int32)
The tensor is created by extracting a few values from the state of an object
rolls = int(self.rolls)
allowed = [int(self.scorecard[c]["allowed"] == True) for c in self.scorecard]
scored = [int(self.scorecard[c]["score"]) if self.scorecard[c]["score"] else int(0) for c in self.scorecard]
return torch.cat([torch.IntTensor(rolls),
torch.IntTensor(allowed),
torch.IntTensor(scored)])
I've checked multiple times, and there is no way any of these values are as large as the example above (e.g. 2089484293). I've tried just creating a numpy array instead of a tensor, and print that shows no problems. I'm suspecting there is something I don't know about how torch.IntTensor.
What is wrong with the way I create my tensor that results in these huge values appearing sometimes?

Crop empty arrays (padding) from a volume

What I want to do is crop a volume to remove all irrelevant data. For example, say I have a 100x100x100 volume filled with zeros, except for a 50x50x50 volume within that is filled with ones.
How do I obtain the cropped 50x50x50 volume from the original ?
Here's the naive method I came up with.
import numpy as np
import tensorflow as tf
test=np.zeros((100,100,100)) # create an empty 100x100x100 volume
rand=np.random.rand(66,25,34) # create a 66x25x34 filled volume
test[10:76, 20:45, 30:64] = rand # partially fill the empty volume
# initialize the cropping coordinates
minx=miny=minz=0
maxx=maxy=maxz=0
maxx,maxy,maxz=np.subtract(test.shape,1)
# compute the optimal cropping coordinates
dimensions=test.shape
while(tf.reduce_max(test[minx,:,:]) == 0): # check for empty slices along the x axis
minx+=1
while(tf.reduce_max(test[:,miny,:]) == 0): # check for empty slices along the y axis
miny+=1
while(tf.reduce_max(test[:,:,minz]) == 0): # check for empty slices along the z axis
minz+=1
while(tf.reduce_max(test[maxx,:,:]) == 0):
maxx-=1
while(tf.reduce_max(test[:,maxy,:]) == 0):
maxy-=1
while(tf.reduce_max(test[:,:,maxz]) == 0):
maxz-=1
maxx,maxy,maxz=np.add((maxx,maxy,maxz),1)
crop = test[minx:maxx,miny:maxy,minz:maxz]
print(minx,miny,minz,maxx,maxy,maxz)
print(rand.shape)
print(crop.shape)
This prints:
10 20 30 76 45 64
(66, 25, 34)
(66, 25, 34)
, which is correct. However, it takes too long and is probably suboptimal. I'm looking for better ways to achieve the same thing.
NB:
The subvolume wouldn't necessarily be a cuboid, it could be any shape.
I want to keep gaps within the subvolume, only remove what's "outside" the shape to be cropped.
(Edit)
Oops, I hadn't seen the comment about keeping the so-called "gaps" between elements! This should be the one, finally.
def get_nonzero_sub(arr):
arr_slices = tuple(np.s_[curr_arr.min():curr_arr.max() + 1] for curr_arr in arr.nonzero())
return arr[arr_slices]
While you wait for a sensible response (I would guess this is a builtin function in an image processing library somewhere), here's a way
y, x = np.where(np.any(test, 0))
z, _ = np.where(np.any(test, 1))
test[min(z):max(z)+1, min(y):max(y)+1, min(x):max(x)+1]
I think leaving tf out of this should up your performance.
Explanation (based on 2D array)
test = np.array([
[0, 0, 0, 0, 0, ],
[0, 0, 1, 2, 0, ],
[0, 0, 3, 0, 0, ],
[0, 0, 0, 0, 0, ],
[0, 0, 0, 0, 0, ],
])
We want to crop it to get
[[1, 2]
[3, 0]]
np.any(..., 0) this will 'iterate' over axis 0 and return True if any of the elements in the slice are truthy. I show the result of this in the comments here:
np.array([
[0, 0, 0, 0, 0, ], # False
[0, 0, 1, 2, 0, ], # True
[0, 0, 3, 0, 0, ], # True
[0, 0, 0, 0, 0, ], # False
[0, 0, 0, 0, 0, ], # False
])
i.e. it returns np.array([False, True, True, False, False])
np.any(..., 1) does the same as step 2 but over axis 1 instead of axis zero i.e.
np.array([
[0, 0, 0, 0, 0, ],
[0, 0, 1, 2, 0, ],
[0, 0, 3, 0, 0, ],
[0, 0, 0, 0, 0, ],
[0, 0, 0, 0, 0, ],
# False False True True False
])
Note that in the case of a 3D array, these steps return 2D arrays
(x,) = np.where(...) this returns the index values of the truthy values in an array. So np.where([False, True, True, False, False]) returns (array([1, 2]),). Note that this is a tuple so in the 2D case we would need to call (x,) = ... so x is just the array array([1, 2]). The syntax is nicer in the 2D case as we can use tuple-unpacking i.e x, y = ...
Note that in the 3D case, np.where can give us the value for 2 axes at a time. I chose to do x-y in one go and then z-? in the second go. The ? is either x or y, I can't be bothered to work out which and since we don't need it I throw it away in a variable named _ which by convention is a reasonable place to store junk output you don't actually want. Note I need to do z, _ = as I want the tuple-unpacking and not just z = otherwise z become the tuple with both arrays.
Well, this step is pretty much the same as what you did at the end of your answer so I assume you understand it. Simple slicing in each dimension from the first element with a value in that dimension to the last. You need the + 1 because slicing in python are not inclusive of the index after the :.
Hopefully that's clear?

Tensorflow: How to retrieve information from the prediction Tensor?

I have found a neural network for semantic segmentation purpose. The network works just fine, I feed my training, validation and test data and I get the output (segmented parts in different colors). Until here, all is OK. I am using Keras with Tensorflow 1.7.0, GPU enabled. Python version is 3.5
What I want to achieve though is to get access to the pixel groups (segments) so that I can get their boundaries' image coordinates, i.e. an array of points which forms the boundary of the segment X shown in green in the prediction image.
How to do that? Obviously I cannot put the entire code here but here is a snippet which I should modify to achieve what I would like to:
I have the following in my evaluate function:
def evaluate(model_file):
net = load_model(model_file, custom_objects={'iou_metric': create_iou_metric(1 + len(PART_NAMES)),
'acc_metric': create_accuracy_metric(1 + len(PART_NAMES), output_mode='pixelwise_mean')})
img_size = net.input_shape[1]
image_filename = lambda fp: fp + '.jpg'
d_test_x = TensorResize((img_size, img_size))(ImageSource(TEST_DATA, image_filename=image_filename))
d_test_x = PixelwiseSubstract([103.93, 116.78, 123.68], use_lane_names=['X'])(d_test_x)
d_test_pred = Predict(net)(d_test_x)
d_test_pred.metadata['properties'] = ['background'] + PART_NAMES
d_x, d_y = process_data(VALIDATION_DATA, img_size)
d_x = PixelwiseSubstract([103.93, 116.78, 123.68], use_lane_names=['X'])(d_x)
d_y = AddBackgroundMap(use_lane_names=['Y'])(d_y)
d_train = Join()([d_x, d_y])
print('losses:', net.evaluate_generator(d_train.batch_array_tuple_generator(batch_size=3), 3))
# the tensor which needs to be modified
pred_y = Predict(net)(d_x)
Visualize(('slices', 'labels'))(Join()([d_test_x, d_test_pred]))
Visualize(('slices', 'labels', 'labels'))(Join()([d_x, pred_y, d_y]))
As for the Predict function, here is the snippet:
Alternatively, I've found that by using the following, one can get access to the tensor:
# for sample_img, in d_x.batch_array_tuple_generator(batch_size=3, n_samples=5):
# aa = net.predict(sample_img)
# indexes = np.argmax(aa,axis=3)
# print(indexes)
# import pdb
# pdb.set_trace()
But I have no idea how this works, I've never used pdb, therefore no idea.
In case if anyone wants to also see the training function, here it is:
def train(model_name='refine_res', k=3, recompute=False, img_size=224,
epochs=10, train_decoder_only=False, augmentation_boost=2, learning_rate=0.001,
opt='rmsprop'):
print("Traning on: " + str(PART_NAMES))
print("In Total: " + str(1 + len(PART_NAMES)) + " parts.")
metrics = [create_iou_metric(1 + len(PART_NAMES)),
create_accuracy_metric(1 + len(PART_NAMES), output_mode='pixelwise_mean')]
if model_name == 'dummy':
net = build_dummy((224, 224, 3), 1 + len(PART_NAMES)) # 1+ because background class
elif model_name == 'refine_res':
net = build_resnet50_upconv_refine((img_size, img_size, 3), 1 + len(PART_NAMES), k=k, optimizer=opt, learning_rate=learning_rate, softmax_top=True,
objective_function=categorical_crossentropy,
metrics=metrics, train_full=not train_decoder_only)
elif model_name == 'vgg_upconv':
net = build_vgg_upconv((img_size, img_size, 3), 1 + len(PART_NAMES), k=k, optimizer=opt, learning_rate=learning_rate, softmax_top=True,
objective_function=categorical_crossentropy,metrics=metrics, train_full=not train_decoder_only)
else:
net = load_model(model_name)
d_x, d_y = process_data(TRAINING_DATA, img_size, recompute=recompute, ignore_cache=False)
d = Join()([d_x, d_y])
# create more samples by rotating top view images and translating
images_to_be_rotated = {}
factor = 5
for root, dirs, files in os.walk(TRAINING_DATA, topdown=False):
for name in dirs:
format = str(name + '/' + name) # construct the format of foldername/foldername
images_to_be_rotated.update({format: factor})
d_aug = ImageAugmentation(factor_per_filepath_prefix=images_to_be_rotated, rotation_variance=90, recalc_base_seed=True)(d)
d_aug = ImageAugmentation(factor=3 * augmentation_boost, color_interval=0.03, shift_interval=0.1, contrast=0.4, recalc_base_seed=True, use_lane_names=['X'])(d_aug)
d_aug = ImageAugmentation(factor=2, rotation_variance=20, recalc_base_seed=True)(d_aug)
d_aug = ImageAugmentation(factor=7 * augmentation_boost, rotation_variance=10, translation=35, mirror=True, recalc_base_seed=True)(d_aug)
# apply augmentation on the images of the training dataset only
d_aug = AddBackgroundMap(use_lane_names=['Y'])(d_aug)
d_aug.metadata['properties'] = ['background'] + PART_NAMES
# substract mean and shuffle
d_aug = Shuffle()(d_aug)
d_aug, d_val = RandomSplit(0.8)(d_aug)
d_aug = PixelwiseSubstract([103.93, 116.78, 123.68], use_lane_names=['X'])(d_aug)
d_val = PixelwiseSubstract([103.93, 116.78, 123.68], use_lane_names=['X'])(d_val)
# Visualize()(d_aug)
d_aug.configure()
d_val.configure()
print('training size:', d_aug.size())
batch_size = 4
callbacks = []
#callbacks += [EarlyStopping(patience=10)]
callbacks += [ModelCheckpoint(filepath="trained_models/"+model_name + '.hdf5', monitor='val_iou_metric', mode='max',
verbose=1, save_best_only=True)]
callbacks += [CSVLogger('logs/'+model_name + '.csv')]
history = History()
callbacks += [history]
# sess = K.get_session()
# sess.run(tf.initialize_local_variables())
net.fit_generator(d_aug.batch_array_tuple_generator(batch_size=batch_size, shuffle_samples=True), steps_per_epoch=d_aug.size() // batch_size,
validation_data=d_val.batch_array_tuple_generator(batch_size=batch_size), validation_steps=d_val.size() // batch_size,
callbacks=callbacks, epochs=epochs)
return {k: (max(history.history[k]), min(history.history[k])) for k in history.history.keys()}
for segmentation tasks, considering that your batch is one image, each pixel in the image is assigned a probability to belong to a class. Suppose you have 5 classes, and the image has 784 pixels(28x28) , you will get from the net.predict an array of shape (784,5) each pixel among 784 is assigned 5 probabilities values to belong to those classes. when you do np.argmax(aa,axis=3) you get the index of the highests probabilities for each pixel that would of shape (784,1) you can then reshape it to 28x28 indexes.reshape(28,28) and you get the mask of your predictions.
Reducing the problem to a 7x7 dimension and 4 classes(0-3) that looks like
array([[2, 1, 0, 1, 2, 3, 1],
[3, 1, 1, 0, 3, 0, 0],
[3, 3, 2, 2, 0, 3, 1],
[1, 1, 0, 3, 1, 3, 1],
[0, 0, 0, 3, 3, 1, 0],
[1, 2, 3, 0, 1, 2, 3],
[0, 2, 1, 1, 0, 1, 3]])
you want to extract the indexes where the model predicted 1
segment_1=np.where(indexes==1)
since its 2 dimension array, segment_1 will be 2x7 array,where the first array is the row indexes, and second array will be column value.
(array([0, 0, 0, 1, 1, 2, 3, 3, 3, 3, 4, 5, 5, 6, 6, 6]), array([1, 3, 6, 1, 2, 6, 0, 1, 4, 6, 5, 0, 4, 2, 3, 5]))
looking at first number in the first and second array,0 and 1 point to where the located in indexes
You can extract its value like
indexes[segment_1]
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
and then proceed with second class you want to get ,lets say 2
segment_2=np.where(image==2)
segment_2
(array([0, 0, 2, 2, 5, 5, 6]), array([0, 4, 2, 3, 1, 5, 1]))
and if you want to get each classes itsself.
you can create a copy of indexes for each class,4 copies in total class_1=indexes and set to zero any value that is not equal to 1. class_1[class_1!=1]=0 and get something like this
array([[0, 1, 0, 1, 0, 0, 1],
[0, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1],
[1, 1, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 1, 0],
[1, 0, 0, 0, 1, 0, 0],
[0, 0, 1, 1, 0, 1, 0]])
for the eye, you may think that there are countour but from this example, you can tell that there is no clear contour of each segment. The only way i could think of,is to loop the image in rows and record where the value change and do the same in columns.
I am not entired sure if this would be ideal situation.
I hope i covered some part of your question.
PDB is just a debugging package that allows you execute your code step by step

Roulette Wheel Selection for non-ordered fitness values

I need to have a fitness proportionate selection approach to a GA, however my population cant loose the structure (order), in this case while generating the probabilities, I believe the individuals get the wrong weights, the program is:
population=[[[0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1], [6], [0]],
[[0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1], [4], [1]],
[[0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0], [6], [2]],
[[1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0], [4], [3]]]
popultion_d={'0,0,1,0,1,1,0,1,1,1,1,0,0,0,0,1': 6,
'0,0,1,1,1,0,0,1,1,0,1,1,0,0,0,1': 4,
'0,1,1,0,1,1,0,0,1,1,1,0,0,1,0,0': 6,
'1,0,0,1,1,1,0,0,1,1,0,1,1,0,0,0': 4}
def ProbabilityList(population_d):
fitness = population_d.values()
total_fit = (sum(fitness))
relative_fitness = [f/total_fit for f in fitness]
probabilities = [sum(relative_fitness[:i+1]) for i in range(len(relative_fitness))]
return (probabilities)
def FitnessProportionateSelection(population, probabilities, number):
chosen = []
for n in range(number):
r = random.random()
for (i, individual) in enumerate(population):
if r <= probabilities[i]:
chosen.append(list(individual))
break
return chosen
number=2
The population element is: [[individual],[fitness],[counter]]
The probabilities function output is: [0.42857142857142855, 0.5714285714285714, 0.8571428571428571, 1.0]
What I notice here is that the previous weight is summed up to the next one, not necessarily being in crescent order, so a think a higher weight is given to the cromosome with a lowest fitness.
I dont want to order it because I need to index the lists by position later, so I think I will have wrong matches.
Anyone knows a possible solution, package or different approach to perform a weighted the selection in this case?
p.s: I know the dictionary may be redundant here, but I had several other problems using the list itself.
Edit: I tried to use random.choices() as you can see below (using relative fitness):
def FitnessChoices(population, probabilities, number):
return random.choices(population, probabilities, number)
But I get this error: TypeError: choices() takes from 2 to 3 positional arguments but 4 were given
Thank you!
Using random.choices is certainly a good idea. You just need to understand the function call. You have to specify, whether your probabilities are marginal or cumulated. So you could use either
import random
def ProbabilityList(population_d):
fitness = population_d.values()
total_fit = sum(fitness)
relative_fitness = [f/total_fit for f in fitness]
return relative_fitness
def FitnessChoices(population, relative_fitness, number):
return random.choices(population, weights = relative_fitness, k = number)
or
import random
def ProbabilityList(population_d):
fitness = population_d.values()
total_fit = sum(fitness)
relative_fitness = [f/total_fit for f in fitness]
cum_probs = [sum(relative_fitness[:i+1]) for i in range(len(relative_fitness))]
return cum_probs
def FitnessChoices(population, cum_probs, number):
return random.choices(population, cum_weights = cum_probs, k = number)
I'd recommend you to have a look at the differences between keyword and positional arguments in python.

Categories