Get contour indexes of subarray in array - python

array = np.array([\
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 255],
[ 0, 0, 0, 0, 0, 0, 0, 255, 255, 255],
[ 0, 0, 0, 0, 0, 255, 255, 255, 255, 255],
[ 0, 0, 0, 255, 255, 255, 255, 255, 255, 255],
[ 0, 255, 255, 255, 255, 255, 255, 255, 255, 255]])
The zeros define a shape:
My question is: How can I extract the indexes of the zeros which define the contour of the shape?

If you don't mind using scipy, you can use a 2D convolution to check if your zero values are surrounded by other zero values or not:
import numpy as np
import scipy.signal as signal
# Dummy input
A = np.array([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 255],
[ 0, 0, 0, 0, 0, 0, 0, 255, 255, 255],
[ 0, 0, 0, 0, 0, 255, 255, 255, 255, 255],
[ 0, 0, 0, 255, 255, 255, 255, 255, 255, 255],
[ 0, 255, 255, 255, 255, 255, 255, 255, 255, 255]])
# We convolve the array with a 3x3 kernel filled with one,
# we use mode='same' in order to preserve the shape of the original array
# and we multiply the result by (A==0).
c2d = signal.convolve2d(A==0,np.ones((3,3)),mode='same')*(A==0)
# It is on the border if the values are > 0 and not equal to 9 so:
res = ((c2d>0) & (c2d<9)).astype(int)
# use np.where(res) if you need a linear index instead.
and we obtain the following boolean index:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
[1, 0, 0, 0, 1, 1, 1, 0, 0, 0],
[1, 0, 1, 1, 1, 0, 0, 0, 0, 0],
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

Related

How to remove a polyline from an RGB image using numpy

I am not sure what would be the best way to remove a vertical polyline from an image. This line starts from the very top of an image to the bottom. The input is an image with three channels and array of y indices of the line. The output would be the same input image but without the polyline. For example, if the first image shape is (5, 10,3), then the final image shape should be (5, 9, 3) after removing the polyline.
Input example:
Image:
np.array([[[0, 20, 0, 0, 255, 0, 0, 0, 0, 0],
[0, 20, 0, 0, 255, 0, 0, 0, 0, 0],
[0, 56, 0, 0, 0, 255, 0, 0, 0, 0],
[0, 58, 0, 0, 0, 255, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 255, 0, 0, 0]],
[[0, 0, 0, 0, 255, 0, 0, 0, 30, 0],
[0, 0, 0, 0, 255, 0, 0, 0, 31, 0],
[0, 0, 0, 0, 0, 255, 0, 0, 32, 0],
[0, 0, 0, 0, 0, 255, 0, 0, 31, 0],
[0, 0, 0, 0, 0, 0, 255, 0, 0, 0]],
[[1, 0, 0, 0, 255, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 255, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 255, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 255, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 255, 0, 0, 0]]])
Note: the size here is (3, 5, 10) for illustration. The image shape would be (5, 10, 3). We wanna remove the white polyline
Polyline:
np.array([4, 4, 5, 5, 6])
Output Example
np.array([[[0, 20, 0, 0, 0, 0, 0, 0, 0],
[0, 20, 0, 0, 0, 0, 0, 0, 0],
[0, 56, 0, 0, 0, 0, 0, 0, 0],
[0, 58, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]],
[[0, 0, 0, 0, 0, 0, 0, 30, 0],
[0, 0, 0, 0, 0, 0, 0, 31, 0],
[0, 0, 0, 0, 0, 0, 0, 32, 0],
[0, 0, 0, 0, 0, 0, 0, 31, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0]],
[[1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 0, 0, 0]]])
Note: the size now is (3, 5, 9)
My code
def remove_polyline(image, polyline):
h, w = image.shape[:2]
res_image = np.zeros((h, w - 1))
for i in range(h):
left_section = image[i, :polyline[i]]
right_section = image[i, polyline[i]+1:]
res_image[i, :] = np.dstack((left_section, right_section))
return res_image
I know this would work for grayscale images but not sure about 3 channel images
Note that your input has shape (3,5,10) instead of (5,10,3). This answer assumes the latter:
def remove_polyline(s, polyline):
mask = np.arange(s.shape[1]) != polyline[ :, None]
return s[mask,:].reshape(s[:,:-1].shape).copy()
r = remove_polyline(image, polyline)
plt.imshow(r)
Output:
You can try this.
def remove_polyline(image, polyline):
h, w, c = image.shape
res_image = np.zeros((h, w - 1, c))
for i in range(h):
left_section = image[i, :polyline[i], :]
right_section = image[i, polyline[i]+1:, :]
res_image[i, :, :] = np.vstack((left_section, right_section))
return res_image

Read a grid from image Python

I need suggestions on reading a grid for maze generation purposes, i don't need code to generate a maze, all i need is a way to read an m x n grid from image and be able to iterate over cells and link/ unlink those cells. I already wrote some code that generates a grid using PIL, and i will be writing code to generate mazes using different algorithms.
Example:
Given a grid that will look like this and I need for example to modify cell 0, 0 and cell 0, 1 by linking them together by removing the wall | between them. Any suggestions on how to be able to modify the walls between cells in a method that might do something like the following:
def link_cells(cell1, cell2, grid):
"""Link 2 cells in a given image.
cell1: a tuple (row, column)
cell2: a tuple (row, column)
grid: an image object (full grid)
"""
# do ...
Note: I don't need an algorithm for maze generation, just something that enables the image processing part and i will work from there.
Here's a possible approach. Calculate the means of all the rows across the image into a single column wide vector as shown on the right in the diagram. Likewise calculate the means down all the columns into a single row high vector as shown across the bottom in the diagram:
Now threshold those two vectors, and then look for transition from white to black and from black to white. That will give you the start and end rows and columns of all the black lines in the image. You can now use those to overpaint in white the cell boundaries you want to erase. I overpainted in cyan and magenta so you can see what I am doing.
#!/usr/bin/env python3
import numpy as np
from PIL import Image
# Open image and make greyscale and Numpy versions
pimRGB = Image.open('grid.jpg')
pimgrey = pim.convert('L')
nimRGB = np.array(pimRGB)
nimgrey = np.array(pimgrey)
# Work out where the horizontal lines are
rowmeans = np.mean(nimgrey,axis=1)
rowthresh = np.where(rowmeans>128,255,0)
# Difference each element with its neighbour...
diffs = rowthresh[:-1] - rowthresh[1:]
rowStarts, rowEnds = [], []
for i,v in enumerate(diffs):
if v>0:
rowStarts.append(i)
if v<0:
rowEnds.append(i)
# Work out where the vertical lines are
colmeans = np.mean(nimgrey,axis=0)
colthresh = np.where(colmeans>128,255,0)
# Difference each element with its neighbour...
diffs = colthresh[:-1] - colthresh[1:]
colStarts, colEnds = [], []
for i,v in enumerate(diffs):
if v>0:
colStarts.append(i)
if v<0:
colEnds.append(i)
# Now all our initialisation is finished
# Colour in cyan the 2nd black row starting after the 3rd black col
r, c = 2, 3
nimRGB[rowStarts[r]:rowEnds[r],colEnds[c]:colStarts[c+1]] = [0,255,255]
# Colour in magenta the 8th black column starting after the 5th black row
r, c = 5, 8
nimRGB[rowEnds[r]:rowStarts[r+1],colStarts[c]:colEnds[c]] = [255,0,255]
# Convert Numpy array back to PIL Image and save
Image.fromarray(nimRGB).save('result.png')
For reference, colthresh looks like this:
array([255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 0, 0, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 0, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 0, 0, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 0, 0, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 0, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 0, 0, 0, 0, 255, 255, 255, 255, 255,
255, 255, 255, 255])
And diffs look like this:
array([ 0, 0, 0, 0, 0, 0, 0, 0, 255, 0, 0,
0, -255, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 255, 0, -255, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 255, 0, -255, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 255, 0, -255,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 0,
-255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
255, 0, -255, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 255, 0, -255, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 255, 0, -255, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 255, -255, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 0,
0, 0, -255, 0, 0, 0, 0, 0, 0, 0, 0]
And colStarts looks like this:
[8, 48, 82, 118, 152, 187, 222, 256, 292, 328]
And colEnds looks like this:
[12, 50, 84, 120, 154, 189, 224, 258, 293, 332]

Replace multiple values in Numpy Array

Given the following example array:
import numpy as np
example = np.array(
[[[ 0, 0, 0, 255],
[ 0, 0, 0, 255]],
[[ 0, 0, 0, 255],
[ 221, 222, 13, 255]],
[[-166, -205, -204, 255],
[-257, -257, -257, 255]]]
)
I want to replace values [0, 0, 0, 255] values with [255, 0, 0, 255] and everything else becomes [0, 0, 0, 0].
So the desired output is:
[[[ 255, 0, 0, 255],
[ 255, 0, 0, 255]],
[[ 255, 0, 0, 255],
[ 0, 0, 0, 0]],
[[ 0, 0, 0, 0],
[ 0, 0, 0, 0]]
This solution got close:
np.place(example, example==[0, 0, 0, 255], [255, 0, 0, 255])
np.place(example, example!=[255, 0, 0, 255], [0, 0, 0, 0])
But it outputs this instead:
[[[255 0 0 255],
[255 0 0 255]],
[[255 0 0 255],
[ 0 0 0 255]], # <- extra 255 here
[[ 0 0 0 0],
[ 0 0 0 0]]]
What's a good way to do this?
You can find all occurrences of [0, 0, 0, 255] using
np.all(example == [0, 0, 0, 255], axis=-1)
# array([[ True, True],
# [ True, False],
# [False, False]])
Save these positions to a mask, set everything to zero, then place the desired output into the mask positions:
mask = np.all(example == [0, 0, 0, 255], axis=-1)
example[:] = 0
example[mask, :] = [255, 0, 0, 255]
# array([[[255, 0, 0, 255],
# [255, 0, 0, 255]],
#
# [[255, 0, 0, 255],
# [ 0, 0, 0, 0]],
#
# [[ 0, 0, 0, 0],
# [ 0, 0, 0, 0]]])
Here's one way:
a = np.array([0, 0, 0, 255])
replace = np.array([255, 0, 0, 255])
m = (example - a).any(-1)
np.logical_not(m)[...,None] * replace
array([[[255, 0, 0, 255],
[255, 0, 0, 255]],
[[255, 0, 0, 255],
[ 0, 0, 0, 0]],
[[ 0, 0, 0, 0],
[ 0, 0, 0, 0]]])

How to write a classification algorithm in tensorflow using keras in python?

I have a training set of 2 images which has 64 features and a label attached to them i.e. matched/not matched.
How can I feed this data in a neural network using keras?
My data is as follows:
[
[
[
239,
1,
255,
255,
255,
255,
2,
0,
130,
3,
1,
101,
22,
154,
0,
240,
30,
0,
2,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
128,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
71,
150,
212
],
[
239,
1,
255,
255,
255,
255,
2,
0,
130,
3,
1,
101,
22,
154,
0,
240,
30,
0,
2,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
128,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
71,
150,
212
],
"true"
],
[
[
239,
1,
255,
255,
255,
255,
2,
0,
130,
3,
1,
81,
28,
138,
0,
241,
254,
128,
6,
0,
2,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
128,
0,
128,
2,
128,
2,
192,
6,
224,
6,
224,
62,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
13,
62
],
[
239,
1,
255,
255,
255,
255,
2,
0,
130,
3,
1,
81,
28,
138,
0,
241,
254,
128,
6,
0,
2,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
128,
0,
128,
2,
128,
2,
192,
6,
224,
6,
224,
62,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
13,
62
],
"true"
],
....
]
I want to train neural network so that after training if I provide it 2 array of 64 features then it should able to tell whether they matched or not?
Since you kind of extracted the futures already, I'd suggest by just going with some dense layers and convert the "true" and "false" to a 1 and 0 respectively, and just use a sigmoid on the final dense layer.
Try to experiment with something simple first, see how it goes and continue from there on, need more help, just ask
EDIT
def generator(batch_size=10, nr_features=126):
feed_data = np.zeros((batch_size, nr_features))
labels = np.zeros(batch_size)
i = 0
for entry in data:
if entry.pop(-1) == "true":
labels[i] = 1
else:
labels[i] = 0
feed_data[i, :] = np.array(entry).flatten()
i += 1
if not (i % batch_size):
i = 0
yield feed_data, labels
model = keras.Sequential()
model.add(keras.layers.Dense(126, input_dim=126))
model.add(keras.layers.Dense(20))
model.add(keras.layers.Dense(1))
model.add(keras.layers.Activation('sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
for d, l in generator():
model.train_on_batch(d, l)
So what happens,
the data in the generator is your full data, I pop out the true/false, convert it to 1/0 and put it into the label array, I concat all features as a feature vecotr of 126. So feed_data.shape = (10, 126) and labels.shape = (10).
I feed that to a simple fully connected network, one that ends up with a sigmoid. Sigmoid is useful for probablilty, so in this case the output will be the probability that a feature vecotr is true. and I just feed the data.
Simple example, is not the full code but should get you started, I tested it, and it runs for me, I did not train anything yet though, that's something for you, good luck!
Oh, and questions, ask away

How to convert RGB image to one-hot encoded 3d array based on color using numpy?

Simply put, what I'm trying to do is similar to this question: Convert RGB image to index image, but instead of 1-channel index image, I want to get n-channel image where img[h, w] is a one-hot encoded vector. For example, if the input image is [[[0, 0, 0], [255, 255, 255]], and index 0 is assigned to black and 1 is assigned to white, then the desired output is [[[1, 0], [0, 1]]].
Like the previous person asked the question, I have implemented this naively, but the code runs quite slowly, and I believe a proper solution using numpy would be significantly faster.
Also, as suggested in the previous post, I can preprocess each image into grayscale and one-hot encode the image, but I want a more general solution.
Example
Say I want to assign white to 0, red to 1, blue to 2, and yellow to 3:
(255, 255, 255): 0
(255, 0, 0): 1
(0, 0, 255): 2
(255, 255, 0): 3
, and I have an image which consists of those four colors, where image is a 3D array containing R, G, B values for each pixel:
[
[[255, 255, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
[[ 0, 0, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
[[ 0, 0, 255], [ 0, 0, 255], [255, 255, 255], [255, 255, 255]],
[[255, 255, 255], [255, 255, 255], [255, 255, 0], [255, 255, 0]]
]
, and this is what I want to get where each pixel is changed to one-hot encoded values of index. (Since changing a 2d array of index values to 3d array of one-hot encoded values is easy, getting a 2d array of index values is fine too.)
[
[[1, 0, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0]],
[[0, 0, 1, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0]],
[[0, 0, 1, 0], [0, 0, 1, 0], [1, 0, 0, 0], [1, 0, 0, 0]],
[[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 1]]
]
In this example I used colors where RGB components are either 255 or 0, but I don't want to solutions rely on that fact.
My solution looks like this and should work for arbitrary colors:
color_dict = {0: (0, 255, 255),
1: (255, 255, 0),
....}
def rgb_to_onehot(rgb_arr, color_dict):
num_classes = len(color_dict)
shape = rgb_arr.shape[:2]+(num_classes,)
arr = np.zeros( shape, dtype=np.int8 )
for i, cls in enumerate(color_dict):
arr[:,:,i] = np.all(rgb_arr.reshape( (-1,3) ) == color_dict[i], axis=1).reshape(shape[:2])
return arr
def onehot_to_rgb(onehot, color_dict):
single_layer = np.argmax(onehot, axis=-1)
output = np.zeros( onehot.shape[:2]+(3,) )
for k in color_dict.keys():
output[single_layer==k] = color_dict[k]
return np.uint8(output)
I haven't tested it for speed yet, but at least, it works :)
We could generate the decimal equivalents of each pixel color. With each channel having 0 or 255 as the value, there would be total 8 possibilities, but it seems we are only interested in four of those colors.
Then, we would have two ways to solve it :
One would involve making unique indices from those decimal equivalents starting from 0 till the final color, all in sequence and finally initializing an output array and assigning into it.
Other way would be to use broadcasted comparisons of those decimal equivalents against the colors.
These two methods are listed next -
def indexing_based(a):
b = (a == 255).dot([4,2,1]) # Decimal equivalents
colors = np.array([7,4,1,6]) # Define colors decimal equivalents here
idx = np.empty(colors.max()+1,dtype=int)
idx[colors] = np.arange(len(colors))
m,n,r = a.shape
out = np.zeros((m,n,len(colors)), dtype=int)
out[np.arange(m)[:,None], np.arange(n), idx[b]] = 1
return out
def broadcasting_based(a):
b = (a == 255).dot([4,2,1]) # Decimal equivalents
colors = np.array([7,4,1,6]) # Define colors decimal equivalents here
return (b[...,None] == colors).astype(int)
Sample run -
>>> a = np.array([
... [[255, 255, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
... [[ 0, 0, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
... [[ 0, 0, 255], [ 0, 0, 255], [255, 255, 255], [255, 255, 255]],
... [[255, 255, 255], [255, 255, 255], [255, 255, 0], [255, 255, 0]],
... [[255, 255, 255], [255, 0, 0], [255, 255, 0], [255, 0 , 0]]])
>>> indexing_based(a)
array([[[1, 0, 0, 0],
[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 1, 0, 0]],
[[0, 0, 1, 0],
[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 1, 0, 0]],
[[0, 0, 1, 0],
[0, 0, 1, 0],
[1, 0, 0, 0],
[1, 0, 0, 0]],
[[1, 0, 0, 0],
[1, 0, 0, 0],
[0, 0, 0, 1],
[0, 0, 0, 1]],
[[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 0, 1],
[0, 1, 0, 0]]])
>>> np.allclose(broadcasting_based(a), indexing_based(a))
True
A simple implementation involves masking the relevant pixel positions, whether it's for converting from label to color or vice-versa. I show here how to convert between dense (1-channel labels), OHE (one-hot-encoding sparse), and RGB formats. Essentially performing OHE<->RGB<->dense.
Having defined your RGB-encoded input as rgb.
First define the color label to color mapping (no need for a dict here):
>>> colors = np.array([[ 255, 255, 255],
[ 255, 0, 0],
[ 0, 0, 255],
[ 255, 255, 0]])
RGB (h, w, 3) to dense (h, w)
dense = np.zeros(seg.shape[:2])
for label, color in enumerate(colors):
dense[np.all(seg == color, axis=-1)] = label
RGB (h, w, 3) to OHE (h, w, #classes)
Similar to the previous conversion, RGB to one-hot-encoding requires two additional lines:
ohe = np.zeros((*seg.shape[:2], len(colors)))
for label, color in enumerate(colors):
v = np.zeros(len(colors))
v[label] = 1
ohe[np.all(seg == color, axis=-1)] = v
dense (h, w) to RGB (h, w, 3)
rgb = np.zeros((*labels.shape, 3))
for label, color in enumerate(colors):
rgb[labels == label] = color
OHE (h, w, #classes) to RGB (h, w, 3)
Converting from OHE to dense requires one line:
dense = ohe.argmax(-1)
Then you can simply follow dense->RGB.

Categories