Related
Although there are many instances of the question: "What is the numpy alternative to nested for loops", I was unable to find a suitable answer for my case. Here it goes:
I have a 3D numpy array with "0" background and other integers as foreground. I would like to find and store the foreground voxels which fall within a predefined mask (a sphere defining a given distance from a reference node). I have successfully done the task using nested 'for' loops and a chain of 'if' conditions as shown below. I am looking for a more efficient and compact alternative to avoid the loops and long conditions for this neighborhood search algorithm.
sample input data:
import numpy as np
im = np.array([[[ 60, 54, 47, 52, 57, 53, 46, 48]
, [ 60, 57, 53, 53, 54, 53, 50, 55]
, [ 60, 63, 56, 58, 59, 57, 50, 50]
, [ 70, 70, 64, 69, 74, 72, 64, 47]
, [ 73, 76, 77, 80, 82, 76, 58, 37]
, [ 85, 85, 86, 86, 78, 62, 38, 20]
, [ 94, 94, 92, 78, 54, 33, 16, 255]
, [ 94, 90, 72, 51, 32, 19, 255, 255]
, [ 65, 53, 29, 18, 255, 255, 255, 255]
, [ 29, 22, 255, 255, 255, 255, 255, 0]]
, [[ 66, 67, 70, 69, 75, 73, 72, 63]
, [ 68, 70, 73, 74, 78, 80, 74, 53]
, [ 75, 87, 87, 83, 89, 86, 61, 33]
, [ 81, 89, 88, 98, 99, 77, 41, 18]
, [ 84, 94, 100, 100, 82, 49, 21, 255]
, [ 99, 101, 92, 75, 48, 25, 255, 255]
, [ 93, 77, 52, 32, 255, 255, 255, 255]
, [ 52, 40, 25, 255, 255, 255, 255, 255]
, [ 23, 16, 255, 255, 255, 255, 255, 0]
, [255, 255, 255, 255, 255, 255, 0, 0]]
, [[ 81, 83, 92, 101, 101, 83, 49, 19]
, [ 86, 96, 103, 103, 95, 64, 28, 255]
, [ 94, 103, 107, 98, 79, 41, 255, 255]
, [101, 103, 98, 79, 51, 28, 255, 255]
, [102, 97, 76, 49, 27, 255, 255, 255]
, [ 79, 62, 35, 21, 255, 255, 255, 255]
, [ 33, 23, 15, 255, 255, 255, 255, 255]
, [ 16, 255, 255, 255, 255, 255, 255, 0]
, [255, 255, 255, 255, 255, 255, 0, 0]
, [255, 255, 255, 255, 255, 0, 0, 0]]
, [[106, 107, 109, 94, 58, 26, 15, 255]
, [110, 104, 90, 66, 37, 19, 255, 255]
, [106, 89, 61, 35, 22, 255, 255, 255]
, [ 76, 56, 34, 19, 255, 255, 255, 255]
, [ 40, 27, 18, 255, 255, 255, 255, 255]
, [ 17, 255, 255, 255, 255, 255, 255, 255]
, [255, 255, 255, 255, 255, 255, 255, 0]
, [255, 255, 255, 255, 255, 255, 0, 0]
, [255, 255, 255, 255, 255, 0, 0, 0]
, [255, 255, 255, 0, 0, 0, 0, 0]]
, [[ 68, 51, 33, 19, 255, 255, 255, 255]
, [ 45, 34, 20, 255, 255, 255, 255, 255]
, [ 28, 18, 255, 255, 255, 255, 255, 255]
, [ 17, 255, 255, 255, 255, 255, 255, 255]
, [255, 255, 255, 255, 255, 255, 255, 255]
, [255, 255, 255, 255, 255, 255, 255, 0]
, [255, 255, 255, 255, 255, 255, 0, 0]
, [255, 255, 255, 255, 255, 0, 0, 0]
, [255, 255, 255, 0, 0, 0, 0, 0]
, [255, 0, 0, 0, 0, 0, 0, 0]]
, [[255, 255, 255, 255, 255, 255, 255, 255]
, [255, 255, 255, 255, 255, 255, 255, 255]
, [255, 255, 255, 255, 255, 255, 255, 255]
, [255, 255, 255, 255, 255, 255, 255, 0]
, [255, 255, 255, 255, 255, 255, 0, 0]
, [255, 255, 255, 255, 255, 0, 0, 0]
, [255, 255, 255, 255, 0, 0, 0, 0]
, [255, 255, 255, 0, 0, 0, 0, 0]
, [255, 0, 0, 0, 0, 0, 0, 0]
, [ 0, 0, 0, 0, 0, 0, 0, 0]]
, [[255, 255, 255, 255, 255, 255, 255, 0]
, [255, 255, 255, 255, 255, 255, 255, 0]
, [255, 255, 255, 255, 255, 255, 0, 0]
, [255, 255, 255, 255, 255, 0, 0, 0]
, [255, 255, 255, 255, 0, 0, 0, 0]
, [255, 255, 255, 0, 0, 0, 0, 0]
, [255, 255, 0, 0, 0, 0, 0, 0]
, [ 0, 0, 0, 0, 0, 0, 0, 0]
, [ 0, 0, 0, 0, 0, 0, 0, 0]
, [ 0, 0, 0, 0, 0, 0, 0, 0]]
, [[255, 255, 255, 255, 255, 255, 0, 0]
, [255, 255, 255, 255, 255, 0, 0, 0]
, [255, 255, 255, 255, 0, 0, 0, 0]
, [255, 255, 255, 0, 0, 0, 0, 0]
, [255, 255, 0, 0, 0, 0, 0, 0]
, [255, 0, 0, 0, 0, 0, 0, 0]
, [ 0, 0, 0, 0, 0, 0, 0, 0]
, [ 0, 0, 0, 0, 0, 0, 0, 0]
, [ 0, 0, 0, 0, 0, 0, 0, 0]
, [ 0, 0, 0, 0, 0, 0, 0, 0]]])
The implemented method:
[Z,Y,X]=im.shape
RN = np.array([3,4,4])
################Loading Area search
rad = 3
a,b,c = RN
x,y,z = np.ogrid[-c:Z-c,-b:Y-b,-a:X-a]
neighborMask = x*x + y*y + z*z<= rad*rad
noNodeMask = im > 0
mask = np.logical_and(neighborMask, noNodeMask)
imtemp = im.copy()
imtemp[mask] = -1
for i in range (X):
for j in range (Y):
for k in range (Z):
if imtemp[i,j,k]==-1:
if i in (0, X-1) or j in (0, Y-1) or k in (0, Z-1):
imtemp[i,j,k]=-2
elif imtemp[i+1,j,k] == 0 or imtemp[i-1,j,k] == 0 or imtemp[i,j+1,k] == 0 or imtemp[i,j-1,k] == 0 or imtemp[i,j,k+1] == 0 or imtemp[i,j,k-1] == 0:
imtemp[i,j,k]=-2
LA = np.argwhere(imtemp==-2)
The resulting LA from the above sample code is:
In [90]:LA
Out[90]:
array([[4, 4, 0],
[4, 4, 6],
[4, 5, 5],
[4, 6, 4],
[4, 6, 5],
[4, 7, 3],
[5, 3, 5],
[5, 4, 4],
[5, 4, 5],
[5, 5, 3],
[5, 5, 4],
[5, 6, 2],
[5, 6, 3],
[6, 2, 4],
[6, 3, 3],
[6, 3, 4],
[6, 4, 2],
[6, 4, 3],
[6, 5, 1],
[6, 5, 2]])
And a slice in Z direction (an XY plane instance) which shows different untouched, masked (-1), and target (-2) nodes:
Since your loops are only using direct Numpy indexing, you can use the Numba's #njit to perform this in a much more efficient way.
#njit
def compute_imtemp(imtemp, X, Y, Z):
for i in range (Z):
for j in range (Y-1):
for k in range (X-1):
if imtemp[i,j,k]==-1:
if i==(Z-1):
imtemp[i,j,k]=-2
elif imtemp[i+1,j,k] == 0 or imtemp[i-1,j,k] == 0 or imtemp[i,j+1,k] == 0 or imtemp[i,j-1,k] == 0 or imtemp[i,j,k+1] == 0 or imtemp[i,j,k-1] == 0:
imtemp[i,j,k]=-2
[...]
imtemp = im.copy()
imtemp[mask] = -1
compute_imtemp(imtemp, X, Y, Z)
LA = np.argwhere(imtemp==-2)
Here are performance results on my machine:
281 µs ± 1.43 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
776 ns ± 16.4 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)
The Numba implementation is 362 times faster.
Note that the first call to compute_imtemp will be slow because of the compilation. One way to overcome this is to call compute_imtemp on an empty Numpy array. Another way is to manually compile the function using the Numba API and provide the types to Numba explicitly.
Problem Statement
You have a "solid" shape in a large array. You carve out a ball from that. Your goal is to find the indices of the surface of the solid within the ball. Surfaces are defined as any point neighboring the outside of the solid with 6-point connectivity. Edges of the array are considered to be surfaces too.
Faster Loop Solution
You already computed the mask that represents the intersection of the solid and the ball. You can compute the mask a little more elegantly and convert it to indices instead. I suggest keeping the order of your dimensions constant, instead of switching between different notations. The order of RN is affected, for example, and you run the risk of mismatching your axis limits.
RN = np.array([4, 4, 3])
rad = 3
im = ...
cutout = ((np.indices(im.shape) - RN.reshape(-1, 1, 1, 1))**2).sum(axis=0) <= rad**2
solid = im > 0
mask = solid & cutout
indices = np.argwhere(mask)
You can also get the cutout without reshaping RN by doing
cutout = ((np.rollaxis(np.indices(im.shape, sparse=False), 0, 4) - RN)**2).sum(axis=-1) <= rad**2
The nice thing about computing indices is that your loops don't need to be huge any more. By using argwhere, you basically strip off the outer three loops, leaving only the if statement to loop over. You can also vectorize the connectivity check. This has the nice side effect that you can define arbitrary connectivity for each pixel.
limit = np.array(im.shape) - 1 # Edge of `im`
connectivity = np.array([[ 1, 0, 0], # Add rows to determine connectivity
[-1, 0, 0],
[ 0, 1, 0],
[ 0, -1, 0],
[ 0, 0, 1],
[ 0, 0, -1]], dtype=indices.dtype)
index_mask = np.ones(len(indices), dtype=bool)
for n, ind in enumerate(indices):
if ind.all() and (ind < limit).all() and im[tuple((ind + connectivity).T)].all():
index_mask[n] = False
LA = indices[index_mask, :]
Notice that there is really no point to imtemp at all. Even in your original loop, you could just manipulate mask directly. Instead of setting elements to -2 when they pass your criterion, you could set elements to False if they didn't.
I do something like that here. We check each of the indices that were actually selected, and determine if any of them are inside the solid. These indices are eliminated from the mask. The list of indices is then updated based on the mask.
The check ind.all() and (ind < limit).all() and im[tuple((ind + connectivity).T)].all() is a shortcut for what you were doing with the or conditions, but reversed (testing for non-surface rather than surface).
ind.all() checks that none of the indices are zero: i.e., not on a top/front/left surface.
(ind < limit).all() checks that none of the indices are equal to the corresponding image size minus one.
im[tuple((ind + connectivity).T)].all() checks that none of the connected pixels are zero. (ind + connectivity).T is a (3, 6) array of the six points that we are connected to (currently defined as +/-1 in each axis by the (6, 3) connectivity array). When you turn it into a tuple, it just becomes a fancy index, as if you had done something like im[x + connectivity[:, 0], y + connectivity[:, 1], z + connectivity[:, 2]]. The commas in the index just make it into a tuple. They way I show is better suited for arbitrary numbers of dimensions.
Pixels that pass all three tests are inside the solid, and get removed. You can of course write the loop to check the other way, but then you would have to alter your mask:
index_mask = np.zeros(len(indices), dtype=bool)
for n, ind in enumerate(indices):
if (ind == 0).any() or (ind == limit).any() or (im[tuple((ind + connectivity).T)] == 0).any():
index_mask[n] = True
LA = indices[index_mask, :]
Looping manually is not ideal by any means. However, it shows you how to shorten the loop (probably by a couple of orders of magnitude), and how to define arbitrary connectivity using vectorization and broadcasting, without getting bogged down with hard-coding it.
Fully Vectorized Solution
The loops above can be fully vectorized using the magic of broadcasting. Instead of looping over each row in indices, we can add connectivity to it in bulk and filter the results in bulk. The trick is to add enough dimensions that you add all of connectivity to each element of indices.
You will still want to omit the pixels that are at the edges:
edges = (indices == 0).any(axis=-1) | (indices == limit).any(axis=-1)
conn_index = indices[~edges, None, :] + connectivity[None, ...]
index_mask = np.empty(len(indices), dtype=bool)
index_mask[edges] = True
index_mask[~edges] = (im[tuple(conn_index.T)] == 0).any(axis=0)
LA = indices[index_mask, :]
I expect that a properly written loop compiled with numba will be significantly faster than this solution, because it will avoid much of the overhead with pipelining the operations. It will not require large temporary buffers or special handling.
TL;DR
# Parameters
RN = np.array([4, 4, 3])
rad = 3
im = ...
# Find subset of interest
cutout = ((np.indices(im.shape) - RN.reshape(-1, 1, 1, 1))**2).sum(axis=0) <= rad**2
solid = im > 0
# Convert mask to indices
indices = np.argwhere(solid & cutout)
# Find image edges among indices
edges = (indices == 0).any(axis=-1) | (indices == limit).any(axis=-1)
# Connectivity elements for non-edge pixels
conn_index = indices[~edges, None, :] + connectivity[None, ...]
# Mask the valid surface pixels
index_mask = np.empty(len(indices), dtype=bool)
index_mask[edges] = True
index_mask[~edges] = (im[tuple(conn_index.T)] == 0).any(axis=0)
# Final result
LA = indices[index_mask, :]
I would like to convert numpy array into numpy array of arrays.
I have an array: a = [[0,0,0],[0,255,0],[0,255,255],[255,255,255]]
and I would like to have: b = [[[0,0,0],[0,0,0],[0,0,0]],[[0,0,0],[255,255,255],[0,0,0]],[[0,0,0],[255,255,255],[255,255,255]],[[255,255,255],[255,255,255],[255,255,255]]]
Is there any easy way to do it?
I have tried with np.where(a == 0, [0,0,0],[255,255,255]) but I got the following error:
ValueError: operands could not be broadcast together with shapes
You can use broadcast_to as
b = np.broadcast_to(a, (3,4,3))
where a was shape (3,4). Then you need to swap the axes around
import numpy as np
a = np.array([[0,0,0],[0,255,0],[0,255,255],[255,255,255]])
b = np.broadcast_to(a, (3,4,3))
c = np.moveaxis(b, [0,1,2], [2,0,1])
c
giving
array([[[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0]],
[[ 0, 0, 0],
[255, 255, 255],
[ 0, 0, 0]],
[[ 0, 0, 0],
[255, 255, 255],
[255, 255, 255]],
[[255, 255, 255],
[255, 255, 255],
[255, 255, 255]]])
A more direct method broadcasting method suggested by #Divakar is
b = np.broadcast(a[:,:,None], (4,3,3))
which produces the same output without axis swapping.
What you tried will work with the following small modification:
a = np.array(a)
np.where(a[...,None]==0,[0,0,0],[255,255,255])
To make multidimensional indexing available we have to cast a to array first. a[...,None] adds a new dimension at the end of a to accomodate the triplets 0,0,0 and 255,255,255.
In [204]: a = np.array([[0,0,0],[0,255,0],[0,255,255],[255,255,255]])
In [205]: a.shape
Out[205]: (4, 3)
Looks like you want to replicate each element 3 times, making a new trailing dimension. We can do that using repeat (after adding the new trailing dimension):
In [207]: a.reshape(4,3,1).repeat(3,2)
Out[207]:
array([[[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0]],
[[ 0, 0, 0],
[255, 255, 255],
[ 0, 0, 0]],
[[ 0, 0, 0],
[255, 255, 255],
[255, 255, 255]],
[[255, 255, 255],
[255, 255, 255],
[255, 255, 255]]])
In [208]: _.shape
Out[208]: (4, 3, 3)
I have an RGB coloured image mask mask_color, shape (4,4,3). How can I quickly convert all the black pixels [0,0,0] to white [255,255,255], without using any loops, without additional packages, preferably NumPy way?
mask_color = np.array([
[
[0,0,0],
[128,0,255],
[0,0,0],
[0,0,0]
],
[
[0,0,0],
[0,0,0],
[0,0,0],
[0,0,0]
],
[
[0,0,0],
[50,128,0],
[0,0,0],
[0,0,0]
],
[
[0,0,0],
[0,0,0],
[245,108,60],
[0,0,0]
]
])
plt.imshow(mask_color)
plt.show()
white_bg_mask_color = # do something
plt.imshow(white_bg_mask_color)
plt.show()
You can use np.where:
>>> np.where(mask_color.any(-1,keepdims=True),mask_color,255)
array([[[255, 255, 255],
[128, 0, 255],
[255, 255, 255],
[255, 255, 255]],
[[255, 255, 255],
[255, 255, 255],
[255, 255, 255],
[255, 255, 255]],
[[255, 255, 255],
[ 50, 128, 0],
[255, 255, 255],
[255, 255, 255]],
[[255, 255, 255],
[255, 255, 255],
[245, 108, 60],
[255, 255, 255]]])
you can also do it using boolean indexing like below
mask_color[np.all(mask_color==0, axis=2)] = 255
mask_color
I am trying to build an image classifier but im running in to the error as mentioned in the title of this post. Below is the code im working on. How do i convert my numpy array that is of shape (8020,) to the shape as required by the function fit()? I tried to print the input shape: train_img_array.shape[1:] but it gives an empty shape: ()
import numpy as np
img_train.shape
img_valid.shape
img_train.head(5)
img_valid.head(5)
(8020, 4)
(2006, 4)
ID index class data
8030 11596 11596 0 [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
2152 11149 11149 0 [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
550 10015 10015 0 [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
1740 9035 9035 0 [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
9549 8218 8218 1 [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
ID index class data
3312 5481 5481 0 [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
9079 10002 10002 0 [[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], ...
6129 11358 11358 0 [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
1147 2613 2613 1 [[[255, 255, 255, 0], [255, 255, 255, 0], [255...
7105 5442 5442 1 [[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], ...
img_train.dtypes
ID int64
index int64
class int64
data object
dtype: object
train_img_array = np.array([])
train_id_array = np.array([])
train_lab_array = np.array([])
train_id_array = img_train['ID'].values
train_lab_array = img_train['class'].values
train_img_array =img_train['data'].values
train_img_array.shape
train_lab_array.shape
train_id_array.shape
(8020,)
(8020,)
(8020,)
# Importing the Keras libraries and other packages
#matplotlib inline
from __future__ import print_function
import keras
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
Using Theano backend.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), padding='same', activation='relu', input_shape = (256, 256, 3)))
classifier.add(Conv2D(32, (3, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Dropout(0.25))
classifier.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
classifier.add(Conv2D(64, (3, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Dropout(0.25))
classifier.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
classifier.add(Conv2D(64, (3, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Dropout(0.25))
classifier.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
classifier.add(Conv2D(64, (3, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Dropout(0.25))
classifier.add(Flatten())
classifier.add(Dense(units = 256, activation = 'relu'))
classifier.add(Dropout(0.25))
classifier.add(Dense(units = 1, activation = 'sigmoid')) classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
classifier.summary()
batch_size = 32
epochs = 15
history = classifier.fit(train_img_array, train_lab_array, batch_size=batch_size, epochs=epochs, verbose=1,
validation_data=(valid_img_array, valid_lab_array))
classifier.evaluate(valid_img_array, valid_lab_array)
ValueError: Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (8020, 1)
Edit: -----------------------------------------------------------
As Nassim requested, adding a few more details to this post:
print(train_img_array)
[ array([[[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0],
...,
[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0]],
[[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0],
...,
...,
[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0]]], dtype=uint8)
array([[[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0],
...,
...,
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]]], dtype=uint8)]
print(list(train_img_array))
[array([[[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0],
...,
[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0]],
[[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0],
...,
...,
[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0]]], dtype=uint8), array([[[255, 255, 255, 0],
[255, 255, 255, 0],
[255, 255, 255, 0],
...,
print(np.array(list(train_img_array)))
throws the error:
ValueError: could not broadcast input array from shape (700,584,4) into shape (700,584)
So after debuging by using the result :
> print(type(train_img_array[0]))
<type 'numpy.ndarray'>
> print(train_img_array[0].shape)
(700, 584, 4)
> print(rain_img_array[0])
array([[[255, 255, 255, 0], [255, 255, 255, 0], [255, 255, 255, 0], ..., ..., [255, 255, 255, 0], [255, 255, 255, 0], [255, 255, 255, 0]]], dtype=uint8)
we see that what is returned when you do :
train_img_array =img_train['data'].values
is actually one numpy array of shape (8020, ) where all the elements are other numpy arrays containin images. Basically two numpy arrays nested.
So what you want is to kind of flatten that nested structure of nested arrays into one single array object. The way I would do it, it might be a bit hacky but should work, is the following :
train_img_array =img_train['data'].values
train_img_array = np.array(list(train_img_array))
So basically transform the structure of numpy array of numpy arrays into a list of numpy arrays. Then when you build a numpy array out of the list of numpy arrays, you get (magic) a numpy array with one more dimension.
The shape after this operation should be (8020, 700, 584, 4)
Now, I see one more potential issue you might encounter with this is the format of your image. The channel dimension of your images is the last one (4 channels here). You should then, in your convolutional layers specify :
conv2D(... , data_format="channels_last", )
Also, your input shape for the first layer is (700, 584, 4), not (256, 256, 3)
hope it works :-)
Simply put, what I'm trying to do is similar to this question: Convert RGB image to index image, but instead of 1-channel index image, I want to get n-channel image where img[h, w] is a one-hot encoded vector. For example, if the input image is [[[0, 0, 0], [255, 255, 255]], and index 0 is assigned to black and 1 is assigned to white, then the desired output is [[[1, 0], [0, 1]]].
Like the previous person asked the question, I have implemented this naively, but the code runs quite slowly, and I believe a proper solution using numpy would be significantly faster.
Also, as suggested in the previous post, I can preprocess each image into grayscale and one-hot encode the image, but I want a more general solution.
Example
Say I want to assign white to 0, red to 1, blue to 2, and yellow to 3:
(255, 255, 255): 0
(255, 0, 0): 1
(0, 0, 255): 2
(255, 255, 0): 3
, and I have an image which consists of those four colors, where image is a 3D array containing R, G, B values for each pixel:
[
[[255, 255, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
[[ 0, 0, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
[[ 0, 0, 255], [ 0, 0, 255], [255, 255, 255], [255, 255, 255]],
[[255, 255, 255], [255, 255, 255], [255, 255, 0], [255, 255, 0]]
]
, and this is what I want to get where each pixel is changed to one-hot encoded values of index. (Since changing a 2d array of index values to 3d array of one-hot encoded values is easy, getting a 2d array of index values is fine too.)
[
[[1, 0, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0]],
[[0, 0, 1, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 1, 0, 0]],
[[0, 0, 1, 0], [0, 0, 1, 0], [1, 0, 0, 0], [1, 0, 0, 0]],
[[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 1]]
]
In this example I used colors where RGB components are either 255 or 0, but I don't want to solutions rely on that fact.
My solution looks like this and should work for arbitrary colors:
color_dict = {0: (0, 255, 255),
1: (255, 255, 0),
....}
def rgb_to_onehot(rgb_arr, color_dict):
num_classes = len(color_dict)
shape = rgb_arr.shape[:2]+(num_classes,)
arr = np.zeros( shape, dtype=np.int8 )
for i, cls in enumerate(color_dict):
arr[:,:,i] = np.all(rgb_arr.reshape( (-1,3) ) == color_dict[i], axis=1).reshape(shape[:2])
return arr
def onehot_to_rgb(onehot, color_dict):
single_layer = np.argmax(onehot, axis=-1)
output = np.zeros( onehot.shape[:2]+(3,) )
for k in color_dict.keys():
output[single_layer==k] = color_dict[k]
return np.uint8(output)
I haven't tested it for speed yet, but at least, it works :)
We could generate the decimal equivalents of each pixel color. With each channel having 0 or 255 as the value, there would be total 8 possibilities, but it seems we are only interested in four of those colors.
Then, we would have two ways to solve it :
One would involve making unique indices from those decimal equivalents starting from 0 till the final color, all in sequence and finally initializing an output array and assigning into it.
Other way would be to use broadcasted comparisons of those decimal equivalents against the colors.
These two methods are listed next -
def indexing_based(a):
b = (a == 255).dot([4,2,1]) # Decimal equivalents
colors = np.array([7,4,1,6]) # Define colors decimal equivalents here
idx = np.empty(colors.max()+1,dtype=int)
idx[colors] = np.arange(len(colors))
m,n,r = a.shape
out = np.zeros((m,n,len(colors)), dtype=int)
out[np.arange(m)[:,None], np.arange(n), idx[b]] = 1
return out
def broadcasting_based(a):
b = (a == 255).dot([4,2,1]) # Decimal equivalents
colors = np.array([7,4,1,6]) # Define colors decimal equivalents here
return (b[...,None] == colors).astype(int)
Sample run -
>>> a = np.array([
... [[255, 255, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
... [[ 0, 0, 255], [255, 255, 255], [255, 0, 0], [255, 0, 0]],
... [[ 0, 0, 255], [ 0, 0, 255], [255, 255, 255], [255, 255, 255]],
... [[255, 255, 255], [255, 255, 255], [255, 255, 0], [255, 255, 0]],
... [[255, 255, 255], [255, 0, 0], [255, 255, 0], [255, 0 , 0]]])
>>> indexing_based(a)
array([[[1, 0, 0, 0],
[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 1, 0, 0]],
[[0, 0, 1, 0],
[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 1, 0, 0]],
[[0, 0, 1, 0],
[0, 0, 1, 0],
[1, 0, 0, 0],
[1, 0, 0, 0]],
[[1, 0, 0, 0],
[1, 0, 0, 0],
[0, 0, 0, 1],
[0, 0, 0, 1]],
[[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 0, 1],
[0, 1, 0, 0]]])
>>> np.allclose(broadcasting_based(a), indexing_based(a))
True
A simple implementation involves masking the relevant pixel positions, whether it's for converting from label to color or vice-versa. I show here how to convert between dense (1-channel labels), OHE (one-hot-encoding sparse), and RGB formats. Essentially performing OHE<->RGB<->dense.
Having defined your RGB-encoded input as rgb.
First define the color label to color mapping (no need for a dict here):
>>> colors = np.array([[ 255, 255, 255],
[ 255, 0, 0],
[ 0, 0, 255],
[ 255, 255, 0]])
RGB (h, w, 3) to dense (h, w)
dense = np.zeros(seg.shape[:2])
for label, color in enumerate(colors):
dense[np.all(seg == color, axis=-1)] = label
RGB (h, w, 3) to OHE (h, w, #classes)
Similar to the previous conversion, RGB to one-hot-encoding requires two additional lines:
ohe = np.zeros((*seg.shape[:2], len(colors)))
for label, color in enumerate(colors):
v = np.zeros(len(colors))
v[label] = 1
ohe[np.all(seg == color, axis=-1)] = v
dense (h, w) to RGB (h, w, 3)
rgb = np.zeros((*labels.shape, 3))
for label, color in enumerate(colors):
rgb[labels == label] = color
OHE (h, w, #classes) to RGB (h, w, 3)
Converting from OHE to dense requires one line:
dense = ohe.argmax(-1)
Then you can simply follow dense->RGB.