Set masked pixels in a 3D RGB numpy array - python

I'd like to set all pixels matching some condition in a 3d numpy array (RGB image) using a mask. I have something like this:
def make_dot(img, color, radius):
"""Make a dot of given color in the center of img (rgb numpy array)"""
(ydim,xdim,dummy) = img.shape
# make an open grid of x,y
y,x = np.ogrid[0:ydim, 0:xdim, ]
y -= ydim/2 # centered at the origin
x -= xdim/2
# now make a mask
mask = x**2+y**2 <= radius**2 # start with 2d
mask.shape = mask.shape + (1,) # make it 3d
print img[mask].shape
img[mask] = color
img = np.zeros((100, 200, 3))
make_dot(img, np.array((.1, .2, .3)), 25)
but that gives ValueError: array is not broadcastable to correct shape in this line:
img[mask] = color
because the shape of img[mask] is (1961,); i.e. it's flattened to contain only the "valid" pixels, which makes sense; but how can I make it "write through the mask" as it were to set only the pixels where the mask is 1? Note that I want to write three values at once to each pixel (the last dim).

You almost have it right.
(ydim,xdim,dummy) = img.shape
# make an open grid of x,y
y,x = np.ogrid[0:ydim, 0:xdim, ]
y -= ydim/2 # centered at the origin
x -= xdim/2
# now make a mask
mask = x**2+y**2 <= radius**2 # start with 2d
img[mask,:] = color
the extra ",:" at the end of the assignment lets you assign the color throughout the 3 channels in one shot.

Related

How to generate a mask using Pillow's Image.load() function

I want to create a mask based on certain pixel values. For example: every pixel where B > 200
The Image.load() method seems to be exactly what I need for identifying the pixels with these values, but I can't seem to figure out how to take all these pixels and create a mask image out of them.
R, G, B = 0, 1, 2
pixels = self.input_image.get_value().load()
width, height = self.input_image.get_value().size
for y in range(0, height):
for x in range(0, width):
if pixels[x, y][B] > 200:
print("%s - %s's blue is more than 200" % (x, y))
``
I meant for you to avoid for loops and just use Numpy. So, starting with this image:
from PIL import Image
import numpy as np
# Open image
im = Image.open('colorwheel.png')
# Make Numpy array
ni = np.array(im)
# Mask pixels where Blue > 200
blues = ni[:,:,2]>200
# Save logical mask as PNG
Image.fromarray((blues*255).astype(np.uint8)).save('result.png')
If you want to make the masked pixels black, use:
ni[blues] = 0
Image.fromarray(ni).save('result.png')
You can make more complex, compound tests against ranges like this:
#!/usr/bin/env python3
from PIL import Image
import numpy as np
# Open image
im = Image.open('colorwheel.png')
# Make Numpy array
ni = np.array(im)
# Mask pixels where 100 < Blue < 200
blues = ( ni[:,:,2]>100 ) & (ni[:,:,2]<200)
# Save logical mask as PNG
Image.fromarray((blues*255).astype(np.uint8)).save('result.png')
You can also make a condition on Reds, Greens and Blues and then use Numpy's np.logical_and() and np.logical_or() to make compound conditions, e.g.:
bluesHi = ni[:,:,2] > 200
redsLo = ni[:,:,0] < 50
mask = np.logical_and(bluesHi,redsLo)
Thanks to the reply from Mark Setchell, I solved by making a numpy array the same size as my image filled with zeroes. Then for every pixel where B > 200, I set the corresponding value in the array to 255. Finally I converted the numpy array to a PIL image in the same mode as my input image was.
R, G, B = 0, 1, 2
pixels = self.input_image.get_value().load()
width, height = self.input_image.get_value().size
mode = self.input_image.get_value().mode
mask = np.zeros((height, width))
for y in range(0, height):
for x in range(0, width):
if pixels[x, y][2] > 200:
mask[y][x] = 255
mask_image = Image.fromarray(mask).convert(mode)

Extract rectangular boxes with padding

I'm trying to extract values From a 2d Tensor inside multiple rectangular regions. I want to crop rectangular regions while setting all values outside the box to zero.
For example from the 9 x 9 image I want to get two separate images with values inside the two rectangular red boxes, while setting the rest of the values to zero. Is there a convenient way to do this with tensorflow slicing?
One way I thought of approaching this is defining a mask array that is 1 inside the box and 0 outside and multiply it with the input array. But this requires looping over the number of boxes, each time changing which values of the mask are set to 0. Is there a faster and more efficient way to do this without using for loops? Is there an equivalent of crop and replace function in tensorflow? Here's the code I have with the for loop. Appreciate any input on this. Thanks
import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib.patches as patches
tf.reset_default_graph()
size = 9 # size of input image
num_boxes = 2 # number of rectangular boxes
def get_cutout(X, bboxs):
"""Returns copies of X with values only inside bboxs"""
out = []
for i in range(num_boxes):
bbox = bboxs[i] # get rectangular box coordinates
Y = tf.Variable(np.zeros((size, size)), dtype=tf.float32) # define temporary mask
# set values of mask inside box to 1
t = [Y[bbox[0]:bbox[2], bbox[2]:bbox[3]].assign(
tf.ones((bbox[2]-bbox[0], bbox[3]-bbox[2])))]
with tf.control_dependencies(t):
mask = tf.identity(Y)
out.append(X * mask) # get values inside rectangular box
return out, X
#define a 9x9 input image X and convert to tensor
in_x = np.eye(size)
in_x[0:3]=np.random.rand(3,9)
X = tf.constant(in_x , dtype=tf.float32)
bboxs = tf.placeholder(tf.int32, [None, 4]) # placeholder for rectangular box
X_outs = get_cutout(X, bboxs)
# coordintes of box ((bottom left x, bottom left y, top right x, top right y))
in_bbox = [[1,3,3,6], [4,3,7,8]]
feed_dict = {bboxs: in_bbox}
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
x_out= sess.run(X_outs, feed_dict=feed_dict)
# plot results
vmin = np.min(x_out[2])
vmax = np.max(x_out[2])
fig, ax = plt.subplots(nrows=1, ncols=1+len(in_bbox),figsize=(10,2))
im = ax[0].imshow(x_out[2], vmin=vmin, vmax=vmax, origin='lower')
plt.colorbar(im, ax=ax[0])
ax[0].set_title("input X")
for i, bbox in enumerate(in_bbox):
bottom_left = (bbox[2]-0.5, bbox[0]-0.5)
width = bbox[3]-bbox[2]
height = bbox[2]- bbox[0]
rect = patches.Rectangle(bottom_left, width, height,
linewidth=1,edgecolor='r',facecolor='none')
ax[0].add_patch(rect)
ax[i+1].set_title("extract values in box {}".format(i+1))
im = ax[i + 1].imshow(x_out[0][i], vmin=vmin, vmax=vmax, origin='lower')
plt.colorbar(im,ax=ax[i+1])
Thanks for that really nice function #edkevekeh. I've had to modify it slightly to get it to do what I want. One, I couldn't iterate over boxes which is a Tensor object. Plus, the crop size is determined by the box and not always 3x3. Also, tf.boolean_mask returns the crop, but I want to keep the crop, but replace outside the crop with 0. So I replaced the tf.boolean_mask with multiplication.
For my use case num_boxes can be large, so I wanted to know if there was a more efficient way than a for loop, guess not. My modified version of #edkevekeh's solution if anyone else needs it.
def extract_with_padding(image, boxes):
"""
boxes: tensor of shape [num_boxes, 4].
boxes are the coordinates of the extracted part
box is an array [y1, x1, y2, x2]
where [y1, x1] (respectively [y2, x2]) are the coordinates
of the top left (respectively bottom right ) part of the image
image: tensor containing the initial image
"""
extracted = []
shape = tf.shape(image)
for i in range(boxes.shape[0]):
b = boxes[i]
crop = tf.ones([b[2] - b[0], b[3] - b[1]])
mask = tf.pad(crop, [[b[0], shape[0] - b[2]], [b[1] , shape[1] - b[3]]])
extracted.append(image*mask)
return extracted
The mask can be created using tf.pad.
crop = tf.ones([3, 3])
# "before_axis_x" how many padding will be added before cropping zone over the axis x
# "after_axis_x" how many padding will be added after cropping zone over the axis x
mask = tf.pad(crop, [[before_axis_0, after_axis_0], [before_axis_1, after_axis_1]]
tf.mask(image, mask) # creates the extracted image
To have the same behavior as tf.image.crop_and_resize, here is a function that will take an array of boxes and will return an array of extracted images with padding.
def extract_with_padding(image, boxes):
"""
boxes: tensor of shape [num_boxes, 4].
boxes are the coordinates of the extracted part
box is an array [y1, x1, y2, x2]
where [y1, x1] (respectively [y2, x2]) are the coordinates
of the top left (respectively bottom right ) part of the image
image: tensor containing the initial image
"""
extracted = []
shape = tf.shape(image)
for b in boxes:
crop = tf.ones([3, 3])
mask = tf.pad(crop, [[b[0], shape[0] - b[2]], [b[1] , shape[1] - b[3]]])
extracted.append(tf.boolean_mask(image, mask))
return extracted

Most optimized way to filter patch positions in an image

So my problem is this: I have an RGB image as a numpy array of dimensions (4086, 2048, 3), I split this image dimension into 96x96 patches and get back the positions of these patches in a numpy array. I always get 96x96 patches in every case. If the dimensions of the image can't allow me to create "pure" 96x96 patches on the x or y axis I just add a left padding to it so the last patches overlap a bit with the patch before it.
Now with these positions in hand I want to get rid of all 96x96 patches for which the RGB value is 255 in all three channels for every pixel in the patch, in the fastest way possible and I want to get back all the patches positions which don't have this value.
I would like to know:
What is the fastest way to extract the 96x96 patches positions from the image dimension? (for now I have a for loop)
How can you get rid of pure white patches (with value 255 on the 3 channels) in most optimal way? (for now I have a for loop)
I have a lot of these images to process like that with images resolution going up to (39706, 94762, 3) so my "for loops" becomes quickly inefficient here. Thanks for your help! (I take solutions which make use of the GPU too)
Here is the pseudo code to give you an idea on how it's done for now:
patches = []
patch_y = 0
y_limit = False
slide_width = 4086
slide_height = 2048
# Lets imagine this image_slide has 96x96 patches which value is 255
image_slide = np.random.rand(slide_width, slide_height, 3)
while patch_y < slide_height:
patch_x = 0
x_limit = False
while patch_x < slide_width:
# Extract the patch at the given position and return it or return None if it's 3 RGB
# channels are 255
is_white = PatchExtractor.is_white(patch_x, patch_y, image_slide)
# Add the patches position to the list if it's not None (not white)
if not is_white:
patches.append((patch_x, patch_y))
if not x_limit and patch_x + crop_size > slide_width - crop_size:
patch_x = slide_width - crop_size
x_limit = True
else:
patch_x += crop_size
if not y_limit and patch_y + crop_size > slide_height - crop_size:
patch_y = slide_height - crop_size
y_limit = True
else:
patch_y += crop_size
return patches
Ideally, I would like to get my patches positions outside a "for loop" then once I have them I can test if they are white or not outside a for loop as well with the fewer possible calls to numpy (so the code is processed in the C layer of numpy and doesn't go back and forth to python)
As you suspected you can vectorize all of what you're doing. It takes roughly a small integer multiple of the memory need of your original image. The algorithm is quite straightforward: pad your image so that an integer number of patches fit in it, cut it up into patches, check if each patch is all white, keep the rest:
import numpy as np
# generate some dummy data and shapes
imsize = (1024, 2048)
patchsize = 96
image = np.random.randint(0, 256, size=imsize + (3,), dtype=np.uint8)
# seed some white patches: cut a square hole in the random noise
image[image.shape[0]//2:3*image.shape[0]//2, image.shape[1]//2:3*image.shape[1]//2] = 255
# pad the image to necessary size; memory imprint similar size as the input image
# white pad for simplicity for now
nx,ny = (np.ceil(dim/patchsize).astype(int) for dim in imsize) # number of patches
if imsize[0] % patchsize or imsize[1] % patchsize:
# we need to pad along at least one dimension
padded = np.pad(image, ((0, nx * patchsize - imsize[0]),
(0, ny * patchsize - imsize[1]), (0,0)),
mode='constant', constant_values=255)
else:
# no padding needed
padded = image
# reshape padded image according to patches; doesn't copy memory
patched = padded.reshape(nx, patchsize, ny, patchsize, 3).transpose(0, 2, 1, 3, 4)
# patched is shape (nx, ny, patchsize, patchsize, 3)
# appending .copy() as a last step to the above will copy memory but might speed up
# the next step; time it to find out
# check for white patches; memory imprint the same size as the padded image
filt = ~(patched == 255).all((2, 3, 4))
# filt is a bool, one for each patch that tells us if it's _not_ all white
# (i.e. we want to keep it)
patch_x,patch_y = filt.nonzero() # patch indices of non-whites from 0 to nx-1, 0 to ny-1
patch_pixel_x = patch_x * patchsize # proper pixel indices of each pixel
patch_pixel_y = patch_y * patchsize
patches = np.array([patch_pixel_x, patch_pixel_y]).T
# shape (npatch, 2) which is compatible with a list of tuples
# if you want the actual patches as well:
patch_images = patched[filt, ...]
# shape (npatch, patchsize, patchsize, 3),
# patch_images[i,...] is an image with patchsize * patchsize pixels
As you can see, in the above I used white padding to get a congruent padded image. I believe this is in line with the philosophy of what you're trying to do. If you want to replicate what you're doing in the loop exactly, you can pad your image manually using the overlapping pixels that you'd take into account near the edge. You'd need to allocate a padded image of the right size, then manually slice the overlapping pixels of the original image in order to set the edge pixels in the padded result.
Since you mentioned that your images are huge and consequently padding leads to far too much memory use, you can avoid padding with some elbow grease. You can use slices of your huge image (which doesn't create a copy), but then you have to manually handle the edges where you don't have full slices. Here's how:
def get_patches(img, patchsize):
"""Compute patches on an input image without padding: assume "congruent" patches
Returns an array shaped (npatch, 2) of patch pixel positions"""
mx,my = (val//patchsize for val in img.shape[:-1])
patched = img[:mx*patchsize, :my*patchsize, :].reshape(mx, patchsize, my, patchsize, 3)
filt = ~(patched == 255).all((1, 3, 4))
patch_x,patch_y = filt.nonzero() # patch indices of non-whites from 0 to nx-1, 0 to ny-1
patch_pixel_x = patch_x * patchsize # proper pixel indices of each pixel
patch_pixel_y = patch_y * patchsize
patches = np.stack([patch_pixel_x, patch_pixel_y], axis=-1)
return patches
# fix the patches that fit inside the image
patches = get_patches(image, patchsize)
# fix edge patches if necessary
all_patches = [patches]
if imsize[0] % patchsize:
# then we have edge patches along the first dim
tmp_patches = get_patches(image[-patchsize:, ...], patchsize)
# correct indices
all_patches.append(tmp_patches + [imsize[0] - patchsize, 0])
if imsize[1] % patchsize:
# same along second dim
tmp_patches = get_patches(image[:, -patchsize:, :], patchsize)
# correct indices
all_patches.append(tmp_patches + [0, imsize[1] - patchsize])
if imsize[0] % patchsize and imsize[1] % patchsize:
# then we have a corner patch we still have to fix
tmp_patches = get_patches(image[-patchsize:, -patchsize:, :], patchsize)
# correct indices
all_patches.append(tmp_patches + [imsize[0] - patchsize, imsize[1] - patchsize])
# gather all the patches into an array of shape (npatch, 2)
patches = np.vstack(all_patches)
# if you also want to grab the actual patch values without looping:
xw, yw = np.mgrid[:patchsize, :patchsize]
patch_images = image[patches[:,0,None,None] + xw, patches[:,1,None,None] + yw, :]
# shape (npatch, patchsize, patchsize, 3),
# patch_images[i,...] is an image with patchsize * patchsize pixels
This will also exactly replicate your looping code, since we're explicitly taking the edge patches such that they overlap with the previous patches (there's no spurious white padding). If you want to have the patches in a given order you'll have to sort them now, though.

How to use numpy.polyfit to fit an graph?

I have an image below. Its shape is 720x1280. I want to draw a line to fit this white pattern.
I used y range instead of x is because y is more easy to fit as 2nd order polynomial.
y_range = np.linspace(0, 719, num=720) # to cover same y-range as image
fit = np.polyfit(y_range, image, 2) # image.shape = (720, 1280)
print(fit.shape) # (3, 1280)
I expect fit.shape = (3,), but it's not.
Can np.polyfit() be used in this situation?
If 1. is true, how to do this? I want to use fit to calculate curve as following.
f = fit[0]*y_range**2 + fit[1]*y_range + fit[2]
Thank you.
Your image is 2-D, that is the problem. The 2-D image contains information about the coordinates of each point, so you only have to put it into a suitable format.
Since it seems that you are interested only in the location of the white pixels (and not the particular value of each pixel), convert the image into binary values. I don't know particular values of your image but you could do for example:
import numpy as np
curoff_value = 0.1 # this is particular to your image
image[image > cutoff_value] = 1 # for white pixel
image[image <= cutoff_value] = 0 # for black pixel
Get the coordinates of the white pixels:
coordinates = np.where(image == 1)
y_range = coordinates[0]
x_range = coordinates[1]
fit = np.polyfit(y_range, x_range, 2)
print(fit.shape)
Returns (3, ) as you would expect.

How can I create a circular mask for a numpy array?

I am trying to circular mask an image in Python. I found some example code on the web, but I'm not sure how to change the maths to get my circle in the correct place.
I have an image image_data of type numpy.ndarray with shape (3725, 4797, 3):
total_rows, total_cols, total_layers = image_data.shape
X, Y = np.ogrid[:total_rows, :total_cols]
center_row, center_col = total_rows/2, total_cols/2
dist_from_center = (X - total_rows)**2 + (Y - total_cols)**2
radius = (total_rows/2)**2
circular_mask = (dist_from_center > radius)
I see that this code applies euclidean distance to calculate dist_from_center, but I don't understand the X - total_rows and Y - total_cols part. This produces a mask that is a quarter of a circle, centered on the top-left of the image.
What role are X and Y playing on the circle? And how can I modify this code to produce a mask that is centered somewhere else in the image instead?
The algorithm you got online is partly wrong, at least for your purposes. If we have the following image, we want it masked like so:
The easiest way to create a mask like this is how your algorithm goes about it, but it's not presented in the way that you want, nor does it give you the ability to modify it in an easy way. What we need to do is look at the coordinates for each pixel in the image, and get a true/false value for whether or not that pixel is within the radius. For example, here's a zoomed in picture showing the circle radius and the pixels that were strictly within that radius:
Now, to figure out which pixels lie inside the circle, we'll need the indices of each pixel in the image. The function np.ogrid() gives two vectors, each containing the pixel locations (or indices): there's a column vector for the column indices and a row vector for the row indices:
>>> np.ogrid[:4,:5]
[array([[0],
[1],
[2],
[3]]), array([[0, 1, 2, 3, 4]])]
This format is useful for broadcasting so that if we use them in certain functions, it will actually create a grid of all the indices instead of just those two vectors. We can thus use np.ogrid() to create the indices (or pixel coordinates) of the image, and then check each pixel coordinate to see if it's inside or outside the circle. In order to tell whether it's inside the center, we can simply find the Euclidean distance from the center to every pixel location, and then if that distance is less than the circle radius, we'll mark that as included in the mask, and if it's greater than that, we'll exclude it from the mask.
Now we've got everything we need to make a function that creates this mask. Furthermore we'll add a little bit of nice functionality to it; we can send in the center and the radius, or have it automatically calculate them.
def create_circular_mask(h, w, center=None, radius=None):
if center is None: # use the middle of the image
center = (int(w/2), int(h/2))
if radius is None: # use the smallest distance between the center and image walls
radius = min(center[0], center[1], w-center[0], h-center[1])
Y, X = np.ogrid[:h, :w]
dist_from_center = np.sqrt((X - center[0])**2 + (Y-center[1])**2)
mask = dist_from_center <= radius
return mask
In this case, dist_from_center is a matrix the same height and width that is specified. It broadcasts the column and row index vectors into a matrix, where the value at each location is the distance from the center. If we were to visualize this matrix as an image (scaling it into the proper range), then it would be a gradient radiating from the center we specify:
So when we compare it to radius, it's identical to thresholding this gradient image.
Note that the final mask is a matrix of booleans; True if that location is within the radius from the specified center, False otherwise. So we can then use this mask as an indicator for a region of pixels we care about, or we can take the opposite of that boolean (~ in numpy) to select the pixels outside that region. So using this function to color pixels outside the circle black, like I did up at the top of this post, is as simple as:
h, w = img.shape[:2]
mask = create_circular_mask(h, w)
masked_img = img.copy()
masked_img[~mask] = 0
But if we wanted to create a circular mask at a different point than the center, we could specify it (note that the function is expecting the center coordinates in x, y order, not the indexing row, col = y, x order):
center = (int(w/4), int(h/4))
mask = create_circular_mask(h, w, center=center)
Which, since we're not giving a radius, would give us the largest radius so that the circle would still fit in the image bounds:
Or we could let it calculate the center but use a specified radius:
radius = h/4
mask = create_circular_mask(h, w, radius=radius)
Giving us a centered circle with a radius that doesn't extend exactly to the smallest dimension:
And finally, we could specify any radius and center we wanted, including a radius that extends outside the image bounds (and the center can even be outside the image bounds!):
center = (int(w/4), int(h/4))
radius = h/2
mask = create_circular_mask(h, w, center=center, radius=radius)
What the algorithm you found online does is equivalent to setting the center to (0, 0) and setting the radius to h:
mask = create_circular_mask(h, w, center=(0, 0), radius=h)
I'd like to offer a way to do this that doesn't involve the np.ogrid() function. I'll crop an image called "robot.jpg", which is 491 x 491 pixels. For readability I'm not going to define as many variables as I would in a real program:
Import libraries:
import matplotlib.pyplot as plt
from matplotlib import image
import numpy as np
Import the image, which I'll call "z". This is a color image so I'm also pulling out just a single color channel. Following that, I'll display it:
z = image.imread('robot.jpg')
z = z[:,:,1]
zimg = plt.imshow(z,cmap="gray")
plt.show()
robot.jpg as displayed by matplotlib.pyplot
To wind up with a numpy array (image matrix) with a circle in it to use as a mask, I'm going to start with this:
x = np.linspace(-10, 10, 491)
y = np.linspace(-10, 10, 491)
x, y = np.meshgrid(x, y)
x_0 = -3
y_0 = -6
mask = np.sqrt((x-x_0)**2+(y-y_0)**2)
Note the equation of a circle on that last line, where x_0 and y_0 are defining the center point of the circle in a grid which is 491 elements tall and wide. Because I defined the grid to go from -10 to 10 in both x and y, it is within that system of units that x_0 and x_y set the center point of the circle with respect to the center of the image.
To see what that produces I run:
maskimg = plt.imshow(mask,cmap="gray")
plt.show()
Our "proto" masking circle
To turn that into an actual binary-valued mask, I'm just going to take every pixel below a certain value and set it to 0, and take every pixel above a certain value and set it to 256. The "certain value" will determine the radius of the circle in the same units defined above, so I'll call that 'r'. Here I'll set 'r' to something and then loop through every pixel in the mask to determine if it should be "on" or "off":
r = 7
for x in range(0,490):
for y in range(0,490):
if mask[x,y] < r:
mask[x,y] = 0
elif mask[x,y] >= r:
mask[x,y] = 256
maskimg = plt.imshow(mask,cmap="gray")
plt.show()
The mask
Now I'll just multiply the mask by the image element-wise, then display the result:
z_masked = np.multiply(z,mask)
zimg_masked = plt.imshow(z_masked,cmap="gray")
plt.show()
To invert the mask I can just swap the 0 and the 256 in the thresholding loop above, and if I do that I get:
Masked version of robot.jpg
The other answers work, but they are slow, so I will propose an answer using skimage.draw.disk. Using this is faster and I find it simple to use. Simply specify the center of the circle and radius then use the output to create a mask
from skimage.draw import disk
mask = np.zeros((10, 10), dtype=np.uint8)
row = 4
col = 5
radius = 5
rr, cc = disk(row, col, radius)
mask[rr, cc] = 1

Categories