extract edge features with prewitt_h - python

I am trying to extract edge features like this:
img = io.imread('pic.jpg')
H, W, C = img.shape
features = custom_features(img)
assignments = kmeans_fast(features, num_segments)
segments = assignments.reshape((H, W))
# Display segmentation
plt.imshow(segments, cmap='viridis')
plt.axis('off')
plt.show()
custom_features:
from skimage.filters import prewitt_h,prewitt_v
def custom_features(image):
"""
Args:
img - array of shape (H, W, C)
Returns:
features - array of (H * W, C)
"""
edges_prewitt_horizontal = prewitt_h(image)
return edges_prewitt_horizontal
However, currently I get an error because the shape of the image is different than what is expected by the prewitt_h function.
ValueError: The parameter `image` must be a 2-dimensional array
How can I modify this inside the function such that the returned shape is as desired?

It looks like you need to give to prewitt a grayscale image. The prewitt transform applies a convolution with a 2-dimensional kernel, hence you need 2-dimensional image (and yours is 3-d, because you have colors (RGB, 3 Channels)).
You could add inside your custom_features method a conversion to grayscale (skimage that you are using already has a method for that, check it out )
from skimage.filters import prewitt_h,prewitt_v
from skimage.color import rgb2gray
def custom_features(image):
"""
Args:
image - array of shape (H, W, C)
Returns:
features - array of (H * W, C)
"""
grayscale = rgb2gray(image)
edges_prewitt_horizontal = prewitt_h(grayscale)
return edges_prewitt_horizontal
And this should do the trick (I assume the image that the custom_features methods receives in input is always an RGB image because of the shape you defined above).
In case you have different types you can add a check if C == 3: to convert only RGB images.

By default, skimage.io.imread returns the read JPEG image as a shape-(M, N, 3) array, representing an RGB color image. However, the prewitt functions expect that the input is a single channel image.
To fix this, convert the image to grayscale first with skimage.color.rgb2gray before filtering. Or you could read the image directly as grayscale with skimage.io.imread(f, as_gray=True).

Related

Remove CMYK colors to keep only black from a PNG

I'm trying to remove the colors from a PNG there is a way to do it ? My goal is to import my image in a PDF using Python, I tryed first with an SVG file but impossible to import, nothing appears with no error. So I wanted to try with a PNG but still hard to import.
Now I have an image with these percentage of colors :
And my final result would be this :
I already tried with openCV but no result from it, I'm looking for a solution since few days.
file = "app\\static\\img\\Picto CE_MAROC_H_6mm.png"
src = cv2.imread(file, cv2.IMREAD_UNCHANGED)
src[:,:,2] = np.zeros([src.shape[0], src.shape[1]])
cv2.imwrite(file,src)
Thanks in advance for your help ! :)
What does it mean to only have a K channel?
Most applications use RGB or RGBA, whereas the CMYK color space is typically for printed material. We should translate what does it mean for an image to only use the K channel.
First, let's look the formulas to convert the CMYK colorspace to RGB. We will assume that C, M, K are on a 0-100 integer scale:
R = 255 * (1 - C/100) * (1 - K/100)
G = 255 * (1 - M/100) * (1 - K/100)
B = 255 * (1 - Y/100) * (1 - K/100)
Since we only care for the K channel, we will set C, Y, and M to 0. This simplifies the formulas to:
R = 255 * (1 - K/100)
G = 255 * (1 - K/100)
B = 255 * (1 - K/100)
Notice that R = G = B when only the K channel is set. This produces a gray monochrome throughout the image, effectively making it grayscale. As such, the goal would be to produce a grayscale image given a RGBA image input.
Converting color to grayscale
Converting a color to its grayscale component is simply done by preserving the luminance of the original image in a gray monochrome palette. To do so, a formula must be defined which takes in a RGB input and returns a single value Y, creating a YYY color on the gray monochrome scale. This can simply be done by assigning each color a coefficient to scale how much an effect each has on luminance. Since the human eye is most sensitive to G, R, then B, we would want to assign a high coefficient to G and a low coefficient to B. The most common grayscale calculation used is luma coding for color TV and video systems:
Y = round(0.229 * R + 0.587 * G + 0.114 * B)
Converting an image to only use the K channel in Python
Now knowing the above information, we can convert an image to only use the K channel. For this, we can use imageio which can provide pixel information in RGB format. Since image data is given as an n dimensional array, we can also use numpy to abstract any loops needed to apply a grayscale to every pixel.
I will be using the imageio.v3 module as that is the most recent API as of this post. Loading in the image can be done by calling imageio.v3.imread and passing in the location of the image.
First, we want to get a luminance value for each pixel in the image. This can be done by taking the dot product of the image and the coefficients of the luminance formula. This will produce a 2D array as (height, width, RGB) x (RGB) = (height, width). We also need to round the values and cast each to a unsigned 8-bit integer to get our values into the 0-255 integer color range.
import numpy as np
# For some image `im` loaded by `#imread`
# The coefficients for converting an RGB color to its luminance value
grayscale_coef = [0.299, 0.587, 0.114]
# Create a 2D array where any pixel (height, width) translates to a single luminance value
grayscale = np.dot(im, grayscale_coef)
# Round the each luminance value and convert to a 0-255 range
grayscale = np.round(grayscale).astype(np.uint8)
Saving as CMYK
Now that we have the value to put into the K channel, we need to reconstruct the 3D array setting the CMY channels to 0 and then outputting to an image format that supports CMYK (JPG, TIFF, etc.). For this, we can use pillow.
from PIL import Image
# Create the CMY channels initialized to 0
cmy = np.zeros(grayscale.shape + (3,))
# Stack the CMY and K channels together
# Cast type to unsigned byte to avoid channel turning completely black
cmyk = np.dstack((cmy, grayscale)).astype(np.uint8)
# Read image from CMYK array buffer
result = Image.fromarray(cmyk, mode="CMYK")
# Save image in a supported format
result.save("<filename_here>.jpg")

Cropping image from a binary mask

I am trying to use DeepLab v3 to detect object and mask where the actual object is.
DeepLab model produces a resized_im(3D) and a mask seg_map (2D) of 0 and non-zero values, 0 means it's the background.
Currently, it is only possible to plot an image with an overlay mask on the object. I want to crop the object out of the resized_im with transparent background. Is there any advice for the work?
You can play around with the notebook here:
https://colab.research.google.com/drive/138dTpcYfne40hqrb13n_36okSGYhrJnz?usp=sharing&hl=en#scrollTo=p47cYGGOQE1W&forceEdit=true&sandboxMode=true
I also tried here: How to crop image based on binary mask but none seems to work on my case
You just need to convert your segmentation mask to boolean numpy array, then multiply image by it. Don't forget that your image has 3 channels while mask has only 1. It may look something like that:
# seg_map - segmentation mask from network, resized_im - your input image
mask = np.greater(seg_map, 0) # get only non-zero positive pixels/labels
mask = np.expand_dims(mask, axis=-1) # (H, W) -> (H, W, 1)
mask = np.concatenate((mask, mask, mask), axis=-1) # (H, W, 1) -> (H, W, 3), (don't like it, so if you know how to do it better, please let me know)
crops = resized_im * mask # apply mask on image
You can use different logical numpy function if you want to choose certain labels, for example:
mask = np.equal(seg_map, 5) # to get only objects with label 5

Python -- change the RGB values of the image and save as a image

I can read every pixel' RGB of the image already, but I don't know how to change the values of RGB to a half and save as a image.Thank you in advance.
from PIL import *
def half_pixel(jpg):
im=Image.open(jpg)
img=im.load()
print(im.size)
[xs,ys]=im.size #width*height
# Examine every pixel in im
for x in range(0,xs):
for y in range(0,ys):
#get the RGB color of the pixel
[r,g,b]=img[x,y]
get the RGB color of the pixel
[r,g,b]=img.getpixel((x, y))
update new rgb value
r = r + rtint
g = g + gtint
b = b + btint
value = (r,g,b)
assign new rgb value back to pixel
img.putpixel((x, y), value)
You can do everything you are wanting to do within PIL.
If you are wanting to reduce the value of every pixel by half, you can do something like:
import PIL
im = PIL.Image.open('input_filename.jpg')
im.point(lambda x: x * .5)
im.save('output_filename.jpg')
You can see more info about point operations here: https://pillow.readthedocs.io/en/latest/handbook/tutorial.html#point-operations
Additionally, you can do arbitrary pixel manipulation as:
im[row, col] = (r, g, b)
There are many ways to do this with Pillow. You can use Image.point, for example.
# Function to map over each channel (r, g, b) on each pixel in the image
def change_to_a_half(val):
return val // 2
im = Image.open('./imagefile.jpg')
im.point(change_to_a_half)
The function is actually only called 256 times (assuming 8-bits color depth), and the resulting map is then applied to the pixels. This is much faster than running a nested loop in python.
If you have Numpy and Matplotlib installed, one solution would be to convert your image to a numpy array and then e.g. save the image with matplotlib.
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
img = Image.open(jpg)
arr = np.array(img)
arr = arr/2 # divide each pixel in each channel by two
plt.imsave('output.png', arr.astype(np.uint8))
Be aware that you need to have a version of PIL >= 1.1.6

Convert a raw RGB array into a png image.

I am trying to read a image file using PIL and then obtaining the raw pixel values in form of numpy array and then i am trying to put together the values to form a copy of original image. The code does not produce any runtime error but the image formed ("my.png") is unreadable.
from PIL import Image
import numpy as np
img_filename = "image.png"
img = Image.open(img_filename)
img = img.convert("RGB")
img.show()
aa = np.array(img.getdata())
alpha = Image.fromarray(aa,"RGB")
alpha.save('my.png')
alpha.show()
np.array(img.getdata()) gives a 2D array of shape (X, 3), where X depends on the dimensions of the original image.
Just change the relevant line of code to:
aa = np.array(img)
This will assign a 3D array to aa, and thus solve your problem.

Tensorflow resize_image_with_crop_or_pad

I want to call tf.image.resize_image_with_crop_or_pad(images,height,width) to resize my input images. As my input images are all in form as 2-d numpy array of pixels, while the image input of resize_image_with_crop_or_pad must be 3-d or 4-d tensor, it will cause an error. What should I do?
Let's suppose that you got images that's a [n, W, H] numpy nd-array, in which n is the number of images and W and H are the width and the height of the images.
Convert images to a tensor, in order to be able to use tensorflow functions:
tf_images = tf.constant(images)
Convert tf_images to the image data format used by tensorflow (thus from n, W, H to n, H, W)
tf_images = tf.transpose(tf_images, perm=[0,2,1])
In tensorflow, every image has a depth channell, thus altough you're using grayscale images, we have to add the depth=1 channell.
tf_images = tf.expand_dims(tf_images, 2)
Now you can use tf.image.resize_image_with_crop_or_pad to resize the batch (that how has a shape of [n, H, W, 1] (4-d tensor)):
resized = tf.image.resize_image_with_crop_or_pad(tf_images,height,width)

Categories