I was watching a tutorial on a facial recognition project using OpenCV,numpy, PIL.
During training, the image was converted into a numpy array, what is the need of converting it into a numpy array?
THE CODE:
PIL_IMAGE = Image.open(path).convert("L")
image_array = np.array(PIL_IMAGE, "uint8")
TLDR; OpenCV images are stored as three-dimensional Numpy arrays.
When you read in digital images using the library, they are represented as Numpy arrays. The rectangular shape of the array corresponds to the shape of the image. Consider this image of a chair
Here's a visualization of how this image is stored as a Numpy array in OpenCV
If we read in the image of the chair we can see how it is structured with image.shape which returns a tuple (height, width, channels). Image properties will be a tuple of the number of rows, columns, and channels if it is a colored image. If it is a grayscale image, image.shape only returns the number of rows and columns.
import cv2
image = cv2.imread("chair.jpg")
print(image.shape)
(222, 300, 3)
When working with OpenCV images, we specify the y coordinate first, then the x coordinate. Colors are stored as BGR values with blue in layer 0, green in layer 1, and red in layer 2. So for this chair image, it has a height of 222, a width of 300, and has 3 channels (meaning it is a color image). Essentially, when the library reads in any image, it stores it as a Numpy array in this format.
The answer is rather simple:
With Numpy you can make blazing fast operations on numerical arrays, no matter which dimension, shape, etc. they are.
Image processing libraries (OpenCV, PIL, scikit-image) sometimes wrap images in some special format that already uses Numpy behind the scenes. If they are not already using Numpy in the background, the images can be converted to Numpy arrays explicitly. Then you can do speedy numerical calculations on them (convolution, FFT, blurry, filters, ...).
Related
I have some images and would like to look at the eigenvalues of the images (as image is a matrix). My issue is that the image is in the shape of TensorShape([577, 700, 3])
How can I possibly to some preprocessing to be able to have its eigen decomposition?
My try:
import tensorflow as tf
import numpy as np
from numpy import linalg as LA
import matplotlib.pyplot as plt
image_path = tf.keras.utils.get_file('YellowLabradorLooking_new.jpg', 'https://storage.googleapis.com/download.tensorflow.org/example_images/YellowLabradorLooking_new.jpg')
image_raw = tf.io.read_file(image_path)
image = tf.image.decode_image(image_raw)
image = tf.cast(image, tf.float32)
image = tf.image.resize(image, (224, 224))
LA.eig(image)
If you have n images, and if images are of the same size, and if images are somehow centered, then you may consider that images are samples from a distribution, and you can use eigenvalue decomposition to study how different pixels in the image vary across the collection.
In this situation: say you have a collection of n [H,W] images. You can flatten images and form a [H*W, n] matrix. If the images are RGB, it can be a [H*W*3, n] array -- i.e. each pixel location and each color channel is treated as an independent dimension.
Eigenvalue decomposition will give you a collection of H*W*3-dimensional vectors, which can be reshaped back into RGB images. Getting all eigenvectors is going to be impossible (H*W*3*H*W*3 is usually huge), however calculating top 3-5 eigenvalues and eigenvectors shouldn't be a problem even if HxWx3 is large.
You can find a more detailed description searching for "Eigenfaces"; e.g. opencv-eigenfaces-for-face-recognition, wikipedia, classic CVPR91 paper, etc.
A grayscale image can be (and usually is) represented as a matrix. A colored image can not. It is represented using three matrices, one for each color channel.
This is the problem with your code snippet. la.eig() expects a square array, or an array containing square arrays in its final two axes, but got an array of shape (224, 224, 3).
To fix this, you can shift the two 224 axes to the end using the np.rollaxis() function. The eigenvalues and -vectors will be calculated separately for each color channel.
Let's say I have a numpy array of shape (100, 100, 3), and that it represents an image in RGB encoding. How do I iterate over the individual pixels of this image.
Specifically I want to map this image with a function.
Note, I got that array from opencv.
I am working on hair removal from skin lesion images. Is there any way to convert binary back to rgb?
Original Image:
Mask Image:
I just want to restore the black area with the original image.
As I know binary images are stored in grayscale in opencv values 1-->255.
To create „dummy“ RGB images you can do:
rgb_img = cv2.cvtColor(binary_img, cv.CV_GRAY2RGB)
I call them „dummy“ since in these images the red, green and blue values are just the same.
Something like this, but your mask is the wrong size (200x200 px) so it doesn't match your image (600x450 px):
#!/usr/local/bin/python3
from PIL import Image
import numpy as np
# Open the input image as numpy array
npImage=np.array(Image.open("image.jpg"))
# Open the mask image as numpy array
npMask=np.array(Image.open("mask2.jpg").convert("RGB"))
# Make a binary array identifying where the mask is black
cond = npMask<128
# Select image or mask according to condition array
pixels=np.where(cond, npImage, npMask)
# Save resulting image
result=Image.fromarray(pixels)
result.save('result.png')
I updated the Daniel Tremer's answer:
import cv2
opencv_rgb_img = cv2.cvtColor(opencv_image, cv2.COLOR_GRAY2RGB)
opencv_image would be two dimension matrix like [width, height] because of binary.
opencv_rgb_img would be three dimension matrix like [width, height, color channel] because of RGB.
I am learning Tensorflow and Python. I tried reading an image from a file and then displaying that image using matplotlib. Here is my code.
import matplotlib.pyplot as plt
import tensorflow as tf
# read and decode the image
image_contents = tf.read_file('elephant.jpeg')
image = tf.image.decode_jpeg(image_contents, channels=3)
with tf.Session() as sess:
img = sess.run(image)
print(img)
plt.axis('off')
plt.imshow(img)
plt.show()
This also prints a huge array which I understand are the RGB values for each pixel. Now I am trying to modify pixel values individually. I can modify all the pixel values at once using tf operations but I am not able to operate on individual pixel values.
For example, I have been trying to make the image grayscale. So, I want to replace the R, G and B values with the average of R,G and B values of the pixel. How do I do that?
I also want to know if I should be focussing on Python or Tensorflow?
You can directly convert the image to grayscale with Pillow
from PIL import Image
img = Image.open('/some path/image.png').convert('L')
I prefer preprocessing images with numpy before feeding them into tensorflow.
I am not sure which shape your array has, i would suggest to convert the image to a 2 dim np array. In the case below i am converting a list of pixels (shape=[784]) to an array with shape=28x28. Afterwards you can directly perform operations on each pixel.
image = np.reshape(img, (28,28)).astype(np.uint8)
I have a 2D array that I want to create an image from. I want to transform the image array of dimensions 140x120 to an array of 140x120x3 by stacking the same array 3 times (to get a grayscale image to use with skimage).
I tried the following:
image = np.uint8([image, image, image])
which results in a 3x120x140 image. How can I reorder the array to get 120x140x3 instead?
np.dstack([image, image, image]) (docs) will return an array of the desired shape, but whether this has the right semantics for your application depends on your image generation library.