I have recorded some data as npy file. And I tried to diplay the image (data[0]) to check if it makes sense with the following code
import numpy as np
import cv2
train_data = np.load('c:/data/train_data.npy')
for data in train_data:
output = data[1]
# only take the height, width and channels of the 4 dimensional array
image = data[0][0, :, :, :]
# image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
cv2.imshow('test', image)
print('output {}'.format(output))
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
But if I display the images without the line image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) the images seem to be BGR based. If I comment this line into the code the images are displayed correctly.
My question: Does this observation imply that the image array is already in BGR format? Or does this imply that cv2.imshow() does by
default interprete the array as BGR array?
Matplotlib and Numpy read images into RGB and processes them as RGB. OpenCV reads images into BGR and processes them as BGR. Either system recognizes a range of input types, has ways to convert between color spaces of almost any type, and offers support of a variety of image processing tasks.
This gives three different ways to load an image (plt.imread(), ndimage.imread() and cv2.imread()), two systems for processing the data (Numpy and CV2), and two ways to display the image (plt.imshow() and cv2.imshow()), and really, there is a third way to display the image using pyplot, if you want to treat the image as numerical data in 2-d plus another dimension for each color.
Here is some simple code to demonstrate some of this.
#!/usr/bin/python
import matplotlib.pyplot as plt
from scipy.ndimage import imread
import numpy as np
import cv2
img = imread('index.jpg')
print( "img data type: %s shape %s"%( type(img), str( img.shape) ) )
plt.imshow( img )
plt.title( 'pyplot as read' )
plt.savefig( 'index.plt.raw.jpg' )
cv2.imshow('cv2, read by numpy', img)
cv2.imwrite('index.cv2.raw.jpg',img)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.imshow('after conversion', img)
cv2.imwrite('index.cv2.bgr2rgb.jpg',img)
This generates the following line of text, and the following three example image files.
img data type: <type 'numpy.ndarray'> shape (225, 225, 3)
The correct image has red as the upper circle. We read the image into a numpy array, using ndimage.imread(), and show it with Pyplot's imshow() and get the correct image. We then show it with cv2.imshow() and we see that the red channel is interpreted as the blue channel and vice versa. Then we convert the colorspace and we see that cv2.imshow() now interprets the result correctly.
plt.imshow(), as read by ndimage():
cv2.imshow(), the image as read by ndimage:
cv2.imshow(), after converting from RGB to BGR:
Related
I am trying to process images from Unity3D WebCamTexture graphics format(ARGB32) using OpenCV Python. But I am having trouble interpreting the image on the Open CV side. The image is all Blue (possibly due to ARGB)
try:
while(True):
data = sock.recv(480 * 640 * 4)
if(len(data) == 480 * 640 * 4):
image = numpy.fromstring(data, numpy.uint8).reshape( 480, 640, 4 )
#imageNoAlpha = image[:,:,0:2]
cv2.imshow('Image', image) #further do image processing
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
break
finally:
sock.close()
The reason is because of the order of the channels. I think the sender read image as a RGB image and you show it as a BGR image or vice versa.
Change the order of R and B channels will solve the problem:
image = image[..., [0,3,2,1]] # swap 3 and 1 represent for B and R
You will meet this problem frequently if you work with PIL.Image and OpenCV. The PIL.Image will read the image as RGB and cv2 will read as BGR, that's why all the red points in your image become blue.
OpenCV uses BGR (BGRA when including alpha) ordering when working with color images [1][2], this applies to images read/written with imread(), imwrite(); images acquired with VideoCapture; drawing functions ellipse(), rectangle(); and so on. This convention is self-consistent within the library, if you read an image with imread() and show it with imshow(), the correct colors will appear.
OpenCV is the only library I know that uses this ordering, e.g. PIL and Matplotlib both use RGB. If you want to convert from one color space to another use cvtColor(), example:
# Convert RGB to BGR.
new_image = cvtColor(image, cv2.COLOR_RGB2BGR)
See the ColorConversionCodes enum for all supported conversion pairs. Unfortunately there is no ARGB to BGR, but you can always manually manipulate the NumPy array anyway:
# Reverse channels ARGB to BGRA.
image_bgra = image[..., ::-1]
# Convert ARGB to BGR.
image_bgr = image[..., [3, 2, 1]]
There is also a mixChannels() function and a bunch other array manipulation utilities but most of these are redundant in OpenCV Python since images are backed by NumPy arrays so it's easier to just use the NumPy counterparts instead.
OpenCV uses BGR for seemingly historical reasons: Why OpenCV Using BGR Colour Space Instead of RGB.
References:
[1] OpenCV: Mat - The Basic Image Container (Search for 'BGR' under Storing methods.)
[2] OpenCV: How to scan images, lookup tables and time measurement with OpenCV
Image from [2] showing BGR layout in memory.
IMAGE_WIDTH = 640
IMAGE_HEIGHT = 480
IMAGE_SIZE = IMAGE_HEIGHT * IMAGE_WIDTH * 4
try:
while(True):
data = sock.recv(IMAGE_SIZE)
dataLen = len(data)
if(dataLen == IMAGE_SIZE):
image = numpy.fromstring(data, numpy.uint8).reshape(IMAGE_HEIGHT, IMAGE_WIDTH, 4)
imageDisp = cv2.cvtColor(image, cv2.COLOR_RGBA2BGR)
cv2.imshow('Image', imageDisp)
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
break
finally:
sock.close()
Edited as per the suggestions from comment
I've converted some images from RGB to Grayscale for ML purpose.
However the shape of the converted grayscale image is still 3, the same as the color image.
The code for the Conversion:
from PIL import Image
img = Image.open('path/to/color/image')
imgGray = img.convert('L')
imgGray.save('path/to/grayscale/image')
The code to check the shape of the images:
import cv2
im_color = cv2.imread('path/to/color/image')
print(im_color.shape)
im_gray2 = cv2.imread('path/to/grayscale/image')
print(im_gray2.shape)
You did
im_gray2 = cv2.imread('path/to/grayscale/image')
OpenCV does not inspect colorness of image - it does assume image is color and desired output is BGR 8-bit format. You need to inform OpenCV you want output to be grayscale (2D intensity array) as follows
im_gray2 = cv2.imread('path/to/grayscale/image', cv2.IMREAD_GRAYSCALE)
If you want to know more about reading images read OpenCV: Getting Started with Images
cv.imread, without any flags, will always convert any image content to BGR, 8 bits per channel.
If you want any image file, grayscale or color, to be read as grayscale, you can pass the cv.IMREAD_GRAYSCALE flag.
If you want to read the file as it really is, then you need to use cv.IMREAD_UNCHANGED.
im_color = cv2.imread('path/to/color/image', cv2.IMREAD_UNCHANGED)
print(im_color.shape)
im_gray2 = cv2.imread('path/to/grayscale/image', cv2.IMREAD_UNCHANGED)
print(im_gray2.shape)
I encountered this puzzling situation when trying to get rid of the third dimension (the RGB dimension) of my images in order to feed them to a Knn classifier for face recognition.
I took one colored face image from the Labeled-face-in-the-wild database as an example. It is saved locally.
I first imported the image, then converted it to grayscale, then checked dimension (time1), then exported with "imwrite", then imported the gray scale image again, then checked its dimension again (time2).
At (time1), the dimension was 2: (250, 250). However, at (time2), the dimension became 3: (250, 250, 3). Why would exporting and importing change the dimension of the gray scale picture? What should I specify when importing the gray scale picture to keep it 2 dimensional?
Here is my python code:
import cv2
import matplotlib.pyplot as plt
imgBGR = cv2.imread("path/filename")
gray = cv2.cvtColor(imgBGR, cv2.COLOR_BGR2GRAY)
gray.shape # this gives me (250, 250)
cv2.imwrite("path/newname", gray)
gray2 = cv2.imread("path/newname")
gray2.shape # this gives me (250, 250, 3)
Try gray2 = cv2.imread("path/newname" , cv2.IMREAD_GRAYSCALE)
As Opencv imread documentaion, the default is cv2.IMREAD_COLOR, so with setting the flag the default setting of cv2.imread is reading image in colour, so it will split a greyscale image into 3 channels.
By specific cv2.imread("path/newname" , cv2.IMREAD_GRAYSCALE), the function will read in image in grayscale.
I am trying to resize a .jpg image with skimage.transform.resize function. Function returns me weird result (see image below). I am not sure if it is a bug or just wrong use of the function.
import numpy as np
from skimage import io, color
from skimage.transform import resize
rgb = io.imread("../../small_dataset/" + file)
# show original image
img = Image.fromarray(rgb, 'RGB')
img.show()
rgb = resize(rgb, (256, 256))
# show resized image
img = Image.fromarray(rgb, 'RGB')
img.show()
Original image:
Resized image:
I allready checked skimage resize giving weird output, but I think that my bug has different propeties.
Update: Also rgb2lab function has similar bug.
The problem is that skimage is converting the pixel data type of your array after resizing the image. The original image has a 8 bits per pixel, of type numpy.uint8, and the resized pixels are numpy.float64 variables.
The resize operation is correct, but the result is not being correctly displayed. For solving this issue, I propose 2 different approaches:
To change the data structure of the resulting image. Prior to changing to uint8 values, the pixels have to be converted to a 0-255 scale, as they are on a 0-1 normalized scale:
# ...
# Do the OP operations ...
resized_image = resize(rgb, (256, 256))
# Convert the image to a 0-255 scale.
rescaled_image = 255 * resized_image
# Convert to integer data type pixels.
final_image = rescaled_image.astype(np.uint8)
# show resized image
img = Image.fromarray(final_image, 'RGB')
img.show()
Update: This method is deprecated, as per scipy.misc.imshow
To use another library for displaying the image. Taking a look at the Image library documentation, there isn't any mode supporting 3xfloat64 pixel images. However, the scipy.misc library has the appropriate tools for converting the array format in order to display it correctly:
from scipy import misc
# ...
# Do OP operations
misc.imshow(resized_image)
I'm trying to open an RGB picture, convert it to grayscale, then represent it as a list of floats scaled from 0 to 1. At last, I want to convert it back again to an Image. However, in the code below, something in my conversion procedure fails, as img.show() (the original image) displays correctly while img2.show() display an all black picture. What am I missing?
import numpy as np
from PIL import Image
ocr_img_path = "./ocr-test.jpg"
# Open image, convert to grayscale
img = Image.open(ocr_img_path).convert("L")
# Convert to list
img_data = img.getdata()
img_as_list = np.asarray(img_data, dtype=float) / 255
img_as_list = img_as_list.reshape(img.size)
# Convert back to image
img_mul = img_as_list * 255
img_ints = np.rint(img_mul)
img2 = Image.new("L", img_as_list.shape)
img2.putdata(img_ints.astype(int))
img.show()
img2.show()
The image used
The solution is to flatten the array before putting it into the image. I think PIL interprets multidimensional arrays as different color bands.
img2.putdata(img_ints.astype(int).flatten())
For a more efficient way of loading images, check out
https://blog.eduardovalle.com/2015/08/25/input-images-theano/
but use image.tobytes() (Pillow) instead of image.tostring() (PIL).
.