I have a code which recognizes faces on images (dlib implementation, 68 points)
I want to rotate a bit some images but after a get following trouble: my image becomes somehow spoiled
from skimage import io
from skimage import transform as tf
img = io.imread(f)
tform = tf.SimilarityTransform(rotation=np.deg2rad(10),translation=(10,12))
img = tf.warp(img,tform)
I plot image with 2 ways:
plt.imshow(img) #the right picture (matplotlib)
win = dlib.image_window() #the left picture (dlib)
win.set_image(img) #the left picture
As you can see dlib image is broken. Also algorithm which can find facial keypoints stopps working.
Without SimilarityTransform dlib works correctly.
Help me please! I want to rotate an image and to pass it to dlib
I found the solution.
I simply needed to convert image with img_as_ubyte
from skimage import img_as_ubyte
img = img_as_ubyte(tf.warp(img,tform))
Related
So I have an image and I want to cut it up into multiple images to feed into OCR to read.
image example
I only want the messages with the white bubbles and exclude anything with the grey bubbles. I can't figure out how to make a loop to separate each white bubble.
import numpy as np
from PIL import ImageGrab, Image, ImageFilter
img = Image.open('test1.png').convert('RGB')
na = np.array(img)
orig = na.copy()
img = img.filter(ImageFilter.MedianFilter(3))
whiteY, whiteX = np.where(np.all(na==[255,255,255],axis=2))
top, bottom = whiteY[1], whiteY[-1]
left, right = whiteX[1], whiteX[-1]
You could try using the opencv threshold function, followed by the findContours function. This will, if you threshold the image correctly, give you the 'borders' of the bubbles above. Using that, you could then crop out each text bubble.
Here's a simple example of contours being used:
https://www.geeksforgeeks.org/find-and-draw-contours-using-opencv-python/
Otherwise if you'd like to understand better how the opencv functions I mentioned or those that are used in the article above, have a look at the opencv documentation.
this should recognize the cars in the image and draw the squares around them, I don't know why it's not working... could someone give me some hints?
code and image with no detection:
IMAGE
import os
from cv2 import cv2
import matplotlib.pyplot as plt
import cvlib as cv
from cvlib.object_detection import draw_bbox
im =cv2.imread("C:\\Users\\gmobi\\PycharmProjects\\ComputerVisionStudent\\imagens\\carros.jpeg")
bbox, label, conf = cv.detect_common_objects(im)
output_image = draw_bbox(im, bbox, label, conf)
plt.imshow(output_image)
plt.show()
print('Number of cars in the image is ' + str(label.count('car')))
Installed packages:
Keras: 2.2.5
cvlib: 0.2.2
opencv-python: 4.1.1.26
tensorflow: 1.14.0
matplotlib: 3.1.1
source code from git:
https://github.com/sabiipoks/blog-posts/blob/master/Count_Number_of_Cars_in_Less_Than_10_Lines_of_Code_Using_Python.ipynb
my settings:
IMAGE
The network that cvlib is using for the object detection is YoloV3. This network expects to get an image in the RGB color space and not BGR color space which is what opencv cv2.imread is returning.
So i think you should convert the color space of the image from BGR to RGB.
im =cv2.imread("C:\\Users\\gmobi\\PycharmProjects\\ComputerVisionStudent\\imagens\\carros.jpeg")
im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
Another reason could be the size of the object in your image. The input size to Yolo is around 411x411 pixels so your image will be resized before it will be inserted to network and after the resize the objects will look to small. Try to crop the image and then insert it.
Im = im[:412,:412,:]
I used Zbar and OpenCV to read the QR code in the image below but both failed to detect it. For ZBar, I use pyzbar library as the python wrapper. There are images that QR is detected correctly and images really similar to the successful ones that fail. My phone camera can read the QR code in the uploaded image which means it is a valid one. Below is the code snippet:
from pyzbar.pyzbar import decode
from pyzbar.pyzbar import ZBarSymbol
import cv2
# zbar
results = decode(cv2.imread(image_path), symbols=[ZBarSymbol.QRCODE])
print(results)
# opencv
qr_decoder = cv2.QRCodeDetector()
data, bbox, rectified_image = qr_decoder.detectAndDecode(cv2.imread(image_path))
print(data, bbox)
What type of pre-processing will help to increase the rate of success for detecting QR codes?
zbar, which does some preprocessing, does not detect the QR code, which you can test running zbarimg image.jpg.
Good binarization is useful here. I got this to work using the kraken.binarization.nlbin() function of the Kraken library. The library is for OCR, but works very well for QR codes, too, by using non-linear processing. The Kraken binarization code is here.
Here is the code for the sample:
from kraken import binarization
from PIL import Image
from pyzbar.pyzbar import decode
from pyzbar.pyzbar import ZBarSymbol
image_path = "image.jpg"
# binarization using kraken
im = Image.open(image_path)
bw_im = binarization.nlbin(im)
# zbar
decode(bw_im, symbols=[ZBarSymbol.QRCODE])
[Decoded(data=b'DE-AAA002065', type='QRCODE', rect=Rect(left=1429, top=361, width=300, height=306), polygon=[Point(x=1429, y=361), Point(x=1429, y=667), Point(x=1729, y=667), Point(x=1723, y=365)])]
The following picture shows the clear image of the QR code after binarization:
I had a similar issue, and Seanpue's answer got me on the right track for this problem. Since I was already using the OpenCV library for image processing rather than PIL, I used it to perform Otsu's Binarization using the directions in an OpenCV tutorial on Image Thresholding. Here's my code:
import cv2
from pyzbar.pyzbar import decode
from pyzbar.pyzbar import ZBarSymbol
image_path = "qr.jpg"
# preprocessing using opencv
im = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
blur = cv2.GaussianBlur(im, (5, 5), 0)
ret, bw_im = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# zbar
decode(bw_im, symbols=[ZBarSymbol.QRCODE])
[Decoded(data=b'DE-AAA002065', type='QRCODE', rect=Rect(left=1429, top=362, width=300, height=305), polygon=[Point(x=1429, y=362), Point(x=1430, y=667), Point(x=1729, y=667), Point(x=1724, y=366)])]
Applying the gaussian blur is supposed to remove noise from the picture to make the binarization more effective, but for my application it didn't actually make much difference. What was vital was to convert the image to grayscale to make the threshold function work (done here by opening the file with the cv2.IMREAD_GRAYSCALE flag).
QReader use to work quite well for these cases.
from qreader import QReader
import cv2
if __name__ == '__main__':
# Initialize QReader
detector = QReader()
img = cv2.cvtColor(cv2.imread('92iKG.jpg'), cv2.COLOR_BGR2RGB)
# Detect and Decode the QR
print(detector.detect_and_decode(image=img))
This code output for this QR:
DE-AAA002065
I have changed the format of the images to png also..but of no use. Does cv2 / imshow decrease the resolution automatically?
import numpy as np
import cv2
from matplotlib import pyplot as plt
imgL = cv2.imread('image.png',0)
imgR = cv2.imread('2.png',0)
stereo = cv2.StereoBM_create(numDisparities=16, blockSize=15)
disparity = stereo.compute(imgR,imgL)
plt.imshow(disparity, 'gray')
plt.show()
My main aim is to generate the final image with the resolution as was the supplied images.
You're using imshow from matplotlib which might be the cause of different showing behaviour.
Instead try:
cv2.imshow("Res", disparity)
cv2.waitKey(0)
cv2.destroyAllWindows()
If that is still not good, please edit the question and include the resulting image and the input image.
When loading a png image with PIL and OpenCV, there is a color shift. Black and white remain the same, but brown gets changed to blue.
I can't post the image because this site does not allow newbies to post images.
The code is written as below rather than use cv.LoadImageM, because in the real case the raw image is received over tcp.
Here is the code:
#! /usr/bin/env python
import sys
import cv
import cv2
import numpy as np
import Image
from cStringIO import StringIO
if __name__ == "__main__":
# load raw image from file
f = open('frame_in.png', "rb")
rawImage = f.read()
f.close()
#convert to mat
pilImage = Image.open(StringIO(rawImage));
npImage = np.array(pilImage)
cvImage = cv.fromarray(npImage)
#show it
cv.NamedWindow('display')
cv.MoveWindow('display', 10, 10)
cv.ShowImage('display', cvImage)
cv. WaitKey(0)
cv.SaveImage('frame_out.png', cvImage)
How can the color shift be fixed?
OpenCV's images have color channels arranged in the order BGR whereas PIL's is RGB. You will need to switch the channels like so:
import PIL.Image
import cv2
...
image = np.array(pilImage) # Convert PIL Image to numpy/OpenCV image representation
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) # You can use cv2.COLOR_RGBA2BGRA if you are sure you have an alpha channel. You will only have alpha channel if your image format supports transparency.
...
#Krish: Thanks for pointing out the bug. I didn't have time to test the code the last time.
Hope this helps.
Change
pilImage = Image.open(StringIO(rawImage))
to
pilImage = Image.open(StringIO(rawImage)).convert("RGB")
Light alchemist's answer did not work, but it did explain the issue. Wouldn't the reverse be screwed up by the Apha channel, i.e. it changes BRGA to AGRB. I would think Froyo's answer would solve it, but it did not change the displayed image at all. What did work was reversing the colors in OpenCV. I'm too much of a newbie to know why. They seem equivalent to me. Reversing the colors in numpy would be preferred as additional processing is planned in numpy. But thanks for the help, the answers steered me in the right direction.
pilImage = Image.open(StringIO(rawImage));
bgrImage = np.array(pilImage)
cvBgrImage = cv.fromarray(bgrImage)
# Reverse BGR
cvRgbImage = cv.CreateImage(cv.GetSize(cvBgrImage),8,3)
cv.CvtColor(cvBgrImage, cvRgbImage, cv.CV_BGR2RGB)
#show it
cv.ShowImage('display', cvRgbImage)
cv. WaitKey(30) # ms to allow display