I am trying to isolate text from an image with openCV before sending it to tesseract4 engine to maximize results.
I found this interesting post and I decided to copy the source and try by mysdelf
However I am getting issue with the first call to OpenCV
To reproduce:
Simply copy the code from the gist
launch command script.py /path/to/image.jpg
I am getting issue:
Required argument 'threshold2' (pos 4) not found
Do you maybe have an idea of what does it means.
I am a javascript, java and bash script developer but not python...
In a simple version:
import glob
import os
import random
import sys
import random
import math
import json
from collections import defaultdict
import cv2
from PIL import Image, ImageDraw
import numpy as np
from scipy.ndimage.filters import rank_filter
if __name__ == '__main__':
if len(sys.argv) == 2 and '*' in sys.argv[1]:
files = glob.glob(sys.argv[1])
random.shuffle(files)
else:
files = sys.argv[1:]
for path in files:
out_path = path.replace('.jpg', '.crop.png')
if os.path.exists(out_path): continue
orig_im = Image.open(path)
edges = cv2.Canny(np.asarray(orig_im), 100, 200)
Thanks in advance for your help
Edit: okay so this answer is apparently wrong, as I tried to send my own 16-bit int image into the function and couldn't reproduce the results.
Edit2: So I can reproduce the error with the following:
from PIL import Image
import numpy as np
import cv2
orig_im = Image.open('opencv-logo2.png')
threshold1 = 50
threshold2 = 150
edges = cv2.Canny(orig_im, 50, 100)
TypeError: Required argument 'threshold2' (pos 4) not found
So if the image was not cast to an array, i.e., the Image class was passed in, I get the error. The PIL Image class is a class with a lot of things other than the image data associated to it, so casting to a np.array is necessary to pass into functions. But if it was properly cast, everything runs swell for me.
In a chat with Dan MaĊĦek, my idea below is a bit incorrect. It is true that the newer Canny() method needs 16-bit images, but the bindings don't look into the actual numpy dtype to see what bit-depth it is to decide which function call to use. Plus, if you try to actually send a uint16 image in, you get a different error:
edges = cv2.Canny(np.array([[0, 1234], [1234, 2345]], dtype=np.uint16), 50, 100)
error: (-215) depth == CV_8U in function Canny
So the answer I originally gave (below) is not the total culprit. Perhaps you accidentally removed the np.array() casting of the orig_im and got that error, or, something else weird is going on.
Original (wrong) answer
In OpenCV 3.2.0, a new method for Canny() was introduced to allow users to specify their own gradient image. In the original implementation, Canny() would use the Sobel() operator for calculating the gradients, but now you could calculate say the Scharr() derivatives and pass those into Canny() instead. So that's pretty cool. But what does this have to do with your problem?
The Canny() method is overloaded. And it decides which function you want to use based on the arguments you send in. The original call for Canny() with the required arguments looks like
cv2.Canny(image, threshold1, threshold2)
but the new overloaded method looks like
cv2.Canny(grad_x, grad_y, threshold1, threshold2)
Now, there was a hint in your error message:
Required argument 'threshold2' (pos 4) not found
Which one of these calls had threshold2 in position 4? The newer method call! So why was that being called if you only passed three args? Note that you were getting the error if you used a PIL image, but not if you used a numpy image. So what else made it assume you were using the new call?
If you check the OpenCV 3.3.0 Canny() docs, you'll see that the original Canny() call requires an 8-bit input image for the first positional argument, whereas the new Canny() call requires a 16-bit x derivative of input image (CV_16SC1 or CV_16SC3) for the first positional argument.
Putting two and two together, PIL was giving you a 16-bit input image, so OpenCV thought you were trying to call the new method.
So the solution here, if you wanted to continue using PIL, is to convert your image to an 8-bit representation. Canny() needs a single-channel (i.e. grayscale) image to run, first off. So you'll need to make sure the image is single-channel first, and then scale it and change the numpy dtype. I believe PIL will read a grayscale image as single channel (OpenCV by default reads all images as three-channel unless you tell it otherwise).
If the image is 16-bit, then the conversion is easy with numpy:
img = (img/256).astype('uint8')
This assumes img is a numpy array, so you would need to cast the PIL image to ndarray first with np.array() or np.asarray().
And then you should be able to run Canny() with the original function call.
The issue was coming from an incompatibility between interfaces used and openCV version.
I was using openCV 3.3 so the correct way to call it is:
orig_im = cv2.imread(path)
edges = cv2.Canny(orig_im, 100, 200)
Related
I'm using OpenCV version 4.1.1 in Python and cannot get a legitimate reading for a 32-bit image, even when I use cv.IMREAD_ANYDEPTH. Without cv.IMREAD_ANYDEPTH, it returns as None type; with it, I get a matrix of zeros. The issue persists after reinstalling OpenCV. os.path.isfile returns True. The error was replicated on another computer. The images open in ImageJ, so I wouldn't think they're corrupted. I would rather use Skimage since it reads the images just fine, but I have to use OpenCV for what I'm working on. Any advice is appreciated.
img = cv2.imread(file,cv2.IMREAD_ANYDEPTH)
Link for the image: https://drive.google.com/file/d/1IiHbemsmn2gLW12RG3i9fLYZQW2u8sQw/view?usp=sharing
It appears to be some bug in how OpenCV loads such TIFF images. Pillow seems to load the image in a sensible way. Running
from PIL import Image
import numpy as np
img_pil = Image.open('example_image.tiff')
img_pil_cv = np.array(img_pil)
print(img_pil_cv.dtype)
print(img_pil_cv.max())
I get
int32
40950
as an output, which looks reasonable enough.
When I do
import cv2
img_cv = cv2.imread('example_image.tiff', cv2.IMREAD_ANYDEPTH)
print(img_cv.dtype)
print(img_cv.max())
I get
float32
5.73832e-41
which is obviously wrong.
Nevertheless, the byte array holding the pixel data is correct, it's just not being interpreted correctly. You can use numpy.ndarray.view to reinterpret the datatype of a numpy array, so that it's treated as an array if 32bit integers instead.
img_cv = cv2.imread('example_image.tiff', cv2.IMREAD_ANYDEPTH)
img_cv = img_cv.view(np.int32)
print(img_cv.dtype)
print(img_cv.max())
Which prints out
int32
40950
Since the maximum value is small enough for 16bit integer, let's convert the array and see what it looks like
img_cv_16bit = img_cv.astype(np.uint16)
cv2.imwrite('output_cv_16bit.png', img_cv_16bit)
OK, there are some bright spots, and a barely visible pattern. With a little adjustment, we can get something visible:
img_cv_8bit = np.clip(img_cv_16bit // 16, 0, 255).astype(np.uint8)
cv2.imwrite('output_cv_8bit.png', img_cv_8bit)
That looks quite reasonable now.
I'm trying to use sobel and prewitt filters from skimage for edge detection to compare the results, but for both I just get black squares!
That's my code:
import numpy as np
from skimage import filters
from PIL import Image
a=Image.open('F:/CT1.png').convert('L')
a.show()
a=np.asarray(a)
b=filters.sobel(a)
b=Image.fromarray(b)
b.show()
As most methods from scikit-image, the sobel function uses np.float64 for calculations, and thus converts your image appropriately to the range 0.0 ... 1.0. Following, your result b is also of type np.float64 with values in the same range. When now converting to some Pillow Image object, its mode is set to F, which is used for 32-bit floating point pixels.
Now, the documentation on Image.show tells us, for example:
On Windows, the image is opened with the standard PNG display utility.
It remains unclear, in which file format(?) the image is actually displayed. Seemingly, it's PNG, at least according to the temporary file name. But, for example, saving some Image object with mode F as PNG or JPG doesn't work! So, it seems, the image must be somehow converted to make it displayable. The first guess is, that some regular 8-bit image is chosen as default, since you get a nearly all black image, indicating that values 0 and maybe 1 are treated as "very dark". And, in fact, when using something like
b=Image.fromarray(b * 255)
the Windows image preview displays a proper image when using b.show().
So, that would be a workaround for the displaying.
Nevertheless, if you want to save the image instead, you don't necessarily need that conversion, but just need to use a proper file format to store those 32-bit information, TIFF for example:
b=Image.fromarray(b)
b.save('b.tiff')
I just stumbled upon a weird situation with skimage.io.imread.
I was trying to open a MultiPage TIFF (dimensions: 96x512x512) like this:
import argparse
from pathlib import Path
import numpy as np
from skimage import io
def numpy_array_from_file(path):
""" method to load numpy array from tiff file"""
im_data = io.imread(path)
print ("image dimensions {}".format(im_data.shape))
return im_data
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Extract page from a MultiPage TIFF")
parser.add_argument("tiff_file", type=str, help="3D TIFF file to open)")
args = parser.parse_args()
tiff_file = Path(args.tiff_file).absolute()
numpy_array_from_file(tiff_file)
And I was obtaining in the output:
image dimensions (512, 512)
After trying many different things (because I was sure that my input image had 96 pages), I discovered that the problem was to use directly Path in the numpy_array_from_file instead of using a string. By changing the last line to:
numpy_array_from_file(str(tiff_file))
I got the expected:
image dimensions (96, 512, 512)
So, my question is ... Anyone know why I had that behaviour? I am not very experienced in python, but I would have expected to obtain an error if Path was not appropriate in that situation.
Indeed, pathlib.Path is relatively new, so support in scikit-image is generally patchy. What's happening is that, because the Path is not a string, the extension isn't checked, and imageio is used instead of tifffile. The behavior of imread is different for the two libraries, with imageio preferring to only return a single plane.
This is a known issue in our issue tracker. Until it's fixed, you can use str() or explicitly call plugin='tifffile'.
Disclaimer: huge openCV noob
Traceback (most recent call last):
File "lanes2.py", line 22, in
canny = canny(lane_image)
File "lanes2.py", line 5, in canny
gray = cv2.cvtColor(imgUMat, cv2.COLOR_RGB2GRAY)
TypeError: Expected cv::UMat for argument 'src'
What exactly is 'src' referring to?
src is the first argument to cv2.cvtColor.
The error you are getting is because it is not the right form. cv2.Umat() is functionally equivalent to np.float32(), so your last line of code should read:
gray = cv2.cvtColor(np.float32(imgUMat), cv2.COLOR_RGB2GRAY)
gray = cv2.cvtColor(cv2.UMat(imgUMat), cv2.COLOR_RGB2GRAY)
UMat is a part of the Transparent API (TAPI) than help to write one code for the CPU and OpenCL implementations.
The following can be used from numpy:
import numpy as np
image = np.array(image)
Not your code is the problem this is perfectly fine:
gray = cv2.cvtColor(imgUMat, cv2.COLOR_RGB2GRAY)
The problem is that imgUMat is None so you probably made a mistake when loading your image:
imgUMat = cv2.imread("your_image.jpg")
I suspect you just entered the wrong image path.
Just add this at start:
image = cv2.imread(image)
Convert your image matrix to ascontiguousarray using np.ascontiguousarray as bellow:
gray = cv2.cvtColor(np.ascontiguousarray(imgUMat), cv2.COLOR_RGB2GRAY)
Is canny your own function? Do you use Canny from OpenCV inside it? If yes check if you feed suitable argument for Canny - first Canny argument should meet following criteria:
type: <type 'numpy.ndarray'>
dtype: dtype('uint8')
being single channel or simplyfing: grayscale, that is 2D array, i.e. its shape should be 2-tuple of ints (tuple containing exactly 2 integers)
You can check it by printing respectively
type(variable_name)
variable_name.dtype
variable_name.shape
Replace variable_name with name of variable you feed as first argument to Canny.
This is a general error, which throws sometimes, when you have mismatch between the types of the data you use. E.g I tried to resize the image with opencv, it gave the same error. Here is a discussion about it.
Some dtype are not supported by specific OpenCV functions. For example inputs of dtype np.uint32 create this error. Try to convert the input to a supported dtype (e.g. np.int32 or np.float32)
that is referring to the expected dtype of your image
"image".astype('float32') should solve your issue
Sometimes I have this error when videostream from imutils package doesn't recognize frame or give an empty frame. In that case, solution will be figuring out why you have such a bad frame or use a standard VideoCapture(0) method from opencv2
If using ImageGrab
Verify that your image is not a 0x0 area due to an incorrect bbox.
Verify the application root folder is the same as the file you are attempting to run.
I got round thid by writing/reading to a file. I guessed cv.imread would put it into the format it needed. This code for anki Vector SDK program but you get the idea.
tmpImage = robot.camera.latest_image.raw_image.save('temp.png')
pilImage = cv.imread('temp.png')
If you are using byte object instead of reading from file you can convert your image to numpy array like this
image = numpy.array(Image.open(io.BytesIO(image_bytes)))
Is it possible to reduce the depth of an image using PIL? Say like going to 4bpp from a regular 8bpp.
You can easily convert image modes (just call im.convert(newmode) on an image object im, it will give you a new image of the new required mode), but there's no mode for "4bpp"; the modes supported are listed here in the The Python Imaging Library Handbook.
This can be done using the changeColorDepth function in ufp.image module.
this function only can reduce color depth(bpp)
import ufp.image
import PIL
im = PIL.Image.open('test.png')
ufp.image.changeColorDepth(im, 16) # change to 4bpp(this function change original PIL.Image object)
im.save('changed.png')