I'm using OpenCV 2.4.0 python bindings and I found that when calculating a laplacian of an image I get different results with the cv2 API from the cv2.cv API.
if I use cv2 API:
im_laplacian = cv2.Laplacian(im_gray, cv2.IPL_DEPTH_32F, ksize = 3)
im_laplacian is always uint8 (missing sign), and ddepth has to be IPL_DEPTH_32F or IPL_DEPTH_64F, if I try IPL_DEPTH_16S or IPL_DEPTH_32S I get an error:
"OverflowError: Python int too large to convert to C long"
if I use cv2.cv API:
cvgray = cv.fromarray(im_gray)
im_laplacian2 = cv.CreateImage(cv.GetSize(cvgray), cv.IPL_DEPTH_16S, 1)
cv.Laplace(cvgray, im_laplacian2, 3)
as expected I get a signed laplacian, this is the same result as in the C++ API.
If I do:
im_laplacian2_scaled = cv.CreateImage(cv.GetSize(cvgray), 8, 1)
cv.ConvertScaleAbs(dst, im_laplacian2_scaled, 1, 0)
im_laplacian2_scaled is still different from im_laplacian calculated with cv2 API
In my particular case I think I can get away with the cv2 output,
but I'm puzzeled, shouldn't all APIs produce the same output?
do they use different algorithms?
or maybe the cv2 python bindings don't correspond to individual C++ functions but some combination of them?
New cv2 API uses different depth constants:
cv2.CV_64F instead of cv2.IPL_DEPTH_64F
cv2.CV_32F instead of cv2.IPL_DEPTH_32F
cv2.CV_32S instead of cv2.IPL_DEPTH_32S
cv2.CV_16S instead of cv2.IPL_DEPTH_16S
cv2.CV_16U instead of cv2.IPL_DEPTH_16U
cv2.CV_8S instead of cv2.IPL_DEPTH_8S
cv2.CV_8U instead of cv2.IPL_DEPTH_8U
Related
I'm using OpenCV version 4.1.1 in Python and cannot get a legitimate reading for a 32-bit image, even when I use cv.IMREAD_ANYDEPTH. Without cv.IMREAD_ANYDEPTH, it returns as None type; with it, I get a matrix of zeros. The issue persists after reinstalling OpenCV. os.path.isfile returns True. The error was replicated on another computer. The images open in ImageJ, so I wouldn't think they're corrupted. I would rather use Skimage since it reads the images just fine, but I have to use OpenCV for what I'm working on. Any advice is appreciated.
img = cv2.imread(file,cv2.IMREAD_ANYDEPTH)
Link for the image: https://drive.google.com/file/d/1IiHbemsmn2gLW12RG3i9fLYZQW2u8sQw/view?usp=sharing
It appears to be some bug in how OpenCV loads such TIFF images. Pillow seems to load the image in a sensible way. Running
from PIL import Image
import numpy as np
img_pil = Image.open('example_image.tiff')
img_pil_cv = np.array(img_pil)
print(img_pil_cv.dtype)
print(img_pil_cv.max())
I get
int32
40950
as an output, which looks reasonable enough.
When I do
import cv2
img_cv = cv2.imread('example_image.tiff', cv2.IMREAD_ANYDEPTH)
print(img_cv.dtype)
print(img_cv.max())
I get
float32
5.73832e-41
which is obviously wrong.
Nevertheless, the byte array holding the pixel data is correct, it's just not being interpreted correctly. You can use numpy.ndarray.view to reinterpret the datatype of a numpy array, so that it's treated as an array if 32bit integers instead.
img_cv = cv2.imread('example_image.tiff', cv2.IMREAD_ANYDEPTH)
img_cv = img_cv.view(np.int32)
print(img_cv.dtype)
print(img_cv.max())
Which prints out
int32
40950
Since the maximum value is small enough for 16bit integer, let's convert the array and see what it looks like
img_cv_16bit = img_cv.astype(np.uint16)
cv2.imwrite('output_cv_16bit.png', img_cv_16bit)
OK, there are some bright spots, and a barely visible pattern. With a little adjustment, we can get something visible:
img_cv_8bit = np.clip(img_cv_16bit // 16, 0, 255).astype(np.uint8)
cv2.imwrite('output_cv_8bit.png', img_cv_8bit)
That looks quite reasonable now.
I'm trying to learn how to use pydicom for reading and processing dicom images. I'm using Python 3.
import dicom
import numpy
ds = pydicom.read_file(lstFilesDCM[0])
print(ds.pixel_array)`
I get an error NameError: name 'pydicom' is not defined. If I change
ds = pydicom.read_file(lstFilesDCM[0])
to
ds = dicom.read_file(lstFilesDCM[0])
(using dicom.read_file instead), I get the following error:
NotImplementedError: Pixel Data is compressed in a format
pydicom does not yet handle. Cannot return array
I also verified that pydicom is properly installed and updated.
How do i fix this?
You are trying to call a class that you have not imported before:
Use:
import pydicom
import numpy
ds = pydicom.read_file(lstFilesDCM[0])
print(ds.pixel_array)
or
import dicom
ds = dicom.read_file("the_name_of_file.dcm")
Documentation: http://pydicom.readthedocs.io/en/stable/pydicom_user_guide.html
If you want to get your hands on the pixel data, I suggest to use the convert program from the ImageMagick suite. You can either call this program from Python using the subprocess module. (See this example, where I convert them to JPEG format), or you can use one of the Python bindings.
If you want to manipulate the images, using the bindings might be preferable. But note that not all the bindings have been converted to ImageMagick version 7.
I am trying to isolate text from an image with openCV before sending it to tesseract4 engine to maximize results.
I found this interesting post and I decided to copy the source and try by mysdelf
However I am getting issue with the first call to OpenCV
To reproduce:
Simply copy the code from the gist
launch command script.py /path/to/image.jpg
I am getting issue:
Required argument 'threshold2' (pos 4) not found
Do you maybe have an idea of what does it means.
I am a javascript, java and bash script developer but not python...
In a simple version:
import glob
import os
import random
import sys
import random
import math
import json
from collections import defaultdict
import cv2
from PIL import Image, ImageDraw
import numpy as np
from scipy.ndimage.filters import rank_filter
if __name__ == '__main__':
if len(sys.argv) == 2 and '*' in sys.argv[1]:
files = glob.glob(sys.argv[1])
random.shuffle(files)
else:
files = sys.argv[1:]
for path in files:
out_path = path.replace('.jpg', '.crop.png')
if os.path.exists(out_path): continue
orig_im = Image.open(path)
edges = cv2.Canny(np.asarray(orig_im), 100, 200)
Thanks in advance for your help
Edit: okay so this answer is apparently wrong, as I tried to send my own 16-bit int image into the function and couldn't reproduce the results.
Edit2: So I can reproduce the error with the following:
from PIL import Image
import numpy as np
import cv2
orig_im = Image.open('opencv-logo2.png')
threshold1 = 50
threshold2 = 150
edges = cv2.Canny(orig_im, 50, 100)
TypeError: Required argument 'threshold2' (pos 4) not found
So if the image was not cast to an array, i.e., the Image class was passed in, I get the error. The PIL Image class is a class with a lot of things other than the image data associated to it, so casting to a np.array is necessary to pass into functions. But if it was properly cast, everything runs swell for me.
In a chat with Dan MaĊĦek, my idea below is a bit incorrect. It is true that the newer Canny() method needs 16-bit images, but the bindings don't look into the actual numpy dtype to see what bit-depth it is to decide which function call to use. Plus, if you try to actually send a uint16 image in, you get a different error:
edges = cv2.Canny(np.array([[0, 1234], [1234, 2345]], dtype=np.uint16), 50, 100)
error: (-215) depth == CV_8U in function Canny
So the answer I originally gave (below) is not the total culprit. Perhaps you accidentally removed the np.array() casting of the orig_im and got that error, or, something else weird is going on.
Original (wrong) answer
In OpenCV 3.2.0, a new method for Canny() was introduced to allow users to specify their own gradient image. In the original implementation, Canny() would use the Sobel() operator for calculating the gradients, but now you could calculate say the Scharr() derivatives and pass those into Canny() instead. So that's pretty cool. But what does this have to do with your problem?
The Canny() method is overloaded. And it decides which function you want to use based on the arguments you send in. The original call for Canny() with the required arguments looks like
cv2.Canny(image, threshold1, threshold2)
but the new overloaded method looks like
cv2.Canny(grad_x, grad_y, threshold1, threshold2)
Now, there was a hint in your error message:
Required argument 'threshold2' (pos 4) not found
Which one of these calls had threshold2 in position 4? The newer method call! So why was that being called if you only passed three args? Note that you were getting the error if you used a PIL image, but not if you used a numpy image. So what else made it assume you were using the new call?
If you check the OpenCV 3.3.0 Canny() docs, you'll see that the original Canny() call requires an 8-bit input image for the first positional argument, whereas the new Canny() call requires a 16-bit x derivative of input image (CV_16SC1 or CV_16SC3) for the first positional argument.
Putting two and two together, PIL was giving you a 16-bit input image, so OpenCV thought you were trying to call the new method.
So the solution here, if you wanted to continue using PIL, is to convert your image to an 8-bit representation. Canny() needs a single-channel (i.e. grayscale) image to run, first off. So you'll need to make sure the image is single-channel first, and then scale it and change the numpy dtype. I believe PIL will read a grayscale image as single channel (OpenCV by default reads all images as three-channel unless you tell it otherwise).
If the image is 16-bit, then the conversion is easy with numpy:
img = (img/256).astype('uint8')
This assumes img is a numpy array, so you would need to cast the PIL image to ndarray first with np.array() or np.asarray().
And then you should be able to run Canny() with the original function call.
The issue was coming from an incompatibility between interfaces used and openCV version.
I was using openCV 3.3 so the correct way to call it is:
orig_im = cv2.imread(path)
edges = cv2.Canny(orig_im, 100, 200)
I'm working with OpenCV in Python. I want to get input from Asus Xtion .
I'm able to successfully run samples from PyOpenNI .
I want to use the image obtained (str format) by igen.get_synced_image_map_bgr() in opencv.
igen-ImageGenerator
I want to convert it to IplImage.
How can I do it,or How can I otherwise use the input from the depth sensor in Opencv python code.
I recently used the string format depth data from Kinect in PyOpenNI with OpenCV. Use numpy arrays which can be created from strings and which are the default data type in cv2 (OpenCV) for Python.
Code example here: http://euanfreeman.co.uk/pyopenni-and-opencv/
Not sure how Kinect differs from your depth sensor but that may be a useful starting point. Good luck!
Edit: added code
from openni import *
import numpy as np
import cv2
# Initialise OpenNI
context = Context()
context.init()
# Create a depth generator to access the depth stream
depth = DepthGenerator()
depth.create(context)
depth.set_resolution_preset(RES_VGA)
depth.fps = 30
# Start Kinect
context.start_generating_all()
context.wait_any_update_all()
# Create array from the raw depth map string
frame = np.fromstring(depth.get_raw_depth_map_8(), "uint8").reshape(480, 640)
# Render in OpenCV
cv2.imshow("image", frame)
I checked the documentation but its incomplete: there is no mention of what rtype parameter actually is.
I think it's a reduce type but I can't find any of variables like cv2.CV_REDUCE_SUM etc... I found this problem with many function that use different variable names. What's the best way to find proper names in cv2 API?
I found out that the appropriate variable can be found in the following package
cv2.cv
If you use CV_REDUCE_SUM operator on uint8 image you have to explicitly provide dtype parameter of bigger range to avoid overflowing (e.g.
slice = cv2.reduce(image, 1, cv2.cv.CV_REDUCE_SUM, dtype=numpy.int32)
If you use CV_REDUCE_AVG operation, result can't overflow that's why setting dtype is optional.
There are some omisions in the current new cv2 lib. Typically these are constants that did not get migrated to cv2 yet and are still in cv only. Here is some code to help you find them:
import cv2
import cv2.cv as cv
nms = [(n.lower(), n) for n in dir(cv)] # list of everything in the cv module
nms2 = [(n.lower(), n) for n in dir(cv2)] # list of everything in the cv2 module
search = 'window'
print "in cv2\n ",[m[1] for m in nms2 if m[0].find(search.lower())>-1]
print "in cv\n ",[m[1] for m in nms if m[0].find(search.lower())>-1]
If you're finding this while using Open CV 3.x or later, these constants have been renamed to cv2.REDUCE_SUM, cv2.REDUCE_AVG, cv2.REDUCE_MAX, and cv2.REDUCE_MIN.
An example of the working reduce function:
reducedArray = cv2.reduce(im, 0, cv2.REDUCE_MAX)
GitHub issue for documentation