I am using opencv with python. I wanted to do an cv2.imwrte:
cv2.imwrite('myimage.png', my_im)
The only problem is that opencv does not recognize the params constants:
cv2.imwrite('myimage.png', my_im, cv2.CV_IMWRITE_PNG_COMPRESSION, 0)
It cannot find CV_IMWRITE_PNG_COMPRESSION at all. Any ideas?
I can't find key CV_XXXXX in the cv2 module:
Try cv2.XXXXX
Failing that, use cv2.cv.CV_XXXXX
In your case, cv2.cv.CV_IMWRITE_PNG_COMPRESSION.
More info.
The docs for OpenCV (cv2 interface) are a bit confusing.
Usually parameters that look like CV_XXXX are actually cv2.XXXX.
I use the following to search for the relevant cv2 constant name. Say I was looking for CV_MORPH_DILATE. I'll search for any constant with MORPH in it:
import cv2
nms = dir(cv2) # list of everything in the cv2 module
[m for m in nms if 'MORPH' in m]
# ['MORPH_BLACKHAT', 'MORPH_CLOSE', 'MORPH_CROSS', 'MORPH_DILATE',
# 'MORPH_ELLIPSE', 'MORPH_ERODE', 'MORPH_GRADIENT', 'MORPH_OPEN',
# 'MORPH_RECT', 'MORPH_TOPHAT']
From this I see that MORPH_DILATE is what I'm looking for.
However, sometimes the constants have not been moved from the cv interface to the cv2 interface yet.
In that case, you can find them under cv2.cv.CV_XXXX.
So, I looked for IMWRITE_PNG_COMPRESSION for you and couldn't find it (under cv2....), and so I looked under cv2.cv.CV_IMWRITE_PNG_COMPRESSION, and hey presto! It's there:
>>> cv2.cv.CV_IMWRITE_PNG_COMPRESSION
16
Expanding on mathematical.coffee to ignore case and look in both namespaces:
import cv2
import cv2.cv as cv
nms = [(n.lower(), n) for n in dir(cv)] # list of everything in the cv module
nms2 = [(n.lower(), n) for n in dir(cv2)] # list of everything in the cv2 module
search = 'imwrite'
print "in cv2\n ",[m[1] for m in nms2 if m[0].find(search.lower())>-1]
print "in cv\n ",[m[1] for m in nms if m[0].find(search.lower())>-1]
>>>
in cv2
['imwrite']
in cv
['CV_IMWRITE_JPEG_QUALITY', 'CV_IMWRITE_PNG_COMPRESSION', 'CV_IMWRITE_PXM_BINARY']
>>>
Hopefully this problem will go away in some later release of cv2...
the compression style is automatically chosen from the file extension. see the cv2.imwrite help here.
however you might still be interested to know all the possible flags used by all the possible functions in cv2 and cv modules.
look for cv2.txt and cv.txt on your computer. they will be where the opencv modules are installed. at the bottom of those text files are a list of the flags used by the respective modules.
just in case you don't find them, you can download the ones i have from here, though they are from august 2011:
cv2.txt
cv.txt
in fact, with cv2 style API, this constant is replaced with cv2.IMWRITE_PNG_COMPRESSION.
Related
Binary image B2
Binary image Y2
I think these images are quite simple and clear. Still pytesseract does not work. I really wonder why.
Here is my code
from pytesseract import pytesseract as tesseract
import cv2 as cv
binary = cv.imread(filepath)
lang = 'eng'
config = 'tessedit_char_whitelist=RGB123'
print(tesseract.image_to_string(binary, lang=lang, config=config))
The output is just blank string.
To Dennlinger's point, I would definitely rotate it before sending it through PyTess. PyTess should rotate it automatically though. Should.
Alternatively, I see in your configuration that you have white listed "RGB123" which, correct me if I'm wrong, may mean that PyTess is mainly looking for those specific numbers and characters.
I'd try changing your configuration by omiting that configuration so that it can pick up the "Y" in there.
When using tesserocr how do I limit the set of characters that Tesseract will recognize to just the digits?
I know from this that if I were using c++ I could set a tessedit_char_whitelist in the config file, but I don't know the analogous approach in tesserocr within Python.
In general, the tesserocr documentation gives help that works if the reader already knows the Tesseract API for c++. As I am not fluent in c++, I am hoping to avoid having to read the c++ source code in order to use tesserocr.
If anyone can give me what I actually need to write in python or a general rule for going from config settings to Python code that would be great. Thanks in advance.
Tesserocr works as the C++ API, you can set a whitelist with the function SetVariable.
An example:
from tesserocr import PyTessBaseAPI
from string import digits
with PyTessBaseAPI() as api:
api.SetVariable('tessedit_char_whitelist', digits)
api.SetImageFile('image.png')
print api.GetUTF8Text() # it will print only digits
If you want another approach that is more straightforward and independent from the C++ API, try with the pytesseract module.
An example with pytesseract:
import pytesseract
from PIL import Image
from string import digits
image = Image.open('image.png')
print pytesseract.image_to_string(
image, config='-c tessedit_char_whitelist=' + digits)
I am trying to isolate text from an image with openCV before sending it to tesseract4 engine to maximize results.
I found this interesting post and I decided to copy the source and try by mysdelf
However I am getting issue with the first call to OpenCV
To reproduce:
Simply copy the code from the gist
launch command script.py /path/to/image.jpg
I am getting issue:
Required argument 'threshold2' (pos 4) not found
Do you maybe have an idea of what does it means.
I am a javascript, java and bash script developer but not python...
In a simple version:
import glob
import os
import random
import sys
import random
import math
import json
from collections import defaultdict
import cv2
from PIL import Image, ImageDraw
import numpy as np
from scipy.ndimage.filters import rank_filter
if __name__ == '__main__':
if len(sys.argv) == 2 and '*' in sys.argv[1]:
files = glob.glob(sys.argv[1])
random.shuffle(files)
else:
files = sys.argv[1:]
for path in files:
out_path = path.replace('.jpg', '.crop.png')
if os.path.exists(out_path): continue
orig_im = Image.open(path)
edges = cv2.Canny(np.asarray(orig_im), 100, 200)
Thanks in advance for your help
Edit: okay so this answer is apparently wrong, as I tried to send my own 16-bit int image into the function and couldn't reproduce the results.
Edit2: So I can reproduce the error with the following:
from PIL import Image
import numpy as np
import cv2
orig_im = Image.open('opencv-logo2.png')
threshold1 = 50
threshold2 = 150
edges = cv2.Canny(orig_im, 50, 100)
TypeError: Required argument 'threshold2' (pos 4) not found
So if the image was not cast to an array, i.e., the Image class was passed in, I get the error. The PIL Image class is a class with a lot of things other than the image data associated to it, so casting to a np.array is necessary to pass into functions. But if it was properly cast, everything runs swell for me.
In a chat with Dan Mašek, my idea below is a bit incorrect. It is true that the newer Canny() method needs 16-bit images, but the bindings don't look into the actual numpy dtype to see what bit-depth it is to decide which function call to use. Plus, if you try to actually send a uint16 image in, you get a different error:
edges = cv2.Canny(np.array([[0, 1234], [1234, 2345]], dtype=np.uint16), 50, 100)
error: (-215) depth == CV_8U in function Canny
So the answer I originally gave (below) is not the total culprit. Perhaps you accidentally removed the np.array() casting of the orig_im and got that error, or, something else weird is going on.
Original (wrong) answer
In OpenCV 3.2.0, a new method for Canny() was introduced to allow users to specify their own gradient image. In the original implementation, Canny() would use the Sobel() operator for calculating the gradients, but now you could calculate say the Scharr() derivatives and pass those into Canny() instead. So that's pretty cool. But what does this have to do with your problem?
The Canny() method is overloaded. And it decides which function you want to use based on the arguments you send in. The original call for Canny() with the required arguments looks like
cv2.Canny(image, threshold1, threshold2)
but the new overloaded method looks like
cv2.Canny(grad_x, grad_y, threshold1, threshold2)
Now, there was a hint in your error message:
Required argument 'threshold2' (pos 4) not found
Which one of these calls had threshold2 in position 4? The newer method call! So why was that being called if you only passed three args? Note that you were getting the error if you used a PIL image, but not if you used a numpy image. So what else made it assume you were using the new call?
If you check the OpenCV 3.3.0 Canny() docs, you'll see that the original Canny() call requires an 8-bit input image for the first positional argument, whereas the new Canny() call requires a 16-bit x derivative of input image (CV_16SC1 or CV_16SC3) for the first positional argument.
Putting two and two together, PIL was giving you a 16-bit input image, so OpenCV thought you were trying to call the new method.
So the solution here, if you wanted to continue using PIL, is to convert your image to an 8-bit representation. Canny() needs a single-channel (i.e. grayscale) image to run, first off. So you'll need to make sure the image is single-channel first, and then scale it and change the numpy dtype. I believe PIL will read a grayscale image as single channel (OpenCV by default reads all images as three-channel unless you tell it otherwise).
If the image is 16-bit, then the conversion is easy with numpy:
img = (img/256).astype('uint8')
This assumes img is a numpy array, so you would need to cast the PIL image to ndarray first with np.array() or np.asarray().
And then you should be able to run Canny() with the original function call.
The issue was coming from an incompatibility between interfaces used and openCV version.
I was using openCV 3.3 so the correct way to call it is:
orig_im = cv2.imread(path)
edges = cv2.Canny(orig_im, 100, 200)
I have started to use Pytesser, which works great with both english and chinese, but is there a way to have both languages work at the same time? Would I have to make my own traineddata file? My code is:
import Image
from pytesser import *
print image_to_string(Image.open("chinese_and_english.jpg"), lang="eng")
#also want to have chinese be recognized
I'm not sure about Pytesser but using tesserocr you can specify multiple languages. For example:
import tesserocr
with tesserocr.PyTessBaseAPI(lang='eng+chi_tra') as api:
api.SetImageFile('eSXSz.jpg')
print api.GetUTF8Text()
# or simply
print tesserocr.file_to_text('eSXSz.jpg', lang='eng+chi_tra')
Example output for your image:
In [8]: print tesserocr.file_to_text('eSXSz.jpg', lang='eng+chi_tra')
Character, Chmese 動m川爬d
胸肌岫馴伽 H枷﹏ P﹏… …
〔Manda‥﹝ 二 Standard C…爬虯
一
口
X慣ng怕ng
Note that it's more efficient to initialize the API once as in the first example and re-use it for multiple images by calling SetImageFile (or SetImage with a PIL.Image object) to avoid re-initializing the API every time.
I checked the documentation but its incomplete: there is no mention of what rtype parameter actually is.
I think it's a reduce type but I can't find any of variables like cv2.CV_REDUCE_SUM etc... I found this problem with many function that use different variable names. What's the best way to find proper names in cv2 API?
I found out that the appropriate variable can be found in the following package
cv2.cv
If you use CV_REDUCE_SUM operator on uint8 image you have to explicitly provide dtype parameter of bigger range to avoid overflowing (e.g.
slice = cv2.reduce(image, 1, cv2.cv.CV_REDUCE_SUM, dtype=numpy.int32)
If you use CV_REDUCE_AVG operation, result can't overflow that's why setting dtype is optional.
There are some omisions in the current new cv2 lib. Typically these are constants that did not get migrated to cv2 yet and are still in cv only. Here is some code to help you find them:
import cv2
import cv2.cv as cv
nms = [(n.lower(), n) for n in dir(cv)] # list of everything in the cv module
nms2 = [(n.lower(), n) for n in dir(cv2)] # list of everything in the cv2 module
search = 'window'
print "in cv2\n ",[m[1] for m in nms2 if m[0].find(search.lower())>-1]
print "in cv\n ",[m[1] for m in nms if m[0].find(search.lower())>-1]
If you're finding this while using Open CV 3.x or later, these constants have been renamed to cv2.REDUCE_SUM, cv2.REDUCE_AVG, cv2.REDUCE_MAX, and cv2.REDUCE_MIN.
An example of the working reduce function:
reducedArray = cv2.reduce(im, 0, cv2.REDUCE_MAX)
GitHub issue for documentation