OCR Space response ParsedResults[0] error - python

I am making a program in python that scans receipts and relies on an OCR response using the OCRSpace API. It has worked perfectly in that past with a couple hundred tries but when uploading an image to my flask server from an iphone instead of a computer, the image's contents do not have an OCR result. I have tried using the same image on their website and it gives a normal response but with my flask app it returns
parsed_results = result.get("ParsedResults")[0]
TypeError: 'NoneType' object is not subscriptable
I am using the code:
img = cv2.imread(file_path)
height, width, _ = img.shape
roi = img[0: height, 0: width]
_, compressedimage = cv2.imencode(".jpg", roi, [1, 90])
file_bytes = io.BytesIO(compressedimage)
url_api = "https://api.ocr.space/parse/image"
result = requests.post(url_api,
files = {os.path.join(r'PATH', file_name): file_bytes},
data = {"apikey": "KEY",
"language": "eng",
#"OCREngine": 2,
"isTable": True})
result = result.content.decode()
result = json.loads(result)
parsed_results = result.get("ParsedResults")[0]
global OCRText
OCRText = parsed_results.get("ParsedText")
Thanks for any help in advance!

iPhones and iPads as of iOS 11 use HEIF as standard; there are no incompatibilities when transferring to PC or sending e.g. by sharing, as the images are converted to the widely supported JPEG format; however, incompatibilities arise when using cloud services e.g. Google Photos.
High Efficiency Image File Format (HEIF)

As #rob247 posted IPhones are using HEIF format by default(official link here)
So when you uploaded photos to the script please try converting it to JPEG before use since opencv does not support *heif,*avif,*heic yet see issue #14534 also view the list of supported formats at opencv imread if you prefer other formats

Related

extract thumbnail from online video link

I am building a website, and I am trying to extract one image from a video directly using the link provide by a user.
The things is that all need to be done in memory: download the video in memory (using requests for example), extract an image and I upload it on my aws bucket
I have searched for a solution and found cv2. I was able locally to extract one image using:
vcap = cv2.VideoCapture(path_to_vid)
res, thumb_buf = cv2.imencode('.png', im_ar)
bt = thumb_buf.tostring()
The issue is, after some research, reading and decoding from bytes or content of response is not supported, so I am back to the beginning.
Ideally I wanted something like this:
r = requests.get(url)
vcap = cv2.VideoCapture(io.BytesIO(r.content))
res, thumb_buf = cv2.imencode('.png', im_ar)
bt = thumb_buf.tostring()

Only one image from 5 is downloaded and it knocks out an error

import requests
from PIL import Image
url_shoes_for_choice = [
"https://content.adidas.co.in/static/Product-CM7531/Unisex_OUTDOOR_SANDALS_CM7531_1.jpg",
"https://cdn.shopify.com/s/files/1/0080/1374/2161/products/product-image-897958210_640x.jpg?v=1571713841",
"https://cdn.chamaripashoes.com/media/catalog/product/cache/9/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_8_3.jpg",
"https://ae01.alicdn.com/kf/HTB1EyKjaI_vK1Rjy0Foq6xIxVXah.jpg_q50.jpg",
"https://www.converse.com/dw/image/v2/BCZC_PRD/on/demandware.static/-/Sites-cnv-master-catalog/default/dwb9eb8c43/images/a_107/167708C_A_107X1.jpg"
]
def img():
for url in url_shoes_for_choice:
image = requests.get(url, stream=True).raw
out = Image.open(image)
out.save('image/image.jpg', 'jpg')
if __name__=="__main__":
img()
Error:
OSError: cannot identify image file <_io.BytesIO object at 0x7fa185c52d58>
The problem is that one of the images is making issues with the byte data returned by the requests.get(url, stream=True).raw, I'm not sure but I guess the data of the 3rd image is invalid byte data so instead of getting the raw data we can just fetch the content and then by using BytesIO we can fix the byte data.
I fixed one more thing from your original code, I added numbering to your images so each can be saved with different name.
from io import BytesIO
def img():
for count, url in enumerate(url_shoes_for_choice):
image = requests.get(url, stream=True)
with BytesIO(image.content) as f:
with Image.open(f) as out:
# out.show() # See the images
out.save('image/image{}.jpg'.format(count))
(Though this works fine but I'm not sure what was the main issue. If anyone knows exactly what is the issue please comment and explain.)
I opened the first link in my browser and saved the image. It's actually a webp file.
$ file Unisex_OUTDOOR_SANDALS_CM7531_1.webp
Unisex_OUTDOOR_SANDALS_CM7531_1.webp: RIFF (little-endian) data, Web/P image, VP8 encoding, 500x500, Scaling: [none]x[none], YUV color, decoders should clamp
You explicitly tell the image library that it should expect a jpg. When you remove that parameter and let it figure it out on its own using out.save('image/image.jpg') the first image successfully downloads for me.
The first two images work this way if you make sure to save each under a different name:
def img():
i = 0
for url in url_shoes_for_choice:
i+=1
image = requests.get(url, stream=True).raw
out = Image.open(image)
out.save('image{}.jpg'.format(i))
the third is a valid jpeg file, as well as the fourth, but using the JFIF standard 1.01 which I hear the first time of. I'm pretty sure you'll have to figure out support for different such filetypes.
It is worth noting that if I download the images in chrome and open those with python, nothing fails. So chrome might be adding information to the file.
The documentation of PIL/pillow explains here that you need a new enough version for animated images, but that is not your problem.
Support for animated WebP files will only be enabled if the system
WebP library is v0.5.0 or later. You can check webp animation support
at runtime by calling features.check(“webp_anim”).

Using **wand** to reduce image filesize for improved OCR performance?

I'm trying to write use the wand simple MagickWand API binding for Python to extract pages from a PDF, stitch them together into a single longer ("taller") image, and pass that image to Google Cloud Vision for OCR Text Detection. I keep running up against Google Cloud Vision's 10MB filesize limit.
I thought a good way to get the filesize down might be to eliminate all color channels and just feed Google a B&W image. I figured out how to get grayscale, but how can I make my color image into a B&W ("bilevel") one? I'm also open to other suggestions for getting the filesize down. Thanks in advance!
from wand.image import Image
selected_pages = [0,1]
imageFromPdf = Image(filename=pdf_filepath+str(selected_pages), resolution=600)
pages = len(imageFromPdf.sequence)
image = Image(
width=imageFromPdf.width,
height=imageFromPdf.height * pages
)
for i in range(pages):
image.composite(
imageFromPdf.sequence[i],
top=imageFromPdf.height * i,
left=0
)
image.colorspace = 'gray'
image.alpha_channel = False
image.format = 'png'
image
The following are several methods of getting a bilevel output from Python Wand (0.5.7). The last needs IM 7 to work. One note in my testing is that in IM 7, the first two results are swapped in terms of dithering or not dithering. But I have reported this to the Python Wand developer.
Input:
from wand.image import Image
from wand.display import display
# Using Wand 0.5.7, all images are not dithered in IM 6 and all images are dithered in IM 7
with Image(filename='lena.jpg') as img:
with img.clone() as img_copy1:
img_copy1.quantize(number_colors=2, colorspace_type='gray', treedepth=0, dither=False, measure_error=False)
img_copy1.auto_level()
img_copy1.save(filename='lena_monochrome_no_dither.jpg')
display(img_copy1)
with img.clone() as img_copy2:
img_copy2.quantize(number_colors=2, colorspace_type='gray', treedepth=0, dither=True, measure_error=False)
img_copy2.auto_level()
img_copy2.save(filename='lena_monochrome_dither.jpg')
display(img_copy2)
with img.clone() as img_copy3:
img_copy3.threshold(threshold=0.5)
img_copy3.save(filename='lena_threshold.jpg')
display(img_copy3)
# only works in IM 7
with img.clone() as img_copy4:
img_copy4.auto_threshold(method='otsu')
img_copy4.save(filename='lena_threshold_otsu.jpg')
display(img_copy4)
First output using IM 6:
Second output using IM 7:

How to encode OpenCV Image as bytes using Python

I am having difficulty sending a jpeg opened with cv2 to a server as bytes. The server complains that the file type is not supported. I can send it without problems using Python's "open" function, but not with OpenCV. How can I get this to work?
import cv2
path = r".\test\frame1.jpg"
with open(path, "rb") as image:
image1 = image.read()
image2 = cv2.imread(path, -1)
image2 = cv2.imencode(".jpg", image2)[1].tobytes() #also tried tostring()
print(image1 == image2)
#This prints False.
#I want it to be True or alternatively encoded in a way that the server accepts.
I want to start by getting your test case working, we will do this by using a lossless format with no compression so we are comparing apples to apples:
import cv2
path_in = r".\test\frame1.jpg"
path_temp = r".\test\frame1.bmp"
img = cv2.imread(path_in, -1)
cv2.imwrite(path_temp, img) # save in lossless format for a fair comparison
with open(path_temp, "rb") as image:
image1 = image.read()
image2 = cv2.imencode(".bmp", img)[1].tobytes() #also tried tostring()
print(image1 == image2)
#This prints True.
This is not ideal since compression is desirable for moving around bytes, but it illustrates that there is nothing inherently wrong with your encoding.
Without knowing the details of your server it is hard to know why it isn't accepting the opencv encoded images. Some suggestions are:
provide format specific encoding parameters as described in the docs, available flags are here
try different extensions

Error in Boto AWS Rekognition

I am trying to compare faces using AWS Rekognitionthrough Python boto3, as instructed in the AWS documentation.
My API call is:
client = boto3.client('rekognition', aws_access_key_id=key, aws_secret_access_key=secret, region_name=region )
source_bytes = open('source.jpg', 'rb')
target_bytes = open('target.jpg', 'rb')
response = client.compare_faces(
SourceImage = {
'Bytes':bytearray(source_bytes.read())
},
TargetImage = {
'Bytes':bytearray(target_bytes.read())
},
SimilarityThreshold = SIMILARITY_THRESHOLD
)
source_image.close()
target_image.close()
But everytime I run this program,I get the following error:
botocore.errorfactory.InvalidParameterException: An error occurred (InvalidParameterException) when calling the CompareFaces operation: Request has Invalid Parameters
I have specified the secret, key, region, and threshold properly. How can I clear off this error and make the request call work?
Your code is perfectly fine,
image dimensions matters when it comes to AWS Rekognition.
Limits in Amazon Rekognition
The following is a list of limits in Amazon Rekognition:
Maximum image size stored as an Amazon S3 object is limited to 15 MB.
The minimum pixel resolution for height and width is 80 pixels.Maximum images size as raw bytes passed in as parameter to an API is 5 MB.
Amazon Rekognition supports the PNG and JPEG image formats. That is, the images you provide as input to various API operations, such as DetectLabels and IndexFaces must be in one of the supported formats.
Maximum number of faces you can store in a single face collection is 1 million.
The maximum matching faces the search API returns is 4096.
source: AWS Docs
For those still looking for answer,
I had the same problem, while, #mohanbabu pointed to official docs for what should go in to compare_faces, what I realised is that compare_faces looks for faces in both SourceImage and TargetImage. I confirmed this by first detecting faces using aws's detect_faces and passing deteced faces to compare_faces.
compare_faces failed almost all the time when face detected by detect_faces was a littile obscure.
So, to summerize if any of your SourceImage or TargetImage is tightly cropped to face AND that face is not instantly obvious, compare_faces will fail.
There can be other reason but this observation worked for me.
ex:
In the above image you can fairly confidently say there is a face in the middle
But,
Now, not so obvious.
This was the reason for me atleast, check both your images and you should know.
The way you are opening the file, you don't need to cast to bytearray.
Try this:
client = boto3.client('rekognition', aws_access_key_id=key, aws_secret_access_key=secret, region_name=region )
source_bytes = open('source.jpg', 'rb')
target_bytes = open('target.jpg', 'rb')
response = client.compare_faces(
SourceImage = {
'Bytes':source_bytes.read()
},
TargetImage = {
'Bytes':target_bytes.read()
},
SimilarityThreshold = SIMILARITY_THRESHOLD
)
source_image.close()
target_image.close()

Categories