Improve image quality using python - python

i have some problem with my task. i need get text from image using python+tesseract. But quality of image it's not highter - it`s screenshoot.
I'm using OpenCV lib and have two variant`s:
COLOR_BGR2GRAY
ADAPTIVE_THRESH_GAUSSIAN_C | THRESH_BINARY
and this variant`s working incorrect.
When i binarize image
def binarize_image(img_path, threshold=195):
"""Binarize an image."""
image_file = Image.open(img_path)
image = image_file.convert('L') # convert image to monochrome
image = np.array(image)
image = binarize_array(image, threshold)
im = Image.fromarray(image)
im.save(img_path)
# imsave(target_path, image)
def binarize_array(numpy_array, threshold=250):
"""Binarize a numpy array."""
for i in range(len(numpy_array)):
for j in range(len(numpy_array[0])):
if numpy_array[i][j] > threshold:
numpy_array[i][j] = 255
else:
numpy_array[i][j] = 0
return numpy_array
tesseract doesn`t usualy get text.
How i can resolve it`s problem ?
screenshot example
UPD: solved my problem, i need to add some pixels between two letters. on screenshot letter is white and background is black. How i can do it using numpy ?

Related

Python tesseract cannot read numbers from image

I have a python script that works for some images with numbers, it reads them correctly.
The type of images that work are here :Working image
I'm trying to use the script with a new kind of images with numbers only but it is not working. The new images type is here:Non working image
My script is as following:
try:
from PIL import Image
from PIL import ImageEnhance
except ImportError:
import Image
import pytesseract
black = (0,0,0)
white = (255,255,255)
threshold = (160,160,160)
# Open input image in grayscale mode and get its pixels.
img = Image.open("./in/web_search.jpg").convert("LA")
# multiply each pixel by 1.2
out = img.point(lambda i: i * 1.3)
enh = ImageEnhance.Contrast(out)
enh.enhance(1.3).show("30% more contrast")
pixels = out.getdata()
newPixels = []
# Compare each pixel
for pixel in pixels:
if pixel < threshold:
newPixels.append(black)
else:
newPixels.append(white)
# Create and save new image.
newImg = Image.new("RGB",out.size)
newImg.putdata(newPixels)
newImg.save("./out/web_search.jpg")
pytesseract.pytesseract.tesseract_cmd = r'/usr/bin/tesseract'
print("-----------------------")
print(pytesseract.image_to_string(Image.open('./out/web_search.jpg'), lang='eng', config='--psm 10 --oem 3 -c tessedit_char_whitelist=1234567890 --tessdata-dir="/usr/share/tesseract-ocr/4.00/tessdata/"'))
print("-----------------------")
The result with my new image is:
-----------------------
Riemer gaat bee 6 eee
-----------------------
Any help please?
Thanks.
You'll probably need to do some work to get it to pick that up. Some things you can do are:
Tesseract allows you to limit the character range which may be used. Set it to numbers only.
Use some form of preprocessing to remove the noise. Either Python Pillow noise removal function, or using morphological opening/closing.
Perform fine tuning training on the network.

Python add noise to image breaks PNG

I'm trying to create a image system in Python 3 to be used in a web app. The idea is to load an image from disk and add some random noise to it. When I try this, I get what looks like a totally random image, not resembling the original:
import cv2
import numpy as np
from skimage.util import random_noise
from random import randint
from pathlib import Path
from PIL import Image
import io
image_files = [
{
'name': 'test1',
'file': 'test1.png'
},
{
'name': 'test2',
'file': 'test2.png'
}
]
def gen_image():
rand_image = randint(0, len(image_files)-1)
image_file = image_files[rand_image]['file']
image_name = image_files[rand_image]['name']
image_path = str(Path().absolute())+'/img/'+image_file
img = cv2.imread(image_path)
noise_img = random_noise(img, mode='s&p', amount=0.1)
img = Image.fromarray(noise_img, 'RGB')
fp = io.BytesIO()
img.save(fp, format="PNG")
content = fp.getvalue()
return content
gen_image()
I have also tried using pypng:
import png
# Added the following to gen_image()
content = png.from_array(noise_img, mode='L;1')
content.save('image.png')
How can I load a png (With alpha transparency) from disk, add some noise to it, and return it so that it can be displayed by web server code (flask, aiohttp, etc).
As indicated in the answer by makayla, this makes it better: noise_img = (noise_img*255).astype(np.uint8) but the colors are still wrong and there's no transparency.
Here's the updated function for that:
def gen_image():
rand_image = randint(0, len(image_files)-1)
image_file = image_files[rand_image]['file']
image_name = image_files[rand_image]['name']
image_path = str(Path().absolute())+'/img/'+image_file
img = cv2.imread(image_path)
cv2.imshow('dst_rt', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Problem exists somewhere below this line.
img = random_noise(img, mode='s&p', amount=0.1)
img = (img*255).astype(np.uint8)
img = Image.fromarray(img, 'RGB')
fp = io.BytesIO()
img.save(fp, format="png")
content = fp.getvalue()
return content
This will popup a pre-noise image and return the noised image. RGB (And alpha) problem exists in returned image.
I think the problem is it needs to be RGBA but when I change to that, I get ValueError: buffer is not large enough
Given all the new information I am updating my answer with a few more tips for debugging the issue.
I found a site here which creates sample transparent images. I created a 64x64 cyan (R=0, G=255, B=255) image with a transparency layer of 0.5. I used this to test your code.
I read in the image two ways to compare: im1 = cv2.imread(fileName) and im2 = cv2.imread(fileName,cv2.IMREAD_UNCHANGED). np.shape(im1) returned (64,64,3) and np.shape(im2) returned (64,64,4). This is why that flag is required--the default imread settings in opencv will read in a transparent image as a normal RGB image.
However opencv reads in as BGR instead of RGB, and since you cannot save out with opencv, you'll need to convert it to the correct order otherwise the image will have reversed color. For example, my cyan image, when viewed with the reversed color appears like this:
You can change this using openCV's color conversion function like this im = cv2.cvtColor(im, cv2.COLOR_BGRA2RGBA) (Here is a list of all the color conversion codes). Again, double check the size of your image if you need to, it should still have four channels since you converted it to RGBA.
You can now add your noise to your image. Just so you know, this is also going to add noise to your alpha channel as well, randomly making some pixels more transparent and others less transparent. The random_noise function from skimage converts your image to float and returns it as float. This means the image values, normally integers ranging from 0 to 255, are converted to decimal values from 0 to 1. Your line img = Image.fromarray(noise_img, 'RGB') does not know what to do with the floating point noise_img. That's why the image is all messed up when you save it, as well as when I tried to show it.
So I took my cyan image, added noise, and then converted the floats back to 8 bits.
noise_img = random_noise(im, mode='s&p', amount=0.1)
noise_img = (noise_img*255).astype(np.uint8)
img = Image.fromarray(noise_img, 'RGBA')
It now looks like this (screenshot) using img.show():
I used the PIL library to save out my image instead of openCV so it's as close to your code as possible.
fp = 'saved_im.png'
img.save(fp, format="png")
I loaded the image into powerpoint to double-check that it preserved the transparency when I saved it using this method. Here is a screenshot of the saved image overlaid on a red circle in powerpoint:

Resizing JPG using PIL.resize gives a completely black image

I'm using PIL to resize a JPG. I'm expecting the same image, resized as output, but instead I get a correctly sized black box. The new image file is completely devoid of any information, just an empty file. Here is an excerpt for my script:
basewidth = 300
img = Image.open(path_to_image)
wpercent = (basewidth/float(img.size[0]))
hsize = int((float(img.size[1])*float(wpercent)))
img = img.resize((basewidth,hsize))
img.save(dir + "/the_image.jpg")
I've tried resizing with Image.LANCZOS as the second argument, (defaults to Image.NEAREST with 1 argument), but it didn't make a difference. I'm running Python3 on Ubunutu 16.04. Any ideas on why the image file is empty?
I also encountered the same issue when trying to resize an image with transparent background. The "resize" works after I add a white background to the image.
Code to add a white background then resize the image:
from PIL import Image
im = Image.open("path/to/img")
if im.mode == 'RGBA':
alpha = im.split()[3]
bgmask = alpha.point(lambda x: 255-x)
im = im.convert('RGB')
im.paste((255,255,255), None, bgmask)
im = im.resize((new_width, new_height), Image.ANTIALIAS)
ref:
Other's code for making thumbnail
Python: Image resizing: keep proportion - add white background
The simplest way to get to the bottom of this is to post your image! Failing that, we can check the various aspects of your image.
So, import Numpy and PIL, open your image and convert it to a Numpy ndarray, you can then inspect its characteristics:
import numpy as np
from PIL import Image
# Open image
img = Image.open('unhappy.jpg')
# Convert to Numpy Array
n = np.array(img)
Now you can print and inspect the following things:
n.shape # we are expecting something like (1580, 1725, 3)
n.dtype # we expect dtype('uint8')
n.max() # if there's white in the image, we expect 255
n.min() # if there's black in the image, we expect 0
n.mean() # we expect some value between 50-200 for most images

Converting a Grayscale image to its original color format using Python

Hi I am currently working on trying to convert a gray scale image to its original color format using Open CV in python.
import cv2
img = cv2.imread('bw.jpg')
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
cv2.imwrite('gray_image.png',gray_image)
executing this produces an error:
error: (-215) scn == 1 && (dcn == 3 || dcn == 4) in function cv::cvtColor
Code in Python Imaging Library are also welcome.
Any help will be appreciated.
Thank you
I am assuming that you are trying to convert a single channel image to 3 channel grayscale image. You are reading the image as img = cv2.imread('bw.jpg'), by default if you do not pass any param to cv2.imread(), then it reads a 3 channel image, irrespective of the original number of channels in the image. You may simply remove the line cv2.cvtColor(img, cv2.COLOR_GRAY2RGB), as the img is already a 3 channel image with only grayscale information.
However if you are into this delusion that OpenCV has functionality of filling RGB colors to your grayscale image, then you are probably using wrong library. You can checkout other Open Source projects like this, which colorise your image using Deep Learning.
See inline comment where mistake was made.
import cv2
img = cv2.imread('bw.jpg')
x = img.shape
# check for color or gray-scale image type.
if x[3] == 3:
print 'Got color image'
# variable "gray_image" linked to result.
gray_image = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
cv2.imwrite('gray_image.png',gray_image) # varname no longer img > gray_image.
else:
print 'Got black/white, single channel image.'
url = 'https://github.com//gustavla//autocolorize'
print "Using ZdaR's posted solution from %s" % (url)

Remove black areas in image with alpha, odd result, Python opencv

file_name = "alpha_sample.png"
src = cv2.imread(file_name, 1)
tmp = cv2.cvtColor(src, cv2.COLOR_BGR2GRAY)
_,alpha = cv2.threshold(tmp,0,255,cv2.THRESH_BINARY)
b, g, r = cv2.split(src)
rgba = [b,g,r, alpha]
dst = cv2.merge(rgba,4)
#cv2.imshow('funny',dst)
cv2.imwrite("Result.png", dst)
I am trying to run this short code sample (credits by User: Srikanth Bhandary). The Input image is an image with two black rectangles at the bottom.
Input image
Result image
I want to make these areas transparent. But the result of the code is the image below. The code seems to divide the image quite random in a transparent an a non transparent part.
Any suggestions?
Regards, Tobias

Categories