I am new to image processing and want to do OCR for images with digits, however some of them are hard to be recognized, like this one:
I tried binarization but it's not effective enough, so is there any other way to remove the circle and the central star? Thanks!
output: Shee ar
expected output: 891972

Try to use tesseract an OCR engine by Google it will be so helpful in doing OCR.
Try this method for removing background
import cv2
import numpy as np
from matplotlib import pyplot as plt
image_bgr = cv2.imread('images/plane_256x256.jpg')
image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)
rectangle = (0, 56, 256, 150)
Create initial mask
mask = np.zeros(image_rgb.shape[:2], np.uint8)
# Create temporary arrays used by grabCut
bgdModel = np.zeros((1, 65), np.float64)
fgdModel = np.zeros((1, 65), np.float64)
# Run grabCut
cv2.grabCut(image_rgb, # Our image
mask, # The Mask
rectangle, # Our rectangle
bgdModel, # Temporary array for background
fgdModel, # Temporary array for background
5, # Number of iterations
cv2.GC_INIT_WITH_RECT) # Initiative using our rectangle
# Create mask where sure and likely backgrounds set to 0, otherwise 1
mask_2 = np.where((mask==2) | (mask==0), 0, 1).astype('uint8')
# Multiply image with new mask to subtract background
image_rgb_nobg = image_rgb * mask_2[:, :, np.newaxis]
plt.imshow(image_rgb_nobg), plt.axis("off")


How to group an array of coordinates into multiple arrays by proximity?

So I'm trying to make a program that reads an image, masks a certain color (brown in this case) and then gives me the coordinates to those masks. I plan on making the mouse move to those coordinates and click them. Here's what I have done so far:
import pyautogui as pygui
import cv2
import numpy as np
# take screenshot
image = cv2.imread("bloodweb.png")
# read image
lower_brown = np.array([29, 39, 52], dtype = "uint8")
upper_brown = np.array([33, 44, 60], dtype = "uint8")
# define BGR boundaries
mask = cv2.inRange(image, lower_brown, upper_brown)
# create binary mask
output = cv2.bitwise_and(image, image, mask = mask)
# create mask
cv2.imwrite("bloodweb_mask.png", output)
# save masked image
img = cv2.imread("bloodweb_mask.png")
# read masked image
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# convert image to grayscale
img = img.astype(np.uint8)
coord = cv2.findNonZero(img)
# find coordinates of masked image
As you might expect, I'm getting a enormous list of coordinates and most of them are really close by and I would like to only have one coordinate stored for each location, so I can move the mouse there and click it.
Raw image:

How to remove CT bed/shadows in a CT image with python?

I am working with 3D CT images and trying to remove the lines from the bed.
A slice from the original Image:
Following is my code to generate the mask:
segmentation = morphology.dilation(image_norm, np.ones((1, 1, 1)))
labels, label_nb = ndimage.label(segmentation)
label_count = np.bincount(labels.ravel().astype(int))
label_count[0] = 0
mask = labels == label_count.argmax()
mask = morphology.dilation(mask, np.ones((40, 40, 40)))
mask = ndimage.morphology.binary_fill_holes(mask)
mask = morphology.dilation(mask, np.ones((1, 1, 1)))
This results in the following image:
As you can see, in the above image the CT scan as distorted as well.
If I change: mask = morphology.dilation(mask, np.ones((40, 40, 40))) to mask = morphology.dilation(mask, np.ones((100, 100, 100))), the resulting image is as follows:
How can I remove only the two lines under the image without changing the image area? Any help is appreciated.
You've probably found another solution by now. Regardless, I've seen similar CT processing questions on SO, and figured it would be helpful to demonstrate a Scikit-Image solution. Here's the end result.
Here's the code to produce the above images.
from skimage import io, filters, color, morphology
import matplotlib.pyplot as plt
import numpy as np
image = color.rgba2rgb(
gray = color.rgb2gray(image)
tgray = gray > filters.threshold_otsu(gray)
keep_mask = morphology.remove_small_objects(tgray,min_size=463)
keep_mask = morphology.remove_small_holes(keep_mask)
maskedimg = np.einsum('ijk,ij->ijk',image,keep_mask)
fig,axes = plt.subplots(ncols=3)
image_list = [image,keep_mask,maskedimg]
title_list = ["Original","Mask","Imgage w/mask"]
for i,ax in enumerate(axes):
Notes on code
image = color.rgba2rgb(
gray = color.rgb2gray(image)
The image saved as RGBA when I loaded it from SO. It needs to be in grayscale for use in the threshold function.
Your image might already by in grayscale.
Also, the downloaded image showed axis markings. That's why I've trimmed the image.
maskedimg = np.einsum('ijk,ij->ijk',image,keep_mask)
I wanted to apply keep_mask to every channel of the RGB image. The mask is a 2D array, and the image is a 3D array. I referenced this previous question in order to apply the mask to the image.

How do I crop an image based on custom mask in python?

Below I have attached two images. I want the first image to be cropped in a heart shape according to the mask image (2nd image).
I searched for solutions but I was not able to get the simple and easier way to do this. Kindly help me with the solution.
2 images:
Image to be cropped:
Mask image:
Let's start by loading the temple image from sklearn:
from sklearn.datasets import load_sample_images
dataset = load_sample_images()
temple = dataset.images[0]
Since, we need to use the second image as mask, we must do a binary thresholding operation. This will create a black and white masked image, which we can then use to mask the former image.
from matplotlib.pyplot import imread
heart = imread(r'path_to_im\heart.jpg', cv2.IMREAD_GRAYSCALE)
_, mask = cv2.threshold(heart, thresh=180, maxval=255, type=cv2.THRESH_BINARY)
We can now trim the image so its dimensions are compatible with the temple image:
temple_x, temple_y, _ = temple.shape
heart_x, heart_y = mask.shape
x_heart = min(temple_x, heart_x)
x_half_heart = mask.shape[0]//2
heart_mask = mask[x_half_heart-x_heart//2 : x_half_heart+x_heart//2+1, :temple_y]
plt.imshow(heart_mask, cmap='Greys_r')
Now we have to slice the image that we want to mask, to fit the dimensions of the actual mask. Another shape would have been to resize the mask, which is doable, but we'd then end up with a distorted heart image. To apply the mask, we have cv2.bitwise_and:
temple_width_half = temple.shape[1]//2
temple_to_mask = temple[:,temple_width_half-x_half_heart:temple_width_half+x_half_heart]
masked = cv2.bitwise_and(temple_to_mask,temple_to_mask,mask = heart_mask)
If you want to instead make the masked (black) region transparent:
tmp = cv2.cvtColor(masked, cv2.COLOR_BGR2GRAY)
_,alpha = cv2.threshold(tmp,0,255,cv2.THRESH_BINARY)
b, g, r = cv2.split(masked)
rgba = [b,g,r, alpha]
masked_tr = cv2.merge(rgba,4)
Since I am on a remote server, cv2.imshow doesnt work for me. I imported plt.
This code does what you are looking for:
import cv2
import matplotlib.pyplot as plt
img_org = cv2.imread('~/temple.jpg')
img_mask = cv2.imread('~/heart.jpg')
##Resizing images
img_org = cv2.resize(img_org, (400,400), interpolation = cv2.INTER_AREA)
img_mask = cv2.resize(img_mask, (400,400), interpolation = cv2.INTER_AREA)
for h in range(len(img_mask)):
for w in range(len(img_mask)):
if img_mask[h][w][0] == 0:
for i in range(3):
img_org[h][w][i] = 0

How can I insert data from a picture into an 2d array?

I am making an AI that can play the game connect 4, from a picture of one state of the game e.g : click to see
This script below, is detecting red elements from a picture:
import cv2
import numpy as np
img = cv2.imread('connect.png')
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
#get red color
lower_range = np.array([169, 100, 100])
upper_range = np.array([189, 255, 255])
mask = cv2.inRange(hsv, lower_range, upper_range)
cv2.imshow('image', img)
cv2.imshow('mask', mask)
I would like to insert these data into a 2D array to be able to use this array as a game state and determine which move the AI should make.
I have tried to find a solution on Stack Overflow and on Internet but overall I didn't find anything about it.
This is a way to read a picture and to cast it in a 2-dimensional numpy array with np.array(in_image):
import numpy as np
import skimage
from skimage import io, transform
path = "C:/my/path/"
pic = 'myPic.png'
imgName = path+pic
in_image_0 = # read the image
in_image_1 = skimage.color.rgb2gray(in_image_0) # transform it to grey-scale
in_image_2 = skimage.transform.rescale(in_image_1, 0.5) # change the resolution
in_image_3 = np.flipud(np.array(in_image_2)) # make a numpy array and flip it up/down

opencv python connectedComponents select component per label

I want to select each component of this image :
In practice, each and every triangle, by its labels. I don't figure out how.
I have this code:
import cv2
import numpy as np
img = cv2.imread('invMehs.png', -1)
imGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, imBw = cv2.threshold(imGray, 250, 255, cv2.THRESH_BINARY)
invBwMesh = cv2.bitwise_not(imBw)
Mask = np.ones(imBw.shape, dtype="uint8") * 255
connectivity = 4
output = cv2.connectedComponentsWithStats(imBw, connectivity, cv2.CV_32S)
num_labels = output[0]
labels = output[1]
stats = output[2]
centroids = output[3]
labels = labels + 1
b = ( labels == 1)
But the image is complety black :S
Thank you very much.
The image you want save (labels[b]) only contains the thin lines (greylevel 1). When saving image using JPEG format, the compression algorithm smooths them, but since they have only 1 greylevel with the background, they are erased. That's why you get a black image
Saving in PNG format do not alter the image labels.
In order to keep all labels for each connected component (0 for the background), the code to write should be :
