I would like to use deep learning program for recognizing the captcha using keras with python.
But the big challenge is to generate massive captcha to train.
I want to solve a captcha like this
How can i easily generate above massive captcha to train.
Currently, i use python package captcha
from captcha.image import ImageCaptcha # pip install captcha
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import random
import os
number = ['0','1','2','3','4','5','6','7','8','9']
MAX_CAPTCHA = 6
WIDTH=100
HEIGHT=30
image = ImageCaptcha(width=WIDTH, height=HEIGHT, font_sizes=[30])
captcha_text = []
for i in range(MAX_CAPTCHA):
c = random.choice(number)
captcha_text.append(c)
#print(captcha_text)
captcha_text = ''.join(captcha_text)
print(captcha_text)
captcha = image.generate(captcha_text)
captcha_image = Image.open(captcha)
captcha_image = np.array(captcha_image)
image.write(captcha_text, str(i)+'_'+captcha_text + '.png')
plt.imshow(captcha_image)
plt.show()
If there is no existing similar captcha datasets online I would tackle this problem in the following way:
Get the MNIST dataset
Take one image example and play with it in gimp or some image transformation librari like Open CV, to get a look similar to your captcha examples.
Transform the MNIST examples in a way you find fit (some random noise, and randomized color on black pixels or something)
Train a model on these augmented examples
Now for practical use it depends a little what kind of model you are implementing. If you have a model that can detect and classify all the numbers on the image then you are finished. But if you like to have a simple model that classifies only images with a single digit on them, then you can move a sliding window over your captcha image and only collect outputs of windows for wich the model has high enough confidence that there is some number pressent in the window.
You can generate captchas using PIL python Imaging Library
Another solution may be using Inkscape and its python scripting api.
There are some program already available using these technologies:
https://github.com/kuszaj/claptcha
https://github.com/ramwin/captcha
https://www.quora.com/What-is-the-code-for-generating-an-image-CAPTCHA-using-Python
Related
I started learning Python a bit ago and just now have started learning Tesseract to create a tool for my own use. I have a script written to take four screenshots of specific parts of the screen and then it uses Tesseract to pull data from those images. It is mostly accurate, and gets the words almost 100% of the time, but there is still some garbage letters and symbols that I don't want in the results.
Rather than trying to process the image (if this really is the easiest way, I could do it, but I still feel like that would result in more data coming through that I don't want) I would like to only keep the words in the result that are in the dictionary I can provide.
import cv2
import pytesseract
import pyscreenshot as ImageGrab
im=ImageGrab.grab(bbox=(580,430,780,500))
im.save(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab.png')
im2=ImageGrab.grab(bbox=(770,430,960,500))
im2.save(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab2.png')
im3=ImageGrab.grab(bbox=(950,430,1150,500))
im3.save(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab3.png')
im4=ImageGrab.grab(bbox=(1140,430,1320,500))
im4.save(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab4.png')
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
from PIL import Image
image = Image.open(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab.png')
image2 = Image.open(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab2.png')
image3 = Image.open(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab3.png')
image4 = Image.open(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab4.png')
print(pytesseract.image_to_string(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab.png'))
print(pytesseract.image_to_string(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab2.png'))
print(pytesseract.image_to_string(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab3.png'))
print(pytesseract.image_to_string(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab4.png'))
This is what I've written above. It might not be as clean as possible but it does what I want it to for now. When I run the program against the test screenshot I've taken, I get the following:
Ballistica Prime Receiver
a
“Ze ij
Titania Prime Blueprint
—
‘|! stradavar Prime
Blueprint
My.
Bronco Prime Barrel
uby-
Here is my screenshot:
It can pick up the words perfectly fine, but the data like "uby-" and "‘|! " aren't needed and is why I want to hopefully clean them out by only keeping words that are in the dictionary. If there's an easier way to do this I'd love to know as I haven't been using Tesseract but for a day or so and don't know of another way I'd do it other than the aforementioned image processing.
I am using simpleITK for image registration, which is a good tool to use. But unfortunately, in the new version, we can not calculate the mutual information value. I have tried other calculation methods, but finally found that it is not applicable to calculate the mutual information of images (such as sklearn package).
Is there a library / tool that already does that? How would you implement it?
Hey it does the trick with SITK
import SimpleITK as sitk
img1 = sitk.ReadImage('')
img2 = sitk.ReadImage('')
registration_method = sitk.ImageRegistrationMethod()
registration_method.SetMetricAsMattesMutualInformation()
registration_method.MetricEvaluate(img1,img2)
I started studying machine learning in python a few days ago and I was trying some examples online when I decided to try it for myself using a custom dataset.
However, I noticed that most datasets involve images that are taken from camera photos composing of hundreds, if not thousands of images with the same target image.
If I create a custom icon in Photoshop, do I need to take a picture of my monitor a thousand times to achieve this? Is it possible to train an AI using only a single PNG file?
My goal right now is to let the AI do object detection on another big image and it needs to find the custom icon inside the image, kind of like Finding Waldo. All of which are digital images straight from Photoshop though, so I don`t know if it is possible.
Right now, I am using a python-based Computer Vision library called ImageAI.
You can use a data preparation strategy called Data Augmentation.
There are mainly two types of Augmentation
Linear Transformation
Affine Transformation
Here is a good white paper
http://cs231n.stanford.edu/reports/2017/pdfs/300.pdf
I am not planning to spam, and besides Google has made captcha obsolete with reCaptcha. I am doing this as a project to learn more about OCR and eventually maybe neural networks.
SO I have an image from a Captcha, I have been able to make modest progress, but the documentation on tesseract isn't exactly well documented. Here is the code I have so far and the results are bellow it.
from selenium import webdriver
from selenium.webdriver.common import keys
import time
import random
import pytesseract
from pytesseract import image_to_string
from PIL import Image, ImageEnhance, ImageFilter
def ParsePic():
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'
im = Image.open("path\\screenshot.png")
im = im.filter(ImageFilter.CONTOUR)
im = im.filter(ImageFilter.DETAIL)
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(4)
im = im.convert('L')
im.save('temp10.png')
text = image_to_string(Image.open('temp10.png'))
print(text)
Original Image
Output
I understand that Captcha was made specifically to defeat OCR, but I read that it is no longer the case, and Im interested in learning how it was done.
My question is, how do I make the background the same color, so the text becomes easily readable?
Late answer but anyway...
You are doing edge detection but there are, obviously, to many in this image so this will not work.
You will have to do some thing with the colors.
I don't know if this is true for every of your captchas but you can just use contrast.
You can test this by open up your original with paint (or any other image edit program) and save the image as "monochrom" (black and white only, NOT grayscale)
result:
without any other editing! (Even the Questionmark is gone)
This would be ready to OCR right away.
Maybe your other images are not this easy, but color/contrast is the way to go for you. If you need ideas on how you can use color, contrast and other things to solve captachs, you can take a look on harder examples and how I solved them here: https://github.com/cracker0dks/CaptchaSolver
cheers
We are using TensorFlow and python to create a custom CNN that will classify images into one of several categories. We have created our CNN based on this tutorial: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/layers/cnn_mnist.py
Instead of reading in a pre-existing dataset like the MNIST dataset used in the tutorial, we would like to read in all images from multiple folders. The name of each folder is the label associated with all the images in that folder. Unfortunately we're very new to python and TensorFlow, could someone point us in the right direction, either with a tutorial or some basic code?
Thank you so much!
consider using the glob package. it allows you to import multiple files in subdirectories easily using patterns. https://docs.python.org/2/library/glob.html
import glob
import matplotlib.pyplot as plt
import numpy as np
images = glob.glob(<file pattern>)
img_list = [plt.imread(image) for image in images]
img_array = np.stack(tuple(img_list))
I haven't tested this so there might be errors but it should make a 3-d numpy array of images (each image a 2-d array). Is this the format you were looking for?