I'm writing a script that takes an image and crops the image down to only include the number I want it to recognize. I have that part working fine. The numbers will be either single or double digit.
I've tried using Googles Vision API, which works fine and gives the correct result, but I would rather do it locally to avoid the fees associated with using that service. I'm currently working on using Tesseract OCR https://github.com/tesseract-ocr/tesseract
Example of an image I want it to recognize:
Tesseract is a command line program but I am calling it in a python file that also handles the other parts of my script. I'm not sure if Tesseract is what I want or if there is a better solution to my problem.
sudo tesseract imgName outputFile
The only results I get no matter what image I put through it returns 0 and also shows "Empty page!!"
EDIT:
I am now using pytesseract and I am trying with this code:
print(pytesseract.image_to_string(img))
Nothing is outputted from that so I tried
print(pytesseract.image_to_string(img,config ='--psm 6'))
which outputs random letters it's guessing. Is there a way with tesseract to only look for numbers so my results are narrowed down?
Related
After having searched for a long time for a solution to my problem without succeeding, I'd like to ask you my question here.
I have a python code that creates a geoTIFF file from google earth engine data. I'm running it on jupyter notebook and I want to export the geoTIFF to my google drive.
The code works without error and a shapefile (shp) is embedded as input.
The problem is that nothing appears on my drive, the folder "GEE" that it creates is well created, but it is empty.
Here is the export part of the code:
task = ee.batch.Export.image.toDrive(image=bare1.clip(aoi),
scale=10,
region=aoi.getInfo()['coordinates'],
fileFormat='GeoTIFF',
description='Active',
folder='GEE',
maxPixels=1e9)
task.start()
You should also know that I am a beginner in python :)
Do you have an idea for a solution? Do not hesitate to ask me for more details.
Thanks :)
First: Have you checked the code editor (https://code.earthengine.google.com/) to see if there has been an error message that accounts for the lack of export, or if the file is actually being deposited in a different place? One note about the 'folder' parameter, is that (in my understanding) it doesn't create a folder necessily, but instead is just telling GEE to deposit your image in the most recently created folder of the same name, which could be anywhere in your Drive.
Next, have you definitely mounted your Google Drive? I assume so, if the GEE folder is working, but just to be sure you can always run:
from google.colab import drive
drive.mount('/content/drive')
Next, I have found that when I am exporting an image, I need to convert it to double for correct export. So in your case, this would be changing the first line to (adding the .toDouble())
task = ee.batch.Export.image.toDrive(image=bare1.clip(aoi).toDouble())
If that doesn't work: Have you tried exporting other images with this same code? (Ie replacing bare1 with another image that you know works, like ee.Image(1) which makes a blank image where every pixel is the value 1?
Happy to take another look if none of this helps!
The task console in the GEE code editor should give a description of the export error. Exports are finicky with a number of causes for error. A good first place to check is that you didn't exceed the maximum pixels. You can deal with max pixel errors by reducing the number of bands in your image to only include those that you need, or increasing the maxpixel parameter in your export task. Sometimes the following dictionary style formatting works for me although it's not clear why:
task = ee.batch.Export.image.toDrive(**{
'image':bare1.clip(aoi),
'scale':10,
'region': aoi.getInfo()['coordinates'],
'fileFormat':'GeoTIFF',
'description':'Active',
'folder':'GEE',
'maxPixels':1e9
})
task.start()
Forgive me if I've left anything out or goofed up formatting conventions; this is my first time posting on this sort of forum.
So I've got a Nikon D5600 that I'm using as part of an (extremely basic) image analysis setup. I'd like to be able to use images from it without having to manually transfer the files over each time I run a test, but I've had some trouble getting access to the files.
To be clear, I don't want to capture screenshots of a video; I understand that this is possible, but the resolution is about 1/3 smaller in video, which is a bit of an issue for my application.
So, when I was 6 hours more naive, I plugged in the camera via USB to my (Windows 10) desktop, tried calling the image using the exact (well, I did change the slashes out) file path windows gave me in the properties screen:
img = cv2.imread("This PC/D5600/Removable storage/DCIM/314D5600/CFW_0031.jpg")
That didn't work.
I checked that the command I was using wasn't the issue by copying the picture to another drive:
img = cv2.imread("D:/CFW_0031.jpg")
That worked.
So I think, and think is a bold claim here, that it's something to do with the "This PC" bit of the path. I've read some old (circa 2009) posts about MTP and such things, but I'm honestly not sure if that's even what this camera uses, or how to get started with that if it is in fact the correct protocol.
I've also tried using pygrabber (I believe it's a wrapper of direct show, though my terminology may be wrong) to control the camera via python, but that also didn't work, although I did manage to control my webcam, which was interesting.
Finally, I attempted to set the assign a letter drive to the camera, but found that the camera wasn't in the manager's list of discs. It's entirely possible I just did this method wrong, but I don't quite see how.
Edit regarding comment from Cristoph
-I just need to be able to use the image files in python, probably with opencv. I suppose that counts as reading them?
-I've attached a screenshot of what the "This PC" location looks like in the file explorer. The camera shows up under devices and drives, but doesn't have a drive letter.
I use the
import cv2,pyautogui,numpy as np
img=np.array(pyautogui.screenshot())
pytesseract.image_to_string(img, lang='eng')
command to get the python wrapper for tesseract to get text from an image for me, which goes through the cli interface basically to save the image to a file and then convert it, which is understandably slow (0.2 seconds on a PC, 3 seconds on a raspberry pi per image).
How do I call the native tesseract library (preferably in python) to directly process an OpenCV/PIL image without going through the CLI?
I have looked into the code here:
https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/tesseract-ocr/xvTFjYCDRQU/rCEwjZL3BQAJ as sugguested by Pytesseract is too slow. How can I make it process images faster? ,and I can't get any output, even with improvements to get the code from start to finish :
add locale:
import locale
locale.setlocale(locale.LC_ALL, 'C')
change all tess.set_variable("tessedit_pageseg_mode", str(frame_piece.psm)) input values to byte:
tess.set_variable(b"tessedit_pageseg_mode", str.encode(str(frame_piece.psm)))
Anyone have any ideas? Would like something that works on windows as well as linux, but I can probably work with anything that works.
P.S. I have tried image>grayscale>threshold>binarization before handing the image to pytesseract and that does give a decent speed boost over using color images, but even then with the IO write involved, it is slow.
I'm having trouble with pytesseract. I know that you can restrict tesseract to a specific set of characters using command line arguments :
tesseract input.tif output nobatch digits
I found some ppl saying they can restrict tesseract with the following lines in python :
import tesseract
ocr = tesseract.TessBaseAPI();
ocr.Init(".","eng",tesseract.OEM_TESSERACT_ONLY)
ocr.SetVariable("tessedit_char_whitelist", "0123456789")
But this is for using the tesseract API, and I'm using pytesseract.... Finally I also tried :
print(image_to_string(someimage, config='outputbase digits'))
But this doesn't work as I still get letters in my output. This is weird because I am using the below code and it is working :
print(image_to_string(screen, config='-psm 10'))
PSM stands for PageSegmentationMode and it allows me to parse my imagefile as a single character. I don't understand why this works and the snippet before doesn't when they are both commandline arguments to tesseract...
Can anyone help ? I want to use both options with a custom wordlist (that i created in the config folder of tesseract).
Finally found the solution, if it can ever help anyone... This is from the tesseract help page :
Simplest invocation of tesseract :
tesseract imagename outputbase
I could deduce the proper syntax from that (in fact, everything I found on stack overflow pretty much pointed me in the wrong direction, maybe because of different versions of tesseract). Keep in mind I'm using tesseract 3.05 (win installer available on GitHub) and pytesseract (installed from pip).
image_to_string(someimage, config='digits -psm 7')
As we've seen on the help page, the outputbase argument comes first after the filename and before the other options, this allows the use of both PSM & restricted charset.
All the command line args from tesseract help page can be used this way, in the config variable !!
Hy,
I'm working on a project, where I have to generate a image (e.g. .png, .bmp etc) with a python script.
The Image must have:
Small boxes (8x8px) in 3 different colours
Horizontal(normal) text in 2 different sizes
and 3) vertikal text (rotate normal text) (like this: http://devcity.net/Data/ArticleImages/Dual_Labels.jpg)
So not very complex things.
I spent the last days with PiL (Python Image Library). For the small boxes, it works fine and easy. But to generate a text in the image, it doesn't work fine.
What also works is to write a normal text, with the standard font (pilfont-type).
But I can't set the px-size of this text. When using truetypes, the following error comes:
"The _imagingft C module is not installed"
I allready "googled" this and this seems to be a popular problem. My Problem is, that the script also has to run on other python systems. What I can accept is, that I have to install Pil on each system/computer, but I can't fix the problem with the truetypes each time!
I'm using Python 2.7 with pil 1.1.7.
So to my question:
For the named "forms" my script has to generate, what library (or other ways to generate an image with a script) would you recomment to me?
Would it be possible to create, e.g writing a bitmap-file with text and pixels with colour, with my script in "Pure-Python", so without any extension?(Would be the optimal solution for me)
Have you thought about using PyCairo instead? See this link for an example: https://stackoverflow.com/a/6506825/514031
This is not quite what matplotlib was designed for, but is definitely capable of producing what you're after. Have a look at the gallery, it has usage examples for almost everything you mentioned.