For a project I am needing to parse pixel data from a large number of online images. I realised it could well be faster to load the images into programme memory with a get request, carry out the required operations, then move onto the next image - removing the necessity for reading and writing these into storage. However in doing this I have ran into several problems, is there a not (overly) complicated way to do this?
Edit: I didn't include code as as far as I can tell everything I've seen (scikit-image, pillow, imagemagick) is a complete dead end. Not looking for somebody to write code for me, just a pointer in the right direction.
Its easy to load image directly from url.
import PIL
from PIL import Image
import urllib2
url = "https://cdn.pixabay.com/photo/2013/07/12/12/58/tv-test-pattern-146649_1280.png"
img = PIL.Image.open(urllib2.urlopen(url))
Image is now loaded.
Getting pixels is also easy: Get pixel's RGB using PIL
Related
I am trying to create a program on the Raspberry Pi Pico W that downloads a jpg image and displays a very pixilated version of the image on an addressable led matrix. I have decided to attempt to use micro python as my coding language for this project since I have already completed a similar project in the full python language. Since I am downloading the images from an online source I am stuck using the '.jpg' format at a fixed size.
I have been running into some difficulties processing the image. To run on the addressable LEDs (Neopixel) I want to collect the rgb data from each individual pixel of the .jpg and add them to a list.
Working on python I know that the PIL/Pillow library is a great solution to this problem.
from PIL import Image
image = Image.open('256256.jpg',formats=None)
print(image)
from numpy import asarray
data = asarray(image)
Unfortunately the RP2040 doesn't seem have enough storage space to handle the module.
I need to find a way to decode the image using the modules readily available to micro python.
I have attempted to reverse engineer the PIL open feature but haven't had any luck so far.
Thank you in advance!
I am having trouble with compressing image in python without saving the image at the disk. The image has a save function as described here. Here it optimizes the image by saving it. Is it possible to use the same procedure without saving the image. I want to do it like another python function.
image=image.quantize() [here it reduces the quality a lot ]
Thanks in advance :)
In PIL or opencv the image is just a large matrix with values for its pixels. If you want to do something with the image(e.g. display it), the function needs to know all the pixel values, and thus needs the extracted image.
However, there is a method to keep the image compressed in memory until you really need to do something with the image. Have a look at this answer: How can i load a image in Python, but keep it compressed?
I started learning Python a bit ago and just now have started learning Tesseract to create a tool for my own use. I have a script written to take four screenshots of specific parts of the screen and then it uses Tesseract to pull data from those images. It is mostly accurate, and gets the words almost 100% of the time, but there is still some garbage letters and symbols that I don't want in the results.
Rather than trying to process the image (if this really is the easiest way, I could do it, but I still feel like that would result in more data coming through that I don't want) I would like to only keep the words in the result that are in the dictionary I can provide.
import cv2
import pytesseract
import pyscreenshot as ImageGrab
im=ImageGrab.grab(bbox=(580,430,780,500))
im.save(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab.png')
im2=ImageGrab.grab(bbox=(770,430,960,500))
im2.save(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab2.png')
im3=ImageGrab.grab(bbox=(950,430,1150,500))
im3.save(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab3.png')
im4=ImageGrab.grab(bbox=(1140,430,1320,500))
im4.save(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab4.png')
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
from PIL import Image
image = Image.open(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab.png')
image2 = Image.open(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab2.png')
image3 = Image.open(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab3.png')
image4 = Image.open(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab4.png')
print(pytesseract.image_to_string(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab.png'))
print(pytesseract.image_to_string(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab2.png'))
print(pytesseract.image_to_string(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab3.png'))
print(pytesseract.image_to_string(r'C:\Users\Charlie\Desktop\tesseract_images\imagegrab4.png'))
This is what I've written above. It might not be as clean as possible but it does what I want it to for now. When I run the program against the test screenshot I've taken, I get the following:
Ballistica Prime Receiver
a
“Ze ij
Titania Prime Blueprint
—
‘|! stradavar Prime
Blueprint
My.
Bronco Prime Barrel
uby-
Here is my screenshot:
It can pick up the words perfectly fine, but the data like "uby-" and "‘|! " aren't needed and is why I want to hopefully clean them out by only keeping words that are in the dictionary. If there's an easier way to do this I'd love to know as I haven't been using Tesseract but for a day or so and don't know of another way I'd do it other than the aforementioned image processing.
I am not planning to spam, and besides Google has made captcha obsolete with reCaptcha. I am doing this as a project to learn more about OCR and eventually maybe neural networks.
SO I have an image from a Captcha, I have been able to make modest progress, but the documentation on tesseract isn't exactly well documented. Here is the code I have so far and the results are bellow it.
from selenium import webdriver
from selenium.webdriver.common import keys
import time
import random
import pytesseract
from pytesseract import image_to_string
from PIL import Image, ImageEnhance, ImageFilter
def ParsePic():
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe'
im = Image.open("path\\screenshot.png")
im = im.filter(ImageFilter.CONTOUR)
im = im.filter(ImageFilter.DETAIL)
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(4)
im = im.convert('L')
im.save('temp10.png')
text = image_to_string(Image.open('temp10.png'))
print(text)
Original Image
Output
I understand that Captcha was made specifically to defeat OCR, but I read that it is no longer the case, and Im interested in learning how it was done.
My question is, how do I make the background the same color, so the text becomes easily readable?
Late answer but anyway...
You are doing edge detection but there are, obviously, to many in this image so this will not work.
You will have to do some thing with the colors.
I don't know if this is true for every of your captchas but you can just use contrast.
You can test this by open up your original with paint (or any other image edit program) and save the image as "monochrom" (black and white only, NOT grayscale)
result:
without any other editing! (Even the Questionmark is gone)
This would be ready to OCR right away.
Maybe your other images are not this easy, but color/contrast is the way to go for you. If you need ideas on how you can use color, contrast and other things to solve captachs, you can take a look on harder examples and how I solved them here: https://github.com/cracker0dks/CaptchaSolver
cheers
I want to load a number of images from harddrive and place them on a larger white background. And I want to do it in Python. I am wondering what is the best way of doing that. I am using a windows machine and I can use any library I want. Any pointer to a webpage or a sample code that can point me to a good direction would be appreciated.
Something like this:
A very popular image processing library for Python is PIL. The official PIL tutorial might be useful, especially the section about "Cutting, Pasting and Merging Images".
PIL isn't enough. Try also with PIL the aggdraw library.
But aggdraw also isn't enough. It works poorly with transparency. I mean 0.5-1 gray pixel around opaque object over the transpparent area.