I have a question regarding logging for somescript.py
The script performs some actions to find matches for words the user is looking for in some pages that have become unreadable due to re-formatting and printing of the pages.
Because of this, OCR techniques don't work for us anymore so i've come up with a script that compares countours of words to find matches.
the script looks something like:
import cv2
from cv2 import *
import numpy as np
method = cv.CV_TM_SQDIFF_NORMED
template_name = "this.png"
image_name = "3.tif"
needle = cv2.imread(template_name)
haystack = cv2.imread(image_name)
# Convert to gray:
needle_g = cv2.cvtColor(needle, cv2.CV_32FC1)
haystack_g = cv2.cvtColor(haystack, cv2.CV_32FC1)
# Attempt match
d = cv2.matchTemplate(needle_g, haystack_g, cv2.cv.CV_TM_SQDIFF_NORMED)
#we want the minimum squared difference
mn,_,mnLoc,_ = cv2.minMaxLoc(d)
print mnLoc
# Draw the rectangle
MPx,MPy = mnLoc
trows,tcols = needle_g.shape[:2]
#Normed methods give better results, ie matchvalue = [1,3,5], others sometimes showserrors
cv2.rectangle(haystack, (MPx,MPy),(MPx+tcols,MPy+trows),(0,0,255),2)
cv2.imshow('output',haystack)
cv2.waitKey(0)
import sys
sys.exit(0)
Now i want to log the various tasks that the script performs, like
converting the image to grayscale
attempting a match
drawing the rectangle
I have seen a few scripts on stackoverflow explaining how to log an entire script or the entire output but i haven't found anything that just logs a few actions.
Also i would like to add the date and time the activity was performed.
Furthermore i have wrote a function that calculates an MD5 and SHA1 hash of the input file, for this particular case, that is for 'this.png' and '3.tif', I have yet to implement this piece of code but would it be easy to log that as well?
I am a python-noob so if the anwsers are obvious to you guys you know why i couldn't figure it out myself.
I hope you can help me out on this one!
Related
I am experimenting with python to read words and numbers from screenshots of FORM, something like a scoreboard that can change several times a second. I think the project can be divided into 2 big parts:
Take screenshot of the forms several times a second
I already have some hint to use win32API for faster screenshot here.
Read the words and numbers from the screenshot with reference of blank form
For this, I already have general idea from the youtube video below
https://www.youtube.com/watch?v=cUOcY9ZpKxw
what I understood is, to apply tesseract to very specific point/ area in the form.
But with this method in the second part, I have a hunch that the execution time is rather slow.
(based from what I see from the video)
So my question is, is there any fast way to read a scoreboard that changes several times a second?
Edit:
Below is my current best effort with the project. I only submit the second part, which is the current bottleneck here.
The image can be found here.
The problem here is, even for just one frame of screenshot, tesseract need around 3 seconds to finish. I tried to use multiprocessing, but it seems my code is not clean enough, so the result is worse than not using it.
import cv2
import pytesseract
import time
import concurrent.futures
pytesseract.pytesseract.tesseract_cmd ="C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
"the height of each field"
h=19
"the list of each field's area and name"
fields=[[(75,5),(130,h),"line 1"],
[(75,5+h),(130,2*h),"line 2"],
[(75,5+2*h),(130,3*h),"line 3"],
[(75,5+3*h),(130,4*h),"line 4"],
[(75,5+4*h),(130,5*h),"line 5"],
[(75,5+5*h),(130,6*h),"line 6"],
[(75,5+6*h),(130,7*h),"line 7"],
[(75,5+7*h),(130,8*h),"line 8"],
[(75,5+8*h),(130,9*h),"line 9"],
[(75,5+9*h),(130,10*h),"line 10"]]
a = time.time()
"loading filled forms"
img = cv2.imread("filled.jpg")
myData = []
"function to crop and OCR the image"
def read(field):
imgCrop = img[field[0][1]:field[1][1],field[0][0]:field[1][0]]
data=pytesseract.image_to_string(imgCrop)
return data
"use this for serial processing"
for field in fields:
myData.append(read(field))
"use this for multi processing"
#if __name__ == '__main__':
# with concurrent.futures.ProcessPoolExecutor() as executor:
# results = executor.map(read, fields)
#for result in results:
# myData.append(result)
print (myData)
b = time.time()
print(b-a)
EDIT 2:
It seems tesseract by default, using multiprocessing and using multiprocessing manually will only hinder the processing speed.
Also, it seems that OCR and image recognition, particularly on its speed and accuracy are still in active research right now. So, maybe I need to wait a little bit more.
Last, I will try to use Google Cloud Vision in the future.
UPDATE: I tried increasing size in the chess.svg.board and it somehow cleared all the rendering issues at size = 900 1800
I tried using the svglib and reportlab to make .png files from .svg, and here is how the code looks:
import sys
import chess.svg
import chess
from svglib.svglib import svg2rlg
from reportlab.graphics import renderPM
board = chess.Board("rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR")
drawing = chess.svg.board(board, size=350)
f = open('file.svg', 'w')
f.write(drawing)
drawing = svg2rlg("file.svg")
renderPM.drawToFile(drawing, "file.png", fmt="png")
If you try to open file.png there is a lot of missing parts of the image, which i guess are rendering issues. How can you fix this?
Sidenote: also getting a lot of 'x_order_2: colinear!' messages when running this on a discord bot, but I am not sure if this affects anything yet.
THIS!! I am having the same error with the same libraries... I didn't find a solution but just a workaround which probably won't help too much in your case, where the shapes generating the bands are not very sparse vertically.
I'll try playing with the file dimensions too, but so far this is what I got. Note that my svg consists of black shapes on a white background (hence the 255 - x in the following code)
Since the appearance of the bands is extremely random, and processing the same file several times in a row produces different results, I decided to take advantage of randomness: what I do is I export the same svg a few times into different pngs, import them all into a list and then only take those pixels that are white in all the exported images, something like:
images_files = [my_convert_function(svgfile=file, index=i) for i in range(3)]
images = [255 - imageio.imread(x) for x in images_files]
result = reduce(lambda a,b: a & b, images)
imageio.imwrite(<your filename here>, result)
[os.remove(x) for x in images_files]
where my_convert_function contains your same svg2rlg and renderPM.drawToFile, and returns the name of the png file being written. The index 'i' is to save several copies of the same png with different names.
It's some very crude code but I hope it can help other people with the same issue
The format parameter has to be in uppercase
renderPM.drawToFile(drawing, "file.png", fmt="PNG")
I've searched the documentation for python-docx and other packages, as well as stack-overflow, but could not find how to remove all images from docx files with python.
My exact use-case: I need to convert hundreds of word documents to "draft" format to be viewed by clients. Those drafts should be identical the original documents but all the images must be deleted / redacted from them.
Sorry for not including an example of things I tried, what I have tried is hours of research that didn't give any info. I found this question on how to extract images from word files, but that doesn't delete them from the actual document: Extract pictures from Word and Excel with Python
From there and other sources I've found out that docx files could be read as simple zip files, I don't know if that means that it's possible to "re-zip" without the images without affecting the integrity of the docx file (edit: simply deleting the images works, but prevents python-docx from continuing to work with this file because of missing references to images), but thought this might be a path to a solution.
Any ideas?
If your goal is to redact images maybe this code I used for a similar usecase could be useful:
import sys
import zipfile
from PIL import Image, ImageFilter
import io
blur = ImageFilter.GaussianBlur(40)
def redact_images(filename):
outfile = filename.replace(".docx", "_redacted.docx")
with zipfile.ZipFile(filename) as inzip:
with zipfile.ZipFile(outfile, "w") as outzip:
for info in inzip.infolist():
name = info.filename
print(info)
content = inzip.read(info)
if name.endswith((".png", ".jpeg", ".gif")):
fmt = name.split(".")[-1]
img = Image.open(io.BytesIO(content))
img = img.convert().filter(blur)
outb = io.BytesIO()
img.save(outb, fmt)
content = outb.getvalue()
info.file_size = len(content)
info.CRC = zipfile.crc32(content)
outzip.writestr(info, content)
Here I used PIL to blur images in some files, but instead of the blur filter any other suitable operation could be used. This worked quite nicely for my usecase.
I don't think it's currently implemented in python-docx.
Pictures in the Word Object Model are defined as either floating shapes or inline shapes. The docx documentation states that it only supports inline shapes.
The Word Object Model for Inline Shapes supports a Delete() method, which should be accessible. However, it is not listed in the examples of InlineShapes and there is also a similar method for paragraphs. For paragraphs, there is an open feature request to add this functionality - which dates back to 2014! If it's not added to paragraphs it won't be available for InlineShapes as they are implemented as discrete paragraphs.
You could do this with win32com if you have a machine with Word and Python installed.
This would allow you to call the Word Object Model directly, giving you access to the Delete() method. In fact you could probably cheat - rather than scrolling through the document to get each image, you can call Find and Replace to clear the image. This SO question talks about win32com find and replace:
import win32com.client
from os import getcwd, listdir
docs = [i for i in listdir('.') if i[-3:]=='doc' or i[-4:]=='docx'] #All Word file
FromTo = {"First Name":"John",
"Last Name":"Smith"} #You can insert as many as you want
word = win32com.client.DispatchEx("Word.Application")
word.Visible = True #Keep comment after tests
word.DisplayAlerts = False
for doc in docs:
word.Documents.Open('{}\\{}'.format(getcwd(), doc))
for From in FromTo.keys():
word.Selection.Find.Text = From
word.Selection.Find.Replacement.Text = FromTo[From]
word.Selection.Find.Execute(Replace=2, Forward=True) #You made the mistake here=> Replace must be 2
name = doc.rsplit('.',1)[0]
ext = doc.rsplit('.',1)[1]
word.ActiveDocument.SaveAs('{}\\{}_2.{}'.format(getcwd(), name, ext))
word.Quit() # releases Word object from memory
In this case since we want images, we would need to use the short-code ^g as the find.Text and blank as the replacement.
word.Selection.Find
find.Text = "^g"
find.Replacement.Text = ""
find.Execute(Replace=1, Forward=True)
I don't know about this library, but looking through the documentation I found this section about images. It mentiones that it is currently not possible to insert images other than inline. If that is what you currently have in your documents, I assume you can also retrieve these by looking in the Document object and then remove them?
The Document is explained here.
Although not a duplicate, you might also want to look at this question's answer where user "scanny" explains how he finds images using the library.
I have been trying to teach myself more advanced methods in Python but can't seem to find anything similar to this problem to base my code off of.
First question: Is this only way to display an image in the terminal to install Pillow? I would prefer not to, as I'm trying to then teach what I learn to a very beginner student. My image.show() function doesn't do anything.
Second question: What is the best way to go about lowering the brightness of all RGB pixels in an image by 20%? What I have below doesn't do anything to the alter the brightness, but it also can compile completely. I would prefer the most simple way to go about this as far as importing minimal libraries.
Third Question: How do I made a new picture instead of changing the original? (IE- lower brightness 20%, "image-decreasedBrightness.jpg" is created from "image.jpg")
here is my code - sorry it isn't formatted correctly. Every time i tried to indent it would tab down to the tags bar.
import Image
import ImageEnhance
fileToBeOpened = raw_input("What is the file name? Include file type.")
image = Image.open(fileToBeOpened)
def decreaseBrightness(image):
image.show()
image = image.convert('L')
brightness = ImageEnhance.Brightness(image)
image = brightness.enhance(20)
image.show()
return image
decreaseBrightness(image)
To save the image as a file, there's an example on the documentation:
from PIL import ImageFile
fp = open("lena.pgm", "rb")
p = ImageFile.Parser()
while 1:
s = fp.read(1024)
if not s:
break
p.feed(s)
im = p.close()
im.save("copy.jpg")
The key function is im.save.
For a more in-depth solution, get a nice beverage, find a comfortable place to sit and enjoy your read:
Pillow 3.4.x Documentation.
ok so im using pyaudio as well but from what I been looking at the wave module could maybe help me out here.
So im trying to add a trimming function on my program, what I mean is im trying to allow the user to find parts of a wav. that he/she doesn't like and has the ability to trim the wave file to however he/she wants it.
so far i been using pyaudio for just simple playback and pyaudio is really easy when it comes to recording from an input device.
I been searching on pyaudio on anything I can do to trim audio but i really havent found anything that can help me. Though on the embedded wave module I see there are ways to set position.
Would I have to have a loop or if statement done so that the program would know which positions to record and then have either pyaudio or the wave module to record the song from user set positions (beginning, end)? Would my program run efficiently if I approached it this way?
Let's assume that you read in the wave file using scipy.
Then you need to have "edit points" These are in and out values (in seconds for example) that the user would like to keep. you could get these from a file or from displaying the wave file and getting clicks from the mouse. If the user gives parts of the audio file that should be removed then this would need to be first negated
This is not the most efficient solution but should be ok for many scenarios.
import scipy.io.wavfile
fs1, y1 = scipy.io.wavfile.read(filename)
l1 = numpy.array([ [7.2,19.8], [35.3,67.23], [103,110 ] ])
l1 = ceil(l1*fs1)#get integer indices into the wav file - careful of end of array reading with a check for greater than y1.shape
newWavFileAsList = []
for elem in l1:
startRead = elem[0]
endRead = elem[1]
if startRead >= y1.shape[0]:
startRead = y1.shape[0]-1
if endRead >= y1.shape[0]:
endRead = y1.shape[0]-1
newWavFileAsList.extend(y1[startRead:endRead])
newWavFile = numpy.array(newWavFileAsList)
scipy.io.wavfile.write(outputName, fs1, newWavFile)