I am getting confused as to how to display a PDF document in its true scale, i.e. scale = 100%.
NB: I am using python-poppler-qt4.
Poppler-qt4 provides a method to get the true size of the PDF in points:
document = Poppler.Document.load('mypdf.pdf')
page = document.page(0)
size = page.pageSize() # returns a QSize object
Then to render the page into a QImage, one should provide the resolution of the graphics device, in dots per inch (DPI):
image = page.renderToImage(72, 72)
Now since the natural size of the document is provided in points (i.e. 72 per inch), and the image renderer requires dots per inch, can I just assume that the natural size of the document is when its resolution is at 72 DPI? Or are dots and points two different measures? If I am wrong, then what is the solution to this?
The points in a PDF file are physical units, you can measure them with a ruler. The dots (pixels) in the image are virtual units and the connection between them is done through the resolution factor. When you move the content from vector space to raster space you decide the relation between points and pixels (the resolution used for conversion), it is up to your application to decide what 100% means.
Most applications use the DPI of the screen as reference for 100% scale. On Windows this usually means 96DPI, one inch from your PDF file is represented on 96 pixels on the screen. Adobe Reader lets you set your own resolution to be used for 100% scale and by default it is 110DPI.
Related
OpenCv doesn't read the metadata of the image. So that, we can't get the dpi of an image. When someone asks about dpi related ocr questions in stackoverflow,
Most of the answers said we don't need DPI. We only need a pixel size.
Changing image DPI for usage with tesseract
Change dpi of an image in OpenCV
In some places, where no one asks about dpi and needs to improve the OCR accuracy someone's come up with the idea that setup DPI to 300 will improve the accuracy.
Tesseract OCR How do I improve result?
Best way to recognize characters in screenshot?
One more thing is, Tesseract said on their official page about that
Tesseract works best on images which have a DPI of at least 300 dpi, so it may be beneficial to resize images.
After some google search, I have found the following things.
We can't tell the image resolution based on height and width
We want an image resolution is high enough to support accurate OCR.
Font size typically means unit length and not pixels like if we have 72 points we have one inch. font size 12pt means 1/6 inchs.
When we have 300 ppi image with a 12pt fontsize texts then the text pixel size is 300 1/6 = 50 pixels.
If we have 60 ppi then the text pixel size is 601/6 =10 pixels.
Below quoted one is from the tesseract official page.
Is there a Minimum / Maximum Text Size? (It won’t read screen text!)
There is a minimum text size for reasonable accuracy. You have to consider resolution as well as point size. Accuracy drops off below 10pt x 300dpi, rapidly below 8pt x 300dpi. A quick check is to count the pixels of the x-height of your characters. (X-height is the height of the lower case x.) At 10pt x 300dpi x-heights are typically about 20 pixels, although this can vary dramatically from font to font. Below an x-height of 10 pixels, you have very little chance of accurate results, and below about 8 pixels, most of the text will be “noise removed”.
Using LSTM there seems also to be a maximum x-height somewhere around 30 px. Above that, Tesseract doesn’t produce accurate results. The legacy engine seems to be less prone to this (see https://groups.google.com/forum/#!msg/tesseract-ocr/Wdh_JJwnw94/24JHDYQbBQAJ).
From these things, I come to one solution that is,
We need a 10 to 12 pt font size text for the OCR. which means If we have 120 ppi(pixel per inch) then we need a height of 20-pixel size. if we have 300 ppi then we need a 50-pixel height for the text.
If Opencv doesn't read the dpi information. What is the default dpi value to tesseract input from an image which is got by imread method of OpenCV?
Does Tesseract do image resizing based on the dpi of an image internally?
If I do resizing the image using opencv then i need to set the dpi to 300 dpi if resizing happens based on dpi internally. What is the easiest way to set up the DPI in OpenCV + pytesseract? but we can do this with PIL
To answer your questions:
DPI is only really relevant when scanning documents - it's a measure of how many dots per inch are used to represent the scanned image. Once tesseract is processing images, it only cares about pixels.
Not as far as I can tell.
The SO answer you linked to relates to writing an image, not reading an image.
I think I understand the core of what you're trying to get at. You're trying to improve the accuracy of your results as it relates to font/text size.
Generally speaking, tesseract seems to work best on text that is about 32 px tall.
Manual resizing
If you're working on a small set of images or a consistent group of images, you can manually resize those images to have capital letters that are approximately 32 pixels tall. That should theoretically give the best results in tesseract.
Automatic resizing
I'm working with an inconsistent data set, so I need an automated approach to resizing images. What I do is to find the bounding boxes for text within the image (using tesseract itself, but you could use EAST or something similar).
Then, I calculate the median height of these bounding boxes. Using that, I can calculate how much I need to resize the image so that the median height of a capital letter in the image is ~32 px tall.
Once I've resized the image, I rerun tesseract and hope for the best. Yay!
Hope that helps somewhat! :)
Bonus: I shared my source code for this function in this Gist
Questions about ghostscript and tkinter:
I have made a tkinter program and I want to convert into an image; I want the image to have the same ratio as 8.5 x 12 in paper; I have read that it's 2550x3300 pixels.
How does this translate to canvas coordinates? For now I picked some numbers with a similar ratio, width=1275,height=1650
Also, what exactly is the canvas and its size? I thought it was the length of what I could write in, but then I can set the scroll regions even further than the given width and height.
Basic idea for the canvas code:
class Application(tk.Frame):
def __init__(self, master):
self.master=master
self.canvas = tk.Canvas(master, width=1275, height=1650, bg='white', highlightthickness=0,
scrollregion=(0, 0, 1275, 1650))
self.hbar = Scrollbar(master, orient=HORIZONTAL)
self.hbar.pack(side=TOP, fill=X)
self.hbar.config(command=self.canvas.xview)
self.vbar = Scrollbar(master, orient=VERTICAL)
self.vbar.pack(side=RIGHT, fill=Y)
self.vbar.config(command=self.canvas.yview)
self.canvas.config(width=1275, height=1650)
self.canvas.config(xscrollcommand=self.hbar.set, yscrollcommand=self.vbar.set)
self.canvas.pack(side=LEFT, expand=True, fill=BOTH)
B1=Button(master,text='add image',command= lambda:self.insert_image())
B1.pack(side=TOP)
and here is my Ghostscript related code:
def _save(self):
self.canvas.postscript(file="tmp.ps",colormode='color')
args = [
"ps2jpg",
"-dSAFER","-dBATCH", "-dNOPAUSE",
"-sDEVICE=png16m",
"-sOutputFile=./ABC.png",
"./tmp.ps"
]
ghostscript.Ghostscript(*args)
And it seems to cut of at the left and right side.
Then I added parameters such as -dFitPage","-g1275x1650","-dPSFitPage", or even "-g2550x3300" instead of "-g1275x1650", but it creates a different error,
where the top of my canvas ends up in the middle of my saved image. What I want is the top of my canvas to appear at the top of my image.
Thank you.
OK, so firstly, the size you have quoted uses '-g' which is the number of pixels. Clearly the actual media size will then depend on the resolution. If I declare the size to be 600x600 and the resolution is 600 dpi then that's 1 inch by 1 inch, if the resolution is 300 dpi, then its 2 inches by 2 inches.
So you can't say 8.5x12 inch (is that supposed to be one of the standard media sizes ?) is 2550x3300 pixels, without also stating the resolution. In fact that can't even be correct. If I assume that 3300 is correct for 12 inch length, then that's a resolution of 275 dpi. If I then figure the width its 2550/275 = 9 inches.
As it happens, the default resolution of the png16m device appears to be 72. So 2550 by 3300 pixels means that your media is 35x45 inches. Not too surprising that you have scroll bars :-)
Of course, its possible that your PostScript program alters the resolution, but since you haven't supplied it to look at, I can't tell.
Now, Postscript co-ordinate systems start (by default) at the bottom left corner which is 0,0 and extend in both directions, positive numbers go up, and right, negative numbers go down and left. Yes its entirely possible to specify that part or all of a drawing operation takes place off the media.
You can also alter the co-ordinate system too, but that's probably more complex than you want to get into.
Without seeing your PostScript program, I can't really say why it lies partially off the media, it may be that that's what the program is asking for.
Using FitPage will attempt to fit the requested image to the page, if its too big it will scale it down (linearly, both directions equally) until both the dimensions fit into the media. This will result in white space in one direction unless your media happens to be the same shape as the program requested. That smallest dimension is then centered. I don't recall exactly but I think if the program marks fit into the media, then it just centres it.
So basically, you need to get the dimensions correct to start with. Assuming you are happy with a 72 dpi output image, and that your media is genuinely 8.5x12 inches, then you can specify -g612x864. If the rendered image doesn't fit precisely then its probable that your program makes marks off the media, is using a different media size, or 'something'. Can't say what without seeing the PostScript.
If you can share a simple PostScript file I can look at it (I can't use anything that requires me to use tkinter, sorry) and give you some more detailed guidance.
[EDIT]
So the output is actually an EPS, not a PostScript program, we can see this from the initial comments (any line beginning with '%' is a comment):
%!PS-Adobe-3.0 EPSF-3.0
%%Creator: Tk Canvas Widget
%%Title: Window .49823304L
%%CreationDate: Mon Aug 14 23:47:27 2017
%%BoundingBox: -171 85 785 707
%!PS tells us its a PostScript program, -Adobe-3.0 tells us it conforms to version 3.0 of the Document Structure Convention (a way of creating PostScript programs that makes them more portable for non PostScript interpreters) and the EPSF tells us its actually an EPS, finally the trailing -3.0 declares that it conforms to version 3.0 of the EPS specification.
Now EPS files are not intended to be sent directly to a PostScript interpreter. They are supposed to be included inside other PostScript programs and used as a kind of 'black box'. This technique is often used when the object is something like a company logo which does not change, or when you send work to an outside agency (eg a freelance graphic artist) they may send it as an EPS for you to use.
The EPS conforms to certain rules regarding what it can and can't contain. One of the important things it cannot do is set the media size. Execution of setpagedevice can cause the device to reset the marked content, which would throw away any marks made before the media selection.
Additionally, the EPS doesn't know how big its going to be when drawn on the final page. You could think of a logo being drawn large on the front page, then drawn small in the footer of each page for example.
So what the EPS contains is a declaration of where it marks the page, this is given by the BoundingBox:
%%BoundingBox: -171 85 785 707
Now you will note that the BoundingBox of this EPS begins at -179,85 and extends to 785,707. So its width is 964 and its height is 792. Those are in PostScript5 units which are 1/72 of an inch. So your EPS is actually 13.38 inches wide by 12 inches tall. Not only that, but it begins 2.48 inches off the left edge of the media.
This probably explains why you are having trouble getting the output you want, you probably are not setting the media correctly in Ghostscript, and translating the origin so that the left edge doesn't lie off the media.
Its the job of the application which places the EPS in its own PostScript output to translate and scale the co-ordinate system so that the EPS is at the required position and size on the final media.
So you have a choice; you can create a PostScript program which sets up the Current Transformation Matrix appropriately to scale and position the EPS, and then includes the EPS in its entirety, finishign with a showpage to actually render it (EPS files may not include a showpage for obvious reasons)
Or you can use the -dEPSCrop or EPSFitPage switches in Ghostscript (documented here) which will fit the content to the page, or the page to the content. Note that the precise behaviour of the FitPage switch depends on the exact version of Ghostscript you are using, which you haven't mentioned. The documentation there is for the current version, 9.21.
If you create a PostScript program yourself to do the work then you have complete control over how its rendered, if you let Ghostscript do it then you have less control but its simple to do. Your choice really.
NB the pastebin stops abruptly and makes no actual marks, so its not a valid PostScript program. If you put the whole file on DropBox or soemthing I could perhaps be more specific, but the gist is certainly covered above.
This is the first time I am using this software to create an experiment.
For my experiment I am presenting two images side by side, ideally I would like to run this experiment in fullscreen but when I set the value to true, the images become stretched. How do i fix their aspect ratio so I can run the program in full screen without stretching the images?
I am using a MacBook Pro and the PsychoPy coder.
Here is my current code for the images:
scale=0.7
faceRGB = visual.ImageStim(win,image='male.jpg',
mask=None,
pos=(0.0,0.0),
size=(scale,scale))
faceRGBINV = visual.ImageStim(win,image='maleInv.jpg',
mask=None,
pos=(0.0,0.0),
size=(scale,scale)`
Furthermore, in my experiment one of the images will be slightly compressed or stretched as it is. The participants will then have to choose the fatter face. This is already set up and when run in a window the images appear normal, it is just in fullscreen mode when they become stretched to fit the monitor size.
By default, PsychoPy uses 'norm' as units, which is size normalized to the window dimensions. You may have a situation where you (1) change the size of the image and (2) the image just happens to have the correct dimensions when presented in the default 800 x 800 pixels Window but appears stretched when you go fullscreen because your monitor has another aspect ratio.
If you don't change the size of the image, PsychoPy maintains the correct aspect ratio. Scaling the image will preserve this aspect ratio, so that's an easy solution. E.g. add one line after initiating the ImageStim:
scale = 0.7
from psychopy import visual
win = visual.Window(fullscr=True)
faceRGB = visual.ImageStim(win, 'male.jpg')
faceRGB.size *= scale # scale the image relative to initial size
If you want to control size directly and not just proportionally, see this discussion on the users list. I suggested the following solution. Say you want to set the image size so that scale is the maximum length along either the x- og y-axis and scale the other axis proportionally. Replace the last line above with this:
faceRGB.size *= scale / max(faceRGB.size)
Multiplying maintains aspect ratio as above and the righthand side is the multiplication factor to ensure scale. Change max to min if you want this to apply to the minimum length instead of the maximum length.
Note: you do not need to set pos=(0,0) and mask=None as that is the default value of these parameters.
I'm building a label printer. It consists of a logo and some text, not tough. I have already spent 3 days trying to get the original SVG logo to draw to screen but the SVG is too complex, using too many gradients, etc.
So I have a high quality bitmapped logo (as a JPG or PNG) and I'm drawing that on a ReportLab canvas. The image in question is much larger than 85*123px. I did this hoping ReportLab would embed the whole thing and scale it accordingly. Here's how I'm doing it:
canvas.drawImage('logo.jpg', 22+xoffset, 460, 85, 123)
The problem is, my assumption was incorrect. It seems to scale it down to 85*123px at screen resolution and that means when it's printed, it doesn't look great.
Does ReportLab have any DPI commands for canvases or documents so I can keep the quality sane?
Having previously worked at the ReportLab company, I can tell you that raster images do not go through any automatic resampling/downscaling while being included in the PDF. The 85*123 dimensions you are using are not pixels, but points (pt) which are a physical unit like millimetres or inches.
I would suggest printing the PDF with different quality images to confirm this or otherwise zooming in very, very far using your PDF viewer. It will always look a bit fuzzy in a PDF viewer as the image is resampled twice (once in the imaging software and then again to the pixels available to the PDF viewer).
This is how I would calculate what size in pixels to make a raster image for it to print well at a given physical size:
Assume I want the picture to be 2 inches wide, there are 72 points in a inch so the width in my code would be 144. I know that a good crisp resolution to print at is 300dpi (dots per inch) so the raster image is saved at 600px wide.
One option that I thought of while writing the question is: increase the size of the PDF and let the printer sort things out.
If I just multiplied all my numbers by 5 and the printer did manage to figure things out, I'd have close to 350DPI... But I'm making quite an assumption.
I don't know if it will work for all but in my case it did.
I only needed to add a logo on the top so I used drawImage()
but shrank the size of the logo by a third
c.drawImage(company_logo,225,750,width=(483/3),height=(122/3))
I had to previously know the real company logo size so it does not get distorted.
I hope it helps!
Using Python's Imaging Library I want to create a PNG file.
I would like it if when printing this image, without any scaling, it would always print at a known and consistent 'size' on the printed page.
Is the resolution encoded in the image?
If so, how do I specify it?
And even if it is, does this have any relevance when it goes to the printer?
As of PIL 1.1.5, there is a way to get the DPI:
im = ... # get image into PIL image instance
dpi = im.info["dpi"] # retrive the DPI
print dpi # (x-res, y-res)
im.info["dpi"] = new dpi # (x-res, y-res)
im.save("PNG") # uses the new DPI
I found a very simple way to get dpi information into the png:
im.save('myfile.png',dpi=[600,600])
Unfortunately I did not find this documented anywhere and had to dig into the PIL source code.
Printers have various resolutions in which they print. If you select a print resolution of 200 DPI for instance (or if it's set as default in the printer driver), then a 200 pixel image should be one inch in size.
Both image print size and resolution are relevant to printing an image of a specific scale and quality. Bear in mind that if the image is then included with a desktop publishing workspace (Word, InDesign) or even a web page, the image is then subject to any specified resolution in the parent document -- this won't necessarily alter the relative scale of the image in the case of desktop publishing programs but will alter image quality.
And yes, all images have a resolution property, which answers half your question - I don't know Python...
Much is going to depend on the software you're using to print. If you're placing the image in a Word document, it will scale according to the DPI, up to the width of your page. If you're putting it on a web page, the DPI will not matter at all.