Ghostscript is cutting off my ps to png - python

Questions about ghostscript and tkinter:
I have made a tkinter program and I want to convert into an image; I want the image to have the same ratio as 8.5 x 12 in paper; I have read that it's 2550x3300 pixels.
How does this translate to canvas coordinates? For now I picked some numbers with a similar ratio, width=1275,height=1650
Also, what exactly is the canvas and its size? I thought it was the length of what I could write in, but then I can set the scroll regions even further than the given width and height.
Basic idea for the canvas code:
class Application(tk.Frame):
def __init__(self, master):
self.master=master
self.canvas = tk.Canvas(master, width=1275, height=1650, bg='white', highlightthickness=0,
scrollregion=(0, 0, 1275, 1650))
self.hbar = Scrollbar(master, orient=HORIZONTAL)
self.hbar.pack(side=TOP, fill=X)
self.hbar.config(command=self.canvas.xview)
self.vbar = Scrollbar(master, orient=VERTICAL)
self.vbar.pack(side=RIGHT, fill=Y)
self.vbar.config(command=self.canvas.yview)
self.canvas.config(width=1275, height=1650)
self.canvas.config(xscrollcommand=self.hbar.set, yscrollcommand=self.vbar.set)
self.canvas.pack(side=LEFT, expand=True, fill=BOTH)
B1=Button(master,text='add image',command= lambda:self.insert_image())
B1.pack(side=TOP)
and here is my Ghostscript related code:
def _save(self):
self.canvas.postscript(file="tmp.ps",colormode='color')
args = [
"ps2jpg",
"-dSAFER","-dBATCH", "-dNOPAUSE",
"-sDEVICE=png16m",
"-sOutputFile=./ABC.png",
"./tmp.ps"
]
ghostscript.Ghostscript(*args)
And it seems to cut of at the left and right side.
Then I added parameters such as -dFitPage","-g1275x1650","-dPSFitPage", or even "-g2550x3300" instead of "-g1275x1650", but it creates a different error,
where the top of my canvas ends up in the middle of my saved image. What I want is the top of my canvas to appear at the top of my image.
Thank you.

OK, so firstly, the size you have quoted uses '-g' which is the number of pixels. Clearly the actual media size will then depend on the resolution. If I declare the size to be 600x600 and the resolution is 600 dpi then that's 1 inch by 1 inch, if the resolution is 300 dpi, then its 2 inches by 2 inches.
So you can't say 8.5x12 inch (is that supposed to be one of the standard media sizes ?) is 2550x3300 pixels, without also stating the resolution. In fact that can't even be correct. If I assume that 3300 is correct for 12 inch length, then that's a resolution of 275 dpi. If I then figure the width its 2550/275 = 9 inches.
As it happens, the default resolution of the png16m device appears to be 72. So 2550 by 3300 pixels means that your media is 35x45 inches. Not too surprising that you have scroll bars :-)
Of course, its possible that your PostScript program alters the resolution, but since you haven't supplied it to look at, I can't tell.
Now, Postscript co-ordinate systems start (by default) at the bottom left corner which is 0,0 and extend in both directions, positive numbers go up, and right, negative numbers go down and left. Yes its entirely possible to specify that part or all of a drawing operation takes place off the media.
You can also alter the co-ordinate system too, but that's probably more complex than you want to get into.
Without seeing your PostScript program, I can't really say why it lies partially off the media, it may be that that's what the program is asking for.
Using FitPage will attempt to fit the requested image to the page, if its too big it will scale it down (linearly, both directions equally) until both the dimensions fit into the media. This will result in white space in one direction unless your media happens to be the same shape as the program requested. That smallest dimension is then centered. I don't recall exactly but I think if the program marks fit into the media, then it just centres it.
So basically, you need to get the dimensions correct to start with. Assuming you are happy with a 72 dpi output image, and that your media is genuinely 8.5x12 inches, then you can specify -g612x864. If the rendered image doesn't fit precisely then its probable that your program makes marks off the media, is using a different media size, or 'something'. Can't say what without seeing the PostScript.
If you can share a simple PostScript file I can look at it (I can't use anything that requires me to use tkinter, sorry) and give you some more detailed guidance.
[EDIT]
So the output is actually an EPS, not a PostScript program, we can see this from the initial comments (any line beginning with '%' is a comment):
%!PS-Adobe-3.0 EPSF-3.0
%%Creator: Tk Canvas Widget
%%Title: Window .49823304L
%%CreationDate: Mon Aug 14 23:47:27 2017
%%BoundingBox: -171 85 785 707
%!PS tells us its a PostScript program, -Adobe-3.0 tells us it conforms to version 3.0 of the Document Structure Convention (a way of creating PostScript programs that makes them more portable for non PostScript interpreters) and the EPSF tells us its actually an EPS, finally the trailing -3.0 declares that it conforms to version 3.0 of the EPS specification.
Now EPS files are not intended to be sent directly to a PostScript interpreter. They are supposed to be included inside other PostScript programs and used as a kind of 'black box'. This technique is often used when the object is something like a company logo which does not change, or when you send work to an outside agency (eg a freelance graphic artist) they may send it as an EPS for you to use.
The EPS conforms to certain rules regarding what it can and can't contain. One of the important things it cannot do is set the media size. Execution of setpagedevice can cause the device to reset the marked content, which would throw away any marks made before the media selection.
Additionally, the EPS doesn't know how big its going to be when drawn on the final page. You could think of a logo being drawn large on the front page, then drawn small in the footer of each page for example.
So what the EPS contains is a declaration of where it marks the page, this is given by the BoundingBox:
%%BoundingBox: -171 85 785 707
Now you will note that the BoundingBox of this EPS begins at -179,85 and extends to 785,707. So its width is 964 and its height is 792. Those are in PostScript5 units which are 1/72 of an inch. So your EPS is actually 13.38 inches wide by 12 inches tall. Not only that, but it begins 2.48 inches off the left edge of the media.
This probably explains why you are having trouble getting the output you want, you probably are not setting the media correctly in Ghostscript, and translating the origin so that the left edge doesn't lie off the media.
Its the job of the application which places the EPS in its own PostScript output to translate and scale the co-ordinate system so that the EPS is at the required position and size on the final media.
So you have a choice; you can create a PostScript program which sets up the Current Transformation Matrix appropriately to scale and position the EPS, and then includes the EPS in its entirety, finishign with a showpage to actually render it (EPS files may not include a showpage for obvious reasons)
Or you can use the -dEPSCrop or EPSFitPage switches in Ghostscript (documented here) which will fit the content to the page, or the page to the content. Note that the precise behaviour of the FitPage switch depends on the exact version of Ghostscript you are using, which you haven't mentioned. The documentation there is for the current version, 9.21.
If you create a PostScript program yourself to do the work then you have complete control over how its rendered, if you let Ghostscript do it then you have less control but its simple to do. Your choice really.
NB the pastebin stops abruptly and makes no actual marks, so its not a valid PostScript program. If you put the whole file on DropBox or soemthing I could perhaps be more specific, but the gist is certainly covered above.

Related

image rendering issue in psychopy

I am a long-time psychopy user, and i just upgraded to 1.81.03 (from 1.78.x). In one experiment, i present images (.jpgs) to the user and ask for a rating scale response. The code worked fine before the update, but now i am getting weird artifacts on some images. For example, here is one image i want to show:
But here is what shows up [screencapped]:
You can see that one border is missing. This occurs for many of my images, though it is not always the same border, and sometimes two or three borders are missing.
Does anyone have an idea about what might be going on?
I received this information from the psychopy-users group (Micahel MacAskill):
As a general point, you should avoid using .jpgs for line art: they aren't designed for this (if you zoom in, in the internal corners of your square, you'll see the typical compression artefacts that their natural image-optimised compression algorithm introduces when applied to line art). .png format is optimal for line art. It is lossless and for this sort of image will still be very small file-size wise.
Graphics cards sometimes do scaling-up and then down-scaling of bitmaps, which can lead to issues like this with single-pixel width lines. Perhaps this is particularly the issue here because (I think) this image was supposed to be 255 × 255 pixels, and cards will sometimes scale up to the nearest power-of-two size (256 × 256) and then down again, so easy to see how the border might be trimmed.
I grabbed your image off SO, it seemed to have a surrounding border around the black line to make it 321 × 321 in total. I made that surround transparent and saved it as .png (another benefit of png vs jpg). It displays without problems (although a version cropped to just the precise dimensions of the black line did show the error you mentioned). (Also, the compression artefacts are still there, as I just made this png directly from the jpg). See attached file.
If this is the sort of simple stimulus you are showing, you might want to use ShapeStim/Polygon stimuli instead of bitmaps. They will always be drawn precisely, without any scaling issues, and there wouldn't be the need for any jiggery pokery.
Why this changed from 1.78 I'm not sure. The issue is also there in 1.82.00

PDF bleed detection

I'm currently writing a little tool (Python + pyPdf) to test PDFs for printer conformity.
Alas I already get confused at the first task: Detecting if the PDF has at least 3mm 'bleed' (border around the pages where nothing is printed). I already got that I can't detect the bleed for the complete document, since there doesn't seem to be a global one. On the pages however I can detect a total of five different boxes:
mediaBox
bleedBox
trimBox
cropBox
artBox
I read the pyPdf documentation concerning those boxes, but the only one I understood is the mediaBox which seems to represent the overall page size (i.e. the paper).
The bleedBox pretty obviously ought to define the bleed, but that doesn't always seem to be the case.
Another thing I noted was that for instance with the PDF, all those boxes have the exact same size (implying no bleed at all) on each page, but when I open it there's a huge amount of bleed; This leads me to think that the individual text elements have their own offset.
So, obviously, just calculating the bleed from mediaBox and bleedBox is not a viable option.
I would be more than delighted if anyone could shed some light on what those boxes actually are and what I can conclude from that (e.g. is one box always smaller than another one).
Bonus question: Can someone tell me what exactly the "default user space unit" mentioned in the documentation? I'm pretty sure this refers to mm on my machine, but I'd like to enforce mm everywhere.
Quoting from the PDF specification ISO 32000-1:2008 as published by Adobe:
14.11.2 Page Boundaries
14.11.2.1 General
A PDF page may be prepared either for a finished medium, such as a
sheet of paper, or as part of a prepress process in which the content
of the page is placed on an intermediate medium, such as film or an
imposed reproduction plate. In the latter case, it is important to
distinguish between the intermediate page and the finished page. The
intermediate page may often include additional production-related
content, such as bleeds or printer marks, that falls outside the
boundaries of the finished page. To handle such cases, a PDF page
maydefine as many as five separate boundaries to control various
aspects of the imaging process:
The media box defines the boundaries of the physical medium on which
the page is to be printed. It may include any extended area
surrounding the finished page for bleed, printing marks, or other such
purposes. It may also include areas close to the edges of the medium
that cannot be marked because of physical limitations of the output
device. Content falling outside this boundary may safely be discarded
without affecting the meaning of the PDF file.
The crop box defines the region to which the contents of the page
shall be clipped (cropped) when displayed or printed. Unlike the other
boxes, the crop box has no defined meaning in terms of physical page
geometry or intended use; it merely imposes clipping on the page
contents. However, in the absence of additional information (such as
imposition instructions specified in a JDF or PJTF job ticket), the
crop box determines how the page’s contents shall be positioned on the
output medium. The default value is the page’s media box.
The bleed box (PDF 1.3) defines the region to which the contents of
the page shall be clipped when output in a production environment.
This may include any extra bleed area needed to accommodate the
physical limitations of cutting, folding, and trimming equipment. The
actual printed page may include printing marks that fall outside the
bleed box. The default value is the page’s crop box.
The trim box (PDF 1.3) defines the intended dimensions of the
finished page after trimming. It may be smaller than the media box to
allow for production-related content, such as printing instructions,
cut marks, or colour bars. The default value is the page’s crop box.
The art box (PDF 1.3) defines the extent of the page’s meaningful
content (including potential white space) as intended by the page’s
creator. The default value is the page’s crop box.
The page object dictionary specifies these boundaries in the MediaBox,
CropBox, BleedBox, TrimBox, and ArtBox entries, respectively (see
Table 30). All of them are rectangles expressed in default user space
units. The crop, bleed, trim, and art boxes shall not ordinarily
extend beyond the boundaries of the media box. If they do, they are
effectively reduced to their intersection with the media box. Figure
86 illustrates the relationships among these boundaries. (The crop box
is not shown in the figure because it has no defined relationship with
any of the other boundaries.)
Following that there is a nice graphic showing those boxes in relation to each other:
The reasons why in many cases only the media box is set, are
that in case of PDFs meant for electronic consumption (i.e. reading on a computer) the other boxes hardly matter; and
that even in the prepress context they aren't as necessary anymore as they used to be, cf. the article Pedro refers to in his comment.
Concerning your "bonus question": The user space unit is 1⁄72 inch by default; since PDF 1.6 it can be changed, though, to any (not necessary integer) multiple of that size using the UserUnit entry in the page dictionary. Changing it in an existing PDF essentially scales it as the user space unit is the basic unit in the device independent coordinate system of a page. Therefore, unless you want to update each and every command in the page descriptions refering to coordinates to keep the page dimensions, you won't want to enforce a millimeter user space unit... ;)

PDF 'advanced' information extraction

I'm trying to write what more or less accounts for a PDF soft proof.
There are a few infos that I would like to extract, but have no clue how to.
What I need to extract:
Bleed: I got this somewhat working with pyPdf, given
that the document uses 72 dpi, which sadly isn't
always the case. I need to be able to calculate
the bleed in millimeters.
Print resolution (dpi): If I read the PDF spec[1] correctly this ought to
always be 72 dpi, unless a page has UserUnit set,
which was only introduced in PDF-1.6, but shouldn't
print documents always be at least 300 dpi? I'm
afraid that I misunderstood something…
I'd also need the print resolution for images, if
they can differ from the default page resolution,
that is.
Text color: I don't have the slightest clue on how to extract
this, the string 'text colour' only shows up once
in the whole spec, without any explanation how it
is set.
Image colormodel: If I understand it correctly I can read this out
in pyPdf with page['/Group']['/CS'] which can be:
- /DeviceRGB
- /DeviceCMY
- /DeviceCMYK
- /DeviceGray
- /DeviceRGBK
- /DeviceN
Font 'embeddedness': I read in another post on stackoverflow that I
can just iterate over the font resources and if a
resource has a '/FontFile'-key that means that
the font is embedded. Is this correct?
If other libs than pyPdf are better able to extract this info (or a combination
of them) they are more than welcome. So far I fumbled around with pyPdf, pdfrw
and pdfminer. All of which don't exactly have the most extensive documentation.
[1] http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
If I read the PDF spec1 correctly this ought to always be 72 dpi,
unless a page has UserUnit set, which was only introduced in PDF-1.6,
but shouldn't print documents always be at least 300 dpi? I'm afraid
that I misunderstood something…
You do misunderstand something. The default user space unit which defaults to 1/72 inch but can be changed on a per-page base since PDF-1.6, is not defining a print resolution, it merely defines what length a unit in coordinates given by the user by default (i.e. unless any size-changing transformation is active) corresponds to.
For printing all data are converted into a device dependent space whose resolution has nothing to do with the user space coordinates. Printing resolutions depend on the printing device and their drivers; they may be limitted due to security settings allowing low quality printing only.
I'd also need the print resolution for images, if they can differ from
the default page resolution, that is.
Images (well, bitmap images, in PDF there are also vector graphics) come each with their individual resolution and then may be transformed (e.g. enlarged) before being rendered. For an "image printing resolution" you'd, therefore, have to inspect each and every bitmap image and each and every page content in which it is inserted. And if the image is rotated, skewed and asymmetrically stretched, I wonder what number you will use as resolution... ;)
Text color: I don't have the slightest clue on how to extract this, the string
'text colour' only shows up once in the whole spec, without any
explanation how it is set.
Have a look at section 9.2.3 in the spec:
The colour used for painting glyphs shall be the current colour in the
graphics state: either the nonstroking colour or the stroking colour
(or both), depending on the text rendering mode (see 9.3.6, "Text
Rendering Mode"). The default colour shall be black (in DeviceGray),
but other colours may be obtained by executing an appropriate
colour-setting operator or operators (see 8.6.8, "Colour Operators")
before painting the glyphs.
There you find a number of pointers to interesting sections. Be aware, though, text is not simply coloured; it may also be rendered as a clip path applied to any background.
I read in another post on stackoverflow that I can just iterate over
the font resources and if a resource has a '/FontFile'-key that means
that the font is embedded. Is this correct?
I would advice a more precise analysis. There are other relevant keys, too, e.g. '/FontFile2' and '/FontFile3', and the correct one must be used.
Don't underestimate your tasks... you should start to define what the properties you search shall mean in a mixed environment of rotated, stretched and skewed glyphs, vector graphics and bitmap images like PDF.

How can I improve ReportLab image quality?

I'm building a label printer. It consists of a logo and some text, not tough. I have already spent 3 days trying to get the original SVG logo to draw to screen but the SVG is too complex, using too many gradients, etc.
So I have a high quality bitmapped logo (as a JPG or PNG) and I'm drawing that on a ReportLab canvas. The image in question is much larger than 85*123px. I did this hoping ReportLab would embed the whole thing and scale it accordingly. Here's how I'm doing it:
canvas.drawImage('logo.jpg', 22+xoffset, 460, 85, 123)
The problem is, my assumption was incorrect. It seems to scale it down to 85*123px at screen resolution and that means when it's printed, it doesn't look great.
Does ReportLab have any DPI commands for canvases or documents so I can keep the quality sane?
Having previously worked at the ReportLab company, I can tell you that raster images do not go through any automatic resampling/downscaling while being included in the PDF. The 85*123 dimensions you are using are not pixels, but points (pt) which are a physical unit like millimetres or inches.
I would suggest printing the PDF with different quality images to confirm this or otherwise zooming in very, very far using your PDF viewer. It will always look a bit fuzzy in a PDF viewer as the image is resampled twice (once in the imaging software and then again to the pixels available to the PDF viewer).
This is how I would calculate what size in pixels to make a raster image for it to print well at a given physical size:
Assume I want the picture to be 2 inches wide, there are 72 points in a inch so the width in my code would be 144. I know that a good crisp resolution to print at is 300dpi (dots per inch) so the raster image is saved at 600px wide.
One option that I thought of while writing the question is: increase the size of the PDF and let the printer sort things out.
If I just multiplied all my numbers by 5 and the printer did manage to figure things out, I'd have close to 350DPI... But I'm making quite an assumption.
I don't know if it will work for all but in my case it did.
I only needed to add a logo on the top so I used drawImage()
but shrank the size of the logo by a third
c.drawImage(company_logo,225,750,width=(483/3),height=(122/3))
I had to previously know the real company logo size so it does not get distorted.
I hope it helps!

Zooming into a Clutter CairoTexture while re-drawing

I am using python-clutter 1.0
My question in the form of a challenge
Write code to allow zooming up to a CairoTexture actor, by pressing a key, in steps such that at each the actor can be re-drawn (by cairo) so that the image remains high-res but still scales as expected, without re-sizing the actor.
Think of something like Inkscape and how you can zoom into the vectors; how the vectors remain clean at any magnification. Put a path (bunch of cairo line_to commands, say) onto an CairoTexture actor and then allow the same trick to happen.
More detail
I am aiming at a small SVG editor which uses groups of actors. Each actor is devoted to one path. I 'zoom' by using SomeGroup.set_depth(z) and then make z bigger/smaller. All fine so far. However, the closer the actor(s) get to the camera, the more the texture is stretched to fit their new apparent size.
I can't seem to find a way to get Clutter to do both:
Leave the actor's actual size static (i.e. what it started as.)
Swap-out its underlying surface for larger ones (on zooming in) that I can then re-draw the path onto (and use a cairo matrix to perform the scaling of the context.)
If I use set_size or set_surface_size, the actor gets larger which is not intended. I only want it's surface (underlying data) to get larger.
(I'm not sure of the terminology for this, mipmapping perhaps? )
Put another way: a polygon is getting larger, increase the size of its texture array so that it can map onto the larger polygon.
I have even tried an end-run around clutter by keeping a second surface (using pycairo) that I re-create to the apparent size of the actor (get_transformed_size) and then I use clutter's set_from_rgb_data and point it at my second surface, forcing a re-size of the surface but not of the actor's dimensions.
The problem with this is that a)clutter ignores the new size and only draws into the old width/height and b)the RGBA vs ARGB32 thing kind of causes a colour meltdown.
I'm open to any alternative ideas, I hope I'm standing in the woods missing all the trees!
\d
Well, despite all my tests and hacks, it was right under my nose all along.
Thanks to Neil on the clutter-project list, here's the scoop:
CT = SomeCairoTextureActor()
# record the old height, once:
old_width, old_height = CT.get_size()
Start a loop:
# Do stuff to the depth of CT (or it's parent)
...
# Get the apparent width and height (absolute size in pixels)
appr_w,appr_h = CT.get_transformed_size()
# Make the new surface to the new size
CT.set_surface_size( appr_w, appr_h )
# Crunch the actor back down to old size
# but leave the texture surface something other!
CT.set_size(old_width, old_height)
loop back again
The surface size and the size of the
actor don't have to be the same. The
surface size is just by default the
preferred size of the actor. You can
override the preferred size by just
setting the size on the actor. If the
size of the actor is different from
the surface size then the texture will
be squished to fit in the actor size
(which I think is what you want).
Nice to put this little mystery to bed. Thanks clutter list!
\d

Categories