Showing page count with ReportLab

Showing page count with ReportLab - python

I'm trying to add a simple "page x of y" to a report made with ReportLab.. I found this old post about it, but maybe six years later something more straightforward has emerged? ^^;
I found this recipe too, but when I use it, the resulting PDF is missing the images..

I was able to implement the NumberedCanvas approach from ActiveState. It was very easy to do and did not change much of my existing code. All I had to do was add that NumberedCanvas class and add the canvasmaker attribute when building my doc. I also changed the measurements of where the "x of y" was displayed:
self.doc.build(pdf)
became
self.doc.build(pdf, canvasmaker=NumberedCanvas)
doc is a BaseDocTemplate and pdf is my list of flowable elements.

use doc.multiBuild
and in the page header method (defined by "onLaterPages="):
global TOTALPAGES
if doc.page > TOTALPAGES:
TOTALPAGES = doc.page
else:
canvas.drawString(270 * mm, 5 * mm, "Seite %d/%d" % (doc.page,TOTALPAGES))

Just digging up some code for you, we use this:
SimpleDocTemplate(...).build(self.story,
onFirstPage=self._on_page,
onLaterPages=self._on_page)
Now self._on_page is a method that gets called for each page like:
def _on_page(self, canvas, doc):
# ... do any additional page formatting here for each page
print doc.page

I came up with a solution for platypus, that is easier to understand (at least I think it is). You can manually do two builds. In the first build, you can store the total number of pages. In the second build, you already know it in advance. I think it is easier to use and understand, because it works with platypus level event handlers, instead of canvas level events.
import copy
import io
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib.units import inch
styles = getSampleStyleSheet()
Title = "Hello world"
pageinfo = "platypus example"
total_pages = 0
def on_page(canvas, doc: SimpleDocTemplate):
global total_pages
total_pages = max(total_pages, doc.page)
canvas.saveState()
canvas.setFont('Times-Roman', 9)
canvas.drawString(inch, 0.75 * inch, "Page %d %s" % (doc.page, total_pages))
canvas.restoreState()
Story = [Spacer(1, 2 * inch)]
style = styles["Normal"]
for i in range(100):
bogustext = ("This is Paragraph number %s. " % i) * 20
p = Paragraph(bogustext, style)
Story.append(p)
Story.append(Spacer(1, 0.2 * inch))
# You MUST use a deep copy of the story!
# https://mail.python.org/pipermail/python-list/2022-March/905728.html
# First pass
with io.BytesIO() as out:
doc = SimpleDocTemplate(out)
doc.build(copy.deepcopy(Story), onFirstPage=on_page, onLaterPages=on_page)
# Second pass
with open("test.pdf", "wb+") as out:
doc = SimpleDocTemplate(out)
doc.build(copy.deepcopy(Story), onFirstPage=on_page, onLaterPages=on_page)
You just need to make sure that you always render a deep copy of your original story. Otherwise it won't work. (You will either get an empty page as the output, or a render error telling that a Flowable doesn't fit in the frame.)

Related

Issue with pdf reader returning empty pages in the output - pdfrw.pdfreader.PdfReader()

I want to
read a pdf of 70 pages
add a different footer in each page
save the new file
To achieve this, I started using pdfrw and reportlab. Although I got good enough results, some pages of the 70 pages-document are returning empty/blank from the pdfrw.pdfreader.PdfReader I don't know why (these pages seem to be okay before adding the footers and the document isn't encrypted or protected by any password).
Is there any approach to make the reading of pdfrw.pdfreader.PdfReader more robust? or an additional reading process that allows user to perform the override of footers? Thanks
Note: footer_array is a length-70 array of tags that shows the last update status of each slide of the 70-pages pdf.
from reportlab.pdfgen.canvas import Canvas
from reportlab.lib.colors import HexColor
from pdfrw import PdfReader
from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl
def add_footer(footer_array, doc_path, new_path):
reader = PdfReader(doc_path)
pages = [pagexobj(p) for p in reader.pages]
canvas = Canvas(new_path)
for page_num, page in enumerate(pages, start=1):
print(page_num, page.BBox)
if footer_array[page_num-1] == "updated":
canvas.setPageSize((page.BBox[2], page.BBox[3]))
canvas.doForm(makerl(canvas, page))
canvas.setFont("Helvetica", 8)
canvas.setFillColor(HexColor('#339966'))
canvas.drawString(page.BBox[2]*0.90, 10, footer_array[page_num-1])
canvas.showPage()
elif footer_array[page_num-1] == "":
canvas.setPageSize((page.BBox[2], page.BBox[3]))
canvas.doForm(makerl(canvas, page))
canvas.setFont("Helvetica", 8)
canvas.setFillColor(HexColor('#000000'))
canvas.drawString(page.BBox[2]*0.90, 10, footer_array[page_num-1])
canvas.showPage()
else:
canvas.setPageSize((page.BBox[2], page.BBox[3]))
canvas.doForm(makerl(canvas, page))
canvas.setFont("Helvetica", 8)
canvas.setFillColor(HexColor('#FF0000'))
canvas.drawString(page.BBox[2]*0.85, 10, footer_array[page_num-1])
canvas.showPage()
canvas.save()

underline text with odfpy

I'd like to generate an odf file with odfpy, and am stuck on underlining text.
Here is a minimal example inspired from official documentation, where i can't find any information about what attributes can be used and where.
Any suggestion?
from odf.opendocument import OpenDocumentText
from odf.style import Style, TextProperties
from odf.text import H, P, Span
textdoc = OpenDocumentText()
ustyle = Style(name="Underline", family="text")
#uprop = TextProperties(fontweight="bold") #uncommented, this works well
#uprop = TextProperties(attributes={"fontsize":"26pt"}) #this either
uprop = TextProperties(attributes={"underline":"solid"}) # bad guess, wont work !!
ustyle.addElement(uprop)
textdoc.automaticstyles.addElement(ustyle)
p = P(text="Hello world. ")
underlinedpart = Span(stylename=ustyle, text="This part would like to be underlined. ")
p.addElement(underlinedpart)
p.addText("This is after the style test.")
textdoc.text.addElement(p)
textdoc.save("myfirstdocument.odt")

Here is how I finally got it:
I created a sample document with underlining using libreoffice, and unzipped it. Looking in styles.xml part of the extracted files, I got the part that makes underlining in the document:
<style:style style:name="Internet_20_link" style:display-name="Internet link" style:family="text">
<style:text-properties fo:color="#000080" fo:language="zxx" fo:country="none" style:text-underline-style="solid" style:text-underline-width="auto" style:text-underline-color="font-color" style:language-asian="zxx" style:country-asian="none" style:language-complex="zxx" style:country-complex="none"/>
</style:style>
The interesting style attributes are named: text-underline-style,
text-underline-width and text-underline-color.
To use them in odfpy, '-' characters must be removed, and attributes keys must be used as str (with quotes) like in the following code. A correct style family (text in our case) must be specified in the Style constructor call.
from odf.opendocument import OpenDocumentText
from odf.style import Style, TextProperties
from odf.text import H, P, Span
textdoc = OpenDocumentText()
#underline style
ustyle = Style(name="Underline", family="text") #here style family
uprop = TextProperties(attributes={
"textunderlinestyle":"solid",
"textunderlinewidth":"auto",
"textunderlinecolor":"font-color"
})
ustyle.addElement(uprop)
textdoc.automaticstyles.addElement(ustyle)
p = P(text="Hello world. ")
underlinedpart = Span(stylename=ustyle, text="This part would like to be underlined. ")
p.addElement(underlinedpart)
p.addText("This is after the style test.")
textdoc.text.addElement(p)
textdoc.save("myfirstdocument.odt")

Dynamic framesize in Python Reportlab

I tried to generate a shipping list with reportlab in Python.
I was trying to put all parts (like senders address, receivers address, a table) in place by using Platypus Frames.
The first problem that I ran into was that I needed a lot of Frames to position everything the right way, is there a better way using Platypus?
Because I want the senders address and my address to be on the same height and if I just add them to my story = [] they get aligned one below the other.
The next problem is that the table I'm drawing is dynamic in size and when I reach the end of the Frame ( space I want the table to go) it just does a FrameBreak and continuous in the next frame. So how can I make the Frame (space for my table ) dynamic?

Your use case is a really common one, so Reportlab has a system to help you out.
If you read the user guide about platypus it will introduce you to 4 main concepts:
DocTemplates the outermost container for the document;
PageTemplates specifications for layouts of pages of various kinds;
Frames specifications of regions in pages that can contain flowing text or graphics.
Flowables
Using PageTemplates you can combine "static" content with dynamic on a page in a sensible way like for example logo's, addresses and such.
You already discovered the Flowables and Frames but propably you did not start on fancyPageTemplates or DocTemplates yet. This makes sense because it isn't necessary for most simple documents. Saddly a shippinglist isn't a simple document, it holds address, logos and important info that has to be on every page. This is where PageTemplates come in.
So how do you use these templates? The concept is simple each page has a certain structure that might differ between pages, for example on page one you want to put the addresses and then start the table while on the second page you only want the table. This would be something like this:
Page 1:
Page 2:
A example would look like this:
(This would be perfect for the SO Documentation if there was one for Reportlab)
from reportlab.lib.pagesizes import A4
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib.units import cm
from reportlab.lib import colors
from reportlab.platypus import BaseDocTemplate, Frame, PageTemplate, NextPageTemplate, Paragraph, PageBreak, Table, \
TableStyle
class ShippingListReport(BaseDocTemplate):
def __init__(self, filename, their_adress, objects, **kwargs):
super().__init__(filename, page_size=A4, _pageBreakQuick=0, **kwargs)
self.their_adress = their_adress
self.objects = objects
self.page_width = (self.width + self.leftMargin * 2)
self.page_height = (self.height + self.bottomMargin * 2)
styles = getSampleStyleSheet()
# Setting up the frames, frames are use for dynamic content not fixed page elements
first_page_table_frame = Frame(self.leftMargin, self.bottomMargin, self.width, self.height - 6 * cm, id='small_table')
later_pages_table_frame = Frame(self.leftMargin, self.bottomMargin, self.width, self.height, id='large_table')
# Creating the page templates
first_page = PageTemplate(id='FirstPage', frames=[first_page_table_frame], onPage=self.on_first_page)
later_pages = PageTemplate(id='LaterPages', frames=[later_pages_table_frame], onPage=self.add_default_info)
self.addPageTemplates([first_page, later_pages])
# Tell Reportlab to use the other template on the later pages,
# by the default the first template that was added is used for the first page.
story = [NextPageTemplate(['*', 'LaterPages'])]
table_grid = [["Product", "Quantity"]]
# Add the objects
for shipped_object in self.objects:
table_grid.append([shipped_object, "42"])
story.append(Table(table_grid, repeatRows=1, colWidths=[0.5 * self.width, 0.5 * self.width],
style=TableStyle([('GRID',(0,1),(-1,-1),0.25,colors.gray),
('BOX', (0,0), (-1,-1), 1.0, colors.black),
('BOX', (0,0), (1,0), 1.0, colors.black),
])))
self.build(story)
def on_first_page(self, canvas, doc):
canvas.saveState()
# Add the logo and other default stuff
self.add_default_info(canvas, doc)
canvas.drawString(doc.leftMargin, doc.height, "My address")
canvas.drawString(0.5 * doc.page_width, doc.height, self.their_adress)
canvas.restoreState()
def add_default_info(self, canvas, doc):
canvas.saveState()
canvas.drawCentredString(0.5 * (doc.page_width), doc.page_height - 2.5 * cm, "Company Name")
canvas.restoreState()
if __name__ == '__main__':
ShippingListReport('example.pdf', "Their address", ["Product", "Product"] * 50)

Extracting text from highlighted annotations in a PDF file

Since yesterday I'm trying to extract the text from some highlighted annotations in one pdf, using python-poppler-qt4.
According to this documentation, looks like I have to get the text using the Page.text() method, passing a Rectangle argument from the higlighted annotation, which I get using Annotation.boundary(). But I get only blank text. Can someone help me? I copied my code below and added a link for the PDF I am using. Thanks for any help!
import popplerqt4
import sys
import PyQt4
def main():
doc = popplerqt4.Poppler.Document.load(sys.argv[1])
total_annotations = 0
for i in range(doc.numPages()):
page = doc.page(i)
annotations = page.annotations()
if len(annotations) > 0:
for annotation in annotations:
if isinstance(annotation, popplerqt4.Poppler.Annotation):
total_annotations += 1
if(isinstance(annotation, popplerqt4.Poppler.HighlightAnnotation)):
print str(page.text(annotation.boundary()))
if total_annotations > 0:
print str(total_annotations) + " annotation(s) found"
else:
print "no annotations found"
if __name__ == "__main__":
main()
Test pdf:
https://www.dropbox.com/s/10plnj67k9xd1ot/test.pdf

Looking at the documentation for Annotations it seems that the boundary property Returns this annotation's boundary rectangle in normalized coordinates. Although this seems a strange decision we can simply scale the coordinates by the page.pageSize().width() and .height() values.
import popplerqt4
import sys
import PyQt4
def main():
doc = popplerqt4.Poppler.Document.load(sys.argv[1])
total_annotations = 0
for i in range(doc.numPages()):
#print("========= PAGE {} =========".format(i+1))
page = doc.page(i)
annotations = page.annotations()
(pwidth, pheight) = (page.pageSize().width(), page.pageSize().height())
if len(annotations) > 0:
for annotation in annotations:
if isinstance(annotation, popplerqt4.Poppler.Annotation):
total_annotations += 1
if(isinstance(annotation, popplerqt4.Poppler.HighlightAnnotation)):
quads = annotation.highlightQuads()
txt = ""
for quad in quads:
rect = (quad.points[0].x() * pwidth,
quad.points[0].y() * pheight,
quad.points[2].x() * pwidth,
quad.points[2].y() * pheight)
bdy = PyQt4.QtCore.QRectF()
bdy.setCoords(*rect)
txt = txt + unicode(page.text(bdy)) + ' '
#print("========= ANNOTATION =========")
print(unicode(txt))
if total_annotations > 0:
print str(total_annotations) + " annotation(s) found"
else:
print "no annotations found"
if __name__ == "__main__":
main()
Additionally, I decided to concatenate the .highlightQuads() to get a better representation of what was actually highlighted.
Please be aware of the explicit <space> I have appended to each quad region of text.
In the example document the returned QString could not be passed directly to print() or str(), the solution to this was to use unicode() instead.
I hope this helps someone as it helped me.
Note: Page rotation may affect the scaling values, I have not been able to test this.

python win32print not printing

I need to print some information directly (without user confirmation) and I'm using Python and the win32print module.
I've already read the whole Tim Golden win32print page (even read the win32print doc, which is small) and I'm using the same example he wrote there himself, but I just print nothing.
If I go to the interactive shell and make one step at a time, I get the document on the printer queue (after the StartDocPrinter), then I get the document size (after the StartPagePrinter, WritePrinter, EndPagePrinter block) and then the document disappear from the queue (after the EndDocPrinter) without printing.
I'm aware of the ShellExecute method Tim Golden showed. It works here, but it needs to create a temp file and it prints this filename, two things I don't want.
Any ideas? Thanks in advance.
This is the code I'm testing (copy and paste of Tim Golden's):
import os, sys
import win32print
import time
printer_name = win32print.GetDefaultPrinter()
if sys.version_info >= (3,):
raw_data = bytes ("This is a test", "utf-8")
else:
raw_data = "This is a test"
hPrinter = win32print.OpenPrinter (printer_name)
try:
hJob = win32print.StartDocPrinter (hPrinter, 1, ("test of raw data", None, "RAW"))
try:
win32print.StartPagePrinter (hPrinter)
win32print.WritePrinter (hPrinter, raw_data)
win32print.EndPagePrinter (hPrinter)
finally:
win32print.EndDocPrinter (hPrinter)
finally:
win32print.ClosePrinter (hPrinter)
[EDIT]
I installed a pdf printer in my computer to test with another printer (CutePDF Writer) and I could generate the test of raw data.pdf file, but when I look inside there is nothing. Meaning: all commands except WritePrinter appears to be doing what they were supposed to do. But again, as I said in the comments, WritePrinter return the correct amount of bytes that were supposed to be written to the printer. I have no other idea how to solve this, but just comproved there is nothing wrong with my printer.

I'm still looking for the best way to do this, but I found an answer that satisfy myself for the problem that I have. In Tim Golden's site (linked in question) you can find this example:
import win32ui
import win32print
import win32con
INCH = 1440
hDC = win32ui.CreateDC ()
hDC.CreatePrinterDC (win32print.GetDefaultPrinter ())
hDC.StartDoc ("Test doc")
hDC.StartPage ()
hDC.SetMapMode (win32con.MM_TWIPS)
hDC.DrawText ("TEST", (0, INCH * -1, INCH * 8, INCH * -2), win32con.DT_CENTER)
hDC.EndPage ()
hDC.EndDoc ()
I adapted it a little bit after reading a lot of the documentation. I'll be using win32ui library and TextOut (device context method object).
import win32ui
# X from the left margin, Y from top margin
# both in pixels
X=50; Y=50
multi_line_string = input_string.split()
hDC = win32ui.CreateDC ()
hDC.CreatePrinterDC (your_printer_name)
hDC.StartDoc (the_name_will_appear_on_printer_spool)
hDC.StartPage ()
for line in multi_line_string:
hDC.TextOut(X,Y,line)
Y += 100
hDC.EndPage ()
hDC.EndDoc ()
I searched in meta stackoverflow before answering my own question and here I found it is an encouraged behavior, therefore I'm doing it. I'll wait a little more to see if I get any other answer.

# U must install pywin32 and import modules:
import win32print, win32ui, win32con
# X from the left margin, Y from top margin
# both in pixels
X=50; Y=50
# Separate lines from Your string
# for example:input_string and create
# new string for example: multi_line_string
multi_line_string = input_string.splitlines()
hDC = win32ui.CreateDC ()
# Set default printer from Windows:
hDC.CreatePrinterDC (win32print.GetDefaultPrinter ())
hDC.StartDoc (the_name_will_appear_on_printer_spool)
hDC.StartPage ()
for line in multi_line_string:
hDC.TextOut(X,Y,line)
Y += 100
hDC.EndPage ()
hDC.EndDoc ()
#I like Python

The problem is the driver Version. If the Version is 4 you need to give XPS_PASS instead of RAW, here is a sample.
drivers = win32print.EnumPrinterDrivers(None, None, 2)
hPrinter = win32print.OpenPrinter(printer_name)
printer_info = win32print.GetPrinter(hPrinter, 2)
for driver in drivers:
if driver["Name"] == printer_info["pDriverName"]:
printer_driver = driver
raw_type = "XPS_PASS" if printer_driver["Version"] == 4 else "RAW"
try:
hJob = win32print.StartDocPrinter(hPrinter, 1, ("test of raw data", None, raw_type))
try:
win32print.StartPagePrinter(hPrinter)
win32print.WritePrinter(hPrinter, raw_data)
win32print.EndPagePrinter(hPrinter)
finally:
win32print.EndDocPrinter(hPrinter)
finally:
win32print.ClosePrinter(hPrinter)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Showing page count with ReportLab - python

I'm trying to add a simple "page x of y" to a report made with ReportLab.. I found this old post about it, but maybe six years later something more straightforward has emerged? ^^; I found this recipe too, but when I use it, the resulting PDF is missing the images..

use doc.multiBuild and in the page header method (defined by "onLaterPages="): global TOTALPAGES if doc.page > TOTALPAGES: TOTALPAGES = doc.page else: canvas.drawString(270 * mm, 5 * mm, "Seite %d/%d" % (doc.page,TOTALPAGES))

Related

Issue with pdf reader returning empty pages in the output - pdfrw.pdfreader.PdfReader()

underline text with odfpy

Dynamic framesize in Python Reportlab

Extracting text from highlighted annotations in a PDF file

python win32print not printing

Categories

Resources