Exclude first page from Table of Contents in pisa / xhtml2pdf - python

I'm using django-xhtml2pdf to generate a report. I'm using the first page as a cover sheet, followed by the table of contents, using the <pdf:toc /> tag.
I would like to discount the first page, so the page-numbering in the Table of Contents starts at 1 instead of 2.
Is this possible?

Reading through the xhtml2pdf code, there isn't support for offsetting the page numbering. There's an old discussion about a pisa fork trying to implement support for this, but I'm not sure how far it got.
An awkward but straight-forward solution is to generate your cover sheet and the rest of the document as separate PDFs and then merge them. That way the page numbering will exclude the cover sheet. pyPDF merging and displaying as httpresponse through django has an accepted answer that will let you do just that.

Related

Python: How to append a FPDF2 table to an existing pdf page?

I have an FPDF2 table created using this script. I used to output it to a blank page and merge it to an existing pdf, which works fine.
But now we need to add the table to an existing page in the pdf and then if it doesn't fit, we insert new pages. And that's the problem.
FPDF doesn't seem to be able to draw to an existing page. I know I can use reportlab canvas can.drawString() to draw to an existing page, but I don't know if reportlab can draw an FPDF object.
Also, if I were to ditch FPDF and use only reportlab to draw a table, I don't know how to detect the end of the page and insert a new page if needed. I'm not starting at the start of a page, I'll be starting somewhere in the middle.
I would prefer to be able to use the FPDF2 script I already have and somehow add the output at a specific x,y position in a page though, if possible. Have you ever had this issue?
I also have Pypdf2 installed and used in the same project, but I think that only reportlab can do the job. Maybe I need to detect the end of the page via Pypdf2 and write to the page via reportlab?
Since no answer exists as of now and since I need an answer to finish my task, I did the following:
I added the FPDF table to a blank PDF page, and pushed it down, by using set_y(100), to have a half blank page to work with.
And I took a screenshot of the items which need to be placed above the table and then added them to the same page by using reportlab canvas.drawImage()
If there's a better solution, please post an answer and I'll accept it and refactor my code. For now, I'll accept my answer to close this question.

report builder for django

is this something like stimulsoft or crystal report for django, i am not talking about report viewer that just export some excel data, i am talking about whole package, like some text with variables and some tables, pages with headers and footers and water marks and so on.
i want to have footer on every page and tables that i don't know how
long they will grow and maybe they go to second page or third and the
page must be generated with footer for new data just like stimulsoft
reporter
You can use Reportlab that contains such features. [read it!]. But I don't found a full package to connect models and making reports. In Reportlab you can make page templates and complete them with data. For the Persian language, you should use external packages for RTL reshaping.
Check out ReportBro
Commercial use requires a license. I'm not affiliated, but currently evaluating for use in my own project. It seems to offer everything you're looking for.

comparing PDF report to Database

I have one use-case .Lets say there is pdf report which has data from testing of some manufacturing components
and this PDF report is loaded in DB using some internally developed software.
We need to develop some reconciliation program wherein the data needs to be compared from PDF report to Database. We can assume pdf file has a fixed template.
If there are many tables and some raw text data in pdf then how mysql save this pdf data..in One table or in many tables .
Please suggest some approach(preferably in python) for comparing data
Finding and extracting specific text from URL PDF files, without downloading or writing (solution) have a look at this example and see if it will help. I found it worked efficiently for me, this is if the pdf is URL based, but you could simply change the input source to be your DB. In your case you can remove the two if statements under the if isinstance(obj, pdfminer.layout.LTTextBoxHorizontal): line. You mention having PDFs with the same template, if you are looking to extract text from one specific area of the template, use the print statement that has been commented out to find coordinates of desired data. Then as is done in the example, use those coordinates in if statements.

Is it possible to specify a page template for the first page and another for following pages in reportlab

I am building a fairly simple document in reportlab with platypus. It is basically a header on all the pages and then a table object with line items that will extend onto multiple pages.
What I am trying to figure out is if there is a way to specify one page template for the first page, and a page template for all following pages.
From what I can tell you have to put a call to NextPageTemplate as a flowable in your story, but since one flowable takes up multiple pages I can't put a NextPageTemplate call in there.
I thought there was a way to specify a template onFirstPage and a template onLaterPages when building the document but I can't seem to find that anymore.
Any Ideas?
Alright I figured it out. Hope this will help someone else in the future.
Where I had seen onFirstPage and onLaterPages was in the build method of the SimpleDocTemplate class. And while for a simpler report it would have worked fine, for mine that method does not work. I am using a frame to specify the margins of my document, there is probably a better way to do this, and the SimpleDocTemplate creates it's own margin Frame, also I might be wrong about that.
Anyways, I subclassed BaseDocTemplate to overide the handle_pageBegin method to tell the build method to switch to the second page template like so:
def handle_pageBegin(self):
'''override base method to add a change of page template after the firstpage.
'''
self._handle_pageBegin()
self._handle_nextPageTemplate('Later')
Then I could just add 2 page templates into the document when I create it naming the second one 'Later'.
Seems to work great for now.

How do you make a PDF searchable with text in the sidebar?

I'm looking to create some PDF's from Python.
I've noticed that some pdf's have sidebar text that allows you to see the context of occurrences of search terms.
e.g. search for "dictionary"
View in Sidebar:
Page 10 Assigning a value to an existing dictionary key simply replaces the old value with a new one.
How is that done?
Is there anyway to convert existing PDFs to render this sidebar text?
If you use Reportlab to generate your pdfs, then there are facilities in the library to bookmark as you want. Checkout the bookmarkPage method on page 54 of the documentation.
I believe what you're referring to are bookmarks. The first hit on Google indicates that you can put them in by hand with Acrobat Pro.
The DocBook XSL templates when used with Apache FOP
The PyQt gui toolkit has support for creating PDF's. See for example: Printing Rich Text with Qt

Categories