Convert HTML to PDF in python on Google App Engine - python

I have had a look at this thread:
How to dynamically generate a pdf from Google's appengine?
I know we can use ReportLab, however, I am not sure how i can give it a HTML file and get a PDF.
Basically, HTML in and PDF out

if you can use an external lib, there is xhtml2pdf

Try this: html-2-pdf.com
It's build in top of: wkhtmltopdf
It's really easy to use it. It's stand-alone so just upload a file on your (Linux) system and you are ready to produce PDFs

xhtml2pdf states that the source is pure python. you should be able to use this as a module in your application.

Related

ACIIDOC to PDF in Python without need for files

So I am generating reports with Python and Ninja in the ASCIIDoc format.
But from within my app I need to convert them into PDF and upload them to another system.
I have seen that there are multiple HowTo for command line that involve ASCIIDoctor or other tools, but they always are invoked at OS level by starting a program or running a docker container and writing the output to a file.
Isn't there a way to perform those action within my app and get the PDF as a string that I can use for the upload?
You can certainly use the available tools to generate a PDF, which you could then read into memory as an opaque string that could be uploaded as needed.
If your question is: how do I generate and upload a PDF without installing any other tools?
Then the answer is that you'd have to implement the PDF generation logic yourself, rather than using tested tooling.

How to serve PDF files on the web?

I have the following code, which seems to serve a PDF without any content:
from pathlib import Path
pdf = Path("url/to/file.pdf")
print(f"Content-Type: application/pdf;\r\n")
print(pdf.read_bytes())
Any tips to correctly serve this PDF would be helpful!
Edit: for context, I am trying to serve PDF files and obscure original PDF file path on the server.
I'm no Python developer so I can't help you too specifically, but a couple things...
If your Phython script is outputting headers and content in the same response (such as via CGI), you need to have a blank line between the headers and content. Right now you have one \r\n. Add a second \r\n.
The other thing is that you should find a way to stream the output from that file rather than reading all its bytes and printing them.
Finally, I don't know if print() in Python is interpreting that as a string, but that can be problematic for binary data. This again is solved by piping a stream directly to the output.
Figured out the answer to my own question if anyone else needs to know:
pdf = Path("url/to/file.pdf")
print("Content-type: application/pdf\r\n\r\n")
stdout.flush()
stdout.buffer.write(pdf.read_bytes())
I realy dont know what your question is about. Its looks like code is unrelated with question. Its seems like its suppose to find pdf in local file system(build_in open?) and then print its content(?). Do you use some framework(flask/django)?
If by dynamic serving you have in mind dynamic creation of pdf based on some template:
pdf may be constructed from some markdown language like html, tex file(latex unfortunately is complicated system and dont fit to be depoloyed with web app)
markdown language file may be in turn rendered by template soft (jinja2, django build_in)
https://weasyprint.org/ is library that convert html + css to pdf
Ps. add more context

Convert Latex to PDF in Python web app

What is the easiest way to convert a string of Latex (e.g. "Consider the polynomial $x^2$") into a pdf within a Python web app? Ideally, this wouldn't require the creation of intermediate files that I would have to store in a database.
I tried downloading Texcaller (http://vog.github.io/texcaller/) but I could not get it to work. In particular, the key file python/texcaller.py has the line "import _texcaller" which gives the error "No module named _texcaller."
I'm thinking that there must be some way to do it because the Texer resource at AoPS (http://www.artofproblemsolving.com/Resources/texer.php) renders Tex as PDF almost instantaneously.
Thank you very much in advance!
I've recently written a library for the purpose of generating LaTeX code using python. It supports tables, plots, matrices and more. https://github.com/JelteF/PyLaTeX

create office files from python

We have a project in python with django.
We need to generate complex word, excel and pdf files.
For the rest of our projects which were done in PHP we used PHPexcel ,
PHPWord and tcpdf for PDF.
What libraries for python would you recommend for creating this kind of files ? (for excel and word its imortant to use the open xml file format xlsx , docx)
Python-docx may help ( https://github.com/mikemaccana/python-docx ).
Python doesn't have highly-developed tools to manipulate word documents. I've found the java library xdocreport ( https://code.google.com/p/xdocreport/ ) to be the best by far for Word reporting. Because I need to generate PCL, which is efficiently done via FOP I also use docx4j.
To integrate this with my python, I use the spark framework to wrap it up with a simple web service, and use requests on the python side to talk to the service.
For excel, there's openpyxl, which actually is a python port of PHPexcel, afaik. I haven't used it yet, but it sounds ok to me.
I would recommend using Docutils. It takes reStructuredText files and converts them to a range of output files. Included in the package are HTML, LaTeX and .odf file writers but in the sandbox there are a whole load of other writers for writing to other formats, see for example, the WordML writer (disclaimer: I haven't used it).
The advantage of this solution is that you can write plain text (reStructuredText) master files, which are human readable as is, and then convert to a range of other file formats as required.
Whilst not a Python solution, you should also look at Pandoc a Haskell library which supports a much wider range of output and input formats than docutils. One major advantage of Pandoc over Docutils is that you can do the reverse translation, i.e. WordML to reStructuredText. You can try Pandoc here.
I have never used any libraries for this, but you can change the extension of any docx, xlsx file to zip, and see the magic!
Generating openxml files is as simple as generating couple of XML files (you can use templates) and zipping it.
Simplest way to generate PDF is to generate HTML (with CSS+images) and convert it using wkhtmltopdf tool.

How to print gantt-charts generated on web using python?

I want to print or save gantt-chart(in pdf format). These charts are generated on web after a particular input. Our chart is a plug-in for Trac. I have used Genshi library to generate charts.
There's an open source python library for generating PDF files by Report Labs. I've not used it myself, but other questions & answers on SO have revolved around this library, Report Lab Toolkit.
Can you give more information about your plugin? There is a gantt chart plugin on trac-hacks.org; is that the one you are using, or a custom one? If custom, is it available as Open Source somewhere so we can see what you are doing?
If you implemented this as a wiki macro, you can use the WikiToPdf plugin to do this.
You Could use WeasyPrint to convert HTML to PDF. From their example website:
weasyprint http://www.w3.org/TR/CSS21/intro.html CSS21-intro.pdf -s http://weasyprint.org/samples/CSS21-print.css
creates a PDF file based on the HTML page and CSS provided. This is a python implementation.

Categories