Using python to convert PNG's into PDF's - python

I want to write a python script to convert PNG's into 2-page pdfs (i.e. 2 PNGs per PDF). The software needs to run on both a Mac and Windows 7.
My current solution is using ReportLab, but that doesn't install easily on a Mac. According to its website, it only has a compiled version for Windows. It has a cross-platform version that requires a C compiler to install.
Is there a better way to do this (so that I don't have to install a C compiler on the Mac)? Should I use a different library, or a different language entirely? As long as I can call the program from a python script, I could use any language for the pdf creation. Or, is there a very straightforward (i.e a non-programmer could install it) C compiler that I could install on a Mac?
What do you recommend?

The unix program convert can help you for conversion.
convert file.png file.pdf
But you said you want to have it under windows too. PIL imaging library has a PDF module. you should try a simple
pilconvert file.png file.pdf
to put more than one image you can play with the library which is quite flexible for resizing, stitching, etc. It is also available for mac and windows
Update 2011-02-15
On my macosx, I have installed reportlab without difficulties.
% sudo easy_install reportlab
Searching for reportlab
Reading http://pypi.python.org/simple/reportlab/
Reading http://www.reportlab.com/
Best match: reportlab 2.5
Downloading http://pypi.python.org/packages/source/r/reportlab/reportlab-2.5.tar.gz#md5=cdf8b87a6cf1501de1b0a8d341a217d3
Processing reportlab-2.5.tar.gz
So you could use a combination of PIL and Reportlab for your own needs.

Reportlab is one of the good tools for generating pdfs. If you can use it for your purposes, it is better to stick with it. For installation on MAC, I see that darwinports have a port for Reportlab called py-reportlab. Follow the instructions to install it using portage, it will install the dependencies by itself.

PyCairo could be a good alternative. You'll have full control and less dependencies, but reportlab is a lot simplier.

Why anyone hasn't tried this??
If you are not much concern about the quality, try the following solution.
import PIL.Image
filepath = "temp.png"
newfilename = 'our.pdf'
im = PIL.Image.open(filepath)
im.save(newfilename, "PDF", quality=100)

Related

Why is the Python build of OpenCV (cv2.pyd) so small?

I wanted to use OpenCV with Python, so I downloaded OpenCV for Windows and got a folder of ~3.7GB after decompression. What surprised me was that the only file I needed was cv2.pyd, which was so small (~11MB) comparing to the C builds (~674MB). I simply copied it to my Python lib-packages folder without adding anything to my PATH and it worked perfectly.
I don't know how Python binding works, and I thought it should call C/C++ implementations under the hood. However, cv2 did not seem to require any C/C++ library. It just looks like magic to me.
Most likely it has something to do with static linking and using all possible tricks found in "Reducing Executable Size" or "GCC x86 code size optimizations"
OpenCV uses cmake as build system, which provides "MinSizeRel" build type. It seems to auto-apply most of those tricks. Couldn't find any good documentation on that, hence: [citation needed]
(follows my original answer which didn't quite address the actual question)
More convenient way to get opencv for python may be to download it from: http://www.lfd.uci.edu/~gohlke/pythonlibs/#opencv
After running installer you'll find cv2.pyd in c:\python27\lib\site-packages
As far as we are concerned .pyd file is same as .dll: http://docs.python.org/2/faq/windows.html#is-a-pyd-file-the-same-as-a-dll
Which means that we can use Dependency Walker to look into it. This is what we see:
This picture means that cv2.pyd is dynamically linked against opencv libraries which contain actual functionality. These take around ~45MB of disk space.

Is there any Image processing library for Python 2.5 that does not require external dependencies under Windows?

I observed that there are (too) many ImageMagick libraries for Python but I found none that can just be installed and that will include the ImageMagick libraries.
I am looking for something can be be easily installed and that has to work with Python 2.5 (please, don't ask me why 2.5!)
Did you try the Python Imaging Library (PIL)? They have got an installer for Python 2.5 and I don't think there are further dependencies.
I won't ask you why 2.5, don't worry - but I am pretty happy with Python Imaging Library.
I work mostly on Windows, and while PIL has its glitches, it works just fine. Up to now I haven't had any compatibility problems BMPs, JPGs and PNGs.
I suggest you look into PIL once again, I just tried to make a BMP in Paint and it works fine.
BTW, a thing that can make stuff seem inoperable is forgetting to put "wb" as file output mode when saving using the Image.save() function.

Building PyGTK- and poppler-based binaries under Windows

For quite some time now, I've been fighting in vain to get a software I'm working on to work under Windows. It's written in Python (the 2.x series), and although all Linux users can benefit from its GUI when they use the source package, it seems that many people who download it go for the Windows package, for which I've only been able to provide command-line binaries.
The GUI was built using Glade/GTK, and uses poppler to embed a PDF viewer. I've found various howto's in the past (I don't have them at hand right now, sorry), which I've tried to follow religiously, but I never got things to work at all.
So, is there a reliable tutorial explaining exactly how to install the needed libraries (GTK and Poppler), so that I can build the corresponding binaries for my users?
There is a window installer here: http://download.gnome.org/binaries/win32/pygtk/2.22/

How do I install python gtkmozembed package or is there a replacement of python GtkMozEmbed in windows?

self.widget = gtkmozembed.MozEmbed()
self.widget.set_size_request(x + 18, y)
I can't find a binary installer of gtkmozembed,
is there an equivalent that can do the above ?
You should specify what OS/platform you're on. Binary installers are OS-specific.
Also you should specify more accurately what you're trying to do. The code is pretty much out of context and it can be hard to see what you're trying to do.
I'm assuming you're trying to have an embedded browser widget in your Python app.
Gtkmozembed for Python is part of PyGTK (pygtk.org) - you'll need to install PyGTK to use gtkmozembed.
I couldn't find any binary installers, but at the bottom of pygtk.org/downloads.html, there are links to instructions for building the binary installer on different platforms.
Also this link could be helpful (assuming you're on Windows): http://faq.pygtk.org/index.py?file=faq21.001.htp&req=show
Other than gtkmozembed there exists pywebkitgtk and wx.HtmlWindow (wxWidgets) but I must admit I couldn't find much information on those.

Converting a PDF to a series of images with Python

I'm attempting to use Python to convert a multi-page PDF into a series of JPEGs. I can split the PDF up into individual pages easily enough with available tools, but I haven't been able to find anything that can covert PDFs to images.
PIL does not work, as it can't read PDFs. The two options I've found are using either GhostScript or ImageMagick through the shell. This is not a viable option for me, since this program needs to be cross-platform, and I can't be sure either of those programs will be available on the machines it will be installed and used on.
Are there any Python libraries out there that can do this?
ImageMagick has Python bindings.
Here's whats worked for me using the python ghostscript module (installed by '$ pip install ghostscript'):
import ghostscript
def pdf2jpeg(pdf_input_path, jpeg_output_path):
args = ["pdf2jpeg", # actual value doesn't matter
"-dNOPAUSE",
"-sDEVICE=jpeg",
"-r144",
"-sOutputFile=" + jpeg_output_path,
pdf_input_path]
ghostscript.Ghostscript(*args)
I also installed Ghostscript 9.18 on my computer and it probably wouldn't have worked otherwise.
You can't avoid the Ghostscript dependency. Even Imagemagick relies on Ghostscript for its PDF reading functions. The reason for this is the complexity of the PDF format: a PDF doesn't just contain bitmap information, but mostly vector shapes, transparencies etc.
Furthermore it is quite complex to figure out which of these objects appear on which page.
So the correct rendering of a PDF Page is clearly out of scope for a pure Python library.
The good news is that Ghostscript is pre-installed on many windows and Linux systems, because it is also needed by all those PDF Printers (except Adobe Acrobat).
If you're using linux some versions come with a command line utility called 'pdftopbm' out of the box. Check out netpbm
Perhaps relevant: http://www.swftools.org/gfx_tutorial.html

Categories