How to add text to existing PDF file with Python

How to add text to existing PDF file with Python - python

I am looking for a Python module and some examples on how to add Text to an existing PDF. The PDF file is an one page PDF and I would need to add the info at a predetermined position. The text can be added as part of the document or as a comment.
I would also need to read the comments that are in this document.
What is the best Python module that I can use for this? The environment is Windows 7 and Python 2.7 x64.
I have tried to compile poppler but it is a nightmare
The other libraries that I have looked at are pyPDF2 and PDF1.0 but I could not locate the objects and the methods that I need to use to achieve my task. My level is "beginner" so please if I overlooked anything is because of this.

This question has been asked, and was very thoroughly answered. Check it out here! The first answer is nicely general, and walks you through each step of the process. The second answer is instead straight code that you can run. Both are valuable and well-written; choose whichever works best for you. (Or both!)

Related

How can I prevent the filename from changing (rename) using python?

I have a project in mind, but there is a section that I don't know how to do. I'm using Python version 3.6 and windows 10. For example we have a file name of "example.txt" I want to prevent the name and its content of this file from being changed.
I did research on this topic, but I could not reach any research. Can we prevent the file's name (including its extension) from changing or its contents?To realize this, I think it is necessary to start as an administrator.
Thanks.

It is possible to stop another program from editing a file by locking it in python.
There is a module that does this called filelock. Take a look at the source code to see how it is done.
It is also worth noting that more advanced ransomware will try to stop processes so they can encrypt files, so this might not work in all cases.

Inserting a comment in docx file using python 3

I'm trying to write an application that scans a word .docx file and offers suggestions. I want to ideally put the suggestions on the document itself using MS Word's annotation functionality. (Here is an image of what I'm talking about if i'm not being clear -
MS Word Annotation Functionality
My question is how do I achieve this in Python. I looked through Python-docx's functionality, but didn't find anything there.
Alternatively, I was thinking of manually doing it, where I'd programmatically go into the docx file, go into the xml file, and add the comment that way but I'm not sure how to approach this and I'm not sure if there is a better way to do it.
Please advise :)
Appreciate the help in advance!

for anyone else looking, this is a fork from python-docx that implements comment functionality.

How to parse a .shp file?

I am interested in gleaning information from an ESRI .shp file.
Specifically the .shp file of a polyline feature class.
When I open the .dbf of a feature class, I get what I would expect: a table that can open in excel and contains the information from the feature class' table.
However, when I try to open a .shp file in any program (excel, textpad, etc...) all I get is a bunch of gibberish and unusual ASCII characters.
I would like to use Python (2.x) to interpret this file and get information out of it (in this case the vertices of the polyline).
I do not want to use any modules or non built-in tools, as I am genuinely interested in how this process would work and I don't want any dependencies.
Thank you for any hints or points in the right direction you can give!

Your question, basically, is "I have a file full of data stored in an arbitrary binary format. How can I use python to read such a file?"
The answer is, this link contains a description of the format of the file. Write a dissector based on the technical specification.

If you don't want to go to all the trouble of writing a parser, you should take look at pyshp, a pure Python shapefile library. I've been using it for a couple of months now, and have found it quite easy to use.
There's also a python binding to shapelib, if you search the web. But I found the pure Python solution easier to hack around with.

might be a long shot, but you should check out ctypes, and maybe use the .dll file that came with a program (if it even exists lol) that can read that type of file. in my experience, things get weird when u start digging around .dlls

Sublime Text dynamic tmLanguage file

I have an issue on my github project that is meant to maintain a EJS syntax definition file for Sublime Text editor. (https://github.com/samholmes/EJS.tmLanguage/issues/1)
The issue is that users want to be able to customize what the opening and closing tags should be in EJS. I've set it to <? and ?> respectively, because I prefer this personally. However, the "correct" or should I say the recommended default open and closing tags are <% and %> as you'll find on the EJS website.
So, what I'm wondering is if there is a way to customize this per installation of this package? I wouldn't know how this would work though. tmLanguage files are just XML files. So, my question is. On this line:
https://github.com/samholmes/EJS.tmLanguage/blob/master/EJS.tmLanguage#L579
Is there a way to make the regular expression generated by some setting file?
Any ideas on how I could solve this would be highly appreciated. I'm not familiar with Sublime's features or python API, so anyone with more information on this, please let me know what it is you think I should do.

What program to write pdf including other pdf on Linux from Python?

On an Ubuntu server, I want to create pdfs which include other static pdfs. I have tried using ReportLab with pyPdf. Ideally I would use ReportLab to do the whole thing, but in order to import the pdfs requires their PageCatcher which has a large recurring fee.
So I use pyPdf to merge a page created with ReportLab and my other pdfs. The problem is that even though this looks fine in Acrobat and Foxit, part of one of the pages prints garbled on a Xerox 7400 color printer. I can't figure out the issue, but would be willing to buy a more integrated solution if it existed and was reasonably priced. I thought PDF Creator Pilot was it until I saw that it was Windows only.
So is there a reasonably priced ($1K or less) solution or a different suggestion?

I have had a lot of success with the Java library iText. They have a great library of samples for pretty much anything you could think of doing with PDF files. This example is for concatenating PDF files and sounds like it would do what you need: http://itextpdf.com/examples/index.php?page=example&id=123. There is also PDFBox which is another great Java based PDF manipulation library.
I realize that you are looking for a Python based solution but there may not be many other options. If you are using the Jython interpreter instead of CPython, integrating in iText should be trivial. If not, then you could consider calling out to it as a separate process. I realize that may not be idea for your situation but I figured I would mention it as an option.

Another non-Python answer. If you are just merging pages, then pdftk does that well (along with a lot of other things).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to add text to existing PDF file with Python - python

Related

How can I prevent the filename from changing (rename) using python?

Inserting a comment in docx file using python 3

How to parse a .shp file?

Sublime Text dynamic tmLanguage file

What program to write pdf including other pdf on Linux from Python?

Categories

Resources