Inserting a comment in docx file using python 3 - python

I'm trying to write an application that scans a word .docx file and offers suggestions. I want to ideally put the suggestions on the document itself using MS Word's annotation functionality. (Here is an image of what I'm talking about if i'm not being clear -
MS Word Annotation Functionality
My question is how do I achieve this in Python. I looked through Python-docx's functionality, but didn't find anything there.
Alternatively, I was thinking of manually doing it, where I'd programmatically go into the docx file, go into the xml file, and add the comment that way but I'm not sure how to approach this and I'm not sure if there is a better way to do it.
Please advise :)
Appreciate the help in advance!

for anyone else looking, this is a fork from python-docx that implements comment functionality.

Related

How To Extract Data From PDF In Python Using PDFrw

I am trying to use PDFrw to get data from a certain PDF (Let's say the one at the top right of the page HERE). I am using PDFrw to do this. I have looked through the documentation that they provide (I couldn't find much) and looked at the example code that they posted on git, but I can't seem to get enough information together to do what I would like to do. How would I make a simple program to go into the PDF using PDFrw (Or another if there is a better one) and extract a certain piece of text. I was thinking about converting it to html... Would that be easier? Look at the PDF I provided above as an example, I would like to get the (let's say) the voltage, which in the PDF is 600 w... How would I go about doing this in the simplest way? I couldn't find any other stack overflow questions about this, so hopefully someone can help that has used it before!
Thanks!
I am the author of pdfrw, and it's not really designed for this. You should probably look at pdfminer.

How to add text to existing PDF file with Python

I am looking for a Python module and some examples on how to add Text to an existing PDF. The PDF file is an one page PDF and I would need to add the info at a predetermined position. The text can be added as part of the document or as a comment.
I would also need to read the comments that are in this document.
What is the best Python module that I can use for this? The environment is Windows 7 and Python 2.7 x64.
I have tried to compile poppler but it is a nightmare
The other libraries that I have looked at are pyPDF2 and PDF1.0 but I could not locate the objects and the methods that I need to use to achieve my task. My level is "beginner" so please if I overlooked anything is because of this.
This question has been asked, and was very thoroughly answered. Check it out here! The first answer is nicely general, and walks you through each step of the process. The second answer is instead straight code that you can run. Both are valuable and well-written; choose whichever works best for you. (Or both!)

How to parse a .shp file?

I am interested in gleaning information from an ESRI .shp file.
Specifically the .shp file of a polyline feature class.
When I open the .dbf of a feature class, I get what I would expect: a table that can open in excel and contains the information from the feature class' table.
However, when I try to open a .shp file in any program (excel, textpad, etc...) all I get is a bunch of gibberish and unusual ASCII characters.
I would like to use Python (2.x) to interpret this file and get information out of it (in this case the vertices of the polyline).
I do not want to use any modules or non built-in tools, as I am genuinely interested in how this process would work and I don't want any dependencies.
Thank you for any hints or points in the right direction you can give!
Your question, basically, is "I have a file full of data stored in an arbitrary binary format. How can I use python to read such a file?"
The answer is, this link contains a description of the format of the file. Write a dissector based on the technical specification.
If you don't want to go to all the trouble of writing a parser, you should take look at pyshp, a pure Python shapefile library. I've been using it for a couple of months now, and have found it quite easy to use.
There's also a python binding to shapelib, if you search the web. But I found the pure Python solution easier to hack around with.
might be a long shot, but you should check out ctypes, and maybe use the .dll file that came with a program (if it even exists lol) that can read that type of file. in my experience, things get weird when u start digging around .dlls

Sublime Text dynamic tmLanguage file

I have an issue on my github project that is meant to maintain a EJS syntax definition file for Sublime Text editor. (https://github.com/samholmes/EJS.tmLanguage/issues/1)
The issue is that users want to be able to customize what the opening and closing tags should be in EJS. I've set it to <? and ?> respectively, because I prefer this personally. However, the "correct" or should I say the recommended default open and closing tags are <% and %> as you'll find on the EJS website.
So, what I'm wondering is if there is a way to customize this per installation of this package? I wouldn't know how this would work though. tmLanguage files are just XML files. So, my question is. On this line:
https://github.com/samholmes/EJS.tmLanguage/blob/master/EJS.tmLanguage#L579
Is there a way to make the regular expression generated by some setting file?
Any ideas on how I could solve this would be highly appreciated. I'm not familiar with Sublime's features or python API, so anyone with more information on this, please let me know what it is you think I should do.

pdf viewer for pyqt4 application?

I'm writing a Python+Qt4 application that would ideally need to pop up a window every once in a while, to display pdf documents and allow very basic operations, namely scrolling through the different pages and printing the document.
I've found the reportLab to create pdf files, but nothing about pdf viewers. Does anyone knows anything that might help. i was really hoping for the existence of something like the QWebView widget...
thanks in advance to all
You can use the Poppler library for that.
A Python binding to poppler-qt4 that aims for completeness and for being actively maintained.
https://code.google.com/p/python-poppler-qt4/
what about okular? It is a full app, but it can always be call from another app.

Categories