Adding text to a 'paragraph' containing an image using python docx - python

I'm using python docx which claims in the documentation that:
'Often, a picture is placed in a paragraph by itself, but this is not required. It can have text before and after it in the paragraph in which it’s placed.'
But I cant find out how to do this, could someone explain (idealy with a basic example) how I get text before the image while in the same paragraph please. So the line of text ends with an image.
I've not found any answers to this but have seen people asking the same elsewhere with no solution.
Thanks
(note: I'm not a hugely experiance programmer and other than this awkward part the rest of my code will very basic)

At the time of this writing, python-docx doesn't have the features to support what you're trying to do.
The feature that would support it would be Run.add_picture(). If you add a feature request to the python-docx issue tracker, I'll see how soon we can get to it.
In the meantime, if you wanted to dig in and see what you could hack up, I'd recommend starting here, at Document.add_picture, as the structure would be analogous and use mostly the same calls.

If you just want to write docx files with Python, you can use another module:
https://github.com/rafaels88/py2docx

Related

How to use the remove-emoji library in Python?

Anyone know how to use the remove-emoji library in Python?
Documentation details or any solid code would be appreciated.
Follow the link that you provided to the library.
Extract the contents of the archive.
You'll notice that there is only one function in the entire library;
def remove_emoji(text):
text = text.decode('utf8')
return emoji_pattern.sub(r'', text).encode('utf8')
The only thing you need to do to use this library is call remove_emoji with the text you wish to have the emoji removed from.
For libraries like this with no documentation but only one simple task, the best thing you can do is look at the source code. Even in larger libraries, the source code is the only point of truth.

reading coreproperties keywords from docx file with python-docx

From the script here I see how to set document keywords with the coreproperties function of python-docx. I want to look at the keywords already in a document written by someone else. Is there a getcoreproperties function or a keywords attribute or something similar?
I've grepped in folder C:\Python27\Lib\site-packages\python_docx-0.5.0-py2.7.egg\docx and none of the .py files there have the string "core" in them, and I've called doc() on a few things but without finding anything promising. Where/how should I look for clues to this kind of thing?
The python-docx library doesn't have support for core properties as of v0.5.0. But as it happens, that should be relatively easy to remedy.
The python-pptx sister project has support for core properties, as explained here:
http://python-pptx.readthedocs.org/en/latest/api/presentation.html#coreproperties-objects
Since the two projects are based on the same architecture, that code should be reusable essentially as-is. It turns out the core-properties bits are common to the the Open Packaging Convention, which is the same for all three of the MS Office XML file formats.
If you'll add an issue on the GitHub issue tracker I'll see how soon we can get to it.
https://github.com/python-openxml/python-docx/issues

Sublime Text editor plug-in, scan div id's and classes

I'm aware I'm supposed to show some starting code to give you a clue as to what I'm trying to do, but I'm really at a basic level and I can't find any resources to show me what I'm after. Basically, I'm trying to write a plug-in for Sublime Text editor, which selects all div ID's then outputs them into a file. What's the best approach? It seems like it should be easy, but I'm not too sure.
Thanks in advance for your help,
Ewan
This looks like a good place to start: http://www.sublimetext.com/docs/plugin-basics
Look at http://www.sublimetext.com/docs/2/api_reference.html, though be advised that Sublime Text 3 is currently in beta. It introduces changes to the plugin api, and a requirement to support Python 3. See http://www.sublimetext.com/docs/3/porting_guide.html
Assuming you have some familiarity with python, I would start with this tutorial on for writing plugins (Link). The author of that tutorial wrote, among other things, package control. Granted, it is for ST2, but for what you are trying to do, I don't for see any major issues with writing a plugin that is compatible with both ST2 and ST3.
How you go about writing your particular plugin is up to you. One approach may be leveraging the view.find_all() method. This takes a regular expression and returns a set of regions. From these regions, you can grab the text, and subsequently the IDs for the divs. There may be a better way, but that might work as an initial attempt. Writing to a file can be done through the usual python means.

Setting Font Attributes Using Python-Docx

I am creating a word document programmatically using the Python-docx module.
I want to be able to center my headers, turn certain words to bold in a table I create, and do other basic mark up.
Unfortunately, reading over the source code in the module doesn't give me much of a lead on doing this.
I'm guessing it has something to do with the lxml/etree module that the docx code is based upon, but I don't have much familiarity with that library. Any ideas?
The link above points to the legacy repository for python-docx. The new one (v0.3.0 and later) is a complete rewrite and is located here: https://github.com/python-openxml/python-docx
All the features listed above are available in the current version.
The documentation is here: https://python-docx.readthedocs.org/en/latest/
Only bug fixes are being done on the legacy version, to support projects that still use it.
The python-docx SO tag is monitored and questions tagged with that usually get answered same day now.

Edit RTF file using Python

Maybe this is a dumb question, but I don't get it so appologize :)
I have an RTF document, and I want to change it. E.g. there is a table, I want to duplicate a row and change the text in the second row in my code in an object-oriented way.
I think pyparsing should be the way to go, but I'm fiddling around for hours and don't get it. I'm providing no example code because it's all nonsense I think :/
Am I on the right path or is there a better approach?
Anyone did something like that before?
RTFs are text documents with special "symbols" to create the formatting. (see - http://search.cpan.org/~sburke/RTF-Writer/lib/RTF/Cookbook.pod#RTF_Document_Structure It seems that perl has a good RTF library though), so yes, PyParsing is a good way to go. You have to learn the structure and then parse (there are perl code examples in the page i mentioned. If you are lucky you can translate them in python with some effort)
There is a basic RTF module available for python. Check - http://pyrtf.sourceforge.net/
Hope that helps you a little.

Categories