So I am starting to use pythons docx library. Now, I create a table with multiple rows, and only 2 columns, it looks like this:
Now, I would like the text in those cells to be centered horizontally. How can I do this? I've searched through docx API documentation but I only saw information about aligning paragraphs.
There is a code to do this by setting the alignment as you create cells.
doc=Document()
table = doc.add_table(rows=0, columns=2)
row=table.add_row().cells
p=row[0].add_paragraph('left justified text')
p.alignment=WD_ALIGN_PARAGRAPH.LEFT
p=row[1].add_paragraph('right justified text')
p.alignment=WD_ALIGN_PARAGRAPH.RIGHT
code by: bnlawrence
and to align text to the center just change:
p.alignment=WD_ALIGN_PARAGRAPH.CENTER
solution found here: Modify the alignment of cells in a table
Well, it seems that adding a paragraph works, but (oh, really?) it addes a new paragraph -- so in my case it wasn't an option. You could change the value of the existing cell and then change paragraph's alignment:
row[0].text = "hey, beauty"
p = row[0].paragraphs[0]
p.alignment = docx.enum.text.WD_ALIGN_PARAGRAPH.CENTER
Actually, in the top answer this first "docx.enum.text" was missing :)
The most reliable way that I have found for setting he alignment of a table cell (or really any text property) is through styles. Define a style for center-aligned text in your document stub, either programatically or through the Word UI. Then it just becomes a matter of applying the style to your text.
If you create the cell by setting its text property, you can just do
for col in table.columns:
for cell in col.cells:
cell.paragraphs[0].style = 'My Center Aligned Style'
If you have more advanced contents, you will have to add another loop to your function:
for col in table.columns:
for cell in col.cells:
for par in cell.paragraphs:
par.style = 'My Center Aligned Style'
You can easily stick this code into a function that will accept a table object and a style name, and format the whole thing.
In my case I used this.
from docx.enum.text import WD_ALIGN_PARAGRAPH
def addCellText(row_cells, index, text):
row_cells[index].text = str(text)
paragraph=row_cells[index].paragraphs[0]
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
font = paragraph.runs[0].font
font.size= Pt(10)
def addCellTextRight(row_cells, index, text):
row_cells[index].text = str(text)
paragraph=row_cells[index].paragraphs[0]
paragraph.alignment = WD_ALIGN_PARAGRAPH.RIGHT
font = paragraph.runs[0].font
font.size= Pt(10)
For total alignment to center I use this code:
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.enum.table import WD_ALIGN_VERTICAL
for row in table.rows:
for cell in row.cells:
cell.paragraphs[0].alignment = WD_ALIGN_PARAGRAPH.CENTER
cell.vertical_alignment = WD_ALIGN_VERTICAL.CENTER
From docx.enum.table import WD_TABLE_ALIGNMENT
table = document.add_table(3, 3)
table.alignment = WD_TABLE_ALIGNMENT.CENTER
For details see a link .
http://python-docx.readthedocs.io/en/latest/api/enum/WdRowAlignment.html
Related
Goal
I am trying to add a text to a table cell where the text is a combination of 2 strings and the space between the strings of variable size so that the final text has the same length and it appears as if the second string is right aligned.
I can either use format or ljust to combine the strings in python.
period = "from Monday to Friday"
item_text = "Some txt"
item_text2 = "Some other txt"
t1 = "t1: {:<30}{:0}".format(item_text,period)
t2 = "t2: {:<30}{:0}".format(item_text2,period)
t3 = f"t3: {item_text.ljust(30)}{period}"
t4 = f"t4: {item_text2.ljust(30)}{period}"
from pprint import pprint
pprint(t1)
pprint(t2)
pprint(t3)
pprint(t4)
Text in python with variable space length between strings
However, if I add this text to a docx table, the space between the strings changes.
from docx import Document
doc = Document()
# Creating a table object
table = doc.add_table(rows=2, cols=2, style="Table Grid")
table.rows[0].cells[0].text = f"{item_text.ljust(30)}{period}"
table.rows[1].cells[0].text = f"{item_text2.ljust(30)}{period}"
def set_col_widths(table):
widths = tuple( Cm(val) for val in [15,8])
for row in table.rows:
for idx, width in enumerate(widths):
row.cells[idx].width = width
set_col_widths(table)
doc.save("test_whitespace.docx")
Text in word. Space between strings changed.
Note
I am aware that I could add a table to the table cell and left adjust the left and right adjust the right but that seems like way more code to write.
Question
Why is the spacing changing in the word document and how can I create the text differently to get the desired goal?
I am looking to iterate through sentences/paragraphs within cells of a docx table, performing functions depending on their style tags using the pywin32 module.
I can manually select the cell using
cell = table.Cell(Row = 1, Column =2)
I tried using something like for x in cell: #do something but
<class 'win32com.client.CDispatch'> objects 'do not support enumeration'
I tried looking through: Word OM to find a solution but to no avail (I understand this is for VBA, but still can be very useful)
Here is a simple example that reads the content from the the first row / first column of the first table in a document and prints it word-by-word:
import win32com.client as win32
import os
wordApp = win32.gencache.EnsureDispatch("Word.Application")
wordApp.Visible = False
doc = wordApp.Documents.Open(os.getcwd() + "\\Test.docx")
table = doc.Tables(1)
for word in table.Cell(Row = 1, Column = 1).Range.Text.split():
print(word)
wordApp.Application.Quit(-1)
The cell's content is just a string, you could easily also split it by paragraphs using split('\r') or by sentences using split('.').
I am setting up a final slide with table of content containing all slide titles and related slide numbers from presentation.
In this case I work with python-pptx 0.6.18 & Python 3.7. So far I've managed to split title and page number with tab sign, however I don't know where should I look for setting tab spacing for those sign.
from pptx import Presentation
from pptx.util import Inches, Cm, Pt
path_to_presentation = 'your/path/to/file.pptx'
prs = Presentation(path_to_presentation)
list_of_titles = []
list_of_slide_pages = []
...
# some code populating both above mentioned lists
...
# create new slide
tslide_layout = prs.slide_layouts[1]
toc_slide = prs.slides.add_slide(tslide_layout)
# add content to TOC slide
toc_slide.shapes.title.text = 'Table of contents'
for numer, title in enumerate(list_of_titles):
paragraph = toc_slide.shapes[1].text_frame.paragraphs[numer]
paragraph.text = title+'\t'+str(list_of_slide_pages[numer])
paragraph.level = 0
paragraph.runs[0].font.size = Pt(18)
toc_slide.shapes[1].text_frame.add_paragraph()
# save presentation
prs.save('your/path/to/file_with_TOC.pptx')
I am looking for parameters to set distance and alignment for tab stops in this shape/text_frame/paragraph or any other trick to elegantly bypass these parameters in a different way giving the desired final result. Any help or advice will be appreciated.
I am using python docx library to manipulate a word document. However I can't find how to align a line to the center in the documents page of that library. I can't find by google either.
from docx import Document
document = Document()
p = document.add_paragraph('A plain paragraph having some ')
p.add_run('bold').bold = True
p.add_run(' and some ')
p.add_run('italic.').italic = True
How can I align the text in docx?
With the new version of python-docx 0.7 https://github.com/python-openxml/python-docx/commit/158f2121bcd2c58b258dec1b83f8fef15316de19
Add feature #51: Paragraph.alignment (read/write)
Now it is possible to align a paragraph as here: http://python-docx.readthedocs.org/en/latest/dev/analysis/features/par-alignment.html
paragraph = document.add_paragraph("This is a text")
paragraph.alignment = 0 # for left, 1 for center, 2 right, 3 justify ....
edit from comments
actually it is 0 for left, 1 for center, 2 for right
edit 2 from comments
You shouldn't hard code magic numbers like this. Use WD_ALIGN_PARAGRAPH.CENTER to get the correct value for centering, etc. To do this use the following import
from docx.enum.text import WD_ALIGN_PARAGRAPH
p = document.add_paragraph('A plain paragraph having some ',style='BodyText', breakbefore=False, jc='left')# #param string jc: Paragraph alignment, possible values:left, center, right, both (justified), ...
for reference see this reference at def paragraph read the documentation
Using Python, I need to find all substrings in a given Excel sheet cell that are either bold or italic.
My problem is similar to this:
Using XLRD module and Python to determine cell font style (italics or not)
..but the solution is not applicable for me as I cannot assume that the same formatting holds for all content in the cell. The value in a single cell can look like this:
1. Some bold text Some normal text. Some italic text.
Is there a way to find the formatting of a range of characters in a cell using xlrd (or any other Python Excel module)?
Thanks to #Vyassa for all of the right pointers, I've been able to write the following code which iterates over the rows in a XLS file and outputs style information for cells with "single" style information (e.g., the whole cell is italic) or style "segments" (e.g., part of the cell is italic, part of it is not).
import xlrd
# accessing Column 'C' in this example
COL_IDX = 2
book = xlrd.open_workbook('your-file.xls', formatting_info=True)
first_sheet = book.sheet_by_index(0)
for row_idx in range(first_sheet.nrows):
text_cell = first_sheet.cell_value(row_idx, COL_IDX)
text_cell_xf = book.xf_list[first_sheet.cell_xf_index(row_idx, COL_IDX)]
# skip rows where cell is empty
if not text_cell:
continue
print text_cell,
text_cell_runlist = first_sheet.rich_text_runlist_map.get((row_idx, COL_IDX))
if text_cell_runlist:
print '(cell multi style) SEGMENTS:'
segments = []
for segment_idx in range(len(text_cell_runlist)):
start = text_cell_runlist[segment_idx][0]
# the last segment starts at given 'start' and ends at the end of the string
end = None
if segment_idx != len(text_cell_runlist) - 1:
end = text_cell_runlist[segment_idx + 1][0]
segment_text = text_cell[start:end]
segments.append({
'text': segment_text,
'font': book.font_list[text_cell_runlist[segment_idx][1]]
})
# segments did not start at beginning, assume cell starts with text styled as the cell
if text_cell_runlist[0][0] != 0:
segments.insert(0, {
'text': text_cell[:text_cell_runlist[0][0]],
'font': book.font_list[text_cell_xf.font_index]
})
for segment in segments:
print segment['text'],
print 'italic:', segment['font'].italic,
print 'bold:', segment['font'].bold
else:
print '(cell single style)',
print 'italic:', book.font_list[text_cell_xf.font_index].italic,
print 'bold:', book.font_list[text_cell_xf.font_index].bold
xlrd can do this. You must call load_workbook() with the kwarg formatting_info=True, then sheet objects will have an attribute rich_text_runlist_map which is a dictionary mapping cell coordinates ((row, col) tuples) to a runlist for that cell. A runlist is a sequence of (offset, font_index) pairs where offset tells you where in the cell the font begins, and font_index indexes into the workbook object's font_list attribute (the workbook object is what's returned by load_workbook()), which gives you a Font object describing the properties of the font, including bold, italics, typeface, size, etc.
I don't know if you can do that with xlrd, but since you ask about any other Python Excel module: openpyxl cannot do this in version 1.6.1.
The rich text gets reconstructed away in function get_string() in openpyxl/reader/strings.py. It would be relatively easy to setup a second table with 'raw' strings in that module.