Python docx library text align - python

I am using python docx library to manipulate a word document. However I can't find how to align a line to the center in the documents page of that library. I can't find by google either.
from docx import Document
document = Document()
p = document.add_paragraph('A plain paragraph having some ')
p.add_run('bold').bold = True
p.add_run(' and some ')
p.add_run('italic.').italic = True
How can I align the text in docx?

With the new version of python-docx 0.7 https://github.com/python-openxml/python-docx/commit/158f2121bcd2c58b258dec1b83f8fef15316de19
Add feature #51: Paragraph.alignment (read/write)
Now it is possible to align a paragraph as here: http://python-docx.readthedocs.org/en/latest/dev/analysis/features/par-alignment.html
paragraph = document.add_paragraph("This is a text")
paragraph.alignment = 0 # for left, 1 for center, 2 right, 3 justify ....
edit from comments
actually it is 0 for left, 1 for center, 2 for right
edit 2 from comments
You shouldn't hard code magic numbers like this. Use WD_ALIGN_PARAGRAPH.CENTER to get the correct value for centering, etc. To do this use the following import
from docx.enum.text import WD_ALIGN_PARAGRAPH

p = document.add_paragraph('A plain paragraph having some ',style='BodyText', breakbefore=False, jc='left')# #param string jc: Paragraph alignment, possible values:left, center, right, both (justified), ...
for reference see this reference at def paragraph read the documentation

Related

MS Word Manipulation using Python

I have ms word document and I want to apply the following setting automatically using python
1.Font type = Trebuchet MS
2.Font Size = 11
3.Table and Appendices font Size = 10
4.Line spacing = multiple of 1.15
Space before paragraph = 0
6.Space after paragraph = 0
7.Paragraph should be Justified
8.Number should be aligned to the bottom right corner of document
9.Text in the table should be Justified
10.Insert footer and header automatically
11.Tool should ensure that a document with a single page shall not have a page number. ,
12.Tool should ensure that For documents exceeding one page, page numbers shall be inserted at the right hand side of the document.
13.Tool should ensure that the cover page shall not be assigned a page number.
14.Tool should ensure that Roman numbers (i, ii, iii. .. ) used only on the preliminary pages including table of contents, preface, abbreviations, list of tables, executive summary
15.Tool should ensure that Arabic numbers (1, 2, 3 ...) used only for main text of the report and appendices.
16.Tool should ensure that year in any document is written in full for the preceding year and two last digits for the current year. For example; instead of writing 2020/2021, write 2020/21
17.Tool should ensure that only English United Kingdom vocabularies are used and NOT English United States. Example: "analyse -English UK" vs "analyze -English US".
18.Tool should ensure that Numbers presented in a paragraph that are less than 10 should be written in words (one, two, three ...).
19.Tool should ensure that Numbers presented in a paragraph For 10 and above, they should be written in numerals.
20.Tool should ensure that Numbers within the table must be expressed in numerals even if they are less than 10.
This is my code so far
from docx import Document
from docx.shared import Pt
from docx.shared import Inches
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.shared import Length
path = 'C:\\Users\\Gaston\\Documents\\Words\\test.docx'
doc = Document(path)
style = doc.styles['Normal']
font = style.font
font.name = 'Trebuchet MS'
font.size = Pt(11)
paragraph = doc.add_paragraph()
paragraph_format = paragraph.paragraph_format
paragraph_format.alignment = WD_ALIGN_PARAGRAPH.JUSTIFY
paragraph_format.right_indent = Inches(1)
paragraph_format.space_before = Pt(0)
paragraph_format.space_after = Pt(0)
paragraph_format.line_spacing = Length(1.15)
doc.save(path)

Accessing additional paragraph-style properties with python-docx

I am trying to parse a Word document using python-docx, but have trouble getting the correct styles of paragraphs. I have uploaded a simplified version of the file to Dropbox.
The document's 'Normal' style uses 'Garamont' font, but this is changed so that everywhere I click in the file, the font is 'Calibri (Body)'.
When I use the 'Style inspector' in Word on the first line, it shows: "Paragraph formatting" is Normal + Plus: Centered, Left: 0 cm, Before: 0 pt, and "Text level formatting" is Default Paragraph Font + Plus: +Body (Calibri), 14 pt, Bold, Underline.
When I do the same on a non-bold text in the table, I get: "Paragraph formatting" is Normal + Plus: +Body (Calibri), Before: 0 pt, and "Text level formatting" is Default Paragraph Font + Plus: <none>.
That is, the font is changed on different levels inside and outside of the table. In both case, however, I do not know how to get this info using python-docx:
import docx
doc = docx.Document('test.docx')
par = doc.paragraphs[0]
#par = doc.tables[0].cell(0,1).paragraphs[0]
print(f"'{par.style.name}'")
print(f"'{par.style.font.name}'")
print(f"'{par.runs[0].font.name}'")
print(f"'{par.runs[0].style.name}'")
print(f"'{par.runs[0].style.font.name}'")
c = doc.tables[0].cell(1,0)
for par in c.paragraphs:
print(f"{len(par.runs)}", end=' ')
c.paragraphs[0].add_run('Very short summary')
doc.save('test_ed.docx')
returns
'Normal'
'Garamond'
'None'
'Default Paragraph Font'
'None'
1 0 0 0 0 0 0 0 0 1
In other words, I do not see any sign that the document actually uses the Calibri font.
It returns exactly the same if I use the second par definition (from the table).
Moreover, looking at the resulting test_ed.docx, the added line is using 'Garamont', even if Word shows the other empty paragraphs as using 'Calibri (Body)'.
So, my question is how to detect the actual format of the text and how to copy it to new paragraphs?

How to create a text shape with python pptx?

I want to add a text box to a presentation with python pptx. I would like to add a text box with several paragraphs in the specific place and then format it (fonts, color, etc.). But since text shape object always comes with the one paragraph in the beginning, I cannot edit first of my paragraphs. The code sample looks like this:
txBox = slide.shapes.add_textbox(left, top, width, height)
tf = txBox.text_frame
p = tf.add_paragraph()
p.text = "This is a first paragraph"
p.font.size = Pt(11)
p = tf.add_paragraph()
p.text = "This is a second paragraph"
p.font.size = Pt(11)
Which creates output like this:
I can add text to this first line with tf.text = "This is text inside a textbox", but it won't be editable in terms of fonts or colors. So is there any way how I can omit or edit that line, so all paragraphs in the box would be the same?
Access the first paragraph differently, using:
p = tf.paragraphs[0]
Then you can add runs, set fonts and all the rest of it just like with a paragraph you get back from tf.add_paragraph().

python wrap text and reportlab

I have a little code and I would like to wrap my long string in every 10th character and then add it into a PDF using reportlab:
This is how I try:
text = '*long_text_long_text_long_text_long_text*'
text = "\n".join(wrap(text, 10))
canvas.drawString(5,227, text)
My pdf was created but where I want to break the lines I can only see black rectangles. You can see the attached picture:
Can you help me? Thank you!
drawString draws a single line. so you will need to adjust the coordinate for each line in a loop.
y = 227
for line in wrap(text, 10):
canvas.drawString(5, y, line)
y += 15
An alternative to placing each line individually is using Paragraph:
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import A5
from reportlab.platypus import Paragraph
text = "long text<br />long text<br />long text<br />"
text_width=A5[0] / 2
text_height=A5[1] / 2
x = A5[0]/4
y = A5[1]/4
pdf = canvas.Canvas(filename="test.pdf", pagesize=A5)
styles = getSampleStyleSheet()
p = Paragraph(text, styles["Normal"])
p.wrapOn(pdf, text_width, text_height)
p.drawOn(pdf, x, y)
pdf.save()
In addition to supporting manually put line breaks, the Paragraph also supports automatic line breaks.

docx center text in table cells

So I am starting to use pythons docx library. Now, I create a table with multiple rows, and only 2 columns, it looks like this:
Now, I would like the text in those cells to be centered horizontally. How can I do this? I've searched through docx API documentation but I only saw information about aligning paragraphs.
There is a code to do this by setting the alignment as you create cells.
doc=Document()
table = doc.add_table(rows=0, columns=2)
row=table.add_row().cells
p=row[0].add_paragraph('left justified text')
p.alignment=WD_ALIGN_PARAGRAPH.LEFT
p=row[1].add_paragraph('right justified text')
p.alignment=WD_ALIGN_PARAGRAPH.RIGHT
code by: bnlawrence
and to align text to the center just change:
p.alignment=WD_ALIGN_PARAGRAPH.CENTER
solution found here: Modify the alignment of cells in a table
Well, it seems that adding a paragraph works, but (oh, really?) it addes a new paragraph -- so in my case it wasn't an option. You could change the value of the existing cell and then change paragraph's alignment:
row[0].text = "hey, beauty"
p = row[0].paragraphs[0]
p.alignment = docx.enum.text.WD_ALIGN_PARAGRAPH.CENTER
Actually, in the top answer this first "docx.enum.text" was missing :)
The most reliable way that I have found for setting he alignment of a table cell (or really any text property) is through styles. Define a style for center-aligned text in your document stub, either programatically or through the Word UI. Then it just becomes a matter of applying the style to your text.
If you create the cell by setting its text property, you can just do
for col in table.columns:
for cell in col.cells:
cell.paragraphs[0].style = 'My Center Aligned Style'
If you have more advanced contents, you will have to add another loop to your function:
for col in table.columns:
for cell in col.cells:
for par in cell.paragraphs:
par.style = 'My Center Aligned Style'
You can easily stick this code into a function that will accept a table object and a style name, and format the whole thing.
In my case I used this.
from docx.enum.text import WD_ALIGN_PARAGRAPH
def addCellText(row_cells, index, text):
row_cells[index].text = str(text)
paragraph=row_cells[index].paragraphs[0]
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
font = paragraph.runs[0].font
font.size= Pt(10)
def addCellTextRight(row_cells, index, text):
row_cells[index].text = str(text)
paragraph=row_cells[index].paragraphs[0]
paragraph.alignment = WD_ALIGN_PARAGRAPH.RIGHT
font = paragraph.runs[0].font
font.size= Pt(10)
For total alignment to center I use this code:
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.enum.table import WD_ALIGN_VERTICAL
for row in table.rows:
for cell in row.cells:
cell.paragraphs[0].alignment = WD_ALIGN_PARAGRAPH.CENTER
cell.vertical_alignment = WD_ALIGN_VERTICAL.CENTER
From docx.enum.table import WD_TABLE_ALIGNMENT
table = document.add_table(3, 3)
table.alignment = WD_TABLE_ALIGNMENT.CENTER
For details see a link .
http://python-docx.readthedocs.io/en/latest/api/enum/WdRowAlignment.html

Categories