How to create a text shape with python pptx? - python

I want to add a text box to a presentation with python pptx. I would like to add a text box with several paragraphs in the specific place and then format it (fonts, color, etc.). But since text shape object always comes with the one paragraph in the beginning, I cannot edit first of my paragraphs. The code sample looks like this:
txBox = slide.shapes.add_textbox(left, top, width, height)
tf = txBox.text_frame
p = tf.add_paragraph()
p.text = "This is a first paragraph"
p.font.size = Pt(11)
p = tf.add_paragraph()
p.text = "This is a second paragraph"
p.font.size = Pt(11)
Which creates output like this:
I can add text to this first line with tf.text = "This is text inside a textbox", but it won't be editable in terms of fonts or colors. So is there any way how I can omit or edit that line, so all paragraphs in the box would be the same?

Access the first paragraph differently, using:
p = tf.paragraphs[0]
Then you can add runs, set fonts and all the rest of it just like with a paragraph you get back from tf.add_paragraph().

Related

python-docx trailing trilling whitepsaces not showing correctly

Goal
I am trying to add a text to a table cell where the text is a combination of 2 strings and the space between the strings of variable size so that the final text has the same length and it appears as if the second string is right aligned.
I can either use format or ljust to combine the strings in python.
period = "from Monday to Friday"
item_text = "Some txt"
item_text2 = "Some other txt"
t1 = "t1: {:<30}{:0}".format(item_text,period)
t2 = "t2: {:<30}{:0}".format(item_text2,period)
t3 = f"t3: {item_text.ljust(30)}{period}"
t4 = f"t4: {item_text2.ljust(30)}{period}"
from pprint import pprint
pprint(t1)
pprint(t2)
pprint(t3)
pprint(t4)
Text in python with variable space length between strings
However, if I add this text to a docx table, the space between the strings changes.
from docx import Document
doc = Document()
# Creating a table object
table = doc.add_table(rows=2, cols=2, style="Table Grid")
table.rows[0].cells[0].text = f"{item_text.ljust(30)}{period}"
table.rows[1].cells[0].text = f"{item_text2.ljust(30)}{period}"
def set_col_widths(table):
widths = tuple( Cm(val) for val in [15,8])
for row in table.rows:
for idx, width in enumerate(widths):
row.cells[idx].width = width
set_col_widths(table)
doc.save("test_whitespace.docx")
Text in word. Space between strings changed.
Note
I am aware that I could add a table to the table cell and left adjust the left and right adjust the right but that seems like way more code to write.
Question
Why is the spacing changing in the word document and how can I create the text differently to get the desired goal?

docx center text in table cells

So I am starting to use pythons docx library. Now, I create a table with multiple rows, and only 2 columns, it looks like this:
Now, I would like the text in those cells to be centered horizontally. How can I do this? I've searched through docx API documentation but I only saw information about aligning paragraphs.
There is a code to do this by setting the alignment as you create cells.
doc=Document()
table = doc.add_table(rows=0, columns=2)
row=table.add_row().cells
p=row[0].add_paragraph('left justified text')
p.alignment=WD_ALIGN_PARAGRAPH.LEFT
p=row[1].add_paragraph('right justified text')
p.alignment=WD_ALIGN_PARAGRAPH.RIGHT
code by: bnlawrence
and to align text to the center just change:
p.alignment=WD_ALIGN_PARAGRAPH.CENTER
solution found here: Modify the alignment of cells in a table
Well, it seems that adding a paragraph works, but (oh, really?) it addes a new paragraph -- so in my case it wasn't an option. You could change the value of the existing cell and then change paragraph's alignment:
row[0].text = "hey, beauty"
p = row[0].paragraphs[0]
p.alignment = docx.enum.text.WD_ALIGN_PARAGRAPH.CENTER
Actually, in the top answer this first "docx.enum.text" was missing :)
The most reliable way that I have found for setting he alignment of a table cell (or really any text property) is through styles. Define a style for center-aligned text in your document stub, either programatically or through the Word UI. Then it just becomes a matter of applying the style to your text.
If you create the cell by setting its text property, you can just do
for col in table.columns:
for cell in col.cells:
cell.paragraphs[0].style = 'My Center Aligned Style'
If you have more advanced contents, you will have to add another loop to your function:
for col in table.columns:
for cell in col.cells:
for par in cell.paragraphs:
par.style = 'My Center Aligned Style'
You can easily stick this code into a function that will accept a table object and a style name, and format the whole thing.
In my case I used this.
from docx.enum.text import WD_ALIGN_PARAGRAPH
def addCellText(row_cells, index, text):
row_cells[index].text = str(text)
paragraph=row_cells[index].paragraphs[0]
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
font = paragraph.runs[0].font
font.size= Pt(10)
def addCellTextRight(row_cells, index, text):
row_cells[index].text = str(text)
paragraph=row_cells[index].paragraphs[0]
paragraph.alignment = WD_ALIGN_PARAGRAPH.RIGHT
font = paragraph.runs[0].font
font.size= Pt(10)
For total alignment to center I use this code:
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.enum.table import WD_ALIGN_VERTICAL
for row in table.rows:
for cell in row.cells:
cell.paragraphs[0].alignment = WD_ALIGN_PARAGRAPH.CENTER
cell.vertical_alignment = WD_ALIGN_VERTICAL.CENTER
From docx.enum.table import WD_TABLE_ALIGNMENT
table = document.add_table(3, 3)
table.alignment = WD_TABLE_ALIGNMENT.CENTER
For details see a link .
http://python-docx.readthedocs.io/en/latest/api/enum/WdRowAlignment.html

Python pptx (Power Point) Find and replace text (ctrl + H)

Question in Short: How can I use the find and replace option (Ctrl+H) using the Python-pptx module?
Example Code:
from pptx import Presentation
nameOfFile = "NewPowerPoint.pptx" #Replace this with: path name on your computer + name of the new file.
def open_PowerPoint_Presentation(oldFileName, newFileName):
prs = Presentation(oldFileName)
prs.save(newFileName)
open_PowerPoint_Presentation('Template.pptx', nameOfFile)
I have a Power Point document named "Template.pptx". With my Python program I am adding some slides and putting some pictures in them. Once all the pictures are put into the document it saves it as another power point presentation.
The problem is that this "Template.pptx" has all the old week numbers in it, Like "Week 20". I want to make Python find and replace all these word combinations to "Week 25" (for example).
Posting code from my own project because none of the other answers quite managed to hit the mark with strings that have complex text with multiple paragraphs without losing formating:
prs = Presentation('blah.pptx')
# To get shapes in your slides
slides = [slide for slide in prs.slides]
shapes = []
for slide in slides:
for shape in slide.shapes:
shapes.append(shape)
def replace_text(self, replacements: dict, shapes: List):
"""Takes dict of {match: replacement, ... } and replaces all matches.
Currently not implemented for charts or graphics.
"""
for shape in shapes:
for match, replacement in replacements.items():
if shape.has_text_frame:
if (shape.text.find(match)) != -1:
text_frame = shape.text_frame
for paragraph in text_frame.paragraphs:
for run in paragraph.runs:
cur_text = run.text
new_text = cur_text.replace(str(match), str(replacement))
run.text = new_text
if shape.has_table:
for row in shape.table.rows:
for cell in row.cells:
if match in cell.text:
new_text = cell.text.replace(match, replacement)
cell.text = new_text
replace_text({'string to replace': 'replacement text'}, shapes)
For those of you who just want some code to copy and paste into your program that finds and replaces text in a PowerPoint while KEEPING formatting (just like I was,) here you go:
def search_and_replace(search_str, repl_str, input, output):
""""search and replace text in PowerPoint while preserving formatting"""
#Useful Links ;)
#https://stackoverflow.com/questions/37924808/python-pptx-power-point-find-and-replace-text-ctrl-h
#https://stackoverflow.com/questions/45247042/how-to-keep-original-text-formatting-of-text-with-python-powerpoint
from pptx import Presentation
prs = Presentation(input)
for slide in prs.slides:
for shape in slide.shapes:
if shape.has_text_frame:
if(shape.text.find(search_str))!=-1:
text_frame = shape.text_frame
cur_text = text_frame.paragraphs[0].runs[0].text
new_text = cur_text.replace(str(search_str), str(repl_str))
text_frame.paragraphs[0].runs[0].text = new_text
prs.save(output)
The prior is a combination of many answers, but it gets the job done. It simply replaces search_str with repl_str in every occurrence of search_str.
In the scope of this answer, you would use:
search_and_replace('Week 20', 'Week 25', "Template.pptx", "NewPowerPoint.pptx")
Merging responses above and other in a way that worked well for me (PYTHON 3). All the original format was keeped:
from pptx import Presentation
def replace_text(replacements, shapes):
"""Takes dict of {match: replacement, ... } and replaces all matches.
Currently not implemented for charts or graphics.
"""
for shape in shapes:
for match, replacement in replacements.items():
if shape.has_text_frame:
if (shape.text.find(match)) != -1:
text_frame = shape.text_frame
for paragraph in text_frame.paragraphs:
whole_text = "".join(run.text for run in paragraph.runs)
whole_text = whole_text.replace(str(match), str(replacement))
for idx, run in enumerate(paragraph.runs):
if idx != 0:
p = paragraph._p
p.remove(run._r)
if bool(paragraph.runs):
paragraph.runs[0].text = whole_text
if __name__ == '__main__':
prs = Presentation('input.pptx')
# To get shapes in your slides
slides = [slide for slide in prs.slides]
shapes = []
for slide in slides:
for shape in slide.shapes:
shapes.append(shape)
replaces = {
'{{var1}}': 'text 1',
'{{var2}}': 'text 2',
'{{var3}}': 'text 3'
}
replace_text(replaces, shapes)
prs.save('output.pptx')
You would have to visit each slide on each shape and look for a match using the available text features. It might not be pretty because PowerPoint has a habit of splitting runs up into what may seem like odd chunks. It does this to support features like spell checking and so forth, but its behavior there is unpredictable.
So finding the occurrences with things like Shape.text will probably be the easy part. Replacing them without losing any font formatting they have might be more difficult, depending on the particulars of your situation.
I know this question is old, but I have just finished a project that uses python to update a powerpoint daily. Bascially every morning the python script is run and it pulls the data for that day from a database, places the data in the powerpoint, and then executes powerpoint viewer to play the powerpoint.
To asnwer your question, you would have to loop through all the Shapes on the page and check if the string you're searching for is in the shape.text. You can check to see if the shape has text by checking if shape.has_text_frame is true. This avoids errors.
Here is where things get trickey. If you were to just replace the string in shape.text with the text you want to insert, you will probably loose formatting. shape.text is actually a concatination of all the text in the shape. That text may be split into lots of 'runs', and all of those runs may have different formatting that will be lost if you write over shape.text or replace part of the string.
On the slide you have shapes, and shapes can have a text_frame, and text_frames have paragraphs (atleast one. always. even when its blank), and paragraphs can have runs. Any level can have formatting, and you have no way of determining how many runs your string is split over.
In my case I made sure that any string that was going to be replaced was in its own shape. You still have to drill all the way down to the run and set the text there so that all formatting would be preserved. Also, the string you match in shape.text may actually be spread across multiple runs, so when setting the text in the first run, I also set the text in all other runs in that paragraph to blank.
random code snippit:
from pptx import Presentation
testString = '{{thingToReplace}}'
replaceString = 'this will be inserted'
ppt = Presentation('somepptxfile.pptx')
def replaceText(shape, string,replaceString):
#this is the hard part
#you know the string is in there, but it may be across many runs
for slide in ppt.slides:
for shape in slide.shapes:
if shape.has_text_frame:
if(shape.text.find(testString)!=-1:
replaceText(shape,testString,replaceString)
Sorry if there are any typos. Im at work.....
I encountered a similar issue that the formatted placeholder spreads over multiple run object. I would like to keep the format, so i could not do the replacement in the paragraph level. Finally, i figure out a way to replace the placeholder.
variable_pattern = re.compile("{{(\w+)}}")
def process_shape_with_text(shape, variable_pattern):
if not shape.has_text_frame:
return
whole_paragraph = shape.text
matches = variable_pattern.findall(whole_paragraph)
if len(matches) == 0:
return
is_found = False
for paragraph in shape.text_frame.paragraphs:
for run in paragraph.runs:
matches = variable_pattern.findall(run.text)
if len(matches) == 0:
continue
replace_variable_with(run, data, matches)
is_found = True
if not is_found:
print("Not found the matched variables in the run segment but in the paragraph, target -> %s" % whole_paragraph)
matches = variable_pattern.finditer(whole_paragraph)
space_prefix = re.match("^\s+", whole_paragraph)
match_container = [x for x in matches];
need_modification = {}
for i in range(len(match_container)):
m = match_container[i]
path_recorder = space_prefix.group(0)
(start_0, end_0) = m.span(0)
(start_1, end_1) = m.span(1)
if (i + 1) > len(match_container) - 1 :
right = end_0 + 1
else:
right = match_container[i + 1].start(0)
for paragraph in shape.text_frame.paragraphs:
for run in paragraph.runs:
segment = run.text
path_recorder += segment
if len(path_recorder) >= start_0 + 1 and len(path_recorder) <= right:
print("find it")
if len(path_recorder) <= start_1:
need_modification[run] = run.text.replace('{', '')
elif len(path_recorder) <= end_1:
need_modification[run] = data[m.group(1)]
elif len(path_recorder) <= right:
need_modification[run] = run.text.replace('}', '')
else:
None
if len(need_modification) > 0:
for key, value in need_modification.items():
key.text = value
Since PowerPoint splits the text of a paragraph into seemingly random runs (and on top each run carries its own - possibly different - character formatting) you can not just look for the text in every run, because the text may actually be distributed over a couple of runs and in each of those you'll only find part of the text you are looking for.
Doing it at the paragraph level is possible, but you'll lose all character formatting of that paragraph, which might screw up your presentation quite a bit.
Using the text on paragraph level, doing the replacement and assigning that result to the paragraph's first run while removing the other runs from the paragraph is better, but will change the character formatting of all runs to that of the first one, again screwing around in places, where it shouldn't.
Therefore I wrote a rather comprehensive script that can be installed with
python -m pip install python-pptx-text-replacer
and that creates a command python-pptx-text-replacer that you can use to do those replacements from the command line, or you can use the class TextReplacer in that package in your own Python scripts. It is able to change text in tables, charts and wherever else some text might appear, while preserving any character formatting specified for that text.
Read the README.md at https://github.com/fschaeck/python-pptx-text-replacer for more detailed information on usage. And open an issue there if you got any problems with the code!
Also see my answer at python-pptx - How to replace keyword across multiple runs? for an example of how the script deals with character formatting...
Here's some code that could help. I found it here:
search_str = '{{{old text}}}'
repl_str = 'changed Text'
ppt = Presentation('Presentation1.pptx')
for slide in ppt.slides:
for shape in slide.shapes:
if shape.has_text_frame:
shape.text = shape.text.replace(search_str, repl_str)
ppt.save('Presentation1.pptx')

Changing font for part of text in python-pptx

I'm using the python-pptx module to create presentations.
How can I change the font properties only for a part of the text?
I currently change the font like this:
# first create text for shape
ft = pres.slides[0].shapes[0].text_frame
ft.clear()
p = ft.paragraphs[0]
run = p.add_run()
run.text = "text"
# change font
from pptx.dml.color import RGBColor
from pptx.util import Pt
font = run.font
font.name = 'Lato'
font.size = Pt(32)
font.color.rgb = RGBColor(255, 255, 255)
Thanks!
In PowerPoint, font properties are applied to a run. In a way, it is what defines a run; a "run" of text sharing the same font treatment, including typeface, size, color, bold/italic, etc.
So to make two bits of text look differently, you need to make them separate runs.
As #scanny says in another post. The post
By "different run", it means you should write the text in first color in a run, then afterwards, write the text in the second color in another run as shown below.
There's also an example in the post, I'll just copy and paste it to here
from docx.shared import RGBColor
# ---reuse existing default single paragraph in cell---
paragraph = cell.paragraphs[0]
###(First run)
#---add distinct runs of text one after the other to
# --- form paragraph contents.
paragraph.add_run("A sentence with a ")
###(Second run)
# ---colorize the run with "red" in it.
red_run = paragraph.add_run("red")
red_run.font.color.rgb = RGBColor(255, 0, 0)
###(Third run)
paragraph.add_run(" word.")

How to add space between lines within a single paragraph with Reportlab

I have a block of text that is dynamically pulled from a database and is placed in a PDF before being served to a user. The text is being placed onto a lined background, much like notepad paper. I want to space the text so that only one line of text is between each background line.
I was able to use the following code to create a vertical spacing between paragraphs (used to generate another part of the PDF).
style = getSampleStyleSheet()['Normal']
style.fontName = 'Helvetica'
style.spaceAfter = 15
style.alignment = TA_JUSTIFY
story = [Paragraph(choice.value,style) for choice in chain(context['question1'].itervalues(),context['question2'].itervalues())]
generated_file = StringIO()
frame1 = Frame(50,100,245,240, showBoundary=0)
frame2 = Frame(320,100,245,240, showBoundary=0)
page_template = PageTemplate(frames=[frame1,frame2])
doc = BaseDocTemplate(generated_file,pageTemplates=[page_template])
doc.build(story)
However, this won't work here because I have only a single, large paragraph.
Pretty sure what yo u want to change is the leading. From the user manual in chapter 6.
To get double-spaced text, use a high
leading. If you set
autoLeading(default "off") to
"min"(use observed leading even if
smaller than specified) or "max"(use
the larger of observed and specified)
then an attempt is made to determine
the leading on a line by line basis.
This may be useful if the lines
contain different font sizes etc.
Leading is defined earlier in chapter 2:
Interline spacing (Leading)
The vertical offset between the point
at which one line starts and where the
next starts is called the leading
offset.
So try different values of leading, for example:
style = getSampleStyleSheet()['Normal']
style.leading = 24
Add leading to ParagraphStyle
orden = ParagraphStyle('orden')
orden.leading = 14
orden.borderPadding = 10
orden.backColor=colors.gray
orden.fontSize = 14
Generate PDF
buffer = BytesIO()
p = canvas.Canvas(buffer, pagesize=letter)
text = Paragraph("TEXT Nro 0001", orden)
text.wrapOn(p,500,10)
text.drawOn(p, 45, 200)
p.showPage()
p.save()
pdf = buffer.getvalue()
buffer.close()
The result

Categories