How to set slide number in generated slide using python-pptx - python

I am generating new slides using python pptx and I see no way to update the slide number. I have set the footer in master layout but when I try to read the shapes, I don't see that component over there.
My code:
from pptx import Presentation
prs = Presentation('sample_ppt.pptx')
title_slide_layout = prs.slide_layouts[14]
slide = prs.slides.add_slide(title_slide_layout)
for shape in slide.shapes:
print(shape.name)
prs.save('hh.pptx')

This is a common confusion, possibly because the way PowerPoint handles footers is, well, confusing :)
The short answer is that you need to put a plain-textbox shape (not a footer placeholder) on the master slide, and insert into that textbox a slide-number field, using Insert > Slide Number from the menu. On the master, it will appear something like <#>, but on the slide that inherits from that master, it will appear as the slide number.

Related

Select slides from pptx using python automation

Dears,
In order to print the specific slides from ppt, based on a list of items we use the basic PowerPoint tool Ctrl+F flowing this simple process, save the slide ID, then next, then save the second slide ID....... then print with those IDs and that takes a lot of workloads.
A base on the above statement we think to automate this task with python script.
this is my try :
from pptx import Presentation
filename = "C:/Users/RElKassah/Desktop/test.pptx"
prs = Presentation(filename)
text="test"
for slide in prs.slides:
if slide.shapes ==text:
title = slide.shapes.text.find = 'test'
print(title)
thank you very much and best regards
Here I wrote some simple loop to find text inside a pptx file, then print slide number of slides that contain that text. Hope it could help.
from pptx import Presentation
filename = 'test.pptx'
prs = Presentation(filename)
text="test"
for slide in prs.slides:
for shape in slide.shapes:
if shape.has_text_frame:
for paragraph in shape.text_frame.paragraphs:
for run in paragraph.runs:
if text in run.text:
print(prs.slides.index(slide)+1)

Assign a variable/tag to specific text within a PowerPoint Text Box (in PowerPoint)

I would like to update specific text within a PowerPoint text box by assigning that specific text a variable within PowerPoint that I can then (1) identify in Python, and (2) replace the text based on data in an Excel model.
The ultimate goal is that in the output PowerPoint presentation, we should only see the underlying value of the variable (e.g., 32.4%), not the variable name (e.g., #pct_total_var#). And if changes need to be made, the underlying reference to the variable is kept, such that I can re-run Python to update the presentation.
Note: I have already implemented a "search and replace" type solution based on answers here, and elsewhere. However the issue with that approach is that you need to work with a template PowerPoint and an output PowerPoint, because once the variables (e.g., #pct_total_var#) have been replaced with values from an Excel model, it is no longer possible to identify the variables in the output presentation file in order to update.
Is it possible to assign a variable/tag to text within a PowerPoint Text Box?
Below is the current Python code that I have. I just need to be able to define variables inside PowerPoint. But thought this might help others searching for the same.
from pptx import Presentation
import pandas as pd
# Paths to model, and template deck
model = "path/to/model.xlsx"
template = "path/to/template.pptx"
# Step 1: Get data from model to replace in PowerPoint
df = pd.read_excel(model, sheet_name='UpdateReportPPT')
df = df[['variable_id', 'formatted_value']]
values_to_update = df.set_index('variable_id').T.to_dict('records')[0]
# Step 2: Open template PowerPoint presentation
prs = Presentation(template)
# Step 3.1: Get shapes within each slide
slides = [slide for slide in prs.slides]
shapes = []
for slide in slides:
for shape in slide.shapes:
shapes.append(shape)
# Step 3.2: Update variables
for shape in shapes:
# THIS IS WHERE I'LL IDENTIFY THE VARIABLES STORED IN POWERPOINT & UPDATE BASED ON THE EXCEL MODEL
# Step 4: Save to output PowerPoint presentation
prs.save(template)
print('Done.')

How to extract ALL IMAGES and text from all pptx file slides using python?

I'm able to read images from pptx file but not all images. I'm unable to extract the images presented in a slide with title or other text. Here is my code and please help me.
from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE
import glob
import os
import codecs
from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = '/usr/local/Cellar/tesseract/4.1.1/bin/tesseract'
from pytesseract import image_to_string
n=0
def write_image(shape):
global n
image = shape.image
# get image
image_bytes = image.blob
# assinging file name, e.g. 'image.jpg'
image_filename = fname[:-5]+'{:03d}.{}'.format(n, image.ext)
n += 1
print(image_filename)
os.chdir("directory_path/readpptx/images")
with open(image_filename, 'wb') as f:
f.write(image_bytes)
os.chdir("directory_path/readpptx")
def visitor(shape):
if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
write_image(shape)
def iter_picture_shapes(prs1):
for slide in prs1.slides:
for shape in slide.shapes:
visitor(shape)
file = open("directory_path/MyFile.txt","a+")
for each_file in glob.glob("directory_path/*.pptx"):
fname = os.path.basename(each_file)
file.write("-------------------"+fname+"----------------------\n")
prs = Presentation(each_file)
print("---------------"+fname+"-------------------")
for slide in prs.slides:
for shape in slide.shapes:
if hasattr(shape, "text"):
print(shape.text)
file.write(shape.text+"\n")
iter_picture_shapes(prs)
file.close()
Above code is able to extract images from pptx slides which have no text or title but not able to extract images in slides with text or title.
Try also iterating over slide masters and slide layouts. If there are "background" images that's where they will be. The same for shape in slide.shapes: mechanism works on slide masters and slide layouts; they are a variant of the polymorphic Slide object with the same shape-access semantics.
I don't think your problem is strictly related to the presence of a title or text on the slide. Perhaps those particular slides use a layout that includes some background images. If you open the slide and clicking on the image does not select it (give it bounding box) that indicates it is a background image and resides on the slide layout or possibly the slide master. This is how logos are commonly implemented to show up on every slide.
You may also want to consider iterating over the Notes slide for each slide when it has one, if there is text and/or images in there you are interested in. It is uncommon to find images in the slide notes but PowerPoint supports it.
Another approach is the traverse the underlying .pptx package (as a Zip archive) and extract the images that way.

How to set a background image in python pptx

I am currently working on a project aiming to create a PowerPoint thanks Python pptx. However I am trying to set an image as the background of the slide and I can’t seem to find the solution in the docs of Python pptx. Is it possible to set an image as background if so can someone help me ? If it is not does anyone know another solution using python ?
Thank you
import os
import fnmatch
from pptx import Presentation
#Create presentation and setting layout as blank (6)
prs = Presentation()
blank_slide_layout = prs.slide_layouts[6]
#Find number of slides to create
#First Method = Count number of images in screenshot files (change path depending on the user)
nbSlide = len(fnmatch.filter(os.listdir("mypath"), '*.jpeg'))
#Loop to create as number of slides as there is report pages
for i in range(nbSlide):
slide = prs.slides.add_slide(blank_slide_layout)
#change background with an image of the slide …
background=slide.background
#Final step = Creation and saving of pptx
prs.save('test.pptx')
Is it possible to set an image as background ... ?
So far with python-pptx there is no direct way to insert image as background of an slide
If it is not does anyone know another solution using python ?
You could insert picture of interest into given slide on the regular basis, considering proper width/height parameters:
#Loop to create as number of slides as there is report pages
for i in range(nbSlide):
slide = prs.slides.add_slide(blank_slide_layout)
#change background with an image of the slide …
left = top = 0
pic = slide.shapes.add_picture('/your_file.jpeg', left-0.1*prs.slide_width, top, height = prs.slide_height)

Unable to delete PowerPoint Slides using Python-pptx

I am trying to delete PowerPoint slides containing a specific keywords using Python-pptx. If the keyword is present anywhere in the slide then that slide will be deleted. My code is given below:
from pptx import Presentation
String = 'Macro'
ppt = Presentation('D:\\Shaon\\pptss\\Regional.pptx')
for slide in ppt.slides:
for shape in slide.shapes:
if shape.has_text_frame:
shape.text = String
slide.delete(slide)
ppt.save('BODd.pptx')
After execution I am getting a memory error. No clue how to resolve this issue. How can I delete ppt slides using some specific keywords?
You can delete a slide with a specific index value with the following code using the pptx library:
from pptx import Presentation
# create slides ------
presentation = Presentation('new.pptx')
xml_slides = presentation.slides._sldIdLst
slides = list(xml_slides)
xml_slides.remove(slides[index])
So to delete the first slide, index would be 0.
It is possible to delete the whole slides using the following code. So just use this before generating the slides to have a clean and empty PowerPoint file. By changing the index, you can also delete the specific slides.
import os
import pptx.util
from pptx import Presentation
cwd = os.getcwd()
prs = Presentation(cwd + '\\ppt.pptx')
for i in range(len(prs.slides)-1, -1, -1):
rId = prs.slides._sldIdLst[i].rId
prs.part.drop_rel(rId)
del prs.slides._sldIdLst[i]
I was trying to delete all slides but the cover from one pptx file to reuse the layouts and the best solution I got was to loop Ferhat´s answer.
It worked :)
# Ferhat´s showed us how to list the slides:
xml_slides = prs.slides._sldIdLst
slides = list(xml_slides)
# Then I loop for all except the first (index 0):
for index in range(1,len(slides)):
xml_slides.remove(slides[index])

Categories