Unable to delete PowerPoint Slides using Python-pptx - python

I am trying to delete PowerPoint slides containing a specific keywords using Python-pptx. If the keyword is present anywhere in the slide then that slide will be deleted. My code is given below:
from pptx import Presentation
String = 'Macro'
ppt = Presentation('D:\\Shaon\\pptss\\Regional.pptx')
for slide in ppt.slides:
for shape in slide.shapes:
if shape.has_text_frame:
shape.text = String
slide.delete(slide)
ppt.save('BODd.pptx')
After execution I am getting a memory error. No clue how to resolve this issue. How can I delete ppt slides using some specific keywords?

You can delete a slide with a specific index value with the following code using the pptx library:
from pptx import Presentation
# create slides ------
presentation = Presentation('new.pptx')
xml_slides = presentation.slides._sldIdLst
slides = list(xml_slides)
xml_slides.remove(slides[index])
So to delete the first slide, index would be 0.

It is possible to delete the whole slides using the following code. So just use this before generating the slides to have a clean and empty PowerPoint file. By changing the index, you can also delete the specific slides.
import os
import pptx.util
from pptx import Presentation
cwd = os.getcwd()
prs = Presentation(cwd + '\\ppt.pptx')
for i in range(len(prs.slides)-1, -1, -1):
rId = prs.slides._sldIdLst[i].rId
prs.part.drop_rel(rId)
del prs.slides._sldIdLst[i]

I was trying to delete all slides but the cover from one pptx file to reuse the layouts and the best solution I got was to loop Ferhat´s answer.
It worked :)
# Ferhat´s showed us how to list the slides:
xml_slides = prs.slides._sldIdLst
slides = list(xml_slides)
# Then I loop for all except the first (index 0):
for index in range(1,len(slides)):
xml_slides.remove(slides[index])

Related

Select slides from pptx using python automation

Dears,
In order to print the specific slides from ppt, based on a list of items we use the basic PowerPoint tool Ctrl+F flowing this simple process, save the slide ID, then next, then save the second slide ID....... then print with those IDs and that takes a lot of workloads.
A base on the above statement we think to automate this task with python script.
this is my try :
from pptx import Presentation
filename = "C:/Users/RElKassah/Desktop/test.pptx"
prs = Presentation(filename)
text="test"
for slide in prs.slides:
if slide.shapes ==text:
title = slide.shapes.text.find = 'test'
print(title)
thank you very much and best regards
Here I wrote some simple loop to find text inside a pptx file, then print slide number of slides that contain that text. Hope it could help.
from pptx import Presentation
filename = 'test.pptx'
prs = Presentation(filename)
text="test"
for slide in prs.slides:
for shape in slide.shapes:
if shape.has_text_frame:
for paragraph in shape.text_frame.paragraphs:
for run in paragraph.runs:
if text in run.text:
print(prs.slides.index(slide)+1)

Assign a variable/tag to specific text within a PowerPoint Text Box (in PowerPoint)

I would like to update specific text within a PowerPoint text box by assigning that specific text a variable within PowerPoint that I can then (1) identify in Python, and (2) replace the text based on data in an Excel model.
The ultimate goal is that in the output PowerPoint presentation, we should only see the underlying value of the variable (e.g., 32.4%), not the variable name (e.g., #pct_total_var#). And if changes need to be made, the underlying reference to the variable is kept, such that I can re-run Python to update the presentation.
Note: I have already implemented a "search and replace" type solution based on answers here, and elsewhere. However the issue with that approach is that you need to work with a template PowerPoint and an output PowerPoint, because once the variables (e.g., #pct_total_var#) have been replaced with values from an Excel model, it is no longer possible to identify the variables in the output presentation file in order to update.
Is it possible to assign a variable/tag to text within a PowerPoint Text Box?
Below is the current Python code that I have. I just need to be able to define variables inside PowerPoint. But thought this might help others searching for the same.
from pptx import Presentation
import pandas as pd
# Paths to model, and template deck
model = "path/to/model.xlsx"
template = "path/to/template.pptx"
# Step 1: Get data from model to replace in PowerPoint
df = pd.read_excel(model, sheet_name='UpdateReportPPT')
df = df[['variable_id', 'formatted_value']]
values_to_update = df.set_index('variable_id').T.to_dict('records')[0]
# Step 2: Open template PowerPoint presentation
prs = Presentation(template)
# Step 3.1: Get shapes within each slide
slides = [slide for slide in prs.slides]
shapes = []
for slide in slides:
for shape in slide.shapes:
shapes.append(shape)
# Step 3.2: Update variables
for shape in shapes:
# THIS IS WHERE I'LL IDENTIFY THE VARIABLES STORED IN POWERPOINT & UPDATE BASED ON THE EXCEL MODEL
# Step 4: Save to output PowerPoint presentation
prs.save(template)
print('Done.')

How to extract ALL IMAGES and text from all pptx file slides using python?

I'm able to read images from pptx file but not all images. I'm unable to extract the images presented in a slide with title or other text. Here is my code and please help me.
from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE
import glob
import os
import codecs
from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = '/usr/local/Cellar/tesseract/4.1.1/bin/tesseract'
from pytesseract import image_to_string
n=0
def write_image(shape):
global n
image = shape.image
# get image
image_bytes = image.blob
# assinging file name, e.g. 'image.jpg'
image_filename = fname[:-5]+'{:03d}.{}'.format(n, image.ext)
n += 1
print(image_filename)
os.chdir("directory_path/readpptx/images")
with open(image_filename, 'wb') as f:
f.write(image_bytes)
os.chdir("directory_path/readpptx")
def visitor(shape):
if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
write_image(shape)
def iter_picture_shapes(prs1):
for slide in prs1.slides:
for shape in slide.shapes:
visitor(shape)
file = open("directory_path/MyFile.txt","a+")
for each_file in glob.glob("directory_path/*.pptx"):
fname = os.path.basename(each_file)
file.write("-------------------"+fname+"----------------------\n")
prs = Presentation(each_file)
print("---------------"+fname+"-------------------")
for slide in prs.slides:
for shape in slide.shapes:
if hasattr(shape, "text"):
print(shape.text)
file.write(shape.text+"\n")
iter_picture_shapes(prs)
file.close()
Above code is able to extract images from pptx slides which have no text or title but not able to extract images in slides with text or title.
Try also iterating over slide masters and slide layouts. If there are "background" images that's where they will be. The same for shape in slide.shapes: mechanism works on slide masters and slide layouts; they are a variant of the polymorphic Slide object with the same shape-access semantics.
I don't think your problem is strictly related to the presence of a title or text on the slide. Perhaps those particular slides use a layout that includes some background images. If you open the slide and clicking on the image does not select it (give it bounding box) that indicates it is a background image and resides on the slide layout or possibly the slide master. This is how logos are commonly implemented to show up on every slide.
You may also want to consider iterating over the Notes slide for each slide when it has one, if there is text and/or images in there you are interested in. It is uncommon to find images in the slide notes but PowerPoint supports it.
Another approach is the traverse the underlying .pptx package (as a Zip archive) and extract the images that way.

Saving area within a shape (rectangle) as image from multiple powerpoint slides

I'm attempting to grab an image of diagrams constructed within a rectangle on a power point slide deck. I found python-pptx and am able to identify the shapes on each slide. Is there any way to expand this to take a snapshot of the area within the rectangle shape and export it as an image?
# Auto grab the photos created in Powerpoint
from pptx import Presentation
prs = Presentation('ex.pptx')
for slide in prs.slides:
print(slide)
for shape in slide.shapes:
print(shape)
# Identify shape on each slide, find area within, and save as .png
I think you're going to be best off looking at a COM32 type of solution, either writing something in VBA or possibly using the win32com library in Python if you really want a Python solution.
Either way this is going to fire up a "live" PowerPoint application instance and basically run it by remote control. That sort of thing isn't a great idea server-side, but if it's just for personal productivity it might work fine.
python-pptx can't do this sort of thing and probably never will. The rendering engine needs to get involved in this type of work and python-pptx is strictly a .pptx file editor/generator.
With Aspose.Slides for Python, you can easily save presentation shapes to images. The following code example shows you how to save all charts from a presentation to PNG images:
import aspose.slides as slides
import aspose.slides.charts as charts
import aspose.pydrawing as draw
with slides.Presentation("example.pptx") as presentation:
for slide_index, slide in enumerate(presentation.slides):
for shape_index, shape in enumerate(slide.shapes):
# Looking for charts, for example.
if isinstance(shape, charts.Chart):
# Get a chart image.
with shape.get_thumbnail() as chart_image:
# Save the chart image to PNG.
image_path = "chart_image_{}_{}.png".format(slide_index, shape_index)
chart_image.save(image_path, draw.imaging.ImageFormat.png)
Aspose.Slides for Python is a paid product, but you can get a temporary license or use it in a trial mode to evaluate all features for managing presentations. Alternatively, you can use Aspose.Slides Cloud SDK for Python. This package provides a REST-based API for managing presentations as well. The code example below shows you how to do the same using Aspose.Slides Cloud:
import asposeslidescloud
import aspose.pydrawing as draw
from asposeslidescloud.apis.slides_api import SlidesApi
from asposeslidescloud.models import *
slides_api = SlidesApi(None, "my_client_id", "my_client_secret")
file_name = "example.pptx"
# Upload the presentation to the default storage.
with open(file_name, "rb") as file_stream:
slides_api.upload_file(file_name, file_stream)
# Get the number of slides.
slides_info = slides_api.get_slides(file_name)
slide_count = len(slides_info.slide_list)
for slide_index in range(1, slide_count + 1):
# Get the number of shapes on the current slide.
shapes_info = slides_api.get_shapes(file_name, slide_index)
shape_count = len(shapes_info.shapes_links)
for shape_index in range(1, shape_count + 1):
shape = slides_api.get_shape(file_name, slide_index, shape_index)
# Looking for charts, for example.
if shape.type == "Chart":
# Get the chart as a PNG image.
image_path = slides_api.download_shape(file_name, slide_index, shape_index, ShapeExportFormat.PNG)
print("A chart image was saved to " + image_path)
This is also a paid product, but you can make 150 free API calls per month for any purposes.
I work as a Support Developer at Aspose and can answer your questions of these libraries on Aspose.Slides forum.

How to set a background image in python pptx

I am currently working on a project aiming to create a PowerPoint thanks Python pptx. However I am trying to set an image as the background of the slide and I can’t seem to find the solution in the docs of Python pptx. Is it possible to set an image as background if so can someone help me ? If it is not does anyone know another solution using python ?
Thank you
import os
import fnmatch
from pptx import Presentation
#Create presentation and setting layout as blank (6)
prs = Presentation()
blank_slide_layout = prs.slide_layouts[6]
#Find number of slides to create
#First Method = Count number of images in screenshot files (change path depending on the user)
nbSlide = len(fnmatch.filter(os.listdir("mypath"), '*.jpeg'))
#Loop to create as number of slides as there is report pages
for i in range(nbSlide):
slide = prs.slides.add_slide(blank_slide_layout)
#change background with an image of the slide …
background=slide.background
#Final step = Creation and saving of pptx
prs.save('test.pptx')
Is it possible to set an image as background ... ?
So far with python-pptx there is no direct way to insert image as background of an slide
If it is not does anyone know another solution using python ?
You could insert picture of interest into given slide on the regular basis, considering proper width/height parameters:
#Loop to create as number of slides as there is report pages
for i in range(nbSlide):
slide = prs.slides.add_slide(blank_slide_layout)
#change background with an image of the slide …
left = top = 0
pic = slide.shapes.add_picture('/your_file.jpeg', left-0.1*prs.slide_width, top, height = prs.slide_height)

Categories