Rearrange powerpoint slides automatically using python-pptx

Rearrange powerpoint slides automatically using python-pptx - python

We typically use powerpoint to facilitate our experiments. We use "sections" in powerpoint to keep groups of slides together for each experimental task. Moving the sections to counterbalance the task order of the experiment has been a lot of work!
I thought we might be able to predefine a counterbalance order (using a string of numbers representing the order of the sections) in a CSV which could be read from python. Then using python-pptx move the sections and save the file for each order. The problem I am having is understanding how to read sections from the python-pptx. If anyone has a better solution than python please let me know.
Thanks for your help!

As explained in the documentation (specifically this section):
Right now, adding a slide is the only operation on the slide collection. On the backlog at the time of writing is deleting a slide and moving a slide to a different position in the list. Copying a slide from one presentation to another turns out to be pretty hard to get right in the general case, so that probably won’t come until more of the backlog is burned down.
Or in other words it currently is not possible to move slides as you have suggested. The best work around I have used is to generate a new presentation and reorder the slides into this (since you can add slides).
For instance say I have slides in Presentation1.pptx of:
[0]
[1]
[2]
[3]
[4]
But I want:
[2]
[3]
[4]
[0]
[1]
Your code (in untested pseudocode of sorts) would be:
old_presentation = Presentation("Presentation1.pptx")
new_presentation = Presentation()
for slide in old_presentation.slides[2:]:
new_slide = new_presentation.slides.add_slide() # transfer the contents into new presentation for slides [2] [3] [4]
populate_new_slide_from_old_slide(slide, new_slide)
for slide in old_presentation.slides[:2]:
new_slide = new_presentation.slides.add_slide() # transfer the contents into new presentation for slides [0] [1]
populate_new_slide_from_old_slide(slide, new_slide)
new_presentation.save()
Where populate_new_slide_from_old_slide() would look like (pretty sure this would work as is, but again I didn't test it):
def populate_new_slide_from_old_slide(slide, new_slide):
shapes_to_transfer = slide.shapes
for shape in shapes_to_transfer:
new_shape = new_slide.shapes.add_shape(shape)
I believe that placeholders are shapes too so they should be transferred via this method!
Take heed I haven't coded .pptx for a while, so the actual implementation might be slightly different. As a concept though, this is currently the only way to do what you're asking. In my opinion if you are actively generating the data (as opposed to just reorganizing it after the fact) it would probably be simpler to just make a new_presentation object and plug your data into that. It seems odd to me to keep generating output in the old format and then converting it to the new format. For instance, when DVDs came out people started putting their movies on that (the sensible option) instead of making VHS and then porting the VHS to DVD through some arbitrary method (the very peculiar option I am trying to dissuade you from).

Would it be feasible - if all we're doing is reordering - to read the XML and rewrite it with the slide elements permuted?
Further - for the "delete" case - is it feasible to simply delete a slide element in the XML? (I realise this could leave dangling objects such as images in the file.)
The process of extracting the XML and rewriting it to a copy of the file is probably not too onerous.

I found a nice solution here:
class PresentationBuilder(object):
presentation = Presentation("presentation.pptx")
def move_slide(self, old_index, new_index):
xml_slides = self.presentation.slides._sldIdLst
slides = list(xml_slides)
xml_slides.remove(slides[old_index])
xml_slides.insert(new_index, slides[old_index])
So for example if you want to move the slide that is currently in position with index 5 to position with index 1, you use:
prs = PresentationBuilder()
prs.move_slide(5, 1)

Related

OpenPyXL - DataValidation adding list to the cell (issue)

This is my code:
dv = DataValidation(type="list", formula1='"11111,22222,33333,44444,55555,66666,77777,88888,99999,111110,122221,133332,144443,155554,166665,177776,188887,199998,211109,222220,233331,244442,255553,266664,277775,288886,299997,311108,322219,333330,344441,355552,366663,377774,388885,399996,411107,422218,433329,444440,455551,466662,477773,488884,499995,511106,522217,533328,544439,555550,566661,577772,588883,599994,611105,622216,633327,644438,655549,666660,677771,688882,699993,711104,722215,733326,744437,755548,766659,777770,788881,799992,811103,822214,833325,844436,855547,866658,877769,888880,899991,911102,922213,933324,944435,955546,966657,977768,988879,999990,1011101,1022212,1033323,1044434,1055545,1066656,1077767,1088878,1099989,1111100,1122211"', allow_blank=False)
sheet.add_data_validation(dv)
dv.add('K5')
But then I have this issue:
BUT if formula1 list is small ... then all is working fine.....
WHat is the way to add a BIG list of options which will not cause issues(as you can see above)?

Excel may impose additional limits on what it accepts. See https://learn.microsoft.com/en-us/openspecs/office_standards/ms-oi29500/8ebf82e4-4fa4-43a6-9ecd-d2d793a6f4bf. In the implementers notes there is additional information but I cannot find the passage referred to.
Basically, I think it's generally easier to refer to values on a separate sheet.

python-pptx duplicate slide PPT will be damaged

I found that when using the method of duplicate slide, if there is a chart on the page, PPT will be damaged, so I used this method to copy a slide with a chart and modify the title of the chart on one page, and the title of the chart on the other page is also modified inexplicably
def duplicate_slide(pres,index):
template = pres.slides[index]
blank_slide_layout = pres.slide_layouts[index]
copied_slide = pres.slides.add_slide(blank_slide_layout)
for shp in template.shapes:
el = shp.element
newel = copy.deepcopy(el)
copied_slide.shapes._spTree.insert_element_before(newel, 'p:extLst')
for _, value in six.iteritems(template.part.rels):
# Make sure we don't copy a notesSlide relation as that won't exist
if "notesSlide" not in value.reltype:
copied_slide.part.rels.add_relationship(
value.reltype, value._target, value.rId
)
return copied_slide

In the general case, duplicating a slide is not a simple as just cloning the slide XML, which is what your duplicate_slide() method does. That works for some simple cases, but not for slides with charts.
In particular, a chart is a separate package-part ("file") within the PPTX package (zip archive). If you just copy the relationships from one slide to the other, like you do here, then you have two slides pointing to the same chart-part. This is why changing the chart title in one slide changes it in the other as well, because the same single chart is displayed on both slides.
In order to get the behavior you seem to be looking for, you would need to also duplicate the chart part and form a relationship from the new slide to that new chart part.
That's not a simple enough process for me to just provide here a few lines of code to do it, but hopefully this explains for you why you are seeing the behavior you are.

Extending pyyaml to find and replace like xml ElementTree

I'd like to extend this SO question to treat a non-trivial use-case.
Background: pyyaml is pretty sweet insofar as it eats YAML and poops Python-native data structures. But what if you want to find a specific node in the YAML? The referenced question would suggest that, hey, you just know where in the data structure the node lives and index right into it. In fact pretty much every answer to every pyyaml question on SO seems to give this same advice.
But what if you don't know where the node lives in the YAML in advance?
If I were working with XML I'd solve this problem with an xml.etree.ElementTree. These provide nice facilities for loading an XML document into memory and finding elements based on certain search criteria. See find() and findall().
Questions:
Does pyyaml provide search capabilities analogous to ElementTree? (If yes, feel free to yell at me for being bad at Google.)
If no, does anyone have nice recipe for extending pyyaml to achieve similar things? (Bonus points for not traversing the deserialized YAML all over again.)
Note that one important thing that ElementTree provides in addition to just being able to find things is the ability to modify the XML document given an element reference. I'd like to be able to do this on YAML as well.

The answer to question 1 is: no. PyYAML implements the YAML 1.1 language standard and there is nothing about finding scalars by any path in the standard nor in the library.
However if you safeload a YAML structure, everything is either a mapping, a sequence or a scalar. Even such a simplistic representation (simple, compared to full fledged object instantiation with !typemarkers), can already contain recursive self referencing structures:
&a x: *a
This is not possible in XML without external semantic interpretation. This makes making a generic tree walker much harder in YAML than in XML.
The type loading mechanism of YAML also makes it much more difficult to generic tree walker, even if you exclude the problem of self references.
If you don't know where a node lives in advance, you still need to know how to identify the node, and since you don't know how to you would walk the parent (which might be represented in multiple layers of combined mappings and sequences, it is almost almost useles to have a generic mechanism that depends on context.
Without being able to rely on context (in general) the thing that is left is a uniquely identifiable value (like the HTML id attribute). If all your objects in YAML have such a unique id, then it is possible to search the (safeloaded) tree for such an id value and extract any structure underneath it (mappings, sequences) until you hit a leaf (scalar), or some structure that has an id of its own (another object).
I have been following the YAML development for quite some time now (earliest emails from the YAML mailing list that I have in my YAML folder are from 2004) and I have not seen anything generic evolve since then. I do have some tools to walk the trees and find things that I use for extracting parts of the simplified structure for testing my raumel.yaml library, but no code that is in a releasable shape (it would have already been on PyPI if it was), and nothing near to a generic solution like you can make for XML (which is IMO, on its own, syntactically less complex than YAML).

Do you know how to search through python objects? then you know how to search through the results of a yaml.load()...
YAML is different from XML in two important ways: one is that while every element in XML has a tag and a value, in YAML, there can be some things that are only values. But secondly... again, YAML creates python objects. There is no intermediate in-memory format to use.
E.G. if you load a YAML file like this:
- First
- Second
- Third
you'll get a list like ['First', 'Second', 'Third']. Want to find 'Third' and don't know where it is? You can use [x for x in my_list if 'Third' in x] to find it. Need to lookup an item in a dictionary? Just do it.
If you want to modify an object, you don't modify the YAML, you modify the object. E.G. now I want the second entry to be in German. I just do 'my_list[1] = 'zweite', modifying it in place. Now the python list looks like ['First', 'zweite', 'Third'], and dumping it to YAML looks like
- First
- zweite
- Third
Note that PyYAML is pretty smart... you can even create objects with loops:
>>> a = [1,2,3]
>>> b = {}
>>> b[1] = a
>>> b[2] = a
>>> print yaml.dump(b)
1: &id001 [1, 2, 3]
2: *id001
>>> b[2] = [3,4,5]
>>> print yaml.dump(b)
1: [1, 2, 3]
2: [3, 4, 5]
In the first case, it even figured out that b[1] and b[2] point to the same object, so it created links and automatically put a link from one to the other... in the original object, if you did something like a.pop(), both b[1] and b[2] would show that one entry was gone. If you send that object to YAML, and then load it back in, that will still be true.
(and note in the second one, where they aren't the same, PyYAML doesn't create the extra notations, as it doesn't need to).
In short: Most likely, you're just overthinking it.

How to force Python dictionary to shrink?

I have experienced that in other languages. Now I have the same problem in Python. I have a dictionary that has a lot of CRUD actions. One would assume that deleting elements from a dictionary should decrease the memory footprint of it. It's not the case. Once a dictionary grows in size (doubling usually), it never(?) releases allocated memory back. I have run this experiment:
import random
import sys
import uuid
a= {}
for i in range(0, 100000):
a[uuid.uuid4()] = uuid.uuid4()
if i % 1000 == 0:
print sys.getsizeof(a)
for i in range(0, 100000):
e = random.choice(a.keys())
del a[e]
if i % 1000 == 0:
print sys.getsizeof(a)
print len(a)
The last line of the first loop is 6291736. The last line of the second loop is 6291736 as well. And the size of the dictionary is 0.
So how to tackle this issue? Is there a way to force release of memory?
PS: don't really need to do random - I played with the range of the second loop.

The way to do this "rehashing" so it uses less memory is to create a new dictionary and copy the content over.
The Python dictionary implementation is explained really well in this video:
https://youtu.be/C4Kc8xzcA68
There is an atendee asking this same question (https://youtu.be/C4Kc8xzcA68?t=1593), and the answer given by the speaker is:
Resizes are only calculated upon insertion; as a dictionary shrinks it just gains a lot of dummy entries and as you refill it will just start reusing those to store keys. [...] you have to copy the keys and values out to a new dictionary

Actually a dictionary can shrink upon resize, but the resize only happens upon a key insert not removal. Here's a comment from the CPython source for dictresize:
Restructure the table by allocating a new table and reinserting all
items again. When entries have been deleted, the new table may
actually be smaller than the old one.
By the way, since the other answer quotes Brandon Rhodes talk on the dictionary at PyCon 2010, and the quote seems to be at odds with the above (which has been there for years), I thought I would include the full quote, with the missing part in bold.
Resizes are only calculated upon insertion. As a dictionary shrinks,
it just gains a lot of dummy entries and as you refill it, it will
just start re-using those to store keys. It will not resize until you
manage to make it two-thirds full again at its larger size. So it
does not resize as you delete keys. You have to do an insert to get
it to figure out it needs to shrink.
So he does say the resizing operation can "figure out [the dictionary] needs to shrink". But that only happens on insert. Apparently when copying over all the keys during resize, the dummy keys can get removed, reducing the size of the backing array.
It isn't clear, however, how to get this to happen, which is why Rhodes says to just copy over everything to a new dictionary.

Selecting all items individually in a list

I was wondering if it is possible to re-select each and every item in the rsList?
I am citing a simple example below but I am looking at hundreds of items in the scene and hence below are the simplest form of coding I am able to come up with base on my limited knowledge of Python
rsList = cmds.ls(type='resShdrSrf')
# Output: [u'pCube1_GenShdr', u'pPlane1_GenShdr', u'pSphere1_GenShdr']
I tried using the following cmds.select but it is taking my last selection (in memory) - pSphere1_GenShdr into account while forgetting the other 2 even though all three items are seen selected in the UI.
Tried using a list and append, but it also does not seems to be working and the selection remains the same...
list = []
for item in rsList:
list.append(item)
cmds.select(items)
#cmds.select(list)
As such, will it be possible for me to perform a cmds.select on each of the item individually?

if your trying to just select each item:
import pymel.core as pm
for i in pm.ls(sl=True):
i.select()
but this should have no effect in your rendering

I think for mine, it is a special case in which I would need to add in mm.eval("autoUpdateAttrEd;") for the first creation of my shader before I can duplicate.
Apparently I need this command in order to get it to work

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.