How to replace text in Powerpoint? - python

This is the code I am using to replace text in powerpoint. First I am extracting text from powerpoint and then storing the translated and original sentences as dictionary.
prs = Presentation('/content/drive/MyDrive/presentation1.pptx')
# To get shapes in your slides
slides = [slide for slide in prs.slides]
shapes = []
for slide in slides:
for shape in slide.shapes:
shapes.append(shape)
def replace_text(self, replacements: dict, shapes: List):
"""Takes dict of {match: replacement, ... } and replaces all matches.
Currently not implemented for charts or graphics.
"""
for shape in shapes:
for match, replacement in replacements.items():
if shape.has_text_frame:
if (shape.text.find(match)) != -1:
text_frame = shape.text_frame
for paragraph in text_frame.paragraphs:
for run in paragraph.runs:
cur_text = run.text
new_text = cur_text.replace(str(match), str(replacement))
run.text = new_text
if shape.has_table:
for row in shape.table.rows:
for cell in row.cells:
if match in cell.text:
new_text = cell.text.replace(match, replacement)
cell.text = new_text
replace_text(translation, shapes)
I get a error
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-97-181cdd92ff8c> in <module>()
9 shapes.append(shape)
10
---> 11 def replace_text(self, replacements: dict, shapes: List):
12 """Takes dict of {match: replacement, ... } and replaces all matches.
13 Currently not implemented for charts or graphics.
NameError: name 'List' is not defined
translation is a dictionary
translation = {' Architecture': 'आर्किटेक्चर',
' Conclusion': 'निष्कर्ष',
' Motivation / Entity Extraction': 'प्रेरणा / इकाई निष्कर्षण',
' Recurrent Deep Neural Networks': 'आवर्तक गहरे तंत्रिका नेटवर्क',
' Results': 'परिणाम',
' Word Embeddings': 'शब्द एम्बेडिंग',
'Agenda': 'कार्यसूची',
'Goals': 'लक्ष्य'}
May I know why am I getting this error. What chnages should be done to resolve it. Also can I save the replaced text using prs.save('output.pptx')
New Error
TypeError Traceback (most recent call last)
<ipython-input-104-957db45f970e> in <module>()
32 cell.text = new_text
33
---> 34 replace_text(translation, shapes)
35
36 prs.save('output.pptx')
TypeError: replace_text() missing 1 required positional argument: 'shapes'

The error you are getting 'NameError: name 'List' is not defined' occurs because 'List' isn't a valid type within python Typing. Since Python 3.9, you'll want to use 'list[type]'
For instance:
def replace_text(self, replacements: dict, shapes: list[str]):
Alternatively, you can use python's typing. However, this is deprecated in newer versions.
from typing import List
def replace_text(self, replacements: dict, shapes: List[str]):

Related

Python nlpaug Sentece augmenter error (50256 is not in list)

The following code is yielding this error for both GPT2 and Xlnet bases. The download of the bases occurs, but the same error prompt is displayed at the end every time.
I am using google colab, by the way.
ValueError: 50256 is not in list
`
import nlpaug
import nlpaug.augmenter.word as naw
import nlpaug.augmenter.sentence as nas
text = "The quick brown fox jumped over the lazy dog"
aug_cs_gpt2 = nas.ContextualWordEmbsForSentenceAug(model_type = 'gpt2')
temp = aug_cs_gpt2.augment(text)
print(temp)
aug_cs_xlnet = nas.ContextualWordEmbsForSentenceAug(model_type = 'xlnet')
temp = aug_cs_xlnet.augment(text)
print(temp)
Expecting the augmented text to be printed.
AttributeError Traceback (most recent call last)
<ipython-input-41-6810452650ff> in <module>
9
10 aug_cs_xlnet = nas.ContextualWordEmbsForSentenceAug(model_type = 'xlnet')
---> 11 temp = aug_cs_xlnet.augment(text)
12 print(temp)
2 frames
/usr/local/lib/python3.8/dist-packages/nlpaug/augmenter/sentence/context_word_embs_sentence.py in _custom_insert(self, all_data)
148 # Mask token is needed for xlnet. No mask token for gpt2
149 if self.model_type in ['xlnet']:
--> 150 text += ' ' + self.model.MASK_TOKEN
151
152 texts.append(text)
AttributeError: 'Gpt2' object has no attribute 'MASK_TOKEN'
This is a known bug in nlpaug in version prior to 0.0.16. You should upgrade nlpaug to a version newer than this one and the issue should be gone.

how to run on array of string in nested loop in python

I try to run this code (small part of my program)
I fixed and added whitespaces and now I get new error, maybe this array does not fit to string?
arrayOfPhotos = ["1.jpg", "2.jpg", "3.jpg", "4.jpg"]
for name in arrayOfPhotos:
detections = detector.detectObjectsFromImage(input_image=arrayOfPhotos[name], output_image_path="holo3-detected.jpg")
for detection in detections:
print(arrayOfPhotos[name], " : ", detection["percentage_probability"])
I get the error:
Traceback (most recent call last):
File "dTEST.py", line 13, in <module>
detections = detector.detectObjectsFromImage(input_image=arrayOfPhotos[name], output_image_path="holo3-detected.jpg")
TypeError: list indices must be integers or slices, not str
Can you help me?
what you probably want is this:
arrayOfPhotos = ["1.jpg", "2.jpg", "3.jpg", "4.jpg"]
for name in arrayOfPhotos:
detections = detector.detectObjectsFromImage(input_image=arrayOfPhotos[name], output_image_path="holo3-detected.jpg")
for detection in detections:
print(arrayOfPhotos[name], " : ", detection["percentage_probability"])
whitespace is important in python, for a statement to be part of a loop it needs to be indented vs the loop (I've added 2 spaces before the lines)
edit: the OP edited the question.
replace input_image=arrayOfPhotos[name] with input_image=name
arrayOfPhotos = ["1.jpg", "2.jpg", "3.jpg", "4.jpg"]
for name in arrayOfPhotos:
detections = detector.detectObjectsFromImage(input_image=name, output_image_path="holo3-detected.jpg")
for detection in detections:
print(name, " : ", detection["percentage_probability"])
The array of photos is a list not a dictionary...

AttributeError: 'float' object has no attribute 'translate' Python

Im working on doing some NLP with textual data from doctors just trying to do some basic preprocessing text cleaning trying to remove stop words and punctuation. I have already given the program a list of punctuations and stop words.
My text data looks something like this:
"Cyclin-dependent kinases (CDKs) regulate a variety of fundamental cellular processes. CDK10 stands out as one of the last orphan CDKs for which no activating cyclin has been identified and no kinase activity revealed. Previous work has shown that CDK10 silencing increases ETS2 (v-ets erythroblastosis virus E26 oncogene homolog 2)-driven activation of the MAPK pathway, which confers tamoxifen resistance to breast cancer cells"
Then my code looks like:
import string
# Create a function to remove punctuations
def remove_punctuation(sentence: str) -> str:
return sentence.translate(str.maketrans('', '', string.punctuation))
# Create a function to remove stop words
def remove_stop_words(x):
x = ' '.join([i for i in x.split(' ') if i not in stop])
return x
# Create a function to lowercase the words
def to_lower(x):
return x.lower()
So then I try to apply the functions to the Text column
train['Text'] = train['Text'].apply(remove_punctuation)
train['Text'] = train['Text'].apply(remove_stop_words)
train['Text'] = train['Text'].apply(lower)
And I get an error message like:
--------------------------------------------------------------------------- AttributeError Traceback (most recent call
last) in
----> 1 train['Text'] = train['Text'].apply(remove_punctuation)
2 train['Text'] = train['Text'].apply(remove_stop_words)
3 train['Text'] = train['Text'].apply(lower)
/opt/conda/lib/python3.6/site-packages/pandas/core/series.py in
apply(self, func, convert_dtype, args, **kwds) 3192
else: 3193 values = self.astype(object).values
-> 3194 mapped = lib.map_infer(values, f, convert=convert_dtype) 3195 3196 if len(mapped) and
isinstance(mapped[0], Series):
pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()
in remove_punctuation(sentence)
3 # Create a function to remove punctuations
4 def remove_punctuation(sentence: str) -> str:
----> 5 return sentence.translate(str.maketrans('', '', string.punctuation))
6
7 # Create a function to remove stop words
AttributeError: 'float' object has no attribute 'translate'
Why am I getting this error. Im guessing because digits appear in the text?

Parse multiline text using the Parsimonious Python library

I am trying to parse multiline text with the python parsimonious library. I've been playing with it for a while and can't figure out how to deal effectively with newlines. One example is below. The behavior below makes sense. I saw this comment from Erik Rose in the parsimonious issues, but I could not figure out how to implement it without errors. Thanks for any tips here...
singleline_text = '''\
FIRST something cool'''
multiline_text = '''\
FIRST something very
cool
SECOND more awesomeness
'''
grammar = Grammar(
"""
bin = ORDER spaces description
ORDER = 'FIRST' / 'SECOND'
spaces = ~'\s*'
description = ~'[A-z0-9 ]*'
""")
Works ok for single line output, print(grammar.parse(singleline_text)) gives:
<Node called "bin" matching "FIRST something cool">
<Node called "ORDER" matching "FIRST">
<Node matching "FIRST">
<RegexNode called "spaces" matching " ">
<RegexNode called "description" matching "something cool">
But multiline gives problems, and I was unable to resolve based on the link above, print(grammar.parse(multiline_text)) gives:
---------------------------------------------------------------------------
IncompleteParseError Traceback (most recent call last)
<ipython-input-123-c346891dc883> in <module>()
----> 1 print(grammar.parse(multiline_text))
/Users/me/anaconda3/lib/python3.6/site-packages/parsimonious/grammar.py in parse(self, text, pos)
121 """
122 self._check_default_rule()
--> 123 return self.default_rule.parse(text, pos=pos)
124
125 def match(self, text, pos=0):
/Users/me/anaconda3/lib/python3.6/site-packages/parsimonious/expressions.py in parse(self, text, pos)
110 node = self.match(text, pos=pos)
111 if node.end < len(text):
--> 112 raise IncompleteParseError(text, node.end, self)
113 return node
114
IncompleteParseError: Rule 'bin' matched in its entirety, but it didn't consume all the text. The non-matching portion of the text begins with '
cool
SECOND' (line 1, column 23).
Here is one thing I tried that did not work:
grammar2 = Grammar(
"""
bin = ORDER spaces description newline
ORDER = 'FIRST' / 'SECOND'
spaces = ~'\s*'
description = ~'[A-z0-9 \n]*'
newline = ~r'#[^\r\n]*'
""")
print(grammar2.parse(multiline_text))
(truncated from the 211-line stack trace):
ERROR:root:An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line string', (1, 4))
---------------------------------------------------------------------------
SyntaxError Traceback (most recent call last)
...
VisitationError: SyntaxError: EOL while scanning string literal (<unknown>, line 1)
Parse tree:
<Node called "spaceless_literal" matching "'[A-z0-9
]*'"> <-- *** We were here. ***
<RegexNode matching "'[A-z0-9
]*'">
It looks like you need to repeat the bin element in your grammar:
grammar = Grammar(
r"""
one = bin +
bin = ORDER spaces description newline
ORDER = 'FIRST' / 'SECOND'
newline = ~"\n*"
spaces = ~"\s*"
description = ~"[A-z0-9 ]*"i
""")
with that you can parse things like:
multiline_text = '''\
FIRST something very cool
SECOND more awesomeness
SECOND even better
'''

float' object has no attribute 'lower'

I'm facing this error and I'm really not able to find the reason for it.
Can somebody please point out the reason for it ?
for i in tweet_raw.comments:
mns_proc.append(processComUni(i))
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-416-439073b420d1> in <module>()
1 for i in tweet_raw.comments:
----> 2 tweet_processed.append(processtwt(i))
3
<ipython-input-414-4e1b8a8fb285> in processtwt(tweet)
4 #Convert to lower case
5 #tweet = re.sub('RT[\s]+','',tweet)
----> 6 tweet = tweet.lower()
7 #Convert www.* or https?://* to URL
8 #tweet = re.sub('((www\.[\s]+)|(https?://[^\s]+))','',tweet)
AttributeError: 'float' object has no attribute 'lower'
A second similar error that facing is this :
for i in tweet_raw.comments:
tweet_proc.append(processtwt(i))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-423-439073b420d1> in <module>()
1 for i in tweet_raw.comments:
----> 2 tweet_proc.append(processtwt(i))
3
<ipython-input-421-38fab2ef704e> in processComUni(tweet)
11 tweet=re.sub(('[http]+s?://[^\s<>"]+|www\.[^\s<>"]+'),'', tweet)
12 #Convert #username to AT_USER
---> 13 tweet = re.sub('#[^\s]+',' ',tweet)
14 #Remove additional white spaces
15 tweet = re.sub('[\s]+', ' ', tweet)
C:\Users\m1027201\AppData\Local\Continuum\Anaconda\lib\re.pyc in sub(pattern, repl, string, count, flags)
149 a callable, it's passed the match object and must return
150 a replacement string to be used."""
--> 151 return _compile(pattern, flags).sub(repl, string, count)
152
153 def subn(pattern, repl, string, count=0, flags=0):
TypeError: expected string or buffer
Shall I check whether of not a particluar tweet is tring before passing it to processtwt() function ? For this error I dont even know which line its failing at.
Just try using this:
tweet = str(tweet).lower()
Lately, I've been facing many of these errors, and converting them to a string before applying lower() always worked for me.
My answer will be broader than shalini answer. If you want to check if the object is of type str then I suggest you check type of object by using isinstance() as shown below. This is more pythonic way.
tweet = "stackoverflow"
## best way of doing it
if isinstance(tweet,(str,)):
print tweet
## other way of doing it
if type(tweet) is str:
print tweet
## This is one more way to do it
if type(tweet) == str:
print tweet
All the above works fine to check the type of object is string or not.

Categories