Incrementing a value inside a dictionary - python

I am trying to model a bag of words. I am having some trouble incrementing the counter inside my dictionary when the word is found in my data (type series):
def build_voc(self, data):
for document in data:
for word in document.split(' '):
if word in self.voc:
self.voc_ctr[word] = self.voc_ctr[word] + 1
else:
self.voc.append(word)
self.voc_ctr = 1
I tried indexing it as well this way just to test where the error was:
self.voc_ctr[word][0] = self.voc_ctr[word][0] + 1
But it still gives me the same error at that line:
TypeError: 'int' object is not subscriptable
Knowing that this is a function in the same class, where self.voc and self.voc_ctr are defined:
class BV:
def __init__(self):
self.voc = []
self.voc_ctr = {}
def build_voc(self, data):
for document in data:
for word in document.split(' '):
if word in self.voc:
self.voc_ctr[word] = self.voc_ctr[word] + 1
else:
self.voc.append(word)
self.voc_ctr = 1
The error seems to say self.voc_ctr is an int object, but I defined it as a list so I don't know where I went wrong.

Your code isn't going into your "if" statement first, it's going into your "else" and initializing your self.voc_ctr to the integer, 1.
It look like you have more going on than just a counter not working. In this part of code:
if word in self.voc:
self.voc_ctr[word] = self.voc_ctr[word] + 1
...you're saying "If the word is in my list, create a dictionary entry containing that word and assign the value of the entry AFTER that new entry to it." Once you correct your initial 'int' error, you're going to get a KeyError. Since self.voc_ctr[word] won't exists until AFTER the assignment operation is complete, self.voc_ctr[word] + 1 won't exist either.
To implement a counter for each word, try doing this:
if word in self.voc:
self.voc_ctr[word] = 1
else:
self.voc_ctr[word] = 0
I don't know what else you have to do with this program, but this will solve your counter issue.

def build_voc(self, data):
for document in data:
for word in document.split(' '):
if word in self.voc:
self.voc_ctr[word] = self.voc_ctr[word] + 1
else:
self.voc.append(word)
self.voc_ctr = 1 ## <-------- The function fails here
The way you are doing is not the best/optimal way to do it, you do not need a list to first check and then add it to a dictionary
Dictionary itself is the best way to check if the word exists or not
Try to use the modified version
voc_ctr = {}
def build_voc(data):
for document in data:
for word in document.split(' '):
if word in voc:
voc_ctr[word] += 1
else:
voc_ctr = 1

Related

Why is there a memory error showing when I try to run this code?

So I'm having to write a function that does this:
Write the function called remove_all_from_string that takes two strings and returns a copy of
the first string with all instances of the second string removed. You can assume that the
second string is only one letter, like "a". For example remove_all_from_string("house", "h")
would output "ouse" or remove_all_from_string("haus", "h") would output "aus".
It has to have:
A function definition with parameters.
A while loop.
The find method.
Slicing and the + operator.
A return statement.
def remove_all_from_string(word, letter):
while letter in word:
x = word.find(letter)
if x == -1:
continue
else:
new = list(word)
new[x] = ""
word = str(word[:x]) + word.join(new)
return word
print(remove_all_from_string("Mississippi", "i"))
Every time I try to run this, Python displays an Error Message:
Traceback (most recent call last):
File "scratchpad.py", line 22, in <module>
print(remove_all_from_string("Mississippi", "i"))
File "scratchpad.py", line 19, in remove_all_from_string
word = str(word[:x]) + word.join(new)
MemoryError
Could anyone help with this? Thank you for any answers!
Your code getting trapped in a forever loop because of word.join(new) that you are adding word again and again ant word grows to infinity.
The way you can fix your code:
def remove_all_from_string(word, letter):
while letter in word:
x = word.find(letter)
if x == -1:
continue
else:
new = list(word)
new[x] = ""
word = "".join(new[:x] + new[x+1:])
return word
print(remove_all_from_string("Mississippi", "i"))
Better way to implement this function:
def remove_all_from_string(word, letter):
return word.replace(letter, "")
Simplify the while loop's logic:
There's already an answer with the 'best' way to solve the problem. In general, if I think I need a while loop I'm normally wrong, but if they're a requirement they're a requirement I guess.
def remove_all_from_string_converted(word: str, letter: str) -> str:
while letter in word:
index: int = word.find(letter)
word = word[:index] + word[index+1:] # 'join' isn't necessary unless it's also required by a rule
return word
print(remove_all_from_string_converted("Mississippi", "i"))
Output: Msssspp
You are accumulating copies of the variable "word" into itself, which results in exponential growth of the data in the variable, resulting in the computer running out of RAM. Given the task description this is a bug.
As it seems you may be new to programming, I suggest this strategy for helping yourself: print the values of all your variables (and add new variables for important intermediate results) so you can see what your program is actually doing. Once you understand what the program is doing, you can begin to fix it.
def remove_all_from_string(word, letter):
debug_loop_count = 0
while letter in word:
debug_loop_count += 1
if debug_loop_count > 2: # (change this to number control the # of loops you print)
print(f"breaking loop because debug_loop_count exceeded")
break
print(f"--- begin loop iteration #{debug_loop_count}, word: '{word}'")
x = word.find(letter)
if x == -1:
print(f"--- end loop iteration #{debug_loop_count} x={x} (continue)")
continue
else:
new = list(word)
print(f"variable new is: '{new}' x={x}")
new[x] = ""
print(f"variable new is updated to: '{new}' x={x}")
str1 = str(word[:x])
str2 = word.join(new)
print(f"variable str1 is: '{str1}'")
print(f"variable str2 is: '{str2}'")
word = str1 + str2
print(f"variable word now contains: '{word}'")
print(f"--- end iteration loop #{debug_loop_count}")
print(f"!!! end of function, word = {word}")
return word
print(remove_all_from_string("Mississippi", "i"))

Python - Dictionary function only one entry for dictionary

Sorry if this is a silly question but I am new to python. I have a piece of code that was opening a text reading it, creating a list of words, then from that list create a dictionary of each word with a count of how many times it appears in the list of words. This code was working fine and was printing out the dictionary fine however when i put it in a function and called the function it returns the dictionary but only with one entry. Any ideas why, any help is much appreciated.
def createDict():
wordlist = []
with open('superman.txt','r', encoding="utf8") as superman:
for line in superman:
for word in line.split():
wordlist.append(word)
#print(word)
table = str.maketrans("!#$%&()*+, ./:;<=>?#[\]^_`{|}~0123456789'“”-''—", 47*' ' )
lenght = len(wordlist)
i = 0
while i < lenght:
wordlist[i] = wordlist[i].translate(table)
wordlist[i] = wordlist[i].lower()
wordlist[i] = wordlist[i].strip()
i += 1
wordlist = list(filter(str.strip, wordlist))
word_dict = {}
for item in wordlist:
if item in word_dict.keys():
word_dict[item] += 1
else:
word_dict[item] = 1
return(word_dict)
try initializing the dictionary outside of the function and then using global inside the function. Is that one item in the dictionary the last iteration?
Fix your indenting in your iterating over the wordlist. Should read:
for item in wordlist:
if item in word_dict.keys():
word_dict[item] += 1
else:
word_dict[item] = 1
this seems to be an indentation and whitespace issue. Make sure the if and else statements near the end of your function are at the same level.
Below is code I got working with the indentation at the correct level. In addition comments to explain the thought process
def createDict():
wordlist = []
with open('superman.txt','r', encoding="utf8") as superman:
for line in superman:
for word in line.split():
wordlist.append(word)
#print(word)
table = str.maketrans("!#$%&()*+, ./:;<=>?#[\]^_`{|}~0123456789'“”-''—", 47*' ' )
lenght = len(wordlist)
i = 0
while i < lenght:
wordlist[i] = wordlist[i].translate(table)
wordlist[i] = wordlist[i].lower()
wordlist[i] = wordlist[i].strip()
i += 1
wordlist = list(filter(str.strip, wordlist))
# print(len(wordlist)) # check to see if wordlist is fine. Indeed it is
word_dict = {}
for item in wordlist:
# for dictionaries don't worry about using dict.keys()
# method. You can use a shorter if [value] in [dict] conditional
# The issue in your code was the whitespace and indentation
# of the else statement.
# Please make sure that if and else are at the same indentation levels
# Python reads by indentation and whitespace because it
# doeesn't use curly brackets like other languages like javascript
if item in word_dict:
word_dict[item] += 1
else:
word_dict[item] = 1
return word_dict # print here too
Please let me know if you have any questions. Cheers!

I am trying to make a program that tracts the most frequent character in a string presented by a user. What am I doing wrong here?

This is what I've come up with so far. it gives me an error saying that "each" is not defined and I don't know what to do to make it work. I am very new to coding so any advice is much appreciated.
my_string = input("Enter a sentence: ")
def main(my_string):
count = {}
for ch in my_string:
if ch in count:
count[each] += 1
else:
count[each] = 1
return count
main(my_string)
Maybe you meant to say ch instead of each both times.
This error is coming up because you never defined the each variable before calling it.
Just change that each variable into ch
my_string = input("Enter a sentence: ")
def main(my_string):
count = {}
for ch in my_string:
if ch in count:
count[ch] += 1
else:
count[ch] = 1
return count
main(my_string)

python is leaving some words out in a string

In my high school class I have been assigned the task of creating a keyword cipher. However when I run it through a python visualizer I still can not see what it is doing wrong.
Here is my code so far:
n = 0
l = 0
string = ""
message = input("enter your sentence: ")
message = (message.lower())
keyword = input("enter keyword: ")
keyword = (keyword.lower())
listkw = list(keyword)
def change(message,listkw,n,string,l):
try:
for letter in message:
if letter == " ":
string += " "
else:
temp = listkw[n]
mtemp = ord(letter)
mtemp -= 96
ktemp = ord(temp)
ktemp -= 96
letterans = mtemp+ktemp
if letterans >= 27:
letterans -= 26
letterans += 96
ans = chr(letterans)
string += ans
print (string)
n+=1
message = message[l:]
l+=1
except IndexError:
n= 0
change(message,listkw,n,string,l)
change(message,listkw,n,string,l)
print (string)
When I run it with the following input
enter your sentence: computingisfun
enter keyword: gcse
it should print jrfubwbsnllkbq, because it gets the place in the alphabet for each letter adds them up and print that letter.
For example:
change('a', list('b'), 0, "", 0)
prints out c because a = 1 and b = 2 and a+b = 3 (which is (c))
But it prints out jrfupqzn, which is not at all what I expected.
I understand that you're in high school so I replace some piece of code that you write unnecessarily, you'll do better with experience ;)
First of all, you must know that it isn't a good idea programming based in an exception, is better if you add a condition and you reinitialize your n value so the exception it isn't necessary; n = n + 1 if n + 1 < len(listkw) else 0
Then, you have a little problem with the scope of the variables, you set string = "" at the start of your script but when call the function the string inner the function has a different scope so when you print(string) at the end you have an empty string value, so, the values that you use into the function like n, l and string it's better if you define inside the function scope and finally return the desired value (calculated (cipher) string)
So, the code it's something like this:
Read and initialize your required data:
message = input("enter your sentence: ").lower()
keyword = input("enter keyword: ").lower()
listkw = list(keyword)
Define your function:
def change(message,listkw):
n = l = 0
string = ""
for letter in message:
if letter == " ":
string += " "
else:
temp = listkw[n]
mtemp = ord(letter) - 96
ktemp = ord(temp) - 96
letterans = mtemp + ktemp
if letterans >= 27:
letterans -= 26
letterans += 96
string += chr(letterans)
message = message[l:]
l+=1
n = n + 1 if n + 1 < len(listkw) else 0
return string
Call and print the return value at the same time ;)
print(change(message,listkw))
You've got quite a few problems with what you're doing here. You're mixing recursion, iteration, and exceptions in a bundle of don't do that.
I think you may have had a few ideas about what to do, and you started down one track and then changed to go down a different track. That's not surprising, given the fact that you're a beginner. But you should learn that it's a good idea to be consistent. And you can do this using recursion, iteration, slicing, or driven with exceptions. But combining them all without understanding why you're doing it is a problem.
Design
Let's unwind your application into what you actually are trying to do. Without writing any code, how would you describe the steps you're taking? This is what I would say:
For every letter in the message:
take the next letter from the keyword
combine the numeric value of the two letters
if the letter is beyond Z(ebra), start back at A and keep counting
when we reach the last letter in the keyword, loop back to the beginning
This gives us a hint as to how we could write this. Indeed the most straightforward way, and one that you've got partially done.
Iteratively
Here's another pointer - rather than starting of with a dynamic problem, let's make it pretty static:
message = 'computing is awesome'
for letter in message:
print(letter)
You'll see that this prints out the message - one character per line. Great! We've got the first part of our problem done. Now the next step is to take letters from the key. Well, let's put a key in there. But how do we iterate over two strings at a time? If we search google for python iterate over two sequences, the very first result for me was How can I iterate through two lists in parallel?. Not bad. It tells us about the handy dandy zip function. If you want to learn about it you can search python3 zip or just run >>> help(zip) in your REPL.
So here's our code:
message = 'computing is awesome'
keyword = 'gcse'
for letter, key in zip(message, keyword):
print(letter, key)
Now if we run this... uh oh!
c g
o c
m s
p e
Where's the rest of our string? It's stopping after we get to the end of the shortest string. If we look at the help for zip, we see:
continues until the shortest iterable in the argument sequence is exhausted
So it's only going to go until the shortest thing. Well that's a bummer. That means we need to have a key and message the same length, right? Or does it? What if our key is longer than the message? Hopefully by now you know that you can do something like this:
>>> 'ab'*10
'abababababababababab'
If we make sure that our key is at least as long as our message, that will work. So we can just multiply the key times the number of letters in our message. I mean, we'll have way more than we need, but that should work, right? Let's try it out:
message = 'computing is awesome'
keyword = 'gcse'*len(message)
for letter, key in zip(message, keyword):
print(letter, key)
Sweet! It worked!
So now let's try just adding the ord values and let's see what we get:
for letter, key in zip(message, keyword):
print(chr(ord(letter)+ord(key)))
Oh.. dear. Well those aren't ASCII letters. As you've already found out, you need to subtract 96 from each of those. As it turns out because math, you can actually just subtract 96*2 from the sum that we've already got.
for letter, key in zip(message, keyword):
if letter == ' ':
print()
else:
new_code = (ord(letter)+ord(key)-96*2)
print(chr(new_code+96))
But we've still got non-alpha characters here. So if we make sure to just bring that value back around:
for letter, key in zip(message, keyword):
if letter == ' ':
print()
else:
new_code = (ord(letter)+ord(key)-96*2)
if new_code > 26:
new_code -= 26
print(chr(new_code+96))
Now we're good. The only thing that we have left to do is combine our message into a string instead of print it out, and stick this code into a function. And then get our input from the user. We're also going to stick our key-length-increasing code into the function:
def change(message, keyword):
if len(keyword) < len(message):
keyword = keyword * len(message)
result = ''
for letter, key in zip(message, keyword):
if letter == ' ':
result += ' '
else:
new_code = (ord(letter)+ord(key)-96*2)
if new_code > 26:
new_code -= 26
result += chr(new_code+96)
return result
message = input('enter your sentence: ')
keyword = input('enter your keyword: ')
print(change(message, keyword))
Recursion
So we've got it working using iteration. What about recursion? You're definitely using recursion in your solution. Well, let's go back to the beginning, and figure out how to print out our message, letter by letter:
message = 'computing is awesome'
def change(message):
if not message:
return
print(message[0])
change(message[1:])
change(message)
That works. Now we want to add our key. As it turns out, we can actually do the same thing that we did before - just multiply it:
def change(message, keyword):
if not message:
return
if len(keyword) < len(message):
keyword = keyword*len(message)
print(message[0], keyword[0])
change(message[1:], keyword[1:])
Well that was surprisingly simple. Now let's print out the converted value:
def change(message, keyword):
if not message:
return
if len(keyword) < len(message):
keyword = keyword*len(message)
new_code = (ord(message[0])+ord(keyword[0])-96*2)
if new_code > 26:
new_code -= 26
print(chr(new_code+96))
change(message[1:], keyword[1:])
Again we need to handle a space character:
def change(message, keyword):
if not message:
return
if len(keyword) < len(message):
keyword = keyword*len(message)
if message[0] == ' ':
print()
else:
new_code = (ord(message[0])+ord(keyword[0])-96*2)
if new_code > 26:
new_code -= 26
print(chr(new_code+96))
change(message[1:], keyword[1:])
Now the only thing that's left is to combine the result. In recursion you usually pass some kind of value around, and we're going to do that with our result:
def change(message, keyword, result=''):
if not message:
return result
if len(keyword) < len(message):
keyword = keyword*len(message)
if message[0] == ' ':
result += ' '
else:
new_code = (ord(message[0])+ord(keyword[0])-96*2)
if new_code > 26:
new_code -= 26
result += chr(new_code+96)
return change(message[1:], keyword[1:], result)
print(change(message, keyword))
Slicing
We used some slicing in our recursive approach. We even could have passed in the index, rather than slicing off parts of our string. But now we're going to slice and dice. It's going to be pretty similar to our recursive solution:
def change(message, keyword):
if len(keyword) < len(message):
keyword = keyword*len(message)
while message:
print(message[0], keyword[0])
message = message[1:]
keyword = keyword[1:]
When you see that, it shouldn't be much of a stretch to realize that you can just put in the code from our recursive solution:
while message:
if message[0] == ' ':
print()
else:
new_code = (ord(message[0])+ord(keyword[0])-96*2)
if new_code > 26:
new_code -= 26
print(chr(new_code+96))
message = message[1:]
keyword = keyword[1:]
And then we just combine the characters into the result:
def change(message, keyword):
if len(keyword) < len(message):
keyword = keyword*len(message)
result = ''
while message:
if message[0] == ' ':
result += ' '
else:
new_code = (ord(message[0])+ord(keyword[0])-96*2)
if new_code > 26:
new_code -= 26
result += chr(new_code+96)
message = message[1:]
keyword = keyword[1:]
return result
Further Reading
You can do some nicer things. Rather than the silly multiplication we did with the key, how about itertools.cycle?
What happens when you use modulo division instead of subtraction?

Break from for loop

Here is my code:
def detLoser(frag, a):
word = frag + a
if word in wordlist:
lost = True
else:
for words in wordlist:
if words[:len(word) == word:
return #I want this to break out.
else:
lost = True
Where I have a return, I've tried putting in both return and break and both give me errors. Both give me the following error: SyntaxError: invalid syntax. Any Ideas? What is the best way to handle this?
You've omitted the ] from the list slice. But what is the code trying to achieve, anyway?
foo[ : len( foo ) ] == foo
always!
I assume this isn't the complete code -- if so, where is wordlist defined? (is it a list? -- it's much faster to test containment for a set.)
def detLoser(frag, a):
word = frag + a
if word in wordlist:
lost = True
else:
for words in wordlist:
if word.startswith(words):
return #I want this to break out.
else:
lost = True
you can probably rewrite the for loop using any or all eg. ( you should use a set instead of a list for wordlist though)
def detLoser(frag, a):
word = frag + a
return word in wordlist or any(w.startswith(word) for w in wordlist)

Categories