Instead of looping over each separate character of a string, I want to loop over parts of a string (multiple characters). Those parts are defined by the keys of a dictionary.
Example:
my_dict = {'010': 'a', '000': 'e', '1101': 'f', '1010': 'h', '1000': 'i', '0111': 'm', '0010': 'n', '1011': 's', '0110': 't', '11001': 'l', '00110': 'o', '10011': 'p', '11000': 'r', '00111': 'u', '10010': 'x'}
word = "1000001001100001100000100000110"
output = ""
What I've tried (looping over each character separately, indeed):
for i in word:
letter = my_dict[i]
output += letter
word = word.lstrip(letter)
My output:
"KeyError: '1'"
But I want to get key "1000" and its value "i", and then continue with key "0010" and get its value "n", etc...
Expected output:
# Expected output:
output = "internet"
Assuming it's a prefix code (otherwise you'd need to define how to deal with ambiguities), accumulate the bits until you have a match, then output the letter and clear the bits:
output = ""
bits = ""
for bit in word:
bits += bit
if bits in my_dict:
letter = my_dict[bits]
output += letter
bits = ""
Try it online!
Slight variation of it the lookup, reminded by Jnevill's answer:
if letter := my_dict.get(bits):
output += letter
You could use a regular expression to substitutes the patterns with the corresponding letters. re.sub allows use of a function for the replacement which could be access to the dictionary to get the letters. The search pattern would need to have the longer values first so that they are "consumed" in priority over shorter patterns that could start with the same bits:
my_dict = {'010': 'a', '000': 'e', '1101': 'f', '1010': 'h', '1000': 'i', '0111': 'm', '0010': 'n', '1011': 's', '0110': 't', '11001': 'l', '00110': 'o', '10011': 'p', '11000': 'r', '00111': 'u', '10010': 'x'}
word = "1000001001100001100000100000110"
import re
pattern = "|".join(sorted(my_dict.keys(),key=len,reverse=True))
output = re.sub(pattern,lambda m:my_dict[m.group(0)],word)
print(output) # internet
[EDIT]
If there are no conflicts between short and long bit patterns, the sort is not needed (as Kelly pointed out), the solution could be a single line:
output = re.sub('|'.join(my_dict),lambda m:my_dict[m[0]],word)
Issue with your code:
for i in word: # here, i is a single character
# so you can't get corresponding value since it's multiple character keys
letter = my_dict[i]
output += letter # this would work fine
word = word.lstrip(letter)
You can do a while loop on word, and remove the part you found in the dict each time. When words is empty, you will stop looping and the program ends.
You can iterate over each key in the dict and test if it match the beginning of the word. If it does, you have the letter you are looking for. Do what you want instead of the print, and repeat.
translate_table = {'010': 'a', '000': 'e', '1101': 'f', '1010': 'h', '1000': 'i', '0111': 'm', '0010': 'n', '1011': 's', '0110': 't', '11001': 'l', '00110': 'o', '10011': 'p', '11000': 'r', '00111': 'u', '10010': 'x'}
message = "1000001001100001100000100000110"
while message:
for code, letter in translate_table.items():
if message.startswith(code):
# replace this with whatever you want to do with the letter
print(letter, end="")
# "Cut" the word to keep the remaining characters
message = message[len(code):]
break # a letter was found, move to next while iteration
While iterating my_dict (as DorianTurba suggests) feels like a more elegant solution, your gut was suggesting that you should iterate word. To do this you can use a while loop and then manage the length of characters you jump in each iteration depending on the size of the my_dict key that matches the first 3, 4, or 5 characters in word.
Consider:
my_dict = {'010': 'a', '000': 'e', '1101': 'f', '1010': 'h', '1000': 'i', '0111': 'm', '0010': 'n', '1011': 's', '0110': 't', '11001': 'l', '00110': 'o', '10011': 'p', '11000': 'r', '00111': 'u', '10010': 'x'}
word = "1000001001100001100000100000110"
i=0
while len(word) > i:
for size in [3,4,5]:
if my_dict.get(word[i:i+size]):
print(my_dict[word[i:i+size]])
i += size
break
first time posting, new to programming. Issue I'm having is when I run my Pangram function, and turn the input string into a set list, the list still has multiple 't' but none of the other letters. when I input "The quick brown fox jumps over the lazy dog" when my code turns this to an organized list to match the alphabet, there are 2 t's. As you can see i'm using print to see what everything is doing
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 't', 'u', 'v', 'w', 'x', 'y', 'z']
above is what my input string converts to and there are 2 t's but all other multiple letters are gone. I also tried making the upper T a lower T manually, and also making other random letter upper and it has no problem with other letters.
def ispangram(str1, alphabet=string.ascii_lowercase):
str1 = str1.replace(' ','')
str1 = list(set(str1))
str1 = [letter.lower() for letter in str1]
str1.sort()
print(str1)
alphabet = list(set(alphabet))
alphabet.sort()
print(alphabet)
if str1 == alphabet:
return 'Is Pangram!'
else:
return 'Is not Pangram!'
You're converting your string to lowercase after collecting a set, you should make it lowercase before the set
str1 = list(set(str1.lower()))
You are lowercasing the characters after building the set, so T and t are considered different.
Instead, lowercase before building the set:
def ispangram(str1, alphabet=string.ascii_lowercase):
str1 = str1.replace(' ','')
str1 = [letter.lower() for letter in str1]
str1 = list(set(str1))
str1.sort()
print(str1)
Or in a much shorter way:
def ispangram(str1, alphabet=string.ascii_lowercase):
str1 = sorted(set(str1.replace(' ','').lower()))
Hey guys in new to Python. And I was playing around writing python and I'm stuck.
words = ['w','hello.','my','.name.','(is)','james.','whats','your','name?']
alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
'''#position of the invalid words i.e the ones that '''
inpos = -1
for word in words:
inpos = inpos + 1
#pass
for letter in word:
#print(letter)
if letter in alphabet:
pass
#print('Valid')
elif letter not in alphabet:
new_word = word.replace(letter,"")
print(word)
print(new_word)
words[inpos] = new_word
print(words)
This code is meant to clean the text (remove all full stops, commas, and other characters)
The problem is when I run it removes the adds the brackets
Heres the output:
Image of output
Can anyone explain why this is happening?
No, it does not add anything. You're printing both old and new word:
print(word)
print(new_word)
so when the new_word is (is the word is still (is).
BTW your code has a logical error: when you remove a character you put back new_word in the list, but word is still the old value. So only the last change for every word will be saved in the list words.
I'm trying to pop any letters given by the user, for example if they give you a keyword "ROSES" then these letters should be popped out of the list.
Note: I have a lot more explanation after the SOURCE CODE
SOURCE CODE
alphabet = ["A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
encrypted_message = []
key_word = str(input("Please enter a word:"))
key_word = list(key_word)
#print(key_word)
check_for_dup = 0
for letter in key_word:
for character in alphabet:
if letter in character:
check_for_dup +=1
alphabet.pop(check_for_dup)
print(alphabet)
print(encrypted_message)
SAMPLE INPUT
Let's say keyword is "Roses"
this what it gives me a list of the following ['A', 'C', 'E', 'G', 'I', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
But that's wrong It should of just removed the characters that existed in the keyword given by the user, like the word "Roses" each letter should of been removed and not in the list being popped. as you can see in the list the letters "B","D","F","H",etc were gone. What I'm trying to do is pop the index of the alphabet letters that the keyword exists.
this is what should of happened.
["A","B","C","D","F","G","H","I","J","K","L","M","N","P","Q","T","U","V","W","X","Y","Z"]
The letters of the keyword "ROSES" were deleted of the list
There is some shortcomings in your code here is an implementation that works:
alphabet = ["A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
key_word = str(input("Please enter a word:"))
for letter in key_word.upper():
if letter in alphabet:
alphabet.remove(letter)
print(alphabet)
Explanations
You can iterate on a string, no need to cast it as a list
Use remove since you can use the str type directly
You need to .upper() the input because you want to remove A if the user input a
Note that I did not handle encrypted_message since it is unused at the moment.
Also, as some comments says you could use a set instead of a list since lookups are faster for sets.
alphabet = {"A","B","C","D",...}
EDIT
alphabet = ["A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
key_word = str(input("Please enter a word:"))
encrypted_message = []
for letter in key_word.upper():
if letter in alphabet:
alphabet.remove(letter)
encrypted_message.append(letter)
encrypted_message.extend(alphabet)
This is a new implementation with the handling of your encrypted_message. This will keep the order of the alphabet after the input of the user. Also, if you're wondering why there's no duplicate, you will be appending only if letter is in alphabet which means the second time it won't be in alphabet and therefore not added to your encrypted_message.
you can directly check with input key
iterate all the letters in data, and check whether or not the letter in the input_key, if yes discard it
data = ['A', 'C', 'E', 'G', 'I', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
input_key = 'Roses'
output = [l for l in data if l not in input_key.upper()]
print output
['A', 'C', 'G', 'I', 'K', 'L', 'M', 'N', 'P', 'Q', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
You can do something like this....
import string
alphabet = list(string.ascii_lowercase)
user_word = 'Roses'
user_word = user_word.lower()
letters_to_remove = set(list(user_word)) # Create a unique set of characters to remove
for letter in letters_to_remove:
alphabet.remove(letter)
alphabet = ["A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]
# this can be accepted as a user input as well
userinput = "ROSES"
# creates a list with unique values in uppercase
l = set(list(userinput.upper()))
# iterates over items in l to remove
# the corresponding items in alphabet
for x in l:
alphabet.remove(x)
' '.join(alphabet).translate({ord(c): "" for c in keyword}).split()
Try this:
import string
key_word = set(input("Please enter a word:").lower())
print(sorted(set(string.ascii_lowercase) - key_word))
Explanation:
When checking for (in)existence, it's better to use set. https://docs.python.org/3/tutorial/datastructures.html#sets
Instead of iterating over it N times (N = number of input characters), you hash the character N times and check if there's already something at the result hash or not.
If you check the speed, try with "WXYZZZYW" and you'll see that it'll be a lot slower than if it were "ABCDDDCA" with the list way. With set, it will be always the same time.
The rest is pretty trivial. Casting to lowercase (or uppercase), to make sure it hits a match, case insensitive.
And then, we end by doing a set difference (-). It's all the items that are in the first set but not in the second one.
So I just learned how to manipulate single letters in a for loop from code academy.
But let's say I made a function and wanted this function to manipulate the vowels of an user inputted word and replace the vowel with four consecutive copies of itself. How would I go about that?
Expected output:
>>>Exclamation("car")
caaaar
>>>Exclamation("hello")
heeeelloooo
So far I have:
word = input("Enter a word: ")
vowels= ['a', 'e', 'i', 'o', 'u', 'A', 'E', 'I', 'O', 'U']
for char in word:
if char in vowels:
print(____,end='') #here I am unsure of how to replace it with consecutive copies of itself
else:
print(char,end='')
Your print statement can be:
print(4 * char,end='') # Or how many ever times you want to repeat it.
If word is 'car', this code:
>>> for char in word:
... if char in vowels:
... print(4 * char, end='')
... else:
... print(char, end='')
...
prints
caaaar
Note: You can include only the lower case vowels in your vowels list and in your if condition, check if char.lower() is in vowels.