Recursive function to convert characters - python

I am trying to write a program in Python which uses a recursive function to convert all the lower-case characters in a string to the next character. Here's my attempt:
def convert(s):
if len(s) < 1:
return ""
else:
return convert(chr(ord(s[0+1])))
print(convert("hello"))
When I try to run this program, it gives me the error: string index out of range. Could anyone please help me correct this? I'm not even sure if my program is coded correctly to give the required output :/

You want to return the shifted character and then call your convert function on the remainder of the string. If you must use recursion, you need to check if the string is exhausted (if not s is the same as if len(s) == 0 here because '' is equivalent to False) and bail:
def convert(s):
if not s:
return ''
c = s[0]
i = ord(c)
if 96 < i < 123:
# for lower-case characters permute a->b, b->c, ... y->z, z->a
c = chr(((i-97)+1)%26 + 97)
return c + convert(s[1:])
print(convert('hello'))
print(convert('abcdefghijklmnopqrstuvwxyz'))
Output:
ifmmp
bcdefghijklmnopqrstuvwxyza
The ASCII codes for 'a' and 'z' are 97 and 122 respectively, so we only apply the shift to characters whose codes, i, are in this range. Don't forget to wrap if the character is z: you can do this with modular arithmetic: ((i-97)+1)%26 + 97.
EDIT explanation: Subtract 97 so that the code becomes 0 to 25, then add 1 mod 26 such that 0+1 = 1, 1+1 = 2, ..., 24+1 = 25, 25+1=0. Then add back on 97 so that the code represents a letter between a and z. This way your letters will cycle round

You are trying to index the second character each time; Python indexes start at 0 so 0+1 is 1 is the second character. Your len() test doesn't guard against that, it only tests for empty strings.
You also pass in just one character to the recursive call, so you always end up with a string of length 1, which doesn't have a second character.
So your test with 'hello' does this:
convert('hello')
len('hello') > 1 -> True
s[0+1] == s[1] == 'e'; chr(ord('e')) is 'e'
return convert('e')
len('e') > 1 -> True
s[0+1] == s[1] -> 'e'[1] raises an index error
If you wanted to use recursion, then you need to decide how to detect the end of the recursion path correctly. You could test for strings shorter than 2 characters, for example, as there is no next character to use in that case.
You also need to decide what to delegate to the recursive call. For a conversion like this, you could pass in the remainder of the string.
Last but not least, you need to test if the character you are going to replace is actually lowercase.

Related

Get the Sum of the Values of Letters in a Name

Numerologists claim to be able to determine a person's character traits based on the "numeric value" of a name. The value of a name is determined by summing up the values of the letters of the name. For example, the name Zelle would have the value 26 + 5 + 12 + 12 + 5 = 60. Write a function called get_numeric_value that takes a sequence of characters (a string) as a parameter. It should return the numeric value of that string. You can use to_numerical in your solution.
First, my code for to_numerical (a function I had to make to convert letters to numbers [notably not the ASCII codes; had to start from A = 1, etc.]) was this:
def to_numerical(character):
if character.isupper():
return ord(character) - 64
elif character.islower():
return ord(character) - 96
In regards to the actual problem, I'm stuck since I can only get the function to return the value of the Z in Zelle (which is supposed to be 26 here). I've pasted what I have so far below:
def get_numeric_value(string):
numerals = []
for character in string:
if character.isupper():
return ord(character) - 64
elif character.islower():
return ord(character) - 96
return numerals
addition = sum(numerals)
return addition
I was thinking I would probably have to use sum() at some point, but I think the issue is more that I'm not getting all of the letters returned to the numerals list. How can I get it so the code will add up and return all the letters? I've been trying to think something up for an hour but I'm stumped.
The problem is that you are using return multiple times, each time the program reaches a return it stops the function and returns the specified value.
In your case the first letter fulfills one of the conditionals (specifically character.isupper(), returns the value of the letter and the program ends.
I think you wanted to use the .append() method, this method allows you to add elements to a list, leaving something like this:
def get_numeric_value(string):
numerals = []
for character in string:
if character.isupper():
numerals.append(ord(character) - 64)
elif character.islower():
numerals.append(ord(character) - 96)
addition = sum(numerals)
return addition
You can also use the to_numeric function you declared earlier, giving a much cleaner result (in my opinion).
def get_numeric_value(string):
numerals = []
for character in string:
numerals.append(to_numerical(character))
addition = sum(numerals)
return addition
If you want it to be just one line you can use something called list comprehension, though you sacrifice some readability.
def get_numeric_value(string):
return sum([to_numerical(character) for character in string])
You should just use a sum variable and add to it each character value and then return that sum:
def get_numeric_value(string):
numerals = []
sum = 0
for character in string:
if character.isupper():
sum += ord(character) - 64
elif character.islower():
sum += ord(character) - 96
return sum
This one line answer should work as well:
def get_numeric_value(string):
return sum(ord(character) - (64 if character.isupper() else 96) for character in string if character.isalpha())
why not just first lowecase the string to get rid of all the if-else blocks?
since all the characters are now lowercase we can use this function:
def get_numeric_value(string:str) -> int:
return sum([ord(c) - 96 for c in string.lower() if c.isalpha()])

Why does my code remove 999 in my replacement code?

I have the code below to replace all punctuation with 999 and all alphabet characters with its number position. I have included the print statement that confirms punctuation is being replaced. However I seem to override with my remaining code to replace the other characters.
import string
def encode(text):
punct = '''!()-[]{};:'"\,<>./?##$%^&*_~'''
for x in text.lower():
if x in punct:
text = text.replace(x, ".999")
print(text)
nums = [str(ord(x) - 96)
for x in text.lower()
if x >= 'a' and x <= 'z'
]
return ".".join(nums)
print(encode(str(input("Enter Text: "))))
Input: 'Morning! \n'
Output: '13.15.18.14.9.14.7 \n'
Expected Output: 13.15.18.14.9.14.7.999
No, you have two independent logical "stories" here. One replaces punctuation with 999. The other filters out all the letters and builds an independent list of their alphabetic positions.
nums = [str(ord(x) - 96)
for x in text.lower()
if x >= 'a' and x <= 'z'
]
return ".".join(nums)
Note that this does nothing to alter text, and it takes nothing but letters from text. If you want to include the numbers, do so:
nums = [str(ord(x) - 96)
if x >= 'a' and x <= 'z'
else x
for x in text.lower()
]
return ".".join(nums)
Output of print(encode("[hello]")):
..9.9.9.8.5.12.12.15...9.9.9
nums = [str(ord(x) - 96)
for x in text.lower()
if x >= 'a' and x <= 'z'
]
This means: take every character from the lowercase version of the string, and only if it is between 'a' and 'z', convert the value and put the result in nums.
In the first step, you replace a bunch of punctuation with text that includes '.' and '9' characters. But neither '9' nor '.' is between 'a' and 'z', so of course neither is preserved in the second step.
Now that I understand what you are going for: you have fundamentally the wrong approach to splitting up the problem. You want to separate the two halves of the rule for "encoding" a given part of the input. But what you want to do is separate the whole rule for encoding a single element, from the process of applying a single-element rule to the whole input. After all - that is what list comprehensions do.
This is the concept of separation of concerns. The two business rules are part of the same concern - because implementing one rule doesn't help you implement the other. Being able to encode one input character, though, does help you encode the whole string, because there is a tool for that exact job.
We can have a complicated rule for single characters - no problem. Just put it in a separate function, so that we can give it a meaningful name and keep things simple to understand. Conceptually, our individual-character encoding is a numeric value, so we will consistently encode as a number, and then let the string-encoding process do the conversion.
def encode_char(c):
if c in '''!()-[]{};:'"\,<>./?##$%^&*_~''':
return 999
if 'a' <= c.lower() <= 'z':
return ord(c) - 96
# You should think about what to do in other cases!
# In particular, you don't want digit symbols 1 through 9 to be
# confused with letters A through I.
# So I leave the rest up to you, depending on your requirements.
Now we can apply the overall encoding process: we want a string that puts '.' in between the string representations of the values. That's straightforward:
def encode(text):
return '.'.join(str(encode_char(c)) for c in text)

Different results when return multiple values in python (Cryptopal challenges)

I'm working on problem 3(set 1) of the cryptopals challenges (https://cryptopals.com/sets/1/challenges/3)
I've already found the key ('x') and decrypted the message ('Cooking mcs like a pound of bacon')
Here is my code:
from hexToBase64 import hexToBinary
from fixedXOR import xorBuffers
def binaryToChar(binaryString):
asciiValue = 0
for i in range(int(len(binaryString))-1,-1,-1):
if(binaryString[i] == '1'):
asciiValue = asciiValue + 2**(7-i)
return chr(asciiValue)
def decimalToBinary(number):
binaryString = ""
while (number != 0):
bit = number % 2
binaryString = str(bit) + binaryString
number = int(number/2)
while(len(binaryString) < 8):
binaryString = "0" + binaryString
return binaryString
def breakSingleByteXOR(cipherString):
decryptedMess = ""
lowestError = 10000
realKey = ""
for i in range(0,128):
errorChar = 0
tempKey = decimalToBinary(i)
tempMess = ""
for j in range(0,len(cipherString),2):
#Take each byte of the cipherString
cipherChar = hexToBinary(cipherString[j:j+2])
decryptedChar = binaryToChar(xorBuffers(cipherChar,tempKey))
asciiValue = ord(decryptedChar)
if (not ((asciiValue >= 65) and (asciiValue <= 90)) \
or ((asciiValue >= 90) and (asciiValue <= 122)) \
or ( asciiValue == 32 )):
# if the character is not one of the characters ("A-Z" or "a-z"
# or " ") consider it as an "error"
errorChar += 1
tempMess = tempMess + decryptedChar
if(errorChar < lowestError):
lowestError = errorChar
decryptedMess = tempMess
realKey = chr(i)
return (realKey,decryptedMess)
if __name__ == "__main__":
print(breakSingleByteXOR("1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736"))
The problem is when I use the function breakSingleByteXOR to return one value (decryptedMess), it came out okay "cOOKING mcS LIKE A POUND OF BACON"
But when I return 2 values with the function (as the code above - (key,decryptedMess)), I received a weird result ('x', 'cOOKING\x00mc\x07S\x00LIKE\x00A\x00POUND\x00OF\x00BACON'), can anyboby explain to me why this is the case?
Tbh, I'm learning python as I'm doing the challenges so hopefully I dont trigger anyone with these code.... I'd also really appreciate it if anyone could give me some advices on writing good python code
Thanks guys :D
It's true that the reason for the difference in the printed string is a quirk of the print function.
The deeper problem with that program is that it's not producing the correct answer. That's because the big ugly if that tries to decide whether a decrypted character is in the acceptable range is incorrect.
It's incorrect in two ways. The first is that (asciiValue >= 90) should be (asciiValue >= 97). A better way to write all of those expressions, which would have avoided this error, is to express them as (asciiValue >= ord('a')) and (asciiValue == ord(' ')) and so on, avoiding the inscrutable numbers.
The second way is that the expressions are not properly grouped. As they stand they do this:
character is not in the range 'A' to 'Z',
or character is in the range 'a' to 'z',
or character is 'space',
then count this as an error
so some of the characters that should be good (specifically 'a' through 'z' and space) are counted as bad. To fix, you need to rework the parentheses so that the condition is:
character is not in the range 'A' to 'Z',
and character is not in the range 'a' to 'z',
and character is not space,
then count this as an error
or (this is style you were trying for)
character is not (in the range 'A' to 'Z'
or in the range 'a' to 'z'
or a space)
I'm not going to give you the exact drop-in expression to fix the program, it'll be better for you to work it out for yourself. (A good way to deal with this kind of complexity is to move it into a separate function that returns True or False. That makes it easy to test that your implementation is correct, just by calling the function with different characters and seeing that the result is what you wanted.)
When you get the correct expression, you'll find that the program discovers a different "best key" and the decrypted string for that key contains no goofy out-of-range characters that behave strangely with print.
The print function is the culprit - it is translating the characters \x00 and \x07 to ASCII values when executed. Specifically, this only occurs when passing a string to the print function, not an iterable or other object (like your tuple).
This is an example:
>>> s = 'This\x00string\x00is\x00an\x00\x07Example.'
>>> s
'This\x00string\x00is\x00an\x00\x07Example.'
>>> print(s)
This string is an Example.
If you were to add the string s to an iterable (tuple, set, or list), s will not be formatted by the print function:
>>> s_list = [s]
>>> print(s_list) # List
['This\x00string\x00is\x00an\x00\x07Example.']
>>> print(set(s_list)) # Set
{'This\x00string\x00is\x00an\x00\x07Example.'}
>>> print(tuple(s_list)) # Tuple
('This\x00string\x00is\x00an\x00\x07Example.')
Edit
Because the \x00 and \x07 bytes are ASCII control characters, (\x00 being NUL and \x07 being BEL), you can't represent them in any other way. So one of the only ways you could strip these characters from the string without printing would be to use the .replace() method; but given \x00 bytes are being treated as spaces by the terminal, you would have to use s.replace('\x00', ' ') to get the same output, which has now changed the true content of the string.
Otherwise when building the string; you could try and implement some logic to check for ASCII control characters and either not add them to tempMess or add a different character like a space or similar.
References
ASCII Wiki: https://en.wikipedia.org/wiki/ASCII
Curses Module: https://docs.python.org/3.7/library/curses.ascii.html?highlight=ascii#module-curses.ascii (Might be useful if you wish to implement any logic).

Taking long time to execute Python code for the definition

This is the problem definition:
Given a string of lowercase letters, determine the index of the
character whose removal will make a palindrome. If is already a
palindrome or no such character exists, then print -1. There will always
be a valid solution, and any correct answer is acceptable. For
example, if "bcbc", we can either remove 'b' at index or 'c' at index.
I tried this code:
# !/bin/python
import sys
def palindromeIndex(s):
# Complete this function
length = len(s)
index = 0
while index != length:
string = list(s)
del string[index]
if string == list(reversed(string)):
return index
index += 1
return -1
q = int(raw_input().strip())
for a0 in xrange(q):
s = raw_input().strip()
result = palindromeIndex(s)
print(result)
This code works for the smaller values. But taken hell lot of time for the larger inputs.
Here is the sample: Link to sample
the above one is the bigger sample which is to be decoded. But at the solution must run for the following input:
Input (stdin)
3
aaab
baa
aaa
Expected Output
3
0
-1
How to optimize the solution?
Here is a code that is optimized for the very task
def palindrome_index(s):
# Complete this function
rev = s[::-1]
if rev == s:
return -1
for i, (a, b) in enumerate(zip(s, rev)):
if a != b:
candidate = s[:i] + s[i + 1:]
if candidate == candidate[::-1]:
return i
else:
return len(s) - i - 1
First we calculate the reverse of the string. If rev equals the original, it was a palindrome to begin with. Then we iterate the characters at the both ends, keeping tab on the index as well:
for i, (a, b) in enumerate(zip(s, rev)):
a will hold the current character from the beginning of the string and b from the end. i will hold the index from the beginning of the string. If at any point a != b then it means that either a or b must be removed. Since there is always a solution, and it is always one character, we test if the removal of a results in a palindrome. If it does, we return the index of a, which is i. If it doesn't, then by necessity, the removal of b must result in a palindrome, therefore we return its index, counting from the end.
There is no need to convert the string to a list, as you can compare strings. This will remove a computation that is called a lot thus speeding up the process. To reverse a string, all you need to do is used slicing:
>>> s = "abcdef"
>>> s[::-1]
'fedcba'
So using this, you can re-write your function to:
def palindromeIndex(s):
if s == s[::-1]:
return -1
for i in range(len(s)):
c = s[:i] + s[i+1:]
if c == c[::-1]:
return i
return -1
and the tests from your question:
>>> palindromeIndex("aaab")
3
>>> palindromeIndex("baa")
0
>>> palindromeIndex("aaa")
-1
and for the first one in the link that you gave, the result was:
16722
which computed in about 900ms compared to your original function which took 17000ms but still gave the same result. So it is clear that this function is a drastic improvement. :)

shift each letter of words by value

I am trying to take a value from user and read a list of words from a Passwords.txt file, and shift each letter to right by value
•def shift():
value=eval(input("Please enter the value here."))
file = open("Passwords.txt","w")
with open ("word-text.txt","r") as m:
for line in m:
line=line.strip()
print (line)
newString = ""
for char in line:
char_int=ord(char)
t=char_int+value
if t==124:
t = t-27
charme= chr(t)
print (char,">>",charme)
newString += charme
file.writelines(line+" "+newString+"\n")
you don't have to convert to ascii, you can just use maketrans function
def shift_string(text, shift):
intab='abcdefghijklmnopqrstuvwxyz'
outab=intab[shift:]+intab[:shift]
return maketrans(intab, outab)
You need to do the assignment yourself (or there is no point in learning to program) and if you don't understand the question, you should ask your teacher for clarification.
That said, shifting is quite simple in principle. You can do it by hand. If you have a letter, say A, shifting it by 1 (key = 1) would transform it into B. In the assignment you shift by 2 places, so A would become C, B (in the original word) would be become D and so on. You have to be a bit careful about the end of the alphabet. When shifting by 1, Z becomes A. When shifting by 2, Y becomes A and Z becomes B.
So in your example, HELLO becomes JGNNQ because when shifting 2 places:
H => J
E => G
L => N
O => Q
(Note: I'm using uppercase for readability but your assignment seems to be about working on lowercase characters. I'm assuming you're only asked to handle lowercase.)
How do you do this? Check out the links you were given. Basically ord() transforms a character into an integer and chr() transforms one such integer into a character. It's based on the way characters are represented as numbers in the computer. So for a given character, if you transform it into its ord(), you can add the key to shift it and then transform it back into a character with chr().
For wrapping from Y and Z to A and B, you can use the modulus operator (%) for this but be careful, it's a bit fiddly (you need to calculate the difference between the ord of your character and the ord of 'a', apply % 26 (which gives you a number between 0 and 25), then add it to ord('a) to have the correct ord). If it's too complicated, just do it with a couple of IFs.
I'd advise to start with a small program that takes input from the user and prints the output to check that it's working correctly. You won't need the input and print in the final version but it will help you to test that your shifting code works correctly.
Then you have the part about reading from a file and writing to a file. Your assignment doesn't ask the user for input, instead it reads from a file. Your line with open ("word-text.txt","r") as f: looks fine, this should give you the file handle you need to read the data. You can read the data with f.read() and assign it to a variable. I'm not sure what you've been taught, but I'd split the string into words with <string>.split() which creates a list of strings (your words).
Then for each word, you use the code you wrote previously to shift the string and you can just write both the original word and the shifted word into the output file. The simplest would probably be to start by opening the output file (in writing mode) and do both the shifting and the writing in one go by looping on the list.
The heavy lifting is doing the word conversion, so I've done that - you can do the rest as it's very trivial. :)
This works by converting each character into a numeric representation and correcting for circular performance (i.e. Z shifted by 2 will output B).
def limits_correction(character, distance, start, end):
char = character
if char >= start and char < end:
if char + distance >= end:
char = char + distance - 26
else:
char = char + distance
return char
def modify_string(string, distance):
ords = [ord(c) for c in string]
corrected_distance = 0
if distance > 26:
corrected_distance = distance % 26
elif distance > 0 and distance <= 26:
corrected_distance = distance
lower_start = 97
lower_end = lower_start + 26
upper_start = 65
upper_end = upper_start + 26
shifted_string = []
for char in ords:
if char >= lower_start and char < lower_end:
char = limits_correction(char, corrected_distance, lower_start, lower_end)
elif char >= upper_start and char < upper_end:
char = limits_correction(char, corrected_distance, upper_start, upper_end)
shifted_string.append(chr(char))
return ''.join(shifted_string)
This also works for uppercase and lowercase for any integer shift number (read as from 0 to very large).
REFERENCE:
http://www.asciitable.com/

Categories