Python Hamming distance rewrite countless for cycles into recursion - python

I have created a code generating strings which have hamming distance n from given binary string. Though I'm not able to rewrite this in a simple recursive function. There are several sequences (edit: actually only one, the length change) in the for loops logic but I don't know how to write it into the recursive way (the input for the function is string and distance (int), but in my code the distance is represented by the count of nested for cycles. Could you please help me?
(e.g. for string '00100' and distance 4, code returns ['11010', '11001', '11111', '10011', '01011'],
for string '00100' and distance 3, code returns ['11000', '11110', '11101', '10010', '10001', '10111', '01010', '01001', '01111', '00011'])
def change(string, i):
if string[i] == '1':
return string[:i] + '0' + string[i+1:]
else: return string[:i] + '1' + string[i+1:] #'0' on input
def hamming_distance(number):
array = []
for i in range(len(number)-3): #change first bit
a = number
a = change(a, i) #change bit on index i
for j in range(i+1, len(number)-2): #change second bit
b = a
b = change(b, j)
for k in range(j+1, len(number)-1): #change third bit
c = b
c = change(c, k)
for l in range(k+1, len(number)): #change fourth bit
d = c
d = change(d, l)
array.append(d)
return array
print(hamming_distance('00100'))
Thank you!

Very briefly, you have three base cases:
len(string) == 0: # return; you've made all the needed changes
dist == 0 # return; no more changes to make
len(string) == dist # change all bits and return (no choice remaining)
... and two recursion cases; with and without the change:
ham1 = [str(1-int(string[0])) + alter
for alter in change(string[1:], dist-1) ]
ham2 = [str[0] + alter for alter in change(string[1:], dist) ]
From each call, you return a list of strings that are dist from the input string. On each return, you have to append the initial character to each item in that list.
Is that clear?
CLARIFICATION
The above approach also generates only those that change the string. "Without" the change refers to only the first character. For instance, given input string="000", dist=2, the algorithm will carry out two operations:
'1' + change("00", 2-1) # for each returned string, "10" and "01"
'0' + change("00", 2) # for the only returned string, "11"
Those two ham lines go in the recursion part of your routine. Are you familiar with the structure of such a function? It consists of base cases and recursion cases.

Related

Python- Expanded form of a number raised to the power of ten

I'm pretty new to coding and I am trying to write a python script where a user enters an integer and it displays that integer in expanded form raised to the power of 10's.
Example: A user enters 643541 and the script outputs 643541 = (6x10^5 )+(4x10^4)+(3x10^3)+(5x10^2)+(4x10^1)+(1x10^0)
This is my code
A = [7000, 400, 70,1]
cond = True
y = 0
i = 0
sizeArray = len(A)
for i in range(0, sizeArray-1):
while cond == True:
if A[i]%10 == 0:
A[i] = A[i]/10
y += 1
else:
cond = False
print(y)
I tried working with a sample array to test the number of zero's but I don't know how i will be able to output the result as above.
How can I accomplish this?
You can transform your input integer 643541 to an array of digits [6,4,3,5,4,1]. Then maintain a variable for the exponent. It will be decremented for each digit in the array
def function(num):
digits = str(num) # convert number to string
output = []
for i, digit in enumerate(digits):
output.append("(" + digit + "x10^" + str(len(digits)-i-1) + ")")
return " + ".join(output)
Here len(digits)-i-1 plays the role of the variable that maintains exponent value
Every question like this deserves a solution using a list comprehension:
>>> n = 123456
>>> '+'.join([ '({1}x10^{0})'.format(*t) for t in enumerate(str(n)[::-1]) ][::-1])
'(1x10^5)+(2x10^4)+(3x10^3)+(4x10^2)+(5x10^1)+(6x10^0)'
Explanation:
str(n)[::-1] converts the number to a string and then reverses the string, giving the digits of the number, as strings, from least-significant to most-significant.
enumerate returns pairs t = (i, d) where i is the index and d is the digit. Since the sequence is from least-significant to most-significant, the index equals the corresponding exponent of 10.
*t unpacks (i, d) for {0} and {1} in the format string, so the result is like ({d}x10^{i}).
The [::-1] applied to the list comprehension reverses the results back into the right order.
'+'.join joins those results together into a single string, with the + symbol between the parts.

Taking long time to execute Python code for the definition

This is the problem definition:
Given a string of lowercase letters, determine the index of the
character whose removal will make a palindrome. If is already a
palindrome or no such character exists, then print -1. There will always
be a valid solution, and any correct answer is acceptable. For
example, if "bcbc", we can either remove 'b' at index or 'c' at index.
I tried this code:
# !/bin/python
import sys
def palindromeIndex(s):
# Complete this function
length = len(s)
index = 0
while index != length:
string = list(s)
del string[index]
if string == list(reversed(string)):
return index
index += 1
return -1
q = int(raw_input().strip())
for a0 in xrange(q):
s = raw_input().strip()
result = palindromeIndex(s)
print(result)
This code works for the smaller values. But taken hell lot of time for the larger inputs.
Here is the sample: Link to sample
the above one is the bigger sample which is to be decoded. But at the solution must run for the following input:
Input (stdin)
3
aaab
baa
aaa
Expected Output
3
0
-1
How to optimize the solution?
Here is a code that is optimized for the very task
def palindrome_index(s):
# Complete this function
rev = s[::-1]
if rev == s:
return -1
for i, (a, b) in enumerate(zip(s, rev)):
if a != b:
candidate = s[:i] + s[i + 1:]
if candidate == candidate[::-1]:
return i
else:
return len(s) - i - 1
First we calculate the reverse of the string. If rev equals the original, it was a palindrome to begin with. Then we iterate the characters at the both ends, keeping tab on the index as well:
for i, (a, b) in enumerate(zip(s, rev)):
a will hold the current character from the beginning of the string and b from the end. i will hold the index from the beginning of the string. If at any point a != b then it means that either a or b must be removed. Since there is always a solution, and it is always one character, we test if the removal of a results in a palindrome. If it does, we return the index of a, which is i. If it doesn't, then by necessity, the removal of b must result in a palindrome, therefore we return its index, counting from the end.
There is no need to convert the string to a list, as you can compare strings. This will remove a computation that is called a lot thus speeding up the process. To reverse a string, all you need to do is used slicing:
>>> s = "abcdef"
>>> s[::-1]
'fedcba'
So using this, you can re-write your function to:
def palindromeIndex(s):
if s == s[::-1]:
return -1
for i in range(len(s)):
c = s[:i] + s[i+1:]
if c == c[::-1]:
return i
return -1
and the tests from your question:
>>> palindromeIndex("aaab")
3
>>> palindromeIndex("baa")
0
>>> palindromeIndex("aaa")
-1
and for the first one in the link that you gave, the result was:
16722
which computed in about 900ms compared to your original function which took 17000ms but still gave the same result. So it is clear that this function is a drastic improvement. :)

how to make an imputed string to a list, change it to a palindrome(if it isn't already) and reverse it as a string back

A string is palindrome if it reads the same forward and backward. Given a string that contains only lower case English alphabets, you are required to create a new palindrome string from the given string following the rules gives below:
1. You can reduce (but not increase) any character in a string by one; for example you can reduce the character h to g but not from g to h
2. In order to achieve your goal, if you have to then you can reduce a character of a string repeatedly until it becomes the letter a; but once it becomes a, you cannot reduce it any further.
Each reduction operation is counted as one. So you need to count as well how many reductions you make. Write a Python program that reads a string from a user input (using raw_input statement), creates a palindrome string from the given string with the minimum possible number of operations and then prints the palindrome string created and the number of operations needed to create the new palindrome string.
I tried to convert the string to a list first, then modify the list so that should any string be given, if its not a palindrome, it automatically edits it to a palindrome and then prints the result.after modifying the list, convert it back to a string.
c=raw_input("enter a string ")
x=list(c)
y = ""
i = 0
j = len(x)-1
a = 0
while i < j:
if x[i] < x[j]:
a += ord(x[j]) - ord(x[i])
x[j] = x[i]
print x
else:
a += ord(x[i]) - ord(x[j])
x [i] = x[j]
print x
i = i + 1
j = (len(x)-1)-1
print "The number of operations is ",a print "The palindrome created is",( ''.join(x) )
Am i approaching it the right way or is there something I'm not adding up?
Since only reduction is allowed, it is clear that the number of reductions for each pair will be the difference between them. For example, consider the string 'abcd'.
Here the pairs to check are (a,d) and (b,c).
Now difference between 'a' and 'd' is 3, which is obtained by (ord('d')-ord('a')).
I am using absolute value to avoid checking which alphabet has higher ASCII value.
I hope this approach will help.
s=input()
l=len(s)
count=0
m=0
n=l-1
while m<n:
count+=abs(ord(s[m])-ord(s[n]))
m+=1
n-=1
print(count)
This is a common "homework" or competition question. The basic concept here is that you have to find a way to get to minimum values with as few reduction operations as possible. The trick here is to utilize string manipulation to keep that number low. For this particular problem, there are two very simple things to remember: 1) you have to split the string, and 2) you have to apply a bit of symmetry.
First, split the string in half. The following function should do it.
def split_string_to_halves(string):
half, rem = divmod(len(string), 2)
a, b, c = '', '', ''
a, b = string[:half], string[half:]
if rem > 0:
b, c = string[half + 1:], string[rem + 1]
return (a, b, c)
The above should recreate the string if you do a + c + b. Next is you have to convert a and b to lists and map the ord function on each half. Leave the remainder alone, if any.
def convert_to_ord_list(string):
return map(ord, list(string))
Since you just have to do a one-way operation (only reduction, no need for addition), you can assume that for each pair of elements in the two converted lists, the higher value less the lower value is the number of operations needed. Easier shown than said:
def convert_to_palindrome(string):
halfone, halftwo, rem = split_string_to_halves(string)
if halfone == halftwo[::-1]:
return halfone + halftwo + rem, 0
halftwo = halftwo[::-1]
zipped = zip(convert_to_ord_list(halfone), convert_to_ord_list(halftwo))
counter = sum([max(x) - min(x) for x in zipped])
floors = [min(x) for x in zipped]
res = "".join(map(chr, floors))
res += rem + res[::-1]
return res, counter
Finally, some tests:
target = 'ideal'
print convert_to_palindrome(target) # ('iaeai', 6)
target = 'euler'
print convert_to_palindrome(target) # ('eelee', 29)
target = 'ohmygodthisisinsane'
print convert_to_palindrome(target) # ('ehasgidihmhidigsahe', 84)
I'm not sure if this is optimized nor if I covered all bases. But I think this pretty much covers the general concept of the approach needed. Compared to your code, this is clearer and actually works (yours does not). Good luck and let us know how this works for you.

how to generate a set of similar strings in python

I am wondering how to generate a set of similar strings based on Levenshtein distance (string edit distance). Ideally, I like to pass in, a source string (i.e. a string which is used to generate other strings that are similar to it), the number of strings need to be generated and a threshold as parameters, i.e. similarities among the strings in the generated set should be greater than the threshold. I am wondering what Python package(s) should I use to achieve that? Or any idea how to implement this?
I think you can think of the problem in another way (reversed).
Given a string, say it is sittin.
Given a threshold (edit distance), say it is k.
Then you apply combinations of different "edits" in k-steps.
For example, let's say k = 2. And assume the allowed edit modes you have are:
delete one character
add one character
substitute one character with another one.
Then the logic is something like below:
input = 'sittin'
for num in 1 ... n: # suppose you want to have n strings generated
my_input_ = input
# suppose the edit distance should be smaller or equal to k;
# but greater or equal to one
for i in in 1 ... randint(k):
pick a random edit mode from (delete, add, substitute)
do it! and update my_input_
If you need to stick with a pre-defined dictionary, that adds some complexity but it is still doable. In this case, the edit must be valid.
Borrowing heavily on the pseudocode in #greeness answer I thought I would include the code I used to do this for DNA sequences.
This may not be your exact use case but I think it should be easily adaptable.
import random
dna = set(["A", "C", "G", "T"])
class Sequence(str):
def mutate(self, d, n):
mutants = set([self])
while len(mutants) < n:
k = random.randint(1, d)
for _ in range(k):
mutant_type = random.choice(["d", "s", "i"])
if mutant_type == "i":
mutants.add(self.insertion(k))
elif mutant_type == "d":
mutants.add(self.deletion(k))
elif mutant_type == "s":
mutants.add(self.substitute(k))
return list(mutants)
def deletion(self, n):
if n >= len(self):
return ""
chars = list(self)
i = 0
while i < n:
idx = random.choice(range(len(chars)))
del chars[idx]
i += 1
return "".join(chars)
def insertion(self, n):
chars = list(self)
i = 0
while i < n:
idx = random.choice(range(len(chars)))
new_base = random.choice(list(dna))
chars.insert(idx, new_base)
i += 1
return "".join(chars)
def substitute(self, n):
idxs = random.sample(range(len(self)), n)
chars = list(self)
for i in idxs:
new_base = random.choice(list(dna.difference(chars[i])))
chars[i] = new_base
return "".join(chars)
To use this you can do the following
s = Sequence("AAAAA")
d = 2 # max edit distance
n = 5 # number of strings in result
s.mutate(d, n)
>>> ['AAA', 'GACAAAA', 'AAAAA', 'CAGAA', 'AACAAAA']

How can I scramble a word with a factor?

I would like to scramble a word with a factor. The bigger the factor is, the more scrambled the word will become.
For example, the word "paragraphs" with factor of 1.00 would become "paaprahrgs", and it will become "paargarphs" with a factor of 0.50.
The distance from the original letter position and the number of scrambled letters should be taken into consideration.
This is my code so far, which only scrambles without a factor:
def Scramble(s):
return ''.join(random.sample(s, len(s)))
Any ideas?
P.S. This isn't an homework job - I'm trying to make something like this: http://d24w6bsrhbeh9d.cloudfront.net/photo/190546_700b.jpg
You could use the factor as a number of shuffling chars in the string around.
As the factor seem's to be between 0 and 1, you can multiply the factor with the string's length.
from random import random
def shuffle(string, factor):
string = list(string)
length = len(string)
if length < 2:
return string
shuffles = int(length * factor)
for i in xrange(shuffles):
i, j = tuple(int(random() * length) for i in xrange(2))
string[i], string[j] = string[j], string[i]
return "".join(string)
x = "computer"
print shuffle(x, .2)
print shuffle(x, .5)
print shuffle(x, .9)
coupmter
eocpumtr
rpmeutoc
If you want the first and the last characters to stay in place, simply split them and add them later on.
def CoolWordScramble(string, factor = .5):
if len(string) < 2:
return string
first, string, last = string[0], string[1:-1], string[-1]
return first + shuffle(string, factor) + last
You haven't defined what your "factor" should mean, so allow me to redefine it for you: A scrambling factor N (an integer) would be the result of swapping two random letters in a word, N times.
With this definition, 0 means the resulting word is the same as the input, 1 means only one pair of letters is swapped, and 10 means the swap is done 10 times.
You can make the "factor" roughly correspond to the number of times two adjacent letters of the word switch their positions (a transposition).
In each transposition, choose a random position (from 0 through the length-minus-two), then switch the positions of the letter at that position and the letter that follows it.
It could be implemented many ways, but here is my solution:
Wrote a function that just changes a letter's place:
def scramble(s):
s = list(s) #i think more easier, but it is absolutely performance loss
p = s.pop(random.randint(0, len(s)-1))
s.insert(random.randint(0, len(s)-1), p)
return "".join(s)
And wrote a function that apply to a string many times:
def scramble_factor(s, n):
for i in range(n):
s = scramble(s)
return s
Now we can use it:
>>> s = "paragraph"
>>> scramble_factor(s, 0)
'paragraph'
>>> scramble_factor(s, 1)
'pgararaph'
>>> scramble_factor(s, 2)
'prahagrap'
>>> scramble_factor(s, 5)
'pgpaarrah'
>>> scramble_factor(s, 10)
'arpahprag'
Of course functions can be combined or nested, but it is clear I think.
Edit:
It doesn't consider distance, but the scramble function easily replaced just for swapping adjacent letters. Here is one:
def scramble(s):
if len(s)<=1:
return s
index = random.randint(0, len(s)-2)
return s[:index] + s[index + 1] + s[index] + s[index+2:]
You could do a for-loop that counts down to 0.
Convert the String into a Char-Array and use a RNG to choose 2 letters to swap.

Categories