Generating permutations using Bitmasking - python

I was answering some programming problems in the internet and this problem interests me. The problem is defined as follows:
This code prints all the permutations of the string lexicographically. Something is wrong with it. Find and fix it by modifying or adding one line!
Input:
The input consists of a single line containing a string of lowercase characters with no spaces in between. Its length is at most 7 characters, and its characters are sorted lexicographically.
Output:
All permutations of the string printed one in each line, listed lexicographically.
def permutations():
global running
global characters
global bitmask
if len(running) == len(characters):
print(''.join(running))
else:
for i in xrange(len(characters)):
if ((bitmask>>i)&1) == 0:
bitmask |= 1<<i
running.append(characters[i])
permutations()
running.pop()
raw = raw_input()
characters = list(raw)
running = []
bitmask = 0
permutations()
Can somebody answer it for me and explain how it works? I am not really familiar in the applications of bitmasking. Thank you.

You should make the bitmask bit 0 again by adding the line:
bitmask ^= 1<<i
Code:
def permutations():
global running
global characters
global bitmask
if len(running) == len(characters):
print(''.join(running))
else:
for i in xrange(len(characters)):
if ((bitmask>>i)&1) == 0:
bitmask |= 1<<i
running.append(characters[i])
permutations()
bitmask ^= 1<<i #make the bit zero again.
running.pop()
raw = raw_input()
characters = list(raw)
running = []
bitmask = 0
permutations()
Explanation:
Bitmask is an integer that is treated as a string of bits. In your case the length of this string is equal to the length of the input string.
Each position in this string signifies whether the corresponding character has already added in the partially built string or not.
The code works by building a new string starting from an empty string. Whenever any character is added, the bitmask records it. Then the string is sent deeper into recursion for further addition of characters. When the code returns from recursion, then the added character is to be removed and the bitmask value has to be made to its original value.
More information about masking can be found here.http://en.wikipedia.org/wiki/Mask_%28computing%29
EDIT:
Say the input string is "abcde" and the bitmask at any point in the execution of the code is "00100". This means that only the character 'c' has been added so far to the partially built string.
Hence we should not add the character 'c' again.
The "if" condition ((bitmask >> i) & 1) == 0 checks whether the i'th bit in bitmask has been set, ie., whether the i'th character has already been added in the string. If it is not added, only then the character gets appended, otherwise not.
If the bit operations are new to you then I suggest you look up on this topic on the internet.

Related

Palindrome vs Symmetry and how to deal with 2 word character

Can we say that a word with 2 characters are palindrome? like "oo" is palindrome and "go" is not?
I am going through a program which is detecting a palindrome from GeeksForGeeks, but it detects go as palindrome as well, though it is not:
# Function to check whether the
# string is plaindrome or not def palindrome(a):
# finding the mid, start
# and last index of the string
mid = (len(a)-1)//2
start = 0
last = len(a)-1
flag = 0
# A loop till the mid of the
# string
while(start<mid):
# comparing letters from right
# from the letters from left
if (a[start]== a[last]):
start += 1
last -= 1
else:
flag = 1
break;
# Checking the flag variable to
# check if the string is palindrome
# or not
if flag == 0:
print("The entered string is palindrome")
else:
print("The entered string is not palindrome")
# ... other code ...
# Driver code
string = 'amaama'
palindrome(string)
Is there any particular length or condition defined for a word to be a palindrome? I read the Wikipedia article, but did not find any particular condition on the length of a palindrome.
The above program detects "go" as palindrome because the midpoint is 0, which is "g" and the starting point is 0, which is also "g", and so it determines it is a palindrome. But I am still confused about the number of characters. Can a 2 number word be a palindrome? If yes, then do we need to just add a specific condition for it: if word[0] == word[1]?
Let's take a look at the definition of palindrome, according to Merriam-Webster:
a word, verse, or sentence (such as "Able was I ere I saw Elba") or a number (such as 1881) that reads the same backward or forward
Therefore, two-character words (or any even-numbered character words) can also be palindromes. The example code is simply poorly written and does not work correctly in the case of two-character strings. As you have correctly deduced, it sets the mid variable to 0 if the length of the string is 2. The loop, while (start < mid), is then instantly skipped, as start is also initialised as 0. Therefore, the flag variable (initialised as 0, corresponding to 'is a palindrome') is never changed, so the function incorrectly prints that go is a palindrome.
There are a number of ways in which you can adapt the algorithm; the simplest of which would be to simply check up to and including the middle character index, by changing the while condition to start <= mid. Note that this is only the simplest way to adapt the given code, the simplest piece of Python code to check whether a string is palindromic is significantly simpler (as you can easily reverse a string using a[::-1], and compare this to the original string).
(Edit to add: the other answer by trincot actually shows that the provided algorithm is incorrect for all even-numbered character words. The fix suggested in this answer still works.)
Your question is justified. The code from GeeksForGeeks you have referenced is not giving the correct result. In fact it also produces wrong results for longer words, like "gang".
The above program detects "go" as palindrome because the midpoint is 0, which is "g" and the starting point is 0, which is also "g", and so it determines it is a palindrome.
This is indeed where the algorithm goes wrong.
...then do we need to just add a specific condition for it: if word[0] == word[1]?
Given the while condition is start<mid, the midpoint should be the first index after the first half of the string that must be verified, and so in the case of a 2-letter word, the midpoint should be 1, not 0.
It is easy to correct the error in the program. Change:
mid = (len(a)-1)//2
To:
mid = len(a)//2
That fixes the issue. No extra line of code is needed to treat this as a separate case.
I did not find any particular condition on the length of a palindrome.
And right you are: there is no such condition. The GeeksForGeeks code made you doubt, but you were right from the start, and the code was wrong.

In Python, does a set count as a buffer?

I am working through Cracking the Coding Interview (4th ed), and one of the questions is as follows:
Design an algorithm and write code to remove the duplicate characters in a string
without using any additional buffer. NOTE: One or two additional variables are fine.
An extra copy of the array is not.
I have written the following solution, which satisfies all of the test cases specified by the author:
def remove_duplicate(s):
return ''.join(sorted(set(s)))
print(remove_duplicate("abcd")) // output "abcd"
print(remove_duplicate("aaaa")) // output "a"
print(remove_duplicate("")) // output ""
print(remove_duplicate("aabb")) // output "ab"
Does my use of a set in my solution count as the use of an additional buffer, or is my solution adequate? If my solution is not adequate, what would be a better way to go about this?
Thank you very much!
Only the person administering the question or evaluating the answer could say for sure, but I would say that a set does count as a buffer.
If there are no repeated characters in the string, the length of the set would equal that of the string. In fact, since a set has significant overhead, since it works on a hash list, the set would probably take more memory than the string. If the string holds Unicode, the number of unique characters could be very large.
If you do not know how many unique characters are in the string, you will not be able to predict the length of the set. The possible-long and probably-unpredictable length of the set makes it count as a buffer--or worse, given the possible longer length than the string.
To follow up on v.coder's comment, I rewrote the code he (or she) was referring to in Python, and added some comments to try to explain what is going on.
def removeduplicates(s):
"""Original java implementation by
Druv Gairola (http://stackoverflow.com/users/495545/dhruv-gairola)
in his/her answer
http://stackoverflow.com/questions/2598129/function-to-remove-duplicate-characters-in-a-string/10473835#10473835
"""
# python strings are immutable, so first converting the string to a list of integers,
# each integer representing the ascii value of the letter
# (hint: look up "ascii table" on the web)
L = [ord(char) for char in s]
# easiest solution is to use a set, but to use Druv Gairola's method...
# (hint, look up "bitmaps" on the web to learn more!)
bitmap = 0
#seen = set()
for index, char in enumerate(L):
# first check for duplicates:
# number of bits to shift left (the space is the "lowest"
# character on the ascii table, and 'char' here is the position
# of the current character in the ascii table. so if 'char' is
# a space, the shift length will be 0, if 'char' is '!', shift
# length will be 1, and so on. This naturally requires the
# integer to actually have as many "bit positions" as there are
# characters in the ascii table from the space to the ~,
# but python uses "very big integers" (BigNums? I am not really
# sure here..) - so that's probably going to be fine..
shift_length = char - ord(' ')
# make a new integer where only one bit is set;
# the bit position the character corresponds to
bit_position = 1 << shift_length
# if the same bit is already set [to 1] in the bitmap,
# the result of AND'ing the two integers together
# will be an integer where that only that exact bit is
# set - but that still means that the integer will be greater
# than zero. (assuming that the so-called "sign bit" of the
# integer doesn't get set. Again, I am not entirely sure about
# how python handles integers this big internally.. but it
# seems to work fine...)
bit_position_already_occupied = bitmap & bit_position > 0
if bit_position_already_occupied:
#if char in seen:
L[index] = 0
else:
# update the bitmap to indicate that this character
# is now seen.
# so, same procedure as above. first find the bit position
# this character represents...
bit_position = char - ord(' ')
# make an integer that has a single bit set:
# the bit that corresponds to the position of the character
integer = 1 << bit_position
# "add" the bit to the bitmap. The way we do this is that
# we OR the current bitmap with the integer that has the
# required bit set to 1. The result of OR'ing two integers
# is that all bits that are set to 1 in *either* of the two
# will be set to 1 in the result.
bitmap = bitmap | integer
#seen.add(char)
# finally, turn the list back to a string to be able to return it
# (again, just kind of a way to "get around" immutable python strings)
return ''.join(chr(i) for i in L if i != 0)
if __name__ == "__main__":
print(removeduplicates('aaaa'))
print(removeduplicates('aabcdee'))
print(removeduplicates('aabbccddeeefffff'))
print(removeduplicates('&%!%)(FNAFNZEFafaei515151iaaogh6161626)([][][ ao8faeo~~~````%!)"%fakfzzqqfaklnz'))

Replacing Odd and Even-indexed characters in a string

How can I replace even and odd-indexed letters in my strings? I'd like to replace odd-indexed characters with uppercased letters and even-indexed characters with lowercased ones.
x=input("Enter String: ")
How can I modify the inputted string?
This sounds a little like a "do my homework for me" post, but I'll help you out, as I need the training myself.
You can do this by breaking down the problem. (As I am quite new with python syntax, I'm gonna assume that the user has already given an input to string x)
Make a loop, or otherwise iterate through the characters of your string
Make sure you have an index number for each character, which increments for each one
Check if the number is even, by using modulus of 2 (%2). This returns the remainder of a number when divided by 2. In the case of even numbers, that will be 0.
If %2 == 0 set letter to lower case, else set letter to upper case.
append letter to new String, which you defined before the loop. You cannot directly alter a single character in a String, because they are immutable. This means that you cannot change the String itself, but you can assign a new String to the variable.
Done. Print and see if it worked.
Code:
x = "seMi Long StRing WiTH COMPLetely RaNDOM CasINg"
result_string = ""
index = 0;
for c in x:
if(index%2 == 0):
result_string += c.lower()
else:
result_string += c.upper()
index+=1
print(result_string)
s=input()
l=[]
s=s.lower()
l=[i.upper() if s.index(i)%2==0 else i for i in s ]
print("".join(l))
x = 'myname'
for item in range(len(x)):
if item%2==0:
print(x[item].upper())
else:
print(x[item].lower())
this is the for loop i was referring to. but the thing with this line of code is that it is specific to the value you have assigned to the variable x where as the function i provided above can take any string value without us having to repeat the code each time.
def myfunc(string):
result=''
for x in range(len(string)):
if x%2==0:
result=result+string[x].upper()
else:
result=result+string[x].lower()
return result
The above is a function for the question you asked.
A non-function for loop might be easier to grasp right now (like you I am very new to Python as well. So for me it was easier to understand the for loop before I got into functions. Look at my next post for the same.

Count occurrences of a given character in a string using recursion

I have to make a function called countLetterString(char, str) where
I need to use recursion to find the amount of times the given character appears in the string.
My code so far looks like this.
def countLetterString(char, str):
if not str:
return 0
else:
return 1 + countLetterString(char, str[1:])
All this does is count how many characters are in the string but I can't seem to figure out how to split the string then see whether the character is the character split.
The first step is to break this problem into pieces:
1. How do I determine if a character is in a string?
If you are doing this recursively you need to check if the first character of the string.
2. How do I compare two characters?
Python has a == operator that determines whether or not two things are equivalent
3. What do I do after I know whether or not the first character of the string matches or not?
You need to move on to the remainder of the string, yet somehow maintain a count of the characters you have seen so far. This is normally very easy with a for-loop because you can just declare a variable outside of it, but recursively you have to pass the state of the program to each new function call.
Here is an example where I compute the length of a string recursively:
def length(s):
if not s: # test if there are no more characters in the string
return 0
else: # maintain a count by adding 1 each time you return
# get all but the first character using a slice
return 1 + length( s[1:] )
from this example, see if you can complete your problem. Yours will have a single additional step.
4. When do I stop recursing?
This is always a question when dealing with recursion, when do I need to stop recalling myself. See if you can figure this one out.
EDIT:
not s will test if s is empty, because in Python the empty string "" evaluates to False; and not False == True
First of all, you shouldn't use str as a variable name as it will mask the built-in str type. Use something like s or text instead.
The if str == 0: line will not do what you expect, the correct way to check if a string is empty is with if not str: or if len(str) == 0: (the first method is preferred). See this answer for more info.
So now you have the base case of the recursion figured out, so what is the "step". You will either want to return 1 + countLetterString(...) or 0 + countLetterString(...) where you are calling countLetterString() with one less character. You will use the 1 if the character you remove matches char, or 0 otherwise. For example you could check to see if the first character from s matches char using s[0] == char.
To remove a single character in the string you can use slicing, so for the string s you can get all characters but the first using s[1:], or all characters but the last using s[:-1]. Hope that is enough to get you started!
Reasoning about recursion requires breaking the problem into "regular" and "special" cases. What are the special cases here? Well, if the string is empty, then char certainly isn't in the string. Return 0 in that case.
Are there other special cases? Not really! If the string isn't empty, you can break it into its first character (the_string[0]) and all the rest (the_string[1:]). Then you can recursively count the number of character occurrences in the rest, and add 1 if the first character equals the char you're looking for.
I assume this is an assignment, so I won't write the code for you. It's not hard. Note that your if str == 0: won't work: that's testing whether str is the integer 0. if len(str) == 0: is a way that will work, and if str == "": is another. There are shorter ways, but at this point those are probably clearest.
First of all you I would suggest not using char or str. Str is a built function/type and while I don't believe char would give you any problems, it's a reserved word in many other languages. Second you can achieve the same functionality using count, as in :
letterstring="This is a string!"
letterstring.count("i")
which would give you the number of occurrences of i in the given string, in this case 3.
If you need to do it purely for speculation, the thing to remember with recursion is carrying some condition or counter over which each call and placing some kind of conditional within the code that will change it. For example:
def countToZero(count):
print(str(count))
if count > 0:
countToZero(count-1)
Keep it mind this is a very quick example, but as you can see on each call I print the current value and then the function calls itself again while decrementing the count. Once the count is no longer greater than 0 the function will end.
Knowing this you will want to keep track of you count, the index you are comparing in the string, the character you are searching for, and the string itself given your example. Without doing the code for you, I think that should at least give you a start.
You have to decide a base case first. The point where the recursion unwinds and returns.
In this case the the base case would be the point where there are no (further) instances of a particular character, say X, in the string. (if string.find(X) == -1: return count) and the function makes no further calls to itself and returns with the number of instances it found, while trusting its previous caller information.
Recursion means a function calling itself from within, therefore creating a stack(at least in Python) of calls and every call is an individual and has a specified purpose with no knowledge whatsoever of what happened before it was called, unless provided, to which it adds its own result and returns(not strictly speaking). And this information has to be supplied by its invoker, its parent, or can be done using global variables which is not advisable.
So in this case that information is how many instances of that particular character were found by the parent function in the first fraction of the string. The initial function call, made by us, also needs to be supplied that information, since we are the root of all function calls and have no idea(as we haven't treaded the string) of how many Xs are there we can safely tell the initial call that since I haven't gone through the string and haven't found any or zero/0 X therefore here's the string entire string and could you please tread the rest of it and find out how many X are in there. This 0 as a convenience could be the default argument of the function, or you have to supply the 0 every time you make the call.
When will the function call another function?
Recursion is breaking down the task into the most granular level(strictly speaking, maybe) and leave the rest to the (grand)child(ren). The most granular break down of this task would be finding a single instance of X and passing the rest of the string from the point, exclusive(point + 1) at which it occurred to the next call, and adding 1 to the count which its parent function supplied it with.
if not string.find(X) == -1:
string = string[string.find(X) + 1:]
return countLetterString(char, string, count = count + 1)`
Counting X in file through iteration/loop.
It would involve opening the file(TextFILE), then text = read(TextFile)ing it, text is a string. Then looping over each character (for char in text:) , remember granularity, and each time char (equals) == X, increment count by +=1. Before you run the loop specify that you never went through the string and therefore your count for the number X (in text) was = 0. (Sounds familiar?)
return count.
#This function will print the count using recursion.
def countrec(s, c, cnt = 0):
if len(s) == 0:
print(cnt)
return 0
if s[-1] == c:
countrec(s[0:-1], c, cnt+1)
else:
countrec(s[0:-1], c, cnt)
#Function call
countrec('foobar', 'o')
With an extra parameter, the same function can be implemented.
Woking function code:
def countLetterString(char, str, count = 0):
if len(str) == 0:
return count
if str[-1] == char:
return countLetterString(char, str[0:-1], count+1)
else:
return countLetterString(char, str[0:-1], count)
The below function signature accepts 1 more parameter - count.
(P.S : I was presented this question where the function signature was pre-defined; just had to complete the logic.)
Hereby, the code :
def count_occurrences(s, substr, count=0):
''' s - indicates the string,
output : Returns the count of occurrences of substr found in s
'''
len_s = len(s)
len_substr = len(substr)
if len_s == 0:
return count
if len_s < len_substr:
return count
if substr == s[0:len_substr]:
count += 1
count = count_occurrences(s[1:], substr, count) ## RECURSIVE CALL
return count
output behavior :
count_occurences("hishiihisha", "hi", 0) => 3
count_occurences("xxAbx", "xx") => 1 (not mandatory to pass the count , since it's a positional arg.)

Python text encryption: rot13

I am currently doing an assignment that encrypts text by using rot 13, but some of my text wont register.
# cgi is to escape html
# import cgi
def rot13(s):
#string encrypted
scrypt=''
alph='abcdefghijklmonpqrstuvwxyz'
for c in s:
# check if char is in alphabet
if c.lower() in alph:
#find c in alph and return its place
i = alph.find(c.lower())
#encrypt char = c incremented by 13
ccrypt = alph[ i+13 : i+14 ]
#add encrypted char to string
if c==c.lower():
scrypt+=ccrypt
if c==c.upper():
scrypt+=ccrypt.upper()
#dont encrypt special chars or spaces
else:
scrypt+=c
return scrypt
# return cgi.escape(scrypt, quote = True)
given_string = 'Rot13 Test'
print rot13(given_string)
OUTPUT:
13 r
[Finished in 0.0s]
Hmmm, seems like a bunch of things are not working.
Main problem should be in ccrypt = alph[ i+13 : i+14 ]: you're missing a % len(alph) otherwise if, for example, i is equal to 18, then you'll end out of the list boundary.
In your output, in fact, only e is encoded to r because it's the only letter in your test string which, moved by 13, doesn't end out of boundary.
The rest of this answer are just tips to clean the code a little bit:
instead of alph='abc.. you can declare an import string at the beginning of the script and use a string.lowercase
instead of using string slicing, for just one character it's better to use string[i], gets the work done
instead of c == c.upper(), you can use builtin function if c.isupper() ....
The trouble you're having is with your slice. It will be empty if your character is in the second half of the alphabet, because i+13 will be off the end. There are a few ways you could fix it.
The simplest might be to simply double your alphabet string (literally: alph = alph * 2). This means you can access values up to 52, rather than just up to 26. This is a pretty crude solution though, and it would be better to just fix the indexing.
A better option would be to subtract 13 from your index, rather than adding 13. Rot13 is symmetric, so both will have the same effect, and it will work because negative indexes are legal in Python (they refer to positions counted backwards from the end).
In either case, it's not actually necessary to do a slice at all. You can simply grab a single value (unlike C, there's no char type in Python, so single characters are strings too). If you were to make only this change, it would probably make it clear why your current code is failing, as trying to access a single value off the end of a string will raise an exception.
Edit: Actually, after thinking about what solution is really best, I'm inclined to suggest avoiding index-math based solutions entirely. A better approach is to use Python's fantastic dictionaries to do your mapping from original characters to encrypted ones. You can build and use a Rot13 dictionary like this:
alph="abcdefghijklmnopqrstuvwxyz"
rot13_table = dict(zip(alph, alph[13:]+alph[:13])) # lowercase character mappings
rot13_table.update((c.upper(),rot13_table[c].upper()) for c in alph) # upppercase
def rot13(s):
return "".join(rot13_table.get(c, c) for c in s) # non-letters are ignored
First thing that may have caused you some problems - your string list has the n and the o switched, so you'll want to adjust that :) As for the algorithm, when you run:
ccrypt = alph[ i+13 : i+14 ]
Think of what happens when you get 25 back from the first iteration (for z). You are now looking for the index position alph[38:39] (side note: you can actually just say alph[38]), which is far past the bounds of the 26-character string, which will return '':
In [1]: s = 'abcde'
In [2]: s[2]
Out[2]: 'c'
In [3]: s[2:3]
Out[3]: 'c'
In [4]: s[49:50]
Out[4]: ''
As for how to fix it, there are a number of interesting methods. Your code functions just fine with a few modifications. One thing you could do is create a mapping of characters that are already 'rotated' 13 positions:
alph = 'abcdefghijklmnopqrstuvwxyz'
coded = 'nopqrstuvwxyzabcdefghijklm'
All we did here is split the original list into halves of 13 and then swap them - we now know that if we take a letter like a and get its position (0), the same position in the coded list will be the rot13 value. As this is for an assignment I won't spell out how to do it, but see if that gets you on the right track (and #Makoto's suggestion is a perfect way to check your results).
This line
ccrypt = alph[ i+13 : i+14 ]
does not do what you think it does - it returns a string slice from i+13 to i+14, but if these indices are greater than the length of the string, the slice will be empty:
"abc"[5:6] #returns ''
This means your solution turns everything from n onward into an empty string, which produces your observed output.
The correct way of implementing this would be (1.) using a modulo operation to constrain the index to a valid number and (2.) using simple character access instead of string slices, which is easier to read, faster, and throws an IndexError for invalid indices, meaning your error would have been obvious.
ccrypt = alph[(i+13) % 26]
If you're doing this as an exercise for a course in Python, ignore this, but just saying...
>>> import codecs
>>> codecs.encode('Some text', 'rot13')
'Fbzr grkg'
>>>

Categories