How to properly unittest in Python - python

I have a method which does the following. Question is how do I unit test this method. I am pretty new to this Python unit testing module.
The question and solution are as follows:
Given a string containing of ‘0’, ‘1’ and ‘?’ wildcard characters, generate all binary strings that can be formed by replacing each wildcard character by ‘0’ or ‘1’.
Example :
Input str = "1??0?101"
Output:
10000101
10001101
10100101
10101101
11000101
11001101
11100101
11101101
Solution:
def _print(string, index):
if index == len(string):
print(''.join(string))
return
if string[index] == "?":
# replace '?' by '0' and recurse
string[index] = '0'
_print(string, index + 1)
# replace '?' by '1' and recurse
string[index] = '1'
_print(string, index + 1)
# NOTE: Need to backtrack as string
# is passed by reference to the
# function
string[index] = '?'
else:
_print(string, index + 1)
# Driver code
if __name__ == "__main__":
string = "1??0?101"
string = list(string) #don’t forget to convert to string
_print(string, 0)
Output:
10000101
10001101
10100101
10101101
11000101
11001101
11100101
11101101
Questions:
1. Also, is there a way of returning a list as output instead of printing them out?
2. Which assert test cases are appropriate in this scenario?
3. What would be the best end to end test cases to cover in this case?
4. What could be a better approach of solving this in terms of time and space complexity?
I have tried this which doesn't seem to work:
import unittest
from wildcard import _print
class TestWildCard(unittest.TestCase):
def test_0_print(self):
print("Start wildCard _print test: \n")
result = 111
self.assertEquals(_print("1?1",0),result,"Results match")

Answers:
1: sure, instead of printing something, append the result to a list result.append('some value') and don't forget to initialise the list at the start of your code result = [] and return it once the function is done return result - and probably don't call the function _print, but something like bit_strings.
ad 1: since your function is recursive, you now also need to capture the return value and add it to the result when calling the function recursively, so result += _print(string, index + 1)
2: you should typically think of edge cases and test them separately, or group those together that really test a single aspect of your function. There is no one way to state what the test should look like - if there were, the test framework would just generate it for you.
3: same answer as 2.
Your code becomes:
def bit_strings(s, index):
result = []
if index == len(s):
result.append(''.join(s))
return result
if s[index] == "?":
# replace '?' by '0' and recurse
s[index] = '0'
result += bit_strings(s, index + 1)
# replace '?' by '1' and recurse
s[index] = '1'
result += bit_strings(s, index + 1)
# NOTE: Need to backtrack as string
# is passed by reference to the
# function
s[index] = '?'
else:
result += bit_strings(s, index + 1)
return result
# Driver code
if __name__ == "__main__":
x = "1??0?101"
xl = list(x) #don’t forget to convert to string
print(bit_strings(xl, 0))
There's more efficient ways of doing this, but I just modified your code in line with the questions and answers.
I've renamed string to s, since string is a bit confusing, reminding others of the type or shadowing the (built-in) module.
As for the unit test:
import unittest
from wildcard import bit_strings
class TestWildCard(unittest.TestCase):
def test_0_print(self):
print("Start wildCard _print test: \n")
# you only had one case here and it's a list now
result = ['101', '111']
# user assertEqual, not Equals
# you were passing in a string, but your code assumed a list, so list() added
self.assertEqual(bit_strings(list("1?1"), 0), result, "Results match")
When using an environment like PyCharm, it helps to call the file test<something>.py (i.e. have test in the name), so that it helps you run the unit tests more easily.
Two alternate solutions as requested in comment (one still recursive, just a lot more concise, the other not recursive but arguably a bit wasteful with result lists - just two quickies):
from timeit import timeit
def unblank_bits(bits):
if not bits:
yield ''
else:
for ch in '01' if bits[0] == '?' else bits[0]:
for continuation in unblank_bits(bits[1:]):
yield ch + continuation
print(list(unblank_bits('0??100?1')))
def unblank_bits_non_recursive(bits):
result = ['']
for ch in bits:
if ch == '?':
result = [s+'0' for s in result] + [s+'1' for s in result]
else:
result = [s+ch for s in result]
return result
print(list(unblank_bits_non_recursive('0??100?1')))
print(timeit(lambda: list(unblank_bits('0??100?1'))))
print(timeit(lambda: list(unblank_bits_non_recursive('0??100?1'))))
This solution doesn't move between lists and strings, as there is no need and doesn't manipulate the input values. As you can tell the recursive one is a bit slower, but I prefer it for readability. The output:
['00010001', '00010011', '00110001', '00110011', '01010001', '01010011', '01110001', '01110011']
['00010001', '01010001', '00110001', '01110001', '00010011', '01010011', '00110011', '01110011']
13.073874
3.9742709000000005
Note that your own solution ran in about 8 seconds using the same setup, so the "improved version" I suggested is simpler, but not faster, so you may prefer the latter solution.

Related

how to recursively create nested list from string input

So, I would like to convert my string input
'f(g,h(a,b),a,b(g,h))'
into the following list
['f',['g','h',['a','b'],'a','b',['g','h']]]
Essentially, I would like to replace all '(' into [ and all ')' into ].
I have unsuccessfully tried to do this recursively. I thought I would iterate through all the variables through my word and then when I hit a '(' I would create a new list and start extending the values into that newest list. If I hit a ')', I would stop extending the values into the newest list and append the newest list to the closest outer list. But I am very new to recursion, so I am struggling to think of how to do it
word='f(a,f(a))'
empty=[]
def newlist(word):
listy=[]
for i, letter in enumerate(word):
if letter=='(':
return newlist([word[i+1:]])
if letter==')':
listy.append(newlist)
else:
listy.extend(letter)
return empty.append(listy)
Assuming your input is something like this:
a = 'f,(g,h,(a,b),a,b,(g,h))'
We start by splitting it into primitive parts ("tokens"). Since your tokens are always a single symbol, this is rather easy:
tokens = list(a)
Now we need two functions to work with the list of tokens: next_token tells us which token we're about to process and pop_token marks a token as processed and removes it from the list:
def next_token():
return tokens[0] if tokens else None
def pop_token():
tokens.pop(0)
Your input consist of "items", separated by a comma. Schematically, it can be expressed as
items = item ( ',' item )*
In the python code, we first read one item and then keep reading further items while the next token is a comma:
def items():
result = [item()]
while next_token() == ',':
pop_token()
result.append(item())
return result
An "item" is either a sublist in parentheses or a letter:
def item():
return sublist() or letter()
To read a sublist, we check if the token is a '(', the use items above the read the content and finally check for the ')' and panic if it is not there:
def sublist():
if next_token() == '(':
pop_token()
result = items()
if next_token() == ')':
pop_token()
return result
raise SyntaxError()
letter simply returns the next token. You might want to add some checks here to make sure it's indeed a letter:
def letter():
result = next_token()
pop_token()
return result
You can organize the above code like this: have one function parse that accepts a string and returns a list and put all functions above inside this function:
def parse(input_string):
def items():
...
def sublist():
...
...etc
tokens = list(input_string)
return items()
Quite an interesting question, and one I originally misinterpreted. But now this solution works accordingly. Note that I have used list concatenation + operator for this solution (which you usually want to avoid) so feel free to improve upon it however you see fit.
Good luck, and I hope this helps!
# set some global values, I prefer to keep it
# as a set incase you need to add functionality
# eg if you also want {{a},b} or [ab<c>ed] to work
OPEN_PARENTHESIS = set(["("])
CLOSE_PARENTHESIS = set([")"])
SPACER = set([","])
def recursive_solution(input_str, index):
# base case A: when index exceeds or equals len(input_str)
if index >= len(input_str):
return [], index
char = input_str[index]
# base case B: when we reach a closed parenthesis stop this level of recursive depth
if char in CLOSE_PARENTHESIS:
return [], index
# do the next recursion, return it's value and the index it stops at
recur_val, recur_stop_i = recursive_solution(input_str, index + 1)
# with an open parenthesis, we want to continue the recursion after it's associated
# closed parenthesis. and also the recur_val should be within a new dimension of the list
if char in OPEN_PARENTHESIS:
continued_recur_val, continued_recur_stop_i = recursive_solution(input_str, recur_stop_i + 1)
return [recur_val] + continued_recur_val, continued_recur_stop_i
# for spacers eg "," we just ignore it
if char in SPACER:
return recur_val, recur_stop_i
# and finally with normal characters, we just extent it
return [char] + recur_val, recur_stop_i
You can get the expected answer using the following code but it's still in string format and not a list.
import re
a='(f(g,h(a,b),a,b(g,h))'
ans=[]
sub=''
def rec(i,sub):
if i>=len(a):
return sub
if a[i]=='(':
if i==0:
sub=rec(i+1,sub+'[')
else:
sub=rec(i+1,sub+',[')
elif a[i]==')':
sub=rec(i+1,sub+']')
else:
sub=rec(i+1,sub+a[i])
return sub
b=rec(0,'')
print(b)
b=re.sub(r"([a-z]+)", r"'\1'", b)
print(b,type(b))
Output
[f,[g,h,[a,b],a,b,[g,h]]
['f',['g','h',['a','b'],'a','b',['g','h']] <class 'str'>

Recursive function to obtain non-repeated characters from string

I have this exercise:
Write a recursive function that takes a string and returns all the characters that are not repeated in said string.
The characters in the output don't need to have the same order as in the input string.
First I tried this, but given the condition for the function to stop, it never evaluates the last character:
i=0
lst = []
def list_of_letters_rec(str=""):
if str[i] not in lst and i < len(str) - 1:
lst.append(str[i])
list_of_letters_rec(str[i+1:])
elif str[i] in lst and i < len(str) - 1:
list_of_letters_rec(str[i+1:])
elif i > len(str) - 1:
return lst
return lst
word = input(str("Word?"))
print(list_of_letters_rec(word))
The main issue with this function is that it never evaluates the last character.
An example of an output:
['a', 'r', 'd', 'v'] for input 'aardvark'.
Since the characters don't need to be ordered, I suppose a better approach would be to do the recursion backwards, and I also tried another approach (below), but no luck:
lst = []
def list_of_letters_rec(str=""):
n = len(str) - 1
if str[n] not in lst and n >= 0:
lst.append(str[n])
list_of_letters_rec(str[:n-1])
elif str[n] in lst and n >= 0:
list_of_letters_rec(str[:n-1])
return lst
word = input(str("Word?"))
print(list_of_letters_rec(word))
Apparently, the stop conditions are not well defined, especially in the last one, as the output I get is
IndexError: string index out of range
Could you give me any hints to help me correct the stop condition, either in the 1st or 2nd try?
You can try:
word = input("> ")
result = [l for l in word if word.count(l) < 2]
> aabc
['b', 'c']
Demo
One improvement I would offer on #trincot's answer is the use of a set, which has better look-up time, O(1), compared to lists, O(n).
if the input string, s, is empty, return the empty result
(inductive) s has at least one character. if the first character, s[0] is in the memo, mem, the character has already been seen. Return the result of the sub-problem, s[1:]
(inductive) The first character is not in the memo. Add the first character to the memo and prepend the first character to the result of the sub-problem, s[1:]
def list_of_letters(s, mem = set()):
if not s:
return "" #1
elif s[0] in mem:
return list_of_letters(s[1:], mem) #2
else:
return s[0] + list_of_letters(s[1:], {*mem, s[0]}) #3
print(list_of_letters("aardvark"))
ardvk
Per your comment, the exercise asks only for a string as input. We can easily modify our program to privatize mem -
def list_of_letters(s): # public api
def loop(s, mem): # private api
if not s:
return ""
elif s[0] in mem:
return loop(s[1:], mem)
else:
return s[0] + loop(s[1:], {*mem, s[0]})
return loop(s, set()) # run private func
print(list_of_letters("aardvark")) # mem is invisible to caller
ardvk
Python's native set data type accepts an iterable which solves this problem instantly. However this doesn't teach you anything about recursion :D
print("".join(set("aardvark")))
akdrv
Some issues:
You miss the last character because of i < len(str) - 1 in the conditionals. That should be i < len(str) (but read the next points, as this still needs change)
The test for if i > len(str) - 1 should come first, before doing anything else, otherwise you'll get an invalid index reference. This also makes the other conditions on the length unnecessary.
Don't name your variable str, as that is already a used name for the string type.
Don't populate a list that is global. By doing this, you can only call the function once reliably. Any next time the list will still have the result of the previous call, and you'll be adding to that. Instead use the list that you get from the recursive call. In the base case, return an empty list.
The global i has no use, since you never change its value; it is always 0. So you should just reference index [0] and check that the string is not empty.
Here is your code with those corrections:
def list_of_letters_rec(s=""):
if not s:
return []
result = list_of_letters_rec(s[1:])
if s[0] not in result:
result.append(s[0])
return result
print(list_of_letters_rec("aardvark"))
NB: This is not the most optimal way to do it. But I guess this is what you are asked to do.
A possible solution would be to just use an index instead of splicing the string:
def list_of_letters_rec(string="", index = 0, lst = []):
if(len(string) == index):
return lst
char = string[index]
if string.count(char) == 1:
lst.append(char)
return list_of_letters_rec(string, index+1, lst)
word = input(str("Word?"))
print(list_of_letters_rec(word))

how to use python to find the first not repeating character?

I am solving a problem: Given a string s consisting of small English letters, find and return the first instance of a non-repeating character in it. If there is no such character, return '_'.
For example: s = "abacabad", the output should be firstNotRepeatingCharacter(s) = 'c'.
I wrote a simple code, it got through all the test, but when I submit it, it reports error, anyone know what's wrong with my code? Thank you!
def firstNotRepeatingCharacter(s):
for i in list(s):
if list(s).count(i) == 1:
return i
return '_'
Could be a performance issue as your repeated count (and unnecessary list conversions) calls make this approach quadratic. You should aim for a linear solution:
from collections import Counter
def firstNotRepeatingCharacter(s):
c = Counter(s)
for i in s:
if c[i] == 1:
return i
return '_'
You can also use next with a generator and a default value:
def firstNotRepeatingCharacter(s):
c = Counter(s)
return next((i for i in s if c[i] == 1), '_')
If you can only use built-ins, just make your own counter (or any other data structure that allows you to identify duplicates)
def firstNotRepeatingCharacter(s):
c = {}
for i in s:
c[i] = c.get(i, 0) + 1
return next((i for i in s if c[i] == 1), '_')
The task at hand is to find first non repeating character from a given string e.g. s = 'aabsacbhsba'
def solution(s):
# This will take each character from the given string s one at a time
for i in s:
if s.index(i) == s.rindex(i): # rindex() returns last index of i in s
return i
return '_'
Here s.rindex(i) method finds the last occurrence of the specified value [value at i in s in our case] we are comparing it with current index s.index(i), if they return the same value we found the first occurance of specified value which is not repeated
You can find definition and usage of rindex() at : W3School rindex()

Recursive function in python does not call itself out

The problem is formulated as follows:
Write a recursive function that, given a string, checks if the string
is formed by two halves equal to each other (i.e. s = s1 + s2, with s1
= s2), imposing the constraint that the equality operator == can only be applied to strings of length ≤1. If the length of the string is
odd, return an error.
I wrote this code in Python 2.7 that is correct (it gives me the right answer every time) but does not enter that recursive loop at all. So can I omit that call here?
def recursiveHalfString(s):
##param s: string
##return bool
if (len(s))%2==0: #verify if the rest of the division by 2 = 0 (even number)
if len(s)<=1: # case in which I can use the == operator
if s[0]==s[1]:
return True
else:
return False
if len(s)>1:
if s[0:len(s)/2] != s[len(s)/2:len(s)]: # here I used != instead of ==
if s!=0:
return False
else:
return recursiveHalfString(s[0:(len(s)/2)-1]+s[(len(s)/2)+1:len(s)]) # broken call
return True
else:
return "Error: odd string"
The expected results are True if the string is like "abbaabba"
or False when it's like anything else not similat to the pattern ("wordword")
This is a much simplified recursive version that actually uses the single char comparison to reduce the problem size:
def rhs(s):
half, rest = divmod(len(s), 2)
if rest: # odd length
raise ValueError # return 'error'
if half == 0: # simplest base case: empty string
return True
return s[0] == s[half] and rhs(s[1:half] + s[half+1:])
It has to be said though that, algorithmically, this problem does not lend itself well to a recursive approach, given the constraints.
Here is another recursive solution. A good rule of thumb when taking a recursive approach is to first think about your base case.
def recursiveHalfString(s):
# base case, if string is empty
if s == '':
return True
if (len(s))%2==0:
if s[0] != s[(len(s)/2)]:
return False
else:
left = s[1:len(s)/2] # the left half of the string without first char
right = s[(len(s)/2)+1: len(s)] # the right half without first char
return recursiveHalfString(left + right)
else:
return "Error: odd string"
print(recursiveHalfString('abbaabba')) # True
print(recursiveHalfString('fail')) # False
print(recursiveHalfString('oddstring')) # Error: odd string
This function will split the string into two halves, compare the first characters and recursively call itself with the two halves concatenated together without the leading characters.
However like stated in another answer, recursion is not necessarily an efficient solution in this case. This approach creates a lot of new strings and is in no way an optimal way to do this. It is for demonstration purposes only.
Another recursive solution that doesn't involve creating a bunch of new strings might look like:
def recursiveHalfString(s, offset=0):
half, odd = divmod(len(s), 2)
assert(not odd)
if not s or offset > half:
return True
if s[offset] != s[half + offset]:
return False
return recursiveHalfString(s, offset + 1)
However, as #schwobaseggl suggested, a recursive approach here is a bit clunkier than a simple iterative approach:
def recursiveHalfString(s, offset=0):
half, odd = divmod(len(s), 2)
assert(not odd)
for offset in range(half):
if s[offset] != s[half + offset]:
return False
return True

Getting the middle character in a odd length string

def get_middle_character(odd_string):
variable = len(odd_string)
x = str((variable/2))
middle_character = odd_string.find(x)
middle_character2 = odd_string[middle_character]
return middle_character2
def main():
print('Enter a odd length string: ')
odd_string = input()
print('The middle character is', get_middle_character(odd_string))
main()
I need to figure out how to print the middle character in a given odd length string. But when I run this code, I only get the last character. What is the problem?
You need to think more carefully about what your code is actually doing. Let's do this with an example:
def get_middle_character(odd_string):
Let's say that we call get_middle_character('hello'), so odd_string is 'hello':
variable = len(odd_string) # variable = 5
Everything is OK so far.
x = str((variable/2)) # x = '2'
This is the first thing that is obviously odd - why do you want the string '2'? That's the index of the middle character, don't you just want an integer? Also you only need one pair of parentheses there, the other set is redundant.
middle_character = odd_string.find(x) # middle_character = -1
Obviously you can't str.find the substring '2' in odd_string, because it was never there. str.find returns -1 if it cannot find the substring; you should use str.index instead, which gives you a nice clear ValueError when it can't find the substring.
Note that even if you were searching for the middle character, rather than the stringified index of the middle character, you would get into trouble as str.find gives the first index at which the substring appears, which may not be the one you're after (consider 'lolly'.find('l')...).
middle_character2 = odd_string[middle_character] # middle_character2 = 'o'
As Python allows negative indexing from the end of a sequence, -1 is the index of the last character.
return middle_character2 # return 'o'
You could actually have simplified to return odd_string[middle_character], and removed the superfluous assignment; you'd have still had the wrong answer, but from neater code (and without middle_character2, which is a terrible name).
Hopefully you can now see where you went wrong, and it's trivially obvious what you should do to fix it. Next time use e.g. Python Tutor to debug your code before asking a question here.
You need to simply access character based on index of string and string slicing. For example:
>>> s = '1234567'
>>> middle_index = len(s)/2
>>> first_half, middle, second_half = s[:middle_index], s[middle_index], s[middle_index+1:]
>>> first_half, middle, second_half
('123', '4', '567')
Explanation:
str[:n]: returns string from 0th index to n-1th index
str[n]: returns value at nth index
str[n:]: returns value from nth index till end of list
Should be like below:
def get_middle_character(odd_string):
variable = len(odd_string)/2
middle_character = odd_string[variable +1]
return middle_character
i know its too late but i post my solution
I hope it will be useful ;)
def get_middle_char(string):
if len(string) % 2 == 0:
return None
elif len(string) <= 1:
return None
str_len = int(len(string)/2))
return string[strlen]
reversedString = ''
print('What is your name')
str = input()
idx = len(str)
print(idx)
str_to_iterate = str
for char in str_to_iterate[::-1]:
print(char)
evenodd = len(str) % 2
if evenodd == 0:
print('even')
else:
print('odd')
l = str
if len(l) % 2 == 0:
x = len(l) // 2
y = len(l) // 2 - 1
print(l[x], l[y])
else:
n = len(l) // 2
print(l[n])

Categories