I wanted to make an infix to prefix converter. When I ran the code, the operator in the string sends all the operators in the beginning of the returning string.
How can I fix the code below?
class Stack:
def __init__(self):
self.a = []
def isEmpty(self):
return self.a == []
def push(self,i):
self.a.append(i)
def pop(self):
return self.a.pop()
def peek(self):
return self.a[len(self.a)-1]
def infixToPrefix(s):
prec = {'/':3,'*':3,'+':2,'-':2,'^':4,'(':1}
opStack = Stack()
prefixList = []
temp = []
for token in s:
if token in "ABCDEFGHIJKLMNOPQRSTUVWXYZ" or token in "0123456789":
prefixList.append(token)
elif token == '(':
opStack.push(token)
elif token == ')':
topToken = opStack.pop()
while topToken != '(':
temp.append(topToken)
topToken = opStack.pop()
prefixList = temp + prefixList
temp = []
else:
while (not opStack.isEmpty()) and \
(prec[opStack.peek()]>= prec[token]):
temp.append(opStack.pop())
prefixList = temp + prefixList
temp = []
opStack.push(token)
while not opStack.isEmpty():
temp.append(opStack.pop())
prefixList = temp + prefixList
return ''.join(prefixList)
print infixToPrefix("(A+B)*C-(D-E)*(F+G)")
Don't reinvent the wheel. Use a parser generator instead. For example, PLY (Python lex-yacc) is a good option. You can start by looking at a basic example and either do the conversion within the production rules themselves, or produce an abstract syntax tree equipped with flattening methods that return prefix, infix, or postfix notation. Note that the difference between these three is whether the operator is inserted in pre-, in between, or post-order during a depth-first traversal of the syntax tree (implemented either as a single function or recursively -- the latter leads to simpler and more modular code).
It might be late to post this answer, but I'm also leaving this as a reference for anyone else. It seems OP, that you already solved the issue when converting from infix to postfix. If that's the case, you can use that same algorithm and code to convert your text to a prefix notation.
All you'd need to do is invert your text first, and then pass that text through your algorithm. Once you invert your text, you'll also store your text in your Stack already inverted. After you've processed this, you need to re-invert your text again to it's original form and you'll get your prefix notation.
Be sure to keep track of what you compare in your dictionary though, you'll no longer compare your operands with the "(".
Hope this helps.
Related
So, I would like to convert my string input
'f(g,h(a,b),a,b(g,h))'
into the following list
['f',['g','h',['a','b'],'a','b',['g','h']]]
Essentially, I would like to replace all '(' into [ and all ')' into ].
I have unsuccessfully tried to do this recursively. I thought I would iterate through all the variables through my word and then when I hit a '(' I would create a new list and start extending the values into that newest list. If I hit a ')', I would stop extending the values into the newest list and append the newest list to the closest outer list. But I am very new to recursion, so I am struggling to think of how to do it
word='f(a,f(a))'
empty=[]
def newlist(word):
listy=[]
for i, letter in enumerate(word):
if letter=='(':
return newlist([word[i+1:]])
if letter==')':
listy.append(newlist)
else:
listy.extend(letter)
return empty.append(listy)
Assuming your input is something like this:
a = 'f,(g,h,(a,b),a,b,(g,h))'
We start by splitting it into primitive parts ("tokens"). Since your tokens are always a single symbol, this is rather easy:
tokens = list(a)
Now we need two functions to work with the list of tokens: next_token tells us which token we're about to process and pop_token marks a token as processed and removes it from the list:
def next_token():
return tokens[0] if tokens else None
def pop_token():
tokens.pop(0)
Your input consist of "items", separated by a comma. Schematically, it can be expressed as
items = item ( ',' item )*
In the python code, we first read one item and then keep reading further items while the next token is a comma:
def items():
result = [item()]
while next_token() == ',':
pop_token()
result.append(item())
return result
An "item" is either a sublist in parentheses or a letter:
def item():
return sublist() or letter()
To read a sublist, we check if the token is a '(', the use items above the read the content and finally check for the ')' and panic if it is not there:
def sublist():
if next_token() == '(':
pop_token()
result = items()
if next_token() == ')':
pop_token()
return result
raise SyntaxError()
letter simply returns the next token. You might want to add some checks here to make sure it's indeed a letter:
def letter():
result = next_token()
pop_token()
return result
You can organize the above code like this: have one function parse that accepts a string and returns a list and put all functions above inside this function:
def parse(input_string):
def items():
...
def sublist():
...
...etc
tokens = list(input_string)
return items()
Quite an interesting question, and one I originally misinterpreted. But now this solution works accordingly. Note that I have used list concatenation + operator for this solution (which you usually want to avoid) so feel free to improve upon it however you see fit.
Good luck, and I hope this helps!
# set some global values, I prefer to keep it
# as a set incase you need to add functionality
# eg if you also want {{a},b} or [ab<c>ed] to work
OPEN_PARENTHESIS = set(["("])
CLOSE_PARENTHESIS = set([")"])
SPACER = set([","])
def recursive_solution(input_str, index):
# base case A: when index exceeds or equals len(input_str)
if index >= len(input_str):
return [], index
char = input_str[index]
# base case B: when we reach a closed parenthesis stop this level of recursive depth
if char in CLOSE_PARENTHESIS:
return [], index
# do the next recursion, return it's value and the index it stops at
recur_val, recur_stop_i = recursive_solution(input_str, index + 1)
# with an open parenthesis, we want to continue the recursion after it's associated
# closed parenthesis. and also the recur_val should be within a new dimension of the list
if char in OPEN_PARENTHESIS:
continued_recur_val, continued_recur_stop_i = recursive_solution(input_str, recur_stop_i + 1)
return [recur_val] + continued_recur_val, continued_recur_stop_i
# for spacers eg "," we just ignore it
if char in SPACER:
return recur_val, recur_stop_i
# and finally with normal characters, we just extent it
return [char] + recur_val, recur_stop_i
You can get the expected answer using the following code but it's still in string format and not a list.
import re
a='(f(g,h(a,b),a,b(g,h))'
ans=[]
sub=''
def rec(i,sub):
if i>=len(a):
return sub
if a[i]=='(':
if i==0:
sub=rec(i+1,sub+'[')
else:
sub=rec(i+1,sub+',[')
elif a[i]==')':
sub=rec(i+1,sub+']')
else:
sub=rec(i+1,sub+a[i])
return sub
b=rec(0,'')
print(b)
b=re.sub(r"([a-z]+)", r"'\1'", b)
print(b,type(b))
Output
[f,[g,h,[a,b],a,b,[g,h]]
['f',['g','h',['a','b'],'a','b',['g','h']] <class 'str'>
I have a method which does the following. Question is how do I unit test this method. I am pretty new to this Python unit testing module.
The question and solution are as follows:
Given a string containing of ‘0’, ‘1’ and ‘?’ wildcard characters, generate all binary strings that can be formed by replacing each wildcard character by ‘0’ or ‘1’.
Example :
Input str = "1??0?101"
Output:
10000101
10001101
10100101
10101101
11000101
11001101
11100101
11101101
Solution:
def _print(string, index):
if index == len(string):
print(''.join(string))
return
if string[index] == "?":
# replace '?' by '0' and recurse
string[index] = '0'
_print(string, index + 1)
# replace '?' by '1' and recurse
string[index] = '1'
_print(string, index + 1)
# NOTE: Need to backtrack as string
# is passed by reference to the
# function
string[index] = '?'
else:
_print(string, index + 1)
# Driver code
if __name__ == "__main__":
string = "1??0?101"
string = list(string) #don’t forget to convert to string
_print(string, 0)
Output:
10000101
10001101
10100101
10101101
11000101
11001101
11100101
11101101
Questions:
1. Also, is there a way of returning a list as output instead of printing them out?
2. Which assert test cases are appropriate in this scenario?
3. What would be the best end to end test cases to cover in this case?
4. What could be a better approach of solving this in terms of time and space complexity?
I have tried this which doesn't seem to work:
import unittest
from wildcard import _print
class TestWildCard(unittest.TestCase):
def test_0_print(self):
print("Start wildCard _print test: \n")
result = 111
self.assertEquals(_print("1?1",0),result,"Results match")
Answers:
1: sure, instead of printing something, append the result to a list result.append('some value') and don't forget to initialise the list at the start of your code result = [] and return it once the function is done return result - and probably don't call the function _print, but something like bit_strings.
ad 1: since your function is recursive, you now also need to capture the return value and add it to the result when calling the function recursively, so result += _print(string, index + 1)
2: you should typically think of edge cases and test them separately, or group those together that really test a single aspect of your function. There is no one way to state what the test should look like - if there were, the test framework would just generate it for you.
3: same answer as 2.
Your code becomes:
def bit_strings(s, index):
result = []
if index == len(s):
result.append(''.join(s))
return result
if s[index] == "?":
# replace '?' by '0' and recurse
s[index] = '0'
result += bit_strings(s, index + 1)
# replace '?' by '1' and recurse
s[index] = '1'
result += bit_strings(s, index + 1)
# NOTE: Need to backtrack as string
# is passed by reference to the
# function
s[index] = '?'
else:
result += bit_strings(s, index + 1)
return result
# Driver code
if __name__ == "__main__":
x = "1??0?101"
xl = list(x) #don’t forget to convert to string
print(bit_strings(xl, 0))
There's more efficient ways of doing this, but I just modified your code in line with the questions and answers.
I've renamed string to s, since string is a bit confusing, reminding others of the type or shadowing the (built-in) module.
As for the unit test:
import unittest
from wildcard import bit_strings
class TestWildCard(unittest.TestCase):
def test_0_print(self):
print("Start wildCard _print test: \n")
# you only had one case here and it's a list now
result = ['101', '111']
# user assertEqual, not Equals
# you were passing in a string, but your code assumed a list, so list() added
self.assertEqual(bit_strings(list("1?1"), 0), result, "Results match")
When using an environment like PyCharm, it helps to call the file test<something>.py (i.e. have test in the name), so that it helps you run the unit tests more easily.
Two alternate solutions as requested in comment (one still recursive, just a lot more concise, the other not recursive but arguably a bit wasteful with result lists - just two quickies):
from timeit import timeit
def unblank_bits(bits):
if not bits:
yield ''
else:
for ch in '01' if bits[0] == '?' else bits[0]:
for continuation in unblank_bits(bits[1:]):
yield ch + continuation
print(list(unblank_bits('0??100?1')))
def unblank_bits_non_recursive(bits):
result = ['']
for ch in bits:
if ch == '?':
result = [s+'0' for s in result] + [s+'1' for s in result]
else:
result = [s+ch for s in result]
return result
print(list(unblank_bits_non_recursive('0??100?1')))
print(timeit(lambda: list(unblank_bits('0??100?1'))))
print(timeit(lambda: list(unblank_bits_non_recursive('0??100?1'))))
This solution doesn't move between lists and strings, as there is no need and doesn't manipulate the input values. As you can tell the recursive one is a bit slower, but I prefer it for readability. The output:
['00010001', '00010011', '00110001', '00110011', '01010001', '01010011', '01110001', '01110011']
['00010001', '01010001', '00110001', '01110001', '00010011', '01010011', '00110011', '01110011']
13.073874
3.9742709000000005
Note that your own solution ran in about 8 seconds using the same setup, so the "improved version" I suggested is simpler, but not faster, so you may prefer the latter solution.
I am trying to make a function that takes an equation as input and evaluate it based on the operations, the rule is that I should have the operators(*,+,-,%,^) between correct mathematical expressions, examples:
Input: 6**8
Result: Not correct
Reason: * has another * next to it instead of a digit or a mathematical expression
Input: -6+2
Result: Not correct
Reason: "-" was in the beginning and it didn't fall between two numbers.
Input: 6*(2+3)
Result: Correct
Reason: "*" was next to a mathematically correct expression "(2+3)
1. Option: eval
eval the expression with try-except:
try:
result = eval(expression)
correct_sign = True
except SyntaxError:
correct_sign = False
Advantages:
Very easy and fast
Disadvantages:
Python accepts expressions, that you probably don't want (e.g. ** is valid in python)
eval is not secure
2. Option: Algorithm
In compilers algorithms are used, to make a math expression readable for the pc. These algorithms can also be used to evaluate if the expression is valid.
I don't aim to explain these algorithms. There are enough resources outside.
This is a very brief structure of what you can do:
Parsing an infix expression
Converting infix expression to a postfix expression
Evaluating the postfix expression
You need to understand what postfix and infix expressions mean.
Resources:
Shunting yard algorithm: https://en.wikipedia.org/wiki/Shunting-yard_algorithm
Reverse polish notation/ post fix notation: https://en.wikipedia.org/wiki/Reverse_Polish_notation
Python builtin tokenizer: https://docs.python.org/3.7/library/tokenize.html
Advantages:
Reliable
Works for complicated expressions
You don't have to reinvent the wheel
Disadvantages
complicate to understand
complicate to implement
As mentioned in comments, this is called parsing and requires a grammar.
See an example with parsimonious, a PEG parser:
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
from parsimonious.exceptions import ParseError
grammar = Grammar(
r"""
expr = (term operator term)+
term = (lpar factor rpar) / number
factor = (number operator number)
operator = ws? (mod / mult / sub / add) ws?
add = "+"
sub = "-"
mult = "*"
mod = "/"
number = ~"\d+(?:\.\d+)?"
lpar = ws? "(" ws?
rpar = ws? ")" ws?
ws = ~"\s+"
"""
)
class SimpleCalculator(NodeVisitor):
def generic_visit(self, node, children):
return children or node
def visit_expr(self, node, children):
return self.calc(children[0])
def visit_operator(self, node, children):
_, operator, *_ = node
return operator
def visit_term(self, node, children):
child = children[0]
if isinstance(child, list):
_, factor, *_ = child
return factor
else:
return child
def visit_factor(self, node, children):
return self.calc(children)
def calc(self, params):
""" Calculates the actual equation. """
x, op, y = params
op = op.text
if not isinstance(x, float):
x = float(x.text)
if not isinstance(y, float):
y = float(y.text)
if op == "+":
return x+y
elif op == "-":
return x-y
elif op == "/":
return x/y
elif op == "*":
return x*y
equations = ["6 *(2+3)", "2+2", "4*8", "123-23", "-1+1", "100/10", "6**6"]
c = SimpleCalculator()
for equation in equations:
try:
tree = grammar.parse(equation)
result = c.visit(tree)
print("{} = {}".format(equation, result))
except ParseError:
print("The equation {} could not be parsed.".format(equation))
This yields
6 *(2+3) = 30.0
2+2 = 4.0
4*8 = 32.0
123-23 = 100.0
The equation -1+1 could not be parsed.
100/10 = 10.0
The equation 6**6 could not be parsed.
You need to use correct data structures and algorithms to achieve your goal to parse a mathematical equation and evaluate it.
also you have to be familiar with two concepts: stacks and trees for creating a parser.
think the best algorithm you can use is RPN (Reverse Polish Notation).
For issue #1, you could always strip out the parentheses before evaluating.
input_string = "6*(2+3)"
it = filter(lambda x: x != '(' and x != ')', input_string)
after = ' '.join(list(it))
print(after)
# prints "6 * 2 + 3"
It looks like you might just be starting to use python. There are always many ways to solve a problem. One interesting one to sort of get you jump started would be to consider splitting the equation based on the operators.
For example the following uses what's called a regular expression to split the formula:
import re
>>> formula2 = '6+3+5--5'
>>> re.split(r'\*|\/|\%|\^|\+|\-',formula2)
['6', '3', '5', '', '5']
>>> formula3 = '-2+5'
>>> re.split(r'\*|\/|\%|\^|\+|\-',formula3)
['', '2', '5']
It may look complex, but in the r'\*|\/|\%|\^|\+|\-' piece the \ means to take the next character literally and the | means 'or' so it evaluates to split on any one of those operators.
In this case you'd notice that any time there are two operators together, or when a formula starts with an operator you will have a blank value in your list - one for the second - in the first formula and one for the leading - in the second formula.
Based on that you could say something like:
if '' in re.split(r'\*|\/|\%|\^|\+|\-',formula):
correctsign = False
Maybe this can serve as a good starting point to get the brain thinking about interesting ways to solve the problem.
Important to first mention that ** stands for exponentiation, i.e 6**8: 6 to the power of 8.
The logic behind your algorithm is wrong because in your code the response depends only on whether the last digit/sign satisfies your conditions. This is because once the loop is complete, your boolean correctsigns defaults to True or False based on the last digit/sign.
You can also use elif instead of nested else statements for cleaner code.
Without changing your core algorithm, your code would like something like this:
def checksigns(equation):
signs = ["*","/","%","^","+","-"]
for i in signs:
if i in equation:
index = equation.index((i))
if (equation[index] == equation[0]):
return "Not correct"
elif (equation[index] == equation[len(equation) - 1]):
return "Not correct"
elif (equation[index + 1].isdigit() and equation[index - 1].isdigit()):
return "Correct"
else:
return "Not correct"
You can use Python's ast module for parsing the expression:
import ast
import itertools as it
def check(expr):
allowed = (ast.Add, ast.Sub, ast.Mult, ast.Mod)
try:
for node in it.islice(ast.walk(ast.parse(expr)), 2, None):
if isinstance(node, (ast.BinOp, ast.Num)):
continue
if not isinstance(node, allowed):
return False
except SyntaxError:
return False
return True
print(check('6**8')) # False
print(check('-6+2')) # False
print(check('6*(2+3)')) # True
The first case 6**8 evaluates as False because it is represented by ast.Pow node and the second one because -6 corresponds to ast.UnaryOp.
I am trying to extract the string inside nested brackets and product them.
Let's say I have following string
string = "(A((B|C)D|E|F))"
According to the answer in Extract string inside nested brackets
I can extract the string inside nested brackets, but for my case it's different since I have "D" at the end of bracket so this is the result from the code. It looks so far from my desired output
['B|C', 'D|E|F', 'A']
This is my desired output
[[['A'],['B|C'],['D']], [['A'],['E|F']']] # '|' means OR
Do you have any recommandation, should I implement by using regular expression or just run through all given string?
So it can leads to my final result, that is
"ABD"
"ACD"
"AE"
"AF"
In this point, I will use itertools.product
You didn't specify the language precisely, but it looks like arbitrary nested brackets are allowed. It's not a regular language. I wouldn't recommend to parse it with regular expression (it might be possible as regular expressions in python are not truly regular, but even if it's possible, it'll probably be a mess).
I'd recommend to define a context-free grammar for your language and parse it instead. Here's how you can do it:
EXPR -> A EXPR (an expression is an expression preceded by an alphabetic character)
EXPR -> (LIST) EXPR (an expression is a list followed by an expression)
EXPR -> "" (an expression can be an empty string)
LIST -> EXPR | LIST (a list is an expression followed by "|" followed by a list)
LIST -> EXPR (or just one expression)
This grammar can be parsed by a simple top-down recursive parser which works in linear time. Here's a sample implementation:
class Parser:
def __init__(self, data):
self.data = data
self.pos = 0
def get_cur_char(self):
"""
Returns the current character or None if the input is over
"""
return None if self.pos == len(self.data) else self.data[self.pos]
def advance(self):
"""
Moves to the next character of the input if the input is not over.
"""
if self.pos < len(self.data):
self.pos += 1
def get_and_advance(self):
"""
Returns the current character and moves to the next one.
"""
res = self.get_cur_char()
self.advance()
return res
def parse_expr(self):
"""
Parse the EXPR according to the speficied grammar.
"""
cur_char = self.get_cur_char()
if cur_char == '(':
# EXPR -> (LIST) EXPR rule
self.advance()
# Parser the list and the rest of the expression and combines
# the result.
prefixes = self.parse_list()
suffices = self.parse_expr()
return [p + s for p in prefixes for s in suffices]
elif not cur_char or cur_char == ')' or cur_char == '|':
# EXPR -> Empty rule. Returns a list with an empty string without
# consuming the input.
return ['']
else:
# EXPR -> A EXPR rule.
# Parses the rest of the expression and prepends the current
# character.
self.advance()
return [cur_char + s for s in self.parse_expr()]
def parse_list(self):
"""
Parser the LIST according to the speficied grammar.
"""
first_expr = self.parse_expr()
# Uses the LIST -> EXPR | LIST rule if the next character is | and
# LIST -> EXPR otherwise
return first_expr + (self.parse_list() if self.get_and_advance() == '|' else [])
if __name__ == '__main__':
string = "(A((B|C)D|E|F))"
parser = Parser(string)
print('\n'.join(parser.parse_expr()))
If you're not familiar with this technique, you can read more about it here.
This implementation is not the most efficient one (for instance, it uses lists explicitly instead of iterators), but it's a good starting point.
I would suggest to go for a solution that targets the final result immediately. So a function that would make this transformation:
input: "(A((B|C)D|E|F))"
output: ['ABD', 'ACD', 'AE', 'AF']
Here is the code I would propose:
import re
def tokenize(text):
return re.findall(r'[()|]|\w+', text)
def product(a, b):
return [x+y for x in a for y in b] if a and b else a or b
def parse(text):
tokens = tokenize(text)
def recurse(tokens, i):
factor = []
term = []
while i < len(tokens) and tokens[i] != ')':
token = tokens[i]
i += 1
if token == '|':
term.extend(factor)
factor = []
else:
if token == '(':
expr, i = recurse(tokens, i)
else:
expr = [token]
factor = product(factor, expr)
return term+factor, i+1
return recurse(tokens, 0)[0]
string = "(A((B|C)D|E|F))"
print(parse(string))
See it run on repl.it
I am trying to find the length of the string with out using the inbuilt len() function. Here's my python code:
increment = -1
def lenRecur(aStr):
'''
aStr: a string
returns: int, the length of aStr
'''
global increment
if aStr == '':
return increment
else:
increment += 1
return lenRecur (aStr[increment:])
print lenRecur ("abcdefq")
The result I am expecting is 7 but what I got was 4.What I realized was when the increment became 2, the value pass to the lenRecur (aStr[increment:]) was "defq". Which means aStr[2:] is evaluated as "defq" instead of "cdefq".
Why this is happening?
Your function should not depend on external variables.
def lenRecur(aStr):
'''
aStr: a string
returns: int, the length of aStr
'''
if aStr == '':
return 0
else:
return 1 + lenRecur(aStr[1:])
print lenRecur("abcdefq")
A common idiom is to use a default argument:
>>> def l(s, c=0): return l(s[1:], c+1) if s else c
This kind of solution works with anything that can be sliced
>>> l('pip')
3
>>> l([1,2,3])
3
>>> l('')
0
>>> l([])
0
>>>
As an other option, you could write this:
def lenRecur(aStr):
return _lenRecur(aStr, 0)
def _lenRecur(aStr, acc):
if not aStr:
return acc
return _lenRecur(aStr[1:], acc+1)
A noticed by #gboffi in his answer, it is commonly accepted in Python to use a default argument instead of using of an helper function:
def lenRecur(aStr, acc = 0):
if not aStr:
return acc
return lenRecur(aStr[1:], acc+1)
The choice of one form or the other will depend how much you want/don't want to allow the called to set the initial accumulator value to something else than 0.
Anyway, the interesting point here is in using an accumulator as the second parameter. That way, you have proper tail recursion. Unfortunately, this is not properly optimized by Python. But this is a good habit as many other languages have such optimization. And this will be a required skill if you switch someday to functional programing language.