I've got some dynamically-generated boolean logic expressions, like:
(A or B) and (C or D)
A or (A and B)
A
empty - evaluates to True
The placeholders get replaced with booleans. Should I,
Convert this information to a Python expression like True or (True or False) and eval it?
Create a binary tree where a node is either a bool or Conjunction/Disjunction object and recursively evaluate it?
Convert it into nested S-expressions and use a Lisp parser?
Something else?
Suggestions welcome.
Here's a small (possibly, 74 lines including whitespace) module I built in about an hour and a half (plus almost an hour to refactoring):
str_to_token = {'True':True,
'False':False,
'and':lambda left, right: left and right,
'or':lambda left, right: left or right,
'(':'(',
')':')'}
empty_res = True
def create_token_lst(s, str_to_token=str_to_token):
"""create token list:
'True or False' -> [True, lambda..., False]"""
s = s.replace('(', ' ( ')
s = s.replace(')', ' ) ')
return [str_to_token[it] for it in s.split()]
def find(lst, what, start=0):
return [i for i,it in enumerate(lst) if it == what and i >= start]
def parens(token_lst):
"""returns:
(bool)parens_exist, left_paren_pos, right_paren_pos
"""
left_lst = find(token_lst, '(')
if not left_lst:
return False, -1, -1
left = left_lst[-1]
#can not occur earlier, hence there are args and op.
right = find(token_lst, ')', left + 4)[0]
return True, left, right
def bool_eval(token_lst):
"""token_lst has length 3 and format: [left_arg, operator, right_arg]
operator(left_arg, right_arg) is returned"""
return token_lst[1](token_lst[0], token_lst[2])
def formatted_bool_eval(token_lst, empty_res=empty_res):
"""eval a formatted (i.e. of the form 'ToFa(ToF)') string"""
if not token_lst:
return empty_res
if len(token_lst) == 1:
return token_lst[0]
has_parens, l_paren, r_paren = parens(token_lst)
if not has_parens:
return bool_eval(token_lst)
token_lst[l_paren:r_paren + 1] = [bool_eval(token_lst[l_paren+1:r_paren])]
return formatted_bool_eval(token_lst, bool_eval)
def nested_bool_eval(s):
"""The actual 'eval' routine,
if 's' is empty, 'True' is returned,
otherwise 's' is evaluated according to parentheses nesting.
The format assumed:
[1] 'LEFT OPERATOR RIGHT',
where LEFT and RIGHT are either:
True or False or '(' [1] ')' (subexpression in parentheses)
"""
return formatted_bool_eval(create_token_lst(s))
The simple tests give:
>>> print nested_bool_eval('')
True
>>> print nested_bool_eval('False')
False
>>> print nested_bool_eval('True or False')
True
>>> print nested_bool_eval('True and False')
False
>>> print nested_bool_eval('(True or False) and (True or False)')
True
>>> print nested_bool_eval('(True or False) and (True and False)')
False
>>> print nested_bool_eval('(True or False) or (True and False)')
True
>>> print nested_bool_eval('(True and False) or (True and False)')
False
>>> print nested_bool_eval('(True and False) or (True and (True or False))')
True
[Partially off-topic possibly]
Note, the you can easily configure the tokens (both operands and operators) you use with the poor-mans dependency-injection means provided (token_to_char=token_to_char and friends) to have multiple different evaluators at the same time (just resetting the "injected-by-default" globals will leave you with a single behavior).
For example:
def fuzzy_bool_eval(s):
"""as normal, but:
- an argument 'Maybe' may be :)) present
- algebra is:
[one of 'True', 'False', 'Maybe'] [one of 'or', 'and'] 'Maybe' -> 'Maybe'
"""
Maybe = 'Maybe' # just an object with nice __str__
def or_op(left, right):
return (Maybe if Maybe in [left, right] else (left or right))
def and_op(left, right):
args = [left, right]
if Maybe in args:
if True in args:
return Maybe # Maybe and True -> Maybe
else:
return False # Maybe and False -> False
return left and right
str_to_token = {'True':True,
'False':False,
'Maybe':Maybe,
'and':and_op,
'or':or_op,
'(':'(',
')':')'}
token_lst = create_token_lst(s, str_to_token=str_to_token)
return formatted_bool_eval(token_lst)
gives:
>>> print fuzzy_bool_eval('')
True
>>> print fuzzy_bool_eval('Maybe')
Maybe
>>> print fuzzy_bool_eval('True or False')
True
>>> print fuzzy_bool_eval('True or Maybe')
Maybe
>>> print fuzzy_bool_eval('False or (False and Maybe)')
False
It shouldn't be difficult at all to write a evaluator that can handle this, for example using pyparsing. You only have a few operations to handle (and, or, and grouping?), so you should be able to parse and evaluate it yourself.
You shouldn't need to explicitly form the binary tree to evaluate the expression.
If you set up dicts with the locals and globals you care about then you should be able to safely pass them along with the expression into eval().
Sounds like a piece of cake using SymPy logic module. They even have an example of that on the docs: http://docs.sympy.org/0.7.1/modules/logic.html
I am writing this because I had a solve a similar problem today and I was here when I was looking for clues. (Boolean parser with arbitrary string tokens that get converted to boolean values later).
After considering different options (implementing a solution myself or use some package), I settled on using Lark, https://github.com/lark-parser/lark
It's easy to use and pretty fast if you use LALR(1)
Here is an example that could match your syntax
from lark import Lark, Tree, Transformer
base_parser = Lark("""
expr: and_expr
| or_expr
and_expr: token
| "(" expr ")"
| and_expr " " and " " and_expr
or_expr: token
| "(" expr ")"
| or_expr " " or " " or_expr
token: LETTER
and: "and"
or: "or"
LETTER: /[A-Z]+/
""", start="expr")
class Cleaner(Transformer):
def expr(self, children):
num_children = len(children)
if num_children == 1:
return children[0]
else:
raise RuntimeError()
def and_expr(self, children):
num_children = len(children)
if num_children == 1:
return children[0]
elif num_children == 3:
first, middle, last = children
return Tree(data="and_expr", children=[first, last])
else:
raise RuntimeError()
def or_expr(self, children):
num_children = len(children)
if num_children == 1:
return children[0]
elif num_children == 3:
first, middle, last = children
return Tree(data="or_expr", children=[first, last])
else:
raise RuntimeError()
def get_syntax_tree(expression):
return Cleaner().transform(base_parser.parse(expression))
print(get_syntax_tree("A and (B or C)").pretty())
Note: the regex I chose doesn't match the empty string on purpose (Lark for some reason doesn't allow it).
You can perform that with Lark grammar library https://github.com/lark-parser/lark
from lark import Lark, Transformer, v_args, Token, Tree
from operator import or_, and_, not_
calc_grammar = f"""
?start: disjunction
?disjunction: conjunction
| disjunction "or" conjunction -> {or_.__name__}
?conjunction: atom
| conjunction "and" atom -> {and_.__name__}
?atom: BOOLEAN_LITTERAL -> bool_lit
| "not" atom -> {not_.__name__}
| "(" disjunction ")"
BOOLEAN_LITTERAL: TRUE | FALSE
TRUE: "True"
FALSE: "False"
%import common.WS_INLINE
%ignore WS_INLINE
"""
#v_args(inline=True)
class CalculateBoolTree(Transformer):
or_ = or_
not_ = not_
and_ = and_
allowed_value = {"True": True, "False": False}
def bool_lit(self, val: Token) -> bool:
return self.allowed_value[val]
calc_parser = Lark(calc_grammar, parser="lalr", transformer=CalculateBoolTree())
calc = calc_parser.parse
def eval_bool_expression(bool_expression: str) -> bool:
return calc(bool_expression)
print(eval_bool_expression("(True or False) and (False and True)"))
print(eval_bool_expression("not (False and True)"))
print(eval_bool_expression("not True or False and True and True"))
Related
line = "add(multiply(add(2,3),add(4,5)),1)"
def readLine(line):
countLeftBracket=0
string = ""
for char in line:
if char !=")":
string += char
else:
string +=char
break
for i in string:
if i=="(":
countLeftBracket+=1
if countLeftBracket>1:
cutString(string)
else:
return execute(string)
def cutString(string):
countLeftBracket=0
for char in string:
if char!="(":
string.replace(char,'')
elif char=="(":
string.replace(char,'')
break
for char in string:
if char=="(":
countLeftBracket+=1
if countLeftBracket>1:
cutString(string)
elif countLeftBracket==1:
return execute(string)
def add(num1,num2):
return print(num1+num2)
def multiply(num1,num2):
return print(num1*num2)
readLines(line)
I need to execute the whole line string. I tried to cut each function inside of brackets one by one and replace them with the result, but I am kind of lost. Not sure how to continue, my code gets the error:
File "main.py", line 26, in cutString
if char!="(":
RuntimeError: maximum recursion depth exceeded in comparison
Give me an idea of where to move, which method to use?
Here is a solution that uses pyparsing, and as such will be much easier to expand:
from pyparsing import *
first a convenience function (use the second tag function and print the parse tree to see why)
def tag(name):
"""This version converts ["expr", 4] => 4
comment in the version below to see the original parse tree
"""
def tagfn(tokens):
tklist = tokens.asList()
if name == 'expr' and len(tklist) == 1:
# LL1 artifact removal
return tklist
return tuple([name] + tklist)
return tagfn
# def tag(name):
# return lambda tokens: tuple([name] + tokens.asList())
Our lexer needs ot recognize left and right parenthesis, integers, and names. This is how you define them with pyparsing:
LPAR = Suppress("(")
RPAR = Suppress(")")
integer = Word(nums).setParseAction(lambda s,l,t: [int(t[0])])
name = Word(alphas)
our parser has function calls, which take zero or more expressions as parameters. A function call is also an expression, so to deal with the circularity we have to forward declare expr and fncall:
expr = Forward()
fncall = Forward()
expr << (integer | fncall).setParseAction(tag('expr'))
fnparams = delimitedList(expr)
fncall << (name + Group(LPAR + Optional(fnparams, default=[]) + RPAR)).setParseAction(tag('fncall'))
Now we can parse our string (we can add spaces and more or less than two parameters to functions as well):
line = "add(multiply(add(2,3),add(4,5)),1)"
res = fncall.parseString(line)
to see what is returned you can print it, this is called the parse-tree (or, since our tag function has simplified it, an abstract syntax tree):
import pprint
pprint.pprint(list(res))
which outputs:
[('fncall',
'add',
[('fncall',
'multiply',
[('fncall', 'add', [2, 3]), ('fncall', 'add', [4, 5])]),
1])]
with the commented out tag function it would be (which is just more work to deal with for no added benefit):
[('fncall',
'add',
[('expr',
('fncall',
'multiply',
[('expr', ('fncall', 'add', [('expr', 2), ('expr', 3)])),
('expr', ('fncall', 'add', [('expr', 4), ('expr', 5)]))])),
('expr', 1)])]
Now define the functions that are available to our program:
FUNCTIONS = {
'add': lambda *args: sum(args, 0),
'multiply': lambda *args: reduce(lambda a, b: a*b, args, 1),
}
# print FUNCTIONS['multiply'](1,2,3,4) # test that it works ;-)
Our parser is now very simple to write:
def parse(ast):
if not ast: # will not happen in our program, but it's good practice to exit early on no input
return
if isinstance(ast, tuple) and ast[0] == 'fncall':
# ast is here ('fncall', <name-of-function>, [list-of-arguments])
fn_name = ast[1] # get the function name
fn_args = parse(ast[2]) # parse each parameter (see elif below)
return FUNCTIONS[fn_name](*fn_args) # find and apply the function to its arguments
elif isinstance(ast, list):
# this is called when we hit a parameter list
return [parse(item) for item in ast]
elif isinstance(ast, int):
return ast
Now call the parser on the result of the lexing phase:
>>> print parse(res[0]) # the outermost item is an expression
46
Sounds like this could be solved with regex.
So this is an example of a single reduction
import re, operator
def apply(match):
func_name = match.group(1) # what's outside the patentesis
func_args = [int(x) for x in match.group(2).split(',')]
func = {"add": operator.add, "multiply": operator.mul}
return str(func[func_name](*func_args))
def single_step(line):
return re.sub(r"([a-z]+)\(([^()]+)\)",apply,line)
For example:
line = "add(multiply(add(2,3),add(4,5)),1)"
print(single_step(line))
Would output:
add(multiply(5,9),1)
All that is left to do, is to loop until the expression is a number
while not line.isdigit():
line = single_step(line)
print (line)
Will show
46
You can use a generator function to build a very simple parser:
import re, operator
line, f = "add(multiply(add(2,3),add(4,5)),1)", {'add':operator.add, 'multiply':operator.mul}
def parse(d):
n = next(d, None)
if n is not None and n != ')':
if n == '(':
yield iter(parse(d))
else:
yield n
yield from parse(d)
parsed = parse(iter(re.findall('\(|\)|\w+', line)))
def _eval(d):
_r = []
n = next(d, None)
while n is not None:
if n.isdigit():
_r.append(int(n))
else:
_r.append(f[n](*_eval(next(d))))
n = next(d, None)
return _r
print(_eval(parsed)[0])
Output:
46
isinstance can be used to check if the object which is the first argument is an instance or subclass of classinfo class which is the second argument.
a = 1
b = [1]
isinstance(a,list)
isinstance(b,list)
Is there a similar way to check if an operator validate in Python? Something like
isoperator('=')
isoperator(':=')
isoperator('<-')
I am trying to build an online executor for Python very beginner.
When they input like this
they would get a hint, the operator is not supported rather than current error message.
This can be achieved by wrapping a custom function around python's built in operator module. Of note, the operator can be inputted as a string or a lambda – I chose string arbitrarily.
import operator
import re
def test_operator(obj = [6], op_in = '> 1'):
"""
returns true if operator can be performed, else false
"""
# Use regex to parse the pattern
pattern_funct = re.compile('[><=]+')
pattern_num = re.compile('[0-9]+')
# Access values
funct = ''.join(pattern_funct.findall(op_in))
num = float(''.join(pattern_num.findall(op_in)))
# Lookup the operator function
ops = {
">": operator.gt,
">=": operator.ge,
"<": operator.lt,
"<=": operator.le,
"=": operator.eq,
}
op_func = ops[funct]
# Try to perform
print(f'Trying `{obj}` with {op_in}:')
try:
op_func(obj, num)
return True
except TypeError:
return False
# Next!
x =[3]
print(test_operator(x,'>= 7 '), '\n' )
# False
print(test_operator(7,'>= 7 '))
# True
.isalpha() return True if the character is like "hello" else return False then the character is like "+". so:
"+".isalpha() return False
"hi".isalpha() retuen True
def isOperand(ch):
return ch.isalpha()
def isOperator(ch):
return True if ch.isalpha==False
I'm using the Pyparsing library to evaluate simple boolean queries like these ones:
(True AND True) OR False AND True
(True AND (True OR False OR True))
Using the simpleBool script from the examples section (simpleBool.py), I've hit a snag when trying to validate the expression syntax. Expressions like the ones below are considered valid even tho they have clear syntax issues:
(True AND True) OR False AND True OR OR
(True AND (True OR False OR True))((((
Does anyone know how to validate syntax with Pyparsing?
Here is the testing script, as requested:
#
# simpleBool.py
#
# Example of defining a boolean logic parser using
# the operatorGrammar helper method in pyparsing.
#
# In this example, parse actions associated with each
# operator expression will "compile" the expression
# into BoolXXX class instances, which can then
# later be evaluated for their boolean value.
#
# Copyright 2006, by Paul McGuire
# Updated 2013-Sep-14 - improved Python 2/3 cross-compatibility
#
from pyparsing import infixNotation, opAssoc, Keyword, Word, alphas
# define classes to be built at parse time, as each matching
# expression type is parsed
class BoolOperand(object):
def __init__(self,t):
self.label = t[0]
self.value = eval(t[0])
def __bool__(self):
return self.value
def __str__(self):
return self.label
__repr__ = __str__
__nonzero__ = __bool__
class BoolBinOp(object):
def __init__(self,t):
self.args = t[0][0::2]
def __str__(self):
sep = " %s " % self.reprsymbol
return "(" + sep.join(map(str,self.args)) + ")"
def __bool__(self):
return self.evalop(bool(a) for a in self.args)
__nonzero__ = __bool__
__repr__ = __str__
class BoolAnd(BoolBinOp):
reprsymbol = '&'
evalop = all
class BoolOr(BoolBinOp):
reprsymbol = '|'
evalop = any
class BoolNot(object):
def __init__(self,t):
self.arg = t[0][1]
def __bool__(self):
v = bool(self.arg)
return not v
def __str__(self):
return "~" + str(self.arg)
__repr__ = __str__
__nonzero__ = __bool__
TRUE = Keyword("True")
FALSE = Keyword("False")
boolOperand = TRUE | FALSE | Word(alphas,max=1)
boolOperand.setParseAction(BoolOperand)
# define expression, based on expression operand and
# list of operations in precedence order
boolExpr = infixNotation( boolOperand,
[
("not", 1, opAssoc.RIGHT, BoolNot),
("and", 2, opAssoc.LEFT, BoolAnd),
("or", 2, opAssoc.LEFT, BoolOr),
])
if __name__ == "__main__":
p = True
q = False
r = True
tests = [("p", True),
("q", False),
("p and q", False),
("p and not q", True),
("not not p", True),
("not(p and q)", True),
("q or not p and r", False),
("q or not p or not r", False),
("q or not (p and r)", False),
("p or q or r", True),
("p or q or r and False", True),
("(p or q or r) and False", False),
]
print("p =", p)
print("q =", q)
print("r =", r)
print()
for t,expected in tests:
res = boolExpr.parseString(t)[0]
success = "PASS" if bool(res) == expected else "FAIL"
print (t,'\n', res, '=', bool(res),'\n', success, '\n')
Answer by #PaulMcGuire:
Change boolExpr.parseString(t)[0] to boolExpr.parseString(t, parseAll=True)[0]. Pyparsing will not raise an exception if it can find a valid match in the leading part of the string, even if there is junk tacked on to the end. By adding parseAll=True, you tell pyparsing that the entire string must parse successfully.
So as the question says i want to print an entire expression object when internal structure of its tree changes, but as the sympy objects are immutable i cannot to do this with the name the object is bound to
Here is an example of Code on how i am changing the Internal Structure
from sympy import *
from sympy.abc import x,y
input = 'x*(x+4)+3*x'
expr = sympify(input,evaluate=False)
def traverse(expr):
if(expr.is_Number):
return 1,True
oldexpr = expr
args = expr.args
sargs = []
hit = False
for arg in args:
arg,arghit = traverse(arg)
hit |= arghit
sargs.append(arg)
if(hit):
expr = expr.func(*sargs)
return expr,True
else:
return oldexpr,False
print(srepr(expr))
expr,hit = traverse(expr)
print(expr)
here i am changing the number to 1 whenever i encounter a number in the expression tree. And i want to print the complete expression when i made the change like this: x*(x+1)+3*x and then x*(x+1)+x
Can anyone suggest me on how to achieve this.
Just a slight mod to what you have might be what you are looking for:
def traverse(expr):
if any(a.is_Number and abs(a) != 1 for a in expr.args):
print(expr,'->',expr.func(*[(a if not a.is_Number else 1) for a in expr.args]))
if expr.is_Number and abs(expr) != 1:
return 1, True
oldexpr = expr
args = expr.args
sargs = []
hit = False
for arg in args:
arg,arghit = traverse(arg)
hit |= arghit
sargs.append(arg)
if(hit):
expr = expr.func(*sargs)
return expr, True
else:
return oldexpr, False
This produces
>>> traverse(2*x+3)
(2*x + 3, '->', 2*x + 1)
(2*x, '->', x)
(x + 1, True)
/c
im trying to parse lines in the form:
(OP something something (OP something something ) ) ( OP something something )
Where OP is a symbol for a logical gate (AND, OR, NOT) and something is the thing i want to evaluate.
The output im looking for is something like:
{ 'OPERATOR': [condition1, condition2, .. , conditionN] }
Where a condition itself can be a dict/list pair itself (nested conditions). So far i tried something like:
tree = dict()
cond = list()
tree[OP] = cond
for string in conditions:
self.counter += 1
if string.startswith('('):
try:
OP = string[1]
except IndexError:
OP = 'AND'
finally:
if OP == '?':
OP = 'OR'
elif OP == '!':
OP = 'N'
# Recurse
cond.append(self.parse_conditions(conditions[self.counter:], OP))
break
elif not string.endswith(")"):
cond.append(string)
else:
return tree
return tree
I tried other ways aswell but i just can't wrap my head around this whole recursion thing so im wondering if i could get some pointers here, i looked around the web and i found some stuff about recursive descent parsing but the tutorials were all trying to do something more complicated than i needed.
PS: i realize i could do this with existing python libraries but what would i learn by doing that eh?
I'm posting this without further comments, for learning purposes (in the real life please do use a library). Note that there's no error checking (a homework for you!)
Feel free to ask if there's something you don't understand.
# PART 1. The Lexer
symbols = None
def read(input):
global symbols
import re
symbols = re.findall(r'\w+|[()]', input)
def getsym():
global symbols
return symbols[0] if symbols else None
def popsym():
global symbols
return symbols.pop(0)
# PART 2. The Parser
# Built upon the following grammar:
#
# program = expr*
# expr = '(' func args ')'
# func = AND|OR|NOT
# args = arg*
# arg = string|expr
# string = [a..z]
def program():
r = []
while getsym():
r.append(expr())
return r
def expr():
popsym() # (
f = func()
a = args()
popsym() # )
return {f: a}
def func():
return popsym()
def args():
r = []
while getsym() != ')':
r.append(arg())
return r
def arg():
if getsym() == '(':
return expr()
return string()
def string():
return popsym()
# TEST = Lexer + Parser
def parse(input):
read(input)
return program()
print parse('(AND a b (OR c d)) (NOT foo) (AND (OR x y))')
# [{'AND': ['a', 'b', {'OR': ['c', 'd']}]}, {'NOT': ['foo']}, {'AND': [{'OR': ['x', 'y']}]}]