Expanding a logical statement (multiplying out) - python

I am looking for a way to expand a logical expression (in a string) of the form:
'(A or B) and ((C and D) or E)'
in Python to produce a list of all positive sets, i.e.
['A and C and D',
'A and E',
'B and C and D',
'B and E']
but I have been unable to find how to do this. I have investigated pyparser, but I cannot work out which example is relevant in this case. This may be very easy with some sort of logic manipulation but I do not know any formal logic. Any help, or a reference to a resource that might help would be greatly appreciated.

Here's the pyparsing bit, taken from the example SimpleBool.py. First, use infixNotation (formerly known as operatorPrecedence) to define an expression grammar that supports parenthetical grouping, and recognizes precedence of operations:
from pyparsing import *
term = Word(alphas)
AND = Keyword("and")
OR = Keyword("or")
expr = infixNotation(term,
[
(AND, 2, opAssoc.LEFT),
(OR, 2, opAssoc.LEFT),
])
sample = '(A or B) and ((C and D) or E)'
result = expr.parseString(sample)
from pprint import pprint
pprint(result.asList())
prints:
[[['A', 'or', 'B'], 'and', [['C', 'and', 'D'], 'or', 'E']]]
From this, we can see that the expression is at least parsed properly.
Next, we add parse actions to each level of the hierarchy of operations. For parse actions here, we actually pass classes, so that instead of executing functions and returning some value, the parser will call the class constructor and initializer and return a class instance for the particular subexpression:
class Operation(object):
def __init__(self, tokens):
self._tokens = tokens[0]
self.assign()
def assign(self):
"""
function to copy tokens to object attributes
"""
def __repr__(self):
return self.__class__.__name__ + ":" + repr(self.__dict__)
__str__ = __repr__
class BinOp(Operation):
def assign(self):
self.op = self._tokens[1]
self.terms = self._tokens[0::2]
del self._tokens
class AndOp(BinOp):
pass
class OrOp(BinOp):
pass
expr = infixNotation(term,
[
(AND, 2, opAssoc.LEFT, AndOp),
(OR, 2, opAssoc.LEFT, OrOp),
])
sample = '(A or B) and ((C and D) or E)'
result = expr.parseString(sample)
pprint(result.asList())
returns:
[AndOp:{'terms': [OrOp:{'terms': ['A', 'B'], 'op': 'or'},
OrOp:{'terms': [AndOp:{'terms': ['C', 'D'],
'op': 'and'}, 'E'], 'op': 'or'}],
'op': 'and'}]
Now that the expression has been converted to a data structure of subexpressions, I leave it to you to do the work of adding methods to AndOp and OrOp to generate the various combinations of terms that will evaluate overall to True. (Look at the logic in the invregex.py example that inverts regular expressions for ideas on how to add generator functions to the parsed classes to generate the different combinations of terms that you want.)

It sounds as if you want to convert these expressions to Disjunctive Normal Form. A canonical algorithm for doing that is the Quine-McCluskey algorithm; you can find some information about Python implementations thereof in the relevant Wikipedia article and in the answers to this SO question.

Related

use own logic expression in an if statement

I got an expression parsed with pyparsing to reconstruct it as a sort of boolean tree:
expr = '(A and B) or C'
parsed -> OR_:[AND_:['A', 'B'] , 'C']
A, B and C are keys in a dict with string values (no boolean values!)
OR_ (union) and AND_ (intersection) are only class names and don't do anything rn, I'm thinking of putting an evaluator inside those classes.
now my question is, how do I turn this parsed expression into one that Python can evaluate?
What I'm trying to do is either take some string value and see if it meets the conditions of the whole expression or have it iterate over every subexpression and appending it to a result list.
Example:
dict: {'A': ['Hi', 'No', 'Yes'], 'B': ['Why', 'No', 'Okay'], 'C': ['Okay']}
expression = '(A and B) or C'
if value in expression:
output.append(value)
output -> ['No', 'Okay'] #intersection of A and B, union of that with C
It's something like that, but the part if value in expression is what bothers me because I can't think of any other way to write it.
We can parse and translate our expression using the ast module. We first parse our statement and then define a node transformer that swaps and with &, or with |, and wraps variable names in the set function. We can then compile this translated ast and evaluate it in the context of our dictionary.
import ast
from typing import Dict, Hashable, List, NoReturn, TypeVar
A = TypeVar('A', bound=Hashable)
def evaluate_logic(expr: str, context: Dict[str, List[A]]) -> List[A]:
tr = ast.parse(expr, mode='eval')
new_tr = ast.fix_missing_locations(TranslateLogic().visit(tr))
co = compile(new_tr, filename='', mode='eval')
return list(eval(co, context))
class TranslateLogic(ast.NodeTransformer):
def visit_BoolOp(self, node: ast.BoolOp) -> ast.BinOp:
op = node.op
new_op = ast.BitAnd() if isinstance(op, ast.And) else ast.BitOr()
return nested_op(new_op, [self.visit(value) for value in node.values])
def visit_Name(self, node: ast.Name) -> ast.Call:
return call_set(node)
def visit_Expression(self, node: ast.Expression) -> ast.Expression:
return super().generic_visit(node)
def generic_visit(self, node: ast.AST) -> NoReturn:
raise ValueError(f"cannote visit node: {node}")
def nested_op(op, values: List[ast.AST]) -> ast.BinOp:
if len(values) < 2:
raise ValueError(f"tried to nest operator with fewer than two values")
elif len(values) == 2:
left, right = values
return ast.BinOp(left=left, op=op, right=right)
else:
left, *rest = values
return ast.BinOp(left=left, op=op, right=nested_op(op, rest))
def call_set(node: ast.Name) -> ast.Call:
return ast.Call(func=ast.Name(id='set', ctx=node.ctx), args=[node], keywords=[])
if __name__ == '__main__':
expr = '(A and B) or C'
context = {'A': ['Hi', 'No', 'Yes'], 'B': ['Why', 'No', 'Okay'], 'C': ['Okay']}
print(evaluate_logic(expr, context))
# prints ['No', 'Okay']
I would say that this demonstrates the challenge with generic parsing in Python and then applying custom logic in Python even when leveraging existing parsing libraries.
A few notes. We're eventually evaluating the code that the user provides. There's some safety because generic_visit should raise an error if the user supplies something more complex than just ands and ors but I would be very wary of this code in a production situation.
Second, there's a bit of complication when translating and to & (and or to |) because of how Python represents a chain of ands vs a chain of &s. A chain of ands become a single BoolOp node with multiple values, whereas a chain of & become nested BinOps each with a left and a right. Compare
ast.dump(ast.parse('A and B and C', mode='eval'))
# "Expression(body=BoolOp(op=And(), values=[Name(id='A', ctx=Load()), Name(id='B', ctx=Load()), Name(id='C', ctx=Load())]))"
to
ast.dump(ast.parse('A & B & C', mode='eval'))
# "Expression(body=BinOp(left=BinOp(left=Name(id='A', ctx=Load()), op=BitAnd(), right=Name(id='B', ctx=Load())), op=BitAnd(), right=Name(id='C', ctx=Load())))"
This explains why we need the nested_op helper function.
Finally, without more information, we can't implement not. The reason is we haven't defined a "universe of discourse". In particular, what should not A evaluate to? There are two possible solutions I see:
Add an additional argument for specifying the universe of discourse. Add a visit_UnaryOp to translate not A into something like set(U) - set(A) where U is the universe of discourse.
Treat not like the set difference binary operator. In this case it would probably be easiest to preprocess the expression as a string to replace " not " with " - ".
With all this being said, you'll likely save yourself a lot of trouble, though, if you just force your users into a easier to work with (for you) interface. Something like
from my_module import And, Or
expr = Or(And("A", "B"), "C")
context = {'A': ['Hi', 'No', 'Yes'], 'B': ['Why', 'No', 'Okay'], 'C': ['Okay']}
evaluate_logic(expr, context)
You force your users to pre-parse the expression they give you but you save yourself a lot of worry and trouble.
You can use binary operators if you convert it into sets:
output = list(set(a) & set(b) | set(c))

Python objects qualified for function overloading?

I'm learning Python lately (Dec.'20) and found this concept is very convoluted. Let's say I have a function. How can I tell if a particular objects type can be applied to this kind of function overloading?
def compute(a, b, c):
return (a+b) * c
It works fine with some of these object type - number, string, but not others.
>>> a, b, c=1,2,3
>>> a, b, c = 'hi', 'U', 3 # okay
>>> a, b, c = [1], [2,3], 3 # okay
>>> a, b, c = {1}, {2}, 3 # it's set()
Taking the + as an example, a data type must implement the __add__() (magic/special/"dunder") method to meaningfully use the + operator. In your examples, it is defined for integers, strings and lists, but not for sets.
The Python Data Model is the reference document if you want to learn more about this topic, but as a starting point you may find this article useful.
If you want to know whether a particular object, say x has an operator overloaded, or what we call magic method in Python, you can check:
hasattr(x, '__add__') # for + and similarly for any other.
Of course, you can define one such method for a class you want, if it does not exist.

How to do math operations with string?

If a have the string calculation = '1+1x8'. How can I convert this into calculation = 1+1*8? I tried doing something like
for char in calculation:
if char == 'x':
calculation = calculation.replace('x', *)
# and
if char == '1':
calculation = calculation.replace('1', 1)
This clearly doesn't work, since you can't replace just one character with an integer. The entire string needs to be an integer, and if I do that it doesn't work either since I can't convert 'x' and '+' to integers
Let's use a more complicated string as an example: 1+12x8. What follows is a rough outline; you need to supply the implementation for each step.
First, you tokenize it, turning 1+12x8 into ['1', '+', '12', 'x', '8']. For this step you need to write a tokenizer or a lexical analyzer. This is the step where you define your operators and literals.
Next, you convert the token stream into a parse tree. Perhaps you represent the tree as an S-expression ['+', '1', ['x', '12', '8']] or [operator.add, 1, [operator.mul, 12, 8]]. This step requires writing a parser, which requires you to define things like the precedence of your operators.
Finally, you write an evaluator that can reduce your parse tree to a single value. Doing this in two steps might yield
[operator.add, 1, [operator.mul, 12, 8]] to [operator.add, 1, 96]
[operator.add, 1, 96] to 97
You could write something like:
def parse_exp(s):
return eval(s.replace('x','*'))
and expand for whatever other exotic symbols you want to use.
To limit the risks of eval you can also eliminate bad characters:
import string
good = string.digits + '()/*+-x'
def parse_exp(s):
s2 = ''.join([i for i in s if i in good])
return eval(s2.replace('x','*'))
Edit: additional bonus is that the in-built eval function will take care of things like parenthesis and general calculation rules :)
Edit 2: As another user pointed out, evalcan be dangerous. As such, only use it if your code will ever only run locally
Adding code to what chepner suggested:
Tokenize '1+12x8' -> ['1', '+', '12', 'x', '8'].
Use order of operation '/*+-' -> reduce calculation 1 + (12*8)
Return the answer
import re
import operator
operators = {
'/': operator.truediv,
'x':operator.mul,
'+':operator.add,
'-':operator.sub,
}
def op(operators, data):
# apply operating to all occurrences
for p in operators:
while p in data:
x = data.index(p)
replacer = operators.get(p)(int(data[x-1]) , int(data[x+1]))
data[x-1] = replacer
del data[x:x+2]
return data[0]
def func(data):
# Tokenize
d = [i for i in re.split('(\d+)', data) if i ]
# Use order of operations
d = op(operators, d)
return d
s1 = "1+1x8"
s2 = '2-4/2+5'
s = func(s1) # 9
print(s)
t = func(s2) #-5
print(t)

python string (with space) matching

while trying to eliminate few strings in the list of strings, I tried to use a simple code similar to:
>>> s = ['a b', 'c d', 'e f', 'g h']
>>> for i in s:
... if i is not 'e f':
... print(i)
...
a b
c d
e f # this should not get printed, right?
g h
and i am unable to understand the underlying behavior?
can u explain? because the following seems logical and above should also work accordingly
>>> if 'a b' is not 'a b':
... True
... else:
... False
...
False
>>> s = ['a', 'c', 'e', 'g']
>>> for i in s:
... if i is not 'e':
... print(i)
...
a
c
g
are spaces to be treated specially? What am i missing?
is not is an identity based test; when it works on strings, it's due to interning of strings or the small string cache; it's an implementation detail that should never be relied on.
Don't use is/is not in general, except for comparisons to None, until you really understand what it's doing. You want != here, which tests value (do the two objects represent the same logical information?), not is not, which tests identity (are both things referring to the exact same object?).
If you wanted to force this to work, you could do something terrible, like explicitly interning all the strings involved, but that doesn't save any work (the work is spent interning them), and it's generally frowned upon.

Replace multiple elements in string with str methods

I am trying to write a function that takes a string of DNA and returns the compliment. I have been trying to solve this for a while now and looked through the Python documentation but couldn't work it out. I have written the docstring for the function so you can see what the answer should look like. I have seen a similar question asked on this forum but I could not understand the answers. I would be grateful if someone can explain this using only str formatting and loops / if statements, as I have not yet studied dictionaries/lists in detail.
I tried str.replace but could not get it to work for multiple elements, tried nested if statements and this didn't work either. I then tried writing 4 separate for loops, but to no avail.
def get_complementary_sequence(dna):
""" (str) -> str
Return the DNA sequence that is complementary
to the given DNA sequence.
>>> get_complementary_sequence('AT')
TA
>>> get_complementary_sequence('GCTTAA')
CGAATT
"""
for char in dna:
if char == A:
dna = dna.replace('A', 'T')
elif char == T:
dna = dna.replace('T', 'A')
# ...and so on
For a problem like this, you can use string.maketrans (str.maketrans in Python 3) combined with str.translate:
import string
table = string.maketrans('CGAT', 'GCTA')
print 'GCTTAA'.translate(table)
# outputs CGAATT
You can map each letter to another letter.
You probably need not create translation table with all possible combination.
>>> M = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
>>> STR = 'CGAATT'
>>> S = "".join([M.get(c,c) for c in STR])
>>> S
'GCTTAA'
How this works:
# this returns a list of char according to your dict M
>>> L = [M.get(c,c) for c in STR]
>>> L
['G', 'C', 'T', 'T', 'A', 'A']
The method join() returns a string in which the string elements of sequence have been joined by str separator.
>>> str = "-"
>>> L = ['a','b','c']
>>> str.join(L)
'a-b-c'

Categories