I got an expression parsed with pyparsing to reconstruct it as a sort of boolean tree:
expr = '(A and B) or C'
parsed -> OR_:[AND_:['A', 'B'] , 'C']
A, B and C are keys in a dict with string values (no boolean values!)
OR_ (union) and AND_ (intersection) are only class names and don't do anything rn, I'm thinking of putting an evaluator inside those classes.
now my question is, how do I turn this parsed expression into one that Python can evaluate?
What I'm trying to do is either take some string value and see if it meets the conditions of the whole expression or have it iterate over every subexpression and appending it to a result list.
Example:
dict: {'A': ['Hi', 'No', 'Yes'], 'B': ['Why', 'No', 'Okay'], 'C': ['Okay']}
expression = '(A and B) or C'
if value in expression:
output.append(value)
output -> ['No', 'Okay'] #intersection of A and B, union of that with C
It's something like that, but the part if value in expression is what bothers me because I can't think of any other way to write it.
We can parse and translate our expression using the ast module. We first parse our statement and then define a node transformer that swaps and with &, or with |, and wraps variable names in the set function. We can then compile this translated ast and evaluate it in the context of our dictionary.
import ast
from typing import Dict, Hashable, List, NoReturn, TypeVar
A = TypeVar('A', bound=Hashable)
def evaluate_logic(expr: str, context: Dict[str, List[A]]) -> List[A]:
tr = ast.parse(expr, mode='eval')
new_tr = ast.fix_missing_locations(TranslateLogic().visit(tr))
co = compile(new_tr, filename='', mode='eval')
return list(eval(co, context))
class TranslateLogic(ast.NodeTransformer):
def visit_BoolOp(self, node: ast.BoolOp) -> ast.BinOp:
op = node.op
new_op = ast.BitAnd() if isinstance(op, ast.And) else ast.BitOr()
return nested_op(new_op, [self.visit(value) for value in node.values])
def visit_Name(self, node: ast.Name) -> ast.Call:
return call_set(node)
def visit_Expression(self, node: ast.Expression) -> ast.Expression:
return super().generic_visit(node)
def generic_visit(self, node: ast.AST) -> NoReturn:
raise ValueError(f"cannote visit node: {node}")
def nested_op(op, values: List[ast.AST]) -> ast.BinOp:
if len(values) < 2:
raise ValueError(f"tried to nest operator with fewer than two values")
elif len(values) == 2:
left, right = values
return ast.BinOp(left=left, op=op, right=right)
else:
left, *rest = values
return ast.BinOp(left=left, op=op, right=nested_op(op, rest))
def call_set(node: ast.Name) -> ast.Call:
return ast.Call(func=ast.Name(id='set', ctx=node.ctx), args=[node], keywords=[])
if __name__ == '__main__':
expr = '(A and B) or C'
context = {'A': ['Hi', 'No', 'Yes'], 'B': ['Why', 'No', 'Okay'], 'C': ['Okay']}
print(evaluate_logic(expr, context))
# prints ['No', 'Okay']
I would say that this demonstrates the challenge with generic parsing in Python and then applying custom logic in Python even when leveraging existing parsing libraries.
A few notes. We're eventually evaluating the code that the user provides. There's some safety because generic_visit should raise an error if the user supplies something more complex than just ands and ors but I would be very wary of this code in a production situation.
Second, there's a bit of complication when translating and to & (and or to |) because of how Python represents a chain of ands vs a chain of &s. A chain of ands become a single BoolOp node with multiple values, whereas a chain of & become nested BinOps each with a left and a right. Compare
ast.dump(ast.parse('A and B and C', mode='eval'))
# "Expression(body=BoolOp(op=And(), values=[Name(id='A', ctx=Load()), Name(id='B', ctx=Load()), Name(id='C', ctx=Load())]))"
to
ast.dump(ast.parse('A & B & C', mode='eval'))
# "Expression(body=BinOp(left=BinOp(left=Name(id='A', ctx=Load()), op=BitAnd(), right=Name(id='B', ctx=Load())), op=BitAnd(), right=Name(id='C', ctx=Load())))"
This explains why we need the nested_op helper function.
Finally, without more information, we can't implement not. The reason is we haven't defined a "universe of discourse". In particular, what should not A evaluate to? There are two possible solutions I see:
Add an additional argument for specifying the universe of discourse. Add a visit_UnaryOp to translate not A into something like set(U) - set(A) where U is the universe of discourse.
Treat not like the set difference binary operator. In this case it would probably be easiest to preprocess the expression as a string to replace " not " with " - ".
With all this being said, you'll likely save yourself a lot of trouble, though, if you just force your users into a easier to work with (for you) interface. Something like
from my_module import And, Or
expr = Or(And("A", "B"), "C")
context = {'A': ['Hi', 'No', 'Yes'], 'B': ['Why', 'No', 'Okay'], 'C': ['Okay']}
evaluate_logic(expr, context)
You force your users to pre-parse the expression they give you but you save yourself a lot of worry and trouble.
You can use binary operators if you convert it into sets:
output = list(set(a) & set(b) | set(c))
Related
If a have the string calculation = '1+1x8'. How can I convert this into calculation = 1+1*8? I tried doing something like
for char in calculation:
if char == 'x':
calculation = calculation.replace('x', *)
# and
if char == '1':
calculation = calculation.replace('1', 1)
This clearly doesn't work, since you can't replace just one character with an integer. The entire string needs to be an integer, and if I do that it doesn't work either since I can't convert 'x' and '+' to integers
Let's use a more complicated string as an example: 1+12x8. What follows is a rough outline; you need to supply the implementation for each step.
First, you tokenize it, turning 1+12x8 into ['1', '+', '12', 'x', '8']. For this step you need to write a tokenizer or a lexical analyzer. This is the step where you define your operators and literals.
Next, you convert the token stream into a parse tree. Perhaps you represent the tree as an S-expression ['+', '1', ['x', '12', '8']] or [operator.add, 1, [operator.mul, 12, 8]]. This step requires writing a parser, which requires you to define things like the precedence of your operators.
Finally, you write an evaluator that can reduce your parse tree to a single value. Doing this in two steps might yield
[operator.add, 1, [operator.mul, 12, 8]] to [operator.add, 1, 96]
[operator.add, 1, 96] to 97
You could write something like:
def parse_exp(s):
return eval(s.replace('x','*'))
and expand for whatever other exotic symbols you want to use.
To limit the risks of eval you can also eliminate bad characters:
import string
good = string.digits + '()/*+-x'
def parse_exp(s):
s2 = ''.join([i for i in s if i in good])
return eval(s2.replace('x','*'))
Edit: additional bonus is that the in-built eval function will take care of things like parenthesis and general calculation rules :)
Edit 2: As another user pointed out, evalcan be dangerous. As such, only use it if your code will ever only run locally
Adding code to what chepner suggested:
Tokenize '1+12x8' -> ['1', '+', '12', 'x', '8'].
Use order of operation '/*+-' -> reduce calculation 1 + (12*8)
Return the answer
import re
import operator
operators = {
'/': operator.truediv,
'x':operator.mul,
'+':operator.add,
'-':operator.sub,
}
def op(operators, data):
# apply operating to all occurrences
for p in operators:
while p in data:
x = data.index(p)
replacer = operators.get(p)(int(data[x-1]) , int(data[x+1]))
data[x-1] = replacer
del data[x:x+2]
return data[0]
def func(data):
# Tokenize
d = [i for i in re.split('(\d+)', data) if i ]
# Use order of operations
d = op(operators, d)
return d
s1 = "1+1x8"
s2 = '2-4/2+5'
s = func(s1) # 9
print(s)
t = func(s2) #-5
print(t)
I am trying to write a function that takes a string of DNA and returns the compliment. I have been trying to solve this for a while now and looked through the Python documentation but couldn't work it out. I have written the docstring for the function so you can see what the answer should look like. I have seen a similar question asked on this forum but I could not understand the answers. I would be grateful if someone can explain this using only str formatting and loops / if statements, as I have not yet studied dictionaries/lists in detail.
I tried str.replace but could not get it to work for multiple elements, tried nested if statements and this didn't work either. I then tried writing 4 separate for loops, but to no avail.
def get_complementary_sequence(dna):
""" (str) -> str
Return the DNA sequence that is complementary
to the given DNA sequence.
>>> get_complementary_sequence('AT')
TA
>>> get_complementary_sequence('GCTTAA')
CGAATT
"""
for char in dna:
if char == A:
dna = dna.replace('A', 'T')
elif char == T:
dna = dna.replace('T', 'A')
# ...and so on
For a problem like this, you can use string.maketrans (str.maketrans in Python 3) combined with str.translate:
import string
table = string.maketrans('CGAT', 'GCTA')
print 'GCTTAA'.translate(table)
# outputs CGAATT
You can map each letter to another letter.
You probably need not create translation table with all possible combination.
>>> M = {'A':'T', 'T':'A', 'C':'G', 'G':'C'}
>>> STR = 'CGAATT'
>>> S = "".join([M.get(c,c) for c in STR])
>>> S
'GCTTAA'
How this works:
# this returns a list of char according to your dict M
>>> L = [M.get(c,c) for c in STR]
>>> L
['G', 'C', 'T', 'T', 'A', 'A']
The method join() returns a string in which the string elements of sequence have been joined by str separator.
>>> str = "-"
>>> L = ['a','b','c']
>>> str.join(L)
'a-b-c'
I am looking for a way to expand a logical expression (in a string) of the form:
'(A or B) and ((C and D) or E)'
in Python to produce a list of all positive sets, i.e.
['A and C and D',
'A and E',
'B and C and D',
'B and E']
but I have been unable to find how to do this. I have investigated pyparser, but I cannot work out which example is relevant in this case. This may be very easy with some sort of logic manipulation but I do not know any formal logic. Any help, or a reference to a resource that might help would be greatly appreciated.
Here's the pyparsing bit, taken from the example SimpleBool.py. First, use infixNotation (formerly known as operatorPrecedence) to define an expression grammar that supports parenthetical grouping, and recognizes precedence of operations:
from pyparsing import *
term = Word(alphas)
AND = Keyword("and")
OR = Keyword("or")
expr = infixNotation(term,
[
(AND, 2, opAssoc.LEFT),
(OR, 2, opAssoc.LEFT),
])
sample = '(A or B) and ((C and D) or E)'
result = expr.parseString(sample)
from pprint import pprint
pprint(result.asList())
prints:
[[['A', 'or', 'B'], 'and', [['C', 'and', 'D'], 'or', 'E']]]
From this, we can see that the expression is at least parsed properly.
Next, we add parse actions to each level of the hierarchy of operations. For parse actions here, we actually pass classes, so that instead of executing functions and returning some value, the parser will call the class constructor and initializer and return a class instance for the particular subexpression:
class Operation(object):
def __init__(self, tokens):
self._tokens = tokens[0]
self.assign()
def assign(self):
"""
function to copy tokens to object attributes
"""
def __repr__(self):
return self.__class__.__name__ + ":" + repr(self.__dict__)
__str__ = __repr__
class BinOp(Operation):
def assign(self):
self.op = self._tokens[1]
self.terms = self._tokens[0::2]
del self._tokens
class AndOp(BinOp):
pass
class OrOp(BinOp):
pass
expr = infixNotation(term,
[
(AND, 2, opAssoc.LEFT, AndOp),
(OR, 2, opAssoc.LEFT, OrOp),
])
sample = '(A or B) and ((C and D) or E)'
result = expr.parseString(sample)
pprint(result.asList())
returns:
[AndOp:{'terms': [OrOp:{'terms': ['A', 'B'], 'op': 'or'},
OrOp:{'terms': [AndOp:{'terms': ['C', 'D'],
'op': 'and'}, 'E'], 'op': 'or'}],
'op': 'and'}]
Now that the expression has been converted to a data structure of subexpressions, I leave it to you to do the work of adding methods to AndOp and OrOp to generate the various combinations of terms that will evaluate overall to True. (Look at the logic in the invregex.py example that inverts regular expressions for ideas on how to add generator functions to the parsed classes to generate the different combinations of terms that you want.)
It sounds as if you want to convert these expressions to Disjunctive Normal Form. A canonical algorithm for doing that is the Quine-McCluskey algorithm; you can find some information about Python implementations thereof in the relevant Wikipedia article and in the answers to this SO question.
I'm looking for a concise and functional style way to apply a function to one element of a tuple and return the new tuple, in Python.
For example, for the following input:
inp = ("hello", "my", "friend")
I would like to be able to get the following output:
out = ("hello", "MY", "friend")
I came up with two solutions which I'm not satisfied with.
One uses a higher-order function.
def apply_at(arr, func, i):
return arr[0:i] + [func(arr[i])] + arr[i+1:]
apply_at(inp, lambda x: x.upper(), 1)
One uses list comprehensions (this one assumes the length of the tuple is known).
[(a,b.upper(),c) for a,b,c in [inp]][0]
Is there a better way? Thanks!
Here is a version that works on any iterable and returns a generator:
>>> inp = ("hello", "my", "friend")
>>> def apply_nth(fn, n, iterable):
... return (fn(x) if i==n else x for (i,x) in enumerate(iterable))
...
>>> tuple(apply_nth(str.upper, 1, inp))
('hello', 'MY', 'friend')
You can extend this so that instead of one position you can give it a list of positions:
>>> def apply_at(fn, pos_lst, iterable):
... pos_lst = set(pos_lst)
... return (fn(x) if i in pos_lst else x for (i,x) in enumerate(iterable))
...
>>> ''.join(apply_at(str.upper, [2,4,6,8], "abcdefghijklmno"))
'abCdEfGhIjklmno'
I commented in support of your first snippet, but here are a couple other ways for the record:
(lambda (a,b,c): [a,b.upper(),c])(inp)
(Won't work in Python 3.x.) And:
[inp[0], inp[1].upper(), inp[2]]
>>> inp = "hello", "my", "friend"
>>> index = 1
>>> inp[:index] + ( str.upper(inp[index]),) + inp[index + 1:]
('hello', 'MY', 'friend')
Seems simple, the only thing you may need to know is that to make a single element tuple, do (elt,)
Maybe some' like this?
>>>inp = ("hello", "my", "friend")
>>>out = tuple([i == 1 and x.upper() or x for (x,i) in zip(t,range(len(t)))])
>>> out
('hello', 'MY', 'friend')
Note: rather than (x,i) in zip(t, range(len(t))) I should have thought of using the enumerate function : (i,x) in enumerate(t)
Making it a bit more general:
Rather than hard-coding the 1, we can place it in a variable.
Also, by using a tuple for that purpose, we can apply the function to elements at multiple indexes.
>>>inp = ("hello", "my", "friend")
>>>ix = (0,2)
>>>out = tuple([i in ix and x.upper() or x for (i, x) in enumerate(t)])
>>> out
('HELLO', 'my', 'FRIEND')
Also, we can "replace" the zip()/enumerate() by map(), in something like
out = tuple(map(lambda x,i : i == 1 and x.upper() or x, inp, range(len(inp)) ) )
Edit: (addressing comment about specifying the function to apply):
Could be something as simple as:
>>> f = str.upper # or whatever function taking a single argument
>>> out = tuple(map(lambda x,i : i == 1 and f(x) or x, inp, range(len(inp)) ) )
Since we're talking about applying any function, we should mention the small caveat with the condition and if_true or if_false construct which is not exactly a substitute for the if/else ternary operator found in other languages. The limitation is that the function cannot return a value which is equivalent to False (None, 0, 0.0, '' for example). A suggestion to avoid this problem, is, with Python 2.5 and up, to use the true if-else ternary operator, as shown in Dave Kirby's answer (note the when_true if condition else when_false syntax of this operator)
I don't understand if you want to apply a certain function to every element in the tuple that passes some test, or if you would like it to apply the function to any element present at a certain index of the tuple. So I have coded both algorithms:
This is the algorithm (coded in Python) that I would use to solve this problem in a functional language like scheme:
This function will identify the element identifiable by id and apply func to it and return a list with that element changed to the output of func. It will do this for every element identifiable as id:
def doSomethingTo(tup, id):
return tuple(doSomethingToHelper(list(tup), id))
def doSomethingToHelper(L, id):
if len(L) == 0:
return L
elif L[0] == id:
return [func(L[0])] + doSomethingToHelper(L[1:], id)
else:
return [L[0]] + doSomethingToHelper(L[1:], id)
This algorithm will find the element at the index of the tuple and apply func to it, and stick it back into its original index in the tuple
def doSomethingAt(tup, i):
return tuple(doSomethingAtHelper(list(tup), i, 0))
def doSomethingAtHelper(L, index, i):
if len(L) == 0:
return L
elif i == index:
return [func(L[0])] + L[1:]
else:
return [L[0]] + doSomethingAtHelper(L[1:], index, i+1)
i also like the answer that Dave Kirby gave. however, as a public service announcement, i'd like to say that this is not a typical use case for tuples -- these are data structures that originated in Python as a means to move data (parameters, arguments) to and from functions... they were not meant for the programmer to use as general array-like data structures in applications -- this is why lists exist. naturally, if you're needing the read-only/immutable feature of tuples, that is a fair argument, but given the OP question, this should've been done with lists instead -- note how there is extra code to either pull the tuple apart and put the resulting one together and/or the need to temporarily convert to a list and back.
The list.index(x) function returns the index in the list of the first item whose value is x.
Is there a function, list_func_index(), similar to the index() function that has a function, f(), as a parameter. The function, f() is run on every element, e, of the list until f(e) returns True. Then list_func_index() returns the index of e.
Codewise:
>>> def list_func_index(lst, func):
for i in range(len(lst)):
if func(lst[i]):
return i
raise ValueError('no element making func True')
>>> l = [8,10,4,5,7]
>>> def is_odd(x): return x % 2 != 0
>>> list_func_index(l,is_odd)
3
Is there a more elegant solution? (and a better name for the function)
You could do that in a one-liner using generators:
next(i for i,v in enumerate(l) if is_odd(v))
The nice thing about generators is that they only compute up to the requested amount. So requesting the first two indices is (almost) just as easy:
y = (i for i,v in enumerate(l) if is_odd(v))
x1 = next(y)
x2 = next(y)
Though, expect a StopIteration exception after the last index (that is how generators work). This is also convenient in your "take-first" approach, to know that no such value was found --- the list.index() function would throw ValueError here.
One possibility is the built-in enumerate function:
def index_of_first(lst, pred):
for i,v in enumerate(lst):
if pred(v):
return i
return None
It's typical to refer a function like the one you describe as a "predicate"; it returns true or false for some question. That's why I call it pred in my example.
I also think it would be better form to return None, since that's the real answer to the question. The caller can choose to explode on None, if required.
#Paul's accepted answer is best, but here's a little lateral-thinking variant, mostly for amusement and instruction purposes...:
>>> class X(object):
... def __init__(self, pred): self.pred = pred
... def __eq__(self, other): return self.pred(other)
...
>>> l = [8,10,4,5,7]
>>> def is_odd(x): return x % 2 != 0
...
>>> l.index(X(is_odd))
3
essentially, X's purpose is to change the meaning of "equality" from the normal one to "satisfies this predicate", thereby allowing the use of predicates in all kinds of situations that are defined as checking for equality -- for example, it would also let you code, instead of if any(is_odd(x) for x in l):, the shorter if X(is_odd) in l:, and so forth.
Worth using? Not when a more explicit approach like that taken by #Paul is just as handy (especially when changed to use the new, shiny built-in next function rather than the older, less appropriate .next method, as I suggest in a comment to that answer), but there are other situations where it (or other variants of the idea "tweak the meaning of equality", and maybe other comparators and/or hashing) may be appropriate. Mostly, worth knowing about the idea, to avoid having to invent it from scratch one day;-).
Not one single function, but you can do it pretty easily:
>>> test = lambda c: c == 'x'
>>> data = ['a', 'b', 'c', 'x', 'y', 'z', 'x']
>>> map(test, data).index(True)
3
>>>
If you don't want to evaluate the entire list at once you can use itertools, but it's not as pretty:
>>> from itertools import imap, ifilter
>>> from operator import itemgetter
>>> test = lambda c: c == 'x'
>>> data = ['a', 'b', 'c', 'x', 'y', 'z']
>>> ifilter(itemgetter(1), enumerate(imap(test, data))).next()[0]
3
>>>
Just using a generator expression is probably more readable than itertools though.
Note in Python3, map and filter return lazy iterators and you can just use:
from operator import itemgetter
test = lambda c: c == 'x'
data = ['a', 'b', 'c', 'x', 'y', 'z']
next(filter(itemgetter(1), enumerate(map(test, data))))[0] # 3
A variation on Alex's answer. This avoids having to type X every time you want to use is_odd or whichever predicate
>>> class X(object):
... def __init__(self, pred): self.pred = pred
... def __eq__(self, other): return self.pred(other)
...
>>> L = [8,10,4,5,7]
>>> is_odd = X(lambda x: x%2 != 0)
>>> L.index(is_odd)
3
>>> less_than_six = X(lambda x: x<6)
>>> L.index(less_than_six)
2
you could do this with a list-comprehension:
l = [8,10,4,5,7]
filterl = [a for a in l if a % 2 != 0]
Then filterl will return all members of the list fulfilling the expression a % 2 != 0. I would say a more elegant method...
Intuitive one-liner solution:
i = list(map(lambda value: value > 0, data)).index(True)
Explanation:
we use map function to create a list containing True or False based on if each element in our list pass the condition in the lambda or not.
then we convert the map output to list
then using the index function, we get the index of the first true which is the same index of the first value passing the condition.