Error when trying to build logical parser - python

So i have these strings stored in database and i want to convert them to python expression to use them with if statement. I will store these strings into list and will loop over them.
For example:
string = "#apple and #banana or #grapes"
i am able to convert this string by replacing # with "a==" and # with "b==" to this :
if a == apple and b == banana or b == grapes
hash refers to a
# refers to b
But when i use eval it throws up error "apple is not defined" because apple is not in quotes. so what i want is this:
if a == "apple" and b == "banana" or b == "grapes"
Is there any way i can do this ?
The strings stored in DB can have any type of format, can have multiple and/or conditions.
Few examples:
string[0] = "#apple and #banana or #grapes"
string[1] = "#apple or #banana and #grapes"
string[2] = "#apple and #banana and #grapes"
There will be else condition where no condition is fullfilled
Thanks

If I understand correctly you are trying so setup something of a logical parser - you want to evaluate if the expression can possibly be true, or not.
#word or #otherword
is always true since it's possible to satisfy this with #=word for example, but
#word and #otherword
is not since it is impossible to satisfy this. The way you were going is using Python's builtin interpreter, but you seem to "make up" variables a and b, which do not exist. Just to give you a starter for such a parser, here is one bad implementation:
from itertools import product
def test(string):
var_dict = {}
word_dict = {}
cur_var = ord('a')
expression = []
for i,w in enumerate(string.split()):
if not i%2:
if w[0] not in var_dict:
var_dict[w[0]] = chr(cur_var)
word_dict[var_dict[w[0]]] = []
cur_var += 1
word_dict[var_dict[w[0]]].append(w[1:])
expression.append('{}=="{}"'.format(var_dict[w[0]],w[1:]))
else: expression.append(w)
expression = ' '.join(expression)
result = {}
for combination in product(
*([(v,w) for w in word_dict[v]] for v in word_dict)):
exec(';'.join('{}="{}"'.format(v,w) for v,w in combination)+';value='+expression,globals(),result)
if result['value']: return True
return False
Beyond not checking if the string is valid, this is not great, but a place to start grasping what you're after.
What this does is create your expression in the first loop, while saving a hash mapping the first characters of words (w[0]) to variables named from a to z (if you want more you need to do better than cur_var+=1). It also maps each such variable to all the words it was assigned to in the original expression (word_dict).
The second loop runs a pretty bad algorithm - product will give all the possible paring of variable and matching word, and I iterate each combination and assign our fake variables the words in an exec command. There are plenty of reasons to avoid exec, but this is easiest for setting the variables. If I found a combination that satisfies the expression, I return True, otherwise False. You cannot use eval if you want to assign stuff (or for if,for,while etc.).
Not this can drastically be improved on by writing your own logical parser to read the string, though it will probably be longer.
#Evaluted as (#apple and #banana) or #grapes) by Python - only #=apple #=banana satisfies this.
>>> test("#apple and #banana or #grapes")
True
#Evaluted as #apple or (#banana and #grapes) by Python - all combinations satisfy this as # does not matter.
>>> test("#apple or #banana and #grapes")
True
#demands both #=banana and #=grapes - impossible.
>>> test("#apple and #banana and #grapes")
False

I am not sure of what you are asking here, but you can use the replace and split functions :
string = "#apple and #banana"
fruits = string.replace("#", "").split("and")
if a == fruits[0] and b == fruits[1]:
Hope this helps

Related

Python: Using bool() to select which vars are TRUE, then use those values to call function

First question ever! I've built a GUI which asks user to input 2 of possible 5 values. Each pair of values (10 possible pairs) get used to run 10 different solution functions named Case_n to which all five values (both zero and non-zero) are passed.
The problem I'm having is getting the bool() results stripped down to 2 digit without brackets, etc and then placed into a variable used to create the name of the function to call.
I've run the logic, with TRUE values added to a list, then converted the list to a string so I could strip it down to just the numerals, saved the 2 digit string and added it to the Case_n name. Now, when I try to use the name to call the function, I get an error that a string is not callable. Please help . . .
s = 5 #vars. For this example, I've pre-loaded 2 of them
a = 15
l = 0
r = 0
e_deg = 0
ve = 0
case = []
if bool(s):
case.append(1)
if bool(a):
case.append(2)
if bool(l):
case.append(3)
if bool(r):
case.append(4)
if bool(e_deg):
case.append(5)
nm = str(case) # placeholder to convert case to string
case_num = nm[1] + nm[4] # this returns 12 as a string
# create case_num var, using the string
Case = "Case_" + case_num
print("Case = ",Case) # Should be Case_12
def Case_12(s,a,l,r,e_deg,ve):
print("Case_12 running")
Case(s,a,l,r,e_deg,ve) ```
You could just use eval(Case) but I advise against it as you are processing user input and it could be a security risk.
An easy way would be to build the following dict :
my_dict = {"Case_1": Case_1, ..., "Case_12" : Case_12}
And then, instead of calling Case, you would do
my_dict[Case](s,a,l,r,e_deg,ve)
You could also create a function :
def choose_case(my_case_as_str):
my_case_dict = {"Case_1": Case_1, ..., "Case_12": Case_12}
return my_case_dict[my_case_as_str]
And then call
choose_case(Case)(s,a,l,r,e_deg,ve)
By the way, you probably don't want your function and variable names to start with an uppercase letter. You also probably want to use a safer way to get user input (for example use Template str)

Working on basic recursion- trying to recursively look through a string for two characters

I'm super new to python, and trying to create a very simple function to be used in a larger map coloring program.
The idea of the function is to have a set of variables attributed to different regions (string1) with colors assigned to them, (r,g,b) and then test if the regions touch another region of the same color by recursively looking through a set of region borders (string2) to find variables+colors that match.
The input format would look like this:
("Ar, Bg, Cb", "AB,CB,CA")
Would return True, meaning no two regions of the same color touch.
Here's my code segment so far:
def finding_double_char_function(string1, string2):
if string2=="":
return True
elif string2[0]+"r" and string2[1]+"r" in string1 or string1[::-1]:
return False
elif string2[0]+"g" and string2[1]+"g" in string1 or string1[::-1]:
return False
elif string2[0]+"b" and string2[1]+"b" in string1 or string1[::-1]:
return False
else:
return finding_double_char_function(string1, (string2[3:]))
I keep getting false when I expected True. Can anyone help? Thanks a lot.
You have several problems in this, but your main problem is that you don't seem to know the order of bindings in an expression. What you've written is a little more readable like this:
elif string2[0]+"r" and
((string2[1]+"r" in string1) or
string1[::-1]) :
In other words, you've used strings as boolean values. The value you get from this is not what you expected. I think what you're trying to do is to see whether either constructed string (such as "Ar") is in string 1, either forward or backward.
"in" can join only one pair of strings; there's no distributive property of "and" and "or" over "in".
Here's the first part rewritten properly:
elif (string2[0]+"r" in string1) and
(string2[1]+"r" in string1)
Does this get you going?
Also, stick in print statements to trace your execution and print out useful values along the way.
If I undestood correctly your problem could be solved like this:
def intersect(str1, str2):
if (not str2):
return True
if (str1[str1.find(str2[0]) + 1] == str1[str1.find(str2[1]) + 1]):
return False
else:
return intersect(str1, str2[3:])

Python: Create dynamic loop based on pattern

I'm still learning to code in Python
I want to generate a string based on pattern, the only way I know is by using for loop.
In example code below, I create a loop for "vcvcv" pattern. c=consonant, v=vowel
How to create a dynamic loop, based on pattern that I provide to the script?
eg. if pattern is "cvcvc" the loop should be build to produce the string
Help appeciated.
Thanks.
#!/bin/env python
vowel="aeiou"
consonant="bcdfghjklmnpqrstvwxyz"
lvowel=list(vowel)
lconsonant=list(consonant)
# pattern for "vcvcv" = ababa
for a in lvowel:
for b in lconsonant:
for c in lvowel:
for d in lconsonant:
for e in lvowel:
myname=a+b+c+d+e
print myname
# pattern for "cvcvc" = babab
# how to make the loop dynamic based on pattern ?
Something like this should work:
import itertools
mapping = {
'v': 'aeiou',
'c': 'bcdfghjklmnpqrstvwxyz'
}
pattern = 'vcvcv'
for thing in itertools.product(*map(mapping.get, pattern)):
print ''.join(thing)
Here's roughly how it works:
map(mapping.get, pattern) just converts 'vcv' to ['aeiou', 'bcdfghjklmnpqrstvwxyz', 'aeiou']. It replaces each letter with the corresponding list of characters.
*map(...) unpacks the argument list.
itertools.product() is like a bunch of nested for loops.
''.join(thing) joins the list of characters into a single string.
If you want to do this without itertools, you'll have to make a recursive function.
If you're just getting into programming and want to see a more general solution than the itertools one listed above, then recursion is your best bet, allowing you to arbitrarily nest loops.
There is a slight complication here, which you could use Python generators for, or else use simpler (but messier) constructs. An example of the latter is shown below.
Something like
def continuePattern(pat, strSoFar):
if pat == '':
print strSoFar
elif pat[0] == 'v':
for c in lvowel:
continuePattern(pat[1:], strSoFar + c)
elif pat[0] == 'c':
for c in lconsonant:
continuePattern(pat[1:], strSoFar + c)
This is one of several possible implementations, and one of the two most naive ones I can imagine.
A somewhat more elaborate but easily customizable version for the first n permutations is given below,
def gen_pattern( seq, op = "" ):
vowel="aeiou"
consonant="bcdfghjklmnpqrstvwxyz"
lvowel=list(vowel)
lconsonant=list(consonant)
if ( not seq ):
print op
return
if ( seq[0] == 'v' ):
for v in lvowel:
gen_pattern( seq[1:], op+v )
elif ( seq[0] == 'c' ):
for c in lconsonant:
gen_pattern( seq[1:],op+c )
if __name__ == "__main__":
gen_pattern("vcvcv")
I agree it is more work though!

Is there a better way to create dynamic functions on the fly, without using string formatting and exec?

I have written a little program that parses log files of anywhere between a few thousand lines to a few hundred thousand lines. For this, I have a function in my code which parses every line, looks for keywords, and returns the keywords with the associated values.
These log files contain of little sections. Each section has some values I'm interested in and want to store as a dictionary.
I have simplified the sample below, but the idea is the same.
My original function looked like this, it gets called between 100 and 10000 times per run, so you can understand why I want to optimize it:
def parse_txt(f):
d = {}
for line in f:
if not line:
pass
elif 'apples' in line:
d['apples'] = True
elif 'bananas' in line:
d['bananas'] = True
elif line.startswith('End of section'):
return d
f = open('fruit.txt','r')
d = parse_txt(f)
print d
The problem I run into, is that I have a lot of conditionals in my program, because it checks for a lot of different things and stores the values for it. And when checking every line for anywhere between 0 and 30 keywords, this gets slow fast. I don't want to do that, because, not every time I run the program I'm interested in everything. I'm only ever interested in 5-6 keywords, but I'm parsing every line for 30 or so keywords.
In order to optimize it, I wrote the following by using exec on a string:
def make_func(args):
func_str = """
def parse_txt(f):
d = {}
for line in f:
if not line:
pass
"""
if 'apples' in args:
func_str += """
elif 'apples' in line:
d['apples'] = True
"""
if 'bananas' in args:
func_str += """
elif 'bananas' in line:
d['bananas'] = True
"""
func_str += """
elif line.startswith('End of section'):
return d"""
print func_str
exec(func_str)
return parse_txt
args = ['apples','bananas']
fun = make_func(args)
f = open('fruit.txt','r')
d = fun(f)
print d
This solution works great, because it speeds up the program by an order of magnitude and it is relatively simple. Depending on the arguments I put in, it will give me the first function, but without checking for all the stuff I don't need.
For example, if I give it args=['bananas'], it will not check for 'apples', which is exactly what I want to do.
This makes it much more efficient.
However, I do not like it this solution very much, because it is not very readable, difficult to change something and very error prone whenever I modify something. Besides that, it feels a little bit dirty.
I am looking for alternative or better ways to do this. I have tried using a set of functions to call on every line, and while this worked, it did not offer me the speed increase that my current solution gives me, because it adds a few function calls for every line. My current solution doesn't have this problem, because it only has to be called once at the start of the program. I have read about the security issues with exec and eval, but I do not really care about that, because I'm the only one using it.
EDIT:
I should add that, for the sake of clarity, I have greatly simplified my function. From the answers I understand that I didn't make this clear enough.
I do not check for keywords in a consistent way. Sometimes I need to check for 2 or 3 keywords in a single line, sometimes just for 1. I also do not treat the result in the same way. For example, sometimes I extract a single value from the line I'm on, sometimes I need to parse the next 5 lines.
I would try defining a list of keywords you want to look for ("keywords") and doing this:
for word in keywords:
if word in line:
d[word] = True
Or, using a list comprehension:
dict([(word,True) for word in keywords if word in line])
Unless I'm mistaken this shouldn't be much slower than your version.
No need to use eval here, in my opinion. You're right in that an eval based solution should raise a red flag most of the time.
Edit: as you have to perform a different action depending on the keyword, I would just define function handlers and then use a dictionary like this:
def keyword_handler_word1(line):
(...)
(...)
def keyword_handler_wordN(line):
(...)
keyword_handlers = { 'word1': keyword_handler_word1, (...), 'wordN': keyword_handler_wordN }
Then, in the actual processing code:
for word in keywords:
# keyword_handlers[word] is a function
keyword_handlers[word](line)
Use regular expressions. Something like the next:
>>> lookup = {'a': 'apple', 'b': 'banane'} # keyword: characters to look for
>>> pattern = '|'.join('(?P<%s>%s)' % (key, val) for key, val in lookup.items())
>>> re.search(pattern, 'apple aaa').groupdict()
{'a': 'apple', 'b': None}
def create_parser(fruits):
def parse_txt(f):
d = {}
for line in f:
if not line:
pass
elif line.startswith('End of section'):
return d
else:
for testfruit in fruits:
if testfruit in line:
d[testfruit] = True
This is what you want - create a test function dynamically.
Depending on what you really want to do, it is, of course, possibe to remove one level of complexity and define
def parse_txt(f, fruits):
[...]
or
def parse_txt(fruits, f):
[...]
and work with functools.partial.
You can use set structure, like this:
fruit = set(['cocos', 'apple', 'lime'])
need = set (['cocos', 'pineapple'])
need. intersection(fruit)
return to you 'cocos'.

Finding partial strings in a list of strings - python

I am trying to check if a user is a member of an Active Directory group, and I have this:
ldap.set_option(ldap.OPT_REFERRALS, 0)
try:
con = ldap.initialize(LDAP_URL)
con.simple_bind_s(userid+"#"+ad_settings.AD_DNS_NAME, password)
ADUser = con.search_ext_s(ad_settings.AD_SEARCH_DN, ldap.SCOPE_SUBTREE, \
"sAMAccountName=%s" % userid, ad_settings.AD_SEARCH_FIELDS)[0][1]
except ldap.LDAPError:
return None
ADUser returns a list of strings:
{'givenName': ['xxxxx'],
'mail': ['xxxxx#example.com'],
'memberOf': ['CN=group1,OU=Projects,OU=Office,OU=company,DC=domain,DC=com',
'CN=group2,OU=Projects,OU=Office,OU=company,DC=domain,DC=com',
'CN=group3,OU=Projects,OU=Office,OU=company,DC=domain,DC=com',
'CN=group4,OU=Projects,OU=Office,OU=company,DC=domain,DC=com'],
'sAMAccountName': ['myloginid'],
'sn': ['Xxxxxxxx']}
Of course in the real world the group names are verbose and of varied structure, and users will belong to tens or hundreds of groups.
If I get the list of groups out as ADUser.get('memberOf')[0], what is the best way to check if any members of a separate list exist in the main list?
For example, the check list would be ['group2', 'group16'] and I want to get a true/false answer as to whether any of the smaller list exist in the main list.
If the format example you give is somewhat reliable, something like:
import re
grps = re.compile(r'CN=(\w+)').findall
def anyof(short_group_list, adu):
all_groups_of_user = set(g for gs in adu.get('memberOf',()) for g in grps(gs))
return sorted(all_groups_of_user.intersection(short_group_list))
where you pass your list such as ['group2', 'group16'] as the first argument, your ADUser dict as the second argument; this returns an alphabetically sorted list (possibly empty, meaning "none") of the groups, among those in short_group_list, to which the user belongs.
It's probably not much faster to just a bool, but, if you insist, changing the second statement of the function to:
return any(g for g in short_group_list if g in all_groups_of_user)
might possibly save a certain amount of time in the "true" case (since any short-circuits) though I suspect not in the "false" case (where the whole list must be traversed anyway). If you care about the performance issue, best is to benchmark both possibilities on data that's realistic for your use case!
If performance isn't yet good enough (and a bool yes/no is sufficient, as you say), try reversing the looping logic:
def anyof_v2(short_group_list, adu):
gset = set(short_group_list)
return any(g for gs in adu.get('memberOf',()) for g in grps(gs) if g in gset)
any's short-circuit abilities might prove more useful here (at least in the "true" case, again -- because, again, there's no way to give a "false" result without examining ALL the possibilities anyway!-).
You can use set intersection (& operator) once you parse the group list out. For example:
> memberOf = 'CN=group1,OU=Projects,OU=Office,OU=company,DC=domain,DC=com'
> groups = [token.split('=')[1] for token in memberOf.split(',')]
> groups
['group1', 'Projects', 'Office', 'company', 'domain', 'com']
> checklist1 = ['group1', 'group16']
> set(checklist1) & set(groups)
set(['group1'])
> checklist2 = ['group2', 'group16']
> set(checklist2) & set(groups)
set([])
Note that a conditional evaluation on a set works the same as for lists and tuples. True if there are any elements in the set, False otherwise. So, "if set(checklist2) & set(groups): ..." would not execute since the condition evaluates to False in the above example (the opposite is true for the checklist1 test).
Also see:
http://docs.python.org/library/sets.html

Categories