Okay,
I am working on a linguistic prover, and I have series of tuples that represent statements or expressions.
Sometimes, I end up with an embedded "and" statement, and I am trying to "bubble" it up to the surface.
I want to take a tuple that looks like this:
('pred', ('and', 'a', 'b'), 'x')
or, for a more simple example:
( ('and', 'a', 'b'), 'x')
and I want to separate out the ands into two statements such that the top one results in:
('and', ('pred', 'a', 'x',), ('pred', 'b', 'x') )
and the bottom one in:
('and', ('a', 'x'), ('b', 'x') )
I've tried a lot of things, but it always turns out to be quite ugly code. And I am having problems if there are more nested tuples such as:
('not', ('p', ('and', 'a', 'b'), 'x') )
which I want to result in
('not', ('and', ('p', 'a', 'x',), ('p', 'b', 'x') ) )
So basically, the problem is trying to replace a nested tuple with the value of the entire tuple, but the nested one modified. It's very ugly. :(
I'm not super duper python fluent so it gets very convoluted with lots of for loops that I know shouldn't be there. :(
Any help is much appreciated!
This recursive approach seems to work.
def recursive_bubble_ands_up(expr):
""" Bubble all 'and's in the expression up one level, no matter how nested.
"""
# if the expression is just a single thing, like 'a', just return it.
if is_atomic(expr):
return expr
# if it has an 'and' in one of its subexpressions
# (but the subexpression isn't just the 'and' operator itself)
# rewrite it to bubble the and up
and_clauses = [('and' in subexpr and not is_atomic(subexpr))
for subexpr in expr]
if any(and_clauses):
first_and_clause = and_clauses.index(True)
expr_before_and = expr[:first_and_clause]
expr_after_and = expr[first_and_clause+1:]
and_parts = expr[first_and_clause][1:]
expr = ('and',) + tuple([expr_before_and + (and_part,) + expr_after_and
for and_part in and_parts])
# apply recursive_bubble_ands_up to all the elements and return result
return tuple([recursive_bubble_ands_up(subexpr) for subexpr in expr])
def is_atomic(expr):
""" Return True if expr is an undividable component
(operator or value, like 'and' or 'a'). """
# not sure how this should be implemented in the real case,
# if you're not really just working on strings
return isinstance(expr, str)
Works on all your examples:
>>> tmp.recursive_bubble_ands_up(('pred', ('and', 'a', 'b'), 'x'))
('and', ('pred', 'a', 'x'), ('pred', 'b', 'x'))
>>> tmp.recursive_bubble_ands_up(( ('and', 'a', 'b'), 'x'))
('and', ('a', 'x'), ('b', 'x'))
>>> tmp.recursive_bubble_ands_up(('not', ('p', ('and', 'a', 'b'), 'x') ))
('not', ('and', ('p', 'a', 'x'), ('p', 'b', 'x')))
Note that this isn't aware of any other "special" operators, like not - as I said in my comment, I'm not sure what it should do with that. But it should give you something to start with.
Edit: Oh, oops, I just realized this only performs a single "bubble-up" operation, for example:
>>> tmp.recursive_bubble_ands_up(((('and', 'a', 'b'), 'x'), 'y' ))
(('and', ('a', 'x'), ('b', 'x')), 'y')
>>> tmp.recursive_bubble_ands_up((('and', ('a', 'x'), ('b', 'x')), 'y'))
('and', (('a', 'x'), 'y'), (('b', 'x'), 'y'))
So what you really want is probably to apply it in a while loop until the output is identical to the input, if you want your 'and' to bubble up from however many levels, like this:
def repeat_bubble_until_finished(expr):
""" Repeat recursive_bubble_ands_up until there's no change
(i.e. until all possible bubbling has been done).
"""
while True:
old_expr = expr
expr = recursive_bubble_ands_up(old_expr)
if expr == old_expr:
break
return expr
On the other hand, doing that shows that actually my program breaks your 'not' example, because it bubbles the 'and' ahead of the 'not', which you said you wanted left alone:
>>> tmp.recursive_bubble_ands_up(('not', ('p', ('and', 'a', 'b'), 'x')))
('not', ('and', ('p', 'a', 'x'), ('p', 'b', 'x')))
>>> tmp.repeat_bubble_until_finished(('not', ('p', ('and', 'a', 'b'), 'x')))
('and', ('not', ('p', 'a', 'x')), ('not', ('p', 'b', 'x')))
So I suppose you'd have to build in a special case for 'not' into recursive_bubble_ands_up, or just apply your not-handling function before running mine, and insert it before recursive_bubble_ands_up in repeat_bubble_until_finished so they're applied in alternation.
All right, I really should sleep now.
Related
Here is my list:
[(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
Basically, I'd like to get:
[('A', 'C'), ('E', 'G')]
So, I'd like to select first elements from the lowest-level lists and build mid-level lists with them.
====================================================
Additional explanation below:
I could just zip them by
list(zip([w[0][0] for w in list1], [w[1][0] for w in list1]))
But later I'd like to add a condition: the second elements in the lowest level lists must be 'B' and 'D' respectively, so the final outcome should be:
[('A', 'C')] # ('E', 'G') must be sorted out
I'm a beginner, but can't find the case anywhere... Would be grateful for help.
I'd do it the following way
list = [(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
out = []
for i in list:
listAux = []
for j in i:
listAux.append(j[0])
out.append((listAux[0],listAux[1]))
print(out)
I hope that's what you're looking for.
I am trying to write a Python (at least initially) function to generate all subsequences of some length k (where k > 0). Since I only need unique subsequences, I am storing both the subsequences and partial subsequences in sets. The following, adapted from a colleague, is the best I could come up with. It seems...overly complex...and like I should be able to abuse itertools, or recursion, to do what I want to do. Can anyone do better?
from typing import Set, Tuple
def subsequences(string: str, k: int) -> Set[Tuple[str, ...]]:
if len(string) < k:
return set()
start = tuple(string[:k])
result = {start}
prev_state = [start]
curr_state = set()
for s in string[k:]:
for p in prev_state:
for i in range(k):
new = p[:i] + p[i + 1 :] + (s,)
curr_state.add(new)
result.update(curr_state)
prev_state = list(curr_state)
curr_state.clear()
return result
(For context, I am interested in induction of k-strictly piecewise languages, an efficiently learnable subclass of the regular languages, and the grammar can be characterized by all licit k-subsequences.
Ultimately I am also thinking about doing this in C++, where std::make_tuple isn't quite as powerful as Python tuple.)
You want a set of r combinations from n items (w/o replacement, <= (n choose r).
Given
import itertools as it
import more_itertools as mit
Code
Option 1 - itertools.combinations
set(it.combinations("foo", 2))
# {('f', 'o'), ('o', 'o')}
set(it.combinations("foobar", 3))
# {('b', 'a', 'r'),
# ('f', 'a', 'r'),
# ('f', 'b', 'a'),
# ('f', 'b', 'r'),
# ('f', 'o', 'a'),
# ('f', 'o', 'b'),
# ('f', 'o', 'o'),
# ('f', 'o', 'r'),
# ('o', 'a', 'r'),
# ('o', 'b', 'a'),
# ('o', 'b', 'r'),
# ('o', 'o', 'a'),
# ('o', 'o', 'b'),
# ('o', 'o', 'r')}
Option 2 - more_itertools.distinct_combinations
list(mit.distinct_combinations("foo", 2))
# [('f', 'o'), ('o', 'o')]
list(mit.distinct_combinations("foobar", 3))
# [('f', 'o', 'o'),
# ('f', 'o', 'b'),
# ('f', 'o', 'a'),
# ('f', 'o', 'r'),
# ('f', 'b', 'a'),
# ('f', 'b', 'r'),
# ('f', 'a', 'r'),
# ('o', 'o', 'b'),
# ('o', 'o', 'a'),
# ('o', 'o', 'r'),
# ('o', 'b', 'a'),
# ('o', 'b', 'r'),
# ('o', 'a', 'r'),
# ('b', 'a', 'r')]
Both options yield the same (unordered) output. However:
Option 1 takes the set of all combinations (including duplicates)
Option 2 does not compute duplicate intermediates
Install more_itertools via > pip install more_itertools.
See also a rough implementation of itertools.combinations in written Python.
I have a problem with tuples in python. I have the following tuple list:
gamma2 = [[('p', 'u'), ('r', 'w')], [('p', 'w'), ('r', 'u')], [('r', 'u'), ('p', 'w')],[('r', 'w'), ('p', 'u')]]
Now, the parts [('p', 'u'), ('r', 'w')] and [('r', 'w'), ('p', 'u')] are the same for me and also [('p', 'w'), ('r', 'u')] and [('r', 'u'), ('p', 'w')].
So I want to delete one of these double entries in my list, but I don't know how.
I've tried with hash tables and set, but the problem is, that this tuple pair is not the same for the hash table and it will be added by gamma2.add().
So do you have an idea?
you can try to use tuple ans set
gamma2 = [[('p', 'u'), ('r', 'w')], [('p', 'w'), ('r', 'u')], [('r', 'u'), ('p', 'w')],[('r', 'w'), ('p', 'u')]]
set([tuple(set(x)) for x in gamma2])
for some case it will be better to use sorted instead inside set (thanks #rockikz)
set([tuple(sorted(x)) for x in gamma2])
and third solution is to use frozenset
set([frozenset(x) for x in gamma2])
will give you the result:
{(('p', 'w'), ('r', 'u')), (('r', 'w'), ('p', 'u'))}
set - list of unique values
the set inside loop - need to to lead the items to make them equal
next use tuple only as sugar to make outer set
and the last set we use to get unique values
and if you want the same type in the result you can do it:
[list(y) for y in set([tuple(set(x)) for x in gamma2])]
will give you
[[('r', 'w'), ('p', 'u')], [('p', 'w'), ('r', 'u')]]
Bigram is a list which looks like-
[('a', 'b'), ('b', 'b'), ('b', 'b'), ('b', 'c'), ('c', 'c'), ('c', 'c'), ('c', 'd'), ('d', 'd'), ('d', 'e')]
Now I am trying to wrote each element if the list as a separate line in a file with this code-
bigram = list(nltk.bigrams(s.split()))
outfile1.write("%s" % ''.join(ele) for ele in bigram)
but I am getting this error :
TypeError: write() argument must be str, not generator
I want the result as in file-
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
......
you're passing a generator comprehension to write, which needs strings.
If I understand correctly you want to write one representation of tuple per line.
You can achieve that with:
outfile1.write("".join('{}\n'.format(ele) for ele in bigram))
or
outfile1.writelines('{}\n'.format(ele) for ele in bigram)
the second version passes a generator comprehension to writelines, which avoids to create the big string in memory before writing to it (and looks more like your attempt)
it produces a file with this content:
('a', 'b')
('b', 'b')
('b', 'b')
('b', 'c')
('c', 'c')
('c', 'c')
('c', 'd')
('d', 'd')
('d', 'e')
Try this:
outfile1.writelines("{}\n".format(ele) for ele in bigram)
This is the operator precedence problem.
You want an expression like this:
("%s" % ''.join(ele)) for ele in bigram
Instead, you get it interpreted like this, where the part in the parens is indeed a generator:
"%s" % (''.join(ele) for ele in bigram)
Use the explicit parentheses.
Please note that ("%s" % ''.join(ele)) for ele in bigram is itself a generator. You need to call write on each element from it.
If you want to write each pair in a separate line, you have to add line separators explicitly. The easiest, to my mind, is an explicit loop:
for pair in bigram:
outfile.write("(%s, %s)\n" % pair)
I am writing a webapp in python to convert a string by rot13 such as rot13(a)=n , rot13(b)=o and so on. But the code is only working for letters after m. From letters a to m there is no change. What am I doing wrong, here is my code:
import webapp2
import cgi
form = """<form method="post">
Type a string to convert:
<br>
<textarea name="text">%(text)s</textarea>
<br>
<input type="submit">
</form> """
class MainHandler(webapp2.RequestHandler):
def write_form(self, text=""):
self.response.out.write(form % {"text": cgi.escape(text)})
def get(self):
self.write_form()
def post(self):
new_text = self.request.get('text')
text = ""
for x in range(len(new_text)):
text += convert(new_text[x])
self.write_form(text)
def convert(t):
for (i, o) in (('a', 'n'),
('b', 'o'),
('c', 'p'),
('d', 'q'),
('e', 'r'),
('f', 's'),
('g', 't'),
('h', 'u'),
('i', 'v'),
('j', 'w'),
('k', 'x'),
('l', 'y'),
('m', 'z'),
('n', 'a'),
('o', 'b'),
('p', 'c'),
('q', 'd'),
('r', 'e'),
('s', 'f'),
('t', 'g'),
('u', 'h'),
('v', 'i'),
('w', 'j'),
('x', 'k'),
('y', 'l'),
('z', 'm')):
t = t.replace(i, o)
return t
app = webapp2.WSGIApplication([
('/', MainHandler)
], debug=True)
When i place the letters n to z above a, then a to m are giving correct result.
The issue is in the convert() method, Lets take a simple Example to understand this.
Lets take example of string 'an' and try to convert it.
First we get a from the tuple of tuple and we replace it with n in the string, so it becomes - 'nn' . Now after lots of misses, we get to n in the tuple of tuples and we again do replace on the whole string, and this time we get - 'aa' . As you can see we are again replacing complete string, not just the remaining of the not converted string. This is the basic issue in your code (atleast the issue you mention).
To fix this, Python already provides a str.translate() (with str.maketrans) function to do what you are trying to do, you should use that instead. Example -
Python 2.x -
def convert(t):
from string import maketrans
tt = maketrans('abcdefghijklmnopqrstuvwxyz','nopqrstuvwxyzabcdefghijklm')
t = t.translate(tt)
return t
For Python 3.x , you should use str.maketrans() instead of string.maketrans() .