Trying to solve the n-parenthesis problem - but failing - python

I am trying to implement a solution to the 'n-parenthesis problem'
def gen_paren_pairs(n):
def gen_pairs(left_count, right_count, build_str, build_list=[]):
print(f'left count is:{left_count}, right count is:{right_count}, build string is:{build_str}')
if left_count == 0 and right_count == 0:
build_list.append(build_str)
print(build_list)
return build_list
if left_count > 0:
build_str += "("
gen_pairs(left_count - 1, right_count, build_str, build_list)
if left_count < right_count:
build_str += ")"
#print(f'left count is:{left_count}, right count is:{right_count}, build string is:{build_str}')
gen_pairs(left_count, right_count - 1, build_str, build_list)
in_str = ""
gen_pairs(n,n,in_str)
gen_paren_pairs(2)
It almost works but isn't quite there.
The code is supposed to generate a list of correctly nested brackets whose count matches the input 'n'
Here is the final contents of a list. Note that the last string starts with an unwanted left bracket.
['(())', '(()()']
Please advise.

Here's a less convoluted approach:
memory = {0:[""]}
def gp(n):
if n not in memory:
local_mem = []
for a in range(n):
part1s = list(gp(a))
for p2 in gp(n-1-a):
for p1 in part1s:
pat = "("+p1+")"+p2
local_mem.append(pat)
memory[n] = local_mem
return memory[n]
The idea is to take one pair of parentheses, go over all the ways to divide the remaining N-1 pairs between going inside that pair and going after it, find the set of patterns for each of those sizes, and make all of the combinations.
To eliminate redundant computation, we save the values returned for each input n, so if asked for the same n again, we can just look it up.

Related

Parsing a block of mathematical expressions and separate the terms

In a textfile, I have a block of text between 2 keywords (let's call them "keyword1" and "keyword2") which consists in a big mathematical expression which is a sum of smaller expressions and could be more or less complex. x"random_number" refer to some variables which are numbered.
For example, this could be like this :
keyword1 x47*ln(x46+2*x38) + (x35*x24 + exp(x87 + x56))^2 - x34 + ...
+ .....
+ .....
keyword2
All I want to do is to separate this big mathematical expression in the terms it is coumpound with and stock these "atomic" terms in a list for example so that every term which appear in the sum (if it is negative, this should be - term)
With the example above, this should return this :
L = [x47*ln(x46+2*x38), (x35*x24 + exp(x87 + x56))^2, - x34, ...]
I would try to use a regex which matches with the + or - symbol which separates terms between them but I think this is wrong because it will also match the +/- symbols which appears in smaller expressions which I don't want to be separated
So I'm a bit triggered with this
Thank you in advance for helping me solve my problem guys
I think for extracting the part between the keywords, a regex will work just fine. With the help of an online regex creator you should be able to create that. Then you have the string left with the mathematical formula in it.
Essentially what you want is to split the string at all places where the bracket 'depth' is 0. For example, if you have x1*(x2+x3)+x4 the + between the brackets should be ignored.
I wrote the following function which searches though the list and keeps track of the current bracket depth. If the depth is 0 and a + or - is encountered, the index is stored. In the end, we can split the string at these indices to obtain the split you require. I first wrote a recursive variant, but the iterative variant works just as well and is probably easier to understand.
Recursive function
def find_split_indexes(block, index=0, depth=0, indexes=[]):
# return when the string has been searched entirely
if index >= len(block):
return indexes
# change the depth when a bracket is encountered
if block[index] == '(':
depth += 1
elif block[index] == ')':
depth -= 1
# if a + or minus is encountered at depth 0, store the index
if depth == 0 and (block[index] == '+' or block[index] == '-'):
indexes.append(index)
# finally return the list of indexes
return find_split_indexes(block, index+1, depth, indexes)
Iterative function
Of course an iterative (using a loop) version of this function can also be created, and is likely a bit simpler to understand
def find_split_indexes_iterative(block):
indexes = []
depth = 0
# iterate over the string
for index in range(len(block)):
if block[index] == '(':
depth += 1
elif block[index] == ')':
depth -= 1
elif depth == 0 and (block[index] == '+' or block[index] == '-'):
indexes.append(index)
return indexes
Using the indices
To then use these indices, you can, for instance, split the string as explained in this other question to obtain the parts you want. The only thing left to do is remove the leading and trailing spaces.

Python Optimizating the Van sequence

I am writing a code on python for the platform Coding Games . The code is about Van Eck's sequence and i pass 66% of the "tests".
Everything is working as expected , the problem is that the process runs out of the time allowed.
Yes , the code is slow.
I am not a python writer and I would like to ask you if you could do any optimization on the piece of code and if your method is complex ( Complex,meaning if you will be using something along vectorized data ) and not just swap an if (because that is easily understandable) to give a good explanation for your choice .
Here is my code for the problem
import sys
import math
def LastSeen(array):
startingIndex = 0
lastIndex = len(array) - 1
closestNum = 0
for startingIndex in range(len(array)-1,-1,-1):
if array[lastIndex] == array[startingIndex] and startingIndex != lastIndex :
closestNum = abs(startingIndex - lastIndex)
break
array.append(closestNum)
return closestNum
def calculateEck(elementFirst,numSeq):
number = numSeq
first = elementFirst
result = 0
sequence.append(first)
sequence.append(0)
number -= 2
while number != 0 :
result = LastSeen(sequence)
number -= 1
print(result)
firstElement = int(input())
numSequence = int(input())
sequence = []
calculateEck(firstElement,numSequence)
so here is my code without dictionaries. van_eck contains the sequence in the end. Usually I would use a dict to track the last position of each element to save runtime. Otherwise you would need to iterate over the list to find the last occurence which can take very long.
Instead of a dict, I simply initialized an array of sufficient size and use it like a dict. To determine its size keep in mind that all numbers in the van-eck sequence are either 0 or tell you how far away the last occurrence is. So the first n numbers of the sequence can never be greater than n. Hence, you can just give the array a length equal to the size of the sequence you want to have in the end.
-1 means the element was not there before.
DIGITS = 100
van_eck = [0]
last_pos = [0] + [-1] * DIGITS
for i in range(DIGITS):
current_element = van_eck[i]
if last_pos[current_element] == -1:
van_eck.append(0)
else:
van_eck.append(i - last_pos[current_element])
last_pos[current_element] = i

Algorithm to print all valid combinations of n pairs of parentheses [duplicate]

This question already has answers here:
Algorithm to print all valid combations of n pairs of parenthesis
(3 answers)
Closed 2 years ago.
This is a very popular interview question and there are tons of pages on the internet about the solution to this problem.
eg. Calculating the complexity of algorithm to print all valid (i.e., properly opened and closed) combinations of n-pairs of parentheses
So before marking this as a duplicate question please read the full details.
I implemented my own solution to this problem but I'm missing some edge cases that I'm having a hard time to figure out.
def get_all_parens(num):
if num == 0:
return []
if num == 1:
return ['()']
else:
sub_parens = get_all_parens(num - 1)
temp = []
for parens in sub_parens:
temp.append('(' + parens + ')')
temp.append('()' + parens)
temp.append(parens + '()')
return set(temp)
there is basically a recursive call to subproblems and putting parenthesis around the combinations from subproblem.
For num = 4, it returns 13 possible combinations however the correct answer is 14, and the missing one is (())(())
I'm not sure what I'm doing wrong here. is this a right direction I'm moving towards or it's a completely wrong approach?
For the first time reader here is the question:
Implement an algorithm to print all valid (e.g., properly opened and closed) combinations of n pairs of parentheses.
E.G Input: 3, Output: ()()(), ()(()), (())(), (()()), ((()))
It looks like a wrong approach.
As you can see in your failure case (())(()) your algorithm may only obtain such string by placing parenthesis around ())((). Unfortunately the latter is not a valid combination, and cannot be generated: the prior recursive call only builds valid ones.
There are many things to correct in your approach.
recursion - it is not the fastest solution
returning set from list with duplicates (did you consider only set instead of list?)
approach of generating only 3 types of new combinations:
a) surrounding parentheses
b) parentheses on the left
c) parentheses on the right,
which also generates many duplications and omits the symmetrical results
You can try to add one additional loop (it will not reduce problems mentioned above) but it will add the expected results to the returned set.
I modified your function by adding only one loop (my proposition is to use every position of ( and add parentheses in the middle of that string):
def get_all_parens(num):
if num == 0:
return []
if num == 1:
return ['()']
else:
sub_parens = get_all_parens(num - 1)
temp = []
for parens in sub_parens:
temp.append('()' + parens)
temp.append('(' + parens + ')')
temp.append(parens + '()')
# added loop
last_index = 0
for _ in range(parens.count('(')):
temp.append(parens[:last_index] + '()' + parens[last_index:])
last_index = parens.index('(', last_index) + 1
# end of added loop
return set(temp)
EDIT:
I propose linear version of that algorithm:
def get_all_combinations(n):
results = set()
for i in range(n):
new_results = set()
if i == 0:
results = {"()"}
continue
for it in results:
output = set()
last_index = 0
for _ in range(it.count("(")):
output.add(it[:last_index] + "()" + it[last_index:])
last_index = it.index("(", last_index) + 1
output.add(it[:last_index] + "()" + it[last_index:])
new_results.update(output)
results = new_results
return list(results), len(results)

Improving the time complexity of a function that returns the index of the first occurrence of an element in a list

UPDATE 1 (Oct.16): The original code had a few logic errors which were rectified. The updated code below should now produce the correct output for all lists L, S.T they meet the criteria for a special list.
I am trying to decrease the running time of the following function:
The "firstrepeat" function takes in a special list L and an index, and produces the smallest index such that L[i] == L[j]. In other words, whatever the element at L[i] is, the "firstrepeat" function returns the index of the first occurrence of this element in the list.
What is special about the list L?:
The list may contain repeated elements on the increasing side of the list, or the decreasing side, but not both. i.e [3,2,1,1,1,5,6] is fine but not [4,3,2,2,1,2,3]
The list is decreasing(or staying the same) and then increasing(or staying the same).
Examples:
L = [4,2,0,1,3]
L = [3,3,3,1,0,7,8,9,9]
L = [4,3,3,1,1,1]
L = [1,1,1,1]
Example Output:
Say we have L = [4,3,3,1,1,1]
firstrepeat(L,2) would output 1
firstrepeat(L,5) would output 3
I have the following code. I believe the complexity is O(log n) or better (though I could be missing something). I am looking for ways to improve the time complexity.
def firstrepeat(L, i):
left = 0
right = i
doubling = 1
#A Doubling Search
#In doubling search, we start at one index and then we look at one step
#forward, then two steps forward, then four steps, then 8, then 16, etc.
#Once we have gone too far, we do a binary search on the subset of the list
#between where we started and where we went to far.
while True:
if (right - doubling) < 0:
left = 0
break
if L[i] != L[right - doubling]:
left = right - doubling
break
if L[i] == L[right - doubling]:
right = right - doubling
doubling = doubling * 2
#A generic Binary search
while right - left > 1:
median = (left + right) // 2
if L[i] != L[median]:
left = median
else:
right = median
f L[left] == L[right]:
return left
else:
return right

Don't quite understand logic of this nested loop - can't revise

I was looking for previously answered questions regarding finding repeated substrings in an array, and came across https://cs.stackexchange.com/questions/79182/im-looking-for-an-algorithm-to-find-unknown-patterns-in-a-string. It does exactly what I want, except it analyzes a single string (and finds repetitions of single characters), whereas I'd like to analyze an array (with integers, some exceeding 9). I can't accomplish this with the code as is, because for example "10" would be understood as "1" and "0".
So instead of the example "ABACBABAABBCBABA", I'd want to analyze [A, B, A, C...]. More to the point, I'd eventually want to work with integers [1, 4, 3, 1, 4...]
I've tried modifying the code, however I don't think I fully understand the logic of the nested loop. Could anyone please help?
I've been doing some work on the original problem, I still don't understand the reason for the outer loop, but I've managed to mimic the original reference script (for arrays instead of strings).
I wanted to post it in case someone else may find use for it. I'm sure it's not the most efficient way, but it seems to work. If anyone sees holes in it, by all means please advise:
def countSubs(total, sub):
totalCount = 0
for i in range(len(total) - len(sub) + 1):
testCount = 0
for j in range(len(sub)):
if sub[j] == total[i + j]:
testCount += 1
if testCount == len(sub):
totalCount += 1
return totalCount
minLength = 3
minCount = 2
test = [1,2,1,3,2,-1,2,-1,2,4,2,4,2,5,2,5,2,5,6,7,-1,7,-1,8,7,6,-1,9,-1,9,8,7,10]
rectDict = {}
for sublen in range(minLength, int(len(test)/minCount)):
for i in range(0, len(test) - sublen):
sub = test[i:i + sublen]
cnt = countSubs(test, sub)
#not necessary to concatenate with commas, but for visual legibility
subText = ''.join(str(e) + ',' for e in sub)
if cnt >= minCount and subText not in recDict:
recDict[subText[:-1]] = cnt
print(rectDict)

Categories