I was looking for previously answered questions regarding finding repeated substrings in an array, and came across https://cs.stackexchange.com/questions/79182/im-looking-for-an-algorithm-to-find-unknown-patterns-in-a-string. It does exactly what I want, except it analyzes a single string (and finds repetitions of single characters), whereas I'd like to analyze an array (with integers, some exceeding 9). I can't accomplish this with the code as is, because for example "10" would be understood as "1" and "0".
So instead of the example "ABACBABAABBCBABA", I'd want to analyze [A, B, A, C...]. More to the point, I'd eventually want to work with integers [1, 4, 3, 1, 4...]
I've tried modifying the code, however I don't think I fully understand the logic of the nested loop. Could anyone please help?
I've been doing some work on the original problem, I still don't understand the reason for the outer loop, but I've managed to mimic the original reference script (for arrays instead of strings).
I wanted to post it in case someone else may find use for it. I'm sure it's not the most efficient way, but it seems to work. If anyone sees holes in it, by all means please advise:
def countSubs(total, sub):
totalCount = 0
for i in range(len(total) - len(sub) + 1):
testCount = 0
for j in range(len(sub)):
if sub[j] == total[i + j]:
testCount += 1
if testCount == len(sub):
totalCount += 1
return totalCount
minLength = 3
minCount = 2
test = [1,2,1,3,2,-1,2,-1,2,4,2,4,2,5,2,5,2,5,6,7,-1,7,-1,8,7,6,-1,9,-1,9,8,7,10]
rectDict = {}
for sublen in range(minLength, int(len(test)/minCount)):
for i in range(0, len(test) - sublen):
sub = test[i:i + sublen]
cnt = countSubs(test, sub)
#not necessary to concatenate with commas, but for visual legibility
subText = ''.join(str(e) + ',' for e in sub)
if cnt >= minCount and subText not in recDict:
recDict[subText[:-1]] = cnt
print(rectDict)
Related
I am trying to implement a solution to the 'n-parenthesis problem'
def gen_paren_pairs(n):
def gen_pairs(left_count, right_count, build_str, build_list=[]):
print(f'left count is:{left_count}, right count is:{right_count}, build string is:{build_str}')
if left_count == 0 and right_count == 0:
build_list.append(build_str)
print(build_list)
return build_list
if left_count > 0:
build_str += "("
gen_pairs(left_count - 1, right_count, build_str, build_list)
if left_count < right_count:
build_str += ")"
#print(f'left count is:{left_count}, right count is:{right_count}, build string is:{build_str}')
gen_pairs(left_count, right_count - 1, build_str, build_list)
in_str = ""
gen_pairs(n,n,in_str)
gen_paren_pairs(2)
It almost works but isn't quite there.
The code is supposed to generate a list of correctly nested brackets whose count matches the input 'n'
Here is the final contents of a list. Note that the last string starts with an unwanted left bracket.
['(())', '(()()']
Please advise.
Here's a less convoluted approach:
memory = {0:[""]}
def gp(n):
if n not in memory:
local_mem = []
for a in range(n):
part1s = list(gp(a))
for p2 in gp(n-1-a):
for p1 in part1s:
pat = "("+p1+")"+p2
local_mem.append(pat)
memory[n] = local_mem
return memory[n]
The idea is to take one pair of parentheses, go over all the ways to divide the remaining N-1 pairs between going inside that pair and going after it, find the set of patterns for each of those sizes, and make all of the combinations.
To eliminate redundant computation, we save the values returned for each input n, so if asked for the same n again, we can just look it up.
This question already has answers here:
Algorithm to print all valid combations of n pairs of parenthesis
(3 answers)
Closed 2 years ago.
This is a very popular interview question and there are tons of pages on the internet about the solution to this problem.
eg. Calculating the complexity of algorithm to print all valid (i.e., properly opened and closed) combinations of n-pairs of parentheses
So before marking this as a duplicate question please read the full details.
I implemented my own solution to this problem but I'm missing some edge cases that I'm having a hard time to figure out.
def get_all_parens(num):
if num == 0:
return []
if num == 1:
return ['()']
else:
sub_parens = get_all_parens(num - 1)
temp = []
for parens in sub_parens:
temp.append('(' + parens + ')')
temp.append('()' + parens)
temp.append(parens + '()')
return set(temp)
there is basically a recursive call to subproblems and putting parenthesis around the combinations from subproblem.
For num = 4, it returns 13 possible combinations however the correct answer is 14, and the missing one is (())(())
I'm not sure what I'm doing wrong here. is this a right direction I'm moving towards or it's a completely wrong approach?
For the first time reader here is the question:
Implement an algorithm to print all valid (e.g., properly opened and closed) combinations of n pairs of parentheses.
E.G Input: 3, Output: ()()(), ()(()), (())(), (()()), ((()))
It looks like a wrong approach.
As you can see in your failure case (())(()) your algorithm may only obtain such string by placing parenthesis around ())((). Unfortunately the latter is not a valid combination, and cannot be generated: the prior recursive call only builds valid ones.
There are many things to correct in your approach.
recursion - it is not the fastest solution
returning set from list with duplicates (did you consider only set instead of list?)
approach of generating only 3 types of new combinations:
a) surrounding parentheses
b) parentheses on the left
c) parentheses on the right,
which also generates many duplications and omits the symmetrical results
You can try to add one additional loop (it will not reduce problems mentioned above) but it will add the expected results to the returned set.
I modified your function by adding only one loop (my proposition is to use every position of ( and add parentheses in the middle of that string):
def get_all_parens(num):
if num == 0:
return []
if num == 1:
return ['()']
else:
sub_parens = get_all_parens(num - 1)
temp = []
for parens in sub_parens:
temp.append('()' + parens)
temp.append('(' + parens + ')')
temp.append(parens + '()')
# added loop
last_index = 0
for _ in range(parens.count('(')):
temp.append(parens[:last_index] + '()' + parens[last_index:])
last_index = parens.index('(', last_index) + 1
# end of added loop
return set(temp)
EDIT:
I propose linear version of that algorithm:
def get_all_combinations(n):
results = set()
for i in range(n):
new_results = set()
if i == 0:
results = {"()"}
continue
for it in results:
output = set()
last_index = 0
for _ in range(it.count("(")):
output.add(it[:last_index] + "()" + it[last_index:])
last_index = it.index("(", last_index) + 1
output.add(it[:last_index] + "()" + it[last_index:])
new_results.update(output)
results = new_results
return list(results), len(results)
here is my code:
def string_match(a, b):
count = 0
if len(a) < 2 or len(b) < 2:
return 0
for i in range(len(a)):
if a[i:i+2] == b[i:i+2]:
count = count + 1
return count
And here are the results:
Correct me if I am wrong but, I see that it didn't work probably because the two string lengths are the same. If I were to change the for loop statement to:
for i in range(len(a)-1):
then it would work for all cases provided. But can someone explain to me why adding the -1 makes it work? Perhaps I'm comprehending how the for loop works in this case. And can someone tell me a more optimal way to write this because this is probably really bad code. Thank you!
But can someone explain to me why adding the -1 makes it work?
Observe:
test = 'food'
i = len(test) - 1
test[i:i+2] # produces 'd'
Using len(a) as your bound means that len(a) - 1 will be used as an i value, and therefore a slice is taken at the end of a that would extend past the end. In Python, such slices succeed, but produce fewer characters.
String slicing can return strings that are shorter than requested. In your first failing example that checks "abc" against "abc", in the third iteration of the for loop, both a[i:i+2] and b[i:i+2] are equal to "c", and therefore count is incremented.
Using range(len(a)-1) ensures that your loop stops before it gets to a slice that would be just one letter long.
Since the strings may be of different lengths, you want to iterate only up to the end of the shortest one. In addition, you're accessing i+2, so you only want i to iterate up to the index before the last item (otherwise you might get a false positive at the end of the string by going off the end and getting a single-character string).
def string_match(a: str, b: str) -> int:
return len([
a[i:i+2]
for i in range(min(len(a), len(b)) - 1)
if a[i:i+2] == b[i:i+2]
])
(You could also do this counting with a sum, but this makes it easy to get the actual matches as well!)
You can use this :
def string_match(a, b):
if len(a) < 2 or len(b) < 0:
return 0
subs = [a[i:i+2] for i in range(len(a)-1)]
occurence = list(map(lambda x: x in b, subs))
return occurence.count(True)
This is merge sort tweaked to count inversions. My code throws an odd error
(I'm implementing algos to learn python 3.x).
In line 11,
in merge_sort first_sorted_half, x = merge_sort(arr[:half])
[Previous line repeated 12 more times] ValueError: not enough values
to unpack (expected 2, got 1)
Even though I explicitly return two values? I'm new to python 3 so I'd like to understand exactly what's going on here, I can't seem to find a similar issue anywhere. A link to python docs for more on this would also be appreciated!
def merge_sort(arr):
if len(arr) <= 1:
return arr
half = int(len(arr)/2)
first_sorted_half, x = merge_sort(arr[:half])
second_sorted_half, y = merge_sort(arr[half:])
merged_halves, z = merge(first_sorted_half, second_sorted_half)
return merged_halves, x + y + z
def merge(first_half, second_half):
n = len(first_half) + len(second_half)
i = 0
j = 0
split_inversions = 0
ans = []
for k in range(n):
if i >= len(first_half):
ans.append(second_half[j])
j += 1
continue
if j >= len(second_half):
ans.append(first_half[i])
i += 1
continue
if first_half[i] > second_half[j]:
ans.append(second_half[j])
j += 1
split_inversions += len(first_half) - i
elif first_half[i] < second_half[j]:
ans.append(first_half[i])
i += 1
return ans, split_inversions
numbers = [3,2,1,4,5,6,8,10,9]
print(merge_sort(numbers))
The error you are getting says that your program executed that recursive call 12 times, and at the end it couldn't unpack the result.
What that means is, python expects you to return two values from merge_sort, because you unpack the result into first_sorted_half and x. However, when you return only arr from the condition len(arr) <=1, there is no value to unpack, only there exists the array.
So how you fix that is returning a value for the base case, like return arr, len(arr).
Whilst ilke444 is right - a bit more clarification is needed. To start: returning data variables is what you need but I do not know much about the len(arr) <=1 , and I am quite new to stackflow, I do not know this feature of Python 3. I specialize in Pygame/ standard packages.
First thing - arr in this "Code Snippet" (If it is) is not defined; and/or will need to be defined. Len stands for length as you know - and uses a quote (' ') to use it.
Like so:
len('arr')
would print:
3
because there are 3 Characters in this set. You are obviously new to python 3 as you said because the syntax is slightly different.
As this probably only solves the first bit - with this info I will leave you with 1 thing more.
Call to print requires a quote (' '),
Lists have [ ] Brackets instead of (),
Dictionaries have {} brackets and variables now require definition either by variable definition or function unless put in quote marks.
Thanks,
Jerry
NOTE: This is for a homework assignment, but the portion I have a question on is ok to ask help for.
I have to script out a sequence 11110000111000110010 (i am using python) without using switches or if statements and only a maximum of 5 for and whiles.
I already have my script laid out to iterate, I just can't figure out the algorithm as recursive or explicit let alone whether the element's are 1's 2's or 4's =/
As much as we have learned so far there is no equation or algorithm to use to figure OUT the algorithm for sequence. Just a set of instructions for defining one once we figure it out. Does anyone see a pattern here I am missing?
EDIT: What I am looking for is the algorithm to determine the sequence.
IE the sequence 1,3,6,10,15 would come out to be a[n]=(a[n-1]+n) where n is the index of the sequence. This would be a recursive sequence because it relies on a previous element's value or index. In this case a[n-1] refers to the previous index's value.
Another sequence would be 2, 4, 6, 8 would come out to be a[n] = (n*2) which is an explicit sequence because you only require the current index or value.
EDIT: Figured it out thanks to all the helpful people that replied.... I can't believe I didn't see it =/
There are many possible solutions to this problem. Here's a reusable solution that simply decrements from 4 to 1 and adds the expected number of 1's and 0's.
Loops used : 1
def sequence(n):
string = ""
for i in range(n):
string+='1'*(n-i)
string+='0'*(n-i)
return string
print sequence(4)
There's another single-line elegant and more pythonic way to do this:
print ''.join(['1'*x+'0'*x for x in range(4,0,-1)])
Loops used : 1, Lines of code : 1
;)
Note how there's a nested structure here. In pseudocode (so you do the python yourself):
for i in 4 .. 1:
for b in 1 .. 0:
for j in 1 .. i:
print b
You could try this:
print ''.join(['1'*i+'0'*i for i in range(4,0,-1)])
b_len = 4
ones = '1111'
zeros = '0000'
s = ''
for n in range(b_len, -1, -1):
s = s + ones[:n] + zeros[:n]
print s
prints:
11110000111000110010
I see. Four "1" - four "0", three "1" - three "0", two "1" - two "0", one "1" - one "0". 20 digits in total. What it means I have no clue.
#!/usr/bin/python
s=''
i=4
while i >0:
s=s+'1'*i+'0'*i
i -=1
print s
11110000111000110010
Is it exactly this sequence or do you want to be abble to change the length of the 1st sequence of 1?
you can use a reversed iteration loop like in this code:
def askedseq(max1):
seq = [] # declaring temporary sequence
for i in range(max1,0,-1): # decreasing iteration loop
seq += i*[1] + i*[0] # adding the correctly sized subseq
return seq
print askedseq(4) #prints the required sequence
print askedseq(5) #prints the equivalent sequence with 11111
prints:
11110000111000110010
111110000011110000111000110010
you can also look at numpy to do such things