list index not updating in for loop (python) - python

In the definition separate I am trying to get the index of a ')' and then loop in reverse until I get '('. The reversed statement is working and the statement continues to stay on the first index of ')'. What is the reason behind the index not being able to update?
class elements:
periodic_table = ['']
def __init__(self, equation):
self.equation = equation
def poly(self):
polyatomic = 'C2H3O2', 'HCO3', 'HSO4', 'ClO', 'ClO3', 'ClO2', 'OCN', 'CN', 'H2PO4', 'OH', 'NO3', 'NO2', 'ClO4', 'MnO4', 'SCN',
return polyatomic
def separate(self):
element = elements.equation
list1 = []
for first, second in zip(element, element[1:]):
if first == ')' and second.isdigit():
multiply = int(second)
print(first, second)
print(element.index(first))
for multiplcation in element[element.index(first)::-1]:
if multiplcation == '(':
break
elif multiplcation != ')':
final = multiplcation * multiply
print(final)
if first == '=':
list1.append(first)
elif first.isupper() and second.islower():
list1.append(first + second)
elif first.isupper() and second.isdigit():
amount = first * int(second)
list1.append(amount)
elif first.isupper():
list1.append(first)
elements = elements(
'K4Fe(CN)6 + KMnO4 + H2SO4 = KHSO4 + Fe2(SO4)3 + MnSO4 + HNO3 + CO2 + H2O')
print(elements.separate())

The crux of the problem is here:
for multiplcation in element[element.index(first)::-1]:
You're backing up from the first occurrence of RPAREN in your string, not from the one you just found. In the given example, you will always return to walk through "CN" for any later parentheses.
I recommend that you redesign your code: split the equation into molecules; write a function to return the expansion of each individual molecule. join those together if you need the entire equation reconstituted.
That should get you past the current problem.

Related

Ignoring Changed Index Check (Python)

I have made a script:
our_word = "Success"
def duplicate_encode(word):
char_list = []
final_str = ""
changed_index = []
base_wrd = word.lower()
for k in base_wrd:
char_list.append(k)
for i in range(0, len(char_list)):
count = 0
for j in range(i + 1, len(char_list)):
if j not in changed_index:
if char_list[j] == char_list[i]:
char_list[j] = ")"
changed_index.append(j)
count += 1
else:
continue
if count > 0:
char_list[i] = ")"
else:
char_list[i] = "("
print(changed_index)
print(char_list)
final_str = "".join(char_list)
return final_str
print(duplicate_encode(our_word))
essentialy the purpose of this script is to convert a string to a new string where each character in the new string is "(", if that character appears only once in the original string, or ")", if that character appears more than once in the original string. I have made a rather layered up script (I am relatively new to the python language so didn't want to use any helpful in-built functions) that attempts to do this. My issue is that where I check if the current index has been previously edited (in order to prevent it from changing), it seems to ignore it. So instead of the intended )())()) I get )()((((. I'd really appreciate an insightful answer to why I am getting this issue and ways to work around this, since I'm trying to gather an intuitive knowledge surrounding python. Thanks!
word = "Success"
print(''.join([')' if word.lower().count(c) > 1 else '(' for c in word.lower()]))
The issue here has nothing to do with your understanding of Python. It's purely algorithmic. If you retain this 'layered' algorithm, it is essential that you add one more check in the "i" loop.
our_word = "Success"
def duplicate_encode(word):
char_list = list(word.lower())
changed_index = []
for i in range(len(word)):
count = 0
for j in range(i + 1, len(word)):
if j not in changed_index:
if char_list[j] == char_list[i]:
char_list[j] = ")"
changed_index.append(j)
count += 1
if i not in changed_index: # the new inportant check to avoid reversal of already assigned ')' to '('
char_list[i] = ")" if count > 0 else "("
return "".join(char_list)
print(duplicate_encode(our_word))
Your algorithm can be greatly simplified if you avoid using char_list as both the input and output. Instead, you can create an output list of the same length filled with ( by default, and then only change an element when a duplicate is found. The loops will simply walk along the entire input list once for each character looking for any matches (other than self-matches). If one is found, the output list can be updated and the inner loop will break and move on to the next character.
The final code should look like this:
def duplicate_encode(word):
char_list = list(word.lower())
output = list('(' * len(word))
for i in range(len(char_list)):
for j in range(len(char_list)):
if i != j and char_list[i] == char_list[j]:
output[i] = ')'
break
return ''.join(output)
for our_word in (
'Success',
'ChJsTk(u cIUzI htBp#qX)OTIHpVtHHhQ',
):
result = duplicate_encode(our_word)
print(our_word)
print(result)
Output:
Success
)())())
ChJsTk(u cIUzI htBp#qX)OTIHpVtHHhQ
))(()(()))))())))()()((())))()))))

How to remove pair of small and capital letters in a string?

Basically what I'm trying to do is create a code that removes a pair of lower and capital letters. e.g. :
AbBax -» x
cCdatabasacCADde -» database
I've tried doing this but it gives me an error, maybe my train of thought is wrong.
def decode(c_p):
t_cp=[]
for i in c_p:
t_cp+=[I,]
#here I added each character from the string to a list so it would be easier to analyse each character
new_c_p=""
for c in range(len(t_cp)-1):
if not t_cp[c]==chr(ord(c)) and t_cp[c+1]==chr(ord(c) + 32) or not t_cp[c]==chr(ord(c) + 32) and t_cp[c+1]==chr(ord(c)) :
#here I analyse the index c and c+1 to know if the first character corresponds to the next in capital or vice-versa, if doesn't correspond, I add that character into new_c_p
new_c_p+=c
return new_c_p
Here's a slightly simpler approach:
def decode(c_p):
while True:
for i, pair in enumerate(zip(c_p, c_p[1:])):
up, lo = sorted(pair)
if up.lower() == lo and up == lo.upper():
c_p = c_p[:i] + c_p[i+2:]
break
else:
return c_p
decode("cCdatabasacCADde")
# 'database'
And here is an even better one that does not start all the way from the beginning every time and has actually linear time and space complexity:
def decode(c_p):
stack = []
for c in c_p:
if not stack:
stack.append(c)
else:
up, lo = sorted((stack[-1], c))
if up.lower() == lo and lo.upper() == up:
stack.pop()
else:
stack.append(c)
return "".join(stack)

Parse nested expression to retrieve each inner functions

Suppose I have an expression as shown below:
expression = "LEN(Replace(Lower(UPPER([ProductName]+[ProductName])), 'chaichai', 'chai'))"
Required output:
['UPPER([ProductName]+[ProductName])','Lower(UPPER([ProductName]+[ProductName]))','Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai')','LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))']
I have tried the below code but not getting required result:
exp_splits = expression.strip(')').split('(')
for i_enm, i in enumerate(range(len(exp_splits)-2, -1, -1), start=1):
result.append(f"{'('.join(exp_splits[i:])}{')'*i_enm}")
print(result)
my code's output:
["UPPER([ProductName]+[ProductName])),'chaichai','chai')", "Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))", "Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai')))", "LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))))"]
import re
e = "LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))"
print ([e[i:j+1] for i in range(len(e)) for j in range(len(e)) if e[i:j+1].count('(') == e[i:j+1].count(')') != 0 and (e[i-1] == '(' or i == 0) and e[j] == ')'])
Output:
["LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))", "Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai')", 'Lower(UPPER([ProductName]+[ProductName]))', 'UPPER([ProductName]+[ProductName])']
Unfolded version:
for i in range(len(e)):
for j in range(len(e)):
#Check for same number of opening/closing parenthesis
if e[i:j+1].count('(') == e[i:j+1].count(')') != 0:
#Check that (first char is preceded by an opening parenthesis OR that first char is the beginning of e) AND last char is a parenthesis
if (e[i-1] == '(' or i == 0) and e[j] == ')':
print (e[i:j+1])
Output:
LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))
Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai')
Lower(UPPER([ProductName]+[ProductName]))
UPPER([ProductName]+[ProductName])
Approach using split and for loop
Here's another approach to get this done. In this approach, I am splitting the string into parts.
Splitting them by left parenthesis and right parenthesis.
Then concatenating them each time to create the expression
Assumption: The expression has equal number of left and right parenthesis
Step 1: Count the number of left parenthesis in the string
Step 2: Split the expression by left parenthesis
Step 3: pop the last expression from the list of left parenthesis and
store into right expression. This contains right parenthesis
Step 4: Split the expression by right parenthesis
Step 5: Now that you have both the sides, stitch them together
Note: While concatenating the expression, left side goes from right
to left (index -1 thru 0) and right side goes from left to right
(index 0 to -1)
Note: For each iteration, you need to concatenate the previous answer
with left and right
Code is as shown below:
expression = "LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))"
n = expression.count('(')
exp_left = expression.split('(')
exp_right = exp_left.pop().split(')')
exp_list = []
exp_string = ''
for i in range(n):
exp_string = exp_left[-i-1] + '(' + exp_string + exp_right[i] + ')'
exp_list.append(exp_string)
for exp in exp_list: print (exp)
The output of this will be:
UPPER([ProductName]+[ProductName])
Lower(UPPER([ProductName]+[ProductName]))
Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai')
LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))
Below code is the same as above. I have added comments to each line for you to understand what's being done.
expression = "LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))"
#find the number of equations in the string. Count of '(' will give you the number
n = expression.count('(')
#split the string by left parenthesis. You get all the functions + middle part + right hand side
exp_left = expression.split('(')
#middle + right hand part is at position index -1. Use pop to remove the last value
#Use the popped string to split by right parenthesis
#result will be middle part + all right hand side parts.
#empty string if there was no text between two right parenthesis
exp_right = exp_left.pop().split(')')
#define a list to store all the expressions
exp_list = []
#Now put it all together looping thru n times
#store the expression in a string so you can concat left and right to it each time
exp_string = ''
for i in range(n):
#for each iteration, concat left side + ( + middle string + right side + )
#left hand side: concat from right to left (-1 to 0)
#right hand side: concat from left to right (0 to n-1)
exp_string = exp_left[-i-1] + '(' + exp_string + exp_right[i] + ')'
#append the expression to the expression list
exp_list.append(exp_string)
#print each string separately
for exp in exp_list: print (exp)
Approach using While Statement
Here's how to do the search and extract version.
e = "LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))"
x = e.count('(')
for i in range(x-1): e = e[e.find('(')+1:]
expression = e[:e.find(')')+1]
print (expression)
The result of this will be:
UPPER([ProductName]+[ProductName])
If you want all of them, then you can do this until you reach the innermost brackets.
e = "LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))"
#x = e.count('(')
#for i in range(x-1): e = e[e.find('(')+1:]
#expression = e[:e.find(')')+1]
exp_list = [e]
while e.count('(') > 1:
e = e[e.find('(')+1:e.rfind(')')]
while e[-1] != ')': e = e[:e.rfind(')')+1]
exp_list.append(e)
for exp in exp_list:
print (exp)
The output of this will be:
LEN(Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai'))
Replace(Lower(UPPER([ProductName]+[ProductName])),'chaichai','chai')
Lower(UPPER([ProductName]+[ProductName]))
UPPER([ProductName]+[ProductName])

Recursive Decompression of Strings

I'm trying to decompress strings using recursion. For example, the input:
3[b3[a]]
should output:
baaabaaabaaa
but I get:
baaaabaaaabaaaabbaaaabaaaabaaaaa
I have the following code but it is clearly off. The first find_end function works as intended. I am absolutely new to using recursion and any help understanding / tracking where the extra letters come from or any general tips to help me understand this really cool methodology would be greatly appreciated.
def find_end(original, start, level):
if original[start] != "[":
message = "ERROR in find_error, must start with [:", original[start:]
raise ValueError(message)
indent = level * " "
index = start + 1
count = 1
while count != 0 and index < len(original):
if original[index] == "[":
count += 1
elif original[index] == "]":
count -= 1
index += 1
if count != 0:
message = "ERROR in find_error, mismatched brackets:", original[start:]
raise ValueError(message)
return index - 1
def decompress(original, level):
# set the result to an empty string
result = ""
# for any character in the string we have not looked at yet
for i in range(len(original)):
# if the character at the current index is a digit
if original[i].isnumeric():
# the character of the current index is the number of repetitions needed
repititions = int(original[i])
# start = the next index containing the '[' character
x = 0
while x < (len(original)):
if original[x].isnumeric():
start = x + 1
x = len(original)
else:
x += 1
# last = the index of the matching ']'
last = find_end(original, start, level)
# calculate a substring using `original[start + 1:last]
sub_original = original[start + 1 : last]
# RECURSIVELY call decompress with the substring
# sub = decompress(original, level + 1)
# concatenate the result of the recursive call times the number of repetitions needed to the result
result += decompress(sub_original, level + 1) * repititions
# set the current index to the index of the matching ']'
i = last
# else
else:
# concatenate the letter at the current index to the result
if original[i] != "[" and original[i] != "]":
result += original[i]
# return the result
return result
def main():
passed = True
ORIGINAL = 0
EXPECTED = 1
# The test cases
provided = [
("3[b]", "bbb"),
("3[b3[a]]", "baaabaaabaaa"),
("3[b2[ca]]", "bcacabcacabcaca"),
("5[a3[b]1[ab]]", "abbbababbbababbbababbbababbbab"),
]
# Run the provided tests cases
for t in provided:
actual = decompress(t[ORIGINAL], 0)
if actual != t[EXPECTED]:
print("Error decompressing:", t[ORIGINAL])
print(" Expected:", t[EXPECTED])
print(" Actual: ", actual)
print()
passed = False
# print that all the tests passed
if passed:
print("All tests passed")
if __name__ == '__main__':
main()
From what I gathered from your code, it probably gives the wrong result because of the approach you've taken to find the last matching closing brace at a given level (I'm not 100% sure, the code was a lot). However, I can suggest a cleaner approach using stacks (almost similar to DFS, without the complications):
def decomp(s):
stack = []
for i in s:
if i.isalnum():
stack.append(i)
elif i == "]":
temp = stack.pop()
count = stack.pop()
if count.isnumeric():
stack.append(int(count)*temp)
else:
stack.append(count+temp)
for i in range(len(stack)-2, -1, -1):
if stack[i].isnumeric():
stack[i] = int(stack[i])*stack[i+1]
else:
stack[i] += stack[i+1]
return stack[0]
print(decomp("3[b]")) # bbb
print(decomp("3[b3[a]]")) # baaabaaabaaa
print(decomp("3[b2[ca]]")) # bcacabcacabcaca
print(decomp("5[a3[b]1[ab]]")) # abbbababbbababbbababbbababbbab
This works on a simple observation: rather tha evaluating a substring after on reading a [, evaluate the substring after encountering a ]. That would allow you to build the result AFTER the pieces have been evaluated individually as well. (This is similar to the prefix/postfix evaluation using programming).
(You can add error checking to this as well, if you wish. It would be easier to check if the string is semantically correct in one pass and evaluate it in another pass, rather than doing both in one go)
Here is the solution with the similar idea from above:
we go through string putting everything on stack until we find ']', then we go back until '[' taking everything off, find the number, multiply and put it back on stack
It's much less consuming as we don't add strings, but work with lists
Note: multiply number can't be more than 9 as we parse it as one element string
def decompress(string):
stack = []
letters = []
for i in string:
if i != ']':
stack.append(i)
elif i == ']':
letter = stack.pop()
while letter != '[':
letters.append(letter)
letter = stack.pop()
word = ''.join(letters[::-1])
letters = []
stack.append(''.join([word for j in range(int(stack.pop()))]))
return ''.join(stack)

Subtracting substring from string in as many possible steps

Goal is to find the maximum amount of times you can subtract t from s.
t = ab, s = aabb. In the first step, we check if t is contained within s. Here, t is contained in the middle i.e. a(ab)b. So, we will remove it and the resultant will be ab and increment the count value by 1. We again check if t is contained within s. Now, t is equal to s i.e. (ab). So, we remove that from s and increment the count. So, since t is no more contained in s, we stop and print the count value, which is 2 in this case.
Problem occurs when you have something as s = 'abbabbaa' t = 'abba'.
Now it matters if you take it from the end or beggining, since you will get more steps from the end.
def MaxNum(s,t):
if not t in s:
return 0
elif s.count(t) == 1:
front = s.find(t)
sfront = s[:front] + s[front + len(t):]
return 1 + MaxNum(sfront,t)
else:
back = s.rfind(t)
front = s.find(t)
sback = s[:back] + s[back +len(t):]
sfront = s[:front] + s[front + len(t):]
print (sfront,sback)
return max(1 + MaxNum(sfront,t),1 + MaxNum(sback,t))
def foo(t,s):
return max([0] + [
1 + foo(t,s[:i]+s[i+len(t):]) for i in range(len(s)) if s[i:].startswith(t)])
Should I ask why you care?

Categories