I wrote the piece of code below a while back, and had this issue then as well. I ignored it at the time and when I came back to it after asking an 'expert' to look at it, it was working fine.
The issue is, sometimes the program seems unable to run the main() on my laptop, possibly due to how heavy the algorithm is. Is there a way around this? I would hate to keep having this problem in the future. The same code is working perfectly on another computer which i have limited access to.
(P.S. laptop having the issue is a MacBook Air 2015 and it should have no problem running the program. Also, it stops after printing "hi")
It does not give and error message, it just doesn't print anything from main(). It's supposed to print a series of strings which progressively converge to "methinks it is like a weasel". In eclipse, it shows that the code is still being processed but it does not output anything that it is supposed to
import random
def generateOne(strlen):
alphabet = "abcdefghijklmnopqrstuvwxyz "
res = ""
for i in range(strlen):
res = res + alphabet[random.randrange(27)]
return res
def score(goal, teststring):
numSame = 0
for i in range(len(goal)):
if goal[i] == teststring[i]:
numSame = numSame + 1
return numSame / len(goal)
def main():
goalstring = "methinks it is like a weasel"
chgoal = [0]*len(goalstring)
newstring = generateOne(28)
workingstring = list(newstring)
countvar = 0
finalstring = ""
while score(list(goalstring), workingstring) < 1:
if score(goalstring, newstring) > 0:
for j in range(len(goalstring)):
if goalstring[j] == newstring[j] and chgoal[j] == 0:
workingstring[j] = newstring[j]
chgoal[j] = 1
finalstring = "".join(workingstring)
countvar = countvar + 1
print(finalstring)
newstring = generateOne(28)
finalstring = "".join(workingstring)
print(finalstring)
print(countvar)
print("hi")
if __name__ == '__main__':
main()
print("ho")
You can optimize a bit. Strings are immutable - every time you append one char to a string a new string is created and replaces the old one. Use lists of chars instead - also do not use "".join() all the time for printing purposes if you can print the list of chars by decomposing and a seperator of "":
import random
def generateOne(strlen):
"""Create one in one random-call, return as list, do not iterativly add to string"""
alphabet = "abcdefghijklmnopqrstuvwxyz "
return random.choices(alphabet,k=strlen)
def score(goal, teststring):
"""Use zip and generator expr. for summing/scoring"""
return sum(1 if a==b else 0 for a,b in zip(goal,teststring))/len(goal)
def main():
goalstring = list("methinks it is like a weasel") # use a list
newstring = generateOne(28) # also returns a list
workingstring = newstring [:] # copy
countvar = 0
while score(goalstring, workingstring) < 1:
if score(goalstring, newstring) > 0:
for pos,c in enumerate(goalstring): # enumerate for getting the index
# test if equal, only change if not yet ok
if c == newstring[pos] and workingstring[pos] != c:
workingstring[pos] = newstring[pos] # could use c instead
countvar += 1
print(*workingstring, sep="") # print decomposed with sep of ""
# instead of "".join()
newstring = generateOne(28)
finalstring = "".join(workingstring) # create result once ...
# although its same as goalstring
# so we could just assing that one
print(finalstring)
print(countvar)
print("hi")
if __name__ == '__main__':
s = datetime.datetime.now()
main()
print(datetime.datetime.now()-s)
print("ho")
Timings with printouts are very unrelieable. If I comment the print printing the intermediate steps to the final solution and use a `random.seed(42)' - I get for mine:
0:00:00.012536
0:00:00.012664
0:00:00.008590
0:00:00.012575
0:00:00.012576
and for yours:
0:00:00.017490
0:00:00.017427
0:00:00.013481
0:00:00.017657
0:00:00.013210
I am quite sure this wont solve your laptops issues, but still - it is a bit faster.
Related
I have made a script:
our_word = "Success"
def duplicate_encode(word):
char_list = []
final_str = ""
changed_index = []
base_wrd = word.lower()
for k in base_wrd:
char_list.append(k)
for i in range(0, len(char_list)):
count = 0
for j in range(i + 1, len(char_list)):
if j not in changed_index:
if char_list[j] == char_list[i]:
char_list[j] = ")"
changed_index.append(j)
count += 1
else:
continue
if count > 0:
char_list[i] = ")"
else:
char_list[i] = "("
print(changed_index)
print(char_list)
final_str = "".join(char_list)
return final_str
print(duplicate_encode(our_word))
essentialy the purpose of this script is to convert a string to a new string where each character in the new string is "(", if that character appears only once in the original string, or ")", if that character appears more than once in the original string. I have made a rather layered up script (I am relatively new to the python language so didn't want to use any helpful in-built functions) that attempts to do this. My issue is that where I check if the current index has been previously edited (in order to prevent it from changing), it seems to ignore it. So instead of the intended )())()) I get )()((((. I'd really appreciate an insightful answer to why I am getting this issue and ways to work around this, since I'm trying to gather an intuitive knowledge surrounding python. Thanks!
word = "Success"
print(''.join([')' if word.lower().count(c) > 1 else '(' for c in word.lower()]))
The issue here has nothing to do with your understanding of Python. It's purely algorithmic. If you retain this 'layered' algorithm, it is essential that you add one more check in the "i" loop.
our_word = "Success"
def duplicate_encode(word):
char_list = list(word.lower())
changed_index = []
for i in range(len(word)):
count = 0
for j in range(i + 1, len(word)):
if j not in changed_index:
if char_list[j] == char_list[i]:
char_list[j] = ")"
changed_index.append(j)
count += 1
if i not in changed_index: # the new inportant check to avoid reversal of already assigned ')' to '('
char_list[i] = ")" if count > 0 else "("
return "".join(char_list)
print(duplicate_encode(our_word))
Your algorithm can be greatly simplified if you avoid using char_list as both the input and output. Instead, you can create an output list of the same length filled with ( by default, and then only change an element when a duplicate is found. The loops will simply walk along the entire input list once for each character looking for any matches (other than self-matches). If one is found, the output list can be updated and the inner loop will break and move on to the next character.
The final code should look like this:
def duplicate_encode(word):
char_list = list(word.lower())
output = list('(' * len(word))
for i in range(len(char_list)):
for j in range(len(char_list)):
if i != j and char_list[i] == char_list[j]:
output[i] = ')'
break
return ''.join(output)
for our_word in (
'Success',
'ChJsTk(u cIUzI htBp#qX)OTIHpVtHHhQ',
):
result = duplicate_encode(our_word)
print(our_word)
print(result)
Output:
Success
)())())
ChJsTk(u cIUzI htBp#qX)OTIHpVtHHhQ
))(()(()))))())))()()((())))()))))
I'm trying to decompress strings using recursion. For example, the input:
3[b3[a]]
should output:
baaabaaabaaa
but I get:
baaaabaaaabaaaabbaaaabaaaabaaaaa
I have the following code but it is clearly off. The first find_end function works as intended. I am absolutely new to using recursion and any help understanding / tracking where the extra letters come from or any general tips to help me understand this really cool methodology would be greatly appreciated.
def find_end(original, start, level):
if original[start] != "[":
message = "ERROR in find_error, must start with [:", original[start:]
raise ValueError(message)
indent = level * " "
index = start + 1
count = 1
while count != 0 and index < len(original):
if original[index] == "[":
count += 1
elif original[index] == "]":
count -= 1
index += 1
if count != 0:
message = "ERROR in find_error, mismatched brackets:", original[start:]
raise ValueError(message)
return index - 1
def decompress(original, level):
# set the result to an empty string
result = ""
# for any character in the string we have not looked at yet
for i in range(len(original)):
# if the character at the current index is a digit
if original[i].isnumeric():
# the character of the current index is the number of repetitions needed
repititions = int(original[i])
# start = the next index containing the '[' character
x = 0
while x < (len(original)):
if original[x].isnumeric():
start = x + 1
x = len(original)
else:
x += 1
# last = the index of the matching ']'
last = find_end(original, start, level)
# calculate a substring using `original[start + 1:last]
sub_original = original[start + 1 : last]
# RECURSIVELY call decompress with the substring
# sub = decompress(original, level + 1)
# concatenate the result of the recursive call times the number of repetitions needed to the result
result += decompress(sub_original, level + 1) * repititions
# set the current index to the index of the matching ']'
i = last
# else
else:
# concatenate the letter at the current index to the result
if original[i] != "[" and original[i] != "]":
result += original[i]
# return the result
return result
def main():
passed = True
ORIGINAL = 0
EXPECTED = 1
# The test cases
provided = [
("3[b]", "bbb"),
("3[b3[a]]", "baaabaaabaaa"),
("3[b2[ca]]", "bcacabcacabcaca"),
("5[a3[b]1[ab]]", "abbbababbbababbbababbbababbbab"),
]
# Run the provided tests cases
for t in provided:
actual = decompress(t[ORIGINAL], 0)
if actual != t[EXPECTED]:
print("Error decompressing:", t[ORIGINAL])
print(" Expected:", t[EXPECTED])
print(" Actual: ", actual)
print()
passed = False
# print that all the tests passed
if passed:
print("All tests passed")
if __name__ == '__main__':
main()
From what I gathered from your code, it probably gives the wrong result because of the approach you've taken to find the last matching closing brace at a given level (I'm not 100% sure, the code was a lot). However, I can suggest a cleaner approach using stacks (almost similar to DFS, without the complications):
def decomp(s):
stack = []
for i in s:
if i.isalnum():
stack.append(i)
elif i == "]":
temp = stack.pop()
count = stack.pop()
if count.isnumeric():
stack.append(int(count)*temp)
else:
stack.append(count+temp)
for i in range(len(stack)-2, -1, -1):
if stack[i].isnumeric():
stack[i] = int(stack[i])*stack[i+1]
else:
stack[i] += stack[i+1]
return stack[0]
print(decomp("3[b]")) # bbb
print(decomp("3[b3[a]]")) # baaabaaabaaa
print(decomp("3[b2[ca]]")) # bcacabcacabcaca
print(decomp("5[a3[b]1[ab]]")) # abbbababbbababbbababbbababbbab
This works on a simple observation: rather tha evaluating a substring after on reading a [, evaluate the substring after encountering a ]. That would allow you to build the result AFTER the pieces have been evaluated individually as well. (This is similar to the prefix/postfix evaluation using programming).
(You can add error checking to this as well, if you wish. It would be easier to check if the string is semantically correct in one pass and evaluate it in another pass, rather than doing both in one go)
Here is the solution with the similar idea from above:
we go through string putting everything on stack until we find ']', then we go back until '[' taking everything off, find the number, multiply and put it back on stack
It's much less consuming as we don't add strings, but work with lists
Note: multiply number can't be more than 9 as we parse it as one element string
def decompress(string):
stack = []
letters = []
for i in string:
if i != ']':
stack.append(i)
elif i == ']':
letter = stack.pop()
while letter != '[':
letters.append(letter)
letter = stack.pop()
word = ''.join(letters[::-1])
letters = []
stack.append(''.join([word for j in range(int(stack.pop()))]))
return ''.join(stack)
For an exercise, I have to create a simple profanity filter in order to learn about classes.
The filter gets initialized with a list of offensive keywords and a replacement template. Every occurrence of any of these words should be replaced with a string that is generated from the template. If the word size is shorter than the template, a substring should be used that starts from the beginning, for longer sizes, the template should be repeated as often as necessary.
The following are my results so far, with an example.
class ProfanityFilter:
def __init__(self, keywords, template):
self.__keywords = sorted(keywords, key=len, reverse=True)
self.__template = template
def filter(self, msg):
def __replace_letters__(old_word, replace_str):
replaced_word = ""
old_index = 0
replace_index = 0
while old_index <= len(old_word):
if replace_index == len(replace_str):
replace_index = 0
else:
replaced_word += replace_str[replace_index]
replace_index += 1
old_index += 1
return replaced_word
for keyword in self.__keywords:
idx = 0
while idx < len(msg):
index_l = msg.lower().find(keyword.lower(), idx)
if index_l == -1:
break
msg = msg[:index_l] + __replace_letters__(keyword, self.__template) + msg[index_l + len(keyword):]
idx = index_l + len(keyword)
return msg
f = ProfanityFilter(["duck", "shot", "batch", "mastard"], "?#$")
offensive_msg = "this mastard shot my duck"
clean_msg = f.filter(offensive_msg)
print(clean_msg) # should be: "this ?#$?#$? ?#$? my ?#$?"
The example should print:
this ?#$?#$? ?#$? my ?#$?
But it prints:
this ?#$?#$ ?#$? my ?#$?
For some reason it replaces the word "mastard" with 6 symbols instead of 7 (one for each letter). It works for the other keywords, why not for this one?
Also if you see anything else that seems off, feel free to tell me. Do keep in mind tho that I am a beginner and my "toolbox" is quite small atm.
Your problem is in the index logic. You have two errors
Each time you reach the end of the replacement string, you skip a letter in the profanity:
while old_index <= len(old_word):
if replace_index == len(replace_str):
replace_index = 0
# You don't replace a letter; you just reset the new index, but ...
else:
replaced_word += replace_str[replace_index]
replace_index += 1
old_index += 1 # ... but you still advance the old index.
The reason you didn't notice this is that you have a second bug: you run your old_index from 0 through len(old_word), which is one more character than you started with. For the canonical four-letter word (or words of 5 or 6 characters), the two errors cancel each other. You didn't see this because you didn't test enough. For instance, using:
f = ProfanityFilter(["StackOverflow", "PC"], "?#$")
offensive_msg = "StackOverflow on PC rulez!"
clean_msg = f.filter(offensive_msg)
Output:
?#$?#$?#$?# on ?#$ rulez!
The input words are 13 and 2 letters; the replacements are 11 and 3.
Fix those two errors: make old_index stay in bounds, and increment it only when you make a replacement.
while old_index < len(old_word):
if replace_index == len(replace_str):
replace_index = 0
else:
replaced_word += replace_str[replace_index]
replace_index += 1
old_index += 1
Future improvements:
Refactor this into a for loop.
Don't reset your replace_index; in fact, get rid of it. Simply use old_index % len(replace_str).
I'd do this with a regular expression instead, since re.sub() has a handy API for dynamic replacements:
import re
class ProfanityFilter:
def __init__(self, keywords, template):
# Build a regular expression that will match all of the profane words
self.keyword_re = re.compile("|".join(re.escape(keyword) for keyword in keywords), re.I)
self.template = template
def _generate_replacement(self, word):
l = len(word)
# Figure out how many times to repeat the template
repeats = (l // len(self.template)) + 1
# Since we may end up with a string longer than the original,
# slice to the correct length.
return (self.template * repeats)[:l]
def filter(self, msg):
# Replace all occurrences of the regular expression with
# a dynamically computed replacement value.
return self.keyword_re.sub(
lambda m: self._generate_replacement(m.group(0)),
msg,
)
f = ProfanityFilter(["duck", "shot", "batch", "mastard"], "?#$")
offensive_msg = "this mastard shot my duck"
print(f.filter(offensive_msg))
Couldn't make a one-liner, but here's a terrible implementation anway. Don't do what VoNWooDSoN does:
def replace(msg, keywords=["duck", "shot", "batch", "mastard"], template="?#$"):
for keyword in keywords * len(msg)):
msg = (template*len(keyword))[:len(keyword)].join([msg[:msg.find(keyword)], msg[msg.find(keyword)+len(keyword):]]) if msg.find(keyword) > 0 else msg
return msg
offensive_msg = "this mastard shot my duck"
clean_msg = replace(offensive_msg)
print(clean_msg) # should be: "this ?#$?#$? ?#$? my ?#$?"
print(clean_msg=="this ?#$?#$? ?#$? my ?#$?")
edit
So, I guess that 3.8 has assignment expressions... So, but this'd be the one liner then (probably).
print ((lambda msg: [msg := (("?#$"*len(keyword))[:len(keyword)].join([msg[:msg.find(keyword)], msg[msg.find(keyword)+len(keyword):]]) if msg.find(keyword) > 0 else msg) for keyword in ["duck", "shot", "batch", "mastard"]])("this mastard shot my duck")[-1])
I have a string: "String"
The first thing you do is reverse it: "gnirtS"
Then you will take the string from the 1st position and reverse it again: "gStrin"
Then you will take the string from the 2nd position and reverse it again: "gSnirt"
Then you will take the string from the 3rd position and reverse it again: "gSntri"
Continue this pattern until you have done every single position, and then you will return the string you have created. For this particular string, you would return: "gSntir"
And I have to repeat this entire procedure for x times where the string and x can be very big . (million or billion)
My code is working fine for small strings but it's giving timeout error for very long strings.
def string_func(s,x):
def reversal(st):
n1=len(st)
for i in range(0,n1):
st=st[0:i]+st[i:n1][::-1]
return st
for i in range(0,x):
s=reversal(s)
return s
This linear implementation could point you in the right direction:
from collections import deque
from itertools import cycle
def special_reverse(s):
d, res = deque(s), []
ops = cycle((d.pop, d.popleft))
while d:
res.append(next(ops)())
return ''.join(res)
You can recognize the slice patterns in the following examples:
>>> special_reverse('123456')
'615243'
>>> special_reverse('1234567')
'7162534'
This works too:
my_string = "String"
my_string_len = len(my_string)
result = ""
for i in range(my_string_len):
my_string = my_string[::-1]
result += my_string[0]
my_string = my_string[1:]
print(result)
And this, though it looks spaghetti :D
s = "String"
lenn = len(s)
resultStringList = []
first_half = list(s[0:int(len(s) / 2)])
second_half = None
middle = None
if lenn % 2 == 0:
second_half = list(s[int(len(s) / 2) : len(s)][::-1])
else:
second_half = list(s[int(len(s) / 2) + 1 : len(s)][::-1])
middle = s[int(len(s) / 2)]
lenn -= 1
for k in range(int(lenn / 2)):
print(k)
resultStringList.append(second_half.pop(0))
resultStringList.append(first_half.pop(0))
if middle != None:
resultStringList.append(middle)
print(''.join(resultStringList))
From the pattern of the original string and the result I constructed this algorithm. It has minimal number of operations.
str = 'Strings'
lens = len(str)
lensh = int(lens/2)
nstr = ''
for i in range(lensh):
nstr = nstr + str[lens - i - 1] + str[i]
if ((lens % 2) == 1):
nstr = nstr + str[lensh]
print(nstr)
or a short version using iterator magic:
def string_func(s):
ops = (iter(reversed(s)), iter(s))
return ''.join(next(ops[i % 2]) for i in range(len(s)))
which does the right thing for me, while if you're happy using some library code, you can golf it down to:
from itertools import cycle, islice
def string_func(s):
ops = (iter(reversed(s)), iter(s))
return ''.join(map(next, islice(cycle(ops), len(s))))
my original version takes 80microseconds for a 512 character string, this updated version takes 32µs, while your version took 290µs and schwobaseggl's solution is about 75µs.
I've had a play in Cython and I can get runtime down to ~0.5µs. Measuring this under perf_event_open I can see my CPU is retiring ~8 instructions per character, which seems pretty good, while a hard-coded loop in C gets this down to ~4.5 instructions per ASCII char. These don't seem to be very "Pythonic" solutions so I'll leave them out of this answer. But included this paragraph to show that the OP has options to make things faster, and that running this a billion times on a string consisting of ~500 characters will still take hundreds of seconds even with relatively careful C code.
def interleave(s1,s2): #This function interleaves s1,s2 together
guess = 0
total = 0
while (guess < len(s1)) and (guess < len(s2)):
x = s1[guess]
y = s2[guess]
m = x + y
print ((m),end ="")
guess += 1
if (len(s1) == len(s2)):
return ("")
elif(len(s1) > len(s2)):
return (s1[guess:])
elif(len(s2) > len(s1)):
return (s2[guess:])
print (interleave("Smlksgeneg n a!", "a ie re gsadhm"))
For some reason, my test function gives an assertion error eventhough the output is the same as the code below.
Eg - "Smlksgeneg n a!", "a ie re gsadhm" returns "Sam likes green eggs and ham!"
but an assertion error still comes out
def testInterleave():
print("Testing interleave()...", end="")
assert(interleave("abcdefg", "abcdefg")) == ("aabbccddeeffgg")
assert(interleave("abcde", "abcdefgh") == "aabbccddeefgh")
assert(interleave("abcdefgh","abcde") == "aabbccddeefgh")
assert(interleave("Smlksgeneg n a!", "a ie re gsadhm") ==
"Sam likes green eggs and ham!")
assert(interleave("","") == "")
print("Passed!")
testInterleave()
You are confusing what is printed by interleave() from what is returned by it. The assert is testing the returned value. For example, when s1 and s2 are the same length, your code prints the interleave (on the print((m),end="") line) but returns an empty string (in the line return ("")
If you want interleave to return the interleaved string, you need to collect the x and y variables (not very well named if they are always holding characters) into a single string and return that.
The problem is that your function just prints the interleaved portion of the resulting string, it doesn't return it, it only returns the tail of the longer string.
Here's a repaired and simplified version of your code. You don't need to do those if... elif tests. Also, your code has a lot of superfluous parentheses (and one misplaced parenthesis), which I've removed.
def interleave(s1, s2):
''' Interleave strings s1 and s2 '''
guess = 0
result = ""
while (guess < len(s1)) and (guess < len(s2)):
x = s1[guess]
y = s2[guess]
result += x + y
guess += 1
return result + s1[guess:] + s2[guess:]
def testInterleave():
print("Testing interleave()...", end="")
assert interleave("abcdefg", "abcdefg") == "aabbccddeeffgg"
assert interleave("abcde", "abcdefgh") == "aabbccddeefgh"
assert interleave("abcdefgh","abcde") == "aabbccddeefgh"
assert (interleave("Smlksgeneg n a!", "a ie re gsadhm")
== "Sam likes green eggs and ham!")
assert interleave("", "") == ""
print("Passed!")
print(interleave("Smlksgeneg n a!", "a ie re gsadhm"))
testInterleave()
output
Sam likes green eggs and ham!
Testing interleave()...Passed!
Here's a slightly improved version of interleave. It uses a list to store the result, rather than using repeated string concatenation. Using lists to build string like this is a common Python practice because it's more efficient than repeated string concatenation using + or +=; OTOH,+ and += on strings have been optimised so that they're fairly efficient for short strings (up to 1000 chars or so).
def interleave(s1, s2):
result = []
i = 0
for i, t in enumerate(zip(s1, s2)):
result.extend(t)
i += 1
result.extend(s1[i:] + s2[i:])
return ''.join(result)
That i = 0 is necessary in case either s1 or s2 are empty strings. When that happens the for loop isn't entered and so i won't get assigned a value.
Finally, here's a compact version using a list comprehension and the standard itertools.zip_longest function.
def interleave(s1, s2):
return ''.join([u+v for u,v in zip_longest(s1, s2, fillvalue='')])