shortest repeated substring [PYTHON] [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
Is there a quick method to find the shortest repeated substring and how many times it occurs? If there is non you only need to return the actual string ( last case ).
>>> repeated('CTCTCTCTCTCTCTCTCTCTCTCT')
('CT', 12)
>>> repeated('GATCGATCGATCGATC')
('GATC', 4)
>>> repeated('GATCGATCGATCGATCG')
('GATCGATCGATCGATCG', 1)
Because some people think it's 'homework' I can show my efforts:
def repeated(sequentie):
string = ''
for i in sequentie:
if i not in string:
string += i
items = sequentie.count(string)
if items * len(string) == len(sequentie):
return (string, items)
else:
return (sequentie, 1)

Your method unfortunately won't work, since it assumes that the repeating substring will have unique characters. This may not be the case:
abaabaabaabaabaaba
You were somewhat on the right track, though. The shortest way that I can think of is to just try and check over and over if some prefix indeed makes up the entire string:
def find_shorted_substring(s):
for i in range(1, len(s) + 1):
substring = s[:i]
repeats = len(s) // len(substring)
if substring * repeats == s:
return (substring, repeats)
It's not very efficient, but it works. There are better ways of doing it.

Related

Run length decompression python [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I'm trying to make a run length decoder that doesn't use 1s. For example a string that could be passed through would be something like ''' A2C3GTA'''. I made what i thought would work and am having trouble finding where I went wrong. I'm a beginner to python so I am sorry for the simple question. Thank you!
def decode(compressed):
decoded= ""
count = 0
for x in compressed :
if x.isdigit():
count += int(x)
y = compressed
decoded += y[int(x)+1] * count
count = 0
else :
decoded += x
print (decoded)
When you find a number-letter pair, you fail to skip the letter after you expand the pair. This is because you used a for loop, which is a more restrictive structure than your logic wants. Instead, try:
idx = 0
while idx < len(compressed):
char = compressed[idx]
if char.isdigit():
# replicate next character
idx += 2
else:
decoded += char
idx += 1
That will take care of your iteration.
Your in appropriate replication, the 22 in your output, this comes from an incorrect reference to the position:
decoded += y[int(x)+1] * count
Here, x is the run length, not the position of the character. If the input were A7B, this ill-formed expression would fault because of an index out of bounds.
In the code I gave you above, simply continue to use idx as the index.
I trust you can finish from here.

creating List within a list in recurrsion from string python [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I am trying to create a list from a string
input string
st = "zzabcxzghfxx"
the list is enclosed in 'z' and 'x'
this is my attempt to create a recursive function
st = "zzabcxzghfxx"
def createlist(strin):
l1=[]
for i in st:
if(i=='x'):
createlist(strin)
elif(i=='z'):
l1.append(i)
return(l1)
following is the desired output:"[[abc][ghf]]"
string = "zzabcxzghzfxx"=> [[abc][ghzf]]"
Using regex.
Ex:
import re
st = "zzabcxzghfxx"
print(re.findall(r"z+(.*?)(?=x)", st))
#or print([[i] for i in re.findall(r"z+(.*?)(?=x)", st)])
Output:
['abc', 'ghf']
You could strip the trailing x and z and split on xz:
st.strip('xz').split('xz')
# ['abc', 'ghf']
Does it have to be recursive? Here's a solution using itertools.groupby.
from itertools import groupby
string = "zzabcxzghfxx"
def is_good_char(char):
return char not in "zx"
lists = [["".join(char for char in list(group))] for key, group in groupby(string, key=is_good_char) if key]
print(lists)
Output:
[['abc'], ['ghf']]
EDIT - Just realized that this might not actually produce the desired behavior. You said:
[a] list is enclosed in 'z' and 'x'
Which means a sublist starts with 'z' and must end with 'x', yes? In that case the itertools.groupby solution I posted will not work exactly. The way it's written now it will generate a new sublist that starts and ends with either 'z' or 'x'. Let me know if this really matters or not.
st.replace("z", "[").replace("x", "]")

How to convert any word into an integer in Python [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
Is there a possible way to convert any word, which is obviously in form of a string to an integer in python.
That might seem utterly stupid and impossible at first, but if you take a look at it, it's a good problem to work on that I have been struggling with so long.
And yes, I have tried many other ways such as using a list to store different integers to their corresponding letters, however; it didn't go very well.
You could create a mapping between primes (2,3,5,7,... ) to all characters in your alphabet (a,b,c,d,e,...). Then you map the position of the character inside your word to the next bigger primes.
Then you multiply your character value with your positional value and sum all up:
Example:
alphabet = {"a":2, "b":3, "1":5, "2":7 }
position = [11,13,17,19,23,29,31]
text = "aabb12a"
def encode(t):
"""Throws error when not map-able"""
return sum(alphabet[x] * position[pos] for pos,x in enumerate(t))
for i in alphabet:
print(i,"=>",encode(i))
print(encode(text))
Output:
('a', '=>', 22)
('1', '=>', 55)
('2', '=>', 77)
('b', '=>', 33)
536
To reverse the number, you would have to do a prime factorisation, order the resuling summands by theire bigger number ascending, then reverse the mapping of your alphabet.
In this case you would get :
536 = 2*11 + 2*13 + 3*17 + 3*19 + 5*23+ 7*29 + 2*31
and you can lookup position and character to reconstruct your word.
Give, with 128 characters (ascii) and words up to 50 characters you would get big numbers....
You can use built-in hash function:
>>> hash('hello')
-8733837932593682240

python find repeated substring in string [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am looking for a function in Python where you give a string as input where a certain word has been repeated several times until a certain length has reached.
The output would then be that word. The repeated word isn't necessary repeated in its whole and it is also possible that it hasn't been repeated at all.
For example:
"pythonpythonp" => "python"
"hellohello" => "hello"
"appleapl" => "apple"
"spoon" => "spoon"
Can someone give me some hints on how to write this kind of function?
You can do it by repeating the substring a certain number of times and testing if it is equal to the original string.
You'll have to try it for every single possible length of string unless you have that saved as a variable
Here's the code:
def repeats(string):
for x in range(1, len(string)):
substring = string[:x]
if substring * (len(string)//len(substring))+(substring[:len(string)%len(substring)]) == string:
print(substring)
return "break"
print(string)
repeats("pythonpytho")
Start by building a prefix array.
Loop through it in reverse and stop the first time you find something that's repeated in your string (that is, it has a str.count()>1.
Now if the same substring exists right next to itself, you can return it as the word you're looking for, however you must take into consideration the 'appleappl' example, where the proposed algorithm would return appl . For that, when you find a substring that exists more than once in your string, you return as a result that substring plus whatever is between its next occurence, namely for 'appleappl' you return 'appl' +'e' = 'apple' . If no such strings are found, you return the whole word since there are no repetitions.
def repeat(s):
prefix_array=[]
for i in range(len(s)):
prefix_array.append(s[:i])
#see what it holds to give you a better picture
print prefix_array
#stop at 1st element to avoid checking for the ' ' char
for i in prefix_array[:1:-1]:
if s.count(i) > 1 :
#find where the next repetition starts
offset = s[len(i):].find(i)
return s[:len(i)+offset]
break
return s
print repeat(s)

Detect string repetition in python without regular expression [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 8 years ago.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Improve this question
I would like a function to detect string reputation, specifically
repetition("abcabcabc")
abc
repetition("aaaaaaa")
a
repetition("ababab")
ab
repetition("abcd")
abcd
I am thinking of doing it in a recursive way but I am confused
Thanks for any help in advance!
I am trying something like
def repetition(r):
if len(r) == 2:
if r[0] == r[1]:
return r[0]
half = len(r) / 2
repetition(r[:half])
if r[:half] == r[half:]:
return r[:half]
There probably is a better way to do this, but my first thought would be this:
def repetition(string):
substring = ''
for character in string:
substring += character
if len(string) % len(substring) == 0:
if (len(string) / len(substring)) * substring == string:
return substring
Using regular expressions:
import re
def repetitions(s):
r = re.compile(r"(.+?)\1+")
for match in r.finditer(s):
if len(match.group()) != len(s):
return s
return match.group(1)
Test:
repetitions("oblabla")
#output: "oblabla"
repetitions("blabla")
#output: "bla"

Categories