convert string to specific format string [duplicate] - python

This question already has answers here:
Run length encoding in Python
(10 answers)
Closed 2 years ago.
How the input should be like:
aaabbcca
How the output should be like:
a3b2c2a1
My attempt:
string = input()
ans = ""
i = 0
j=0
while i < len(string):
num=1
ans += string[i]
j = i + 1
if j >= len(string): break
while j < len(string):
if string[i] == string[j]:
num += 1
else:
ans += str(num)
i = j
break
print(ans)
# i = 0
# for i in range(len(string)):
# num=0
# ans += string[i]
# for j in range(i,len(string)):
# if string[i] == string[j]:
# num +=1
# i = j
# else: break
# ans += str(num)
# print(ans)
My problem: nothing is printed
how can i get this code right?

You might want to explore groupby from itertools.
For example:
def encode(data: str) -> str:
return "".join(f"{x}{sum(1 for _ in y)}" for x, y in groupby(data))
print(encode("aaabbcca"))
Output:
a3b2c2a1

Second loop is redundant for this case. You can simply take first char as a base case and with one loop you can do all you need.
raw_str = 'aaabbcca'
last_char = raw_str[0]
res = ''
count = 0
for i in range(len(raw_str)):
if raw_str[i] == last_char:
count +=1
else:
res += raw_str[i - 1] + str(count)
last_char = raw_str[i]
count = 1
# If there is only 1 different char at the end of the str, it will not call if
# so you need to check explicitly if the last char is equal to your lastly added one
if last_char != raw_str[-2]:
res += last_char + str(count)

Related

Parsing string of numbers not registering 0 in Python3

I have this code to reformat a string in Python. So 'abc123' could be 'a1b2c3' or '1a2b3c' or '1b3a2c' and so on. Basically permutations where the rule is a letter follows a number and a number follows a letter. It works for every case except for 1 as far as I can tell. When a zero exists somewhere in the string, the program completely disregards it when parsing as if it were a blank space. So if input is '0abc12' the output becomes 'a1b2c' when it should be '0a1b2c'. How can I fix this?
def reformat(s: str) -> str:
nums = []
chars = []
new_str = ""
for char in s:
try:
if int(char):
nums.append(char)
except ValueError:
chars.append(char)
if len(chars) == len(nums):
for i in range(len(chars)):
new_str += chars[i] + nums[i]
elif len(chars) == len(nums) + 1:
for i in range(len(nums)):
new_str += chars[i] + nums[i]
new_str += chars[-1]
elif len(nums) == len(chars) + 1:
for i in range(len(chars)):
new_str += nums[i] + chars[i]
new_str += nums[-1]
else:
new_str = ""
return new_str
this line: if int(char): is the issue.
int('0') equates to 0 - and the statement if 0: is the same as if False: - hence the following line does not run when the char is 0.
Try removing the if statement, the try except clauses should be enough.

how do i run length encode a pattern, rather than a character?

heres my current RLE code
import re
def decode(string):
if string == '':
return ''
multiplier = 1
count = 0
rle_decoding = []
rle_encoding = []
rle_encoding = re.findall(r'[A-Za-z]|-?\d+\.\d+|\d+|[\w\s]', string)
for item in rle_encoding:
if item.isdigit():
multiplier = int(item)
elif item.isalpha() or item.isspace():
while count < multiplier:
rle_decoding.append('{0}'.format(item))
count += 1
multiplier = 1
count = 0
return(''.join(rle_decoding))
def encode(string):
if string == '':
return ''
i = 0
count = 0
letter = string[i]
rle = []
while i <= len(string) - 1:
while string[i] == letter:
i+= 1
count +=1
#catch the loop on last character so it doesn't got to top and access out of bounds
if i > len(string) - 1:
break
if count == 1:
rle.append('{0}'.format(letter))
else:
rle.append('{0}{1}'.format(count, letter))
if i > len(string) - 1: #ugly that I have to do it twice
break
letter = string[i]
count = 0
final = ''.join(rle)
return final
the code might have gotten fucked up when I removed all my comments, but the current code isn't too important. the problem is, I am running RLE on hexadecimal values, that have all been converted to letters so that 0-9 becomes g-p. the problem is that there are a lot of patterns like 'kjkjkjkjkjkjkjkjlmlmlmlmlmlmlm' which doesn't compress at all, because of their not single characters. how would I, if even possible, be able to run my program so that it encodes patterns as well?

Is there any problem with my code here? I am having trouble with the result

Here is my code: I'd like to decode the string from "3[b2[ca]]" to "bcacabcacabcaca". But my result is "[[[". Can someone help me with that?
def decompression(text):
intStack = []
charStack = []
temp = ""
result = ""
for i in range(len(text)):
if text[i].isdigit():
times = 0
while text[i].isdigit():
times = times * 10 + int(text[i])
i += 1
i -= 1
intStack.append(times)
elif text[i] == ']':
temp = ""
times = 0
if len(intStack) > 0:
times = intStack[-1]
intStack.pop()
while len(charStack) > 0 and charStack[-1] != '[':
temp += charStack[-1]
charStack.pop()
if len(charStack) > 0 and charStack[-1] == '[':
charStack.pop()
for j in range(times):
result += temp
for j in range(len(result)):
charStack.append(result[j])
result = ""
elif text[i] == '[':
if text[i-1].isdigit():
charStack.append(text[i])
else:
charStack.append(text[i])
intStack.append(1)
else:
charStack.append(text[1])
while len(charStack) != 0:
result += charStack[-1]
charStack.pop()
return result
print(decompression("3[b2[ca]]"))
I am having the wrong answer all the time, I've checked it several times and tested with other online editors but what's wrong with this code? I cannot figure it out still.
There is a small typo in the last else statement. It says
else:
charStack.append(text[1])
this should be
else:
charStack.append(text[i])
Also you should indent everything in the function, otherwise python won't know it's part of the function.
def decompression(text):
intStack = []
charStack = []
temp = ""
result = ""
for i in range(len(text)):
if text[i].isdigit():
times = 0
while text[i].isdigit():
times = times * 10 + int(text[i])
i += 1
i -= 1
intStack.append(times)
elif text[i] == ']':
temp = ""
times = 0
if len(intStack) > 0:
times = intStack[-1]
intStack.pop()
while len(charStack) > 0 and charStack[-1] != '[':
temp += charStack[-1]
charStack.pop()
if len(charStack) > 0 and charStack[-1] == '[':
charStack.pop()
for j in range(times):
result += temp
for j in range(len(result)):
charStack.append(result[j])
result = ""
elif text[i] == '[':
if text[i-1].isdigit():
charStack.append(text[i])
else:
charStack.append(text[i])
intStack.append(1)
else:
charStack.append(text[i])
while len(charStack) != 0:
result += charStack[-1]
charStack.pop()
return result
Other than the above mentioned answers and comments your logic for finding intStack doesn't work if the integers aren't single digits, I mean in your sample input 3[b2[ca]] the intStack would be [3,2] that's fine but if the input is 33[b22[c6[a]]] the intStack should be [33,22,6] but your logic fails in this case.
Try this for finding intStack:
intStack = [];
index=[];
for i in range(len(text)):
if i not in index:
if text[i].isdigit():
times="";
while text[i].isdigit():
times+=text[i]
index.append(i);
i+=1
intStack.append(int(times))
While this is not the best practice, it works:
text = "3[b2[ca]]"
list = list(text)
try:
for i in range(len(list)*2):
if list[i] == '[':
list[i] = '*('
if list[i] == ']':
list[i] = ')'
if list[i].isalpha() and list[i+1].isdigit():
list.insert(i+1,"+")
if list[i].isalpha() and list[i+1].isalpha():
list.insert(i+1,"+")
if list[i].isalpha():
globals()[list[i]] = str(list[i])
except:
pass
print(list)
result = ''.join(list)
print(result)
print(eval(result))
Basically, what it does is:
takes the text and transforms it into a list;
iterates over the double the length of the list (we make some appends to the list during the loop itself, hence the double value);
the whole iteration is inside a try: except: pass so that we ignore the IndexError: list index out of range error;
during the iteration, we make some find and replace, slowly transforming the string into a math expression: [...] is replaced with *(...), where it finds a letter followed by a number, it adds + sign, and also where it finds letters next to each other;
finally, using globals() builtin function (https://docs.python.org/3/library/functions.html#globals) we take each letter from the list and declare it as a variable that takes the string representation of itself as a value;
all we have to do now is to join the elements of the list and using eval() https://docs.python.org/3/library/functions.html#eval on that result string, we have the result:
>>> print(text)
3[b2[ca]]
>>> print(list)
['3', '*(', 'b', '+', '2', '*(', 'c', '+', 'a', ')', ')']
>>> result = ''.join(list)
>>> print(result)
3*(b+2*(c+a))
>>> print(eval(result))
bcacabcacabcaca

How to fix 'String index out of range' error

I am trying to write a code which replaces repeating symbols in a string with a symbol and number of its repeats (like that: "aaaaggggtt" --> "a4g4t2"). But I'm getting string index out of range error((
seq = input()
i = 0
j = 1
v = 1
while j<=len(seq)-1:
if seq[i] == seq[j]:
v += 1
i += 1
j += 1
elif seq[i] != seq[j]:
seq.replace(seq[i-v:j], seq[i] + str(v))
v = 1
i += 1
j += 1
print(seq)
line 6, in
if seq[i] == seq[j]:
IndexError: string index out of range
UPD: After changing len(seq) to len(seq)-1 there is no more string index error, but the code still doesn't work.
Input: aaaaggggtt
Output:aaaaggggtt (same)
You can iterate over the string, keeping a running counter and create your string as you go
s = 'aaaaggggtt'
res = ''
counter = 1
#Iterate over the string
for idx in range(len(s)-1):
#If the character changes
if s[idx] != s[idx+1]:
#Append last character and counter, and reset it
res += s[idx]+str(counter)
counter = 1
else:
#Else increment the counter
counter+=1
#Append the last character and it's counter
res += s[-1]+str(counter)
print(res)
Or you can approach this using itertools.groupby
from itertools import groupby
s = 'aaaaggggtt'
#Count numbers and associated length in a list
res = ['{}{}'.format(model, len(list(group))) for model, group in groupby(s)]
#Convert list to string
res = ''.join(res)
print(res)
The output will be
a4g4t2
simple way:
str1 = 'aaaaggggtt'
set1 = set(str1)
res = ''
for i in set1:
res+=i+str(str1.count(i))
print(res)

Find the longest substring in alphabetical order

I have this code that I found on another topic, but it sorts the substring by contiguous characters and not by alphabetical order. How do I correct it for alphabetical order? It prints out lk, and I want to print ccl. Thanks
ps: I'm a beginner in python
s = 'cyqfjhcclkbxpbojgkar'
from itertools import count
def long_alphabet(input_string):
maxsubstr = input_string[0:0] # empty slice (to accept subclasses of str)
for start in range(len(input_string)): # O(n)
for end in count(start + len(maxsubstr) + 1): # O(m)
substr = input_string[start:end] # O(m)
if len(set(substr)) != (end - start): # found duplicates or EOS
break
if (ord(max(sorted(substr))) - ord(min(sorted(substr))) + 1) == len(substr):
maxsubstr = substr
return maxsubstr
bla = (long_alphabet(s))
print "Longest substring in alphabetical order is: %s" %bla
s = 'cyqfjhcclkbxpbojgkar'
r = ''
c = ''
for char in s:
if (c == ''):
c = char
elif (c[-1] <= char):
c += char
elif (c[-1] > char):
if (len(r) < len(c)):
r = c
c = char
else:
c = char
if (len(c) > len(r)):
r = c
print(r)
Try changing this:
if len(set(substr)) != (end - start): # found duplicates or EOS
break
if (ord(max(sorted(substr))) - ord(min(sorted(substr))) + 1) == len(substr):
to this:
if len(substr) != (end - start): # found duplicates or EOS
break
if sorted(substr) == list(substr):
That will display ccl for your example input string. The code is simpler because you're trying to solve a simpler problem :-)
You can improve your algorithm by noticing that the string can be broken into runs of ordered substrings of maximal length. Any ordered substring must be contained in one of these runs
This allows you to just iterate once through the string O(n)
def longest_substring(string):
curr, subs = '', ''
for char in string:
if not curr or char >= curr[-1]:
curr += char
else:
curr, subs = '', max(curr, subs, key=len)
return max(curr, subs, key=len)
s = 'cyqfjhcclkbxpbojgkar'
longest = ""
max =""
for i in range(len(s) -1):
if(s[i] <= s[i+1] ):
longest = longest + s[i]
if(i==len(s) -2):
longest = longest + s[i+1]
else:
longest = longest + s[i]
if(len(longest) > len(max)):
max = longest
longest = ""
if(len(s) == 1):
longest = s
if(len(longest) > len(max)):
print("Longest substring in alphabetical order is: " + longest)
else:
print("Longest substring in alphabetical order is: " + max)
In a recursive way, you can import count from itertools
Or define a same method:
def loops( I=0, S=1 ):
n = I
while True:
yield n
n += S
With this method, you can obtain the value of an endpoint, when you create any substring in your anallitic process.
Now looks the anallize method (based on spacegame issue and Mr. Tim Petters suggestion)
def anallize(inStr):
# empty slice (maxStr) to implement
# str native methods
# in the anallize search execution
maxStr = inStr[0:0]
# loop to read the input string (inStr)
for i in range(len(inStr)):
# loop to sort and compare each new substring
# the loop uses the loops method of past
# I = sum of:
# (i) current read index
# (len(maxStr)) current answer length
# and 1
for o in loops(i + len(maxStr) + 1):
# create a new substring (newStr)
# the substring is taked:
# from: index of read loop (i)
# to: index of sort and compare loop (o)
newStr = inStr[i:o]
if len(newStr) != (o - i):# detect and found duplicates
break
if sorted(newStr) == list(newStr):# compares if sorted string is equal to listed string
# if success, the substring of sort and compare is assigned as answer
maxStr = newStr
# return the string recovered as longest substring
return maxStr
Finally, for test or execution pourposes:
# for execution pourposes of the exercise:
s = "azcbobobegghakl"
print "Longest substring in alphabetical order is: " + anallize( s )
The great piece of this job started by: spacegame and attended by Mr. Tim Petters, is in the use of the native str methods and the reusability of the code.
The answer is:
Longest substring in alphabetical order is: ccl
In Python character comparison is easy compared to java script where the ASCII values have to be compared. According to python
a>b gives a Boolean False and b>a gives a Boolean True
Using this the longest sub string in alphabetical order can be found by using the following algorithm :
def comp(a,b):
if a<=b:
return True
else:
return False
s = raw_input("Enter the required sting: ")
final = []
nIndex = 0
temp = []
for i in range(nIndex, len(s)-1):
res = comp(s[i], s[i+1])
if res == True:
if temp == []:
#print i
temp.append(s[i])
temp.append(s[i+1])
else:
temp.append(s[i+1])
final.append(temp)
else:
if temp == []:
#print i
temp.append(s[i])
final.append(temp)
temp = []
lengths = []
for el in final:
lengths.append(len(el))
print lengths
print final
lngStr = ''.join(final[lengths.index(max(lengths))])
print "Longest substring in alphabetical order is: " + lngStr
Use list and max function to reduce the code drastically.
actual_string = 'azcbobobegghakl'
strlist = []
i = 0
while i < len(actual_string)-1:
substr = ''
while actial_string[i + 1] > actual_string[i] :
substr += actual_string[i]
i += 1
if i > len(actual_string)-2:
break
substr += actual-string[i]
i += 1
strlist.append(subst)
print(max(strlist, key=len))
Wow, some really impressing code snippets here...
I want to add my solution, as I think it's quite clean:
s = 'cyqfjhcclkbxpbojgkar'
res = ''
tmp = ''
for i in range(len(s)):
tmp += s[i]
if len(tmp) > len(res):
res = tmp
if i > len(s)-2:
break
if s[i] > s[i+1]:
tmp = ''
print("Longest substring in alphabetical order is: {}".format(res))
Without using a library, but using a function ord() which returns ascii value for a character.
Assumption: input will be in lowercase, and no special characters are used
s = 'azcbobobegghakl'
longest = ''
for i in range(len(s)):
temp_longest=s[i]
for j in range(i+1,len(s)):
if ord(s[i])<=ord(s[j]):
temp_longest+=s[j]
i+=1
else:
break
if len(temp_longest)>len(longest):
longest = temp_longest
print(longest)
Slightly different implementation, building up a list of all substrings in alphabetical order and returning the longest one:
def longest_substring(s):
in_orders = ['' for i in range(len(s))]
index = 0
for i in range(len(s)):
if (i == len(s) - 1 and s[i] >= s[i - 1]) or s[i] <= s[i + 1]:
in_orders[index] += s[i]
else:
in_orders[index] += s[i]
index += 1
return max(in_orders, key=len)
s = "azcbobobegghakl"
ls = ""
for i in range(0, len(s)-1):
b = ""
ss = ""
j = 2
while j < len(s):
ss = s[i:i+j]
b = sorted(ss)
str1 = ''.join(b)
j += 1
if str1 == ss:
ks = ss
else:
break
if len(ks) > len(ls):
ls = ks
print("The Longest substring in alphabetical order is "+ls)
This worked for me
s = 'cyqfjhcclkbxpbojgkar'
lstring = s[0]
slen = 1
for i in range(len(s)):
for j in range(i,len(s)-1):
if s[j+1] >= s[j]:
if (j+1)-i+1 > slen:
lstring = s[i:(j+1)+1]
slen = (j+1)-i+1
else:
break
print("Longest substring in alphabetical order is: " + lstring)
Output: Longest substring in alphabetical order is: ccl
input_str = "cyqfjhcclkbxpbojgkar"
length = len(input_str) # length of the input string
iter = 0
result_str = '' # contains latest processed sub string
longest = '' # contains longest sub string alphabetic order
while length > 1: # loop till all char processed from string
count = 1
key = input_str[iter] #set last checked char as key
result_str += key # start of the new sub string
for i in range(iter+1, len(input_str)): # discard processed char to set new range
length -= 1
if(key <= input_str[i]): # check the char is in alphabetic order
key = input_str[i]
result_str += key # concatenate the char to result_str
count += 1
else:
if(len(longest) < len(result_str)): # check result and longest str length
longest = result_str # if yes set longest to result
result_str = '' # re initiate result_str for new sub string
iter += count # update iter value to point the index of last processed char
break
if length is 1: # check for the last iteration of while loop
if(len(longest) < len(result_str)):
longest = result_str
print(longest);
finding the longest substring in alphabetical order in Python
in python shell 'a' < 'b' or 'a' <= 'a' is True
result = ''
temp = ''
for char in s:
if (not temp or temp[-1] <= char):
temp += char
elif (temp[-1] > char):
if (len(result) < len(temp)):
result = temp
temp = char
if (len(temp) > len(result)):
result = temp
print('Longest substring in alphabetical order is:', result)
s=input()
temp=s[0]
output=s[0]
for i in range(len(s)-1):
if s[i]<=s[i+1]:
temp=temp+s[i+1]
if len(temp)>len(output):
output=temp
else:
temp=s[i+1]
print('Longest substring in alphabetic order is:' + output)
I had similar question on one of the tests on EDX online something. Spent 20 minutes brainstorming and couldn't find solution. But the answer got to me. And it is very simple. The thing that stopped me on other solutions - the cursor should not stop or have unique value so to say if we have the edx string s = 'azcbobobegghakl' it should output - 'beggh' not 'begh'(unique set) or 'kl'(as per the longest identical to alphabet string). Here is my answer and it works
n=0
for i in range(1,len(s)):
if s[n:i]==''.join(sorted(s[n:i])):
longest_try=s[n:i]
else:
n+=1
In some cases, input is in mixed characters like "Hello" or "HelloWorld"
**Condition 1:**order determination is case insensitive, i.e. the string "Ab" is considered to be in alphabetical order.
**Condition 2:**You can assume that the input will not have a string where the number of possible consecutive sub-strings in alphabetical order is 0. i.e. the input will not have a string like " zxec ".
string ="HelloWorld"
s=string.lower()
r = ''
c = ''
last=''
for char in s:
if (c == ''):
c = char
elif (c[-1] <= char):
c += char
elif (c[-1] > char):
if (len(r) < len(c)):
r = c
c = char
else:
c = char
if (len(c) > len(r)):
r = c
for i in r:
if i in string:
last=last+i
else:
last=last+i.upper()
if len(r)==1:
print(0)
else:
print(last)
Out:elloW
```python
s = "cyqfjhcclkbxpbojgkar" # change this to any word
word, temp = "", s[0] # temp = s[0] for fence post problem
for i in range(1, len(s)): # starting from 1 not zero because we already add first char
x = temp[-1] # last word in var temp
y = s[i] # index in for-loop
if x <= y:
temp += s[i]
elif x > y:
if len(temp) > len(word): #storing longest substring so we can use temp for make another substring
word = temp
temp = s[i] #reseting var temp with last char in loop
if len(temp) > len(word):
word = temp
print("Longest substring in alphabetical order is:", word)
```
My code store longest substring at the moment in variable temp, then compare every string index in for-loop with last char in temp (temp[-1]) if index higher or same with (temp[-1]) then add that char from index in temp. If index lower than (temp[-1]) checking variable word and temp which one have longest substring, after that reset variable temp so we can make another substring until last char in strings.
s = 'cyqfjhcclkbxpbojgkar'
long_sub = '' #longest substring
temp = '' # temporarily hold current substr
if len(s) == 1: # if only one character
long_sub = s
else:
for i in range(len(s) - 1):
index = i
temp = s[index]
while index < len(s) - 1:
if s[index] <= s[index + 1]:
temp += s[index + 1]
else:
break
index += 1
if len(temp) > len(long_sub):
long_sub = temp
temp = ''
print(long_sub)
For comprehensibility, I also add this code snippet based on regular expressions. It's hard-coded and seems clunky. On the other hand, it seems to be the shortest and easiest answer to this problem. And it's also among the most efficient in terms of runtime complexity (see graph).
import re
def longest_substring(s):
substrings = re.findall('a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*', s)
return max(substrings, key=len)
(Unfortunately, I'm not allowed to paste a graph here as a "newbie".)
Source + Explanation + Graph: https://blog.finxter.com/python-how-to-find-the-longest-substring-in-alphabetical-order/
Another way:
s = input("Please enter a sentence: ")
count = 0
maxcount = 0
result = 0
for char in range(len(s)-1):
if(s[char]<=s[char+1]):
count += 1
if(count > maxcount):
maxcount = count
result = char + 1
else:
count = 0
startposition = result - maxcount
print("Longest substring in alphabetical order is: ", s[startposition:result+1])

Categories