remove substring from string using python

remove substring from string using python - python

I would like to remove 2 last sub-strings from a string like the following example :
str="Dev.TTT.roker.{i}.ridge.{i}."
str1="Dev.TTT.roker.{i}.ridge.{i}.obj."
if in the last two strings between the dot . there is a {i} we have to remove it as well.
so the result of python script should be loke this :
the expected result for str is : Dev.TTT.
the expected result for str1 is : Dev.TTT.roker.{i}.

you can simply split by . and ignore empty string or {i}.
Also do not use keyword as variable. In your case dont use str as variable name.
def solve(s):
x = s.split('.')
cnt = 2
l = len(x) - 1
while cnt and l:
if x[l] == '' or x[l] == '{i}':
l -= 1
continue
else:
cnt -= 1
l -= 1
return '.'.join(x[:l+1]) + '.'
str1="Dev.TTT.roker.{i}.ridge.{i}."
str2="Dev.TTT.roker.{i}.ridge.{i}.obj."
print(solve(str1))
print(solve(str2))
output:
Dev.TTT.
Dev.TTT.roker.{i}.

Related

Creating string based on first letters of each element of the list

Example:
list = [abcc, typpaw, gfssdwww]
expected result = atgbyfcpscpsadwwww
Any ideas?
This is what i made so far:
def lazy_scribe(sources: list):
result: str = ''
i = 0
while i < len(max(sources, key=len)):
for source in sources:
for char in source:
if i <= len(source):
result = result + source[int(i)]
else:
continue
i += 1 / (len(sources))
break
return result
sources = ["python", "java", "golang"]
print(lazy_scribe(sources))
print(len(sources))
result: "pjgyaoyvlhaaononngn". I dont know why there is "y" instead of t (7 char in result string)

If I understand the problem correctly, this should work.
list = ["abcc", "typpaw", "gfssdwww"]
max_len = len(max(list, key=len))
res = ""
char_iterator = 0
while char_iterator < max_len:
for word in list:
if char_iterator < len(word):
res += word[char_iterator]
char_iterator += 1
print(res)

Another possible solution is as follows:
l = ['abcc', 'typpaw', 'gfssdwww']
max_len = len(max(l, key=len))
padded_l = list(zip(*[e + " " * (max_len - len(e)) for e in l]))
''.join([''.join(e) for e in padded_l]).replace(' ', '')
find the longest string in the list
then pad all the strings in the list with blank space
use zip on the result list
join the elements and replace the blank space to get the desired result

Replace all occurrences of the substring in string using string slicing

I want to replace all substring occurrences in a string, but I wish not to use the replace method. At the moment, experiments have led me to this:
def count_substrings_and_replace(string, substring, rpl=None):
string_size = len(string)
substring_size = len(substring)
count = 0
_o = string
for i in range(0, string_size - substring_size + 1):
if string[i:i + substring_size] == substring:
if rpl:
print(_o[:i] + rpl + _o[i + substring_size:])
count += 1
return count, _o
count_substrings_and_replace("aaabaaa", "aaa", "ddd")
but I have output like this:
dddbaaa
aaabddd
not dddbddd.
Update 1:
I figured out that I can only replace correctly with a string of the same length of substring. For example for count_substrings_and_replace("aaabaaa", "aaa", "d") I got output: (2, 'dbaad') not dbd
Update 2:
Issue described in update 1 did appear because of string comparing relative to the original string (line 8) that does not change throughout the process.
Fixed:
def count_substrings_and_replace(string, substring, rpl=None):
string_size = len(string)
substring_size = len(substring)
count = 0
_o = string
for i in range(0, string_size - substring_size + 1):
if _o[i:i + substring_size] == substring:
if rpl:
_o = _o[:i] + rpl + _o[i + substring_size:]
count += 1
return count, _o
count_substrings_and_replace("aaabaaa", "aaa", "d")
Output: (2, dbd)

You never update the value of _o when a match is found, you're only printing out what it'd look like if it was to be replaced. Instead, inside that innermost if statement should be two lines like:
_o = _o[:i] + rpl + _o[i + substring_size:]
print(_o)
That would print the string every time a match is found and replaced, moving the print statement to run after the for loop would make it only run once the entire string was parsed and replaced appropriately.

Just my mistake. I had to pass the value to the variable on each iteration not print:
_o = _o[:i] + rpl + _o[i + substring_size:]

How to count the number of triplets found in a string?

string1 = "abbbcccd"
string2 = "abbbbdccc"
How do I find the number of triplets found in a string. Triplet meaning a character that appears 3 times in a row. Triplets can also overlap for example in string 2 (abbbbdccc)
The output should be:
2 < -- string 1
3 <-- string 2
Im new to python and stack overflow so any help or advice in question writing would be much appreciated.

Try iterating through the string with a while loop, and comparing if the character and the two other characters in front of that character are the same. This works for overlap as well.
string1 = "abbbcccd"
string2 = "abbbbdccc"
string3 = "abbbbddddddccc"
def triplet_count(string):
it = 0 # iterator of string
cnt = 0 # count of number of triplets
while it < len(string) - 2:
if string[it] == string[it + 1] == string[it + 2]:
cnt += 1
it += 1
return cnt
print(triplet_count(string1)) # returns 2
print(triplet_count(string2)) # returns 3
print(triplet_count(string3)) # returns 7

This simple script should work...
my_string = "aaabbcccddddd"
# Some required variables
old_char = None
two_in_a_row = False
triplet_count = 0
# Iterates through characters of a given string
for char in my_string:
# Checks if previous character matches current character
if old_char == char:
# Checks if there already has been two in a row (and hence now a triplet)
if two_in_a_row:
triplet_count += 1
two_in_a_row = True
# Resets the two_in_a_row boolean variable if there's a non-match.
else:
two_in_a_row = False
old_char = char
print(triplet_count) # prints 5 for the example my_string I've given

Given a string containing uppercase alphabets (A-Z), compress the string using Run Length encoding

Given a string containing uppercase alphabets (A-Z), compress the string using Run Length encoding. Repetition of character has to be replaced by storing the length of that run.
I tried the following codes
#Code 1: Tried on my own
def encode(message):
list1=[]
for i in range (0,len(message)):
count = 1
while(i < len(message)-1 and message[i]==message[i+1]):
count+=1
i+=1
list1=str(count)+message[i]
return list1
encoded_message=encode("ABBBBCCCCCCCCAB")
print(encoded_message)
Input:AAAABBBBCCCCCCCC
Expected Output: 4A4B8C
#code 2:I tried this by looking at another code based on run-length encoding
def encode(message):
list1=[]
count=1
for i in range (1,len(message)):
if(message[i]==message[i-1]):
count+=1
else:
list1.append((count,list1[i-1]))
count=1
if i == len(messege) - 1 :
list1.append((count , data[i]))
return list1
encoded_message=encode("ABBBBCCCCCCCCAB")
print(encoded_message)
Input:AAAABBBBCCCCCCCC
Expected Output: 4A4B8C
The first code gives output as 2B

def encode(message):
pairs = []
for char in message:
if len(pairs) > 0:
if pairs[-1][0] == char:
pairs[-1] = (char, pairs[-1][1] + 1)
else:
pairs.append((char, 1))
else:
pairs.append((char, 1))
strings = []
for letter, count in pairs:
strings.append(f"{count}{letter.upper()}")
return "".join(strings)
print(encode("ABBBBCCCCCCCCAB"))
print(encode("AAAABBBBCCCCCCCC"))
This outputs:
1A4B8C1A1B
4A4B8C

This is a very good use for the groupby function from itertools:
from itertools import groupby
message = 'AAAABBBBCCCCCCCC'
''.join('{}{}'.format(len(list(g)), c) for c, g in groupby(message))

Based on your code #2 method I have tweaked it to give out the output as you have in Expected Output: 4A4B8C
basically, your returning a tuple in a list so you needed to make it a string instead and add to it your also using data but have no data variable and your trying to find the content of the message, not your list so the code would be
def encode2(message):
encoded_return_message = ""
count=1
for i in range (1,len(message)):
if(message[i]==message[i-1]):
count+=1
else:
encoded_return_message += (f'{count}{message[i-1]}')
count=1
if i == len(message) - 1 :
encoded_return_message +=(f'{count}{message[i]}')
return encoded_return_message
encoded_message=encode2("ABBBBCCCCCCCCAB")
print(str(encoded_message))
I also did a demo on Repl.it
https://repl.it/repls/RowdyFloralwhiteBlockchain

Personally I would do that task using re module following way:
import re
text = 'AAAABBBBCCCCCCCC'
def sub_function(m):
span = m.span()
return f"{span[1]-span[0]}"+m.groups()[0]
out = re.sub(r'(\w)(\1*)',sub_function,text)
print(out)
Output:
4A4B8C
Explanation: pattern in re.sub is looking for letter followed by 0 or more occurences of same letter, than every such substring is feed to sub_function which calculate overall length of substring and return that value concatenated with first letter (which is same as all others) of substring. Note that I used so-called f-string in my code which is not available in older versions (I tested my code in Python 3.6.7), if you have to use older version you need to use other string formatting method. Note also that my code as is would replace single letter with digit 1 plus that letter for example input ABC would result in 1A1B1C, if you wish to retain single letters without adding 1 then change 1st argument of re.sub from r'(\w1)(\1*)' to r'(\w1)(\1+)'
Though maybe now I am the guy with hammer seeing nails everywhere.

def encode(message):
count=0
characters=''
previous_char=message[0]
result=''
length=len(message)
i=0
while(i!=length):
character=message[i]
if previous_char==character:
count=count+1
else:
result=result+str(count)+previous_char
count=1
previous_char=character
i=i+1
return result+str(count)+str(previous_char)
encoded_messsage=encode("ABBBBCCCCCCCCAB")
print(encoded_message)
Input is:ABBBBCCCCCCCCAB
output is:1A4B8C1A1B

def encodeString(s):
encoded = ""
ctr = 1
for i in range(len(s)-1):
if s[i]==s[i+1]:
ctr += 1
i += 1
else:
encoded = encoded + str(ctr) + s[i]
i += 1
ctr = 1
#print(encoded)
encoded = encoded + str(ctr) + s[i]
#print(encoded)
return encoded
Input :"AAAAABBCCDDAB"
Output: 5A2B2C4D1A1B

def encode(message):
list1=[]
count=1
for i in range (1,len(message)):
if(message[i].upper()==message[i-1].upper()):
count+=1
else:
list1.append(f"{count}{message[i-1].upper()}")
count=1
if i == len(message) - 1 :
list1.append(f"{count}{message[i].upper()}")
return "".join(list1)
encoded_message=encode("ABBBBCCCCCCCCAB")
print(encoded_message)

Changing version number to single digits python

I have a version number in a file like this:
Testing x.x.x.x
So I am grabbing it off like this:
import re
def increment(match):
# convert the four matches to integers
a,b,c,d = [int(x) for x in match.groups()]
# return the replacement string
return f'{a}.{b}.{c}.{d}'
lines = open('file.txt', 'r').readlines()
lines[3] = re.sub(r"\b(\d+)\.(\d+)\.(\d+)\.(\d+)\b", increment, lines[3])
I want to make it so if the last digit is a 9... then change it to 0 and then change the previous digit to a 1. So 1.1.1.9 changes to 1.1.2.0.
I did that by doing:
def increment(match):
# convert the four matches to integers
a,b,c,d = [int(x) for x in match.groups()]
# return the replacement string
if (d == 9):
return f'{a}.{b}.{c+1}.{0}'
elif (c == 9):
return f'{a}.{b+1}.{0}.{0}'
elif (b == 9):
return f'{a+1}.{0}.{0}.{0}'
Issue occurs when its 1.1.9.9 or 1.9.9.9. Where multiple digits need to rounded. How can I handle this issue?

Use integer addition?
def increment(match):
# convert the four matches to integers
a,b,c,d = [int(x) for x in match.groups()]
*a,b,c,d = [int(x) for x in str(a*1000 + b*100 + c*10 + d + 1)]
a = ''.join(map(str,a)) # fix for 2 digit 'a'
# return the replacement string
return f'{a}.{b}.{c}.{d}'

If your versions are never going to go beyond 10, it is better to just convert it to an integer, increment it and then convert back to a string.
This allows you to go up to as many version numbers as you require and you are not limited to thousands.
def increment(match):
match = match.replace('.', '')
match = int(match)
match += 1
match = str(match)
output = '.'.join(match)
return output

Add 1 to the last element. If it's more than 9, set it to 0 and do the same for the previous element. Repeat as necessary:
import re
def increment(match):
# convert the four matches to integers
g = [int(x) for x in match.groups()]
# increment, last one first
pos = len(g)-1
g[pos] += 1
while pos > 0:
if g[pos] > 9:
g[pos] = 0
pos -= 1
g[pos] += 1
else:
break
# return the replacement string
return '.'.join(str(x) for x in g)
print (re.sub(r"\b(\d+)\.(\d+)\.(\d+)\.(\d+)\b", increment, '1.8.9.9'))
print (re.sub(r"\b(\d+)\.(\d+)\.(\d+)\.(\d+)\b", increment, '1.9.9.9'))
print (re.sub(r"\b(\d+)\.(\d+)\.(\d+)\.(\d+)\b", increment, '9.9.9.9'))
Result:
1.9.0.0
2.0.0.0
10.0.0.0

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

remove substring from string using python - python

Related

Creating string based on first letters of each element of the list

Replace all occurrences of the substring in string using string slicing

How to count the number of triplets found in a string?

Given a string containing uppercase alphabets (A-Z), compress the string using Run Length encoding

Changing version number to single digits python

Categories

Resources