Increasing integer after having found occurrence - python

I have a text with several occurrences of websites, let's say "www.test.com". I want to replace all these occurrences by "website_NR", where NR should start at 1.
Example:
text = "bbsjsddh www.test.com shduh sudhuhd sjdjsdh wiowqiuedl www.test.de uasuisdhckjdfh www.test.de sudhdhdhfh"
Now, every occurrence of "www.test.com" should be replaced by "website_1", "website_2", and so on...
I tried it with:
n = 0
if 'www.test.com' in text:
n = n+1
text = text.replace('www.test.com', 'website_'+str(n))
But...this method counts only the first occurrence of "www.test.com"

The code function below will replace old with new + '_{}'.format(n) one at a time while old is in your string. It also increases n for each iteration so you get 'website_1', 'website_2', etc.
def func(s, old, new):
n = 1
while old in s:
s = s.replace(old, new + '_{}'.format(n), 1)
n += 1
return s
text = "bbsjsddh www.test.com shduh sudhuhd sjdjsdh wiowqiuedl www.test.com uasuisdhckjdfh www.test.com sudhdhdhfh"
print(func(text, 'www.test.com', 'website'))
# bbsjsddh website_1 shduh sudhuhd sjdjsdh wiowqiuedl website_2 uasuisdhckjdfh website_3 sudhdhdhfh

Related

find set of words in list

In the example I made up below, I am trying to get the words STEM Employment from the text list. If I find those set of words in order, I would like to find the index number of the first word so I can then use that same index number for the width and height lists since they are parallel (meaning their len is always the same dynamic value)
teststring = ("STEM Employment").split(" ")
data = {"text":["some","more","STEM","Employment","data"],
"width":[100,45,50,90,354],
"height":[500,320,320,432,554]}
so for this example, the answer would be 50 and 320 because the first word is STEM. However I am not just looking for STEM I have to make sure that Employment follows right after STEM in the list.
I tried writing a forloop for this but my forloop stops short when it confirms the first word STEM. I am not sure how to fix it:
testchecker = 0
for testword in range(len(data)):
print(data["text"][testword])
for m in teststring:
# print(m)
print(testchecker)
if m in data["text"][testword]:
print("true")
testchecker = testchecker + 1
if testchecker == len(teststring):
print("match")
print(testword-testchecker+1)
pass
else:
testchecker = 0
You can make data["text"] a string with join and check for "STEM Employment" in that. Then find the index of "STEM".
teststring = "STEM Employment"
data = {"text":["some","more","STEM","Employment","data"],
"width":[100,45,50,90,354],
"height":[500,320,320,432,554]}
if teststring in " ".join(data["text"]):
idx = data["text"].index(teststring.split(' ')[0])
print(data["width"][idx], data["height"][idx])
Output:
50 320
Another option:
teststring = "STEM Employment".split(' ')
# Make sure all words in testring are in data["text"]
if all(s in data["text"] for s in teststring):
# Get the indexes of each word
indexes = [data["text"].index(s) for s in teststring]
# Make sure all indexes are sequential
if all(b - a == 1 for a, b in zip(indexes, indexes[1:])):
print(data["width"][indexes[0]], data["height"][indexes[0]])

How to encode (replace) parts of word from end to beginning for some N value (like abcabcab to cbacbaba for n=3)?

I would like to create a program for encoding and decoding words.
Specifically, the program should take part of the word (count characters depending on the value of n) and turns them backwards.
This cycle will be running until it encodes the whole word.
At first I created the number of groups of parts of the word which is the number of elements n + some possible remainder
*(For example for Language with n = 3 has 3 parts - two parts of 3 chars and one remainder with 2 chars).This unit is called a general.
Then, depending on the general, I do a cycle that n * takes the given character and always adds it to the group (group has n chars).
At the end of the group cycle, I add (in reverse order) to new_word and reset the group value.
The goal should be to example decode word Language with (n value = 2) to aLgnaueg.
Or Language with (n value = 3) to naL aug eg and so on.
Next example is word abcabcab (n=3) to cba cba ba ?
Output of my code donĀ“t do it right. Output for n=3 is "naLaugeg"
Could I ask how to improve it? Is there some more simple python function how to rewrite it?
My code is there:
n = 3
word = "Language"
new_word = ""
group = ""
divisions = (len(word)//n)
residue = (len(word)%n)
general = divisions + residue
for i in range(general):
j=2
for l in range(n):
group += word[i+j]
print(word[i+j], l)
j=j-1
for j in range((len(group)-1),-1,-1):
new_word += group[j]
print(word[j])
group = ""
print(group)
print(new_word)
import textwrap
n = 3
word = "Language"
chunks = textwrap.wrap(word, n)
reversed_chunks = [chunk[::-1] for chunk in chunks]
>>> print(' '.join(reversed_chunks))
naL aug eg

How to find the most amount of shared characters in two strings? (Python)

yamxxopd
yndfyamxx
Output: 5
I am not quite sure how to find the number of the most amount of shared characters between two strings. For example (the strings above) the most amount of characters shared together is "yamxx" which is 5 characters long.
xx would not be a solution because that is not the most amount of shared characters. In this case the most is yamxx which is 5 characters long so the output would be 5.
I am quite new to python and stack overflow so any help would be much appreciated!
Note: They should be the same order in both strings
Here is simple, efficient solution using dynamic programming.
def longest_subtring(X, Y):
m,n = len(X), len(Y)
LCSuff = [[0 for k in range(n+1)] for l in range(m+1)]
result = 0
for i in range(m + 1):
for j in range(n + 1):
if (i == 0 or j == 0):
LCSuff[i][j] = 0
elif (X[i-1] == Y[j-1]):
LCSuff[i][j] = LCSuff[i-1][j-1] + 1
result = max(result, LCSuff[i][j])
else:
LCSuff[i][j] = 0
print (result )
longest_subtring("abcd", "arcd") # prints 2
longest_subtring("yammxdj", "nhjdyammx") # prints 5
This solution starts with sub-strings of longest possible lengths. If, for a certain length, there are no matching sub-strings of that length, it moves on to the next lower length. This way, it can stop at the first successful match.
s_1 = "yamxxopd"
s_2 = "yndfyamxx"
l_1, l_2 = len(s_1), len(s_2)
found = False
sub_length = l_1 # Let's start with the longest possible sub-string
while (not found) and sub_length: # Loop, over decreasing lengths of sub-string
for start in range(l_1 - sub_length + 1): # Loop, over all start-positions of sub-string
sub_str = s_1[start:(start+sub_length)] # Get the sub-string at that start-position
if sub_str in s_2: # If found a match for the sub-string, in s_2
found = True # Stop trying with smaller lengths of sub-string
break # Stop trying with this length of sub-string
else: # If no matches found for this length of sub-string
sub_length -= 1 # Let's try a smaller length for the sub-strings
print (f"Answer is {sub_length}" if found else "No common sub-string")
Output:
Answer is 5
s1 = "yamxxopd"
s2 = "yndfyamxx"
# initializing counter
counter = 0
# creating and initializing a string without repetition
s = ""
for x in s1:
if x not in s:
s = s + x
for x in s:
if x in s2:
counter = counter + 1
# display the number of the most amount of shared characters in two strings s1 and s2
print(counter) # display 5

Subtracting substring from string in as many possible steps

Goal is to find the maximum amount of times you can subtract t from s.
t = ab, s = aabb. In the first step, we check if t is contained within s. Here, t is contained in the middle i.e. a(ab)b. So, we will remove it and the resultant will be ab and increment the count value by 1. We again check if t is contained within s. Now, t is equal to s i.e. (ab). So, we remove that from s and increment the count. So, since t is no more contained in s, we stop and print the count value, which is 2 in this case.
Problem occurs when you have something as s = 'abbabbaa' t = 'abba'.
Now it matters if you take it from the end or beggining, since you will get more steps from the end.
def MaxNum(s,t):
if not t in s:
return 0
elif s.count(t) == 1:
front = s.find(t)
sfront = s[:front] + s[front + len(t):]
return 1 + MaxNum(sfront,t)
else:
back = s.rfind(t)
front = s.find(t)
sback = s[:back] + s[back +len(t):]
sfront = s[:front] + s[front + len(t):]
print (sfront,sback)
return max(1 + MaxNum(sfront,t),1 + MaxNum(sback,t))
def foo(t,s):
return max([0] + [
1 + foo(t,s[:i]+s[i+len(t):]) for i in range(len(s)) if s[i:].startswith(t)])
Should I ask why you care?

How do you reverse the words in a string using python (manually)? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Reverse the ordering of words in a string
I know there are methods that python already provides for this, but I'm trying to understand the basics of how those methods work when you only have the list data structure to work with. If I have a string hello world and I want to make a new string world hello, how would I think about this?
And then, if I can do it with a new list, how would I avoid making a new list and do it in place?
Split the string, make a reverse iterator then join the parts back.
' '.join(reversed(my_string.split()))
If you are concerned with multiple spaces, change split() to split(' ')
As requested, I'm posting an implementation of split (by GvR himself from the oldest downloadable version of CPython's source code: Link)
def split(s,whitespace=' \n\t'):
res = []
i, n = 0, len(s)
while i < n:
while i < n and s[i] in whitespace:
i = i+1
if i == n:
break
j = i
while j < n and s[j] not in whitespace:
j = j+1
res.append(s[i:j])
i = j
return res
I think now there are more pythonic ways of doing that (maybe groupby) and the original source had a bug (if i = n:, corrrected to ==)
Original Answer
from array import array
def reverse_array(letters, first=0, last=None):
"reverses the letters in an array in-place"
if last is None:
last = len(letters)
last -= 1
while first < last:
letters[first], letters[last] = letters[last], letters[first]
first += 1
last -= 1
def reverse_words(string):
"reverses the words in a string using an array"
words = array('c', string)
reverse_array(words, first=0, last=len(words))
first = last = 0
while first < len(words) and last < len(words):
if words[last] != ' ':
last += 1
continue
reverse_array(words, first, last)
last += 1
first = last
if first < last:
reverse_array(words, first, last=len(words))
return words.tostring()
Answer using list to match updated question
def reverse_list(letters, first=0, last=None):
"reverses the elements of a list in-place"
if last is None:
last = len(letters)
last -= 1
while first < last:
letters[first], letters[last] = letters[last], letters[first]
first += 1
last -= 1
def reverse_words(string):
"""reverses the words in a string using a list, with each character
as a list element"""
characters = list(string)
reverse_list(characters)
first = last = 0
while first < len(characters) and last < len(characters):
if characters[last] != ' ':
last += 1
continue
reverse_list(characters, first, last)
last += 1
first = last
if first < last:
reverse_list(characters, first, last=len(characters))
return ''.join(characters)
Besides renaming, the only change of interest is the last line.
You have a string:
str = "A long string to test this algorithm"
Split the string (at word boundary -- no arguments to split):
splitted = str.split()
Reverse the array obtained -- either using ranges or a function
reversed = splitted[::-1]
Concatenate all words with spaces in between -- also known as joining.
result = " ".join(reversed)
Now, you don't need so many temps, combining them into one line gives:
result = " ".join(str.split()[::-1])
str = "hello world"
" ".join(str.split()[::-1])

Categories