Find index of two substrings with overlapping characters

Find index of two substrings with overlapping characters - python

I want to find the index of two substrings in a string of characters given like this:
find_start = '1L'
find_end = 'L'
>>> blah = 'A1LELST5W'
>>> blah.index('1L')
1
>>> blah.index('L')
2 # i want it to give me 4
If I use the index method, it gives me the "L" that's the third character in the string. But I want it to treat "1L" and "L" as separate strings and give me the fifth character instead.
Is there a simple way of doing this? Or would I have to store everything except find_start in a new string and then try to index through that? (But that would mess with the position of everything inside the string).

The str.index method has start and end arguments that allow you to constrain the search. So you just need to start the second search where the first one ends:
>>> find_start = '1L'
>>> find_end = 'L'
>>> blah = 'A1LELST5W'
>>> first = blah.index('1L')
>>> first
1
>>> blah.index('L', first + len(find_start))
4

Related

Python: How to move the position of an output variable using the split() method

This is my first SO post, so go easy! I have a script that counts how many matches occur in a string named postIdent for the substring ff. Based on this it then iterates over postIdent and extracts all of the data following it, like so:
substring = 'ff'
global occurences
occurences = postIdent.count(substring)
x = 0
while x <= occurences:
for i in postIdent.split("ff"):
rawData = i
required_Id = rawData[-8:]
x += 1
To explain further, if we take the string "090fd0909a9090ff90493090434390ff90904210412419ghfsdfs9000ff", it is clear there are 3 instances of ff. I need to get the 8 preceding characters at every instance of the substring ff, so for the first instance this would be 909a9090.
With the rawData, I essentially need to offset the variable required_Id by -1 when I get the data out of the split() method, as I am currently getting the last 8 characters of the current string, not the string I have just split. Another way of doing it could be to pass the current required_Id to the next iteration, but I've not been able to do this.
The split method gets everything after the matching string ff.
Using the partition method can get me the data I need, but does not allow me to iterate over the string in the same way.

Get the last 8 digits of each split using a slice operation in a list-comprehension:
s = "090fd0909a9090ff90493090434390ff90904210412419ghfsdfs9000ff"
print([x[-8:] for x in s.split('ff') if x])
# ['909a9090', '90434390', 'sdfs9000']

Not a difficult problem, but tricky for a beginner.
If you split the string on 'ff' then you appear to want the eight characters at the end of every substring but the last. The last eight characters of string s can be obtained using s[-8:]. All but the last element of a sequence x can similarly be obtained with the expression x[:-1].
Putting both those together, we get
subject = '090fd0909a9090ff90493090434390ff90904210412419ghfsdfs9000ff'
for x in subject.split('ff')[:-1]:
print(x[-8:])
This should print
909a9090
90434390
sdfs9000

I wouldn't do this with split myself, I'd use str.find. This code isn't fancy but it's pretty easy to understand:
fullstr = "090fd0909a9090ff90493090434390ff90904210412419ghfsdfs9000ff"
search = "ff"
found = None # our next offset of
last = 0
l = 8
print(fullstr)
while True:
found = fullstr.find(search, last)
if found == -1:
break
preceeding = fullstr[found-l:found]
print("At position {} found preceeding characters '{}' ".format(found,preceeding))
last = found + len(search)
Overall I like Austin's answer more; it's a lot more elegant.

AttributeError: 'str' object has no attribute 'remove' [duplicate]

There is a string, for example. EXAMPLE.
How can I remove the middle character, i.e., M from it? I don't need the code. I want to know:
Do strings in Python end in any special character?
Which is a better way - shifting everything right to left starting from the middle character OR creation of a new string and not copying the middle character?

In Python, strings are immutable, so you have to create a new string. You have a few options of how to create the new string. If you want to remove the 'M' wherever it appears:
newstr = oldstr.replace("M", "")
If you want to remove the central character:
midlen = len(oldstr) // 2
newstr = oldstr[:midlen] + oldstr[midlen+1:]
You asked if strings end with a special character. No, you are thinking like a C programmer. In Python, strings are stored with their length, so any byte value, including \0, can appear in a string.

To replace a specific position:
s = s[:pos] + s[(pos+1):]
To replace a specific character:
s = s.replace('M','')

This is probably the best way:
original = "EXAMPLE"
removed = original.replace("M", "")
Don't worry about shifting characters and such. Most Python code takes place on a much higher level of abstraction.

Strings are immutable. But you can convert them to a list, which is mutable, and then convert the list back to a string after you've changed it.
s = "this is a string"
l = list(s) # convert to list
l[1] = "" # "delete" letter h (the item actually still exists but is empty)
l[1:2] = [] # really delete letter h (the item is actually removed from the list)
del(l[1]) # another way to delete it
p = l.index("a") # find position of the letter "a"
del(l[p]) # delete it
s = "".join(l) # convert back to string
You can also create a new string, as others have shown, by taking everything except the character you want from the existing string.

How can I remove the middle character, i.e., M from it?
You can't, because strings in Python are immutable.
Do strings in Python end in any special character?
No. They are similar to lists of characters; the length of the list defines the length of the string, and no character acts as a terminator.
Which is a better way - shifting everything right to left starting from the middle character OR creation of a new string and not copying the middle character?
You cannot modify the existing string, so you must create a new one containing everything except the middle character.

Use the translate() method:
>>> s = 'EXAMPLE'
>>> s.translate(None, 'M')
'EXAPLE'

def kill_char(string, n): # n = position of which character you want to remove
begin = string[:n] # from beginning to n (n not included)
end = string[n+1:] # n+1 through end of string
return begin + end
print kill_char("EXAMPLE", 3) # "M" removed
I have seen this somewhere here.

card = random.choice(cards)
cardsLeft = cards.replace(card, '', 1)
How to remove one character from a string:
Here is an example where there is a stack of cards represented as characters in a string.
One of them is drawn (import random module for the random.choice() function, that picks a random character in the string).
A new string, cardsLeft, is created to hold the remaining cards given by the string function replace() where the last parameter indicates that only one "card" is to be replaced by the empty string...

On Python 2, you can use UserString.MutableString to do it in a mutable way:
>>> import UserString
>>> s = UserString.MutableString("EXAMPLE")
>>> type(s)
<class 'UserString.MutableString'>
>>> del s[3] # Delete 'M'
>>> s = str(s) # Turn it into an immutable value
>>> s
'EXAPLE'
MutableString was removed in Python 3.

Another way is with a function,
Below is a way to remove all vowels from a string, just by calling the function
def disemvowel(s):
return s.translate(None, "aeiouAEIOU")

Here's what I did to slice out the "M":
s = 'EXAMPLE'
s1 = s[:s.index('M')] + s[s.index('M')+1:]

To delete a char or a sub-string once (only the first occurrence):
main_string = main_string.replace(sub_str, replace_with, 1)
NOTE: Here 1 can be replaced with any int for the number of occurrence you want to replace.

You can simply use list comprehension.
Assume that you have the string: my name is and you want to remove character m. use the following code:
"".join([x for x in "my name is" if x is not 'm'])

If you want to delete/ignore characters in a string, and, for instance, you have this string,
"[11:L:0]"
from a web API response or something like that, like a CSV file, let's say you are using requests
import requests
udid = 123456
url = 'http://webservices.yourserver.com/action/id-' + udid
s = requests.Session()
s.verify = False
resp = s.get(url, stream=True)
content = resp.content
loop and get rid of unwanted chars:
for line in resp.iter_lines():
line = line.replace("[", "")
line = line.replace("]", "")
line = line.replace('"', "")
Optional split, and you will be able to read values individually:
listofvalues = line.split(':')
Now accessing each value is easier:
print listofvalues[0]
print listofvalues[1]
print listofvalues[2]
This will print
11
L
0

Two new string removal methods are introduced in Python 3.9+
#str.removeprefix("prefix_to_be_removed")
#str.removesuffix("suffix_to_be_removed")
s='EXAMPLE'
In this case position of 'M' is 3
s = s[:3] + s[3:].removeprefix('M')
OR
s = s[:4].removesuffix('M') + s[4:]
#output'EXAPLE'

from random import randint
def shuffle_word(word):
newWord=""
for i in range(0,len(word)):
pos=randint(0,len(word)-1)
newWord += word[pos]
word = word[:pos]+word[pos+1:]
return newWord
word = "Sarajevo"
print(shuffle_word(word))

Strings are immutable in Python so both your options mean the same thing basically.

Get character based on index in Python

I am fairly new to python and was wondering how do you get a character in a string based on an index number?
Say I have the string "hello" and index number 3. How do I get it to return the character in that spot, it seems that there is some built in method that I just cant seem to find.

You just need to index the string, just like you do with a list.
>>> 'hello'[3]
l
Note that Python indices (like most other languages) are zero based, so the first element is index 0, the second is index 1, etc.
For example:
>>> 'hello'[0]
h
>>> 'hello'[1]
e

its just straight forward.
str[any subscript]. //e.g. str[0], str[0][0]

Check this page...
What you need is:
Strings can be subscripted (indexed); like in C, the first character of a string has subscript (index) 0.
There is no separate character type; a character is simply a string of size one.
Like in Icon, substrings can be specified with the slice notation: two indices separated by a colon.
Example:
>>> word[4]
'A'
>>> word[0:2]
'He'
>>> word[2:4]
'lp'
For your case try this:
>>> s = 'hello'
>>> s[3]
'l'

How to differentiate lines with one dot and two dot?

I want to extract a specific part of a sentence. My problem is that I have a list of sentences that each have different formats. For instance:
X.y.com
x.no
x.com
y.com
z.co.uk
s.com
b.t.com
how can I split these lines based on the number of dots they have? If I want the second part of the sentence with two dots and the first part of the sentences with one dot

You want the part directly preceding the last dot; just split on the dots and take the one-but last part:
for line in data:
if not '.' in line: continue
elem = line.strip().split('.')[-2]
For your input, that gives:
>>> for line in data:
... print line.strip().split('.')[-2]
...
y
x
x
y
co
s
t

To anwser your question you could use count to count the number of times the '.' appears and then do
whatever you need.
>>> 't.com'.count('.')
1
>>> 'x.t.com'.count('.')
2
You could use that in a loop:
for s in string_list:
dots = s.count('.')
if dots == 1:
# do something here
elif dots == 2:
# do something else
else:
# another piece of code
More pythonic way to solve your problem:
def test_function(s):
"""
>>> test_function('b.t.com')
't'
>>> test_function('x.no')
'x'
>>> test_function('z')
'z'
"""
actions = {0: lambda x: x
1: lambda x: x.split('.')[0],
2: lambda x: x.split('.')[1]}
return actions[s.count('.')](s)

I would follow this logic:
For each line:
remove any spaces at beginning and end
split the line by dots
take the part before last of the splitted list
This should give you the part of the sentence you're looking for.

Simply use the split function.
a = 'x.com'
b = a.split('.')
This will make a list of 2 items in b. If you have two dots, the list will contain 3 items. The function actually splits the string based on the given character.

How to delete a character from a string using Python

There is a string, for example. EXAMPLE.
How can I remove the middle character, i.e., M from it? I don't need the code. I want to know:
Do strings in Python end in any special character?
Which is a better way - shifting everything right to left starting from the middle character OR creation of a new string and not copying the middle character?

In Python, strings are immutable, so you have to create a new string. You have a few options of how to create the new string. If you want to remove the 'M' wherever it appears:
newstr = oldstr.replace("M", "")
If you want to remove the central character:
midlen = len(oldstr) // 2
newstr = oldstr[:midlen] + oldstr[midlen+1:]
You asked if strings end with a special character. No, you are thinking like a C programmer. In Python, strings are stored with their length, so any byte value, including \0, can appear in a string.

To replace a specific position:
s = s[:pos] + s[(pos+1):]
To replace a specific character:
s = s.replace('M','')

This is probably the best way:
original = "EXAMPLE"
removed = original.replace("M", "")
Don't worry about shifting characters and such. Most Python code takes place on a much higher level of abstraction.

Strings are immutable. But you can convert them to a list, which is mutable, and then convert the list back to a string after you've changed it.
s = "this is a string"
l = list(s) # convert to list
l[1] = "" # "delete" letter h (the item actually still exists but is empty)
l[1:2] = [] # really delete letter h (the item is actually removed from the list)
del(l[1]) # another way to delete it
p = l.index("a") # find position of the letter "a"
del(l[p]) # delete it
s = "".join(l) # convert back to string
You can also create a new string, as others have shown, by taking everything except the character you want from the existing string.

How can I remove the middle character, i.e., M from it?
You can't, because strings in Python are immutable.
Do strings in Python end in any special character?
No. They are similar to lists of characters; the length of the list defines the length of the string, and no character acts as a terminator.
Which is a better way - shifting everything right to left starting from the middle character OR creation of a new string and not copying the middle character?
You cannot modify the existing string, so you must create a new one containing everything except the middle character.

Use the translate() method:
>>> s = 'EXAMPLE'
>>> s.translate(None, 'M')
'EXAPLE'

def kill_char(string, n): # n = position of which character you want to remove
begin = string[:n] # from beginning to n (n not included)
end = string[n+1:] # n+1 through end of string
return begin + end
print kill_char("EXAMPLE", 3) # "M" removed
I have seen this somewhere here.

card = random.choice(cards)
cardsLeft = cards.replace(card, '', 1)
How to remove one character from a string:
Here is an example where there is a stack of cards represented as characters in a string.
One of them is drawn (import random module for the random.choice() function, that picks a random character in the string).
A new string, cardsLeft, is created to hold the remaining cards given by the string function replace() where the last parameter indicates that only one "card" is to be replaced by the empty string...

On Python 2, you can use UserString.MutableString to do it in a mutable way:
>>> import UserString
>>> s = UserString.MutableString("EXAMPLE")
>>> type(s)
<class 'UserString.MutableString'>
>>> del s[3] # Delete 'M'
>>> s = str(s) # Turn it into an immutable value
>>> s
'EXAPLE'
MutableString was removed in Python 3.

Another way is with a function,
Below is a way to remove all vowels from a string, just by calling the function
def disemvowel(s):
return s.translate(None, "aeiouAEIOU")

Here's what I did to slice out the "M":
s = 'EXAMPLE'
s1 = s[:s.index('M')] + s[s.index('M')+1:]

To delete a char or a sub-string once (only the first occurrence):
main_string = main_string.replace(sub_str, replace_with, 1)
NOTE: Here 1 can be replaced with any int for the number of occurrence you want to replace.

You can simply use list comprehension.
Assume that you have the string: my name is and you want to remove character m. use the following code:
"".join([x for x in "my name is" if x is not 'm'])

If you want to delete/ignore characters in a string, and, for instance, you have this string,
"[11:L:0]"
from a web API response or something like that, like a CSV file, let's say you are using requests
import requests
udid = 123456
url = 'http://webservices.yourserver.com/action/id-' + udid
s = requests.Session()
s.verify = False
resp = s.get(url, stream=True)
content = resp.content
loop and get rid of unwanted chars:
for line in resp.iter_lines():
line = line.replace("[", "")
line = line.replace("]", "")
line = line.replace('"', "")
Optional split, and you will be able to read values individually:
listofvalues = line.split(':')
Now accessing each value is easier:
print listofvalues[0]
print listofvalues[1]
print listofvalues[2]
This will print
11
L
0

Two new string removal methods are introduced in Python 3.9+
#str.removeprefix("prefix_to_be_removed")
#str.removesuffix("suffix_to_be_removed")
s='EXAMPLE'
In this case position of 'M' is 3
s = s[:3] + s[3:].removeprefix('M')
OR
s = s[:4].removesuffix('M') + s[4:]
#output'EXAPLE'

from random import randint
def shuffle_word(word):
newWord=""
for i in range(0,len(word)):
pos=randint(0,len(word)-1)
newWord += word[pos]
word = word[:pos]+word[pos+1:]
return newWord
word = "Sarajevo"
print(shuffle_word(word))

Strings are immutable in Python so both your options mean the same thing basically.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find index of two substrings with overlapping characters - python

Related

Python: How to move the position of an output variable using the split() method

AttributeError: 'str' object has no attribute 'remove' [duplicate]

Get character based on index in Python

How to differentiate lines with one dot and two dot?

How to delete a character from a string using Python

Categories

Resources