strings match after a certain character - python

I wanted to see if two strings matched each other after the last time a certain character appears. For example:
same_list = ['Time - is an illusion.', 'Lunchtime - which is delicious - is an illusion.']
True
So in this case my certain character is '-'. I want to see if the string after the last '-' for both strings match each other. The string after '-' is 'is an illusion.' for both strings. Therefore it is true.
So far I have:
same_list = ['Time - is an illusion.', 'Lunchtime - which is delicious - is an illusion.']
some_list = []
for string in same_list:
for ch in string:
if ch == '-':
new_string = string.split(ch)
some_list.append(new_string)
i = 0
for sublist[-1] in some_list:
I feel like I'm forgetting an easier way of doing this in python. All I can remember is the while loop, but that would return the first occurrence of '-', not the last. Also, the function I have written returns the second string twice as it was split twice. How do I get rid of that?

You can do something simple like this (assuming that all endings in a list must match):
def end_matches(*phrases): # phrases becomes a tuple that will take any number
phrase_endings = set() # number of positional parameters
for phrase in phrases:
end = phrase.split('-')[-1] # take the last element of the split list
phrase_endings.add(end)
return len(phrase_endings) == 1 # the set holds only unique values, so a set of
# length 1 means all values in the set are the same
You can now test it out:
>>> same_list = ['Time - is an illusion.', 'Lunchtime - which is delicious - is an illusion.']
>>> end_matches(*same_list) # unpack the list into individual positional arguments
True

Related

python string replace using for loop with if else

I am new with python, trying to replace string using for loop with if else condition,
I have a string and want to replace some character of that string in a such way that it should take / pick first character of the string and search them in the old_list if the character match it should replace that character with the character of new_list and if the character does not match it should consider that character (previous) and the next character together of the string and search them combinely and again search in old_list and so on.
it should replace in this oder (picking the character from string) = 010,101,010,010,100,101,00,00,011,1101,011,00,101,010,00,011,1111,1110,00,00,00,010,101,010,
replacing value = 1001,0000,0000,1000,1111,1001,1111,1111,100,1010101011,100,1111,1001,0000,1111,100,10100101,101010,1111,1111,1111,0000,1001,
with the example of above string if we performed that operation the string will becomes
final or result string = 10010000000010001111100111111111100101010101110011111001000011111001010010110101011111111111100001001
string = 01010101001010010100000111101011001010100001111111110000000010101010
old_list = ['00','011','010','101','100','1010','1011','1101','1110','1111']
new_list = ['1111','100','0000','1001','1000'1111','0101','1010101011','101010','10100101']
i = 0
for i in range((old), 0):
if i == old:
my_str = my_str.replace(old[i],new[i], 0)
else:
i = i + 1
print(my_str)
as result, string become = 10010000000010001111100111111111100101010101110011111001000011111001010010110101011111111111100001001
new = ['a ','local ','is ']
my_str = 'anindianaregreat'
old = ['an','indian','are']
for i, string in enumerate(old):
my_str = my_str.replace(string, new[i], 1)
print(my_str)
Your usage of range is incorrect.
range goes from lower (inclusive) to higher (exclusive) or simply 0 to higher (exclusive)
Your i == old condition is incorrect as well. (i is an integer, while old is a list). Also what is it supposed to do?
You can simply do:
for old_str, new_str in zip(old, new):
my_str = my_str.replace(old_str, new_str, 1)
https://docs.python.org/3/library/stdtypes.html#str.replace
You can provide an argument to replace to specify how many occurrences to replace.
No conditional is required since if old_str is absent, nothing will be replaced anyway.

Python: Is there a way to find and remove the first and last occurrence of a character in a string?

The problem:
Given a string in which the letter h occurs at least twice.
Remove from that string the first and the last occurrence of
the letter h, as well as all the characters between them.
How do I find the first and last occurrence of h? And how can I remove them and the characters in between them?
#initialize the index of the input string
index_count =0
#create a list to have indexes of 'h's
h_indexes = []
#accept input strings
origin_s = input("input:")
#search 'h' and save the index of each 'h' (and save indexes of searched 'h's into h_indexes
for i in origin_s:
first_h_index =
last_h_index =
#print the output string
print("Output:"+origin_s[ : ]+origin_s[ :])
Using a combination of index, rindex and slicing:
string = 'abc$def$ghi'
char = '$'
print(string[:string.index(char)] + string[string.rindex(char) + 1:])
# abcghi
You need to use regex:
>>> import re
>>> s = 'jusht exhamplhe'
>>> re.sub(r'h.+h', '', s)
'juse'
How do I find the first and last occurrence of h?
First occurence:
first_h_index=origin_s.find("h");
Last occurence:
last_h_index=origin_s.rfind("h");
And how can I remove them and the characters in between them?
Slicing
string = '1234-123456789'
char_list = []
for i in string:
char_list.append(string[i])
char_list.remove('character_to_remove')
According to the documentation, remove(arg) is a method acting on a mutable iterable (for example list) that removes the first instance of arg in the iterable.
This will help you to understand more clearly:
string = 'abchdef$ghi'
first=string.find('h')
last=string.rfind('h')
res=string[:first]+string[last+1:]
print(res)

Python How to get the each letters put into a word?

When the name is given, for example Aberdeen Scotland.
I need to get the result of Adbnearldteoecns.
Leaving the first word plain, but reverse the last word and put in between the first word.
I have done so far:
coordinatesf = "Aberdeen Scotland"
for line in coordinatesf:
separate = line.split()
for i in separate [0:-1]:
lastw = separate[1][::-1]
print(i)
A bit dirty but it works:
coordinatesf = "Aberdeen Scotland"
new_word=[]
#split the two words
words = coordinatesf.split(" ")
#reverse the second and put to lowercase
words[1]=words[1][::-1].lower()
#populate the new string
for index in range(0,len(words[0])):
new_word.insert(2*index,words[0][index])
for index in range(0,len(words[1])):
new_word.insert(2*index+1,words[1][index])
outstring = ''.join(new_word)
print outstring
Note that what you want to do is only well-defined if the the input string is composed of two words with the same lengths.
I use assertions to make sure that is true but you can leave them out.
def scramble(s):
words = s.split(" ")
assert len(words) == 2
assert len(words[0]) == len(words[1])
scrambledLetters = zip(words[0], reversed(words[1]))
return "".join(x[0] + x[1] for x in scrambledLetters)
>>> print(scramble("Aberdeen Scotland"))
>>> AdbnearldteoecnS
You could replace the x[0] + x[1] part with sum() but I think that makes it less readable.
This splits the input, zips the first word with the reversed second word, joins the pairs, then joins the list of pairs.
coordinatesf = "Aberdeen Scotland"
a,b = coordinatesf.split()
print(''.join(map(''.join, zip(a,b[::-1]))))

Consecutive values in strings, getting indices

The following is a python string of length of approximately +1000.
string1 = "XXXXXXXXXXXXXXXXXXXXXAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBB........AAAAXXXXX"
len(string1) ## 1311
I would like to know the index of where the consecutive X's end and the non-X characters begin. Reading this string from left to right, the first non-X character is at index location 22, and the first non-X character from the right is at index location 1306.
How does one find these indices?
My guess would be:
for x in string1:
if x != "X":
print(string.index(x))
The problem with this is it outputs all indices that are not X. It does not give me the index where the consecutive X's end.
Even more confusing for me is how to "check" for consecutive X's. Let's say I have this string:
string2 = "XXXXAAXAAAAAAAAAAAAAAABBBBBBBBBBBBBB........AAAAXXXXX"
Here, the consecutive X's end at index 4, not index 7. How could I check several characters ahead whether this is really no longer consecutive?
using regex, split the first & last group of Xs, get their lengths to construct the indices.
import re
mystr = 'XXXXAAXAAAAAAAAAAAAAAABBBBBBBBBBBBBB........AAAAXXXXX'
xs = re.split('[A-W|Y-Z]+', mystr)
indices = (len(xs[0]), len(mystr) - len(xs[-1]) - 1)
# (4, 47)
I simply need the outputs for the indices. I'm then going to put them in randint(first_index, second_index)
Its possible to pass the indices to the function like this
randint(*indices)
However, I suspect that you want to use the output of randint(first_index, last_index) to select a random character from the middle, this would be a shorter alternative.
from random import choice
randchar = choice(mystr.strip('X'))
If I understood well your question, you just do:
def getIndexs(string):
lst =[]
flag = False
for i, char in enumerate(string):
if char == "x":
flag = True
if ((char != "x") and flag):
lst.append(i-1)
flag = False
return lst
print(getIndexs("xxxxbbbxxxxaaaxxxbb"))
[3, 10, 16]
If the sequences are, as you say, only in the beginning and at the end of your string, a simple loop / reversed loop would suffice:
string1 = "XXXXXXXXXXXXXXXXXXXXXAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBB........AAAAXXXXX"
left_index = 0
for char in string1:
left_index += 1
if char != "X":
break
right_index = len(string1)
for char in reversed(string1):
if char != "X":
break
right_index -= 1
print(left_index) # 22
print(right_index) # 65
Regex can lookahead and identify characters that don't match the pattern:
>>>[match.span() for match in re.finditer(r'X{2,}((?=[^X])|$)', string2)]
[(0, 4), (48, 53)]
Breaking this down:
X - the character we're matching
{2,} - need to see at least two in a row to consider a match
((?=[^X])|$) - two conditions will satisfy the match
(?=[^X]) - lookahead for anything but an X
$ - the end of the string
As a result, finditer returns each instance where there are multiple X's, followed by a non-X or an end of line. match.span() extracts the position information from each match from the string.
This will give you the first index and last index (of non-'X' character).
s = 'XXABCDXXXEFGHXXXXX'
first_index = len(s) - len(s.lstrip('X'))
last_index = len(s.rstrip('X')) - len(s) - 1
print first_index, last_index
2 -6
How it works:
For first_index:
We strip all the 'X' characters at the beginning of our string. Finding the difference in length between the original and shortened string gives us the index of the first non-'X' character.
For last_index:
Similarly, we strip the 'X' characters at the end of our string. We also subtract 1 from the difference, since reverse indexing in Python starts from -1.
Note:
If you just want to randomly select one of the characters between first_index and last_index, you can do:
import random
shortened_s = s.strip('X')
random.choice(shortened_s)

need to count number of occurrences of given strings in an if statement

so in my function one of my if statements is that the string does not have more than 1 occurrence of a string that is in a list.
for example:
list = ['a','b']
i need to check if a string has more than one of those 2 characters in it or else it will move on to the next if statement.
s = 'aloha'
s will pass the if statement because b is not in s.
s = 'abcd'
s should fail this if statement because both a and b are in s.
more examples if it's not clear enough.
s = 'aaab'
this will fail the if statement and move on.
s = 'aaloh'
this will also fail
my if statement was:
if s.count('a') == 1 or s.count('b') == 1:
This if statement doesn't work
my question is, is there a way to check this without doing a for loop before hand?
You'll need a for loop, but you can still do it one line.
if sum(char in s for char in list) > 1:
If you wanted to look for specific numbers of a given character with some threshold, put your list of characters into a dict like d = {'a': 1, 'b': 4}.
if sum(s.count(char) >= count for char, count in d.items) > THRESHOLD:
Update
The OP commented that he wants a
a function that does something with the string if and only if that string doesn't have both 'a' and 'b' in the string. It can only have one of the two and not both and also only one occurrence of it.
Because OP's original post wanted to generalize 'a' and 'b' to a sequence of characters list, I interpret his meaning to be he wants an expression that returns True for a string s if and only if s has exactly one occurrence of exactly one element of a list of characters list.
sum(s.count(char) for char in list) <= 1
In OP's simple example, list = ['a', 'b'].

Categories