Exchanging characters in a string - python

I need to exchange the middle character in a numeric string of 15 numbers with the last number of the string.
So I get that this:
def string(str):
return str[-1:] + str[1:-1] + str[:1]
print(string('abcd'))
print(string('12345'))
RESULTS:
dbca
52341
But how can I make it so that in the initial input string, 012345678912345,
where the 7 is exchanged with the last character in the string 5?

Consider
def last_to_mid(s):
if len(s) == 1:
return s
if len(s)%2 == 0:
raise ValueError('expected string of odd length')
idx = len(s)//2
return f'{s[:idx]}{s[-1]}{s[idx+1:-1]}{s[idx]}'
operating like this:
>>> last_to_mid('021')
'012'
>>> last_to_mid('0123x4567')
'01237456x'
>>> last_to_mid('1')
'1'
Assuming you have Python 3.6 or newer for f-strings.

You can have a function for this:
In [178]: def swap_index_values(my_string):
...: l = list(my_string)
...: middleIndex = (len(l) - 1)/2
...: middle_val = l[middleIndex]
...: l[middleIndex] = l[-1]
...: l[-1] = middle_val
...: return ''.join(l)
...:
In [179]:
In [179]: a
Out[179]: '012345678912345'
In [180]: swap_index_values(a)
Out[180]: '012345658912347'
Above, you can see that middle value and last values have been exchanged.

In this very specific context (always the middle and last character of a string of length 15), your initial approach can be extended to:
text[0:7]+text[-1]+text[8:-1]+text[7]
Also try to avoid variable names like str, since they shadow the function of the same name.

s1='1243125'
s2=s1[:len(s1)//2] + s1[-1] + s1[len(s1)//2 + 1:]
print(s2)
'1245125'

Related

Python - how to multiply characters in string by number after character

Title, for example I want to make 'A3G3A' into 'AAAGGGA'.
I have this so far:
if any(i.isdigit() for i in string):
for i in range(0, len(string)):
if string[i].isdigit():
(i am lost after this)
Here's a simplistic approach:
string = 'A3G3A'
expanded = ''
for character in string:
if character.isdigit():
expanded += expanded[-1] * (int(character) - 1)
else:
expanded += character
print(expanded)
OUTPUT: AAAGGGA
It assumes valid input. It's limitation is that the repetition factor has to be a single digit, e.g. 2 - 9. If we want repetition factors greater than 9, we have to do slightly more parsing of the string:
from itertools import groupby
groups = groupby('DA10G3ABC', str.isdigit)
expanded = []
for is_numeric, characters in groups:
if is_numeric:
expanded.append(expanded[-1] * (int(''.join(characters)) - 1))
else:
expanded.extend(characters)
print(''.join(expanded))
OUTPUT: DAAAAAAAAAAGGGABC
Assuming that the format is always a letter followed by an integer, with the last integer possibly missing:
>>> from itertools import izip_longest
>>> s = 'A3G3A'
>>> ''.join(c*int(i) for c, i in izip_longest(*[iter(s)]*2, fillvalue=1))
'AAAGGGA'
Assuming that the format can be any substring followed by an integer, with the integer possibly longer than one digit and the last integer possibly missing:
>>> from itertools import izip_longest
>>> import re
>>> s = 'AB10GY3ABC'
>>> sp = re.split('(\d+)', s)
>>> ''.join(c*int(i) for c, i in izip_longest(*[iter(sp)]*2, fillvalue=1))
'ABABABABABABABABABABGYGYGYABC'
A minimal pure python code which manage all cases.
output = ''
n = ''
c = ''
for x in input + 'a':
if x.isdigit():
n += x
else:
if n == '':
n = '1'
output = output + c*int(n)
n = ''
c = x
with input="WA5OUH2!10", output is WAAAAAOUHH!!!!!!!!!!.
+'a' is to enforce the good behaviour at the end, because output is delayed.
Another approach could be -
import re
input_string = 'A3G3A'
alphabets = re.findall('[A-Z]', input_string) # List of all alphabets - ['A', 'G', 'A']
digits = re.findall('[0-9]+', input_string) # List of all numbers - ['3', '3']
final_output = "".join([alphabets[i]*int(digits[i]) for i in range(0, len(alphabets)-1)]) + alphabets[-1]
# This expression repeats each letter by the number next to it ( Except for the last letter ), joins the list of strings into a single string, and appends the last character
# final_output - 'AAAGGGA'
Explanation -
In [31]: alphabets # List of alphabets in the string
Out[31]: ['A', 'G', 'A']
In [32]: digits # List of numbers in the string ( Including numbers more than one digit)
Out[32]: ['3', '3']
In [33]: list_of_strings = [alphabets[i]*int(digits[i]) for i in range(0, len(alphabets)-1)] # List of strings after repetition
In [34]: list_of_strings
Out[34]: ['AAA', 'GGG']
In [35]: joined_string = "".join(list_of_strings) # Joined list of strings
In [36]: joined_string
Out[36]: 'AAAGGG'
In [38]: final_output = joined_string + input_string[-1] # Append last character of the string
In [39]: final_output
Out[39]: 'AAAGGGA'
using the * to repeat the characters:
assumption repeater range between [1,9]
q = 'A3G3A'
try:
int(q[-1]) # check if it ends with digit
except:
q = q+'1' # repeat only once
"".join([list(q)[i]*int(list(q)[i+1]) for i in range(0,len(q),2)])
One line solution. Assuming numbers in the range [0, 9].
>>> s = 'A3G3A'
>>> s = ''.join(s[i] if not s[i].isdigit() else s[i-1]*(int(s[i])-1) for i in range(0, len(s)))
>>> print(s)
AAAGGGA
Embrace regex! This finds all occurrences of the pattern non-digit character followed by non-negative integer (any number of digits) and replaces that substring with that many of the character.
import re
re.sub(r'(\D)(\d+)', lambda m: m.group(1) * int(m.group(2)), 'A3G3A')
This can be solved by numpy:
import numpy as np
x = 'A3G3A'
if not x[-1].isdigit():
x += '1'
letters = list(x[::2])
times = list(map(int,x[1::2]))
lst = ''.join(np.repeat(letters, times))
#output
'AAAGGGA'

Exercises with String using Python 3

Write a function called remove_duplicates which will take one argument called string.
This string input will only have characters between a-z.
The function should remove all repeated characters in the string and return a tuple with two values:
A new string with only unique, sorted characters.
The total number of duplicates dropped.
For example:
remove_duplicates('aaabbbac') should produce ('abc')
remove_duplicates('a') should produce ('a', 0)
remove_duplicates('thelexash') should produce ('aehlstx', 2)
My Code:
def remove_duplicates(string):
    for string in "abcdefghijklmnopqrstuvwxyz":
        k = set(string)
        x = len(string) - len(set(string))
        return k, x
print(remove_duplicates("aaabbbccc"))
Expected Output:
I'm expecting it to print ({a, b, c}, 6) instead it prints ({a}, 0).
What is wrong with my code above? Why it isn't producing what I'm expecting?
You'll get the expected result if you don't iterate over each char in the string.
I've commented your code so you'll can see the difference between your script and mine.
Non-working commented code:
def remove_duplicates(string):
#loop through each char in "abcdefghijklmnopqrstuvwxyz" and call it "string"
for string in "abcdefghijklmnopqrstuvwxyz":
#create variable k that holds a set of 1 char because of the loop
k = set(string)
# create a variable x that holds the difference between 1 and 1 = 0
x = len(string) - len(set(string))
#return these values in each iteration
return k, x
print(remove_duplicates("aaabbbccc"))
Outputs:
({'a'}, 0)
Working code:
def remove_duplicates(string):
#create variable k that holds a set of each unique char present in string
k = set(string)
# create a variable x that holds the difference between 1 and 1 = 0
x = len(string) - len(set(string))
#return these values
return k, x
print(remove_duplicates("aaabbbccc"))
Outputs:
({'b', 'c', 'a'}, 6)
P.s.: if you want your results to be in order, you can change return k, x to return sorted(k), x, but then the output will be a list.
(['a', 'b', 'c'], 6)
EDIT: if you want your code runs only if certain condition is met - for example, runs only if string don't have any number - you can add an if/else clause:
Example Code:
def remove_duplicates(s):
if not s.isdigit():
k = set(s)
x = len(s) - len(set(s))
return sorted(k), x
else:
msg = "This function only works with strings that doesn't contain any digits.."
return msg
print(remove_duplicates("aaabbbccc"))
print(remove_duplicates("123123122"))
Outputs:
(['a', 'b', 'c'], 6)
This function only works with strings that doesn't contain any digits..
In your code, the function will return after iterating the 1st character.
As string refers to the first char in the input string. I think you are trying to iterate over the string variable character-by-character.
For this, you can use collections.Counter which performs the same computation more efficiently.
However, we can work with an alternate solution which doesn't involve computing the count of each character in the given string.
def remove_duplicates(s):
unique_characters = set(s) # extract the unique characters in the given string
new_sorted_string = ''.join(sorted(unique_characters)) # create the sorted string with unique characters
number_of_duplicates = len(s) - len(unique_characters) # compute the number of duplicates in the original string
return new_sorted_string, number_of_duplicates
You are returning from the function at the first instance where a character is found. So it returns for the first "a".
Try this instead :
def remove_duplicates(string):
temp = set(string)
return temp,len(string) - len(temp)
print(remove_duplicates("aaabbbccc"))
Output :
({'c', 'b', 'a'}, 6)
If you want to remove everything expect alphabets (as you mentioned in comments) try this:
def remove_duplicates(string):
a= set()
for i in string:
if i.isalpha() and i not in a:
a.add(i)
return a,len(string) - len(a)
def remove_duplicates(s):
unique_characters = set(s) # extract the unique characters in the given
string
new_sorted_string = ''.join(sorted(unique_characters)) # create the sorted string with unique characters
number_of_duplicates = len(s) - len(unique_characters) # compute the number of duplicates in the original string
return new_sorted_string, number_of_duplicates

Splitting an unspaced string of decimal values - Python

An awful person has given me a string like this
values = '.850000.900000.9500001.000001.50000'
and I need to split it to create the following list:
['.850000', '.900000', '.950000', '1.00000', '1.500000']
I know that I was dealing only with numbers < 1 I could use the code
dl = '.'
splitvalues = [dl+e for e in values.split(dl) if e != ""]
But in cases like this one where there are numbers greater than 1 buried in the string, splitvalue would end up being
['.850000', '.900000', '.9500001', '.000001', '.50000']
So is there a way to split a string with multiple delimiters while also splitting the string differently based on which delimiter is encountered?
I think this is somewhat closer to a fixed width format string. Try a regular expression like this:
import re
str = "(\d{1,2}\\.\d{5})"
m = re.search(str, input_str)
your_first_number = m.group(0)
Try this repeatedly on the remaining string to consume all numbers.
>>> import re
>>> source = '0.850000.900000.9500001.000001.50000'
>>> re.findall("(.*?00+(?!=0))", source)
['0.850000', '.900000', '.950000', '1.00000', '1.50000']
The split is based on looking for "{anything, double zero, a run of zeros (followed by a not-zero)"}.
Assume that the value before the decimal is less than 10, and then we have,
values = '0.850000.900000.9500001.000001.50000'
result = list()
last_digit = None
for value in values.split('.'):
if value.endswith('0'):
result.append(''.join([i for i in [last_digit, '.', value] if i]))
last_digit = None
else:
result.append(''.join([i for i in [last_digit, '.', value[0:-1]] if i]))
last_digit = value[-1]
if values.startswith('0'):
result = result[1:]
print(result)
# Output
['.850000', '.900000', '.950000', '1.00000', '1.50000']
How about using re.split():
import re
values = '0.850000.900000.9500001.000001.50000'
print([a + b for a, b in zip(*(lambda x: (x[1::2], x[2::2]))(re.split(r"(\d\.)", values)))])
OUTPUT
['0.85000', '0.90000', '0.950000', '1.00000', '1.50000']
Here digits are of fixed width, i.e. 6, if include the dot it's 7. Get the slices from 0 to 7 and 7 to 14 and so on. Because we don't need the initial zero, I use the slice values[1:] for extraction.
values = '0.850000.900000.9500001.000001.50000'
[values[1:][start:start+7] for start in range(0,len(values[1:]),7)]
['.850000', '.900000', '.950000', '1.00000', '1.50000']
Test;
''.join([values[1:][start:start+7] for start in range(0,len(values[1:]),7)]) == values[1:]
True
With a fixed / variable string, you may try something like:
values = '0.850000.900000.9500001.000001.50000'
str_list = []
first_index = values.find('.')
while first_index > 0:
last_index = values.find('.', first_index + 1)
if last_index != -1:
str_list.append(values[first_index - 1: last_index - 2])
first_index = last_index
else:
str_list.append(values[first_index - 1: len(values) - 1])
break
print str_list
Output:
['0.8500', '0.9000', '0.95000', '1.0000', '1.5000']
Assuming that there will always be a single digit before the decimal.
Please take this as a starting point and not a copy paste solution.

Getting the first appearance of a any char from a set in a string - python

is there a better way to find the first appearance of one of the chars: 'x','y','z' in someStr?
def findFirstAppearance(someStr):
x = someStr.find('x');
y = someStr.find('y');
z = someStr.find('z');
if x == -1: x= len(someStr);
if y == -1: y= len(someStr);
if z == -1: z= len(someStr);
return min(x,y,z);
for example: for someStr = "axby" it should return 1.
for someStr = "aybx" it should also return 1.
thanks!
Maybe:
>>> s = 'this string x contains y several letters z'
>>> next(i for i,c in enumerate(s) if c in 'xyz')
12
>>> s[12]
'x'
This will raise an exception if it's not found, which could be fixed by using a default value:
>>> next(i for i,c in enumerate(s) if c in 'Q')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> next((i for i,c in enumerate(s) if c in 'Q'), -1)
-1
You could also pre-construct a set to test membership in:
>>> special = set("vmp")
>>> next((i for i,c in enumerate(s) if c in special), -1)
27
which might be faster if there were a lot of letters to test against; it'll depend a lot on the sizes involved. Easy to experiment if it matters, but (spoiler alert) it probably doesn't.
Here's an alternative using regular expressions.
import re
def find_first_dx(needles, haystack):
match = re.search('|'.join(map(re.escape, needles)), haystack)
return match.start() if match else -1
Examples:
>>> find_first_dx('xyz', 'abcyax')
3
>>> find_first_dx('xyz.', 'a.bcyax')
1
>>> find_first_dx('xyz', 'fee fi foe fum')
-1
>>> find_first_dx(['foe', 'fum'], 'fee fi foe fum')
7
I think this is what you're looking for. This finds the first occurance of one of may chars (items) in a string. It works just like str.find.
def findany(string, items, start, end=-1):
if end == -1:
end = len(string)
for i in range(start, end):
c = string[i]
if c in items:
return i
return -1
# 01234567
inp = "hellozxy"
print findany(inp, "xyz") # 5 = z
print findany(inp, "?") # -1 = not found
print findany(inp, ["o", "l"], 3) # 3, skips the first 'l'
Note: You pass a list of chars (1-character strings) as items. In python, a string is just that. If you pass something like ["x", "y", "blah"], it won't work (it'll ignore "blah").
This should work:
def findany(s1, s2):
for i, x in enumerate(s1):
if x in s2:
return i
return -1
Use enumerate(), it yields a tuple for each character of the string.
Tuple's first element is the index and second element is the character itself.
In [103]: def find_first(strs):
.....: for i,x in enumerate(strs):
.....: if x in 'xyz': #if current character is either
#'x' or 'y' or 'z' then return index
.....: return i
.....: return -1 #if the loop completed successfully then return -1
.....:
In [104]: find_first("fooxbaryzx")
Out[104]: 3
In [105]: find_first("qwerty")
Out[105]: 5
In [106]: find_first("qwert")
Out[106]: -1
In [107]: find_first("axby")
Out[107]: 1
In [108]: find_first("aybx")
Out[108]: 1
For a lot of chars, you should seriously think about using a regular expression,
especially if you are doing this in a loop in your application:
import re
def findall(string, chars)
m = re.search("[%s]" % chars, string, re.DOTALL)
if m:
return m.start()
return -1
this should be at least 100x faster than a pure-python loop with a call to "find"
for each char.
Just be aware that if you need to find a char that is used for other
purposes inside regexps "[ ]" , you should escape them (like "-", "^")

Python, context sensitive string substitution

Is it possible to do something like this in Python using regular expressions?
Increment every character that is a number in a string by 1
So input "123ab5" would become "234ab6"
I know I could iterate over the string and manually increment each character if it's a number, but this seems unpythonic.
note. This is not homework. I've simplified my problem down to a level that sounds like a homework exercise.
a = "123ab5"
b = ''.join(map(lambda x: str(int(x) + 1) if x.isdigit() else x, a))
or:
b = ''.join(str(int(x) + 1) if x.isdigit() else x for x in a)
or:
import string
b = a.translate(string.maketrans('0123456789', '1234567890'))
In any of these cases:
# b == "234ab6"
EDIT - the first two map 9 to a 10, the last one wraps it to 0. To wrap the first two into zero, you will have to replace str(int(x) + 1) with str((int(x) + 1) % 10)
>>> test = '123ab5'
>>> def f(x):
try:
return str(int(x)+1)
except ValueError:
return x
>>> ''.join(map(f,test))
'234ab6'
>>> a = "123ab5"
>>> def foo(n):
... try: n = int(n)+1
... except ValueError: pass
... return str(n)
...
>>> a = ''.join(map(foo, a))
>>> a
'234ab6'
by the way with a simple if or with try-catch eumiro solution with join+map is the more pythonic solution for me too

Categories