python - in-place string reversal [duplicate] - python

There is no built in reverse function for Python's str object. What is the best way of implementing this method?
If supplying a very concise answer, please elaborate on its efficiency. For example, whether the str object is converted to a different object, etc.

Using slicing:
>>> 'hello world'[::-1]
'dlrow olleh'
Slice notation takes the form [start:stop:step]. In this case, we omit the start and stop positions since we want the whole string. We also use step = -1, which means, "repeatedly step from right to left by 1 character".

#Paolo's s[::-1] is fastest; a slower approach (maybe more readable, but that's debatable) is ''.join(reversed(s)).

What is the best way of implementing a reverse function for strings?
My own experience with this question is academic. However, if you're a pro looking for the quick answer, use a slice that steps by -1:
>>> 'a string'[::-1]
'gnirts a'
or more readably (but slower due to the method name lookups and the fact that join forms a list when given an iterator), str.join:
>>> ''.join(reversed('a string'))
'gnirts a'
or for readability and reusability, put the slice in a function
def reversed_string(a_string):
return a_string[::-1]
and then:
>>> reversed_string('a_string')
'gnirts_a'
Longer explanation
If you're interested in the academic exposition, please keep reading.
There is no built-in reverse function in Python's str object.
Here is a couple of things about Python's strings you should know:
In Python, strings are immutable. Changing a string does not modify the string. It creates a new one.
Strings are sliceable. Slicing a string gives you a new string from one point in the string, backwards or forwards, to another point, by given increments. They take slice notation or a slice object in a subscript:
string[subscript]
The subscript creates a slice by including a colon within the braces:
string[start:stop:step]
To create a slice outside of the braces, you'll need to create a slice object:
slice_obj = slice(start, stop, step)
string[slice_obj]
A readable approach:
While ''.join(reversed('foo')) is readable, it requires calling a string method, str.join, on another called function, which can be rather relatively slow. Let's put this in a function - we'll come back to it:
def reverse_string_readable_answer(string):
return ''.join(reversed(string))
Most performant approach:
Much faster is using a reverse slice:
'foo'[::-1]
But how can we make this more readable and understandable to someone less familiar with slices or the intent of the original author? Let's create a slice object outside of the subscript notation, give it a descriptive name, and pass it to the subscript notation.
start = stop = None
step = -1
reverse_slice = slice(start, stop, step)
'foo'[reverse_slice]
Implement as Function
To actually implement this as a function, I think it is semantically clear enough to simply use a descriptive name:
def reversed_string(a_string):
return a_string[::-1]
And usage is simply:
reversed_string('foo')
What your teacher probably wants:
If you have an instructor, they probably want you to start with an empty string, and build up a new string from the old one. You can do this with pure syntax and literals using a while loop:
def reverse_a_string_slowly(a_string):
new_string = ''
index = len(a_string)
while index:
index -= 1 # index = index - 1
new_string += a_string[index] # new_string = new_string + character
return new_string
This is theoretically bad because, remember, strings are immutable - so every time where it looks like you're appending a character onto your new_string, it's theoretically creating a new string every time! However, CPython knows how to optimize this in certain cases, of which this trivial case is one.
Best Practice
Theoretically better is to collect your substrings in a list, and join them later:
def reverse_a_string_more_slowly(a_string):
new_strings = []
index = len(a_string)
while index:
index -= 1
new_strings.append(a_string[index])
return ''.join(new_strings)
However, as we will see in the timings below for CPython, this actually takes longer, because CPython can optimize the string concatenation.
Timings
Here are the timings:
>>> a_string = 'amanaplanacanalpanama' * 10
>>> min(timeit.repeat(lambda: reverse_string_readable_answer(a_string)))
10.38789987564087
>>> min(timeit.repeat(lambda: reversed_string(a_string)))
0.6622700691223145
>>> min(timeit.repeat(lambda: reverse_a_string_slowly(a_string)))
25.756799936294556
>>> min(timeit.repeat(lambda: reverse_a_string_more_slowly(a_string)))
38.73570013046265
CPython optimizes string concatenation, whereas other implementations may not:
... do not rely on CPython's efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b . This optimization is fragile even in CPython (it only works for some types) and isn't present at all in implementations that don't use refcounting. In performance sensitive parts of the library, the ''.join() form should be used instead. This will ensure that concatenation occurs in linear time across various implementations.

Quick Answer (TL;DR)
Example
### example01 -------------------
mystring = 'coup_ate_grouping'
backwards = mystring[::-1]
print(backwards)
### ... or even ...
mystring = 'coup_ate_grouping'[::-1]
print(mystring)
### result01 -------------------
'''
gnipuorg_eta_puoc
'''
Detailed Answer
Background
This answer is provided to address the following concern from #odigity:
Wow. I was horrified at first by the solution Paolo proposed, but that
took a back seat to the horror I felt upon reading the first
comment: "That's very pythonic. Good job!" I'm so disturbed that such
a bright community thinks using such cryptic methods for something so
basic is a good idea. Why isn't it just s.reverse()?
Problem
Context
Python 2.x
Python 3.x
Scenario:
Developer wants to transform a string
Transformation is to reverse order of all the characters
Solution
example01 produces the desired result, using extended slice notation.
Pitfalls
Developer might expect something like string.reverse()
The native idiomatic (aka "pythonic") solution may not be readable to newer developers
Developer may be tempted to implement his or her own version of string.reverse() to avoid slice notation.
The output of slice notation may be counter-intuitive in some cases:
see e.g., example02
print 'coup_ate_grouping'[-4:] ## => 'ping'
compared to
print 'coup_ate_grouping'[-4:-1] ## => 'pin'
compared to
print 'coup_ate_grouping'[-1] ## => 'g'
the different outcomes of indexing on [-1] may throw some developers off
Rationale
Python has a special circumstance to be aware of: a string is an iterable type.
One rationale for excluding a string.reverse() method is to give python developers incentive to leverage the power of this special circumstance.
In simplified terms, this simply means each individual character in a string can be easily operated on as a part of a sequential arrangement of elements, just like arrays in other programming languages.
To understand how this works, reviewing example02 can provide a good overview.
Example02
### example02 -------------------
## start (with positive integers)
print 'coup_ate_grouping'[0] ## => 'c'
print 'coup_ate_grouping'[1] ## => 'o'
print 'coup_ate_grouping'[2] ## => 'u'
## start (with negative integers)
print 'coup_ate_grouping'[-1] ## => 'g'
print 'coup_ate_grouping'[-2] ## => 'n'
print 'coup_ate_grouping'[-3] ## => 'i'
## start:end
print 'coup_ate_grouping'[0:4] ## => 'coup'
print 'coup_ate_grouping'[4:8] ## => '_ate'
print 'coup_ate_grouping'[8:12] ## => '_gro'
## start:end
print 'coup_ate_grouping'[-4:] ## => 'ping' (counter-intuitive)
print 'coup_ate_grouping'[-4:-1] ## => 'pin'
print 'coup_ate_grouping'[-4:-2] ## => 'pi'
print 'coup_ate_grouping'[-4:-3] ## => 'p'
print 'coup_ate_grouping'[-4:-4] ## => ''
print 'coup_ate_grouping'[0:-1] ## => 'coup_ate_groupin'
print 'coup_ate_grouping'[0:] ## => 'coup_ate_grouping' (counter-intuitive)
## start:end:step (or start:end:stride)
print 'coup_ate_grouping'[-1::1] ## => 'g'
print 'coup_ate_grouping'[-1::-1] ## => 'gnipuorg_eta_puoc'
## combinations
print 'coup_ate_grouping'[-1::-1][-4:] ## => 'puoc'
Conclusion
The cognitive load associated with understanding how slice notation works in python may indeed be too much for some adopters and developers who do not wish to invest much time in learning the language.
Nevertheless, once the basic principles are understood, the power of this approach over fixed string manipulation methods can be quite favorable.
For those who think otherwise, there are alternate approaches, such as lambda functions, iterators, or simple one-off function declarations.
If desired, a developer can implement her own string.reverse() method, however it is good to understand the rationale behind this aspect of python.
See also
alternate simple approach
alternate simple approach
alternate explanation of slice notation

This answer is a bit longer and contains 3 sections: Benchmarks of existing solutions, why most solutions here are wrong, my solution.
The existing answers are only correct if Unicode Modifiers / grapheme clusters are ignored. I'll deal with that later, but first have a look at the speed of some reversal algorithms:
NOTE: I've what I called list_comprehension should be called slicing
list_comprehension : min: 0.6μs, mean: 0.6μs, max: 2.2μs
reverse_func : min: 1.9μs, mean: 2.0μs, max: 7.9μs
reverse_reduce : min: 5.7μs, mean: 5.9μs, max: 10.2μs
reverse_loop : min: 3.0μs, mean: 3.1μs, max: 6.8μs
list_comprehension : min: 4.2μs, mean: 4.5μs, max: 31.7μs
reverse_func : min: 75.4μs, mean: 76.6μs, max: 109.5μs
reverse_reduce : min: 749.2μs, mean: 882.4μs, max: 2310.4μs
reverse_loop : min: 469.7μs, mean: 577.2μs, max: 1227.6μs
You can see that the time for the list comprehension (reversed = string[::-1]) is in all cases by far the lowest (even after fixing my typo).
String Reversal
If you really want to reverse a string in the common sense, it is WAY more complicated. For example, take the following string (brown finger pointing left, yellow finger pointing up). Those are two graphemes, but 3 unicode code points. The additional one is a skin modifier.
example = "👈🏾👆"
But if you reverse it with any of the given methods, you get brown finger pointing up, yellow finger pointing left. The reason for this is that the "brown" color modifier is still in the middle and gets applied to whatever is before it. So we have
U: finger pointing up
M: brown modifier
L: finger pointing left
and
original: LMU 👈🏾👆
reversed: UML (above solutions) ☝🏾👈
reversed: ULM (correct reversal) 👆👈🏾
Unicode Grapheme Clusters are a bit more complicated than just modifier code points. Luckily, there is a library for handling graphemes:
>>> import grapheme
>>> g = grapheme.graphemes("👈🏾👆")
>>> list(g)
['👈🏾', '👆']
and hence the correct answer would be
def reverse_graphemes(string):
g = list(grapheme.graphemes(string))
return ''.join(g[::-1])
which also is by far the slowest:
list_comprehension : min: 0.5μs, mean: 0.5μs, max: 2.1μs
reverse_func : min: 68.9μs, mean: 70.3μs, max: 111.4μs
reverse_reduce : min: 742.7μs, mean: 810.1μs, max: 1821.9μs
reverse_loop : min: 513.7μs, mean: 552.6μs, max: 1125.8μs
reverse_graphemes : min: 3882.4μs, mean: 4130.9μs, max: 6416.2μs
The Code
#!/usr/bin/env python3
import numpy as np
import random
import timeit
from functools import reduce
random.seed(0)
def main():
longstring = ''.join(random.choices("ABCDEFGHIJKLM", k=2000))
functions = [(list_comprehension, 'list_comprehension', longstring),
(reverse_func, 'reverse_func', longstring),
(reverse_reduce, 'reverse_reduce', longstring),
(reverse_loop, 'reverse_loop', longstring)
]
duration_list = {}
for func, name, params in functions:
durations = timeit.repeat(lambda: func(params), repeat=100, number=3)
duration_list[name] = list(np.array(durations) * 1000)
print('{func:<20}: '
'min: {min:5.1f}μs, mean: {mean:5.1f}μs, max: {max:6.1f}μs'
.format(func=name,
min=min(durations) * 10**6,
mean=np.mean(durations) * 10**6,
max=max(durations) * 10**6,
))
create_boxplot('Reversing a string of length {}'.format(len(longstring)),
duration_list)
def list_comprehension(string):
return string[::-1]
def reverse_func(string):
return ''.join(reversed(string))
def reverse_reduce(string):
return reduce(lambda x, y: y + x, string)
def reverse_loop(string):
reversed_str = ""
for i in string:
reversed_str = i + reversed_str
return reversed_str
def create_boxplot(title, duration_list, showfliers=False):
import seaborn as sns
import matplotlib.pyplot as plt
import operator
plt.figure(num=None, figsize=(8, 4), dpi=300,
facecolor='w', edgecolor='k')
sns.set(style="whitegrid")
sorted_keys, sorted_vals = zip(*sorted(duration_list.items(),
key=operator.itemgetter(1)))
flierprops = dict(markerfacecolor='0.75', markersize=1,
linestyle='none')
ax = sns.boxplot(data=sorted_vals, width=.3, orient='h',
flierprops=flierprops,
showfliers=showfliers)
ax.set(xlabel="Time in ms", ylabel="")
plt.yticks(plt.yticks()[0], sorted_keys)
ax.set_title(title)
plt.tight_layout()
plt.savefig("output-string.png")
if __name__ == '__main__':
main()

1. using slice notation
def rev_string(s):
return s[::-1]
2. using reversed() function
def rev_string(s):
return ''.join(reversed(s))
3. using recursion
def rev_string(s):
if len(s) == 1:
return s
return s[-1] + rev_string(s[:-1])

A lesser perplexing way to look at it would be:
string = 'happy'
print(string)
'happy'
string_reversed = string[-1::-1]
print(string_reversed)
'yppah'
In English [-1::-1] reads as:
"Starting at -1, go all the way, taking steps of -1"

Reverse a string in python without using reversed() or [::-1]
def reverse(test):
n = len(test)
x=""
for i in range(n-1,-1,-1):
x += test[i]
return x

This is also an interesting way:
def reverse_words_1(s):
rev = ''
for i in range(len(s)):
j = ~i # equivalent to j = -(i + 1)
rev += s[j]
return rev
or similar:
def reverse_words_2(s):
rev = ''
for i in reversed(range(len(s)):
rev += s[i]
return rev
Another more 'exotic' way using bytearray which supports .reverse()
b = bytearray('Reverse this!', 'UTF-8')
b.reverse()
b.decode('UTF-8')`
will produce:
'!siht esreveR'

def reverse(input):
return reduce(lambda x,y : y+x, input)

Here is a no fancy one:
def reverse(text):
r_text = ''
index = len(text) - 1
while index >= 0:
r_text += text[index] #string canbe concatenated
index -= 1
return r_text
print reverse("hello, world!")

There are multiple ways to reverse a string in Python
Slicing Method
string = "python"
rev_string = string[::-1]
print(rev_string)
using reversed function
string = "python"
rev= reversed(string)
rev_string = "".join(rev)
print(rev_string)
Using Recursion
string = "python"
def reverse(string):
if len(string)==0:
return string
else:
return reverse(string[1:])+string[0]
print(reverse(string))
Using for Loop
string = "python"
rev_string =""
for s in string:
rev_string = s+ rev_string
print(rev_string)
Using while Loop
string = "python"
rev_str =""
length = len(string)-1
while length >=0:
rev_str += string[length]
length -= 1
print(rev_str)

original = "string"
rev_index = original[::-1]
rev_func = list(reversed(list(original))) #nsfw
print(original)
print(rev_index)
print(''.join(rev_func))

To solve this in programing way for interview
def reverse_a_string(string: str) -> str:
"""
This method is used to reverse a string.
Args:
string: a string to reverse
Returns: a reversed string
"""
if type(string) != str:
raise TypeError("{0} This not a string, Please provide a string!".format(type(string)))
string_place_holder = ""
start = 0
end = len(string) - 1
if end >= 1:
while start <= end:
string_place_holder = string_place_holder + string[end]
end -= 1
return string_place_holder
else:
return string
a = "hello world"
rev = reverse_a_string(a)
print(rev)
Output:
dlrow olleh

Recursive method:
def reverse(s): return s[0] if len(s)==1 else s[len(s)-1] + reverse(s[0:len(s)-1])
example:
print(reverse("Hello!")) #!olleH

def reverse_string(string):
length = len(string)
temp = ''
for i in range(length):
temp += string[length - i - 1]
return temp
print(reverse_string('foo')) #prints "oof"
This works by looping through a string and assigning its values in reverse order to another string.

a=input()
print(a[::-1])
The above code recieves the input from the user and prints an output that is equal to the reverse of the input by adding [::-1].
OUTPUT:
>>> Happy
>>> yppaH
But when it comes to the case of sentences, view the code output below:
>>> Have a happy day
>>> yad yppah a evaH
But if you want only the characters of the string to be reversed and not the sequence of string, try this:
a=input().split() #Splits the input on the basis of space (" ")
for b in a: #declares that var (b) is any value in the list (a)
print(b[::-1], end=" ") #End declares to print the character in its quotes (" ") without a new line.
In the above code in line 2 in I said that ** variable b is any value in the list (a)** I said var a to be a list because when you use split in an input the variable of the input becomes a list. Also remember that split can't be used in the case of int(input())
OUTPUT:
>>> Have a happy day
>>> evaH a yppah yad
If we don't add end(" ") in the above code then it will print like the following:
>>> Have a happy day
>>> evaH
>>> a
>>> yppah
>>> yad
Below is an example to understand end():
CODE:
for i in range(1,6):
print(i) #Without end()
OUTPUT:
>>> 1
>>> 2
>>> 3
>>> 4
>>> 5
Now code with end():
for i in range(1,6):
print(i, end=" || ")
OUTPUT:
>>> 1 || 2 || 3 || 4 || 5 ||

Here is how we can reverse a string using for loop:
string = "hello,world"
for i in range(-1,-len(string)-1,-1):
print (string[i], end=(" "))

Just as a different solution(because it's asked in interviews):
def reverse_checker(string):
ns = ""
for h in range(1,len(string)+1):
ns += string[-h]
print(ns)
if ns == string:
return True
else:
return False

Related

Convert list of similar ints to tuple of int and occurances [duplicate]

I'm trying to write a simple Python algorithm to solve this problem. Can you please help me figure out how to do this?
If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.
I see many great solutions here but none that feels very pythonic to my eyes. So I'm contributing with a implementation I wrote myself today for this problem.
def run_length_encode(data: str) -> Iterator[Tuple[str, int]]:
"""Returns run length encoded Tuples for string"""
# A memory efficient (lazy) and pythonic solution using generators
return ((x, sum(1 for _ in y)) for x, y in groupby(data))
This will return a generator of Tuples with the character and number of instances, but can easily be modified to return a string as well. A benefit of doing it this way is that it's all lazy evaluated and won't consume more memory or cpu than needed if you don't need to exhaust the entire search space.
If you still want string encoding the code can quite easily be modified for that use case like this:
def run_length_encode(data: str) -> str:
"""Returns run length encoded string for data"""
# A memory efficient (lazy) and pythonic solution using generators
return "".join(f"{x}{sum(1 for _ in y)}" for x, y in groupby(data))
This is a more generic run length encoding for all lengths, and not just for those of over 4 characters. But this could also quite easily be adapted with a conditional for the string if wanted.
Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.
Here is Python code with regular expressions:
from re import sub
def encode(text):
'''
Doctest:
>>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
'12W1B12W3B24W1B14W'
'''
return sub(r'(.)\1*', lambda m: str(len(m.group(0))) + m.group(1),
text)
def decode(text):
'''
Doctest:
>>> decode('12W1B12W3B24W1B14W')
'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
'''
return sub(r'(\d+)(\D)', lambda m: m.group(2) * int(m.group(1)),
text)
textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
assert decode(encode(textin)) == textin
Aside for setting a=i after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby. Its also a good idea to use format when constructing strings.
from itertools import groupby
def runLengthEncode (plainText):
res = []
for k,i in groupby(plainText):
run = list(i)
if(len(run) > 4):
res.append("/{:02}{}".format(len(run), k))
else:
res.extend(run)
return "".join(res)
Just observe the behaviour:
>>> runLengthEncode("abcd")
'abc'
Last character is ignored. You have to append what you've collected.
>>> runLengthEncode("abbbbbcd")
'a/5b/5b'
Oops, problem after encoding. You should set a=i even if you found a long enough sequence.
I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:
def runLengthEncode (plainText):
res=''
a=''
count = 0
for i in plainText:
count+=1
if a.count(i)>0:
a+=i
else:
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
a=i
else:
res+=a
a=i
if count == len(plainText):
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
return(res)
Split=(list(input("Enter string: ")))
Split.append("")
a = 0
for i in range(len(Split)):
try:
if (Split[i] in Split) >0:
a = a + 1
if Split[i] != Split[i+1]:
print(Split[i],a)
a = 0
except IndexError:
print()
this is much easier and works everytime
def RLE_comp_encode(text):
if text == text[0]*len(text) :
return str(len(text))+text[0]
else:
comp_text , r = '' , 1
for i in range (1,len(text)):
if text[i]==text[i-1]:
r +=1
if i == len(text)-1:
comp_text += str(r)+text[i]
else :
comp_text += str(r)+text[i-1]
r = 1
return comp_text
This worked for me,
You can use the groupby() function combined with a list/generator comprehension:
from itertools import groupby, imap
''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))
An easy solution to run-length encoding which I can think of:
For encoding a string like "a4b5c6d7...":
def encode(s):
counts = {}
for c in s:
if counts.get(c) is None:
counts[c] = s.count(c)
return "".join(k+str(v) for k,v in counts.items())
For decoding a string like "aaaaaabbbdddddccccc....":
def decode(s):
return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))
Fairly easy to read and simple.
text=input("Please enter the string to encode")
encoded=[]
index=0
amount=1
while index<=(len(text)-1):
if index==(len(text)-1) or text[index]!=text[(index+1)]:
encoded.append((text[index],amount))
amount=1
else:
amount=amount+1
index=index+1
print(encoded)

How to replace characters of string from a list entry in Python?

I have a string in which I want to replace certain characters with "*". But replace() function of python doesn't replace the characters. I understand that the strings in python are immutable and I am creating a new variable to store the replaced string. But still the function doesn't provide the replaced strings.
This is the following code that I have written. I have tried the process in two ways but still don't get the desired output:
1st way:
a = "AGGCFTFGADFADTRFCAGFADARTRADFACDGFLKLIAP"
rep = ['A','C','P']
for char in rep:
new = a.replace(char, "*")
print(new)
Output:
AGGCFTFGADFADTRFCAGFADARTRADFACDGFLKLIA*
2nd way:
a = "AGGCFTFGADFADTRFCAGFADARTRADFACDGFLKLIAP"
rep = ['A','C','P']
for i in a:
if(i in rep):
new = a.replace(i, "*")
print(new)
Output:
AGGCFTFGADFADTRFCAGFADARTRADFACDGFLKLIA*
Any help would be much appreciated. Thanks
You assign the result of a.replace(char, "*") to new, but then on the next iteration of the for loop, you again replace parts of a, not new. Instead of assigning to new, just assign the result to a, replacing the original string.
a = "AGGCFTFGADFADTRFCAGFADARTRADFACDGFLKLIAP"
rep = ['A','C','P']
for char in rep:
a = a.replace(char, "*")
print(a)
In addition to the answers offered, I would suggest that regular expressions make this perhaps more straightforward, accomplishing all of the substitutions with a single function call.
>>> import re
>>> a = "AGGCFTFGADFADTRFCAGFADARTRADFACDGFLKLIAP"
>>> rep = ['A','C','P']
>>> r = re.compile('|'.join(rep))
>>> r.sub('*', a)
'*GG*FTFG*DF*DTRF**GF*D*RTR*DF**DGFLKLI**'
Just in case someone decides to be clever and puts something regex significant in rep, you could escape those when compiling your regex.
r = re.compile('|'.join(re.escape(x) for x in rep))
Others have explained errors in posted code. An alternative using generator expression:
new = ''.join("*" if char in ['A','C','P'] else char for char in a)
print(new)
>>> '*GG*FTFG*DF*DTRF**GF*D*RTR*DF**DGFLKLI**'
A simple loop is easy to understand and efficient. The crucial part of the looping approach is to re-assign the string reference to the output of replace()
I've taken the liberty of plagiarising two pieces of code from other contributors in order to demonstrate the performance differences (in case that's important).
import re
from timeit import timeit
a = "AGGCFTFGADFADTRFCAGFADARTRADFACDGFLKLIAP"
rep = 'A', 'C', 'P'
p = re.compile('|'.join(rep))
def v1(s):
for c in rep:
s = s.replace(c, '*')
return s
def v2(s):
return p.sub('*', s)
def v3(s):
return ''.join("*" if char in rep else char for char in s)
for func in v1, v2, v3:
print(func.__name__, timeit(lambda: func(a)))
assert v1(a) == v2(a)
assert v1(a) == v3(a)
Output:
v1 0.3363962830003402
v2 1.8725565750000897
v3 3.3800653280000006
Platform:
macOS 13.0.1
Python 3.11.0
3 GHz 10-Core Intel Xeon W
As already mentioned, you should write a = a.replace(i, "*") because you are looping through rep and you want to do the replacement in the string a. Strings are immutable, and replace gives back a copy of the string.
The variable new only gives you the replacement over the last iteration of rep which is a P char and will result in AGGCFTFGADFADTRFCAGFADARTRADFACDGFLKLIA* because there is only a single P at the end of the string and you are never actually changing the value of rep.
If you have single characters, you can use a character class [ACP] with a single call to re.sub
import re
a = "AGGCFTFGADFADTRFCAGFADARTRADFACDGFLKLIAP"
print(re.sub("[ACP]", "*", a))
Output
*GG*FTFG*DF*DTRF**GF*D*RTR*DF**DGFLKLI**

How do I reverse words in a string with Python

I am trying to reverse words of a string, but having difficulty, any assistance will be appreciated:
S = " what is my name"
def reversStr(S):
for x in range(len(S)):
return S[::-1]
break
What I get now is: eman ym si tahw
However, I am trying to get: tahw is ym eman (individual words reversed)
def reverseStr(s):
return ' '.join([x[::-1] for x in s.split(' ')])
orig = "what is my name"
reverse = ""
for word in orig.split():
reverse = "{} {}".format(reverse, word[::-1])
print(reverse)
Since everyone else's covered the case where the punctuation moves, I'll cover the one where you don't want the punctuation to move.
import re
def reverse_words(sentence):
return re.sub(r'[a-zA-Z]+', lambda x : x.group()[::-1], sentence)
Breaking this down.
re is python's regex module, and re.sub is the function in that module that handles substitutions. It has three required parameters.
The first is the regex you're matching by. In this case, I'm using r'\w+'. The r denotes a raw string, [a-zA-Z] matches all letters, and + means "at least one".
The second is either a string to substitute in, or a function that takes in a re.MatchObject and outputs a string. I'm using a lambda (or nameless) function that simply outputs the matched string, reversed.
The third is the string you want to do a find in a replace in.
So "What is my name?" -> "tahW si ym eman?"
Addendum:
I considered a regex of r'\w+' initially, because better unicode support (if the right flags are given), but \w also includes numbers and underscores. Matching - might also be desired behavior: the regexes would be r'[a-zA-Z-]+' (note trailing hyphen) and r'[\w-]+' but then you'd probably want to not match double-dashes (ie --) so more regex modifications might be needed.
The built-in reversed outputs a reversed object, which you have to cast back to string, so I generally prefer the [::-1] option.
inplace refers to modifying the object without creating a copy. Yes, like many of us has already pointed out that python strings are immutable. So technically we cannot reverse a python string datatype object inplace. However, if you use a mutable datatype, say bytearray for storing the string characters, you can actually reverse it inplace
#slicing creates copy; implies not-inplace reversing
def rev(x):
return x[-1::-1]
# inplace reversing, if input is bytearray datatype
def rev_inplace(x: bytearray):
i = 0; j = len(x)-1
while i<j:
t = x[i]
x[i] = x[j]
x[j] = t
i += 1; j -= 1
return x
Input:
x = bytearray(b'some string to reverse')
rev_inplace(x)
Output:
bytearray(b'esrever ot gnirts emose')
Try splitting each word in the string into a list (see: https://docs.python.org/2/library/stdtypes.html#str.split).
Example:
>>string = "This will be split up"
>>string_list = string.split(" ")
>>string_list
>>['This', 'will', 'be', 'split', 'up']
Then iterate through the list and reverse each constituent list item (i.e. word) which you have working already.
def reverse_in_place(phrase):
res = []
phrase = phrase.split(" ")
for word in phrase:
word = word[::-1]
res.append(word)
res = " ".join(res)
return res
[thread has been closed, but IMO, not well answered]
the python string.lib doesn't include an in place str.reverse() method.
So use the built in reversed() function call to accomplish the same thing.
>>> S = " what is my name"
>>> ("").join(reversed(S))
'eman ym si tahw'
There is no obvious way of reversing a string "truly" in-place with Python. However, you can do something like:
def reverse_string_inplace(string):
w = len(string)-1
p = w
while True:
q = string[p]
string = ' ' + string + q
w -= 1
if w < 0:
break
return string[(p+1)*2:]
Hope this makes sense.
In Python, strings are immutable. This means you cannot change the string once you have created it. So in-place reverse is not possible.
There are many ways to reverse the string in python, but memory allocation is required for that reversed string.
print(' '.join(word[::-1] for word in string))
s1 = input("Enter a string with multiple words:")
print(f'Original:{s1}')
print(f'Reverse is:{s1[::-1]}')
each_word_new_list = []
s1_split = s1.split()
for i in range(0,len(s1_split)):
each_word_new_list.append(s1_split[i][::-1])
print(f'New Reverse as List:{each_word_new_list}')
each_word_new_string=' '.join(each_word_new_list)
print(f'New Reverse as String:{each_word_new_string}')
If the sentence contains multiple spaces then usage of split() function will cause trouble because you won't know then how many spaces you need to rejoin after you reverse each word in the sentence. Below snippet might help:
# Sentence having multiple spaces
given_str = "I know this country runs by mafia "
tmp = ""
tmp_list = []
for i in given_str:
if i != ' ':
tmp = tmp + i
else:
if tmp == "":
tmp_list.append(i)
else:
tmp_list.append(tmp)
tmp_list.append(i)
tmp = ""
print(tmp_list)
rev_list = []
for x in tmp_list:
rev = x[::-1]
rev_list.append(rev)
print(rev_list)
print(''.join(rev_list))
output:
def rev(a):
if a == "":
return ""
else:
z = rev(a[1:]) + a[0]
return z
Reverse string --> gnirts esreveR
def rev(k):
y = rev(k).split()
for i in range(len(y)-1,-1,-1):
print y[i],
-->esreveR gnirts

Remove Last instance of a character and rest of a string

If I have a string as follows:
foo_bar_one_two_three
Is there a clean way, with RegEx, to return: foo_bar_one_two?
I know I can use split, pop and join for this, but I'm looking for a cleaner solution.
result = my_string.rsplit('_', 1)[0]
Which behaves like this:
>>> my_string = 'foo_bar_one_two_three'
>>> print(my_string.rsplit('_', 1)[0])
foo_bar_one_two
See in the documentation entry for str.rsplit([sep[, maxsplit]]).
One way is to use rfind to get the index of the last _ character and then slice the string to extract the characters up to that point:
>>> s = "foo_bar_one_two_three"
>>> idx = s.rfind("_")
>>> if idx >= 0:
... s = s[:idx]
...
>>> print s
foo_bar_one_two
You need to check that the rfind call returns something greater than -1 before using it to get the substring otherwise it'll strip off the last character.
If you must use regular expressions (and I tend to prefer non-regex solutions for simple cases like this), you can do it thus:
>>> import re
>>> s = "foo_bar_one_two_three"
>>> re.sub('_[^_]*$','',s)
'foo_bar_one_two'
Similar the the rsplit solution, rpartition will also work:
result = my_string.rpartition("_")[0]
You'll need to watch out for the case where the separator character is not found. In that case the original string will be in index 2, not 0.
doc string:
rpartition(...)
S.rpartition(sep) -> (head, sep, tail)
Search for the separator sep in S, starting at the end of S, and return
the part before it, the separator itself, and the part after it. If the
separator is not found, return two empty strings and S.
Here is a generic function to remove everything after the last occurrence of any specified string. For extra credit, it also supports removing everything after the nth last occurrence.
def removeEverythingAfterLast (needle, haystack, n=1):
while n > 0:
idx = haystack.rfind(needle)
if idx >= 0:
haystack = haystack[:idx]
n -= 1
else:
break
return haystack
In your case, to remove everything after the last '_', you would simply call it like this:
updatedString = removeEverythingAfterLast('_', yourString)
If you wanted to remove everything after the 2nd last '_', you would call it like this:
updatedString = removeEverythingAfterLast('_', yourString, 2)
I know is python, and my answer may be a little bit wrong in syntax, but in java you would do:
String a = "foo_bar_one_two_three";
String[] b = a.split("_");
String c = "";
for(int i=0; i<b.length-1; a++){
c += b[i];
if(i != b.length-2){
c += "_";
}
}
//and at this point, c is "foo_bar_one_two"
Hope in python split function works same way. :)
EDIT:
Using the limit part of the function you can do:
String a = "foo_bar_one_two_three";
String[] b = a.split("_",StringUtils.countMatches(a,"_"));
//and at this point, b is the array = [foo,bar,one,two]

Python: find most frequent bytes?

I'm looking for a (preferably simple) way to find and order the most common bytes in a python stream element.
e.g.
>>> freq_bytes(b'hello world')
b'lohe wrd'
or even
>>> freq_bytes(b'hello world')
[108,111,104,101,32,119,114,100]
I currently have a function that returns a list in the form list[97] == occurrences of "a". I need that to be sorted.
I figure I basically need to flip the list so list[a] = b --> list[b] = a at the same time removing the repeates.
Try the Counter class in the collections module.
from collections import Counter
string = "hello world"
print ''.join(char[0] for char in Counter(string).most_common())
Note you need Python 2.7 or later.
Edit: Forgot the most_common() method returned a list of value/count tuples, and used a list comprehension to get just the values.
def frequent_bytes(aStr):
d = {}
for char in aStr:
d[char] = d.setdefault(char, 0) + 1
myList = []
for char, frequency in d.items():
myList.append((frequency, char))
myList.sort(reverse=True)
return ''.join(myList)
>>> frequent_bytes('hello world')
'lowrhed '
I just tried something obvious. #kindall's answer rocks, though. :)

Categories