strip(char) on a string - python

I am trying to strip the characters '_ ' (underscore and space) away from my string. The first code fails to strip anything.
The code for word_1 works just as I intend. Could anyone enlighten me how to modify the first code to get output 'ale'?
word = 'a_ _ le'
word.strip('_ ')
word_1 = '_ _ le'
word_1.strip('_ ')
'''

You need to replace() in this use case, not strip()
word.replace('_ ', '')
strip():
string.strip(s[, chars])
Return a copy of the string with leading and trailing characters removed. If chars is omitted or None, whitespace characters are removed. If given and not None, chars must be a string; the characters in the string will be stripped from the both ends of the string this method is called on.
replace():
string.replace(s, old, new[, maxreplace])
Return a copy of string s with all occurrences of substring old replaced by new. If the optional argument maxreplace is given, the first maxreplace occurrences are replaced.
Strings in Python

.strip removes the target string from the start and end of the source string.
You want .replace.
>>> word = 'a_ _ le'
>>> word = word.replace("_ ", "")
>>> word
'ale'

.strip() is used when the passed string has to be removed from the start and end of string. It does not work in the middle. For this, .replace() is used as word.replace('_ ', ''). This outputs ale

Related

Why does sentence.strip() remove certain characters but not others from the end of this string?

Tyring to figure out how strip() works when reading characters in a string.
This:
sentence = "All the single ladies"
sentence = sentence.strip("All the si")
print(sentence)
returns this:
ngle lad
I get why 'All the si' is removed from the start of the string. But how does Python decide to remove the 'ies' from the end of the string? If the 'e' is being removed from the 'ies', why isn't it being removed from 'the' too? What are the rules for string stripping behavior?
.strip() accepts an iterable of characters you want to remove not a substring. So all of i, e, s characters are present in the substring you passed (All the si). And d (that is at the end of the resulting string) isn't, so it stops on it.
See more in the docs.
To remove the substring you would use:
sentence.replace("All the si", "")

How to use text strip() function?

I can strip numerics but not alpha characters:
>>> text
'132abcd13232111'
>>> text.strip('123')
'abcd'
Why the following is not working?
>>> text.strip('abcd')
'132abcd13232111'
The reason is simple and stated in the documentation of strip:
str.strip([chars])
Return a copy of the string with the leading and trailing characters removed.
The chars argument is a string specifying the set of characters to be removed.
'abcd' is neither leading nor trailing in the string '132abcd13232111' so it isn't stripped.
Just to add a few examples to Jim's answer, according to .strip() docs:
Return a copy of the string with the leading and trailing characters removed.
The chars argument is a string specifying the set of characters to be removed.
If omitted or None, the chars argument defaults to removing whitespace.
The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped.
So it doesn't matter if it's a digit or not, the main reason your second code didn't worked as you expected, is because the term "abcd" was located in the middle of the string.
Example1:
s = '132abcd13232111'
print(s.strip('123'))
print(s.strip('abcd'))
Output:
abcd
132abcd13232111
Example2:
t = 'abcd12312313abcd'
print(t.strip('123'))
print(t.strip('abcd'))
Output:
abcd12312313abcd
12312313

How can we strip punctuation at the start of a string using Python?

I want to strip all kinds of punctuation at the start of the string using Python. My list contains strings and some of them starting with some kind of punctuation. And how can I strip all type of punctuation from the strings?
For example: If my word is like ,,gets, I want to strip ,, from the word, and I want gets as the result. Also, I want to strip away spaces as well as numbers from the list. I have tried with the following code but it is not producing the correct result.
If 'a' is a list containing some words:
for i in range (0,len(a)):
a[i]=a[i].lstrip().rstrip()
print a[i]
You can use strip():
Return a copy of the string with the leading and trailing characters
removed. The chars argument is a string specifying the set of
characters to be removed.
Passing string.punctuation will remove all leading and trailing punctuation chars:
>>> import string
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?#[\\]^_`{|}~'
>>> l = [',,gets', 'gets,,', ',,gets,,']
>>> for item in l:
... print item.strip(string.punctuation)
...
gets
gets
gets
Or, lstrip() if you need only leading characters removed, rstip() - for trailing characters.
Hope that helps.
Pass the characters you want to remove in lstrip and rstrip
'..foo..'.lstrip('.').rstrip('.') == 'foo'
strip() when used without parameters strips only spaces. If you want to strip any other character, you need to pass it as a parameter to strip function. In your case you should be doing
a[i]=a[i].strip(',')
To remove punctuation, spaces, numbers from the beginning of each string in a list of strings:
import string
chars = string.punctuation + string.whitespace + string.digits
a[:] = [s.lstrip(chars) for s in a]
Note: it doesn't take into account non-ascii punctuation, whitespace, or digits.
If you want to remove it only from the begining, try this:
import re
s='"gets'
re.sub(r'("|,,)(.*)',r'\2',s)
Assuming you want to remove all punctuation regardless of where it occurs in a list containing strings (which may contain multiple words), this should work:
test1 = ",,gets"
test2 = ",,gets,,"
test3 = ",,this is a sentence and it has commas, and many other punctuations!!"
test4 = [" ", "junk1", ",,gets", "simple", 90234, "234"]
test5 = "word1 word2 word3 word4 902344"
import string
remove_l = string.punctuation + " " + "1234567890"
for t in [test1, test2, test3, test4, test5]:
if isinstance(t, str):
print " ".join([x.strip(remove_l) for x in t.split()])
else:
print [x.strip(remove_l) for x in t \
if isinstance(x, str) and len(x.strip(remove_l))]
for each_string in list:
each_string.lstrip(',./";:') #you can put all kinds of characters that you want to ignore.

Add string between tabs and text

I simply want to add string after (0 or more) tabs in the beginning of a string.
i.e.
a = '\t\t\tHere is the next part of string. More garbage.'
(insert Added String here.)
to
b = '\t\t\t Added String here. Here is the next part of string. More garbage.'
What is the easiest/simplest way to go about it?
Simple:
re.sub(r'^(\t*)', r'\1 Added String here. ', inputtext)
The ^ caret matches the start of the string, \t a tab character, of which there should be zero or more (*). The parenthesis capture the matched tabs for use in the replacement string, where \1 inserts them again in front of the string you need adding.
Demo:
>>> import re
>>> a = '\t\t\tHere is the next part of string. More garbage.'
>>> re.sub(r'^(\t*)', r'\1 Added String here. ', a)
'\t\t\t Added String here. Here is the next part of string. More garbage.'
>>> re.sub(r'^(\t*)', r'\1 Added String here. ', 'No leading tabs.')
' Added String here. No leading tabs.'

Replace first occurrence of string in Python

I have some sample string. How can I replace first occurrence of this string in a longer string with empty string?
regex = re.compile('text')
match = regex.match(url)
if match:
url = url.replace(regex, '')
string replace() function perfectly solves this problem:
string.replace(s, old, new[, maxreplace])
Return a copy of string s with all occurrences of substring old replaced by new. If the optional argument maxreplace is given, the first maxreplace occurrences are replaced.
>>> u'longlongTESTstringTEST'.replace('TEST', '?', 1)
u'longlong?stringTEST'
Use re.sub directly, this allows you to specify a count:
regex.sub('', url, 1)
(Note that the order of arguments is replacement, original not the opposite, as might be suspected.)

Categories