How to strip comma in Python string - python

How can I strip the comma from a Python string such as Foo, bar? I tried 'Foo, bar'.strip(','), but it didn't work.

You want to replace it, not strip it:
s = s.replace(',', '')

Use replace method of strings not strip:
s = s.replace(',','')
An example:
>>> s = 'Foo, bar'
>>> s.replace(',',' ')
'Foo bar'
>>> s.replace(',','')
'Foo bar'
>>> s.strip(',') # clears the ','s at the start and end of the string which there are none
'Foo, bar'
>>> s.strip(',') == s
True

unicode('foo,bar').translate(dict([[ord(char), u''] for char in u',']))

This will strip all commas from the text and left justify it.
for row in inputfile:
place = row['your_row_number_here'].strip(', ')
‎
‎‎‎‎‎
‎‎‎‎‎‎

Related

What's the difference between these two input statements?

I've tried using these two input statements in python. Both the statements returns same output. What's the difference between using split() and split(" ") ?
a=[int(i) for i in input().split(" ")]
print(a)
and
a=[int(i) for i in input().split()]
print(a)
The default action of method split on a string is to split on any grouping of white space:
>>> 'foo bar'.split()
['foo', 'bar']
>>> 'foo \n \t bar'.split()
['foo', 'bar']
If you pass a literal space as the argument, however, the split is done differently, with only a literal space as the splitter, and with empty strings resulting from adjacent literal spaces:
>>> 'foo \n \t bar'.split(' ')
['foo', '\n', '\t', '', '', 'bar']
If the input has only single, ordinary spaces, there will be no observable difference.

list comprehension using regex conditional

i have a list of strings.
If any of these strings has a 4-digit year, i want to truncate the string at the end of the year.
Otherwise I leave the strings alone.
I tried using:
for x in my_strings:
m=re.search("\D\d\d\d\d\D",x)
if m: x=x[:m.end()]
I also tried:
my_strings=[x[:re.search("\D\d\d\d\d\D",x).end()] if re.search("\D\d\d\d\d\D",x) for x in my_strings]
Neither of these is working.
Can you tell me what I am doing wrong?
Something like this seems to work on trivial data:
>>> regex = re.compile(r'^(.*(?<=\D)\d{4}(?=\D))(.*)')
>>> strings = ['foo', 'bar', 'baz', 'foo 1999', 'foo 1999 never see this', 'bar 2010 n 2015', 'bar 20156 see this']
>>> [regex.sub(r'\1', s) for s in strings]
['foo', 'bar', 'baz', 'foo 1999', 'foo 1999', 'bar 2010', 'bar 20156 see this']
Looks like your only bound on the result string is at the end(), so you should be using re.match() instead, and modify your regex to:
my_expr = r".*?\D\d{4}\D"
Then, in your code, do:
regex = re.compile(my_expr)
my_new_strings = []
for string in my_strings:
match = regex.match(string)
if match:
my_new_strings.append(match.group())
else:
my_new_strings.append(string)
Or as a list comprehension:
regex = re.compile(my_expr)
matches = ((regex.match(string), string) for string in my_strings)
my_new_strings = [match.group() if match else string for match, string in matches]
Alternatively, you could use re.sub:
regex = re.compile(r'(\D\d{4})\D')
new_strings = [regex.sub(r'\1', string) for string in my_strings]
I am not entirely sure of your usecase, but the following code can give you some hints:
import re
my_strings = ['abcd', 'ab12cd34', 'ab1234', 'ab1234cd', '1234cd', '123cd1234cd']
for index, string in enumerate(my_strings):
match = re.search('\d{4}', string)
if match:
my_strings[index] = string[0:match.end()]
print my_strings
# ['abcd', 'ab12cd34', 'ab1234', 'ab1234', '1234', '123cd1234']
You were actually pretty close with the list comprehension, but your syntax is off - you need to make the first expression a "conditional expression" aka x if <boolean> else y:
[x[:re.search("\D\d\d\d\d\D",x).end()] if re.search("\D\d\d\d\d\D",x) else x for x in my_strings]
Obviously this is pretty ugly/hard to read. There are several better ways to split your string around a 4-digit year. Such as:
[re.split(r'(?<=\D\d{4})\D', x)[0] for x in my_strings]

remove characters from a python string

I have several python strings from which I want unwanted characters removed.
Examples:
"This is '-' a test"
should be "This is a test"
"This is a test L)[_U_O-Y OH : l’J1.l'}/"
should be "This is a test"
"> FOO < BAR"
should be "FOO BAR"
"I<<W5§!‘1“¢!°\" I"
should be ""
(because if only words are extracted then it returns I W I and none of them form words)
"l‘?£§l%nbia ;‘\\~siI.ve_rswinq m"
should be ""
"2|'J]B"
should be ""
this is what I have so far, however, it is not keeping the original spaces between words.
>>> line = re.sub(r"\W+","","This is '-' a test")
>>> line
'Thisisatest'
>>> line = re.sub(r"\W+","","This is a test L)[_U_O-Y OH : l’J1.l'}/")
>>> line
'ThisisatestL_U_OYOHlJ1l'
#although i would prefer this to be "This is a test" but if not possible i would
prefer "This is a test L_U_OYOHlJ1l"
>>> line = re.sub(r"\W+","","> FOO < BAR")
>>> line
'FOOBAR'
>>> line = re.sub(r"\W+","","I<<W5§!‘1“¢!°\" I")
>>> line
'IW51I'
>>> line = re.sub(r"\W+","","l‘?£§l%nbia ;‘\\~siI.ve_rswinq m")
>>> line
'llnbiasiIve_rswinqm'
>>> line = re.sub(r"\W+","","2|'J]B")
>>> line
'2JB'
I will be filtering the regex cleaned words through a list of predefined words later.
I'd go with a split and filter, like this:
' '.join(word for word in line.split() if word.isalpha() and word.lower() in list)
This will remove all non-alphabetic words and alphabetic words that are not in the list.
Examples:
def myfilter(string):
words = {'this', 'test', 'i', 'a', 'foo', 'bar'}
return ' '.join(word for word in line.split() if word.isalpha() and word.lower() in words)
>>> myfilter("This is '-' a test")
'This a test'
>>> myfilter("This is a test L)[_U_O-Y OH : l’J1.l'}/")
'This a test'
>>> myfilter("> FOO < BAR")
'FOO BAR'
>>> myfilter("I<<W5§!‘1“¢!°\" I")
'I'
>>> myfilter("l‘?£§l%nbia ;‘\\~siI.ve_rswinq m")
''
>>> myfilter("2|'J]B")
''
This one clears out any group of non-space symbols with at least one non alphabetic character. It will leaves some unwanted group of letters though :
re.sub(r"\w*[^a-zA-Z ]+\w*","","This is a test L)[_U_O-Y OH : l’J1.l'}/")
gives :
'This is a test OH '
It will also leave groups of more than one space :
re.sub(r"[^a-zA-Z ]+\w*","","This is '-' a test")
'This is a test' # two spaces

Python and Line Breaks

With Python I know that the "\n" breaks to the next line in a string, but what I am trying to do is replace every "," in a string with a '\n'. Is that possible? I am kind of new to Python.
Try this:
text = 'a, b, c'
text = text.replace(',', '\n')
print text
For lists:
text = ['a', 'b', 'c']
text = '\n'.join(text)
print text
>>> str = 'Hello, world'
>>> str = str.replace(',','\n')
>>> print str
Hello
world
>>> str_list=str.split('\n')
>>> print str_list
['Hello', ' world']
For futher operations you may check: http://docs.python.org/library/stdtypes.html
You can insert a literal \n into your string by escaping the backslash, e.g.
>>> print '\n'; # prints an empty line
>>> print '\\n'; # prints \n
\n
The same principle is used in regular expressions. Use this expresion to replace all , in a string with \n:
>>> re.sub(",", "\\n", "flurb, durb, hurr")
'flurb\n durb\n hurr'

Joining multiple strings if they are not empty in Python

I have four strings and any of them can be empty. I need to join them into one string with spaces between them. If I use:
new_string = string1 + ' ' + string2 + ' ' + string3 + ' ' + string4
The result is a blank space on the beginning of the new string if string1 is empty. Also, I have three blank spaces if string2 and string3 are empty.
How can I easily join them without blank spaces when I don't need them?
>>> strings = ['foo','','bar','moo']
>>> ' '.join(filter(None, strings))
'foo bar moo'
By using None in the filter() call, it removes all falsy elements.
If you KNOW that the strings have no leading/trailing whitespace:
>>> strings = ['foo','','bar','moo']
>>> ' '.join(x for x in strings if x)
'foo bar moo'
otherwise:
>>> strings = ['foo ','',' bar', ' ', 'moo']
>>> ' '.join(x.strip() for x in strings if x.strip())
'foo bar moo'
and if any of the strings have non-leading/trailing whitespace, you may need to work harder still. Please clarify what it is that you actually have.
strings = ['foo','','bar','moo']
' '.join([x for x in strings if x is not ''])
'foo bar moo'

Categories