How to remove whitespace from the string of list - python

I would like to remove whitespace of the string of the list as following
original = ['16', '0000D1AE18', '1', '1', '1', 'S O S .jpg', '0']
after remove the whitespace
['16', '0000D1AE18', '1', '1', '1', 'SOS.jpg', '0']

Use str.translate() on each element in a list comprehension:
[v.translate(None, ' ') for v in original]
Here None means don't replace characters with other characters, and ' ' means remove spaces altogether. This produces a new list to replace the original.
The above only removes just the spaces. To remove all whitespace (newlines, tabs, feeds, etc.) simply expand what characters should be removed
[v.translate(None, ' \t\r\n\f\x0a') for v in original]
str.translate() is the fastest option for removing characters from text.
Demo:
>>> original = ['16', '0000D1AE18', '1', '1', '1', 'S O S .jpg', '0']
>>> [v.translate(None, ' \t\r\n\f\x0a') for v in original]
['16', '0000D1AE18', '1', '1', '1', 'SOS.jpg', '0']

If you want to remove any whitespace (i.e.Space, Tab, CR and Newline), use this:
import re
without_spaces = [re.sub(r'\s+', '', item) for item in original]
If you need to replace only regular spaces, use the already suggested solution
without_spaces = [item.replace(' ', '') for item in original]

You can use
k=[]
for i in original :
j = i.replace(' ','')
k.append(j)

Related

How to extract numbers from a string that has no spaces into a list

I have an assignment for which my script should be able to receive a string for input (e.g. "c27bdj3jddj45g" ) and extract the numbers into a list (not just the digits, it should be able to detect full numbers).
I am not allowed to use regex at all, only simple methods like split, count and append.
Any ideas? (Using python)
Example for the output needed for the string I gave as an example:
['27','3', '45']
Nothing I have tried so far is worth mentioning here, I am pretty lost on which approach to take here without re.findall, which I cannot use.
One way to solve it is to use the groupby from itertools lib:
from itertools import groupby
s = 'c27bdj3jdj45g11' # last dight is 11
ans = []
for k, g in groupby(s, lambda x: x.isdigit()):
if k: # True if x is digit
ans.append(''.join(g))
ans
['27', '3', '45', '11']
Second solution - even OP has opt out the regex, but this is just for a reference. (to show how much easier to approach this type of puzzle - which should be the way to go)
You could try to use regex - re lib like this (if there's no restriction!)
s = 'c27bdj3jddj45g'
import re
list(re.findall(r'\d+', s)) # matching one more digits
['27', '3', '45']
# or to get *integer*
list(map(int, re.findall(r'\d+', s)))
[27, 3, 45]
You can do this with a for-loop and save the numbers. Then, when you see no digit, append digits and reset the string.
s = 'g38ff11'
prv = ''
res = []
for c in s:
if c.isdigit():
prv += c
else:
if prv != '': res.append(prv)
prv = ''
if prv != '': res.append(prv)
print(res)
Output:
['38', '11']
You can also write a lambda to check and append:
s = 'g38ff11'
prv = ''
res = []
append_dgt = lambda prv, res: res.append(prv) if prv!="" else None
for c in s:
if c.isdigit():
prv += c
else:
append_dgt(prv, res)
prv = ''
append_dgt(prv, res)
print(res)
s='c27bdj3jddj45g'
lst=[]
for x in s:
if x.isdigit():
lst.append(x)
else:
lst.append('$') # here $ is appended as a place holder so that all the numbers can come togetrher
Now, lst becomes :
#['$', '2', '7', '$', '$', '$', '3', '$', '$', '$', '$', '4', '5', '$']
''.join(lst).split('$') becomes:
['', '27', '', '', '3', '', '', '', '45', '']
Finally doing list comprehension to extract the numbers:
[x for x in ''.join(lst).split('$') if x.isdigit()]
['27', '3', '45']
string='c27bdj3jddj45g'
lst=[]
for i in string:
if i.isdigit():
lst.append(i)
else:
lst.append('$')
print([int(i) for i in ''.join(lst).split('$') if i.isdigit()])

How do I trim specific elements of a list?

I've got a list here that represents a line in a file after I split it:
['[0.111,', '-0.222]', '1', '2', '3']
and I'm trying to trim off the "[" and the "," in the first element and the "]" in the second element. How would I do that? I've started my thought process here, but this code doesn't work:
for line in file:
line = line.split()
line[0] = line[1:-1]
line[1] = line[0:-1]
print(line2)
You can use re.sub:
from re import sub
s = '[0.111, -0.222] 1 2 3'
s = sub('[\[\]]', '', s)
print(s.split())
Output:
['0.111,', '-0.222', '1', '2', '3']
If by any chance you would like to remove the comma as well, you can
from re import sub
s = '[0.111, -0.222] 1 2 3'
s = sub('[\[\],]', '', s)
print(s.split())
Output:
['0.111', '-0.222', '1', '2', '3']
You can use replace to remove the brackets:
lst = ['[0.111,', '-0.222]', '1', '2', '3']
lst2 = [x.replace('[','').replace(']','') for x in lst]
print(lst2)
Output
['0.111,', '-0.222', '1', '2', '3']
You also be more specific:
lst2 = [x[1:] if x[0] == '[' else x[:-1] if x[-1] == ']' else x for x in lst]
You could filter only the numeric part of each string with a function like:
def clean(strings):
def onlyNumeric(c):
return c.isdigit() or c == '-' or c == '.'
return list(map(lambda s: "".join(filter(onlyNumeric, s)), strings))
Then your example (and many other oddities) could be addressed.
>>> clean(['[0.111,', '-0.222]', '1', '2', '3'])
['0.111', '-0.222', '1', '2', '3']

Python regular expression retrieving numbers between two different delimiters

I have the following string
"h=56,7,1,d=88,9,1,h=58,8,1,d=45,h=100,d=,"
I would like to use regular expressions to extract the groups:
group1 56,7,1
group2 88,9,1
group3 58,8,1
group4 45
group5 100
group6 null
My ultimate goal is to have tuples such as (group1, group2), (group3, group4), (group5, group6). I am not sure if this all can be accomplished with regular expressions.
I have the following regular expression with gives me partial results
(?<=h=|d=)(.*?)(?=h=|d=)
The matches have an extra comma at the end like 56,7,1, which I would like to remove and d=, is not returning a null.
You likely do not need to use regex. A list comprehension and .split() can likely do what you need like:
Code:
def split_it(a_string):
if not a_string.endswith(','):
a_string += ','
return [x.split(',')[:-1] for x in a_string.split('=') if len(x)][1:]
Test Code:
tests = (
"h=56,7,1,d=88,9,1,h=58,8,1,d=45,h=100,d=,",
"h=56,7,1,d=88,9,1,d=,h=58,8,1,d=45,h=100",
)
for test in tests:
print(split_it(test))
Results:
[['56', '7', '1'], ['88', '9', '1'], ['58', '8', '1'], ['45'], ['100'], ['']]
[['56', '7', '1'], ['88', '9', '1'], [''], ['58', '8', '1'], ['45'], ['100']]
You could match rather than split using the expression
[dh]=([\d,]*),
and grab the first group, see a demo on regex101.com.
That is
[dh]= # d or h, followed by =
([\d,]*) # capture d and s 0+ times
, # require a comma afterwards
In Python:
import re
rx = re.compile(r'[dh]=([\d,]*),')
string = "h=56,7,1,d=88,9,1,h=58,8,1,d=45,h=100,d=,"
numbers = [m.group(1) for m in rx.finditer(string)]
print(numbers)
Which yields
['56,7,1', '88,9,1', '58,8,1', '45', '100', '']
You can use ([a-z]=)([0-9,]+)(,)?
Online demo
just you need add index to group
You could use $ in positive lookahead to match against the end of the string:
import re
input_str = "h=56,7,1,d=88,9,1,h=58,8,1,d=45,h=100,d=,"
groups = []
for x in re.findall('(?<=h=|d=)(.*?)(?=d=|h=|$)', input_str):
m = x.strip(',')
if m:
groups.append(m.split(','))
else:
groups.append(None)
print(groups)
Output:
[['56', '7', '1'], ['88', '9', '1'], ['58', '8', '1'], ['45'], ['100'], None]
Here, I have assumed that parameters will only have numerical values. If it is so, then you can try this.
(?<=h=|d=)([0-9,]*)
Hope it helps.

How to remove whitespace in a list

I can't remove my whitespace in my list.
invoer = "5-9-7-1-7-8-3-2-4-8-7-9"
cijferlijst = []
for cijfer in invoer:
cijferlijst.append(cijfer.strip('-'))
I tried the following but it doesn't work. I already made a list from my string and seperated everything but the "-" is now a "".
filter(lambda x: x.strip(), cijferlijst)
filter(str.strip, cijferlijst)
filter(None, cijferlijst)
abc = [x.replace(' ', '') for x in cijferlijst]
Try that:
>>> ''.join(invoer.split('-'))
'597178324879'
If you want the numbers in string without -, use .replace() as:
>>> string_list = "5-9-7-1-7-8-3-2-4-8-7-9"
>>> string_list.replace('-', '')
'597178324879'
If you want the numbers as list of numbers, use .split():
>>> string_list.split('-')
['5', '9', '7', '1', '7', '8', '3', '2', '4', '8', '7', '9']
This looks a lot like the following question:
Python: Removing spaces from list objects
The answer being to use strip instead of replace. Have you tried
abc = x.strip(' ') for x in x

python - splitting a string without removing delimiters

I'm trying to split a string without removing the delimiter and having trouble doing so. The string I want to split is:
'+ {- 9 4} {+ 3 2}'
and I want to end up with
['+', '{- 9 4}', '{+ 3 2}']
yet everything I've tried hasn't worked. I was looking through this stackoverflow post for answers as well as google: Python split() without removing the delimiter
Thanks!
re.split will keep the delimiters when they are captured, i.e., enclosed in parentheses:
import re
s = '+ {- 9 4} {+ 3 2}'
p = filter(lambda x: x.strip() != '', re.split("([+{} -])", s))
will give you
['+', '{', '-', '9', '4', '}', '{', '+', '3', '2', '}']
which, IMO, is what you need to handle nested expressions

Categories