This question already has answers here:
Dot notation string manipulation
(5 answers)
Closed 6 years ago.
I have a column with data like this that I'm accessing via Python:
501,555,570=3.5
I want to get 570=3.5.
How can I do that? Would it be a variation of the split command?
You can use the str.rsplit() function as follows:
In [34]: s = '501,555,570=3.5'
In [35]: s.rsplit(',', 1)[1]
Out[35]: '570=3.5'
>>> s = '501,555,570=3.5'
>>> s.split(",")[-1]
'570=3.5'
This will access the last element in the split string. It is not dependent on how many commas are in the string.
Example of longer string:
>>> s = "501,555,670,450,1,12,570=3.5"
>>> s.split(",")[-1]
'570=3.5'
A slight variation of wim's answer, if you're in a recent Python 3 version:
>>> s = '501,555,570=3.5'
>>> *others, last = s.split(',')
>>> last
'570=3.5'
>>> s = '501,555,570=3.5'
>>> others, last = s.rsplit(',', 1)
>>> last
'570=3.5'
Another variation, and how I would do it myself:
>>> s = '501,555,570=3.5'
>>> last = s.split(',')[-1]
>>> last
'570=3.5'
Using rpartition and as # MartijnPieters♦ mentioned here in his comment.
for a single split, str.rpartition() is going to be faster.
>>> s.rpartition(',')[-1]
'570=3.5'
You could take substring from the position after where you find the last comma.
s[s.rfind(',') + 1 : ]
Related
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 3 years ago.
Literally, I've been trying to a way to solve this but it seems that I'm poor on regex;)
I need to remove (WindowsPath and )"from the strings in a list
x= ["(WindowsPath('D:/test/1_birds_bp.png'),WindowsPath('D:/test/1_eagle_mp.png'))", "(WindowsPath('D:/test/2_reptiles_bp.png'),WindowsPath('D:/test/2_crocodile_mp.png'))"]
So I tried
import re
cleaned_x = [re.sub("(?<=WindowsPath\(').*?(?='\))",'',a) for a in x]
outputs
["(WindowsPath(''),WindowsPath(''))", "(WindowsPath(''),WindowsPath(''))"]
what I need to have is;
cleaned_x= [('D:/test/1_birds_bp.png','D:/test/1_eagle_mp.png'), ('D:/test/2_reptiles_bp.png','D:/test/2_crocodile_mp.png')]
basically tuples in a list.
You can accomplish this by using re.findall like this:
>>> cleaned_x = [tuple(re.findall(r"[A-Z]:/[^']+", a)) for a in x]
>>> cleaned_x
[('D:/test/1_birds_bp.png', 'D:/test/1_eagle_mp.png'), ('D:/test/2_reptiles_bp.png',
'D:/test/2_crocodile_mp.png')]
>>>
Hope it helps.
Perhaps you could use capturing groups? For instance:
import re
re_winpath = re.compile(r'^\(WindowsPath\(\'(.*)\'\)\,WindowsPath\(\'(.*)\'\)\)$')
def extract_pair(s):
m = re_winpath.match(s)
if m is None:
raise ValueError(f"cannot extract pair from string: {s}")
return m.groups()
pairs = list(map(extract_pair, x))
Here's my take,
not pretty, and I did it in two steps so as not to make regexp spagetti, and you could turn it into a list comprehension if you like, but it should work
for a in x:
a = re.sub('(\()?WindowsPath', '', a)
a = re.sub('\)$','', a)
print(a)
This question already has answers here:
Keeping only certain characters in a string using Python?
(3 answers)
Closed 5 years ago.
So my code is
value = "123456"
I want to remove everything except for 2 and 5.
the output will be 25
the program should work even the value is changed for example
value = "463312"
the output will be 2
I tried to use remove() and replace() function. But it didn't work.
Doing it on python 3.6.2
Instead of trying to remove every unwanted character, you will be better off to build a whitelist of the characters you want to keep in the result:
>>> value = '123456'
>>> whitelist = set('25')
>>> ''.join([c for c in value if c in whitelist])
'25'
Here is another option where the loop is implicit. We build a mapping to use with str.translate where every character maps to '', unless specified otherwise:
>>> from collections import defaultdict
>>> d = defaultdict(str, str.maketrans('25', '25'))
>>> '123456'.translate(d)
'25'
In case you are looking for regex solution then you can use re.sub to replace all the characters other than 25 with ''.
import re
x = "463312"
new = re.sub('[^25]+' ,'', x)
x = "463532312"
new = re.sub('[^25]+' ,'', x)
Output:
2, 522
If you are using Python 2, you can use filter like this:
In [60]: value = "123456"
In [61]: whitelist = set("25")
In [62]: filter(lambda x: x in whitelist, value)
Out[62]: '25'
If you are using Python 3, you would need to "".join() the result of the filter.
value="23456"
j=""
for k in value:
if k=='2' or k=='5':
j=j+k
print (j)
It is the woorking program of what you said, You can give any input to the value, it will always print 25.
value = "123456"
whitelist = '25'
''.join(set(whitelist) & set(value))
'25'
I tried to approach this differently using the Python built-in filter() method with lambda. See below:
a = "1225866125" # Should return "22525"
whitelist = '25'
# Use filter to remove all elements that does not match the condition
a = "".join(filter(lambda c: c in whitelist, a))
Hope this helped!
(Edited to shorten the answer, credit #Akavall + again #salparadise)
This question already has answers here:
How to extract the substring between two markers?
(22 answers)
Closed 4 years ago.
I have a string - Python :
string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
Expected output is :
"Atlantis-GPS-coordinates"
I know that the expected output is ALWAYS surrounded by "/bar/" on the left and "/" on the right :
"/bar/Atlantis-GPS-coordinates/"
Proposed solution would look like :
a = string.find("/bar/")
b = string.find("/",a+5)
output=string[a+5,b]
This works, but I don't like it.
Does someone know a beautiful function or tip ?
You can use split:
>>> string.split("/bar/")[1].split("/")[0]
'Atlantis-GPS-coordinates'
Some efficiency from adding a max split of 1 I suppose:
>>> string.split("/bar/", 1)[1].split("/", 1)[0]
'Atlantis-GPS-coordinates'
Or use partition:
>>> string.partition("/bar/")[2].partition("/")[0]
'Atlantis-GPS-coordinates'
Or a regex:
>>> re.search(r'/bar/([^/]+)', string).group(1)
'Atlantis-GPS-coordinates'
Depends on what speaks to you and your data.
What you haven't isn't all that bad. I'd write it as:
start = string.find('/bar/') + 5
end = string.find('/', start)
output = string[start:end]
as long as you know that /bar/WHAT-YOU-WANT/ is always going to be present. Otherwise, I would reach for the regular expression knife:
>>> import re
>>> PATTERN = re.compile('^.*/bar/([^/]*)/.*$')
>>> s = '/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/'
>>> match = PATTERN.match(s)
>>> match.group(1)
'Atlantis-GPS-coordinates'
import re
pattern = '(?<=/bar/).+?/'
string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
result = re.search(pattern, string)
print string[result.start():result.end() - 1]
# "Atlantis-GPS-coordinates"
That is a Python 2.x example. What it does first is:
1. (?<=/bar/) means only process the following regex if this precedes it (so that /bar/ must be before it)
2. '.+?/' means any amount of characters up until the next '/' char
Hope that helps some.
If you need to do this kind of search a bunch it is better to 'compile' this search for performance, but if you only need to do it once don't bother.
Using re (slower than other solutions):
>>> import re
>>> string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"
>>> re.search(r'(?<=/bar/)[^/]+(?=/)', string).group()
'Atlantis-GPS-coordinates'
This question already has answers here:
How can I split and parse a string in Python? [duplicate]
(3 answers)
Closed 8 years ago.
I have a file which contains each line in the following format
"('-1259656819525938837', 598679497)\t0.036787946" # "\t" within the string is the tab sign
I need to get the components out
-1259656819525938837 #string, it is the content within ' '
598679497 # long
0.036787946 # float
Python 2.6
You can use regular expressions from re module:
import re
s = "('-1259656819525938837', 598679497)\t0.036787946"
re.findall(r'[-+]?[0-9]*\.?[0-9]+', s)
% gives: ['-1259656819525938837', '598679497', '0.036787946']
"2.7.0_bf4fda703454".split("_") gives a list of strings:
In [1]: "2.7.0_bf4fda703454".split("_")
Out[1]: ['2.7.0', 'bf4fda703454']
This splits the string at every underscore. If you want it to stop after the first split, use "2.7.0_bf4fda703454".split("_", 1).
If you know for a fact that the string contains an underscore, you can even unpack the LHS and RHS into separate variables:
In [8]: lhs, rhs = "2.7.0_bf4fda703454".split("_", 1)
In [9]: lhs
Out[9]: '2.7.0'
In [10]: rhs
Out[10]: 'bf4fda703454'
You can use a regex to extract number and float from string:
>>> import re
>>> a = "('-1259656819525938837', 598679497)\t0.036787946"
>>> re.findall(r'[-?\d\.\d]+', a)
['-1259656819525938837', '598679497', '0.036787946']
This question already has answers here:
How to convert string to Title Case in Python?
(10 answers)
Closed 9 years ago.
I'm having trouble trying to create a function that can do this job. The objective is to convert strings like
one to One
hello_world to HelloWorld
foo_bar_baz to FooBarBaz
I know that the proper way to do this is using re.sub, but I'm having trouble creating the right regular expressions to do the job.
You can try something like this:
>>> s = 'one'
>>> filter(str.isalnum, s.title())
'One'
>>>
>>> s = 'hello_world'
>>> filter(str.isalnum, s.title())
'HelloWorld'
>>>
>>> s = 'foo_bar_baz'
>>> filter(str.isalnum, s.title())
'FooBarBaz'
Relevant documentation:
str.title()
str.isalnum()
filter()
Found solution:
def uppercase(name):
return ''.join(x for x in name.title() if not x.isspace()).replace('_', '')