How to extract number from end of the string - python

I have a string like "Titile Something/17". I need to cut out "/NNN" part which can be 3, 2, 1 digit number or may not be present.
How you do this in python? Thanks.

\d{0,3} matches from zero upto three digits. $ asserts that we are at the end of a line.
re.search(r'/\d{0,3}$', st).group()
Example:
>>> re.search(r'/\d{0,3}$', 'Titile Something/17').group()
'/17'
>>> re.search(r'/\d{0,3}$', 'Titile Something/1').group()
'/1'

You don't need RegEx here, simply use the built-in str.rindex function and slicing, like this
>>> data = "Titile Something/17"
>>> data[:data.rindex("/")]
'Titile Something'
>>> data[data.rindex("/") + 1:]
'17'
Or you can use str.rpartition, like this
>>> data.rpartition('/')[0]
'Titile Something'
>>> data.rpartition('/')[2]
'17'
>>>
Note: This will get any string after the last /. Use it with caution.
If you want to make sure that the split string is actually full of numbers, you can use str.isdigit function, like this
>>> data[data.rindex("/") + 1:].isdigit()
True
>>> data.rpartition('/')[2].isdigit()
True

data = "Titile Something/17"
print data.split("/")[0]
'Titile Something'
print data.split("/")[-1] #last part string after separator /
'17'
or
print data.split("/")[1] # next part after separator in this case this is the same
'17'
when You want add this to the list use strip() to remove newline "\n"
print data.split("/")[-1].strip()
'17'
~

I need to cut out "/NNN"
x = "Titile Something/17"
print re.sub(r"/.*$","",x) #cuts the part after /
print re.sub(r"^.*?/","",x) #cuts the part before /
Using re.sub you can what you want.

Related

Python: How to remove [' and ']?

I want to remove [' from start and '] characters from the end of a string.
This is my text:
"['45453656565']"
I need to have this text:
"45453656565"
I've tried to use str.replace
text = text.replace("['","");
but it does not work.
You need to strip your text by passing the unwanted characters to str.strip() method:
>>> s = "['45453656565']"
>>>
>>> s.strip("[']")
'45453656565'
Or if you want to convert it to integer you can simply pass the striped result to int function:
>>> try:
... val = int(s.strip("[']"))
... except ValueError:
... print("Invalid string")
...
>>> val
45453656565
Using re.sub:
>>> my_str = "['45453656565']"
>>> import re
>>> re.sub("['\]\[]","",my_str)
'45453656565'
You could loop over the character filtering if the element is a digit:
>>> number_array = "['34325235235']"
>>> int(''.join(c for c in number_array if c.isdigit()))
34325235235
This solution works even for both "['34325235235']" and '["34325235235"]' and whatever other combination of number and characters.
You also can import a package and use a regular expresion to get it:
>>> import re
>>> theString = "['34325235235']"
>>> int(re.sub(r'\D', '', theString)) # Optionally parse to int
Instead of hacking your data by stripping brackets, you should edit the script that created it to print out just the numbers. E.g., instead of lazily doing
output.write(str(mylist))
you can write
for elt in mylist:
output.write(elt + "\n")
Then when you read your data back in, it'll contain the numbers (as strings) without any quotes, commas or brackets.

Python string regular expression

I need to do a string compare to see if 2 strings are equal, like:
>>> x = 'a1h3c'
>>> x == 'a__c'
>>> True
independent of the 3 characters in middle of the string.
You need to use anchors.
>>> import re
>>> x = 'a1h3c'
>>> pattern = re.compile(r'^a.*c$')
>>> pattern.match(x) != None
True
This would check for the first and last char to be a and c . And it won't care about the chars present at the middle.
If you want to check for exactly three chars to be present at the middle then you could use this,
>>> pattern = re.compile(r'^a...c$')
>>> pattern.match(x) != None
True
Note that end of the line anchor $ is important , without $, a...c would match afoocbarbuz.
Your problem could be solved with string indexing, but if you want an intro to regex, here ya go.
import re
your_match_object = re.match(pattern,string)
the pattern in your case would be
pattern = re.compile("a...c") # the dot denotes any char but a newline
from here, you can see if your string fits this pattern with
print pattern.match("a1h3c") != None
https://docs.python.org/2/howto/regex.html
https://docs.python.org/2/library/re.html#search-vs-match
if str1[0] == str2[0]:
# do something.
You can repeat this statement as many times as you like.
This is slicing. We're getting the first value. To get the last value, use [-1].
I'll also mention, that with slicing, the string can be of any size, as long as you know the relative position from the beginning or the end of the string.

Split Sentences of a String

HI i am trying to split a text
For example
'C:/bye1.txt'
i would want 'C:/bye1.txt' only
'C:/bye1.txt C:/hello1.txt'
i would want C:/hello1.txt only
'C:/bye1.txt C:/hello1.txt C:/bye2 C:/bye3'
i would want C:/bye3 and so on.
Code i tried, Problem is it only print out Y.
x = "Hello i am boey"
Last = x[len(x)-1]
print Last
Look at this:
>>> x = 'C:/bye1.txt'
>>> x.split()[-1]
'C:/bye1.txt'
>>> y = 'C:/bye1.txt C:/hello1.txt'
>>> y.split()[-1]
'C:/hello1.txt'
>>> z = 'C:/bye1.txt C:/hello1.txt C:/bye2 C:/bye3'
>>> z.split()[-1]
'C:/bye3'
>>>
Basically, you split the strings using str.split and then get the last item (that's what [-1] does).
x = "Hello i am boey".split()
Last = x[len(x)-1]
print Last
Though more Pythonic:
x = "Hello i am boey".split()
print x[-1]
Try:
>>> x = 'C:/bye1.txt C:/hello1.txt C:/bye2 C:/bye3'
>>> x.split()[-1]
'C:/bye3'
>>>
x[len()-1] returns you the last character of the list (in this case a list of characters). It is the same as x[-1]
To get the output you want, do this:
x.split()[-1]
If your sentences are separated by other delimiters, you can specify the delimeter to split like so:
delimeter = ','
item = x.split(delimeter)[-1] # Split list according to delimeter and get last item
In addition, the "lstrip()", "rstrip()", and "strip()" functions might be useful for removing unnecessary characters from the end of your string. Look them up in the python documentations.
Answer:
'C:/bye1.txt C:/hello1.txt C:/bye2 C:/bye3'.split()[-1]
would give you 'C:/bye3'.
Details:
The split method without any parameter assumes space to be the delimiter. In the example above, it returns the following list:
['C:/bye1.txt', 'C:/hello1.txt', 'C:/bye2', 'C:/bye3']
An index of -1 specifies taking the first character in the reversed order (from the back).

removing part of a string (up to but not including) in python

I'm trying to strip off part of a string.
e.g. Strip:-
a = xyz-abc
to leave:-
a = -abc
I would usually use lstrip e.g.
a.lstrip('xyz')
but in this case I don't know what xyz is going to be, so I need a way to just strip everything to the left of '-'.
Is it possible to set that option with lstrip or do I have to go about it a different way?
Thanks.
If there's only one - character, this will work:
'xyz-abc'.split('-')[1]
If you want the '-' in there, you have to reattach it:
>>> '-' + 'xyz-abc'.split('-')[1]
'-abc'
There's also count parameter that allows you to split only at the first - character.
>>> '-' + 'xyz-ab-c'.split('-', 1)[1]
'-ab-c'
partition is also potentially useful:
>>> 'xyz-abc'.partition('-')
('xyz', '-', 'abc')
It splits at the first occurrence of the separator:
>>> ''.join('xyz-ab-c'.partition('-')[1:])
'-ab-c'
>>> a = 'xyz-abc'
>>> a.find('-') # return the index of the first instance of '-'
3
>>> a[a.find('-'):] # return the string of everything past that index
'-abc'
You could use a conjunction of .find and splicing.
If there is no guarantee that the text to the left of - doesn't contain dashes of its own, the reversed version of find called rfind is even more useful:
>>> s = "xyv-er-hdgcfh-abc"
>>> print s[s.rfind("-"):]
-abc

Split a string in python

a="aaaa#b:c:"
>>> for i in a.split(":"):
... print i
... if ("#" in i): //i=aaaa#b
... print only b
In the if loop if i=aaaa#b how to get the value after the hash.should we use rsplit to get the value?
The following can replace your if statement.
for i in a.split(':'):
print i.partition('#')[2]
>>> a="aaaa#b:c:"
>>> a.split(":",2)[0].split("#")[-1]
'b'
a = "aaaa#b:c:"
print(a.split(":")[0].split("#")[1])
I'd suggest from: Python Docs
str.rsplit([sep[, maxsplit]])
Return a list of the words in the string, using sep as the delimiter
string. If maxsplit is given, at most maxsplit splits are done, the
rightmost ones. If sep is not specified or None, any whitespace string
is a separator. Except for splitting from the right, rsplit() behaves
like split() which is described in detail below.
so to answer your question yes.
EDIT:
It depends on how you wish to index your strings too, it looks like Rstring does it from the right, so if your data is always "rightmost" you could index by 0 (or 1, not sure how python indexes), every time, rather then having to do a size check of the returned array.
do you really need to use split? split create a list, so isn't so efficient...
what about something like this:
>>> a = "aaaa#b:c:"
>>> a[a.find('#') + 1]
'b'
or if you need particular occurence, use regex instead...
split would do the job nicely. Use rsplit only if you need to split from the last '#'.
a="aaaa#b:c:"
>>> for i in a.split(":"):
... print i
... b = i.split('#',1)
... if len(b)==2:
... print b[1]

Categories