Split string by hyphen - python

I have a strings in the format of feet'-inches" (i.e. 18'-6") and I want to split it so that the values of the feet and inches are separated.
I have tried:
re.split(r'\s|-', `18'-6`)
but it still returns 18'-6.
Desired output: [18,6] or similar
Thanks!

Just split normally replacing the ':
s="18'-6"
a, b = s.replace("'","").split("-")
print(a,b)
If you have both " and ' one must be escaped so just split and slice up to the second last character:
s = "18'-6\""
a, b = s.split("-")
print(a[:-1], b[:-1])
18 6

You can use
import re
p = re.compile(ur'[-\'"]')
test_str = u"18'-6\""
print filter(None,re.split(p, test_str))
Output:
[u'18', u'6']
Ideone demo

A list comprehension will do the trick:
In [13]: [int(i[:-1]) for i in re.split(r'\s|-', "18'-6\"")]
Out[13]: [18, 6]
This assumes that your string is of the format feet(int)'-inches(int)", and you are trying to get the actual ints back, not just numbers in string format.

The built-in split method can take an argument that will cause it to split at the specified point.
"18'-16\"".replace("'", "").replace("\"", "").split("-")
A one-liner. :)

Related

pandas regex look ahead and behind from a 1st occurrence of character

I have python strings like below
"1234_4534_41247612_2462184_2131_GHI.xlsx"
"1234_4534__sfhaksj_DHJKhd_hJD_41247612_2462184_2131_PQRST.GHI.xlsx"
"12JSAF34_45aAF34__sfhaksj_DHJKhd_hJD_41247612_2f462184_2131_JKLMN.OPQ.xlsx"
"1234_4534__sfhaksj_DHJKhd_hJD_41FA247612_2462184_2131_WXY.TUV.xlsx"
I would like to do the below
a) extract characters that appear before and after 1st dot
b) The keywords that I want are always found after the last _ symbol
For ex: If you look at 2nd input string, I would like to get only PQRST.GHI as output. It is after last _ and before 1st . and we also get keyword after 1st .
So, I tried the below
for s in strings:
after_part = (s.split('.')[1])
before_part = (s.split('.')[0])
before_part = qnd_part.split('_')[-1]
expected_keyword = before_part + "." + after_part
print(expected_keyword)
Though this works, this is definitely not nice and elegant way to write a regex.
Is there any other better way to write this?
I expect my output to be like as below. As you can see that we get keywords before and after 1st dot character
GHI
PQRST.GHI
JKLMN.OPQ
WXY.TUV
Try (regex101):
import re
strings = [
"1234_4534_41247612_2462184_2131_ABCDEF.GHI.xlsx",
"1234_4534__sfhaksj_DHJKhd_hJD_41247612_2462184_2131_PQRST.GHI.xlsx",
"12JSAF34_45aAF34__sfhaksj_DHJKhd_hJD_41247612_2f462184_2131_JKLMN.OPQ.xlsx",
"1234_4534__sfhaksj_DHJKhd_hJD_41FA247612_2462184_2131_WXY.TUV.xlsx",
]
pat = re.compile(r"[^.]+_([^.]+\.[^.]+)")
for s in strings:
print(pat.search(s).group(1))
Prints:
ABCDEF.GHI
PQRST.GHI
JKLMN.OPQ
WXY.TUV
You can do (try the pattern here )
df['text'].str.extract('_([^._]+\.[^.]+)',expand=False)
Output:
0 ABCDEF.GHI
1 PQRST.GHI
2 JKLMN.OPQ
3 WXY.TUV
Name: text, dtype: object
You can also do it with rsplit(). Specify maxsplit, so that you don't split more than you need to (for efficiency):
[s.rsplit('_', maxsplit=1)[1].rsplit('.', maxsplit=1)[0] for s in strings]
# ['GHI', 'PQRST.GHI', 'JKLMN.OPQ', 'WXY.TUV']
If there are strings with less than 2 dots and each returned string should have one dot in it, then add a ternary operator that splits (or not) depending on the number of dots in the string.
[x.rsplit('.', maxsplit=1)[0] if x.count('.') > 1 else x
for s in strings
for x in [s.rsplit('_', maxsplit=1)[1]]]
# ['GHI.xlsx', 'PQRST.GHI', 'JKLMN.OPQ', 'WXY.TUV']

How to replace comma with space in python list

I want to replace comma to space in the list. How can i do that? Thanks
input:
host_dict['actives'] = list(get_po_bound_ints['result'][1]['portChannels'][mac_to_eth2]['activePorts'].keys())
output:
[{'actives': ['PeerEthernet23',
'PeerEthernet24',
'Ethernet23',
'Ethernet22'],
Replacing commas with empty strings (assuming your list is named my_list):
print(str(my_list).replace(',', ''))
If I understand your question, you have an iterable that you want to convert to a space-separated string. The str.join method does that:
>>> test = ['PeerEthernet23', 'PeerEthernet24', 'Ethernet23', 'Ethernet22']
>>> " ".join(test)
'PeerEthernet23 PeerEthernet24 Ethernet23 Ethernet22'
your script would be
host_dict['actives'] = " ".join(get_po_bound_ints['result'][1]
['portChannels'][mac_to_eth2]['activePorts'].keys())

Get two numbers from a string

I need to extract two numbers inside a string. They can look like this:
(0,0)
(122,158)
(1,22)
(883,8)
etc...
So I want to get the first number before the comma and the number after the comma and store them in variables. I can get the first number like this:
myString.split(',')[0][1:])
However, I can't figure out how to get the next number.
Thanks for the help everyone!
It should work with something like
myVar.split(',')[0][1:] # = 122 for the string in the second line
myVar.split(',')[1][:-1] # = 158 for the string in the second line
This should be the easiest way to do this
You could get rid of the parentheses, split the string, and convert each item to an int:
a, b = [int(x) for x in s[1:-1].split(',')]
Of course, if you absolutely sure about the string's format, and don't care about security, you could just eval the string:
a, b = eval(s)
You can use ast.literal_eval() to convert your string into a tuple. This will also take care about extra whitespace like '( 123, 158)'.
>>> from ast import literal_eval
>>> tup = literal_eval('(122,158)')
>>> tup[0]
122
>>> tup[1]
158
Or just:
>>> first, second = literal_eval('(122,158)')
Multi assignment, stripping the parentheses and splitting will do:
a, b = myString.lstrip('(').rstrip(')').split(',')
# a, b = map(int, (a, b))
myVar.split(',')[1][:-1])
will get you the second number
The simplest one-liner would be
a, b = (myString[1:-1].split(',')[0], myString[1:-1].split(',')[1])
Gets rid of the parentheses, then splits at the comma.

Remove Characters from string with replace not working

I have a number of strings from which I am aiming to remove charactars using replace. However, this dosent seem to wake. To give a simplified example, this code:
row = "b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'"
row = row.replace("b'", "").replace("'", "").replace('b"', '').replace('"', '')
print(row.encode('ascii', errors='ignore'))
still ouputs this b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38' wheras I would like it to output James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38. How can I do this?
Edit: Updataed the code with a better example.
You seem to be mistaking single quotes for double quotes. Simple replace 'b:
>>> row = "xyz'b"
>>> row.replace("'b", "")
'xyz'
As an alternative to str.replace, you can simple slice the string to remove the unwanted leading and trailing characters:
>>> row[2:-1]
'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'
In your first .replace, change b' to 'b. Hence your code should be:
>>> row = "xyz'b"
>>> row = row.replace("'b", "").replace("'", "").replace('b"', '').replace('"', '')
# ^ changed here
>>> print(row.encode('ascii', errors='ignore'))
xyz
I am assuming rest of the conditions you have are the part of other task/matches that you didn't mentioned here.
If all you want is to take the string before first ', then you may just do:
row.split("'")[0]
You haven't listed this to remove 'b:
.replace("'b", '')
import ast
row = "b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'"
b_string = ast.literal_eval(row)
print(b_string)
u_string = b_string.decode('utf-8')
print(u_string)
out:
b_string:b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'
u_string: James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38
The real question is how to convert a string to python object.
You get a string which contains an a binary string, to convert it to python's binary string object, you should use eval(). ast.literal_eval() is more safe way to do it.
Now you get a binary string, you can convert it to unicode string which do not start with "b" by using decode()

how to split brackets using python abcd[00451.00]

I have tried below code to split but I am unable to split
import re
s = "abcd[00451.00]"
print str(s).strip('[]')
I need output as only number or decimal format 00451.00 this value but I am able to get output as abcd[00451.00
If you know for sure that there will be one opening and closing brackets you can do
s = "abcd[00451.00]"
print s[s.index("[") + 1:s.rindex("]")]
# 00451.00
str.index is used to get the first index of the element [ in the string, where as str.rindex is used to get the last index of the element in ]. Based on those indexes, the string is sliced.
If you want to convert that to a floating point number, then you can use float function, like this
print float(s[s.index("[") + 1:s.rindex("]")])
# 451.0
You should use re.search:
import re
s = "abcd[00451.00]"
>>> print re.search(r'\[([^\]]+)\]', s).group(1)
00451.00
You can first split on the '[' and then strip the resulting list of any ']' chars:
[p.strip(']') for p in s.split('[')]

Categories