I have tried below code to split but I am unable to split
import re
s = "abcd[00451.00]"
print str(s).strip('[]')
I need output as only number or decimal format 00451.00 this value but I am able to get output as abcd[00451.00
If you know for sure that there will be one opening and closing brackets you can do
s = "abcd[00451.00]"
print s[s.index("[") + 1:s.rindex("]")]
# 00451.00
str.index is used to get the first index of the element [ in the string, where as str.rindex is used to get the last index of the element in ]. Based on those indexes, the string is sliced.
If you want to convert that to a floating point number, then you can use float function, like this
print float(s[s.index("[") + 1:s.rindex("]")])
# 451.0
You should use re.search:
import re
s = "abcd[00451.00]"
>>> print re.search(r'\[([^\]]+)\]', s).group(1)
00451.00
You can first split on the '[' and then strip the resulting list of any ']' chars:
[p.strip(']') for p in s.split('[')]
Related
I'm trying to get data after second underscore from back?
sample:
str
a_bc_def 12_23_this_6729
abc_def,122$3_this_6729
abc_def_1_2_23_this_6729
output
this_6729
You can first split your string by the '_', use a slice to get the last two substrings, then join them again by '_':
string = '''a_bc_def 12_23_this_6729
abc_def,122$3_this_6729
abc_def_1_2_23_this_6729'''
print('_'.join(string.split('_')[-2:]))
Output:
this_6729
You can try rfind twice like
a = "a_bc_def 12_23_this_6729"
a[a[:a.rfind("_")].rfind("_") + 1:]
Output
'this_6729'
How most effectively do I cut out a part of a word if the character '=#=' appears and then finish cutting the word if the character '=#=' appears? For example:
From a large string
'321#5=85#45#41=#=I-LOVE-STACK-OVER-FLOW=#=3234#41#=q#$^1=#=xx$q=#=xpa$=4319'
The python code returns:
'I-LOVE-STACK-OVER-FLOW'
Any help will be appreciated.
Using split():
s = '321#5=85#45#41=#=I-LOVE-STACK-OVER-FLOW=#=3234#41#=q#$^1=#=xx$q=#=xpa$=4319'
st = '=#='
ed = '=#='
print((s.split(st))[1].split(ed)[0])
Using regex:
import re
s = '321#5=85#45#41=#=I-LOVE-STACK-OVER-FLOW=#=3234#41#=q#$^1=#=xx$q=#=xpa$=4319'
print(re.search('%s(.*)%s' % (st, ed), s).group(1))
OUTPUT:
I-LOVE-STACK-OVER-FLOW
In addition to #DirtyBit's answer, if you want to also handle cases of more than 2 '=#='s, you can split the string, and then add every other element:
s = '321#5=85#45#41=#=I-LOVE-STACK-OVER-FLOW=#=3234#41#=q#$^1=#=xx$q=#=xpa$=4319=#=|I-ALSO-LOVE-SO=#=3123123'
parts = s.split('=#=')
print(''.join([parts[i] for i in range(1,len(parts),2)]))
Output
I-LOVE-STACK-OVER-FLOW|I-ALSO-LOVE-SO
The explanation is in the code.
import re
ori_list = re.split("=#=",ori_str)
# you can imagine your goal is to find the string wrapped between signs of "=#="
# so after the split, the even number position must be the parts outsides of "=#="
# and the odd number position is what you want
for i in range(len(ori_list)):
if i%2 == 1:#odd position
print(ori_list[i])
I want to know how to construct the regular express to extract the list.
Here is my string:
audit = "{## audit_filter = ['hostname.*','service.*'] ##}"
Here is my expression:
AUDIT_FILTER_RE = r'([.*])'
And here is my search statement:
audit_filter = re.search(AUDIT_FILTER_RE, audit).group(1)
I want to extract everything inside the square brackets including the brackets. '[...]'
Expected Output:
['hostname.*','service.*']
import re
audit = "{## audit_filter = ['hostname.*','service.*'] ##}"
print eval(re.findall(r"\[.*\]", audit)[0]) # ['hostname.*', 'service.*']
findall returns a list of string matches. In your case, there should only be one, so I'm retrieving the string at index 0, which is a string representation of a list. Then, I use eval(...) to convert that string representation of a list to an actual list. Just beware:
If there are no matches, ...findall...[0] will throw a list index out of range error
Don't use eval() if you ever expect input coming from another source (i.e. input that is not yours) because that would be a security issue.
Use r"\[(.*?)\]"
Ex:
import re
audit = "{## audit_filter = ['hostname.*'] ##}"
print(re.findall(r"\[(.*?)\]", audit))
Output:
["'hostname.*'"]
A small issue I've encountered during coding.
I'm looking to print out the name of a .txt file.
For example, the file is named: verdata_florida.txt, or verdata_newyork.txt
How can I exclude .txt and verdata_, but keep the string between? It must work for any number of characters, but .txt and verdata_ must be excluded.
This is where I am so far, I've already defined filename to be input()
print("Average TAM at", str(filename[8:**????**]), "is higher than ")
3 ways of doing it:
using str.split twice:
>>> "verdata_florida.txt".split("_")[1].split(".")[0]
'florida'
using str.partition twice (you won't get an exception if the format doesn't match, and probably faster too):
>>> "verdata_florida.txt".partition("_")[2].partition(".")[0]
'florida'
using re, keeping only center part:
>>> import re
>>> re.sub(".*_(.*)\..*",r"\1","verdata_florida.txt")
'florida'
all those above must be tuned if _ and . appear multiple times (must we keep the longest or the shortest string)
EDIT: In your case, though, prefixes & suffixes seem fixed. In that case, just use str.replace twice:
>>> "verdata_florida.txt".replace("verdata_","").replace(".txt","")
'florida'
Assuming you want it to split on the first _ and the last . you can use slicing and the index and rindex functions to get this done. These functions will search for the first occurrence of the substring in the parenthesis and return the index number. If no substring is found, they will throw a ValueError. If the search is desired, but not the ValueError, you can also use find and rfind, which do the same thing but always return -1 if no match is found.
s = 'verdata_new_hampshire.txt'
s_trunc = s[s.index('_') + 1: s.rindex('.')] # or s[s.find('_') + 1: s.rfind('.')]
print(s_trunc) # new_hampshire
Of course, if you are always going to exclude verdata_ and .txt you could always hardcode the slice as well.
print(s[8:-4]) # new_hampshire
You can leverage str.split() on strings. For example:
s = 'verdata_newyork.txt'
s.split('verdata_')
# ['', 'florida.txt']
s.split('verdata_')[1]
# 'florida.txt'
s.split('verdata_')[1].split('.txt')
['florida', '']
s.split('verdata_')[1].split('.txt')[0]
# 'florida'
You can just split string by dot and underscore like this:
string filename = "verdata_prague.txt";
string name = filename.split("."); //verdata_prague
name = name[0].split("_")[1]; //prague
or by replace function:
string filename = "verdata_prague.txt";
string name = filename.replace(".txt",""); //verdata_prague
name = name[0].replace("verdata_","")[1]; //prague
I am new to python and I have a string that looks like this
Temp = "', '/1412311.2121\n"
my desired output is just getting the numbers and decimal itself.. so im looking for
1412311.2121
as the output.. trying to get rid of the ', '/\n in the string.. I have tried Temp.strip("\n") and Temp.rstrip("\n") for trying to remove \n but i still seems to remain in my string. :/... Does anyone have any ideas? Thanks for your help.
Strings are immutable. string.strip() doesn't change string, it's a function that returns a value. You need to do:
Temp = Temp.strip()
Note also that calling strip() without any parameters causes it to remove all whitespace characters, including \n
As stalk said, you can achieve your desired result by calling strip("',/\n") on Temp.
If the data are like you show, numbers that are wrapped from right and left with non-number data, you can use a very simple regular expression:
g = re.search('[0-9.]+', s) # capture the inner number only
print g.group(0)
I would use a regular expression to do this:
In [8]: s = "', '/1412311.2121\n"
In [9]: re.findall(r'([+-]?\d+(?:\.\d+)?(?:[eE][+-]\d+)?)', s)
Out[9]: ['1412311.2121']
This returns a list of all floating-point numbers found in the string.