regex to match a word and everything after it? - python

I need to dump some http data as a string from the http packet which i have in string format am trying to use the regular expression below to match 'data:'and everything after it,Its not working . I am new to regex and python
>>>import re
>>>pat=re.compile(r'(?:/bdata:/b)?\w$')
>>>string=" dnfhndkn data: ndknfdjoj pop"
>>>res=re.match(pat,string)
>>>print res
None

re.match matches only at the beginning of the string. Use re.search to match at any position. (See search() vs. match())
>>> import re
>>> pat = re.compile(r'(?:/bdata:/b)?\w$')
>>> string = " dnfhndkn data: ndknfdjoj pop"
>>> res = re.search(pat,string)
>>> res
<_sre.SRE_Match object at 0x0000000002838100>
>>> res.group()
'p'
To match everything, you need to change \w with .*. Also remove /b.
>>> import re
>>> pat = re.compile(r'(?:data:).*$')
>>> string = " dnfhndkn data: ndknfdjoj pop"
>>> res = re.search(pat,string)
>>> print res.group()
data: ndknfdjoj pop

No need for a regular expression here. You can just slice the string:
>>> string
' dnfhndkn data: ndknfdjoj pop'
>>> string.index('data')
10
>>> string[string.index('data'):]
'data: ndknfdjoj pop'
str.index('data') returns the point in the string where the substring data is found. The slice from this position to the end string[10:] gives you the part of the string you are interested in.
By the way, string is a potentially problematic variable name if you are planning on using the string module at any point...

you can just do:
string.split("data:")[1]
assuming "data:" appears only once in each string

Related

How to replace/delete a string in python

how can I replace/delete a part of a string, like this
string = '{DDBF1F} this is my string {DEBC1F}'
#{DDBF1F} the code between Parentheses is random, I only know it is made out of 6 characters
the output should be
this is my string
I tried this, I know it doesn't work, but I tried :3
string = '{DDBF1F} Hello {DEBC1F}'
string.replace(f'{%s%s%s%s%s%s}', 'abc')
print(string)
Use the re library to perform a regex replace, like this:
import re
text = '{DDBF1F} Hello {DEBC1F}'
result = re.sub(r"(\s?\{[A-F0-9]{6}\}\s?)", "", text)
print(result)
If the length of the strings within the brackets is fixed, you can use slicing to get the inner substring:
>>> string = '{DDBF1F} this is my string {DEBC1F}'
>>> string[8:-8]
' this is my string '
(string[9:-9] if you want to remove the surrounding spaces)
If hardcoding the indexes feels bad, they can be derived using str.index (if you can be certain that the string will not contain an embedded '}'):
>>> start = string.index('}')
>>> start
7
>>> end = string.index('{', start)
>>> end
27
>>> string[start+1:end]
' this is my string '
This code works
string = '{DDBF1F} this is my string {DEBC1F}'
st=string.split(' ')
new_str=''
for i in st:
if i.startswith('{') and i.endswith('}'):
pass
else:
new_str=new_str+" "+ i
print(new_str)

Python: How to remove [' and ']?

I want to remove [' from start and '] characters from the end of a string.
This is my text:
"['45453656565']"
I need to have this text:
"45453656565"
I've tried to use str.replace
text = text.replace("['","");
but it does not work.
You need to strip your text by passing the unwanted characters to str.strip() method:
>>> s = "['45453656565']"
>>>
>>> s.strip("[']")
'45453656565'
Or if you want to convert it to integer you can simply pass the striped result to int function:
>>> try:
... val = int(s.strip("[']"))
... except ValueError:
... print("Invalid string")
...
>>> val
45453656565
Using re.sub:
>>> my_str = "['45453656565']"
>>> import re
>>> re.sub("['\]\[]","",my_str)
'45453656565'
You could loop over the character filtering if the element is a digit:
>>> number_array = "['34325235235']"
>>> int(''.join(c for c in number_array if c.isdigit()))
34325235235
This solution works even for both "['34325235235']" and '["34325235235"]' and whatever other combination of number and characters.
You also can import a package and use a regular expresion to get it:
>>> import re
>>> theString = "['34325235235']"
>>> int(re.sub(r'\D', '', theString)) # Optionally parse to int
Instead of hacking your data by stripping brackets, you should edit the script that created it to print out just the numbers. E.g., instead of lazily doing
output.write(str(mylist))
you can write
for elt in mylist:
output.write(elt + "\n")
Then when you read your data back in, it'll contain the numbers (as strings) without any quotes, commas or brackets.

python regex find characters from and end of the string

svn-backup-test,2014/09/24/18/Rev1223/FullSvnCheckout.tgz
from the following string I need to fetch Rev1233. So i was wondering if we can have any regexpression to do that. I like to do following string.search ("Rev" uptill next /)
so far I split this using split array
s1,s2,s3,s4,s5 = string ("/",4)
You don't need a regex to do this. It is as simple as:
str = 'svn-backup-test,2014/09/24/18/Rev1223/FullSvnCheckout.tgz'
str.split('/')[-2]
Here is a quick python example
>>> impot re
>>> s = 'svn-backup-test,2014/09/24/18/Rev1223/FullSvnCheckout.tgz'
>>> p = re.compile('.*/(Rev\d+)/.*')
>>> p.match(s).groups()[0]
'Rev1223'
Find second part from the end using regex, if preferred:
/(Rev\d+)/[^/]+$
http://regex101.com/r/cC6fO3/1
>>> import re
>>> m = re.search('/(Rev\d+)/[^/]+$', 'svn-backup-test,2014/09/24/18/Rev1223/FullSvnCheckout.tgz')
>>> m.groups()[0]
'Rev1223'

Python testing if whitespace(\s) symbol is in string

I have a problem testing if "\s" symbols are present in a string. For example is '\sgoogle\s.com' must show that there is.
# use raw strings to ignore escapes:
s = r'\sgoogle\s'
print s, s.find(r'\s') != -1
# and with regex:
import re
print re.search(r'\\s', s)
Gives:
\sgoogle\s True
<_sre.SRE_Match object at 0x7f2d696fc850>
You have to escape your \ as follows:
s = '\sgoogle\s.com'
re.search(r'(\\s)', s)
Demo: http://regex101.com/r/tF9oW6

python string pattern matching

new_str="##2##*##1"
new_str1="##3##*##5##7"
How to split the above string in python
for val in new_str.split("##*"):
logging.debug("=======")
logging.debug(val[2:]) // will give
for st in val.split("##*"):
//how to get the values after ## in new_str and new_str1
I don't understand the question.
Are you trying to split a string by a delimiter? Then use split:
>>> a = "##2##*##1"
>>> b = "##3##*##5##7"
>>>
>>> a.split("##*")
['##2', '##1']
>>> b.split("##*")
['##3', '##5##7']
Are you trying to strip extraneous characters from a string? Then use strip:
>>> c = b.split("##*")[1]
>>> c
'##5##7'
>>> c.strip("#")
'5##7'
Are you trying to remove all the hashes (#) from a string? Then use replace:
>>> c.replace("#","")
'57'
Are you trying to find all the characters after "##"? Then use rsplit with its optional argument to split only once:
>>> a.rsplit("##",1)
['##2##*', '1']

Categories