am trying to fetch a string which only has a digit in it (the regex I give), but its returning me the both of them.
string1 = '1234843847394645362'
string2 = 'this is what I have 1297643847381737345is a multi'
Regex used :
this gives me both the numbers from string1 and string2 .
Can we avoid getting the number from string2 ?
need help.
Try with this regex: ^\d{15,20}$
Demo here
If you don't want to match the digits when followed by a newline use \Z
Regex demo
i have regex
Hello!1 World
and i wanna get first mark (!|#) and change the number 1 to another number 2
I did
but it adds extra text and i just wanna change the number
i expect result to be
and ifusing # to be
Match and capture either ! or # in a named capture group, here called char, if followed by one or more digits and a whitespace:
Substitute with the named capture, \g<char> followed by 2_:
If you only want the substitution if there's a 1 following either ! or #, replace \d+ with 1.
In your substitution you need to change the {\1}2_ to just 2_.
string = "Hello!1 World"
pattern = "(\!|\#)(1)"
replacement = "2_"
result = re.sub(pattern, replacement, string)
Why not: string.replace('!1 ', '!2_').replace('#1 ', '#2_') ?
>>> string = "Hello!1 World"
>>> repl = lambda s: s.replace('!1 ', '!2_').replace('#1 ', '#2_')
>>> string2 = repl(string)
>>> string2
>>> string = "Hello!12 World"
>>> string2 = repl(string)
>>> string2
'Hello!12 World'
The replacement for you pattern should be \g<1>2_
Regex demo
You could also shorten your pattern to a single capture with a character class [!#] and a match and use the same replacement as above.
Regex demo
Or with a lookbehind assertion without any groups and replace with 2_
Regex demo
I want to write a single regular expression code to extract the string from these two strings:
string1 = '#HISEQ:625:HC2T5BCXY:1:1101:1177:2101'
string2 = '#SRR7216015.1 HISEQ:630:HC2VKBCXY:1:1101:1177:2073/1'
I want to extract the string right after the # until it hit the end or a space to get
HISEQ:625:HC2T5BCXY:1:1101:1177:2101 from string1
SRR7216015.1 from string2
So, how to do it. I've tested a bunch of the regular expression code but couldn't do it.
Below is the code I tried:
string1 = '#HISEQ:625:HC2T5BCXY:1:1101:1177:2101'
string2 = '#SRR7216015.1 HISEQ:630:HC2VKBCXY:1:1101:1177:2073/1'
pattern1 = re.compile(r'#(\w*.*:*\d*:*\w*:*\d*:*\d*[$|\s])')
Thanks in advance!
Just use
and take the first group. Lookarounds or alternations - as suggested in other answers - are expensive.
You could use this regex for that:
(?<=#).*?(?= |$)
Use lookarounds. (?<=#) checks for an # signt before, (?= |$) matches an spaces or end of string. .* mathes everything between
I am trying to create list of tuples with the data after strings string1 and string3. But not getting expected result.
s = 'string1:1234string2string3:a1b2c3string1:2345string3:b5c6d7'
Actual result:
[('1234', 'b5c6d7)']
Expected result:
[('1234', 'a1b2c3'), ('2345', 'b5c6d7')]
You current regex uses [\s,\S]+ which is greedy and matches all characters until the end of the line.
You could make it non greedy and use a positive lookahead (?=string|$) for the last match that assert what follows is either string or the end of the line $.
import re
s = 'string1:1234string2string3:a1b2c3string1:2345string3:b5c6d7'
The problem is that [\s,\S]+ is greedy and therefore consuming everything between the first string1 and the last string3.
You can fix that by using positive lookaheads and making the regex non greedy like this:
I want to use python in order to manipulate a string I have.
Basically, I want to prepend"\x" before every hex byte except the bytes that already have "\x" prepended to them.
My original string looks like this:
mystr = r"30336237613131\x90\x01\x0A\x90\x02\x146F6D6D616E64\x90\x01\x06\x90\x02\x0F52656C6174\x90\x01\x02\x90\x02\x50656D31\x90\x00"
And I want to create the following string from it:
mystr = r"\x30\x33\x62\x37\x61\x31\x31\x90\x01\x0A\x90\x02\x14\x6F\x6D\x6D\x61\x6E\x64\x90\x01\x06\x90\x02\x0F\x52\x65\x6C\x61\x74\x90\x01\x02\x90\x02\x50\x65\x6D\x31\x90\x00"
I thought of using regular expressions to match everything except /\x../g and replace every match with "\x". Sadly, I struggled with it a lot without any success. Moreover, I'm not sure that using regex is the best approach to solve such case.
Regex: (?:\\x)?([0-9A-Z]{2}) Substitution: \\x$1
(?:) Non-capturing group
? Matches between zero and one time, match string \x if it exists.
() Capturing group
[] Match a single character present in the list 0-9 and A-Z
{n} Matches exactly n times
\\x String \x
$1 Group 1.
Python code:
import re
text = R'30336237613131\x90\x01\x0A\x90\x02\x146F6D6D616E64\x90\x01\x06\x90\x02\x0F52656C6174\x90\x01\x02\x90\x02\x50656D31\x90\x00'
text = re.sub(R'(?:\\x)?([0-9A-Z]{2})', R'\\x\1', text)
Code demo
You don't need regex for this. You can use simple string manipulation. First remove all of the "\x" from your string. Then add add it back at every 2 characters.
replaced = mystr.replace(r"\x", "")
newstr = "".join([r"\x" + replaced[i*2:(i+1)*2] for i in range(len(replaced)/2)])
>>> print(newstr)
You can get a list with your values to manipulate as you wish, with an even simpler re pattern
mystr = r"30336237613131\x90\x01\x0A\x90\x02\x146F6D6D616E64\x90\x01\x06\x90\x02\x0F52656C6174\x90\x01\x02\x90\x02\x50656D31\x90\x00"
import re
pat = r'([a-fA-F0-9]{2})'
match = re.findall(pat, mystr)
if match:
print('\n\nNew string:')
print('\\x' + '\\x'.join(match))
#for elem in match: # match gives you a list of strings with the hex values
# print('\\x{}'.format(elem), end='')
print('\n\nOriginal string:')
This can be done without replacing existing \x by using a combination of positive lookbehinds and negative lookaheads.
See code in use here
import re
regex = r"(?!(?<=\\x)|(?<=\\x[a-f\d]))([a-f\d]{2})"
test_str = r"30336237613131\x90\x01\x0A\x90\x02\x146F6D6D616E64\x90\x01\x06\x90\x02\x0F52656C6174\x90\x01\x02\x90\x02\x50656D31\x90\x00"
subst = r"\\x$1"
result = re.sub(regex, subst, test_str, 0, re.IGNORECASE)
if result:
print (result)
(?!(?<=\\x)|(?<=\\x[a-f\d])) Negative lookahead ensuring either of the following doesn't match.
(?<=\\x) Positive lookbehind ensuring what precedes is \x.
(?<=\\x[a-f\d]) Positive lookbehind ensuring what precedes is \x followed by a hexidecimal digit.
([a-f\d]{2}) Capture any two hexidecimal digits into capture group 1.
in this text:
I want to catch all ips, in this example are and emptystring
I tried to use this regex:
which returns me and ",
it took the first " from PolicerID instead of the last " in IPAddress.
Can you please help me?
You can keep it simple and just use a capturing group:
>>> str = r'"IPAddress":"","PolicerID":"","IPAddress":"","PolicerID":""'
>>> print re.findall(r'"IPAddress":"([^"]*)', str)
['', '']
However if you have to use lookbehind assertion then use this regex:
([^"]*) is a negated pattern to match 0 or more of any character that is not a double quote.
RegEx Demo
If you want all IPs in that text I would suggest this regex