Can someone explain me why my regex is not getting satisfied for below regex expression. Could someone let me know how to overcome and check for [] match.
>>> str = li= "a.b.\[c\]"
>>> if re.search(li,str,re.IGNORECASE):
... print("Matched")
...
>>>
>>> str = li= r"a.b.[c]"
>>> if re.search(li,str,re.IGNORECASE):
... print("Matched")
...
>>>
If I remove open and close brackets I get match
>>> str = li= 'a.b.c'
>>> if re.search(li,str,re.IGNORECASE):
... print("matched")
...
matched
You are attempting to match the string a.b.\\[c\\] instead of a.b.[c].
Try this:
import re
li= r"a\.b\.\[c\]"
s = "a.b.[c]"
if re.search(li, s, re.IGNORECASE):
print("Matched")
re.IGNORECASE is not needed in here by the way.
You can try the following code:
import re
str = "a.b.[c]"
if re.search(r".*\[.*\].*", str):
print("Matched")
Output:
Matched
Related
I have a string in which I want to replace some variables, but in different steps, something like:
my_string = 'text_with_{var_1}_to_variables_{var_2}'
my_string.format(var_1='10')
### make process 1
my_string.format(var_2='22')
But when I try to replace the first variable I get an Error:
KeyError: 'var_2'
How can I accomplish this?
Edit:
I want to create a new list:
name = 'Luis'
ids = ['12344','553454','dadada']
def create_list(name,ids):
my_string = 'text_with_{var_1}_to_variables_{var_2}'.replace('{var_1}',name)
return [my_string.replace('{var_2}',_id) for _id in ids ]
this is the desired output:
['text_with_Luis_to_variables_12344',
'text_with_Luis_to_variables_553454',
'text_with_Luis_to_variables_dadada']
But using .format instead of .replace.
In simple words, you can not replace few arguments with format {var_1}, var_2 in string(not all) using format. Even though I am not sure why you want to only replace partial string, but there are few approaches that you may follow as a workaround:
Approach 1: Replacing the variable you want to replace at second step by {{}} instead of {}. For example: Replace {var_2} by {{var_2}}
>>> my_string = 'text_with_{var_1}_to_variables_{{var_2}}'
>>> my_string = my_string.format(var_1='VAR_1')
>>> my_string
'text_with_VAR_1_to_variables_{var_2}'
>>> my_string = my_string.format(var_2='VAR_2')
>>> my_string
'text_with_VAR_1_to_variables_VAR_2'
Approach 2: Replace once using format and another using %.
>>> my_string = 'text_with_{var_1}_to_variables_%(var_2)s'
# Replace first variable
>>> my_string = my_string.format(var_1='VAR_1')
>>> my_string
'text_with_VAR_1_to_variables_%(var_2)s'
# Replace second variable
>>> my_string = my_string % {'var_2': 'VAR_2'}
>>> my_string
'text_with_VAR_1_to_variables_VAR_2'
Approach 3: Adding the args to a dict and unpack it once required.
>>> my_string = 'text_with_{var_1}_to_variables_{var_2}'
>>> my_args = {}
# Assign value of `var_1`
>>> my_args['var_1'] = 'VAR_1'
# Assign value of `var_2`
>>> my_args['var_2'] = 'VAR_2'
>>> my_string.format(**my_args)
'text_with_VAR_1_to_variables_VAR_2'
Use the one which satisfies your requirement. :)
Do you have to use format? If not, can you just use string.replace? like
my_string = 'text_with_#var_1#_to_variables_#var2#'
my_string = my_string.replace("#var_1#", '10')
###
my_string = my_string.replace("#var2#", '22')
following seems to work now.
s = 'a {} {{}}'.format('b')
print(s) # prints a b {}
print(s.format('c')) # prints a b c
I need to capitalize a line of input, but if I just use the upper() function, link addresses get capitalized, thus making them unusable.
For example: "Cool Video www.youtube.com/watch?v=dQw4w9WgXcQ"
will turn to: "COOL VIDEO WWW.YOUTUBE.COM/WATCH?V=DQW4W9WGXCQ"
The link address has changes and won't work anymore. Is there any way to ignore links?
If I was correct to understand your goal here, then you should first look for the part of string to upper case and then joined back with the rest of the original string, this way:
>>> import re
>>> s = "Cool Video -> www.youtube.com/watch?v=dQw4w9WgXcQ"
>>> #Look for the part of string you want to upper case
>>> m = re.search(r'^.*(?=\s+->)', s)
>>> m
<_sre.SRE_Match object; span=(0, 10), match='Cool Video'>
>>> #m.start() and m.end() will give you start and endo position of matched string.
>>> new_s = s[m.start():m.end()].upper() + s[m.end():]
>>> #remember that strings are immutable, so make new one
>>> new_s
'COOL VIDEO -> www.youtube.com/watch?v=dQw4w9WgXcQ'
>>> #OR
>>> new_s = m.group().upper() + s[m.end():]
>>> new_s
'COOL VIDEO -> www.youtube.com/watch?v=dQw4w9WgXcQ'
EDIT:
Otherway, is to look for string preceding a link and then apply upper method on it:
>>> s = "Cool Video www.youtube.com/watch?v=dQw4w9WgXcQ"
>>> m = re.search(r'(.*)(?=www.*)',s)
>>> s = m.group().upper() + s[m.end():]
>>> s
'COOL VIDEO www.youtube.com/watch?v=dQw4w9WgXcQ'
I am trying to write regex in python for either single or double quotation marks from these examples:
animal="cat"
animal="horse"
animal='dog'
animal='cow'
It comes up empty when trying with |
re.compile("animal=\"|'(.+?)\"|'").findall
Please help. Thanks
You can take advantage of back-reference:
r = re.compile(r"""animal=(["'])(.+?)\1""")
This guarantees that the opening and closing characters are the same.
It's time to test it:
assert r.search('animal="cat"').group(2) == "cat"
assert r.search('animal="horse"').group(2) == "horse"
assert r.search("animal='dog'").group(2) == "dog"
assert r.search("animal='cow'").group(2) == "cow"
Your logical OR doesn't works on ' and " instead use a character class :
>>> s="""animal="cat"
...
... animal="horse"
...
... animal='dog'
...
... animal='cow'"""
>>>
>>> re.findall(r"""animal=["'](.+?)["']""",s)
['cat', 'horse', 'dog', 'cow']
>>>
I have a 'NoneType' object like:
A='ABC:123'
I would like to get an object keeping only the digits:
A2=digitsof(A)='123'
Split at the colon:
>>> A='ABC:123'
>>> numA = int(A.split(':')[1])
123
How about:
>>> import re
>>> def digitsof(a):
... return [int(x) for x in re.findall('\d+', a) ]
...
>>> digitsof('ABC:123')
[123]
>>> digitsof('ABC:123,123')
[123, 123]
>>>
Regular Expressions?
>>> from re import sub
>>> A = 'ABC:123'
>>> sub(r'\D', '', A)
123
A simple filter function
A='ABC:123'
filter(lambda s: s.isdigit(), A)
Could someone please help me strip characters from a string to leave me with just the characters held within '[....]'?
For example:
a = newyork_74[mylocation]
b = # strip the frist characters until you reach the first bracket [
c = [mylocation]
Something like this:
>>> import re
>>> strs = "newyork_74[mylocation]"
>>> re.sub(r'(.*)?(\[)','\g<2>',strs)
'[mylocation]'
Assuming no nested structures, one way would be using itertools.dropwhile,
>>> from itertools import dropwhile
>>> b = ''.join(dropwhile(lambda c: c != '[', a))
>>> b
'[mylocation]'
Another would be to use regexs,
>>> import re
>>> pat = re.compile(r'\[.*\]')
>>> b = pat.search(a).group(0)
>>> b
'[mylocation]'