This question already has answers here:
Extracting a URL in Python
(10 answers)
Closed 2 years ago.
import re
string = """position":1,"url":"https://www.flipkart.com/honor-8c-black-64-gb/p/itmfc8c4fsekrpdp?pid=MOBFC8C8FXXNHZ7C&lid=LSTMOBFC8C8FXXNHZ7CZYQGKP&marketplace=FLIPKART,"""
regex = "\w\w\w\w\w\w\w\w\W\W\d\W\W\w\w\w\W\W\W\w\w\w\w\w\W\/\/(...)\"\W"
match = re.findall(regex, string)
print(match)
I want to capture just the link from the above variable
the output must be in this way -(https://www.flipkart.com/honor-8c-black-64-gb/p/itmfc8c4fsekrpdp?pid=MOBFC8C8FXXNHZ7C&lid=LSTMOBFC8C8FXXNHZ7CZYQGKP&marketplace=FLIPKART)
while i run the above code it just gives me empty parenthesis
I think so that something is wrong with my regex so anyone please help me
THANKING IN ADVANCE.
You have some formatting issues. Here you go (assuming this format is consistent, otherwise follow the advice from the comments):
import re
string ='"position":1,"url":"https://www.flipkart.com/honor-8c-black-64-gb/p/itmfc8c4fsekrpdp?pid=MOBFC8C8FXXNHZ7C&lid=LSTMOBFC8C8FXXNHZ7CZYQGKP&marketplace=FLIPKART"'
regex = r'\"url\":\"(.*)\"'
match = re.search(regex, string)
print(match.group(1))
Related
This question already has an answer here:
Learning Regular Expressions [closed]
(1 answer)
Closed 3 years ago.
I cant find the solution for a regex that looks for a pattern but only in a specific range of the string
I want to find $ $ but only if it is in the 5-7 position of the string and it doesnt matter which character is between those two
Example
xxxx$x$xxxxx would match
xx$x$xxxxxxx would not
import re
should = "xxxx$x$xxxxx would match"
shouldnt = "xx$x$xxxxxxx would not"
pattern = r'^.{4}\$.\$.+'
re.match(pattern, should)
re.match(pattern, shouldnt)
gives
match
None
https://regex101.com/r/RLHrZb/1
This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
What do 'lazy' and 'greedy' mean in the context of regular expressions?
(13 answers)
Closed 4 years ago.
I am new to Python and I am struggling a bit with regular expressions. If I have an input like this:
text = <tag>xyz</tag>\n<tag>abc</tag>
Is it possible to get an output list with elements like:
matches = ['<tag>xyz</tag>','<tag>abc</tag>]
Right now I am using the following regex
matches = re.findall(r"<tag>[\w\W]*</tag>", text)
But instead of a list with two elements I am getting only one element with the whole input string like:
matches = ['<tag>xyz</tag>\n<tag>abc</tag>']
Could someone please guide me?
Thank you.
You just need to make your capture non-greedy.
Change this regex,
<tag>[\w\W]*</tag>
to
<tag>[\w\W]*?</tag>
import re
text = '<tag>xyz</tag>\n<tag>abc</tag>'
matches = re.findall(r"<tag>[\w\W]*?</tag>", text)
print(matches)
Prints,
['<tag>xyz</tag>', '<tag>abc</tag>']
This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Closed 4 years ago.
I would like to get, from the following string "/path/to/%directory_1%/%directory_2%.csv"
the following list: [directory_1, directory_2]. I would like to avoid using split by "%" my string. I was hoping to find a regex that could help me. However I cannot find the correct one.
For now, I have the following:
re.findall('%(.*)%', dirty_arg)
which output ["directory_1%/%directory_2"]
Do you have any recommandation about that?
Thank you very much for your help.
Try this:
import re
regex = r"%(.*?)%"
dirty_arg = "/path/to/%directory_1%/%directory_2%.csv"
print(re.findall(regex, dirty_arg))
I've added ? to your regex which makes sure it matches as few times as possible. The output of this code is ['directory_1', 'directory_2']
This question already has answers here:
What is the difference between re.search and re.match?
(9 answers)
Closed 5 years ago.
I know this is a very frequently asked question, but it's driving me mad.
I want to use regex to match a substring in my string.
line = '##ParameterValue[part I care about]=garbagegarbage'
And I would like to extract the part I care about.
My code looks like this:
import re
line = '##ParameterValue[part I care about]=garbagegarbage'
m = re.match('\[(.*)\]', line)
print m.group(1)
But this gives me an AttributeError: 'NoneType' object has no attribute 'group'
I tested my regex on regex101 and it works. I don't understand why this fails for me.
Change match to search
import re
line = '##ParameterValue[part I care about]=garbagegarbage'
m = re.search('\[(.*)\]', line)
print m.group(1)
This question already has answers here:
What is the difference between re.search and re.match?
(9 answers)
Closed 7 years ago.
I am trying to match string with mypattern, somehow I do not get correct result. Can you please point where am I wrong?
import re
mypattern = '_U_[R|S]_data.csv'
string = 'X003_U_R_data.csv'
re.match(mypattern, string)
I like to compile the regex statement first. Then I do whatever kind of matching/searching I would like.
mypattern = re.compile(ur'_U_[R|S]_data.csv')
Then
re.search(mypattern, string)
Here's a great website for regex creation- https://regex101.com/#python