How can I use re.match to find numbers? - python

I'm trying to use python re module:
import re
res = re.match(r"\d+", 'editUserProfile!input.jspa?userId=2089')
print(res)
I got None type for res, but if I replace the match by findall, I can find the 2089.
Do you know where the problem is ?

The problem is that you're using match() to search for a substring in a string.
The method match() only works for the whole string. If you want to search for a substring inside a string, you should use search().
As stated by khelwood in the comments, you should take a look at: Search vs Match.
Code:
import re
res = re.search(r"\d+", 'editUserProfile!input.jspa?userId=2089')
print(res.group(0))
Output:
2089
Alternatively you can use .split() to isolate the user id.
Code:
s = 'editUserProfile!input.jspa?userId=2089'
print(s.split('=')[1])
Output:
2089

Related

Regex URL Help: Word or Phrase

I am an absolute noob at regex (I kind of know the basics and need to help a word, or a phrase. If it is a phrase, then separate each word with a hyphen - :
This is my current regex, which only matches one word:
r'^streams/search/(?P<stream_query>\w+)/$
The ?P just allows the URL to take a parameter.
Extra note: I am using python re module with the Django urls.py
Any suggestions?
Here are some examples:
game
gsl
starcraft-2014
final-fantasy-iv
word1-word2-word-3
Updated explanation:
I basically need a regular expression to expand the current one, so inside the same regex, no other one:
r'^streams/search/(?P<stream_query>\w+)/$
So include the new regex INSIDE this one, where ?P\w+ is any word that Django considers a parameter (and is passed into a function).
URL definition, which includes the regex:
url(r'^streams/search/(?P\w+)/$', 'stream_search', name='stream_search')
Then, Django passes that parameter into the stream_search function, which takes that parameter:
def stream_search(request, stream_query):
#here I manipulate the stream_query string, ie: removing the hyphens
So, once again, I need an re to match a word or phrase, that are passed into the stream_query parameter (or if necessary, a second one).
So, what I want stream_query to have is:
word1
or
word1-word2-word3
If I understand your question correctly then you might not have to use regexs at all.
Based on your example:
example.com/streams/search/rocket-league-fsdfs-fsdfs
It seems that the term you want to deal with is always found after the last /. So you can rsplit and then check for -. Here is an example:
url = "example.com/streams/search/rocket-league-fsdfs-fsdfs"
result = url.rsplit("/", 1)[-1]
#result = ["example.com/streams/search", "rocket-league-fsdfs-fsdfs"]
if "-" in result:
#do whatever you want with the string
else:
#do whatever you want with the string
or a regex that would match either word or word-word-word would be: [\w-]+
Try this,
import re
str = "http://example.com/something?id=123&action=yes"
regex = "(query\d+)=(\w+)"
re.findall(regex, str)
You can also use Python's urlparse library,
from urlparse import url parse
urlparse = urlparse("http://example.com/something?id=123&action=yes")
Just call url parse to return
ParseResult(scheme='http', netloc='example.com', path='/something', params='', query='id=123&action=yes', fragment='')

Extract string using regex in Python

I'm struggling a bit on how to extract (i.e. assign to variable) a string based on a regex. I have the regex worked out -- I tested on regexpal. But I'm lost on how I actually implement that in Python. My regex string is:
http://jenkins.mycompany.com/job/[^\s]+
What I want to do is take string and if there's a pattern in there that matches the regex, put that entire "pattern" into a variable. So for example, given the following string:
There is a problem with http://jenkins.mycompany.com/job/app/4567. We should fix this.
I want to extract http://jenkins.mycompany.com/job/app/4567and assign it a variable. I know I'm supposed to use re but I'm not sure if I want re.match or re.search and how to get what I want. Any help or pointers would be greatly appreciated.
import re
p = re.compile('http://jenkins.mycompany.com/job/[^\s]+')
line = 'There is a problem with http://jenkins.mycompany.com/job/app/4567. We should fix this.'
result = p.search(line)
print result.group(0)
Output:
http://jenkins.mycompany.com/job/app/4567.
Try the first found match in the string, using the re.findall method to select the first match:
re.findall(pattern_string, input_string)[0] # pick the first match that is found

python regex match querystring path

I'm trying to write a regex to match any path that contains /? to determine whether it is a querystring or not.
a sample string to be matched would be this: /mysite/path/to/whatever/?page=1
so far I thought this would match re.match(r'/\?', '/mysite/path/to/whatever/?page=1')
but it doesn't seem to be matching
This code is already written for you. No need to reinvent the wheel:
import urlparse
print urlparse.urlparse('/mysite/path/to/whatever/?page=1')
http://docs.python.org/library/urlparse.html#module-urlparse
Your problem is that you're using re.match. That function looks for matches at the beginning of the string. So, either you change your regexp to '.*/\?', or use re.search instead.
You don't need a regular expression here. Just use the in operator: '/?' in the_string.
The problem is that re.match only looks at the beginning of the string.
You could use re.search instead, if you need the power of REs.

python and regex

#!/usr/bin/python
import re
str = raw_input("String containing email...\t")
match = re.search(r'[\w.-]+#[\w.-]+', str)
if match:
print match.group()
it's not the most complicated code, and i'm looking for a way to get ALL of the matches, if it's possible.
It sounds like you want re.findall():
findall(pattern, string, flags=0)
Return a list of all non-overlapping matches in the string.
If one or more groups are present in the pattern, return a
list of groups; this will be a list of tuples if the pattern
has more than one group.
Empty matches are included in the result.
As far as the actual regular expression for identifying email addresses goes... See this question.
Also, be careful using str as a variable name. This will hide the str built-in.
I guess that re.findall is what you're looking for.
You should give a try for find() or findall()
findall() matches all occurrences of a
pattern, not just the first one as
search() does. For example, if one was
a writer and wanted to find all of the
adverbs in some text, he or she might
use findall()
http://docs.python.org/library/re.html#finding-all-adverbs
You don't use raw_input in the way you used. Just use raw_input to get the input from the console.
Don't override built-in's such as str. Use a meaningful name and assign it a whole string value.
Also it is a good idea many a times to compile your pattern have it a Regex object to match the string against. (illustrated in the code)
I just realized that a complete regex to match an email id exactly as per RFC822 could be a pageful otherwise this snippet should be useful.
import re
inputstr = "something#exmaple.com, 121#airtelnet.com, ra#g.net, etc etc\t"
mailsrch = re.compile(r'[\w\-][\w\-\.]+#[\w\-][\w\-\.]+[a-zA-Z]{1,4}')
matches = mailsrch.findall(inputstr)
print matches

What is the syntax for evaluating string matches on regular expressions?

How do I determine if a string matches a regular expression?
I want to find True if a string matches a regular expression.
Regular expression:
r".*apps\.facebook\.com.*"
I tried:
if string == r".*apps\.facebook\.com.*":
But that doesn't seem to work.
From the Python docs: on re module, regex
import re
if re.search(r'.*apps\.facebook\.com.*', stringName):
print('Yay, it matches!')
Since re.search returns a MatchObject if it finds it, or None if it is not found.
You have to import the re module and test it that way:
import re
if re.match(r'.*apps\.facebook\.com.*', string):
# it matches!
You can use re.search instead of re.match if you want to search for the pattern anywhere in the string. re.match will only match if the pattern can be located at the beginning of the string.
import re
match = re.search(r'.*apps\.facebook\.com.*', string)
You're looking for re.match():
import re
if (re.match(r'.*apps\.facebook\.com.*', string)):
do_something()
Or, if you want to match the pattern anywhere in the string, use re.search().
Why don't you also read through the Python documentation for the re module?

Categories