How to match begin URL with REGEX in Django - python

I'm using this syntax for my API urls:
http://myhost/myapiname/X-Y-Z/web-service-name/
In Django urls.py, it looks like this:
url(r'api_s/2-0-0/get_client_profile/$', GetClientProfile.as_detail(), name='get_client_profile'),
Now, I'd like to redirect all 1-0-0 urls (deprecated webservices) to a specific view.
I tried something like url(r'api_t/1-0-0*$', Deprecated.as_list(), name='deprecated') but it can't be catch. I'm not used to REGEX so I'm missing something here. Thanks.

Add a dot before *:
url(r'api_t/1-0-0.*$', Deprecated.as_list(), name='deprecated')
The asterisk * sign means "repeat previous symbol zere or more times". The dot . means "any char". So .* will match any string.

Related

Regular expression for checking string outside of a set

I am writing a web crawler using Scrapy and as a result I get a set of URLs like: [Dummy URLs]
*http://matrix.com/en/Zion
http://matrix.com/en/Machine_World
http://matrix.com/en/Matrix:Banner_guidelines
http://matrix.com/en/File:Link_Banner.jpg
http://matrix.com/wiki/en/index.php*
In the rules in scrapy, I want to add a regex that allows urls ONLY of the kind "http://matrix.com/en/Machine_World" or "http://matrix.com/en/Zion"
i.e urls that contain anything outside of the set "http://matrix.com/en/<[a-zA-Z,_]>" must not be allowed.
Constraints :
The string after "/en/" could be of any length. So I cannot ask it to look only for the first 10 or 20 characters. e.g when I use the regex : [a-zA-Z,]{1,20} OR [a-zA-Z,]{1,} it still matches the URLs like http://matrix.com/en/Matrix:Banner_guidelines coz it finds "http://matrix.com/en/Matrix" part of the url a successful match. I want it look at the string starting after "/en/" till the end of URL and then apply this rule.
Unfortunately I cannot extract that string n write a sub-routine of any kind. It has to be done using a regex only!
i.e urls that contain anything outside of the set "http://matrix.com/en/<[a-zA-Z,_]>" must not be allowed.
Have you tried using this character class in your regex? Looks like you aren't including underscores.
Try
[a-zA-Z,_]+
The plus sign means "one or more" - which is the same as {1,} just a nice shorthand :)
If you want to exclude items with .php or .jpg, feel free to add a $ sign to the end, as so:
[a-zA-Z,_]+$
The $ means "end of line" meaning that your matching sequence must run to the end of the line. As fullstops are not included in the character class, those options will be excluded
Let me know if that works,
Elliott
Reproducible evidence that the suggested regex works:
grep("matrix.com\\/en\\/[a-zA-Z,_]+$", x, perl=TRUE, value=TRUE)
#[1] "http://matrix.com/en/Zion"
#[2] "http://matrix.com/en/Machine_World"
Data
x <- c("http://matrix.com/en/Zion", "http://matrix.com/en/Machine_World",
"http://matrix.com/en/Matrix:Banner_guidelines",
"http://matrix.com/en/File:Link_Banner.jpg",
"http://matrix.com/wiki/en/index.php")

Regex match string beginning with ?code=

I'm using python and django to match urls for my site. I need to match a url that looks like this:
/company/code/?code=34k3593d39k
The part after ?code= is any combination of letters and numbers, and any length.
I've tried this so far:
r'^company/code/(.+)/$'
r'^company/code/(\w+)/$'
r'^company/code/(\D+)/$'
r'^company/code/(.*)/$'
But so far none are catching the expression. Any ideas? Thanks
code=34k3593d39k is GET parameter and you don't need to define the pattern for it in URL pattern. You can access it using request.GET.get('code') under view. The pattern should be just:
r'^company/code/$'
Usage, accessing GET parameter:
def my_view(request):
code = request.GET.get('code')
print code
Check the documentation:
The URLconf searches against the requested URL, as a normal Python
string. This does not include GET or POST parameters, or the domain
name.
The first pattern will work if you move the last / to just after the ^:
>>> import re
>>> re.match(r'^/company/code/(.+)$', '/company/code/?code=34k3593d39k')
<_sre.SRE_Match object at 0x0209C4A0>
>>> re.match(r'^/company/code/(.+)$', '/company/code/?code=34k3593d39k').groups()
('?code=34k3593d39k',)
>>>
Note too that the ^ is unnecessary because re.match matches from the start of the string:
>>> re.match(r'/company/code/(.+)$', '/company/code/?code=34k3593d39k').groups()
('?code=34k3593d39k',)
>>>

Python regex with question mark literal

I'm using Django's URLconf, the URL I will receive is /?code=authenticationcode
I want to match the URL using r'^\?code=(?P<code>.*)$' , but it doesn't work.
Then I found out it is the problem of '?'.
Becuase I tried to match /aaa?aaa using r'aaa\?aaa' r'aaa\\?aaa' even r'aaa.*aaa' , all failed, but it works when it's "+" or any other character.
How to match the '?', is it special?
>>> s="aaa?aaa"
>>> import re
>>> re.findall(r'aaa\?aaa', s)
['aaa?aaa']
The reason /aaa?aaa won't match inside your URL is because a ? begins a new GET query.
So, the matchable part of the URL is only up to the first 'aaa'. The remaining '?aaa' is a new query string separated by the '?' mark, containing a variable "aaa" being passed as a GET parameter.
What you can do here is encode the variable before it makes its way into the URL. The encoded form of ? is %3F.
You should also not match a GET query such as /?code=authenticationcode using regex at all. Instead, match your URL up to / using r'^$'. Django will pass the variable code as a GET parameter to the request object, which you can obtain in your view using request.GET.get('code').
You are not allowed to use ? in a URL as a variable value. The ? indicates that there are variables coming in.
Like: http://www.example.com?variable=1&another_variable=2
Replace it or escape it. Here's some nice documentation.
Django's urls.py does not parse query strings, so there is no way to get this information at the urls.py file.
Instead, parse it in your view:
def foo(request):
code = request.GET.get('code')
if code:
# do stuff
else:
# No code!
"How to match the '?', is it special?"
Yes, but you are properly escaping it by using the backslash. I do not see where you have accounted for the leading forward slash, though. That bit just needs to be added in:
r'^/\?code=(?P<code>.*)$'
supress the regex metacharacters with []
>>> s
'/?code=authenticationcode'
>>> r=re.compile(r'^/[?]code=(.+)')
>>> m=r.match(s)
>>> m.groups()
('authenticationcode',)

Django: Capturing data from URL will not work

I have the following statement in my URL.py file
(r'^confirm/(\d+)/$', confirm)
But this URL
http://127.0.0.1:8000/confirm/DMo32zPB15
Returns this
Page not found (404)
Request Method: GET
Request URL: http://127.0.0.1:8000/confirm/DMo32zPB15
Using the URLconf defined in BBN.urls, Django tried these URL patterns, in this order:
^login/$
^ajax/login$
^ajax/login/nact/$
^ajax/login/nact/cancel//$
^ajax/login/nact/resend/$
^confirm/(\d+)/$
The current URL, confirm/DMo32zPB15, didn't match any of these.
You're seeing this error because you have DEBUG = True in your Django settings file Change that to False, and Django will display a standard 404 page.
Why won't it regognize the URL?
\d+ means one or more digits.
DMo32zPB15 has both digits and letters. Try r'^confirm/([a-zA-Z0-9]+)/$' instead.
More information about regular expressions can be found at http://www.regular-expressions.info for your reading pleasure.
This is related to your other question about \d+. \d+ matches only digits. Your URL contains things that are not digits (like letters). You should take a look at the regex tutorial I linked in my answer to your other question and get a solid grasp of regular expressions before you try to write URL matchers using them.

Using URLS that accept slashes as part of the parameter in Django

Is there a way in Django to accept 'n' parameters which are delimited by a '/' (forward slash)?
I was thinking this may work, but it does not. Django still recognizes forward slashes as delimiters.
(r'^(?P<path>[-\w]+/)$', 'some.view', {}),
Add the right url to your urlpatterns:
# ...
("^foo/(.*)$", "foo"), # or whatever
# ...
And process it in your view, like AlbertoPL said:
fields = paramPassedInAccordingToThatUrl.split('/')
Certainly, Django can accept any URL which can be described by a regular expression - including one which has a prefix followed by a '/' followed by a variable number of segments separated by '/'. The exact regular expression will depend on what you want to accept - but an example in Django is given by /admin URLs which parse the suffix of the URL in the view.

Categories