Pass email as regex in django url - python

I am trying to pass email as parameter in django URL. I want to pass email as well as normal string and number also in URL as arguments.
url(r"search_connections/(?P<data>[\w.%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4})/$", "search_connections", name="search_connections"),
It's working properly for email as a parameter. But for normal string like "abc" it's not working:
working for "/search_connections/abc#test.com/"
not working for "/search_connections/abc/"
I want this URL to work for both.

You may try simply use | (or) with \w+:
r'search_connections/(?P<data>\w+|[\w.%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4})/$'
However I think regex for email just isn't a robust solution to match all valid emails.

Related

URL pattern for query string parameters Django REST Framework

This is my current url path is as http://localhost:8000/api/projects/abcml/2021
and in the urls.py page I am passing as path("api/projects/<str:project_handle>/<int:year>", functionname...) and in the view, I accept this parameters with self.kwargs.get method.
I want to pass url as this format http://localhost:8000/api/projects/abcml/?year=2021
What changes do I need to make in url pattern?
I tried this path("api/projects/<str:project_handle>/?year=<int:year> but did not seem correct also in the view page, instead of self.kwargs.get I changed it to self.request.query_params.get for year parameter. That did not work either. Error it throwing is Page not found (404).
The query string [wiki] is not part of the path, and therefore can not be matched.
You thus specify as path:
path('api/projects/<str:project_handle>/', functionname)
In the view, you can access the data with self.request.GET['year'] and this will return a string or a KeyError in cas the year was not provided in the query string.

Regex for Django signed token

I have created a signed token using Django in order to create a URL for validating email addresses. I now need to ensure my urlpattern matches the token Django creates.
The token is created using the following function:
from django.core import signing
def create_token(verify_code, partial_token):
token = {
'verify_code': verify_code,
'partial_token': partial_token
}
return signing.dumps(token)
And I have the following url + regex:
url(r'^email/validation/(?P<signed_token>[^/]+)/$', core_views.email_validation,
name='email_validation')
Is (?P<signed_token>[^/]+) the correct regex to pickup all possible tokens created using that function.
The token generated by Django will be a base 64 string. You could find a regex to specifically match base 64, but this is unnecessary unless you have other URL patterns that could conflict with this one (and if you do, the better solution would be to modify your URL patterns).
The regex you currently have is simply catching all characters that aren't a /, which should work just fine.

django Url endswith regex in url path

I need to support following urls in single url regex.
/hotel_lists/view/
/photo_lists/view/
/review_lists/view/
how to support all above urls in single views?
I tried something like below
url(r'^\_lists$/(?P<resource>.*)/$', 'admin.views.customlist_handler'),
edit:
hotel,photo, review is just example. that first part will be dynamic. first part can be anything.
If you wish to capture the resource type in the view, you could do this:
url(r'^(?P<resource>hotel|photo|review)_lists/view/$', 'admin.views.customlist_handler'),
Or to make it more generic,
url(r'^(?P<resource>[a-z]+)_lists/view/$', 'admin.views.customlist_handler'), #Or whatever regex pattern is more appropriate
and in the view
def customlist_handler(request, resource):
#You have access to the resource type specified in the URL.
...
You can read more on named URL pattern groups here

Capture URL with query string as parameter in Flask route

Is there a way for Flask to accept a full URL as a URL parameter?
I am aware that <path:something> accepts paths with slashes. However I need to accept everything including a query string after ?, and path doesn't capture that.
http://example.com/someurl.com?andother?yetanother
I want to capture someurl.com?andother?yetanother. I don't know ahead of time what query args, if any, will be supplied. I'd like to avoid having to rebuild the query string from request.args.
the path pattern will let you capture more complicated route patterns like URLs:
#app.route('/catch/<path:foo>')
def catch(foo):
print(foo)
return foo
The data past the ? indicate it's a query parameter, so they won't be included in that patter. You can either access that part form request.query_string or build it back up from the request.args as mentioned in the comments.
Due to the way routing works, you will not be able to capture the query string as part of the path. Use <path:path> in the rule to capture arbitrary paths. Then access request.url to get the full URL that was accessed, including the query string. request.url always includes a ? even if there was no query string. It's valid, but you can strip that off if you don't want it.
#app.route("/<path:path>")
def index(path=None):
return request.url.rstrip("?")
For example, accessing http://127.0.0.1:5000/hello?world would return http://127.0.0.1:5000/hello?world.

How to reliably extract URLs contained in URLs with Python?

Many search engines track clicked URLs by adding the result's URL to the query string which can take a format like: http://www.example.com/result?track=http://www.stackoverflow.com/questions/ask
In the above example the result URL is part of the query string but in some cases it takes the form http://www.example.com/http://www.stackoverflow.com/questions/ask or URL encoding is used.
The approach I tried first is to split searchengineurl.split("http://"). Some obvious problems with this:
it would return all parts of the query string that follow the result URL and not just the result URL. This would be a problem with an URL like this: http://www.example.com/result?track=http://www.stackoverflow.com/questions/ask&showauthor=False&display=None
it does not distinguish between any additional parts of the search engine tracking URL's query string and the result URL's query string. This would be a problem with an URL like this: http://www.example.com/result?track=http://www.stackoverflow.com/questions/ask?showauthor=False&display=None
it fails if the "http://" is ommitted in the result URL
What is the most reliable, general and non-hacky way in Python to extract URLs contained in other URLs?
I would try using urlparse.urlparse it will probably get you most of the way there and a little extra work on your end will get what you want.
This works for me.
from urlparse import urlparse
from urllib import unquote
urls =["http://www.example.com/http://www.stackoverflow.com/questions/ask",
"http://www.example.com/result?track=http://www.stackoverflow.com/questions/ask&showauthor=False&display=None",
"http://www.example.com/result?track=http://www.stackoverflow.com/questions/ask?showauthor=False&display=None",
"http://www.example.com/result?track=http%3A//www.stackoverflow.com/questions/ask%3Fshowauthor%3DFalse%26display%3DNonee"]
def clean(url):
path = urlparse(url).path
index = path.find("http")
if not index == -1:
return path[index:]
else:
query = urlparse(url).query
index = query.index("http")
query = query[index:]
index_questionmark = query.find("?")
index_ampersand = query.find("&")
if index_questionmark == -1 or index_questionmark > index_ampersand:
return unquote(query[:index_ampersand])
else:
return unquote(query)
for url in urls:
print clean(url)
> http://www.stackoverflow.com/questions/ask
> http://www.stackoverflow.com/questions/ask
> http://www.stackoverflow.com/questions/ask?showauthor=False&display=None
> http://www.stackoverflow.com/questions/ask?showauthor=False&display=None
I don't know about Python specifically, but I would use a regular expression to get the parts (key=value) of the query string, with something like...
(?:\?|&)[^=]+=([^&]*)
That captures the "value" parts. I would then decode those and check them against another pattern (probably another regex) to see which one looks like a URL. I would just check the first part, then take the whole value. That way your pattern doesn't have to account for every possible type of URL (and presumably they didn't combine the URL with something else within a single value field). This should work with or without the protocol being specified (it's up to your pattern to determine what looks like a URL).
As for the second type of URL... I don't think there is a non-hacky way to parse that. You could URL-decode the entire URL, then look for the second instance of http:// (or https://, and/or any other protocols you might run across). You would have to decide whether any query strings are part of "your" URL or the tracker URL. You could also not decode the URL and attempt to match on the encoded values. Either way will be messy, and if they don't include the protocol it will be even worse! If you're working with a set of specific formats, you could work out good rules for them... but if you just have to handle whatever they happen to throw at you... I don't think there's a reliable way to handle the second type of embedding.

Categories