Flask route with URI encoded component - python

It seems Flask doesn't support routes with a URI encoded component. I'm curious if I'm doing something wrong, or if there is a special flag I need to include.
My route looks something like this:
#app.route('/foo/<encoded>/bar/')
def foo(encoded):
# ...
pass
The URL that this should match can look like these:
http://foobar.com/foo/xxx/bar/ # matched correctly, no URI component
http://foobar.com/foo/x%2Fx%2Fx%2F/bar/ # not matched correctly, URI component
Former URL works, latter spits out a lovely 404.
Thanks!

Add path to your url rule:
#app.route('/foo/<path:encoded>/bar/')
Update per comment: The route API docs are here: http://flask.pocoo.org/docs/api/#flask.Flask.route. The underlying classes that implement the path style route converter are here: http://werkzeug.pocoo.org/docs/routing/#custom-converters (this is one of the really nice parts of pocoostan.) As far as the trailing slashes, there are special rules that amount to:
If a rule ends with a slash and is requested without a slash by the
user, the user is automatically redirected to the same page with a
trailing slash attached.
If a rule does not end with a trailing slash and the user request the
page with a trailing slash, a 404 not found is raised.
Also keep in mind that if you are on Apache and are expecting a slash-trailed url, ie a bookmarklet that submits to http://ex.com/foo/<path:encoded>/bar and encoded gets something with double slashes, Apache will convert multiple slashes to a single one.

Related

django.urls.reverse(): URL-encoding the slash

When one supplies URL args or kwargs to a django.urls.reverse() call,
Django will nicely URL-encode non-Ascii characters and URL-reserved characters.
For instance, given a declaration such as
path("prefix/<stuff>", view=MyView.as_view(), name="myurl")
we get
reverse('myurl', args=['aaa bbb']) == "/prefix/aaa%20bbb"
reverse('myurl', args=['aaa%bbb']) == "/prefix/aaa%25bbb"
reverse('myurl', args=['Ä']) == "/prefix/%C3%84"
and so on. So far, so good.
What Django will not encode, however, is the slash:
reverse('myurl', args=['aaa/bbb'])
will give us
django.urls.exceptions.NoReverseMatch:
Reverse for 'myurl' with arguments '('aaa/bbb',)' not found.
1 pattern(s) tried: ['prefix/(?P<stuff>[^/]+)$']
(The question what to encode and what not
has been discussed as a Django issue.
It's complicated.)
I found a remark in the code that may explain why
the slash is a special case:
_reverse_with_prefix in django/urls/resolvers.py contains a comment that says
# WSGI provides decoded URLs, without %xx escapes, and the URL
# resolver operates on such URLs. First substitute arguments
# without quoting to build a decoded URL and look for a match.
# Then, if we have a match, redo the substitution with quoted
# arguments in order to return a properly encoded URL.
Given that unencoded arguments are used in the matching initially,
it is no wonder that it does not work:
The slash looks like the end of the argument to Django and so there is
one more argument than expected.
My question:
I dearly want to use user-supplied data in natural-looking URLs,
so slashes occur occasionally. How can I make them work?
The URL structure I need is basically
/show_rooms/<organization>/<department>/<building>
I can think of these approaches:
Replace a slash in an argument with some exotic Unicode character
that will never occur otherwise. And back for received arguments.
This would sort of do the job, but is inconvenient,
non-standard, and therefore ugly.
Use slugs instead of the real names.
This would require extending my models to store the slugs (because
the ORM needs to find objects by them) and appears out of proportion to me.
URL-quote my arguments before passing them to reverse()
and unquote arguments when I receive them.
This is as inconvenient as (1).
It leads to URLs that are more difficult to read than
those from (1), because each % produced by quoting
will subsequently be encoded as %25.
But at least it is a standard-ish approach.
Sigh. Is this really the "right" way?
Any comments or fourth solutions are welcome!
Now that I've written it up, solution (1) does not look quite
so horrible to me. What replacement character would you use for a slash?
I suggest you try to pass the slash in as a regular URL and see if your view is able to match with it. If that's the case and the problem is in the reverse function itself not the view. How about passing the slash already encoded %2F?
It's me, the asker.
Here is the solution that I finally used:
I decided for solution (1).
It turned out less inconvenient than I had expected and
it works spectacularly well.
My Firefox browser shows the URL in text form, not urlencoded form,
and when you pick the right replacement character it looks almost natural.
Very nice.
Here is the code for the escaping (to be called in the template, hence a
custom templatetag)
and unescaping (to be called in the view):
import django.template as djt
register = djt.Library()
# see https://docs.djangoproject.com/en/stable/howto/custom-template-tags/
ALT_SLASH = '\N{DIVISION SLASH}'
#register.filter
def escape_slash(urlparam: str) -> str:
"""
Avoid having a slash in the urlparam URL part,
because it would not get URL-encoded.
See https://stackoverflow.com/questions/67849991/django-urls-reverse-url-encoding-the-slash
Possible replacement characters are
codepoint char utf8 name oldname
U+2044 ⁄ e2 81 84 FRACTION SLASH
U+2215 ∕ e2 88 95 DIVISION SLASH
U+FF0F / ef bc 8f FULLWIDTH SOLIDUS FULLWIDTH SLASH
None of them will look quite right if the browser shows the char rather than
the %-escape in the address line, but DIVISION SLASH comes close.
The normal slash is
U+002F / 2f SOLIDUS SLASH
To get back the urlparam after calling escape_slash,
the URL will be formed (via {% url ... } or reverse()) and URL-encoded,
sent to the browser,
received by Django in a request, URL-unencoded, split,
its param parts handed to a view as args or kwargs,
and finally unescape_slash will be called by the view.
"""
return urlparam.replace('/', ALT_SLASH)
def unescape_slash(urlparam_q: str) -> str:
return urlparam_q.replace(ALT_SLASH, '/')

Route's with a leading/traling slash and without slashes

Could you please explain to me the diffference between:
#app.route( '/something' )
compared to:
#app.route( 'something/' )
and also compared to:
#app.route( 'something' )
So i can better distinguish them?
In a word, /foo was the normal use case, /foo/ was used when you want to make the URL looks like a path/folder, foo was wrong. If I'm wrong, please correct me.
The URL rule should start with a slash(/).
/foo and /foo/ was two different URL rule, see the details in the docs:
The following two rules differ in their use of a trailing slash.
#app.route('/projects/')
def projects():
return 'The project page'
#app.route('/about')
def about():
return 'The about page'
The canonical URL for the projects endpoint has a trailing slash. It’s similar to a folder in a file
system. If you access the URL without a trailing slash, Flask
redirects you to the canonical URL with the trailing slash.
The canonical URL for the about endpoint does not have a trailing
slash. It’s similar to the pathname of a file. Accessing the URL with
a trailing slash produces a 404 “Not Found” error. This helps keep
URLs unique for these resources, which helps search engines avoid
indexing the same page twice.
Link: http://flask.pocoo.org/docs/1.0/quickstart/#unique-urls-redirection-behavior

No trailing slash results in a 404 error in Flask app

-- Resolved without changing anything. Will update when I know what caused it to temporally not work --
I made a route to a URL without a trailing slash on Flask. This should work according to the docs:
timestamp = "temptime"
#app.route('/js/searchindex<time>.js')
def searchindex(time):
return render_template('searchindex.js')
What happens is that the URL gets redirected to the URL with trailing slash, which obviously results in a 404 error. So http://127.0.0.1:5000/js/searchindextemptime.js becomes http://127.0.0.1:5000/js/searchindextemptime.js/.
I am aware that I can set the slashes to not-strict to avoid the 404, but I really want without trailing slash.

Twisted, how to "putChild()" with trailing slash?

Im using twisted, and made a webserver, but when i try to request a page with a trailing slash i get
"No Such Resource - No such child resource."
I tried all of these
self.putChild('login', Login(self))
self.putChild('/login/', Login(self))
self.putChild('/login', Login(self))
self.putChild('login/', Login(self))
Even tried overriding the 'getChildWithDefault' method, and tried requesting pages with both slashes and no slash, and it always say the path is 'login', no slashes, so it should always match the first line, but doesn't for w.e reason.
Anyone know how to add a child resource with the trailing slash?
You can't pass a slash to putChild; it will be escaped by the URL traversal logic, because the argument is a single path segment.
Assuming that Login is itself a Resource though, you can put itself onto itself, so that both /login and /login/ will work, like so:
l = Login(self)
l.putChild("", l)
self.putChild("login", l)
You can of course make /login without the trailing slash a resource of your own design, or a twisted.web.util.Redirect that adds a slash; assemble your resources in whichever configuration you prefer :).

Urls.py unable to pass #(pound) character to a view in Django,

I used to pass data through django URL while passing #character is not able to pass through urls.py, I am using pattern as
url(r'^pass/(?P<sentence>[\w|\W]*)/$',pass)
I tried with these pattern also
url(r'^pass/(?P<sentence>[a-zA-Z0-9-/:-?##{-~!^_\'\[\]*]*)/$',pass)
Thanks in advance.
The "#" character marks inline anchors (links within the same page) in a URL, so the browser will never send it to Django.
For example, if the URL is /something/pass/#test/something-else/ the browser will sent only /something/pass/ to the server. You can try /something/pass/%23test/something-else/ instead, 23 is the hexadecimal ascii code for # - not pretty (ugly by ugly just pass it as a get variable instead).
There is nothing you can do on the Django side - you better avoid characters with special meanings in the URL path when designing your routes - of course it is a matter of taste, but I really think that strings passed in the URL path should be "slugfied" in order to remove any funny character.
Browsers won't send the url fragment part (ends with "#") to servers. Why not converting your data to base64 first, then pass the data via url.
RFC 1808 (Relative Uniform Resource Locators) : Note that the fragment identifier (and the "#" that precedes it) is
not considered part of the URL. However, since it is commonly used
within the same string context as a URL, a parser must be able to
recognize the fragment when it is present and set it aside as part of
the parsing process.

Categories