Flask URL encoding mis [duplicate] - python

My application frequently takes URL encoded strings as a URL parameter. Often these strings look like paths with a leading slash. IE /file/foo. In flask, I have an endpoint that takes a path parameter that I send a URL encoded path to. So I have something that looks like:
import flask
app = flask.Flask("Hello World")
#app.route("/blah/<path:argument>", methods=["GET"])
def foo(argument):
return "GOT: %s" % argument
if __name__ == "__main__":
app.run(debug=True)
This works great if I visit this URL:
http://localhost:5000/blah/cats%2F
returns:
GOT: cats/
But a leading slash with %2F fails with 404 in the case of GET and 405 in the case of POST. In other words, this 404s:
http://localhost:5000/blah/%2Fcats
In my research on this problem, I was lead to believe here that URL encoding was sufficient to sole the problem. However that doesn't appear to be the case.

This is because of how Werkzeug parses urls. It decodes the encoded slashes before parsing the route, so they still appear as leading slashes. There are bug reports about this:
https://github.com/mitsuhiko/flask/issues/900
https://github.com/mitsuhiko/werkzeug/pull/478
The second link provides a patch to perform this decoding after routing, but it is not merged.
It looks like the best solution at this point is to follow Martijn's answer here.

One way to get around this without defining your own PathConverter is having two route filters:
import flask
app = flask.Flask("Hello World")
#app.route("/blah/<path:argument>", methods=["GET"])
#app.route("/blah//<path:argument>", methods=["GET"])
def foo(argument):
return "GOT: %s" % argument
if __name__ == "__main__":
app.run(debug=True)
Hitting this with:
http://localhost:5000/blah/%2Fcats
Gives me:
GOT: cats
And with:
http://localhost:5000/blah//cats
Gives me:
GOT: cats
But a better (cleaner) solution is probably the one described in this SO answer: Flask route using path with leading slash

Related

Cannot route non-ascii URLs Flask

I have a problem with the routing my URL adress to Flask, precisely with running it in web-browser. All I want is to transfer the sharp symbol "#" and some Russian words (as like "#привет" or "#ПомогитеМнеПожалуйста") together.
The screenshot of error: enter image description here
My programming code at the moment looks like this:
# -*- coding: utf-8 -*-
from flask import Flask, jsonify
app = Flask(__name__)
#app.route('/hashtags/<names>', methods=['GET'])
def get_hashtags(names):
return jsonify({'Segmentation Hashtags': names})
if __name__ == '__main__':
app.run(port=9876)
So, basically, <names> is a parameter from function get_hashtag that is used for transfering my future hashtag to the web-browser using jsonify. I need to find the way of transfering any hashtag I want with sharp symbol "#" plus Russian letters. As far as I know, there is an ASCII-coding methods (something like .**decode(utf-8)**), but I have no idea how to use it properly.
Thanks in advance!
The hashtags are causing the error. You can try to remove them on the client side, and just request this link instead:
/hashtags/привет
Hashtags in the url are often used to tell the browser which element on a page to jump to. For instance, on https://en.wikipedia.org/wiki/Stack_Overflow#Technology
the #Technology means jump to the technology section of the page.
try unidecode:
from unidecode import unidecode
#app.route('/hashtags/<names:string>', methods=['GET'])
def get_hashtags(names):
return jsonify({'Segmentation Hashtags': unidecode(names)})

Passing arguments containing slashes to bottle

I need to pass strings containing slashes through the last argument in a url to my bottlepy server but since slashes get treated like argument separators the server doesn't handle it the way I need to.
I found a page about how flask supports this:
http://flask.pocoo.org/snippets/76/
But haven't found a similar solution in bottle yet
Sounds like you want :path:
:path matches all characters including the slash character in a
non-greedy way and may be used to match more than one path segment.
For example,
#route('/root/<path:thepath>')
def callback(thepath):
# `thepath` is everything after "/root/" in the URI.
...
EDIT: In response to OP's comment (below), here's a snippet which works for me:
from bottle import Bottle, route
app = Bottle()
#app.route('/add/<uid>/<collection>/<group>/<items:path>')
def add(uid, collection, group, items):
return 'your uri path args: {}, {}, {}, {}\n'.format(uid, collection, group, items)
app.run(host='0.0.0.0', port=8081)
Yields:
% ~>curl 'http://127.0.0.1:8081/add/1/2/3/and/now/a/path'
your uri path args: 1, 2, 3, and/now/a/path

Webapp2 strict_slash returns KeyError: 'Missing Argument' for url with trailing slash when method has 2 or more args... Bug?

I've built my urls as such:
#url = /index/test/argument/second
# Maps to the Index handler's test method and passes in the optional arguments 'argument' and 'second'
# So the handler function looks like this:
def test(argument=None,second=None):
print 'test'
I'm using strict_slash from webapp2 so the handlers with a trailing slash get redirected to handlers without a trailing slash.
#url = /index/ redirects perfectly to /index
#url = /index/test/ # KEYERROR!!
So even though index/test is routed before index/test/second, webapp2 is ignoring the redirect for trailing slashes and returning an error because it's looking (too hard) for the second argument. I think it should recognize there is no second argument so follow the strict_slash redirect route.
This works in all cases except with argument passing. Any insight, anyone?
To solve this problem you just need to set unique name argument for routes.

Specifying custom URL schema in appengine using app.yaml?

I am trying to have a custom URL which looks like this:
example.com/site/yahoo.com
which would hit this script like this=
example.com/details?domain=yahoo.com
can this be done using app.yaml?
the basic idea is to call "details" with the input "yahoo.com"
You can't really rewrite the URLs per se, but you can use regular expression groups to perform a similar kind of thing.
In your app.yaml file, try something like:
handlers:
- url: /site/(.+)
script: site.py
And in your site.py:
SiteHandler(webapp.RequestHandler):
def get(self, site):
# the site parameter will be what was passed in the URL!
pass
def main():
application = webapp.WSGIApplication([('/site/(.+)', SiteHandler)], debug=True)
util.run_wsgi_app(application)
What happens is, whatever you have after /site/ in the request URL will be passed to SiteHandler's get() method in the site parameter. From there you can do whatever it is you wanted to do at /details?domain=yahoo.com, or simply redirect to that URL.

How to display a page in my browser with python code that is run locally on my computer with "GAE" SDK?

When I run this code on my computer with the help of "Google App Engine SDK", it displays (in my browser) the HTML code of the Google home page:
from google.appengine.api import urlfetch
url = "http://www.google.com/"
result = urlfetch.fetch(url)
print result.content
How can I make it display the page itself? I mean I want to see that page in my browser the way it would normally be seen by any user of the internet.
Update 1:
I see I have received a few questions that look a bit complicated to me, although I definitely remember I was able to do it, and it was very simple, except i don't remember what exactly i changed then in this code.
Perhaps, I didn't give You all enough details on how I run this code and where I found it. So, let me tell You what I did. I only installed Python 2.5 on my computer and then downloaded "Google App Engine SDK" and installed it, too. Following the instructions on "GAE" page (http://code.google.com/appengine/docs/python/gettingstarted/helloworld.html) I created a directory and named it “My_test”, then I created a “my_test.py” in it containing that small piece of the code that I mentioned in my question.
Then, continuing to follow on the said instructions, I created an “app.yaml” file in it, in which my “my_test.py” file was mentioned. After that in “Google App Engine Launcher” I found “My_test” directory and clicked on Run button, and then on Browse. Then, having visited this URL http://localhost:8080/ in my web browser, I saw the results.
I definitely remember I was able to display any page in my browser in this way, and it was very simple, except I don’t remember what exactly I changed in the code (it was a slight change). Now, all I can see is a raw HTML code of a page, but not a page itself.
Update 2:
(this update is my response to wescpy)
Hello, wescpy!!! I've tried Your updated code and something didn't work well there. Perhaps, it's because I am not using a certain framework that I am supposed to use for this code. Please, take a look at this screen shot (I guess You'll need to right-click this image to see it in better resolution):
(source: narod.ru)
Is not that easy, you have to parse content and adjust relative to absolute paths for images and javascripts.
Anyway, give it a try adding the correct Content-Type:
from google.appengine.api import urlfetch
url = "http://www.google.com/"
result = urlfetch.fetch(url)
print 'Content-Type: text/html'
print ''
print result.content
a more complete example would look something like this:
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.api import urlfetch
class MainHandler(webapp.RequestHandler):
def get(self):
url = "http://www.google.com/"
result = urlfetch.fetch(url)
self.response.out.write(result.content)
application = webapp.WSGIApplication([
('/', MainHandler),
], debug=True)
def main():
run_wsgi_app(application)
if __name__ == '__main__':
main()
but as others' have said, it's not that easy to do because you're not in the server's domain, meaning the pages will likely not look correct due to missing static content (JS, CSS, and/or images)... unless full pathnames are used or everything that's needed is embedded into the page itself.
UPDATE 1:
as mentioned before, you cannot just download the HTML source and expect things to render correctly because you don't necessarily have access to the static data. if you really want to render it as it was meant to be seen, you have to just redirect... here's the modified piece of code:
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.api import urlfetch
class MainHandler(webapp.RequestHandler):
def get(self):
url = "http://www.google.com/"
self.redirect(url)
application = webapp.WSGIApplication([
('/', MainHandler),
], debug=True)
def main():
run_wsgi_app(application)
if __name__ == '__main__':
main()
UPDATE 2:
sorry! it was a cut-n-paste error. now try it.
special characters such as <> etc are likely encoded, you'd have to decode them again for the browser to interpet it as code.

Categories