Django sendfile download -- Page not found - python

I'm using the Django sendfile module to serve files to the user. Docs:
https://github.com/johnsensible/django-sendfile
I used the simple backend. More precisely, I put SENDFILE_BACKEND = 'sendfile.backends.simple' in my settings.py.
I have checked and doucle-checked that the files are there. This is my code (I think only the request function is relevant but I'm including the whole function because I use pk in urls.py):
def permit(request, pk)
if int(request.user.id) == int(pk) and int(request.user.id) >= 1:
return sendfile(request, request.path)
else:
return render_to_response('forbidden.html')
return HttpResponseRedirect('/notendur/list')
user is a Django User object. pk is the regular expression taken from urls.py.
And the error I get is
404: [path to file] does not exist.
This is the relevant entry in project/urls.py:
url(r'^media/uploads/(?P<pk>[^/]+)', 'notendur.views.permit')
As you can see, the urls.py redirects the user to the permit function if the regx matches. If the user's ID is equal to the directory name (I name the directories by user ID) then the user is allowed to download the file, otherwise not.
I have confirmed that this error is due to the sendfile module because the download works fine if I serve the file directly, without the sendfile module.

Firstly a big warning, what you are doing is dangerous. You are trusting your user to give you a path. You must always sanitize this!
Now to your issue: rather than giving a relative file to the current directory, it is better practice to give an absolute file based on some root media path set in your settings file then do:
sanitized_path = sanitize(request.path) # you'll have to write a sanitize function
media_path = "%s%s" (settings.MEDIA_ROOT, sanitized_path)
if not path.exists(media_path): # Don't trust your visitors too much!
# raise 404
return sendfile(request, media_path)

Related

Is it always correct to use URLs like "./about.html" or "../about.htm" instead of Absolute URLS like /about?

I'm a computer science student. Recently we were tasked to develop a static HTTP server from scratch without using any HTTP modules, solely depending on socket programming. So this means that I had to write all the logic for HTTP message parsing, extracting headers, parsing URLs, etc.
However, I'm stuck with some confusion. As I'm somewhat experienced in web development before, I'm used to using URLs in places like anchor tags like this "/about", and "/articles/article-1".However, I've seen people sometimes people to relative paths according to their folder structure like this. "./about.html", "../contact.html".This always seemed to be a bad idea to me. However, I realized that even though in my code I'm not supporting these kinds of URLs explicitly, it seems to work anyhow.
Following is the python code I'm using to get the path from the HTTP message and then get the corresponding path in the file system.
def get_http_url(self, raw_request_headers: list[str]):
"""
Method to get HTTP url by parsing request headers
"""
if len(raw_request_headers) > 0:
method_and_path_header = raw_request_headers[0]
method_and_path_header_segments = method_and_path_header.split(" ")
if len(method_and_path_header_segments) >= 2:
"""
example: GET / HTTP/1.1 => ['GET', '/', 'HTTP/1.1] => '/'
"""
url = method_and_path_header_segments[1]
return url
return False
def get_resource_path_for_url(self, path: str | Literal[False]):
"""
Method to get the resource path based on url
"""
if not path:
return False
else:
if path.endswith('/'):
# Removing trailing '/' to make it easy to parse the url
path = path[0:-1]
# Split to see if the url also includes the file extension
parts = path.split('.')
if path == '':
# if the requested path is "/"
path_to_resource = os.path.join(
os.getcwd(), "htdocs", "index.html")
else:
# Assumes the user entered a valid url with resources file extension as well, ex: http://localhost:2728/pages/about.html
if len(parts) > 1:
path_to_resource = os.path.join(
os.getcwd(), "htdocs", path[1:]) # Get the abslute path with the existing file extension
else:
# Assumes user requested a url without an extension and as such is hoping for a html response
path_to_resource = os.path.join(
os.getcwd(), "htdocs", f"{path[1:]}.html") # Get the absolute path to the corresponding html file
return path_to_resource
So in my code, I'm not explicitly adding any logic to handle that kind of relative path. But somehow, when I use things like ../about.html in my test HTML files, it somehow works?
Is this the expected behavior? As of now (I would like to know where this behavior is implemented), I'm on Windows if that matters. And if this is expected, can I depend on this behavior and conclude that it's safe to refer to HTML files and other assets with relative paths like this on my web server?
Thanks in advance for any help, and I apologize if my question is not clear or well-formed.

Different translations of the same static html file: how is it possible to force the user's browser to select one of them?

I have several translations (into different languages) of the same static html file. Those were obtained via Sphinx-Python, using the internationalization feature, which ends up with mo files. All this works fine.
However, I have no clue on how to allow the user's browser to choose the proper translation that corresponds to the Accept-Languages header sent by the browser itself.
The static files will be served via Flask's send_static_file.
Thanks a lot for any hint on how it's possible to force the browser to select the "good" translation from a bunch of static files.
I have a solution that I'd like to share.
I have followed the instructions given here, that are applicable as long as the production server is Apache. The following lines are added for every translation of the same Sphinx files.
Alias /doc_en /location/of/sphinx/english_version/html
<Directory /location/of/sphinx/english_version/html>
Order deny,allow
Allow from all
</Directory>
From the Flask's side, one can steer the user to the most appropriate translation (as defined by user's browser), with the help of Flask-Babel extension:
app = Flask(__name__)
app.config.from_pyfile('mysettings.cfg')
babel = Babel(app)
#babel.localeselector
def get_locale():
bestl = request.accept_languages.best_match(['de', 'fr', 'en'])
if bestl:
return request.accept_languages.best_match(['de', 'fr', 'en'])
else:
return "en"
Then, static file is served via send_static_file:
#app.route('/<dir>/<filename>', methods=['GET'])
def doc(dir, filename):
path = os.path.join(dir, filename)
return app.send_static_file(path)
#app.route('/help')
def docs():
return redirect(url_for('doc', dir = 'doc_'+get_locale(), filename='index.html'))

How to handle url path in web.py?

I'm new to web.py, and use a lot of hardcoded url in my code for href in tag a,like
/loginor/?type=example.
The problem is,
when I set my application running under a certain path, not the root of a URL, like
http://example.com/appname/
The link will direct me to some place like
http://example.com/login
While the expected/wanted one is
http://example.com/appname/login
How do I handle this?
Make web.ctx.homepath available in your template globals, and output it before your paths.
From http://webpy.org/cookbook/ctx
homepath – The part of the path requested by the user which was
trimmed off the current app. That is homepath + path = the path
actually requested in HTTP by the user. E.g. /admin This seems to be
derived during startup from the environment variable REAL_SCRIPT_NAME.
It affects what web.url() will prepend to supplied urls. This in turn
affects where web.seeother() will go, which might interact badly with
your url rewriting scheme (e.g. mod_rewrite)
template_globals = {
'app_path': lambda p: web.ctx.homepath + p,
}
render = template.render(my_template_dir, globals=template_globals, base="mylayout")
Then you should be able to output app_path in your templates
Login

Link generator using django or any python module

I want to generate for my users temporary download link.
Is that ok if i use django to generate link using url patterns?
Could it be correct way to do that. Because can happen that I don't understand some processes how it works. And it will overflow my memory or something else. Some kind of example or tools will be appreciated. Some nginx, apache modules probably?
So, what i wanna to achieve is to make url pattern which depend on user and time. Decript it end return in view a file.
A simple scheme might be to use a hash digest of username and timestamp:
from datetime import datetime
from hashlib import sha1
user = 'bob'
time = datetime.now().isoformat()
plain = user + '\0' + time
token = sha1(plain)
print token.hexdigest()
"1e2c5078bd0de12a79d1a49255a9bff9737aa4a4"
Next you store that token in a memcache with an expiration time. This way any of your webservers can reach it and the token will auto-expire. Finally add a Django url handler for '^download/.+' where the controller just looks up that token in the memcache to determine if the token is valid. You can even store the filename to be downloaded as the token's value in memcache.
Yes it would be ok to allow django to generate the urls. This being exclusive from handling the urls, with urls.py. Typically you don't want django to handle the serving of files see the static file docs[1] about this, so get the notion of using url patterns out of your head.
What you might want to do is generate a random key using a hash, like md5/sha1. Store the file and the key, datetime it's added in the database, create the download directory in a root directory that's available from your webserver like apache or nginx... suggest nginx), Since it's temporary, you'll want to add a cron job that checks if the time since the url was generated has expired, cleans up the file and removes the db entry. This should be a django command for manage.py
Please note this is example code written just for this and not tested! It may not work the way you were planning on achieving this goal, but it works. If you want the dl to be pw protected also, then look into httpbasic auth. you can generate and remove entries on the fly in a httpd.auth file using htpasswd and the subprocess module when you create the link or at registration time.
import hashlib, random, datetime, os, shutil
# model to hold link info. has these fields: key (charfield), filepath (filepathfield)
# datetime (datetimefield), url (charfield), orgpath (filepathfield of the orignal path
# or a foreignkey to the files model.
from models import MyDlLink
# settings.py for the app
from myapp import settings as myapp_settings
# full path and name of file to dl.
def genUrl(filepath):
# create a onetime salt for randomness
salt = ''.join(['{0}'.format(random.randrange(10) for i in range(10)])
key = hashlib('{0}{1}'.format(salt, filepath).hexdigest()
newpath = os.path.join(myapp_settings.DL_ROOT, key)
shutil.copy2(fname, newpath)
newlink = MyDlink()
newlink.key = key
newlink.date = datetime.datetime.now()
newlink.orgpath = filepath
newlink.newpath = newpath
newlink.url = "{0}/{1}/{2}".format(myapp_settings.DL_URL, key, os.path.basename(fname))
newlink.save()
return newlink
# in commands
def check_url_expired():
maxage = datetime.timedelta(days=7)
now = datetime.datetime.now()
for link in MyDlink.objects.all():
if(now - link.date) > maxage:
os.path.remove(link.newpath)
link.delete()
[1] http://docs.djangoproject.com/en/1.2/howto/static-files/
It sounds like you are suggesting using some kind of dynamic url conf.
Why not forget your concerns by simplifying and setting up a single url that captures a large encoded string that depends on user/time?
(r'^download/(?P<encrypted_id>(.*)/$', 'download_file'), # use your own regexp
def download_file(request, encrypted_id):
decrypted = decrypt(encrypted_id)
_file = get_file(decrypted)
return _file
A lot of sites just use a get param too.
www.example.com/download_file/?09248903483o8a908423028a0df8032
If you are concerned about performance, look at the answers in this post: Having Django serve downloadable files
Where the use of the apache x-sendfile module is highlighted.
Another alternative is to simply redirect to the static file served by whatever means from django.

Django localeURL when WSGIScriptAlias is /PREFIX

Introduction
I Got a question about localeURL usage.
Everything works great for me with url like this :
http://www.example.com/
If I type http://www.example.com/ in address bar, it turns correctly in http://www.example.com/en/ for example.
If I use the view change_locale, it's also all right (ie change www.example.com/en/ in www.example.com/fr/).
problem
But my application use apache as server, with mod_wsgi. The httpd.conf script contains this line :
WSGIScriptAlias /MY_PREFIX /path/to/django/app/apache/django.wsgi
that gives url like this :
http://www.example.com/MY_PREFIX/
If I type http://www.example.com/MY_PREFIX/ in the address bar, the adress turns into http://www.example.com/en/ when the expected result should be http://www.example.com/MY_PREFIX/en/
The same problem occurred with the change_locale view. I modified this code in order to manage this prefix (store in settings.SERVER_PREFIX):
def change_locale(request) :
"""
Redirect to a given url while changing the locale in the path
The url and the locale code need to be specified in the
request parameters.
O. Rochaix; Taken from localeURL view, and tuned to manage :
- SERVER_PREFIX from settings.py
"""
next = request.REQUEST.get('next', None)
if not next:
next = request.META.get('HTTP_REFERER', None)
if not next:
next = settings.SERVER_PREFIX + '/'
next = urlsplit(next).path
prefix = False
if settings.SERVER_PREFIX!="" and next.startswith(settings.SERVER_PREFIX) :
prefix = True
next = "/" + next.lstrip(settings.SERVER_PREFIX)
_, path = utils.strip_path (next)
if request.method == 'POST':
locale = request.POST.get('locale', None)
if locale and check_for_language(locale):
path = utils.locale_path(path, locale)
if prefix :
path = settings.SERVER_PREFIX + path
response = http.HttpResponseRedirect(path)
return response
with this customized view, i'm able to correctly change language, but i'm not sure that's the right way of doing stuff.
Question
when, in httpd.conf you use WSGIScriptAlias with /PREFIX (ie "/Blog"), do we need, on python side to use a variable (here settings.SERVER_PREFIX) that match WSGIScriptAlias ? i use it for MEDIA_URL and other stuff, but maybe there is some configuration to do in order to make it work "automatically" without having to manage this on python side
Do you think that this customized view (change_locale) is the right way to manage this issue ? Or is there some kind of automagic stuff as for 1. ?
It doesn't solve the problem if I type the address (http://www.example.com/MY_PREFIX/) in address bar. If customization is the way to go, i will change this as well, but I think there is a better solution!
You should not be hard wiring SERVER_PREFIX in settings. The mount prefix for the site is available as SCRIPT_NAME in the WSGI environ dictionary. Thus from memory is available as request.META.get('SCRIPT_NAME').
try this (I am not sure whether it will work though):
WSGIScriptAliasMatch ^/MY_PREFIX(/.*)?$ /path/to/django/app/apache/django.wsgi$1
basically the idea s to make django believe that there is no prefix
but you need to make sure django emits the correct URLs in its HTML output.

Categories