I want to generate for my users temporary download link.
Is that ok if i use django to generate link using url patterns?
Could it be correct way to do that. Because can happen that I don't understand some processes how it works. And it will overflow my memory or something else. Some kind of example or tools will be appreciated. Some nginx, apache modules probably?
So, what i wanna to achieve is to make url pattern which depend on user and time. Decript it end return in view a file.
A simple scheme might be to use a hash digest of username and timestamp:
from datetime import datetime
from hashlib import sha1
user = 'bob'
time = datetime.now().isoformat()
plain = user + '\0' + time
token = sha1(plain)
print token.hexdigest()
"1e2c5078bd0de12a79d1a49255a9bff9737aa4a4"
Next you store that token in a memcache with an expiration time. This way any of your webservers can reach it and the token will auto-expire. Finally add a Django url handler for '^download/.+' where the controller just looks up that token in the memcache to determine if the token is valid. You can even store the filename to be downloaded as the token's value in memcache.
Yes it would be ok to allow django to generate the urls. This being exclusive from handling the urls, with urls.py. Typically you don't want django to handle the serving of files see the static file docs[1] about this, so get the notion of using url patterns out of your head.
What you might want to do is generate a random key using a hash, like md5/sha1. Store the file and the key, datetime it's added in the database, create the download directory in a root directory that's available from your webserver like apache or nginx... suggest nginx), Since it's temporary, you'll want to add a cron job that checks if the time since the url was generated has expired, cleans up the file and removes the db entry. This should be a django command for manage.py
Please note this is example code written just for this and not tested! It may not work the way you were planning on achieving this goal, but it works. If you want the dl to be pw protected also, then look into httpbasic auth. you can generate and remove entries on the fly in a httpd.auth file using htpasswd and the subprocess module when you create the link or at registration time.
import hashlib, random, datetime, os, shutil
# model to hold link info. has these fields: key (charfield), filepath (filepathfield)
# datetime (datetimefield), url (charfield), orgpath (filepathfield of the orignal path
# or a foreignkey to the files model.
from models import MyDlLink
# settings.py for the app
from myapp import settings as myapp_settings
# full path and name of file to dl.
def genUrl(filepath):
# create a onetime salt for randomness
salt = ''.join(['{0}'.format(random.randrange(10) for i in range(10)])
key = hashlib('{0}{1}'.format(salt, filepath).hexdigest()
newpath = os.path.join(myapp_settings.DL_ROOT, key)
shutil.copy2(fname, newpath)
newlink = MyDlink()
newlink.key = key
newlink.date = datetime.datetime.now()
newlink.orgpath = filepath
newlink.newpath = newpath
newlink.url = "{0}/{1}/{2}".format(myapp_settings.DL_URL, key, os.path.basename(fname))
newlink.save()
return newlink
# in commands
def check_url_expired():
maxage = datetime.timedelta(days=7)
now = datetime.datetime.now()
for link in MyDlink.objects.all():
if(now - link.date) > maxage:
os.path.remove(link.newpath)
link.delete()
[1] http://docs.djangoproject.com/en/1.2/howto/static-files/
It sounds like you are suggesting using some kind of dynamic url conf.
Why not forget your concerns by simplifying and setting up a single url that captures a large encoded string that depends on user/time?
(r'^download/(?P<encrypted_id>(.*)/$', 'download_file'), # use your own regexp
def download_file(request, encrypted_id):
decrypted = decrypt(encrypted_id)
_file = get_file(decrypted)
return _file
A lot of sites just use a get param too.
www.example.com/download_file/?09248903483o8a908423028a0df8032
If you are concerned about performance, look at the answers in this post: Having Django serve downloadable files
Where the use of the apache x-sendfile module is highlighted.
Another alternative is to simply redirect to the static file served by whatever means from django.
Related
I'm a computer science student. Recently we were tasked to develop a static HTTP server from scratch without using any HTTP modules, solely depending on socket programming. So this means that I had to write all the logic for HTTP message parsing, extracting headers, parsing URLs, etc.
However, I'm stuck with some confusion. As I'm somewhat experienced in web development before, I'm used to using URLs in places like anchor tags like this "/about", and "/articles/article-1".However, I've seen people sometimes people to relative paths according to their folder structure like this. "./about.html", "../contact.html".This always seemed to be a bad idea to me. However, I realized that even though in my code I'm not supporting these kinds of URLs explicitly, it seems to work anyhow.
Following is the python code I'm using to get the path from the HTTP message and then get the corresponding path in the file system.
def get_http_url(self, raw_request_headers: list[str]):
"""
Method to get HTTP url by parsing request headers
"""
if len(raw_request_headers) > 0:
method_and_path_header = raw_request_headers[0]
method_and_path_header_segments = method_and_path_header.split(" ")
if len(method_and_path_header_segments) >= 2:
"""
example: GET / HTTP/1.1 => ['GET', '/', 'HTTP/1.1] => '/'
"""
url = method_and_path_header_segments[1]
return url
return False
def get_resource_path_for_url(self, path: str | Literal[False]):
"""
Method to get the resource path based on url
"""
if not path:
return False
else:
if path.endswith('/'):
# Removing trailing '/' to make it easy to parse the url
path = path[0:-1]
# Split to see if the url also includes the file extension
parts = path.split('.')
if path == '':
# if the requested path is "/"
path_to_resource = os.path.join(
os.getcwd(), "htdocs", "index.html")
else:
# Assumes the user entered a valid url with resources file extension as well, ex: http://localhost:2728/pages/about.html
if len(parts) > 1:
path_to_resource = os.path.join(
os.getcwd(), "htdocs", path[1:]) # Get the abslute path with the existing file extension
else:
# Assumes user requested a url without an extension and as such is hoping for a html response
path_to_resource = os.path.join(
os.getcwd(), "htdocs", f"{path[1:]}.html") # Get the absolute path to the corresponding html file
return path_to_resource
So in my code, I'm not explicitly adding any logic to handle that kind of relative path. But somehow, when I use things like ../about.html in my test HTML files, it somehow works?
Is this the expected behavior? As of now (I would like to know where this behavior is implemented), I'm on Windows if that matters. And if this is expected, can I depend on this behavior and conclude that it's safe to refer to HTML files and other assets with relative paths like this on my web server?
Thanks in advance for any help, and I apologize if my question is not clear or well-formed.
I'm busy configuring a TensorFlow Serving client that asks a TensorFlow Serving server to produce predictions on a given input image, for a given model.
If the model being requested has not yet been served, it is downloaded from a remote URL to a folder where the server's models are located. (The client does this). At this point I need to update the model_config and trigger the server to reload it.
This functionality appears to exist (based on https://github.com/tensorflow/serving/pull/885 and https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/model_service.proto#L22), but I can't find any documentation on how to actually use it.
I am essentially looking for a python script with which I can trigger the reload from client side (or otherwise to configure the server to listen for changes and trigger the reload itself).
So it took me ages of trawling through pull requests to finally find a code example for this. For the next person who has the same question as me, here is an example of how to do this. (You'll need the tensorflow_serving package for this; pip install tensorflow-serving-api).
Based on this pull request (which at the time of writing hadn't been accepted and was closed since it needed review): https://github.com/tensorflow/serving/pull/1065
from tensorflow_serving.apis import model_service_pb2_grpc
from tensorflow_serving.apis import model_management_pb2
from tensorflow_serving.config import model_server_config_pb2
import grpc
def add_model_config(host, name, base_path, model_platform):
channel = grpc.insecure_channel(host)
stub = model_service_pb2_grpc.ModelServiceStub(channel)
request = model_management_pb2.ReloadConfigRequest()
model_server_config = model_server_config_pb2.ModelServerConfig()
#Create a config to add to the list of served models
config_list = model_server_config_pb2.ModelConfigList()
one_config = config_list.config.add()
one_config.name= name
one_config.base_path=base_path
one_config.model_platform=model_platform
model_server_config.model_config_list.CopyFrom(config_list)
request.config.CopyFrom(model_server_config)
print(request.IsInitialized())
print(request.ListFields())
response = stub.HandleReloadConfigRequest(request,10)
if response.status.error_code == 0:
print("Reload sucessfully")
else:
print("Reload failed!")
print(response.status.error_code)
print(response.status.error_message)
add_model_config(host="localhost:8500",
name="my_model",
base_path="/models/my_model",
model_platform="tensorflow")
Add a model to TF Serving server and to the existing config file conf_filepath: Use arguments name, base_path, model_platform for the new model. Keeps the original models intact.
Notice a small difference from #Karl 's answer - using MergeFrom instead of CopyFrom
pip install tensorflow-serving-api
import grpc
from google.protobuf import text_format
from tensorflow_serving.apis import model_service_pb2_grpc, model_management_pb2
from tensorflow_serving.config import model_server_config_pb2
def add_model_config(conf_filepath, host, name, base_path, model_platform):
with open(conf_filepath, 'r+') as f:
config_ini = f.read()
channel = grpc.insecure_channel(host)
stub = model_service_pb2_grpc.ModelServiceStub(channel)
request = model_management_pb2.ReloadConfigRequest()
model_server_config = model_server_config_pb2.ModelServerConfig()
config_list = model_server_config_pb2.ModelConfigList()
model_server_config = text_format.Parse(text=config_ini, message=model_server_config)
# Create a config to add to the list of served models
one_config = config_list.config.add()
one_config.name = name
one_config.base_path = base_path
one_config.model_platform = model_platform
model_server_config.model_config_list.MergeFrom(config_list)
request.config.CopyFrom(model_server_config)
response = stub.HandleReloadConfigRequest(request, 10)
if response.status.error_code == 0:
with open(conf_filepath, 'w+') as f:
f.write(request.config.__str__())
print("Updated TF Serving conf file")
else:
print("Failed to update model_config_list!")
print(response.status.error_code)
print(response.status.error_message)
While the solutions mentioned here works fine, there is one more method that you can use to hot-reload your models. You can use --model_config_file_poll_wait_seconds
As mentioned here in the documentation -
By setting the --model_config_file_poll_wait_seconds flag to instruct the server to periodically check for a new config file at --model_config_file filepath.
So, you just have to update the config file at model_config_path and tf-serving will load any new models and unload any models removed from the config file.
Edit 1: I looked at the source code and it seems that the flag is present from the very early version of tf-serving but there have been instances where some users were not able to use this flag (see this). So, try to use the latest version if possible.
If you're using the method described in this answer, please note that you're actually launching multiple tensorflow model server instances instead of a single model server, effectively making the servers compete for resources instead of working together to optimize tail latency.
I am new to python and firebase and I am trying to flaten my firebase database.
I have a database in this format
each cat has thousands of data in it. All I want is to fetch the cat names and put them in an array. for example I want the output to be ['cat1','cat2'....]
I was using this tutorial
http://ozgur.github.io/python-firebase/
from firebase import firebase
firebase = firebase.FirebaseApplication('https://your_storage.firebaseio.com', None)
result = firebase.get('/Data', None)
the problem with the above code is it'll attempt to fetch all the data under Data. How can I only fetch the "cats"?
if you want to get the values inside the cats as columns, try using the pyrebase, using pip install pyrebase at cmd / anaconda prompt(later prefered if you didn't set up PIP or Python at your environment paths. after installing:
import pyrebase
config {"apiKey": yourapikey
"authDomain": yourapidomain
"databaseURL": yourdatabaseurl,
"storageBucket": yourstoragebucket,
"serviceAccount": yourserviceaccount
}
Note: you can find all the information above at your Firebase's console:
https://console.firebase.google.com/project/ >>> your project >>> click on the icon "<'/>" with the tag "add firebase to your web app
back to the code...
make a neat definition so you can store it into a py file:
def connect_firebase():
# add a way to encrypt those, I'm a starter myself and don't know how
username: "usernameyoucreatedatfirebase"
password: "passwordforaboveuser"
firebase = pyrebase.initialize_app(config)
auth = firebase.auth()
#authenticate a user > descobrir como não deixar hardcoded
user = auth.sign_in_with_email_and_password(username, password)
#user['idToken']
# At pyrebase's git the author said the token expires every 1 hour, so it's needed to refresh it
user = auth.refresh(user['refreshToken'])
#set database
db = firebase.database()
return db
Ok, now save this into a neat .py file
NEXT, at your new notebook or main .py you're going to import this new .py file that we'll call auth.py from now on...
from auth import *
# add do a variable
db = connect_firebase()
#and now the hard/ easy part that took me a while to figure out:
# notice the value inside the .child, it should be the parent name with all the cats keys
values = db.child('cats').get()
# adding all to a dataframe you'll need to use the .val()
data = pd.DataFrame(values.val())
and thats it, print(data.head()) to check if the values / columns are where they're expected to be.
Firebase Realtime Database is one big JSON tree:
when you fetch data at a location in your database, you also retrieve
all of its child nodes.
The best practice is to denormalize your data, creating multiple locations (nodes) for the same data:
Many times you can denormalize the data by using a query to retrieve a
subset of the data
In your case, you may create a second node named "categories" where you list "only" the category names.
/cat1
/...
/cat2
/...
/cat3
/...
/cat4
/...
/categories
/cat1
/cat2
/cat3
/cat4
In this scenario you can use the update() method to write to more than one location at the same time.
I was exploring pyrebase documentation. As per that, we may extract only keys from some path.
To return just the keys at a particular path use the shallow() method.
all_user_ids = db.child("users").shallow().get()
In your case, it'll be something like:
firebase = pyrebase.initialize_app(config)
db = firebase.database()
allCats = db.child("data").shallow().get()
Let me know if it didn't help.
I'm using the Django sendfile module to serve files to the user. Docs:
https://github.com/johnsensible/django-sendfile
I used the simple backend. More precisely, I put SENDFILE_BACKEND = 'sendfile.backends.simple' in my settings.py.
I have checked and doucle-checked that the files are there. This is my code (I think only the request function is relevant but I'm including the whole function because I use pk in urls.py):
def permit(request, pk)
if int(request.user.id) == int(pk) and int(request.user.id) >= 1:
return sendfile(request, request.path)
else:
return render_to_response('forbidden.html')
return HttpResponseRedirect('/notendur/list')
user is a Django User object. pk is the regular expression taken from urls.py.
And the error I get is
404: [path to file] does not exist.
This is the relevant entry in project/urls.py:
url(r'^media/uploads/(?P<pk>[^/]+)', 'notendur.views.permit')
As you can see, the urls.py redirects the user to the permit function if the regx matches. If the user's ID is equal to the directory name (I name the directories by user ID) then the user is allowed to download the file, otherwise not.
I have confirmed that this error is due to the sendfile module because the download works fine if I serve the file directly, without the sendfile module.
Firstly a big warning, what you are doing is dangerous. You are trusting your user to give you a path. You must always sanitize this!
Now to your issue: rather than giving a relative file to the current directory, it is better practice to give an absolute file based on some root media path set in your settings file then do:
sanitized_path = sanitize(request.path) # you'll have to write a sanitize function
media_path = "%s%s" (settings.MEDIA_ROOT, sanitized_path)
if not path.exists(media_path): # Don't trust your visitors too much!
# raise 404
return sendfile(request, media_path)
Introduction
I Got a question about localeURL usage.
Everything works great for me with url like this :
http://www.example.com/
If I type http://www.example.com/ in address bar, it turns correctly in http://www.example.com/en/ for example.
If I use the view change_locale, it's also all right (ie change www.example.com/en/ in www.example.com/fr/).
problem
But my application use apache as server, with mod_wsgi. The httpd.conf script contains this line :
WSGIScriptAlias /MY_PREFIX /path/to/django/app/apache/django.wsgi
that gives url like this :
http://www.example.com/MY_PREFIX/
If I type http://www.example.com/MY_PREFIX/ in the address bar, the adress turns into http://www.example.com/en/ when the expected result should be http://www.example.com/MY_PREFIX/en/
The same problem occurred with the change_locale view. I modified this code in order to manage this prefix (store in settings.SERVER_PREFIX):
def change_locale(request) :
"""
Redirect to a given url while changing the locale in the path
The url and the locale code need to be specified in the
request parameters.
O. Rochaix; Taken from localeURL view, and tuned to manage :
- SERVER_PREFIX from settings.py
"""
next = request.REQUEST.get('next', None)
if not next:
next = request.META.get('HTTP_REFERER', None)
if not next:
next = settings.SERVER_PREFIX + '/'
next = urlsplit(next).path
prefix = False
if settings.SERVER_PREFIX!="" and next.startswith(settings.SERVER_PREFIX) :
prefix = True
next = "/" + next.lstrip(settings.SERVER_PREFIX)
_, path = utils.strip_path (next)
if request.method == 'POST':
locale = request.POST.get('locale', None)
if locale and check_for_language(locale):
path = utils.locale_path(path, locale)
if prefix :
path = settings.SERVER_PREFIX + path
response = http.HttpResponseRedirect(path)
return response
with this customized view, i'm able to correctly change language, but i'm not sure that's the right way of doing stuff.
Question
when, in httpd.conf you use WSGIScriptAlias with /PREFIX (ie "/Blog"), do we need, on python side to use a variable (here settings.SERVER_PREFIX) that match WSGIScriptAlias ? i use it for MEDIA_URL and other stuff, but maybe there is some configuration to do in order to make it work "automatically" without having to manage this on python side
Do you think that this customized view (change_locale) is the right way to manage this issue ? Or is there some kind of automagic stuff as for 1. ?
It doesn't solve the problem if I type the address (http://www.example.com/MY_PREFIX/) in address bar. If customization is the way to go, i will change this as well, but I think there is a better solution!
You should not be hard wiring SERVER_PREFIX in settings. The mount prefix for the site is available as SCRIPT_NAME in the WSGI environ dictionary. Thus from memory is available as request.META.get('SCRIPT_NAME').
try this (I am not sure whether it will work though):
WSGIScriptAliasMatch ^/MY_PREFIX(/.*)?$ /path/to/django/app/apache/django.wsgi$1
basically the idea s to make django believe that there is no prefix
but you need to make sure django emits the correct URLs in its HTML output.