Tamporarily saving and sanitizing image objects in Django - python

I'm creating a Django website where users post details of used items they want to sell/barter.
When posting an item, the second-last step is to upload (upto 3) photos of the item on sale. The last step is to provide personal details (name, address, mobile). After this, the ad is finalized and goes into a "pending approval" queue.
I don't want to save photos to the DB until an advert is finalized. Hence I'm thinking I'll temporarily save a reference to the UploadedFile object retrieved from the form like so:
request.session["photo"] = form.cleaned_data.get('photo',None)
So far so good, or is this problematic? Assuming everything is correct so far.
Next, once ad is final, I'll save the images to my storage backend and pop photo from the request.session dictionary.
But what if the user drops out before finalizing the ad? How would I handle cases where a request.session["photo"] entry was created, but the user never finished the ad? Where are UploadedFile objects saved?
Basically, I need an efficient way to process 'orphaned' request.session image entries and related UploadedFile objects.

The main problem of using sessions is that they're not meant to handle a lot amount of data, and they could potentially loss information due to encoding, the RAM filling up, the framework's process of garbage collection, and such. That really dependes on how you store sessions objects, if you already solved this problem, go to step two.
Step One: You may want to make a separate model to store the image temporarily, storing the date and image file reference (through an ImageField), you can tell this model to upload the file to a temporary location on the Media directory using the upload_to parameter of the field.
Step Two: Set up a management command that filters the orphaned models from a certain date (images older than 10 minutes or so), and then deletes them along with the images (this is very important, as deleting the database record does not delete the file). The management command should be able to run without user intervention. Read more about management commands here.
Step Three: Set up your machine's Crontab or whatever you use to schedule processes, to run the management command in a daily or hourly basis depending on your needs.

Related

Posting data to database through a "workflow" (Ex: on field changed to 20, create new record)

I'm looking to post new records on a user triggered basis (i.e. workflow). I've spent the last couple of days reasearching the best way to approach this and so far I've come up with the following ideas:
(1) Utilize Django signals to check for conditions on a field change, and then post data originating from my Django app.
(2) Utilize JS/AJAX on the front-end to post data to the app based upon a user changing certain fields.
(3) Utilize a prebuilt workflow app like http://viewflow.io/, again based upon changes triggers by users.
Of the three above options, is there a best practice? Are there any other options I'm not considering for how to take this workflow based approach to post new records?
The second approach of monitoring the changes in the front end and then calling a backend view to update go database would be a better approach because processing on the backend or any other site would put the processing on the server which would slow down the site whereas second approach is more of a client side solution thereby keeping server relieved.
I do not think there will be a data loss, you are just trying to monitor a change, as soon as it changes your view will update the database, you can also use cookies or sessions to keep appending values as a list and update the database when site closes. Also django gives https errors you could put proper try and except conditions in that case as well. Anyways cookies would be a good approach I think
For anyone that finds this post I ended up deciding to take the Signals route. Essentially I'm utilizing Signals to track when users change a fields, and based on the field that changes I'm performing certain actions on the database.
For testing purposes this has been working well. When I reach production with this project I'll try to update this post with any challenges I run into.
Example:
#receiver(pre_save, sender=subTaskChecklist)
def do_something_if_changed(sender, instance, **kwargs):
try:
obj = sender.objects.get(pk=instance.pk) #define obj as "old" before change values
except sender.DoesNotExist:
pass
else:
previous_Value = obj.FieldToTrack
new_Value = instance.FieldToTrack #instance represents the "new" after change object
DoSomethingWithChangedField(new_Value)

Updating my Django website's database from a third party service, strategies?

I'm learning Django and to practice I'm currently developing a clone page of YTS, it's a movie torrents repository*.
As of right now, I scrapped all the movies in the website and have them on a single db table called Movie with all the basic information of each movie (I'm planning on adding one more for Genre).
Every few days YTS will post new movies and I want my clone-web to automatically add them to the database. I'm currently stuck on deciding how to do this:
I was planning on comparing the movie id of the last movie in my db against the last movie in the YTS db each time the user enters the website, but that'd mean make a request to YTS every time my page loads, it'd also mean some very slow code should be executed inside my index() views method.
Another strategy would be to query the last time my db was updated (new entries were introduced) and if it's let's say bigger than a day then request new movies to YTS. Problem with this is I don't seem to find any method to query the time of last db updates. Does it even exist such method?
I could also set a cron job to update the information but I'm having problems to make changes from a separated Python function (I import django.db and such but the interpreter refuses to execute django db instructions).
So, all in all, what's the best strategy to update my database from a third party service/website without bothering the user with loading times? How do you set such updates in non-intrusive way to the user? How do you generally do it?
* I know a torrents website borders the illegal and I'm not intended, in any way, to make my project available to the public
I think you should choose definetely the third alternative, a cron job to update the database regularly seems the best option.
You don' t need to use a seperate python function, you can schedule a task with celery, which can be easily integrated with django using django-celery
The simplest way would be to write a custom management command and run it periodically from a cron job.

Save form data on every step using Django FormWizard

Background
I'm building a very large form to process customer submissions, so the end goal is to allow the user to resume the form where they left off at a later date. The form is fully functional using a FormWizard (NamedUrlSessionWizardView, actually). The Django docs mention a final save is accomplished in the done method, and leave this as an exercise to the reader. This works OK if the user completes this in one sitting, but not if you want to restore this later.
In my case, an email address is used to lookup past progress, and send a unique link to the user. This sets up the form and returns the user to where they left off. This works fine as long as your session is still valid, but not if it isn't (different computer, etc). What I would like to do is save the form data (these are ModelForms) after each step. I'll restore the user's state when they return.
Research
This answer is about the only solution I can find, but the solution is the same thing that the standard FormWizard.post() method does:
if form.is_valid():
# if the form is valid, store the cleaned data and files.
self.storage.set_step_data(self.steps.current, self.process_step(form))
self.storage.set_step_files(self.steps.current, self.process_step_files(form))
My Question
What is the proper way/place in a FormWizard to take action on, and save the form data out after each step?
You should be able to save the data directly to the ModelForm as you go along by simply writing it into the post method.
if self.steps.current == "form1":
data = self.request.POST["form1-response"]
user = CustomerModel.objects.get(id=self.request.user.id)
user.response = data
user.form_step = "form1"
user.save()
form_step, in this case, is simply a bookmark that you can use to direct the user back to the right step on their return. You should remove any already-saved fields from the done method, so they don't get overwritten.
If you do it this way, you may need to construct a dispatch method that rebuilds the management form when the user logs back in.
Alternatively, you might be able to get away with saving the user's session (or the relevant parts) into a session field on the model, then write a dispatch method for the SessionWizardView that injects the relevant information back in. I've never attempted it, but if you can get it to work, it might be preferable from an aesthetic standpoint depending on how many steps you have to cover.
Finally, if you can rely on your users not to clear their cookies and to use the same browser when they return, you can maybe cheat and set use persistent cookies.
Hopefully that will get you started. I'd be interested to see how you end up getting it to work. Good luck!

Persistent object with Django?

So I have a site that on a per-user basis, and it is expected to query a very large database, and flip through the results. Due to the size of the number of entries returned, I run the query once (which takes some time...), store the result in a global, and let folks iterate through the results (or download them) as they want.
Of course, this isn't scalable, as the globals are shared across sessions. What is the correct way to do this in Django? I looked at session management, but I always ran into the "xyz is not serializeable on json" issue. Do I look into how I do this correctly using sessions, or is there another preferred way to do this?
If the user is flipping through the results, you probably don't want to pull back and render any more than you have to. Most SQL dialects have TOP and LIMIT clauses that will let you pull back a limited range of results, as long as your data is ordered consistently. Django's Pagination classes are a nice abstraction of this on top of Django Model classes: https://docs.djangoproject.com/en/dev/topics/pagination/
I would be careful of storing large amounts of data in user sessions, as it won't scale as your number of users grows, and user sessions can stay around for a while after the user has left the site. If you're set on this option, make sure you read about clearing the expired sessions. Django doesn't do it for you:
https://docs.djangoproject.com/en/1.7/topics/http/sessions/#clearing-the-session-store

Dynamic database tables in django

I am working on a project which requires me to create a table of every user who registers on the website using the username of that user. The columns in the table are same for every user.
While researching I found this Django dynamic model fields. I am not sure how to use django-mutant to accomplish this. Also, is there any way I could do this without using any external apps?
PS : The backend that I am using is Mysql
An interesting question, which might be of wider interest.
Creating one table per user is a maintenance nightmare. You should instead define a single table to hold all users' data, and then use the database's capabilities to retrieve only those rows pertaining to the user of interest (after checking permissions if necessary, since it is not a good idea to give any user unrestricted access to another user's data without specific permissions having been set).
Adopting your proposed solution requires that you construct SQL statements containing the relevant user's table name. Successive queries to the database will mostly be different, and this will slow the work down because every SQL statement has to be “prepared” (the syntax has to be checked, the names of table and columns has to be verified, the requesting user's permission to access the named resources has to be authorized, and so on).
By using a single table (model) the same queries can be used repeatedly, with parameters used to vary specific data values (in this case the name of the user whose data is being sought). Your database work will move along faster, you will only need a single model to describe all users' data, and database management will not be a nightmare.
A further advantage is that Django (which you appear to be using) has an extensive user-based permission model, and can easily be used to authenticate user login (once you know how). These advantages are so compelling I hope you will recant from your heresy and decide you can get away with a single table (and, if you planning to use standard Django logins, a relationship with the User model that comes as a central part of any Django project).
Please feel free to ask more questions as you proceed. It seems you are new to database work, and so I have tried to present an appropriate level of detail. There are many pitfalls such as this if you cannot access knowledgable advice. People on SO will help you.
This page shows how to create a model and install table to database on the fly. So, you could use type('table_with_username', (models.Model,), attrs) to create a model and use django.core.management to install it to the database.

Categories