I'm using Django REST framework to serve up JSON content for a website front end. On the back end, I have two Django models, Player and Match, that each reference multiple of the other. A Match contains multiple Players, and a Player contains multiple Matches. This data is originally retrieved from a third-party API.
Matches and Players must be fetched separately from the API, and can only be fetched one at a time. When an object is fetched, its data is converted from the external JSON format into my Django model. At this point, the Match/Player will live forever in Django. The hard part is that I want this external fetching to be seamless. If I query for a player or match and it's in the DB, then just serve what we have there. Otherwise, I want to fetch that object from the external DB.
My question is, does Django provide any convenient way of handling this? Ideally, any query along the lines of Match.objects.get(id=...) will handle this API fallback transparently (I don't mind the fact that this query may take significantly longer in some cases).
If a way is "convenient" depends on your expectations ...
You could create a custom QuerySet where you override the get() method to include your fetch-from-API logic. Then you create a custom manager based on that QuerySet, like the docs show here.
Finally add that custom manager to your model.
See also this question from 2011.
Related
I'm building a web app using Django and couchdb 2.0.
The new version of couchdb doesn't support temporary views. They recommend using a Mongo query but I couldn't find any useful documentation.
What is the best approach or library to use couchdb 2.0 with Django?
Temporary views were indeed abandoned in CouchDB 2.0. With mango, you could emulate them using a Hack, but that's just as bad (read: performance-wise). The recommendation is to actually use persistent views. As only the delta of new or updated documents need indexing, this will likely need significantly less resources.
As opposed to relational DBs, the created view (which is a persisted index by keys), is meant to be queried many times with different parameters (there is no such a thing as a query optimizer taking your temp view definition or something). So, when you're built heavily on temporary views, you might consider changing the way you query in the first place. One place to start is thinking about which attribute will collapse the result set most quickly to what you're looking for and build a view for that. Then, go query this view with keys and post-filter for the rest.
The closest thing you can do to a temporary view (when you really, really need it) is creating a design doc (e.g. _design/temp<uuid>) and use it for the one query execution.
Just to add a link (not new - but timeless) on the details: http://guide.couchdb.org/draft/views.html
Basically I have a program which scraps some data from a website, I need to either print it out to a django template or to REST API without using a database. How do I do this without a database?
Your best bet is to
a.) Perform the scraping in views themselves, and pass the info in a context dict to the template
or
b.) Write to a file and have your view pull info from the file.
Django can be run without a database, but it depends on what applications you enable. Some of the default functionality (auth, sites, contenttypes) requires a database. So you'd need to disable those. If you need to use them, you're SOL.
Other functionality (like sessions) usually uses a database, but you can configure it to use a cache or file or something else.
I've taken two approaches in the past:
1) Disable the database completely and disable the applications that require the database:
DATABASES = {}
2) Use a dummy sqlite database just so it works out of box with the default apps without too much tweaking, but don't really use it for anything. I find this method faster and good for setting up quick testing/prototyping.
And to actually get the data from the scraper into your view, you can take a number of approaches. Store the data in a cache, or just write it directly to your context variables, etc.
I noticed that there's no guarantee that the data base is updated synchronously after calling save() on a model.
I have done a simple test by making an ajax call to the following method
def save(request, id)
product = ProductModel.objects.find(id = id)
product.name = 'New Product Name'
product.save()
return HTTPResponse('success')
On the client side I wait for a response from the above method and then execute findAll method that retrieves the list of products. The returned list of products contains the old value for the name of the updated product.
However, if I delay the request for the list of products then it contains the new value, just like it is should.
This means that return HTTPResponse('success') if fired before the new values are written into the data base.
If the above is true then is there a way to return the HTTP response only after the data base is updated.
You should have mentioned App Engine more prominently. I've added it to the tags.
This is very definitely because of your lack of understanding of GAE, rather than anything to do with Django. You should read the GAE documentation on eventual consistency in the datastore, and structure your models and queries appropriately.
Normal Django, running with a standard relational database, would not have this issue.
The view should not return anything prior to the .save() function ends its flow.
As for the flow itself, the Django's docs declare it quite explicitly:
When you save an object, Django performs the following steps:
1) Emit a pre-save signal. The signal django.db.models.signals.pre_save is sent, allowing any functions listening for that signal to take some customized action.
2) Pre-process the data. Each field on the object is asked to perform any automated data modification that the field may need to perform.
Most fields do no pre-processing — the field data is kept as-is. Pre-processing is only used on fields that have special behavior. For example, if your model has a DateField with auto_now=True, the pre-save phase will alter the data in the object to ensure that the date field contains the current date stamp. (Our documentation doesn’t yet include a list of all the fields with this “special behavior.”)
3) Prepare the data for the database. Each field is asked to provide its current value in a data type that can be written to the database.
Most fields require no data preparation. Simple data types, such as integers and strings, are ‘ready to write’ as a Python object. However, more complex data types often require some modification.
For example, DateField fields use a Python datetime object to store data. Databases don’t store datetime objects, so the field value must be converted into an ISO-compliant date string for insertion into the database.
4) Insert the data into the database. The pre-processed, prepared data is then composed into an SQL statement for insertion into the database.
5) Emit a post-save signal. The signal django.db.models.signals.post_save is sent, allowing any functions listening for that signal to take some customized action.
Let me note that the behaviour you're receiving is possible if you've applied #transaction.commit_on_success decorator to your view, though, I don't see it in your code.
More on transactions: https://docs.djangoproject.com/en/1.5/topics/db/transactions/
EDIT:
I have added [MVC] and [design-patterns] tags to expand the audience for this question as it is more of a generic programming question than something that has direclty to do with Python or SQLalchemy. It applies to all applications with business logic and an ORM.
The basic question is if it is better to keep business logic in separate modules, or to add it to the classes that our ORM provides:
We have a flask/sqlalchemy project for which we have to setup a structure to work in. There are two valid opinions on how to set things up, and before the project really starts taking off we would like to make our minds up on one of them.
If any of you could give us some insights on which of the two would make more sense and why, and what the advantages/disadvantages would be, it would be greatly appreciated.
My example is an HTML letter that needs to be sent in bulk and/or displayed to a single user. The letter can have sections that display an invoice and/or a list of articles for the user it is addressed to.
Method 1:
Split the code into 3 tiers - 1st tier: web interface, 2nd tier: processing of the letter, 3rd tier: the models from the ORM (sqlalchemy).
The website will call a server side method in a class in the 2nd tier, the 2nd tier will loop through the users that need to get this letter and it will have internal methods that generate the HTML and replace some generic fields in the letter, with information for the current user. It also has internal methods to generate an invoice or a list of articles to be placed in the letter.
In this method, the 3rd tier is only used for fetching data from the database and perhaps some database related logic like generating a full name from a users' first name and last name. The 2nd tier performs most of the work.
Method 2:
Split the code into the same three tiers, but only perform the loop through the collection of users in the 2nd tier.
The methods for generating HTML, invoices and lists of articles are all added as methods to the model definitions in tier 3 that the ORM provides. The 2nd tier performs the loop, but the actual functionality is enclosed in the model classes in the 3rd tier.
We concluded that both methods could work, and both have pros and cons:
Method 1:
separates business logic completely from database access
prevents that importing an ORM model also imports a lot of methods/functionality that we might not need, also keeps the code for the model classes more compact.
might be easier to use when mocking out ORM models for testing
Method 2:
seems to be in line with the way Django does things in Python
allows simple access to methods: when a model instance is present, any function it
performs can be immediately called. (in my example: when I have a letter-instance available, I can directly call a method on it that generates the HTML for that letter)
you can pass instances around, having all appropriate methods at hand.
Normally, you use the MVC pattern for this kind of stuff, but most web frameworks in python have dropped the "Controller" part for since they believe that it is an unnecessary component. In my development I have realized, that this is somewhat true: I can live without it. That would leave you with two layers: The view and the model.
The question is where to put business logic now. In a practical sense, there are two ways of doing this, at least two ways in which I am confrontet with where to put logic:
Create special internal view methods that handle logic, that might be needed in more than one view, e.g. _process_list_data
Create functions that are related to a model, but not directly tied to a single instance inside a corresponding model module, e.g. check_login.
To elaborate: I use the first one for strictly display-related methods, i.e. they are somehow concerned with processing data for displaying purposes. My above example, _process_list_data lives inside a view class (which groups methods by purpose), but could also be a normal function in a module. It recieves some parameters, e.g. the data list and somehow formats it (for example it may add additional view parameters so the template can have less logic). It then returns the data set to the original view function which can either pass it along or process it further.
The second one is used for most other logic which I like to keep out of my direct view code for easier testing. My example of check_login does this: It is a function that is not directly tied to display output as its purpose is to check the users login credentials and decide to either return a user or report a login failure (by throwing an exception, return False or returning None). However, this functionality is not directly tied to a model either, so it cannot live inside an ORM class (well it could be a staticmethod for the User object). Instead it is just a function inside a module (remember, this is Python, you should use the simplest approach available, and functions are there for something)
To sum this up: Display logic in the view, all the other stuff in the model, since most logic is somehow tied to specific models. And if it is not, create a new module or package just for logic of this kind. This could be a separate module or even a package. For example, I often create a util module/package for helper functions, that are not directly tied for any view, model or else, for example a function to format dates that is called from the template but contains so much python could it would be ugly being defined inside a template.
Now we bring this logic to your task: Processing/Creation of letters. Since I don't know exactly what processing needs to be done, I can only give general recommendations based on my assumptions.
Let's say you have some data and want to bring it into a letter. So for example you have a list of articles and a costumer who bought these articles. In that case, you already have the data. The only thing that may need to be done before passing it to the template is reformatting it in such a way that the template can easily use it. For example it may be desired to order the purchased articles, for example by the amount, the price or the article number. This is something that is independent of the model, the order is now only display related (you could have specified the order already in your database query, but let's assume you didn't). In this case, this is an operation your view would do, so your template has the data ready formatted to be displayed.
Now let's say you want to get the data to create a specifc letter, for example a list of articles the user bough over time, together with the date when they were bought and other details. This would be the model's job, e.g. create a query, fetch the data and make sure it is has all the properties required for this specifc task.
Let's say in both cases you with to retrieve a price for the product and that price is determined by a base value and some percentages based on other properties: This would make sense as a model method, as it operates on a single product or order instance. You would then pass the model to the template and call the price method inside it. But you might as well reformat it in such a way, that the call is made already in the view and the template only gets tuples or dictionaries. This would make it easier to pass the same data out as an API (see below) but it might not necessarily be the easiest/best way.
A good rule for this decision is to ask yourself If I were to provide a JSON API additionally to my standard view, how would I need to modify my code to be as DRY as possible?. If theoretical is not enough at the start, build some APIs for the templates and see where you need to change things to the API makes sense next to the views themselves. You may never use this API and so it does not need to be perfect, but it can help you figure out how to structure your code. However, as you saw above, this doesn't necessarily mean that you should do preprocessing of the data in such a way that you only return things that can be turned into JSON, instead you might want to make some JSON specifc formatting for the API view.
So I went on a little longer than I intended, but I wanted to provide some examples to you because that is what I missed when I started and found out those things via trial and error.
At my work, we use Oracle for our database. Which works great. I am not the main db admin, but I do work with it. One thing I like is that the DB has a built in logic layer using PL/SQL which ca handle logic related to saving the data and retrieve it. I really like this because it allows our MVC application (PHP/Zend Framework) to be lighter, and makes it easier to tie in another platform into the data, such as desktop or mobile.
Although, I have a personal project where I want to use couchdb or mongodb, and I want to try and accomplish a similar goal. outside of the mvc/framework, I want to have an API layer that the main applications talk to. they dont actually talk directly to the database. They specify the design document (couchdb) or something similar for mongo, to get the results. And that API layer will validate the incoming data and make sure that data itself is saved and updated properly. Such as saving a new user, in the framework I only need to send a json obejct with the keys/values that need to be saved and the api layer saves the data in the proper places where needed.
This API would probably have a UI, but only for administrative purposes and to make my life easier. In general it will always reply with json strings, or pre-rendered/cached html in some cases. Since each api layer would be specific to the application anyways.
I was wondering if anyone has done anything like this, or had any tips on nethods I could accomplish this. I am currently looking to write my application in python, and the front end will likely be something like Angularjs. Although I am also looking at node.js for a back end.
We do this exact thing at my current job. We have MongoDB on the back end, a RESTful API on top of it and then PHP/Zend on the front end.
Most of our data is read only, so we import that data into MongoDB and then the RESTful API (in Java) just serves it up.
Some things to think about with this approach:
Write generic sorting/paging logic in your API. You'll need this for lists of data. The user can pass in things like http://yourapi.com/entity/1?pageSize=10&page=3.
Make sure to create appropriate indexes in Mongo to match what people will query on. Imagine you are storing users. Make an index in Mongo on the user id field, or just use the _id field that is already indexed in all your calls.
Make sure to include all relevant data in a given document. Mongo doesn't do joins like you're used to in Oracle. Just keep in mind modeling data is very different with a document database.
You seem to want to write a layer (the middle tier API) that is database agnostic. That's a good goal. Just be careful not to let Mongo specific terminology creep into your exposed API. Mongo has specific operators/concepts that you'll need to mask with more generic terms. For example, they have a $set operator. Don't expose that directly.
Finally after having a decent amount of experience with CouchDB and Mongo, I'd definitely go with Mongo.