Model.objects.only("columnname") doesn't work. It shows me everything - python

I have a model called Theme. It has a lot of columns, but I need to retrieve only the field called "name", so I did this:
Theme.objects.only("name")
But it doesn't work, it is still retrieving all the columns.
PD: I don't want to use values() because it returns only a python dictionary. I need to return a set of model instances, to access to its attributes and methods.

Using only or its counterpart defer does not prevent accessing the deferred attributes. It only delays retrieval of said attributes until they are accessed. So take the following:
for theme in Theme.objects.all():
print theme.name
print theme.other_attribute
This will execute a single query when the loop starts. Now consider the following:
for theme in Theme.objects.only('name'):
print theme.name
print theme.other_attribute
In this case, the other_attribute is not loaded in the initial query at the start of the loop. However, it is added to the model's list of deferred attributes. When you try to access it, another query is executed to retrieve the value of other_attribute. In the second case, a total of n+1 queries is executed for n Theme objects.
The only and defer methods should only ever be used in advanced use-cases, after the need for optimization arises, and after proper analysing of your code. Even then, there are often workarounds that work better than deferring fields. Please read the note at the bottom of the defer documentation.

If what you want is a single column, I think what you are looking for is .values() instead of .only.

Related

Django ORM objects.get() Related Questin: Is it legit to use .get() as if it's .filter()?

I often see that, regardless of the model, people often use Model.objects.get(id=id) or .get(product_name=product_name) or .get(cart=my_cart) -- but I now see a piece of code that is using .get() like it's a filter such as .get(product=product, cart=my_cart), is this going to work as intended?
.get() is used to only return one record, as opposed to .filter() which returns a set of records. You can use as many criteria as you like in order to positively identify that one record.
An example might be:
the_batman = Movie.objects.get(category = "superhero", lead__full_name="Robert Pattinson")
In this case, either criteria alone will produce a set of many movies (and thus error out in a .get() request), but in combination they will only produce one, and so is working as intended.

SQLAlchemy add vs add_all

I'm trying to figure out when to use session.add and when to use session.add_all with SQLAlchemy.
Specifically, I don't understand the downsides of using add_all. It can do everything that add can do, so why not just always use it? There is no mention of this in the SQLalchemy documentation.
If you only have one new record to add, then use sqlalchemy.orm.session.Session.add() but if you have multiple records then use sqlalchemy.orm.session.Session.add_all(). There's not really a significant difference, except the API of the first method is for a single instance whereas the second is for multiple instances. Is that a big difference? No. It's just convenience.
I was wondering about the same and as mentioned by others, there is no real difference. However, I would like to add that using add in a loop, instead of using add_all allows you to be more fine grained regarding exception handling. Passing a list of mapped class instances to add_all will cause a rollback for all instances if, for example, one of these objects violates a constraint (e.g., unique). I prefer to decouple my data logic from my service logic and decide what to do with instances not stored in the service layer by returning them from my data layer. However, I think it depends on how you are handling exceptions.

Are query objects in SqlAlchemy immutable?

Let's say I got a query like this ...
baseQuery = MyDbObj.query.filter_by(someProp='foo')
If, at a later point, I extend that query with something else (let's say, another filter) ...
derivedQuery = baseQuery.filter_by(anotherProp='bar')
will this result in the original query be modified, internally, or is a new query instance created?
Background: My use case is that I got multiple cases that only differ in one filter. Right now there is a ton of copy pasted query code (not my fault, I inherited this codebase) which I am cleaning up. For the cases where only one query is ultimately executed, I don't care if the original query gets modified. However I also have cases where two queries are executed, so here it matters that I can extend two queries from a base-query without them interfering with each other.
Though maybe a solution here could be to do that filtering in python itself, and not making two queries against the DB in the first place (I will keep that as a 2nd option).
SQLAlchemy creates a copy when filtering. So when you do
derivedQuery = baseQuery.filter_by(anotherProp='bar')
then derivedQuery is a copy of baseQuery with the filter applied. See the docs for more details.

sqlalchemy automatically extend query or update or insert upon table definition

in my app I have a mixin that defines 2 fields like start_date and end_date. I've added this mixin to all table declarations which require these fields.
I've also defined a function that returns filters (conditions) to test a timestamp (e.g. now) to be >= start_date and < end_date. Currently I'm manually adding these filters whenever I need to query a table with these fields.
However sometimes me or my colleagues forget to add the filters, and I wonder whether it is possible to automatically extend any query on such a table. Like e.g. an additional function in the mixin that is invoked by SQLalchemy whenever it "compiles" the statement. I'm using 'compile' only as an example here, actually I don't know when or how to best do that.
Any idea how to achieve this?
In case it works for SELECT, does it also work for INSERT and UPDATE?
thanks a lot for your help
Juergen
Take a look at this example. You can change the criteria expressed in the private method to refer to your start and end dates.
Note that this query will be less efficient because it overrides the get method to bypass the identity map.
I'm not sure what the enable_assertions false call does; I'd recommend understanding that before proceeding.
I tried extending Query but had a hard time. Eventually (and unfortunately) I moved back to my previous approach of little helper functions returning filters and applying them to queries.
I still wish I would find an approach that automatically adds certain filters if a table (Base) has certain columns.
Juergen

Is fetch() better than list(Model.all().run()) for returning a list from a datastore query?

Using Google App Engine Python 2.7 Query Class -
I need to produce a list of results that I pass to my django template. There are two ways I've found to do this.
Use fetch, however in the docs it says that fetch should almost never be used. https://developers.google.com/appengine/docs/python/datastore/queryclass#Query_fetch
Use run() and then wrap it into list() thereby creating the list object.
Is one preferable to the other in terms of memory usage? Is there another way I could be doing this?
The key here is why fetch “should almost never be used”. The documentation says that fetch will get all the results, therefore having to keep all of them in memory at the same time. If the data you get is big, you will need lots of memory.
You say you can wrap run inside list. Sure, you can do that, but you will hit exactly the same problem—list will force all the elements into memory. So, this solution is actually discouraged on the same basis as using fetch.
Now, you could say: so what should I do? The answer is: in most cases you can deal with elements of your data one by one, without keeping them all in memory at the same time. For example, if all you need is to put the result data into a django template, and you know that it will be used at most once in your template, then the django template will happily take any iterator—so you can pass the run call result directly without wrapping it into list.
Similarly, if you need to do some processing, for example go over the results to find the element with the highest price or ranking, or whatever, you can just iterate over the result of run.
But if your usage requires having all the elements in memory (e.g.: your django template uses the data from the query several times), then you have a case where fetch or list(run(…)) actually has sense. In the end—this is just the typical trade-off: if you need for your application to apply an algorithm which requires all the data in memory, you need to pay for it by using up memory. So, you can either redesign your algorithms and usage to work with an iterator, or use fetch and pay for it by longer processing times and higher memory usage. Google of course encourages you to do the first thing. And this is what “should almost never be used” actually means.

Categories