How to grab one random item from a database in Django/postgreSQL?

How to grab one random item from a database in Django/postgreSQL? - python

So i got the database.objects.all() and database.objects.get('name') but how would i got about getting one random item from the database. I'm having trouble trying to figure out how to get it ot select one random item.

Selecting a random element from a list of all database objects isn't a goog solution as retrieving all elements of the database can have a big impact on performance, neither is using order_by('?') as mentioned in the django documentation.
The best solution should be to retrieve an element with a random index:
import random
random_idx = random.randint(0, Model.objects.count() - 1)
random_obj = Model.objects.all()[random_idx]

Aamir's solution will select all objects before discarding all but one. This is extremely wasteful and, besides, this sort of calculation should be done in the database.
model.objects.all().order_by('?')[0]
Read more here: https://docs.djangoproject.com/en/dev/ref/models/querysets/#order-by
Edit: lazerscience's answer is indeed faster, as shown here.

I would do it slightly different. Querysets are lazy anyway in django.
import random
def get_my_random_object():
object = random.choice(model.objects.all())
return object
https://docs.djangoproject.com/en/dev/topics/db/queries/#querysets-are-lazy
https://docs.djangoproject.com/en/dev/ref/models/querysets/#when-querysets-are-evaluated

Related

Efficient way of dividing a querySet with a filter, while keeping all data?

I have a 'Parts' model, and these parts are either linked to a 'Device' model or not yet. The actual "link" is done via more than just one ForeignKey, i.e. I have to go through 3 or 4 Models all linked between each other with ForeignKeys to finally get the data I want.
My question is: What is the most efficient way of getting both the linked and non-linked parts ?
Right now, I am getting all parts and simply outputting that, but I would like a little separation:
allParts = Parts.object.all()
I know I could do something similar to this:
allParts = Parts.object.all()
linkedParts = allParts.objects.filter(...device_id=id)
nonLinkedParts = allParts.objects.exclude(...device_id__in=[o.id for o in linkedParts])
But is that really the most efficient solution ? I feel like there would be a better way, but I have not yet found anything in the docs about it.
Just to clarify, there are only linked, and non-linked parts. These are mutually exclusive and exhaustive.
Thank you very much

If you are only interested in obtaining the elements, for example to iterate over it, you can work with two lists:
allParts = Parts.object.all()
linkedParts = []
nonLinkedParts = []
for part in allParts:
if part.device_id == id:
linkedParts.append(part)
else:
nonLinkedParts.append(part)
since these are lists, you can no longer (efficiently) filter further, or order by a specific condition. If you want to order it by a certain field, you should do that already in the allParts database query.

How do I do a Simple Scan in DynamoDB?

Trying to understand the docs has been very difficult in relation to trying to understand how to do a simple scan in AWS DynamoDB.
Can someone please explain to me in simple terms how to do a basic scan?

What is a Scan?
The Scan operation returns one or more items and item attributes by accessing every item in a table or a secondary index.
Explanation
A scan operation in it's simplest form looks through everything in your table. Most of the time, you probably don't need the whole table to be returned or even looked at. As a result, many often decide to use filters to cut down on the stuff to look through, process and return.
How do I Scan?
Here is a simple scan operation in python. Even if you aren't using python, this guide will be very helpful.
# Table = 'grades'
# Year_levels = {0-12}
# Sort_key = overall_rank
# Attribute_categories = math, english, science | out of 100
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table("grades")
result = table.scan(
FilterExpression ='math > :math AND english > :eng',
ExpressionAttributeValues = {':math': 80,':eng': 70},
Select='SPECIFIC_ATTRIBUTES',
ProjectionExpression='year_level,overall_rank,math,english',
Limit = 50 #This is the amount of items to SCAN, not necessarily RETURN.
)
# return or print result
Explanation
FilterExpression and ExpressionAttributeValues. There are multiple ways of understanding how these work. One way to understand it is by seeing it as an item attribute value checker. In other words, every item that the scan goes through, the filter's applied upon it's attributes must be true for the item to be returned. e.g (a math score of 80%+ and an english score of 70%+)
Select and Projection Expression. In technical terms, the way in which I explain this is incorrect, however, in practical terms this way of understanding holds up: You can see that there is a SECOND filter, not for the item, but the ATTRIBUTES of the item that will be returned. e.g (I only want the
year_level, overall_rank, math, english to be returned, but no science)
Now if we combine the two we have an example: If an item is checked and matches the criteria placed upon it by the FilterExpression, it will be returned. HOWEVER, we only want SPECIFIC_ATTRIBUTES to be returned. At this point, the item will then be checked AGAIN against, this time, the Select criteria. The select criteria tells you what attributes FROM the item to return.
Limit is just the amount of items to check through, but not necessarily return.
References
References

How to delete first N items from queryset in django

I'm looking to delete only the first N results returned from a query in django. Following the django examples here which I found while reading this SO answer, I was able to limit the resulting set using the following code
m = Model.objects.all()[:N]
but attempting to delete it generates the following error
m.delete()
AssertionError: Cannot use 'limit' or 'offset' with delete.
Is there a way to accomplish this in django?

You can not delete through a limit. Most databases do not support this.
You can however accomplish this in two steps, like:
Model.objects.filter(id__in=list(Models.objects.values_list('pk', flat=True)[:N])).delete()
We thus first retrieve the primary keys of the first N elements, and then use this in a .filter(..) part to delete those items in bulk.

You don't have the option directly. So you should delete it by some advanced ways. For example:
not_ideal = Model.objects.all()[N:].values_list("id", flat=True)
Model.objects.exclude(pk__in=list(not_ideal)).delete()
Using this way you are finding your not ideal objects and delete everything except them.
You can use anything beside id. But id is unique and will help you to optimize.
Notice that in the first line I'm getting the items which are from N to the last.(Not from the first to N)

Try this.
Loop through all filtered objects
delatable_objects = Model.objects.all()[:N]
for m in delatable_objects:
m.delete()

You can loop through the queryset and apply delete method to the objects.
for obj in m:
obj.delete()

Selecting all items individually in a list

I was wondering if it is possible to re-select each and every item in the rsList?
I am citing a simple example below but I am looking at hundreds of items in the scene and hence below are the simplest form of coding I am able to come up with base on my limited knowledge of Python
rsList = cmds.ls(type='resShdrSrf')
# Output: [u'pCube1_GenShdr', u'pPlane1_GenShdr', u'pSphere1_GenShdr']
I tried using the following cmds.select but it is taking my last selection (in memory) - pSphere1_GenShdr into account while forgetting the other 2 even though all three items are seen selected in the UI.
Tried using a list and append, but it also does not seems to be working and the selection remains the same...
list = []
for item in rsList:
list.append(item)
cmds.select(items)
#cmds.select(list)
As such, will it be possible for me to perform a cmds.select on each of the item individually?

if your trying to just select each item:
import pymel.core as pm
for i in pm.ls(sl=True):
i.select()
but this should have no effect in your rendering

I think for mine, it is a special case in which I would need to add in mm.eval("autoUpdateAttrEd;") for the first creation of my shader before I can duplicate.
Apparently I need this command in order to get it to work

Length of _QueryIterator

I'm trying to get the length of the result of the following query:
matchingTitles = db.GqlQuery("SELECT * FROM Post WHERE title=:1",title).run()
I tried doing this:
if(len(matchingTitles)>0):
But I get the following error:
TypeError: object of type '_QueryIterator' has no len()
I've been searching all over for the _QueryIteratorobject docs, but can't seem to find any. I instead just iterated over it and incremented a number "for each" item in the set. Wondering if there was a better way...
Thanks!
EDIT
There's a better way to do this. Instead of running and then counting, you can simply do:
matchingTitles = db.GqlQuery("SELECT * FROM Post WHERE title=:1",title).count()
and it returns the number of entities.

This can take a lot of memory, but you could use itertools.tee:
https://docs.python.org/2/library/itertools.html#itertools.tee

For anyone that comes across this question actually looking for the length of a _QueryIterator, you can try:
len(list(matchingTitles)) # This will load all the results into memory before counting.
# OR
sum([1 for _ in matchingTitles])
As mentioned though - it's usually better / faster / cheaper to use the database's count functionality than loading all the records and iterating over them. There may be a reason you can't use that - in which case those two options are available.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.