Django ORM exclude records using an array of dictionaries - python

I have similar code that return all entries from a table:
all_entries = Entry.objects.all()
and I have the following array:
exclusion_list = [
{
"username": "Tom",
"start_date": 01/03/2019,
"end_date": 29/02/2020,
},
{
"username": "Mark",
"start_date": 01/02/2020,
"end_date": 29/02/2020,
},
{
"username": "Pam",
"start_date": 01/03/2019,
"end_date": 29/02/2020,
}
]
I want to exclude all Tom's records from "01/03/2019" to "29/02/2020", all "Mark" records from "01/02/2020" to "29/02/2020" and all Pam's record from "01/03/2019" to "29/02/2020"
I want to do that in a loop, so I believe i should do something like:
for entry in all_entries:
filtered_entry = all_entries.exclude(username=entry.username).filter(date__gte=entry.start_date, date__lte=entry.end_date)
Is this approach correct? I am new to Django ORM. Is there a better and more efficient solution?
Thank you for your help

Yes, you can do this with a loop.
This results in a query whose WHERE-clause gets extended every cycle of your loop. But to do this, you have to use the filtered queryset of your previous cycle:
filtered_entry = all_entries
for exclude_entry in exclusion_list:
filtered_entry = filtered_entry.exclude(username=exclude_entry.username, date__gte=exclude_entry.start_date, date__lte=exclude_entry.end_date)
Notes
Using the same reference of the queryset to limit the results further every loop cycle
To use multiple criteria connected with AND, just write multiple keyword arguments within exclude() (look into the docs [here][1])
Be aware, that this can result in a large WHERE-clause and maybe there are limitations of your database
So if your exclude_list is not too big, I think you can use this without concerns.
If your exclude_list grows, the best would be to save your exclusion_list in the database itself. With this the ORM can generate subqueries instead of single values. Just an example:
exclusion_query = ExclusionEntry.objects.all().values('username')
filtered = all_entries.exclude(username__in=exclusion_query)
[1]: https://docs.djangoproject.com/en/3.1/topics/db/queries/#retrieving-specific-objects-with-filters

Related

Get list value by comparing values

I have a list like this:
data.append(
{
"type": type,
"description": description,
"amount": 1,
}
)
Every time there is a new object I want to check if there already is an entry in the list with the same description. If there is, I need to add 1 to the amount.
How can I do this the most efficient? Is the only way going through all the entries?
I suggest making data a dict and using the description as a key.
If you are concerned about the efficiency of using the string as a key, read this: efficiency of long (str) keys in python dictionary.
Example:
data = {}
while loop(): # your code here
existing = data.get(description)
if existing is None:
data[description] = {
"type": type,
"description": description,
"amount": 1,
}
else:
existing["amount"] += 1
In either case you should first benchmark the two solutions (the other one being the iterative approach) before reaching any conclusions about efficiency.

Update DynamoDB nested map within a set using conditional

I'm trying to update an item in DynamoDB that has somewhat complicated(?) data structure.
Item:
{
'user_id': 'abc123',
'groups': [
{
'group_id': 'Group1',
'games_won': [],
'games_lost': []
},
{
'group_id': 'Group2',
'games_won': [],
'games_lost': []
},
]
}
I am trying to append a string to games_won on a specific group_id. I am trying to use a conditional to avoid multiple db queries but I can't seem to figure out how to iterate over groups in my conditional.
Basically, I want to do this:
for g in groups:
if g.group_id == 'Group2':
g.games_won.append('game12345')
Sorry for the complicated title. I'm a bit new to DynamoDB and NoSQL in general.
You could read the 'groups' attribute, then change the data outside of the query and when your done write the whole thing back. That way no matter how many groups you change you always have just one read and one write action. The number of read or write capacity units consumed is off course related to the size of your 'groups' attribute.

Unwind multiple arrays from different structure in document

I have these two types of structures in my collection:
{
_id: "date1",
users: [{"user": "123", ...}, {"user": "456", ...}]
}
and
{
_id: "date2",
points: [{"point": "1234", ...}, {"point": "5678", ...}]
}
I need to make an agregation, that returns me a list of these documents and only the specific point or user information and with skip and limit. Something like:
[
{_id: "date1", user: {"user": "123", ...}},
{_id: "date2", point: {"point": "1234", ...}},
]
I have used, I'm new in mongo, can you have me any recommendation?
collection.aggregate([
{"$unwind": "$users"},
{"$unwind": "$points"},
{"$match": {"$or": [
{'users.user': an_user},
{'points.point': a_point}]}},
{"$sort": {"_id": -1}},
{"$skip": 10},
{"$limit": 10}
])
with the information of one specific user or one specific point depending if point or user key is in that document
Given your provided document example you may be able to just utilise db.collection.find(), for example:
db.collection.find({"users.user": a_user}, {"users.$":1});
db.collection.find({"points.point": a_point}, {"points.$":1});
Depending on your use case, this may not be ideal because you're executing find() twice.
This would return your a list of documents with their _id. The above query is equivalent to saying:
Find all documents in the collection
where in array field users contain a user 'a_user'
OR in array field points contain a point 'a_point'
See also MongoDB Indexes
Having said the above, you should really reconsider your document schema. Depending on your use case, you may find difficulties in querying the data later on and may impact your query performance. Please review MongoDB Data Models to provide more information on how to design your schema.

how to fetch data from json schema? error shown-TypeError: string indices must be integers

I have a json response from an API in this way:-
{
"meta": {
"code": 200
},
"data": {
"username": "luxury_mpan",
"bio": "Recruitment Agents👑👑👑👑\nThe most powerful manufacturers,\nwe have the best quality.\n📱Wechat:13255996580💜💜\n📱Whatsapp:+8618820784535",
"website": "",
"profile_picture": "https://scontent.cdninstagram.com/t51.2885-19/10895140_395629273936966_528329141_a.jpg",
"full_name": "Mpan",
"counts": {
"media": 17774,
"followed_by": 7982,
"follows": 7264
},
"id": "1552277710"
}
}
I want to fetch the data in "media", "followed_by" and "follows" and store it in three different lists as shown in the below code:--
for r in range(1,5):
var=r,st.cell(row=r,column=3).value
xy=var[1]
ij=str(xy)
myopener=Myopener()
url=myopener.open('https://api.instagram.com/v1/users/'+ij+'/?access_token=641567093.1fb234f.a0ffbe574e844e1c818145097050cf33')
beta=json.load(url)
for item in beta['data']:
list1.append(item['media'])
list2.append(item['followed_by'])
list3.append(item['follows'])
When I run it, it shows the error TypeError: string indices must be integers
How would my loop change in order to fetch the above mentioned values?
Also, Asking out of curiosity:- Is there any way to fetch the Watzapp no from the "BIO" key in data dictionary?
I have referred questions similar to this and still did not get my answer. Please help!
beta['data'] is a dictionary object. When you iterate over it with for item in beta['data'], the values taken by item will be the keys of the dictionary: "username", "bio", etc.
So then when you ask for, e.g., item['media'] it's like asking for "username"['media'], which of course doesn't make any sense.
It isn't quite clear what it is that you want: is it just the stuff inside counts? If so, then instead of for item in beta['data']: you could just say item = beta['data']['counts'], and then item['media'] etc. will be the values you want.
As to your secondary question: I suggest looking into regular expressions.

How can I re-assemble list after using aggregate and $unwind in mongo?

I'm building an aggregate pipeline as follows
pipeline = [
{"$unwind": "$categories"}
]
if len(cat_comp) > 0:
pipeline.append({"$match": {"categories": {"$in": cat_comp}}})
result = mongo.db.xxx.aggregate(pipeline)['result']
The question is, how on performing the aggregation can I re-assemble the list of categories back in the results, because each record returned is the categories field corresponds to one of the items in the list. How can I rebuild the results such that I can perform the matching ($match) against a list of possibilities but recover the original list of categories.
It has been suggested that I try:
pipeline.append({"$group": {"categories": {"$push": "$categories"}}})
which I have modified to:
pipeline.append({"$group": {"_id": "anything", "categories": {"$push": "$categories"}}})
However now, I only get one record back which has for categories a massive list from all results. So what I would like to do is to take a document as thus:
{
"_id": 45666
"categories": ['Fiction', 'Biography']
"other": "sss"
}
and search from a user list category_list = ['Anything', ...] by passing through regular expressions as this:
cat_comp = [re.compile(cat, re.IGNORECASE) for cat in cat_list]
In the end, what is happening with aggregate(pipeline) is that I am losing "categories" as a list because of the $unwind. Now, how can I perform the query over the input data but return records that match where I have category as a list.
I'm also trying:
pipeline.append({"$group": {"_id": "$_id", "categories": { "$addToSet": "$categories" } } })
Which usefully returns a list of records with categories in a list - however, how can I see the rest of the record, I can only see _id and categories.
You need to use a $group step in the pipeline with a $push to re-build the lists:
pipeline.append({"$group": {"categories": {"$push": "$categories"},"_id":"$_id","other": {"$first":"$other"}}})

Categories