Elasticsearch DSL query with specific Output

Elasticsearch DSL query with specific Output - python

I have a few objects in the database:
object 1, object 2, object 3, .., object n
Now I am doing filter like this
MyDocument.search().filter("match", is_active=True).sort('id').execute()
Output:
searchDocumentobject 1, searchDocumentobject 2,searchDocumentobject 3, ....
Now I need to searchDocumentobject 2 in last from the list.
Need the output like this:
searchDocumentobject 1,searchDocumentobject 3, .... , searchDocumentobject 2
Thank you

In MyModel, add a new method, which returns 0 if you want to keep that document at last, else it'll return 1.
class MyModel(models.Model):
# Add new method here
def get_rank(self):
if self.id == 2: # your condition here
return 0 # return 0, if you want to keep it at last
return 1
Now, you can utilize this method in MyDocument. Add a new field in MyDocument, which we'll use for sorting.
class MyDocument(Document):
# Add new field here
rank = fields.IntegerField(attr='get_rank')
Now, you can query like this,
MyDocument.search().filter("match", is_active=True).sort('-rank', 'id').execute()

You can achieve this behavior in your search request using function score, the idea is that each other documents you'll have score of 1 (default) and with document 2 you give it lower score then sort by "_score", "id". Here is the DSL, try construct the query from your python API:
{
"_source": ["id"],
"query": {
"function_score": {
"query": {
"bool": {
//add your query here
"must": [
{
"terms": {
"id": [1, 2, 3, 70]
}
}
]
}
},
"functions": [
{
"filter": {
"term": {
"id": 2
}
},
"weight": 0.5
}
]
}
},
"sort": [
"_score",
{
"id": {
"order": "asc"
}
}
]
}
Also as Yifeng stated in comment section, you can re-sort the results after you query from ES.

Related

Pymongo aggregate pipeline

I would like to use an aggregate pipeline to get the most common value given another value.
How can I use an aggregate pipeline to find what the most common StudentId is for TeacherId 212?
Have been attempting code below, but not getting desired outcome.
pl= [
'$project': {
'_id': 1,
'StudentId': 1,
"TeacherID: 1,
"$group": {
"__id": 'TeacherID',
"__id": {
"$first": "StudentID",
}
}
}
]
db.collection.aggregate(pl)

Demo - https://mongoplayground.net/p/ksay82IaGHs
Group by TeacherID and TeacherID and get occurrence of the combination, $sort by occurrence in descending order.
db.collection.aggregate([
{ $group: { _id: { TeacherID: "$TeacherID", StudentID: "$StudentID" }, occurrence: { $sum: 1 } } },
{ $sort: { "occurrence": -1 } }
]);
Output
[
{
"_id": {
"StudentID": 2,
"TeacherID": 212
},
"occurrence": 3
},
{
"_id": {
"StudentID": 4,
"TeacherID": 223
},
"occurrence": 1
}, .....
]
If you want the top record
Demo - https://mongoplayground.net/p/zBsGdAOdYwy
{
"$limit": 1
}
Demo - https://mongoplayground.net/p/G2KIVcjtYII
If you want to check for specific TeacherID use $match

How to filter nested array in python pymongo 3.7.2 for mongodb

I am using pymongo version 3.7.2 with python 3.6.8. I have documents in the following format in my database:
{"_id" : 1,
"main_array":[
{"subid":222,
"subarray":[{"name":"hari","status":1},{"name":"henry","status":1}]
},
{"subid":333,
"subarray":[{"name":"james","status":0},{"name":"jason","status":1}]
}]
},
{"_id" : 2,
"main_array":[
{"subid":222,
"subarray":[{"name":"alex","status":1},{"name":"anna","status":1}]
},
{"subid":333,
"subarray":[{"name":"bob","status":0},{"name":"bunny","status":1}]
}]
}
I need to get the objects with subid = 222 from all the documents in the collection. The required result should be as follows:
{"_id" : 1,
"main_array":[
{"subid":222,
"subarray":[{"name":"hari","status":1},{"name":"henry","status":1}]
}]
},
{"_id" : 2,
"main_array":[
{"subid":222,
"subarray":[{"name":"alex","status":1},{"name":"anna","status":1}]
}]
}
I tried the following code:
myclient = pymongo.MongoClient(<mongoclient url>)
mydb = myclient["test"]
mycol = mydb["user"]
subid = 222
_id = 1
x = mycol.find({"_id":_id},{"main_array":{"$elemMatch":{"subid":subid}}})
I got the required result for a particular document. But i need for all the documents. I tried the following query:
x = mycol.find({"main_array":{"$elemMatch":{"subid":subid}}})
But this time it returns the entire collection. What did i miss ?

elemMatch gives you the documents in which ANY of the array item passes the condition.
You should use the aggregation pipeline with $unwind and $match.
Basically, do:
db.collection.aggregate([{
$unwind: "$main_array"
},
{
$match: {
"main_array.subid": 222
}
}])
This gives main_array as an object though, but you should be able to work with that.
Output of the above:
[
{
"_id": 1,
"main_array": {
"subarray": [
{
"name": "hari",
"status": 1
},
{
"name": "henry",
"status": 1
}
],
"subid": 222
}
},
{
"_id": 2,
"main_array": {
"subarray": [
{
"name": "alex",
"status": 1
},
{
"name": "anna",
"status": 1
}
],
"subid": 222
}
}
]
Fiddle: https://mongoplayground.net/p/-sg_d2h5wIJ

Count items through documents on mongo query

My mongo cursor looks like this :
{
"_id":ObjectId("57558ee01807ce2f774569cc"),
"description": "Lorem Ipnsun ....",
"results":[
{
"name":"Alica James",
"gender":"male"
},
{
"name":"Alica James",
"gender":"female"
},
{
"name":"Alica James",
"gender":"female"
}
]
},
{
"_id":ObjectId("57558ee01807ce2f774569c6"),
"description": "Lorem Ipnsun ....",
"results":[
{
"name":"Van Ban",
"gender":"unclear"
}
]
},
{
"_id":ObjectId("57558ee01807ce2f774569c7"),
"description": "Lorem Ipnsun ....",
"results":[]
}
As you can see the results key can be empty or can have values. Inside it, there's a field name which for with exists a gender that can be male female or unclear.
I want to find all documents in my collection, then search through each document check gender distribution for each name.
So for name "Alica James" i want my query to get
female_numbers_for_document = 2
male_numbers_for_document = 1
unclear_numbers_for_document = 0
For Van Ban:
female_numbers_for_document = 0
male_numbers_for_document = 0
unclear_numbers_for_document = 1
On python, I started to do it, first i found all the documents on collections then I started to iterate through each document in cursor and then I declared some vars to define gender but this doesn't work since it takes only first value and doesnt go throught results. Code look like this :
def find_gender_distribution(self):
cursor = self.mongo.db[self.collection_name].find()
for document in cursor:
female_numbers_for_document = document.find({"results.gender": "female"}).count()
male_numbers_for_document = document.find({"results.gender": "male"}).count()
unclear_numbers_for_document = document.find({"results.gender": "unclear"}).count()
I don't know how to count how many documents inside results that contains same gender? Please help.

You are using the wrong method to do this. You need to use the .aggregate() method which gives access to the aggregation pipeline.
unwind1 = {"$unwind": "$result"}
group1 = {
"$group": {
"_id": {"name": "$result.name", "gender": "$result.gender"},
"count": {"$sum": 1}
}
}
group2 = {
"$group": {
"_id": "$_id.name",
"nmale": {
"$sum": {"$cond": [
{"$eq": ["$_id.gender", "male"]},
"$count",
0
]
}
},
"nfemale": {
"$sum": {"$cond": [
{"$eq": ["$_id.gender", "female"]},
"$count",
0
]
}
},
"nunclear": {
"$sum": {"$cond": [
{"$or": [
{"$ne": ["$_id.gender", "male"]},
{"$ne": ["$_id.gender", "female"]}
]},
"$count",
0
]
}
}
}
}
pipeline = [unwind1, group1, group2]
def find_gender_distribution(self):
collection = self.mongo.db[self.collection_name]
cursor = collection.aggregate(pipeline)
for document in cursor:
print(document) # or do something
If we print the cursor, it yields something like:
{ "_id" : "Alica James", "nmale" : 1, "nfemale" : 2, "nunclear" : 3 }
{ "_id" : "Van Ban", "nmale" : 0, "nfemale" : 0, "nunclear" : 1 }

how to find values that are common for different fields [pymongo aggregate]

Let's say we have mongodb documents:
{'shop':'yes'}
{'shop':'ice_cream'}
{'shop':'grocery'}
{'amenity':'yes'}
{'amenity':'hotel'}
How do I write an aggregate query in pymongo which would return the values that are common for both keys? In that example it should return 'yes'.

Your aggregation pipeline would make use of the $setIntersection in the $project operator stage. This takes two or more arrays and returns an array that contains the elements that appear in every input array. Another aggregation operator that is useful is the $addToSet array operator which is used in creating the distinct list of values for each grouped field that can then be compared later on.
In mongoshell, inserting the documents
db.collection.insert([
{'shop':'yes'},
{'shop':'ice_cream'},
{'shop':'grocery'},
{'amenity':'yes'},
{'amenity':'hotel'}
])
You could try the following aggregation pipeline:
db.collection.aggregate([
{
"$group": {
"_id": null,
"shops": {
"$addToSet": "$shop"
},
"amenities": {
"$addToSet": "$amenity"
}
}
},
{
"$project": {
"_id": 0,
"commonToBoth": { "$setIntersection": [ "$shops", "$amenities" ] }
}
}
]);
Output:
/* 0 */
{
"result" : [
{
"commonToBoth" : [
"yes"
]
}
],
"ok" : 1
}
Pymongo:
>>> pipe = [
... {"$group": { "_id": None, "shops": {"$addToSet": "$shop"}, "amenities": {"$addToSet": "$amenity"}}},
... { "$project": {"_id": 0, "commonToBoth":{"$setIntersection": ["$shops", "$amenities"]}}}
... ]
>>>
>>> for doc in collection.aggregate(pipe):
... print(doc)
...
{u'commonToBoth': [u'yes']}

mongodb get element in multi array

I have mongodb document like this:
{
"post":[
{
"name": "post1",
"part": [
{
"name": "part1",
...
},{
"name": "part2",
...
}
]
},{
"name": "post2",
"part": [
{
"name": "part3",
...
},{
"name": "part4",
...
}
]
}
...
]
}
I want get output like this:
{
"post": [
{
"part":[
{
"name": "part2"
}
]
}
]
}
my query like this:
db.find_one({"_id": 123},{
"post.%s.part.%s.name" % (0, 1) : 1
})
I known index of list post (is 0) and part (is 1)
I can't get by index of output, can you help me get element of array ?
I have try $slice, but how to query with $slice in multi part of array
Thanks!

Projection can't project out specific elements except by matching with $. You can restrict to just the field "post.part.name" (but still getting the field value for each element of the bottom-level array).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Elasticsearch DSL query with specific Output - python

Related

Pymongo aggregate pipeline

How to filter nested array in python pymongo 3.7.2 for mongodb

Count items through documents on mongo query

how to find values that are common for different fields [pymongo aggregate]

mongodb get element in multi array

Categories

Resources