Pymongo Remove duplicates with condition in mongodb - python

Thanks for reading my question.
Please execuse any mistakes, i'm working on improving my English.
I have > 4000 records in my MongoDB, this is one of my records :
{
"_id" : ObjectId("5763821ffefb61074041477e"),
"sessionId" : "5138A3B4A5966CE4B2203B8BFC90055F",
"objects" : [
{
"id" : "334449673730",
"point" : 0.5
},
{
"id" : "790373008255",
"point" : 0.5
},
{
"id" : "790373008255",
"point" : 1.0
},
{
"id" : "572453522243",
"point" : 0.5
},
{
"id" : "572453522243",
"point" : 1.0
}
]
}
My result, i want to delete duplicate id but keep the point : 1.0
Result :
{
"_id" : ObjectId("5763821ffefb61074041477e"),
"sessionId" : "5138A3B4A5966CE4B2203B8BFC90055F",
"objects" : [
{
"id" : "334449673730",
"point" : 0.5
},
{
"id" : "790373008255",
"point" : 1.0
},
{
"id" : "572453522243",
"point" : 1.0
}
]
}
I follow this post : How to remove duplicates with a certain condition in mongodb?
its similary with my question but i don't know why result as not as i want :
pipeline = ([
{
"$group": {
"_id": "$id",
"count": { "$sum": 1 },
#"uniqueIds": { "$addToSet": "$_id" },
"Point": { "$max": "$point" }
}
},
{
"$match": {
"count": { "$gte": 1 }
}
}
])
for test_item in collection_forTest.aggregate(pipeline):
print(test_item)
Result :
{'Point': None, 'count': 1, '_id': None}
I can use python code, load all records, check where same id in list, compare if point = 1 and remove same record with point != 1 but i think its slower than aggregation
Can you help me with my problem for all > 4000 records ?
Thanks very much !

Related

Mongodb find nested dict element

{
"_id" : ObjectId("63920f965d15e98e3d7c450c"),
"first_name" : "mymy",
"last_activity" : 1669278303.4341061,
"username" : null,
"dates" : {
"29.11.2022" : {
},
"30.11.2022" : {
}
},
"user_id" : "1085116517"
}
How can I find all documents with 29.11.2022 contained in date? I tried many things but in all of them it detects the dot letter as something else.
Use $getField in $expr.
db.collection.find({
$expr: {
$eq: [
{},
{
"$getField": {
"field": "29.11.2022",
"input": "$dates"
}
}
]
}
})
Mongo Playground

MongoDB - Pull and Update in a single query

I have the following schema -
{
"_id" : ObjectId("60c3253f19862e6347bc9f4e"),
"farm_id": "Gustavo-chainer",
"first_ts" : ISODate("2021-05-18T09:53:00.000Z"),
"last_ts" : ISODate("2021-05-18T12:53:00.000Z"),
"sensor_data" : [
{
"data" : 76.0,
"sensor": "temperature-sensor",
"start_ts" : ISODate("2021-05-18T09:33:00.000Z"),
"end_ts" : ISODate("2021-05-18T09:53:00.000Z")
},
{
"data" : 74.0,
"sensor": "temperature-sensor",
"start_ts" : ISODate("2021-05-18T12:33:00.000Z"),
"end_ts" : ISODate("2021-05-18T12:53:00.000Z")
}
]
}
where first_ts = minimum of all the values of start_ts present in the sensor_data array and last_ts = maximum of all the values of end_ts present in the sensor_data array.
I want to delete a data point from sensor_data array given the start_ts and end_ts and after deletion, have to update the first_ts and last_ts accordingly.
Example -
Delete data point with "start_ts" : ISODate("2021-05-18T12:33:00.000Z") and "end_ts" : ISODate("2021-05-18T12:53:00.000Z"). After deletion, the document should look like -
{
"_id" : ObjectId("60c3253f19862e6347bc9f4e"),
"first_ts" : ISODate("2021-05-18T09:53:00.000Z"),
"last_ts" : ISODate("2021-05-18T09:53:00.000Z"),
"sensor_data" : [
{
"data" : 76.0,
"sensor": "temperature-sensor"
"start_ts" : ISODate("2021-05-18T09:33:00.000Z"),
"end_ts" : ISODate("2021-05-18T09:53:00.000Z")
}
]
}
I need to write a pymongo query that can do the above task in a single query.
You can try update with aggregation pipeline starting from MongoDB 4.2,
$filter to iterate loop of sensor_data array, check both fields date condition and $not for the opposite condition to exclude matching documents
$min to get minimum start_ts date from sensor_data.start_ts
$max to get maximum end_ts date from sensor_data.end_ts
collection.update(
{
sensor_data: {
$elemMatch: {
start_ts: ISODate("2021-05-18T12:33:00.000Z"),
end_ts: ISODate("2021-05-18T12:53:00.000Z")
}
}
},
[{
$set: {
sensor_data: {
$filter: {
input: "$sensor_data",
cond: {
$not: {
$and: [
{ $eq: ["$$this.start_ts", ISODate("2021-05-18T12:33:00.000Z")] },
{ $eq: ["$$this.end_ts", ISODate("2021-05-18T12:53:00.000Z")] }
]
}
}
}
}
}
},
{
$set: {
first_ts: { $min: "$sensor_data.start_ts" },
last_ts: { $max: "$sensor_data.end_ts" }
}
}],
{ multi: true }
)
Playground

My code is woring in mongodb but not working in pymongo

I have a documents in collection and I want to find document and update elements of list.
Here is sample data:
{
{
"_id" : ObjectId("5edd3faaf6c9d938e0bfd966"),
"id" : 1,
"status" : "XXX",
"number" : [
{
"code" : "AAA"
},
{
"code" : "CVB"
},
{
"code" : "AAA"
},
{
"code" : "BBB"
}
]
},
{
"_id" : ObjectId("asseffsfpo2dedefwef"),
"id" : 2,
"status" : "TUY",
"number" : [
{
"code" : "PPP"
},
{
"code" : "SSD"
},
{
"code" : "HDD"
},
{
"code" : "IOO"
}
]
}
}
I planed to find where "id":1 and value of number.code in ["AAA", "BBB"], change number.code to "DDD". I did it with following code:
db.test.update(
{
id: 1,
"number.code": {$in: ["AAA", "BBB"]}
},
{
$set: {"number.$[elem].code": "VVV"}
},
{ "arrayFilters": [{ "elem.code": {$in: ["AAA", "BBB"]} }], "multi": true, "upsert": false
}
)
It works in mongodb shell, but in python (with pymongo) it doesn't with the following error:
raise TypeError("%s must be True or False" % (option,))
TypeError: upsert must be True or False
Please help me. What can I do?
pymongo just has syntax that's a tad different. it would look like this:
db.test.update_many(
{
"id": 1,
"number.code": {"$in": ["AAA", "BBB"]}
},
{
"$set": {"number.$[elem].code": "VVV"}
},
array_filters=[{"elem.code": {"$in": ["AAA", "BBB"]}}],
upsert=False
)
multi flag not needed with update_many.
upsert is False by default hence also redundant.
You can find pymongo's docs here.

Multi $project mongodb

I have query in MongoDB:
db.questions.aggregate([
{ $project: {
total: { $add: [ "$answear_false", "$answear_true" ] }
}},
{ $project: {
percent_true: {
$cond: [
{ $eq: [ "$total", null ] },
0 ,
{ $divide: [ "$answear_true", "$total" ] }
]
}
}},
{ $project: { _id: 1, total: 1, percent_true: 1 } }
])
But result print not exactly, field total not showing on result
{ "_id" : "1004121032231110394769", "percent_true" : 0 }
{ "_id" : "1004121035679127802289", "percent_true" : 0 }
{ "_id" : "1004121038562570811362", "percent_true" : 0 }
Could add >2 $project in one query mongodb ?
The second project filters the total field, try adding to the second project
total: "$total"

how to aggregate on each item in collection in mongoDB

MongoDB noob here...
when I do db.students.find().pretty() in the shell I get a long list from my collection...like so..
{
"_id" : 19,
"name" : "Gisela Levin",
"scores" : [
{
"type" : "exam",
"score" : 44.51211101958831
},
{
"type" : "quiz",
"score" : 0.6578497966368002
},
{
"type" : "homework",
"score" : 93.36341655949683
},
{
"type" : "homework",
"score" : 49.43132782777443
}
]
}
now I've got about over 100 of these...I need to run the following on each of them...
lowest_hw_score =
db.students.aggregate(
// Initial document match (uses index, if a suitable one is available)
{ $match: {
_id : 0
}},
// Expand the scores array into a stream of documents
{ $unwind: '$scores' },
// Filter to 'homework' scores
{ $match: {
'scores.type': 'homework'
}},
// Sort in descending order
{ $sort: {
'scores.score': 1
}},
{ $limit: 1}
)
So I can run something like this on each result
for item in lowest_hw_score:
print lowest_hw_score
Right now "lowest_score" works on only one item I to run this on all items in the collection...how do I do this?
> db.students.aggregate(
{ $match : { 'scores.type': 'homework' } },
{ $unwind: "$scores" },
{ $match:{"scores.type":"homework"} },
{ $group: {
_id : "$_id",
maxScore : { $max : "$scores.score"},
minScore: { $min:"$scores.score"}
}
});
You don't really need the first $match, but if "scores.type" is indexed, it means it would be used before unwinding the scores. (I don't believe after the $unwind mongo would be able to use the index.)
Result:
{
"result" : [
{
"_id" : 19,
"maxScore" : 93.36341655949683,
"minScore" : 49.43132782777443
}
],
"ok" : 1
}
Edit: tested and updated in mongo shell

Categories