I am having a document which is structured like this
{
"_id" : ObjectId("564c0cb748f9fa2c8cdeb20f"),
"username" : "blah",
"useremail" : "blah#blahblah.com",
"groupTypeCustomer" : true,
"addedpartners" : [
"562f1a629410d3271ba74f74",
"562f1a6f9410d3271ba74f83"
],
"groupName" : "Mojito",
"groupTypeSupplier" : false,
"groupDescription" : "A group for fashion designers"
}
Now I want to delete one of the values from this 'addedpartners' array and update the document.
I want to just delete 562f1a6f9410d3271ba74f83 from the addedpartners array
This is what I had tried earlier.
db.myCollection.update({'_id':'564c0cb748f9fa2c8cdeb20f'},{'$pull':{'addedpartners':'562f1a6f9410d3271ba74f83'}})
db.myCollection.update(
{ _id: ObjectId(id) },
{ $pull: { 'addedpartners': '562f1a629410d3271ba74f74' } }
);
Try with this
db.myCollection.update({}, {$unset : {"addedpartners.1" : 1 }})
db.myCollection.update({}, {$pull : {"addedpartners" : null}})
No way to delete array directly, i think this is going to work, i haven't tried yet.
Related
I am using Python and MongoEngine to try and query the below Document in MongoDB.
I need a query to efficiently get the Documents only when they contain Embedded Documents 'Keywords' that match the following criteria:
Keywords Filtered where the Property 'SFR' is LTE '100000'
SUM the filtered keywords
Return the parent documents where SUM of the keywords matching the criteria is Greater than '9'
Example structure:
{
"_id" : ObjectId("5eae60e4055ef0e717f06a50"),
"registered_data" : ISODate("2020-05-03T16:12:51.999+0000"),
"UniqueName" : "SomeUniqueNameHere",
"keywords" : [
{
"keyword" : "carport",
"search_volume" : NumberInt(10532),
"sfr" : NumberInt(20127),
"percent_contribution" : 6.47,
"competing_product_count" : NumberInt(997),
"avg_review_count" : NumberInt(143),
"avg_review_score" : 4.05,
"avg_price" : 331.77,
"exact_ppc_bid" : 3.44,
"broad_ppc_bid" : 2.98,
"exact_hsa_bid" : 8.33,
"broad_hsa_bid" : 9.29
},
{
"keyword" : "party tent",
"search_volume" : NumberInt(6944),
"sfr" : NumberInt(35970),
"percent_contribution" : 4.27,
"competing_product_count" : NumberInt(2000),
"avg_review_count" : NumberInt(216),
"avg_review_score" : 3.72,
"avg_price" : 210.16,
"exact_ppc_bid" : 1.13,
"broad_ppc_bid" : 0.55,
"exact_hsa_bid" : 9.66,
"broad_hsa_bid" : 8.29
}
]
}
From the research I have been doing, I believe an Aggregate type query might do what I am attempting.
Unfortunately, being new to MongoDB / MongoEngine I am struggling to figure out how to structure the query and have failed in finding an example similar to what I am attempting to do (RED FLAG RIGHT????).
I did find an example of a aggregate but unsure how to structure my criteria in it, maybe something like this is getting close but does not work.
pipeline = [
{
"$lte": {
"$sum" : {
"keywords" : {
"$lte": {
"keyword": 100000
}
}
}: 9
}
}
]
data = product.objects().aggregate(pipeline)
Any guidance would be greatly appreciated.
Thanks,
Ben
you can try something like this
db.collection.aggregate([
{
$project: { // the first project to filter the keywords array
registered_data: 1,
UniqueName: 1,
keywords: {
$filter: {
input: "$keywords",
as: "item",
cond: {
$lte: [
"$$item.sfr",
100000
]
}
}
}
}
},
{
$project: { // the second project to get the length of the keywords array
registered_data: 1,
UniqueName: 1,
keywords: 1,
keywordsLength: {
$size: "$keywords"
}
}
},
{
$match: { // then do the match
keywordsLength: {
$gte: 9
}
}
}
])
you can test it here Mongo Playground
hope it helps
Note, I used sfr property only from the keywords array for simplicity
I want to remove duplicate data from my collection in MongoDB. How can I accomplish this?
Please refer to this example to understand my problem:
My collection name & questions are in this col/row as follows -
{
"questionText" : "what is android ?",
"__v" : 0,
"_id" : ObjectId("540f346c3e7fc1234ffa7085"),
"userId" : "102"
},
{
"questionText" : "what is android ?",
"__v" : 0,
"_id" : ObjectId("540f346c3e7fc1054ffa7086"),
"userId" : "102"
}
How do I remove the duplicate question by the same userId? Any help?
I'm using Python and MongoDB.
IMPORTANT: The dropDups option was removed starting with MongoDB 3.x, so this solution is only valid for MongoDB versions 2.x and before. There is no direct replacement for the dropDups option. The answers to the question posed at http://stackoverflow.com/questions/30187688/mongo-3-duplicates-on-unique-index-dropdups offer some possible alternative ways to remove duplicates in Mongo 3.x.
Duplicate records can be removed from a MongoDB collection by creating a unique index on the collection and specifying the dropDups option.
Assuming the collection includes a field named record_id that uniquely identifies a record in the collection, the command to use to create a unique index and drop duplicates is:
db.collection.ensureIndex( { record_id:1 }, { unique:true, dropDups:true } )
Here is the trace of a session that shows the contents of a collection before and after creating a unique index with dropDups. Notice that duplicate records are no longer present after the index is created.
> db.pages.find()
{ “_id” : ObjectId(“52829c886602e2c8428d1d8c”), “leaf_num” : “1”, “scan_id” : “smithsoniancont251985smit”, “height” : 3464, “width” : 2548 }
{ “_id” : ObjectId(“52829c886602e2c8428d1d8d”), “leaf_num” : “1”, “scan_id” : “smithsoniancont251985smit”, “height” : 3464, “width” : 2548 }
{ “_id” : ObjectId(“52829c886602e2c8428d1d8e”), “leaf_num” : “2”, “scan_id” : “smithsoniancont251985smit”, “height” : 3587, “width” : 2503 }
{ “_id” : ObjectId(“52829c886602e2c8428d1d8f”), “leaf_num” : “2”, “scan_id” : “smithsoniancont251985smit”, “height” : 3587, “width” : 2503 }
>
> db.pages.ensureIndex( { scan_id:1, leaf_num:1 }, { unique:true, dropDups:true } )
>
> db.pages.find()
{ “_id” : ObjectId(“52829c886602e2c8428d1d8c”), “leaf_num” : “1”, “scan_id” : “smithsoniancont251985smit”, “height” : 3464, “width” : 2548 }
{ “_id” : ObjectId(“52829c886602e2c8428d1d8e”), “leaf_num” : “2”, “scan_id” : “smithsoniancont251985smit”, “height” : 3587, “width” : 2503 }
>
Since now the dropOps is deprecated. You can use pandas.
select the fields you need from mongodb
use pandas.DataFrame.duplicated to mark all duplicates as True except the first one
remove them ( the ones marked as duplicated ) in the collection using their _ids
I must be really slow because I spent a whole day googling and trying to write Python code to simply list the "code" values only so my output will be Service1, Service2, Service2. I have extracted json values before from complex json or dict structure. But now I must have hit a mental block.
This is my json structure.
myjson='''
{
"formatVersion" : "ABC",
"publicationDate" : "2017-10-06",
"offers" : {
"Service1" : {
"code" : "Service1",
"version" : "1a1a1a1a",
"index" : "1c1c1c1c1c1c1"
},
"Service2" : {
"code" : "Service2",
"version" : "2a2a2a2a2",
"index" : "2c2c2c2c2c2"
},
"Service3" : {
"code" : "Service4",
"version" : "3a3a3a3a3a",
"index" : "3c3c3c3c3c3"
}
}
}
'''
#convert above string to json
somejson = json.loads(myjson)
print(somejson["offers"]) # I tried so many variations to no avail.
Or, if you want the "code" stuffs :
>>> [s['code'] for s in somejson['offers'].values()]
['Service1', 'Service2', 'Service4']
somejson["offers"] is a dictionary. It seems you want to print its keys.
In Python 2:
print(somejson["offers"].keys())
In Python 3:
print([x for x in somejson["offers"].keys()])
In Python 3 you must use the list comprehension because in Python 3 keys() is a 'view', not a list.
This should probably do the trick , if you are not certain about the number of Services in the json.
import json
myjson='''
{
"formatVersion" : "ABC",
"publicationDate" : "2017-10-06",
"offers" : {
"Service1" : {
"code" : "Service1",
"version" : "1a1a1a1a",
"index" : "1c1c1c1c1c1c1"
},
"Service2" : {
"code" : "Service2",
"version" : "2a2a2a2a2",
"index" : "2c2c2c2c2c2"
},
"Service3" : {
"code" : "Service4",
"version" : "3a3a3a3a3a",
"index" : "3c3c3c3c3c3"
}
}
}
'''
#convert above string to json
somejson = json.loads(myjson)
#Without knowing the Services:
offers = somejson["offers"]
keys = offers.keys()
for service in keys:
print(somejson["offers"][service]["code"])
I'm writing a code in which i find this kind of database (i'm using pymongo).
How can i attribute these arrays inside the wishlist field to python arrays?
Alternatively, how can i search my database for a value inside an array inside the wishlist field. E.g.: i want to find all IDs that have, say, ["feldon","c15", "sp"] in their wishlists
{
"_id" : "david",
"password" : "azzzzzaa",
"url" : "url3",
"old_url" : "url3",
"new_url" : ["url1", "url2"],
"wishlist" : [
["all is dust", "mm4", "nm"],
["feldon", "c15", "sp"],
["feldon", "c15", "sp"],
["jenara", "shards", "nm"],
["rafiq", "shards", "nm"]
]
}
You can use distinct if elements in your sublist and are exactly in the same order.
db.collection.distinct("_id", {"wishlist": ["feldon", "c15", "sp"]})
If not, you need to use the aggregate method and the $redact operator.
db.collection.aggregate([
{"$redact": {
"$cond": [
{"$setIsSubset": [
[["feldon","c15", "sp"]],
"$wishlist"
]},
"$$KEEP",
"$$PRUNE"
]
}},
{"$group": {
"_id": None,
"ids": {"$push": "$_id"}
}}
])
Data:
{
"_id" : ObjectId("50cda9741d41c81da6000002"),
"template_name" : "common_MH",
"role" : "MH",
"options" : [
{
"sections" : [
{
"tpl_option_name" : "test321",
"tpl_option_type" : "string",
"tpl_default_value" : "test321"
}
],
"tpl_section_name" : "Test"
}
]
}
could I modify tpl_default_value in options.$.section.$.tpl_option_name = 'test321'?
I already try too times, but I can't solve.
please assist me, thanks.
This is a bad schema for doing these kinda of updates, there is a JIRA for multi-level positional operator however it is not yet done: https://jira.mongodb.org/browse/SERVER-831
Ideally you either have to update this client side and then atomically set that section of the array:
$section = {
"tpl_option_name" : "test321",
"tpl_option_type" : "string",
"tpl_default_value" : "test321"
};
db.col.update({}, {$set: {options.$.sections.1: $section}})
Or you need to change your schema. Does the sections really need to be embedded? I noticed that you have a tpl_section_name in the top level but then you are nesting sections within that, it sounds more logical that only one section should be there.
That document would be easier to update.