Pull from nested Array in MongoDB - python

I have a document structure like this:
{
"name": "Example",
"description": "foo",
"vocabulary": [
["apple", "pomme"],
["hello", "bonjour"],
["bus", "bus"]
]
}
Now I want to pull an array inside the vocabulary array by specifying the first item, a.E.:
{"$pull": {"vocabulary.$": ["apple"]}
Which should remove the array ["apple", "pomme"] from vocabulary, but this doesn't work.
I tried this ($pull from nested array), but it did not work, it threw
pymongo.errors.WriteError:
The positional operator did not find the match needed from the query., full error: {'index': 0, 'code': 2, 'errmsg': 'The positional operator did not find the match needed from the query.'}

Very tricky question.
I think for this case $ positional operator is not suitable.
Instead, you need an aggregation pipeline in update query.
Query ($filter) the values with the "apple" word is not ($not) existed ($in) in the vocabulary array field. Then $set to the vocabulary field.
db.collection.update({},
[
{
"$set": {
"vocabulary": {
$filter: {
input: "$vocabulary",
cond: {
$not: {
$in: [
"apple",
"$$this"
]
}
}
}
}
}
}
])
Sample Mongo Playground

Related

Appending list inside JSON object

This is my dictionary format:
quest_attr = {
"questions": [
{
"Tags": [
{
"tagname": ""
}
],
"Title": "",
"Authors": [
{
"name": ""
}
],
"Answers": [
{
"ans": ""
}
],
"Related_Questions": [
{
"quest": ""
}
]
}
]
}
I want to add list of "Tags" such that the result will be:
"questions":[
{
"Tags": [
{"tagname":"#Education"}, {"tagname":"#Social"}
],
remaining fields...
}
The remaining fields can be assumed to be null. And I want to add multiple questions to the main "questions" list.
I am using this code but he results are not as expected.
ind=0
size=len(tags)
while ind<size:
quest_attr["questions"].append({["Tags"].append({"tagname":tags[ind]})})
ind=ind+1
And if I maintain a variable for looping through the list of questions like:
quest_attr["questions"][ind]["Tags"].append({"tagname":tags[ind]
It gives an error that the index is out of range. What should I do?
It appears that the index variable ind is intended to iterate only through the list of tags. The way you have the append structured, your loop will attempt to attach the next tag to the next question in the questions list, instead of adding the rest of the tags to the same question.
If you were to add the same set to multiple questions, you need loop through the questions list separately while nesting your append statement for the tags inside another loop. On the other hand, if there's only one question you want to target, just use the index number, [0] in this case.
Something like this would perhaps work better but more context would help:
for question in quest_attr["questions"]:
for tag in tags:
question["Tags"].append({"tagname":tag})
Please don't make a mess with dict and list like your code.
Here I recommend a simpler deploy.
quest_attr = {
'questions': {
"Tags":[],
"Title":"",
"Authors":[],
"Answers":[],
"Related_Questions":[]
}
}
tags = [ {"tagname":"#Education"},{"tagname":"#Social"} ]
quest_attr["questions"]['Tags'] += tags
print(quest_attr)

How can I mitigate the error in Python Dictionary?

How do I access to visibilities?
I am trying like this: dev1['data']['results :visibilites ']
dev1 = {
"status": "OK",
"data": {
"results": [
{
"tradeRelCode": "ZT55",
"customerCode": "ZC0",
"customerName": "XYZ",
"tier": "null1",
"visibilites": [
{
"code": "ZS0004207",
"name": "Aabc Systems Co.,Ltd",
"siteVisibilityMap": {
},
"customerRef": "null1"
}
]
}
],
"pageNumber": 3,
"limit": 1,
"total": 186
}
}
You can use dev1['data']['results'][0]['visibilites'].
It will contain a list of one dictionary.
To access this dictionary directly, use dev1['data']['results'][0]['visibilites'][0]
dev['data'] represents a dictionary that has for key results.
You can access the item associated to results key (a list) using (dev1['data'])['results'].
To access the only member of this list, you use ((dev1['data'])['results'])[0].
This gives you a dictionary that has tradeRelCode, customerCode, customerName, tier and visibilites keys.
To access the item associated to visibilites key (a list), you have tu use (((dev1['data'])['results'])[0])['visibilites'].
To finally access the only dictionary contained in this list, you have tu use ((((dev1['data'])['results'])[0])['visibilites'])[0].
Parenthesis are here to show that python dig into each dictionary or list in order from left to right (python does not mind the parenthesis in the code, you can keep them if it is clearer for you.)
In your data structure use path
dev1['data']['results'][0]['visibilites']
Try this
dev1['data']['results'][0]['visibilites']
Reason:
This is a list -> dev1['data']['results']
So, access this -> dev1['data']['results'][0]
and then you obtain this ->
{'tradeRelCode': 'ZT55',
'customerCode': 'ZC0',
'customerName': 'XYZ',
'tier': 'null1',
'visibilites': [{'code': 'ZS0004207',
'name': 'Aabc Systems Co.,Ltd',
'siteVisibilityMap': {},
'customerRef': 'null1'}]}
and then you can have -> dev1['data']['results'][0]['visibilites']
which results in ->
[{'code': 'ZS0004207',
'name': 'Aabc Systems Co.,Ltd',
'siteVisibilityMap': {},
'customerRef': 'null1'}]
which is a list and you can index the first element which is another dictionary

How to use PyMongo find() to search nested array attribute?

Using PyMongo, how would one find/search for the documents where the nested array json object matches a given string.
Given the following 2 Product JSON documents in a MongoDB collection..
[{
"_id" : ObjectId("5be1a1b2aa21bb3ceac339b0"),
"id" : "1",
"prod_attr" : [
{
"name" : "Branded X 1 Sneaker"
},
{
"hierarchy" : {
"dept" : "10",
"class" : "101",
"subclass" : "1011"
}
}
]
},
{
"_id" : ObjectId("7be1a1b2aa21bb3ceac339xx"),
"id" : "2",
"prod_attr" : [
{
"name" : "Branded Y 2 Sneaker"
},
{
"hierarchy" : {
"dept" : "10",
"class" : "101",
"subclass" : "2022"
}
}
]
}
]
I would like to
1. return all documents where prod_att.hierarchy.subclass = "2022"
2. return all documents where prod_attr.name contains "Sneaker"
I appreciate the JSON could be structured differently, unfortunately that is not within my control to change.
1. Return all documents where prod_attr.hierarchy.subclass = "2022"
Based on the Query an Array of Embedded Documents documentation of MongoDB you can use dot notation concatenating the name of the array field (prod_attr), with a dot (.) and the name of the field in the nested document (hierarchy.subclass):
collection.find({"prod_attr.hierarchy.subclass": "2022"})
2. Return all documents where prod_attr.name contains "Sneaker"
As before, you can use the dot notation to query a field of a nested element inside an array.
To perform the "contains" query you have to use the $regex operator:
collection.find({"prod_attr.name": {"$regex": "Sneaker"}})
Another option is to use the MongoDB Aggregation framework:
collection.aggregate([
{"$unwind": "$prod_attr"},
{"$match": {"prod_attr.hierarchy.subclass": "2022"}}
])
the $unwind operator creates a new object for each object inside the prod_attr array, so you will have only nested documents and no array (check the documentation for details).
The next step is the $match operator that actually perform a query on the nested object.
This is a simple example but playing with the Aggregators Operators you have a lot of flexibility.

Add pattern inside $in, and aggregate

I am adding a pattern inside an $in in an aggregate function.
I know there values exist but my query is returning nothing for the pattern.
Here is my query:
db.collection.aggregate([{"$unwind":"$tags"},
{'$match':
{
'tags.tag.name': {
"$in": ['AA', 'CS', '/Nie/i']},
'auditRun': 12345}},
{'$project': {
'tags.tag.name':1,
'_id': 0}},])
I am getting results for AA and CS but I am not getting anything back for the Nie.
I am expecting a few results for that as well because I have a bunch of names starting with Nie.
What am I doing wrong?
You could use the $or operator along with $regex to separately lookup the items you need a regular expression search on. It cannot be combined into a single list of items.
db.collection.aggregate([{
"$unwind": "$tags"
},
{
'$match': {
$or: [
{'tags.tag.name': {"$in": ['AA','CS']}},
{'tags.tag.name': { $regex: 'Nie', $options: 'i'}}
],
'auditRun': 12345
}
},
{
'$project': {
'tags.tag.name': 1,
'_id': 0
}
},
])
Also want to mention that if you only need the ones that start with Nie, then your regex should be ^Nie

MongoDb: $sort by $in

I am running a mongodb find query with an $in operator:
collection.find({name: {$in: [name1, name2, ...]}})
I would like the results to be sorted in the same order as my name array: [name1, name2, ...]. How do I achieve this?
Note: I am accessing MongoDb through pymongo, but I don't think that's of any importance.
EDIT: as it's impossible to achieve this natively in MongoDb, I ended up using a typical Python solution:
names = [name1, name2, ...]
results = list(collection.find({"name": {"$in": names}}))
results.sort(key=lambda x: names.index(x["name"]))
You can achieve this with aggregation framework starting with upcoming version 3.4 (Nov 2016).
Assuming the order you want is the array order=["David", "Charlie", "Tess"] you do it via this pipeline:
m = { "$match" : { "name" : { "$in" : order } } };
a = { "$addFields" : { "__order" : { "$indexOfArray" : [ order, "$name" ] } } };
s = { "$sort" : { "__order" : 1 } };
db.collection.aggregate( m, a, s );
The "$addFields" stage is new in 3.4 and it allows you to "$project" new fields to existing documents without knowing all the other existing fields. The new "$indexOfArray" expression returns position of particular element in a given array.
The result of this aggregation will be documents that match your condition, in order specified in the input array order, and the documents will include all original fields, plus an additional field called __order
Impossible. $in operator checks the presence. The list is treated as set.
Options:
Split for several queries for name1 ... nameN or filter the result the same way.
More names - more queries.
Use itertools groupby/ifilter. In that case - add the "sorting precedence" flag to every document and match name1 to PREC1, name2 to PREC2, ...., then isort by PREC then group by PREC.
If your collection has the index on "name" field - option 1 is better.
If doest not have the index or you cannot create it due to high write/read ratio - option 2 is for you.
Vitaly is correct it's impossible to do that with find but it can be achieved with aggregates:
db.collection.aggregate([
{ $match: { name: { $in: [name1, name2, /* ... */] } } },
{
$project: {
name: 1,
name1: { $eq: ['name1', '$name'] },
name2: { $eq: ['name2', '$name'] },
},
},
{ $sort: { name1: 1, name2: 1 } },
])
tested on 2.6.5
I hope this will hint other people in the right direction.

Categories