Add pattern inside $in, and aggregate - python

I am adding a pattern inside an $in in an aggregate function.
I know there values exist but my query is returning nothing for the pattern.
Here is my query:
db.collection.aggregate([{"$unwind":"$tags"},
{'$match':
{
'tags.tag.name': {
"$in": ['AA', 'CS', '/Nie/i']},
'auditRun': 12345}},
{'$project': {
'tags.tag.name':1,
'_id': 0}},])
I am getting results for AA and CS but I am not getting anything back for the Nie.
I am expecting a few results for that as well because I have a bunch of names starting with Nie.
What am I doing wrong?

You could use the $or operator along with $regex to separately lookup the items you need a regular expression search on. It cannot be combined into a single list of items.
db.collection.aggregate([{
"$unwind": "$tags"
},
{
'$match': {
$or: [
{'tags.tag.name': {"$in": ['AA','CS']}},
{'tags.tag.name': { $regex: 'Nie', $options: 'i'}}
],
'auditRun': 12345
}
},
{
'$project': {
'tags.tag.name': 1,
'_id': 0
}
},
])
Also want to mention that if you only need the ones that start with Nie, then your regex should be ^Nie

Related

Pull from nested Array in MongoDB

I have a document structure like this:
{
"name": "Example",
"description": "foo",
"vocabulary": [
["apple", "pomme"],
["hello", "bonjour"],
["bus", "bus"]
]
}
Now I want to pull an array inside the vocabulary array by specifying the first item, a.E.:
{"$pull": {"vocabulary.$": ["apple"]}
Which should remove the array ["apple", "pomme"] from vocabulary, but this doesn't work.
I tried this ($pull from nested array), but it did not work, it threw
pymongo.errors.WriteError:
The positional operator did not find the match needed from the query., full error: {'index': 0, 'code': 2, 'errmsg': 'The positional operator did not find the match needed from the query.'}
Very tricky question.
I think for this case $ positional operator is not suitable.
Instead, you need an aggregation pipeline in update query.
Query ($filter) the values with the "apple" word is not ($not) existed ($in) in the vocabulary array field. Then $set to the vocabulary field.
db.collection.update({},
[
{
"$set": {
"vocabulary": {
$filter: {
input: "$vocabulary",
cond: {
$not: {
$in: [
"apple",
"$$this"
]
}
}
}
}
}
}
])
Sample Mongo Playground

Select multiple values from a document

Background
I have data stored in the following format
{
"player_id": "VU3R5HNTAGMK",
"markers": {
"BICF2P964092": "GC",
"BICF2G630653981": "CG",
"BICF2P483996": "CT",
"BICF2S23452916": "CG",
"chr26_19147949": "TC",
}
}
You can imagine i have data stored for multiple players and each has a unique player_id and they all have varying number of markers with different marker values.
In the above case a marker is BICF2P964092 and it's marker value is GC.
I am trying to query my mongo db in various ways. One obvious way is by using player_id. To do that I do the following col.find({"player_id": "VU3R5HNTAGMK"})
Another thing i want to do is maybe I just want to know value of a specific marker for a specific player. So for that I can do the following col.find({"player_id": "VU3R5HNTAGMK"}, {'markers.BICF2P964092'})
ISSUE
I also want to be able to get values for multiple markers for a specific player and i am not able to do so. I have tried the following with no luck.
col.find({"player_id": "VU3R5HNTAGMK"},{'markers': {'$in': ["BICF2P964092", "chr26_19147949"]}})
col.find({"player_id": "VU3R5HNTAGMK"}, {'markers.BICF2P964092'}, {'markers.chr26_19147949'})
I would really appreciate it if someone can help me write a query where i can get multiple marker values for specified marker and player_id
You can simply do the following
col.find({“player_id”: “VU3R5HNTAGMK”}, {“markers.” + m: 1 for m in [“ BICF2P964092", “BICF2G630653981”]})
As you've tagged this pymongo, you might be as best to process the marker values in python after the find; e.g.
docs = col.find({"player_id": "VU3R5HNTAGMK"})
for doc in docs:
for marker, value in doc.get('markers').items():
if marker in ["BICF2P964092", "chr26_19147949"]:
print(marker, value)
#Belly Buster solution is good if you want to handle this using python.
But, there is a way to completely handle this on the MongoDB side using Aggregation.
You can combine $objectToArray, $filter, and $arrayToObject operators in $project stage.
collection.aggregate([
{
"$match": {
"player_id": "VU3R5HNTAGMK" # <-- All your match conditons
}
},
{
"$project": {
"player_id": 1, # All the other keys which you want to project
"markers": {
"$arrayToObject": {
"$filter": {
"input": {
"$objectToArray": "$markers"
},
"as": "elem",
"cond": {
"$in": [
"$$elem.k",
[
# <-- List of key names you want to project
"BICF2G630653981",
"BICF2P483996"
]
]
},
},
},
},
}
},
])
Note: You have to use MongoDB version >= 3.4.4 for this aggregation query to work.

How can I mitigate the error in Python Dictionary?

How do I access to visibilities?
I am trying like this: dev1['data']['results :visibilites ']
dev1 = {
"status": "OK",
"data": {
"results": [
{
"tradeRelCode": "ZT55",
"customerCode": "ZC0",
"customerName": "XYZ",
"tier": "null1",
"visibilites": [
{
"code": "ZS0004207",
"name": "Aabc Systems Co.,Ltd",
"siteVisibilityMap": {
},
"customerRef": "null1"
}
]
}
],
"pageNumber": 3,
"limit": 1,
"total": 186
}
}
You can use dev1['data']['results'][0]['visibilites'].
It will contain a list of one dictionary.
To access this dictionary directly, use dev1['data']['results'][0]['visibilites'][0]
dev['data'] represents a dictionary that has for key results.
You can access the item associated to results key (a list) using (dev1['data'])['results'].
To access the only member of this list, you use ((dev1['data'])['results'])[0].
This gives you a dictionary that has tradeRelCode, customerCode, customerName, tier and visibilites keys.
To access the item associated to visibilites key (a list), you have tu use (((dev1['data'])['results'])[0])['visibilites'].
To finally access the only dictionary contained in this list, you have tu use ((((dev1['data'])['results'])[0])['visibilites'])[0].
Parenthesis are here to show that python dig into each dictionary or list in order from left to right (python does not mind the parenthesis in the code, you can keep them if it is clearer for you.)
In your data structure use path
dev1['data']['results'][0]['visibilites']
Try this
dev1['data']['results'][0]['visibilites']
Reason:
This is a list -> dev1['data']['results']
So, access this -> dev1['data']['results'][0]
and then you obtain this ->
{'tradeRelCode': 'ZT55',
'customerCode': 'ZC0',
'customerName': 'XYZ',
'tier': 'null1',
'visibilites': [{'code': 'ZS0004207',
'name': 'Aabc Systems Co.,Ltd',
'siteVisibilityMap': {},
'customerRef': 'null1'}]}
and then you can have -> dev1['data']['results'][0]['visibilites']
which results in ->
[{'code': 'ZS0004207',
'name': 'Aabc Systems Co.,Ltd',
'siteVisibilityMap': {},
'customerRef': 'null1'}]
which is a list and you can index the first element which is another dictionary

Selectively retrieving depending on existence of key in map

Is it possible to selectively retrieve depending on the existence of keys in a map in mongodb? And if so, how do you go about doing it?
Suppose I have a document that looks like this for example..
{ "_id": 1234,
"parentfield1" : {
"childfield1" : { ...},
"childfield2" : { ...},
"childfield5" : { ...}, // There might be many childfields.. > 50
},
}
How would I be able to selectively retrieve from the document a/some particular childfields given multiple options to choose from? Some of which may not exist in the document.
i.e.
input "childfield1", "childfield2", "childfield3"
-> output
{ "_id": 1234,
"parentfield1": {
"childfield1" : { ... },
"childfield2" : { ... },
},
}
Is it even doable? Is it possible to do efficiently also?
Any help would be great (python, go).
Yes, that's the purpose of the projection parameter of find:
db.collection.find({_id: 1234}, {
'parentfield1.childfield1': 1,
'parentfield1.childfield2': 1,
'parentfield1.childfield3': 1
});
If a specified field isn't present in a given doc, the other matching fields will still be included.
Build up the projection parameter object programmatically if you want it to be dynamic.

MongoDb: $sort by $in

I am running a mongodb find query with an $in operator:
collection.find({name: {$in: [name1, name2, ...]}})
I would like the results to be sorted in the same order as my name array: [name1, name2, ...]. How do I achieve this?
Note: I am accessing MongoDb through pymongo, but I don't think that's of any importance.
EDIT: as it's impossible to achieve this natively in MongoDb, I ended up using a typical Python solution:
names = [name1, name2, ...]
results = list(collection.find({"name": {"$in": names}}))
results.sort(key=lambda x: names.index(x["name"]))
You can achieve this with aggregation framework starting with upcoming version 3.4 (Nov 2016).
Assuming the order you want is the array order=["David", "Charlie", "Tess"] you do it via this pipeline:
m = { "$match" : { "name" : { "$in" : order } } };
a = { "$addFields" : { "__order" : { "$indexOfArray" : [ order, "$name" ] } } };
s = { "$sort" : { "__order" : 1 } };
db.collection.aggregate( m, a, s );
The "$addFields" stage is new in 3.4 and it allows you to "$project" new fields to existing documents without knowing all the other existing fields. The new "$indexOfArray" expression returns position of particular element in a given array.
The result of this aggregation will be documents that match your condition, in order specified in the input array order, and the documents will include all original fields, plus an additional field called __order
Impossible. $in operator checks the presence. The list is treated as set.
Options:
Split for several queries for name1 ... nameN or filter the result the same way.
More names - more queries.
Use itertools groupby/ifilter. In that case - add the "sorting precedence" flag to every document and match name1 to PREC1, name2 to PREC2, ...., then isort by PREC then group by PREC.
If your collection has the index on "name" field - option 1 is better.
If doest not have the index or you cannot create it due to high write/read ratio - option 2 is for you.
Vitaly is correct it's impossible to do that with find but it can be achieved with aggregates:
db.collection.aggregate([
{ $match: { name: { $in: [name1, name2, /* ... */] } } },
{
$project: {
name: 1,
name1: { $eq: ['name1', '$name'] },
name2: { $eq: ['name2', '$name'] },
},
},
{ $sort: { name1: 1, name2: 1 } },
])
tested on 2.6.5
I hope this will hint other people in the right direction.

Categories