Updating an object inside an array with PyMongo - python

I am wondering how do you update a nested array with PyMongo/MongoDB by selecting a document(row) and then going into the nested array and selecting a specific object.
{
"_id" : "12345",
"name" : "John Doe,
"mylist" : [
{
"nested_id" : "1",
"data1" : "lorem ipsum",
"data2" : "stackoverflow",
"data3" : "james bond"
},
{
"nested_id" : "2",
"data1" : "lorem ipsum",
"data2" : "stackoverflow",
"data3" : "james bond"
},
{
....
}
]
}
and then lets say you pass a discretionary with the elements you want to update. In this example only update data1 and data3
data = {
"data1" : "new lorem",
"data3" : "goldeneye"
}
I have tried with the following syntax, but with no success.
db.testing.find_and_modify(
query={"_id": "12345", 'mylist.nested_id' : "1"},
update={"$set": {'mylist' : data}})
what it should look like after the update
{
"_id" : "12345",
"name" : "John Doe,
"mylist" : [
{
"nested_id" : "1",
"data1" : "new lorem",
"data2" : "stackoverflow",
"data3" : "goldeneye"
},
{
"nested_id" : "2",
"data1" : "lorem ipsum",
"data2" : "stackoverflow",
"data3" : "james bond"
},
{
....
}
]
}

Use "dot notation" and the positional operator in the update portion. Also transform your input to match the "dot notation" form for the key representation:
# Transform to "dot notation" on explicit field
for key in data:
data["mylist.$." + key] = data[key]
del data[key]
# Basically makes
# {
# "mylist.$.data1": "new lorem",
# "mylist.$.data3": "goldeneye"
# }
db.testing.find_and_modify(
query = {"_id": "12345", 'mylist.nested_id' : "1"},
update = { "$set": data }
)
So that will transpose $ to the actual matched element position from the query portion of the update. The matched array element will be updated and using "dot notation" only the mentioned fields will be affected.
Have no idea what "service" is supposed to mean in this context and I am just treating it as a "transcribing error" since you are clearly trying to match an array element in position.
That could be cleaner, but this should give you the general idea.

Related

Formatting issue, when python dictionary is dumped in to json objects

I have two dictionaries - test1 and test2. I have to recursively compare both, if test1 contains key description$$, I have to replace second test2 of same key with the value of test1 key and then dump this in to a JSON file. I was able to get this, but the output of JSON file is not as of expected format.
sample.py
import json
test1 = {
"info" : {
"title" : "some random data",
"description$$" : "CHANGED::::",
"version" : "x.x.1"
},
"schemes" : [ "https" ],
"basePath" : "/sch/f1"
}
test2 = {
"info" : {
"title" : "some random data",
"description" : "before change",
"version" : "x.x.4"
},
"schemes" : [ "https" ],
"basePath" : "/sch/f2"
}
def walk(test1, test2):
for key, item in test1.items():
if type(item) is dict:
walk(item, test2[key])
else:
if str(key) == "description$$" or str(key) == "summary$$":
modfied_key = str(key)[:-2]
test2[modfied_key] = test1[key]
walk(test1, test2)
json.dump(test2, open('outputFile.json', "w"), indent=2)
My output is -
outpufile.json
{
"info": {
"title": "some random data",
"description": "CHANGED::::",
"version": "x.x.4"
},
"schemes": [
"https"
],
"basePath": "/sch/f2"
}
but the expected output should be -
{
"info": {
"title": "some random data",
"description": "CHANGED::::",
"version": "x.x.4"
},
"schemes": ["https"],
"basePath": "/sch/f2"
}
the schema should be printed in single line, but in my output it's taking 3 lines. how can I fix this?
Thank you

How to iterate over a mongo document with an embedded doc using pymongo?

I have MongoDB data that looks like this:
{
"_id" : ObjectId("602ad096f7449063397d41bd"),
"name" : "1234/1236 Main St",
"city" : "Indianapolis",
"state" : "IN",
"zip" : "46208",
"total_units" : 2,
"units" : [
{
"unit_num" : 1,
"street" : "1234 Main",
"bedrooms" : 3,
"bath" : 1,
"sqft" : 1225,
"monthly_rent" : 800,
"lease_expiration" : "2021-06-30"
},
{
"unit_num" : 2,
"street" : "1236 Main",
"bedrooms" : 3,
"bath" : 1,
"sqft" : 1225,
"monthly_rent" : 800,
"lease_expiration" : "2021-07-31"
}
]
}]}
Using python and PyMongo, I'm trying to iterate over all the entries, and not all the entries have multiple units, and return the monthly_rent and lease_expiration value. This is what I have in the shell:
db.property.find({active: true}, {_id: 0, name: 1, "street": 1,"units.monthly_rent": 1, "units.lease_expiration": 1}).sort({"units.lease_expiration": 1})
and it returns this:
{
"name" : "1234/1236 Main St",
"units" : [
{
"street" : "1234 Main St",
"monthly_rent" : 800,
"lease_expiration" : "2021-06-30"
},
{
"street" : "1236 Main St",
"monthly_rent" : 800,
"lease_expiration" : "2021-07-31"
}
]
}
Within the python script, I want to iterate over each property and unit and print out name,"units.street","units.monthly_rent", "units.lease_expiration", but I can't seem to get it to traverse the units array. I'm testing this:
for prop in list(db.mycol.find({})):
print(prop)
print(prop["units.monthly_rent"])
The print(prop) prints out all the data as expected, but the other print statement gives an error: KeyError: 'units.monthly_rent'
Can someone point me in the right direction?
Adding aggregation query from comments below:
db.property.aggregate([
{ $project: {
"units": {
$filter: { input: "$units", as: "u",
cond: { $gte: [ "$$u.unit_num", 0 ] }
} } }
},
{ $unwind: "$units" },
{ $project: {
"street": "$units.street", "
monthly_rent": "$units.monthly_rent",
"lease_expiration": "$units.lease_expiration" }
}
])
You've to loop ver the units and print unit.monthly_rent
for prop in list(db.mycol.find({})):
print(prop)
for unit in list(prop['units']):
print(unit['monthly_rent'])
You can't do array['monthly_rent'] instead you've to do array[0]['monthly_rent'] and so on
You've get each element and then you get access to the object inside the array

how to insert array into embeded document in mongodb

I have a document like this:
b = { "_id":"10001", "comments":[{"comid":"3","comtime":"2014","author":"jenny"}]}
I want to insert another one like:
c = {"comid":"34","comtime":"2015","author":"jack"}
into comid whose value is "3".
the result I want is :
{
"_id" : "10001",
"comments" : [
{
"comid" : "3",
"comtime" : "2014",
"author" : "jenny",
"replycomment" : [
{
"comid" : "34",
"comtime" : "2015",
"author" : "jack"
}
]
}
]
}
exactly I want to have another embedded document in the array for replied comments.
any ideas?
You nee to use the update_one method and the $push update operator.
replycomment = {"comid": "34", "comtime": "2015", "author": "jack"}
collection.update_one(
{"comments.comid": "3"},
{"$push": {"comments.$.replycomment": replycomment}}
)

MongoDB: Remove duplicate records from Projection

How can I remove duplicate records from mongoDB projection ?
Lets say I have My mongo documents in following form -
{"_id":"55555454", "From":"Bob", "To":"Alice", "subject":"Hi", "date":"04102011"}
{"_id":"55555455", "From":"Bob", "To":"Dave", "subject":"Hello", "date":"04102014"}
{"_id":"55555456", "From":"Bob", "To":"Alice", "subject":"Bye", "date":"04112013"}
When I do a simple projection
db.col.find({}, {"From":1, "To":1, "_id"=0})
which will obviously give me all three records like this.
{"From":"Bob", "To":"Alice"} {"From":"Bob","To":"Dave"} {"From":"Bob",
"To":"Alice"}
However What I want is only two records, this way -
{"From":"Bob", "To":"Alice"} {"From":"Bob","To":"Dave"}
As My application is in python currently (using pymongo), what I am doing is that I am removing duplicate in the application from the list of records using
result = [dict(tupleized) for tupleized in set(tuple(item.items()) for item in l)]
Is there any DB method which I can apply to the projection and gives me only two records.
You can't do a reduction and eliminate duplicate documents using just find with MongoDB and a projection.
The find commands won't work as you need remember that it's returning a cursor to the client and as such, can't reduce the results to only those documents that are unique without a secondary pass.
Using this as test data (removed the _id):
> db.test.find()
{ "From" : "Bob", "To" : "Alice", "subject" : "Hi", "date" : "04102011" }
{ "From" : "Bob", "To" : "Dave", "subject" : "Hello", "date" : "04102014" }
{ "From" : "Bob", "To" : "Alice", "subject" : "Bye", "date" : "04112013" }
{ "From" : "Bob", "To" : "Alice", "subject" : "Hi", "date" : "04102011" }
{ "From" : "Bob", "To" : "Dave", "subject" : "Hello", "date" : "04102014" }
{ "From" : "Bob", "To" : "Alice", "subject" : "Bye", "date" : "04112013" }
{ "From" : "Bob", "To" : "Dave", "subject" : "Hello", "date" : "04102014" }
{ "From" : "Bob", "To" : "Alice", "subject" : "Bye", "date" : "04112013" }
{ "From" : "George", "To" : "Carl", "subject" : "Bye", "date" : "04112013" }
{ "From" : "David", "To" : "Carl", "subject" : "Bye", "date" : "04112013" }
You could use aggregation:
> db.test.aggregate({ $group: { _id: { "From": "$From", "To": "$To" }}})
Results:
{
"result" : [
{
"_id" : {
"From" : "David",
"To" : "Carl"
}
},
{
"_id" : {
"From" : "George",
"To" : "Carl"
}
},
{
"_id" : {
"From" : "Bob",
"To" : "Dave"
}
},
{
"_id" : {
"From" : "Bob",
"To" : "Alice"
}
}
],
"ok" : 1
}
The Python code should look very similar to the aggregation pipeline suggested above.
Projection only defines which fields you want to appear in the result. It is much like the statement starting with:
SELECT From, To
as opposed to the basic form of
SELECT *
So what you actually wanted to do was the equivalent of this:
db.collection.find(
{ "From": "Bob", "To": "Alice" },
{ "From": 1, "To": 1 }
)
Which actually selects the records that you want and is much the same form as:
SELECT From, To
FROM collection
WHERE
From = "Bob"
AND To = "Alice"
Should that actually somehow produce "duplicate" results the you can remove this with use of aggregate:
db.collection.aggregate([
{ "$match": {
"From": "Bob", "To": "Alice"
}}
{ "$group": {
"_id": {
"From": "$From", "To": "$To"
}
}}
])

ElasticSearch: Finding documents with field value that is in an array

I have some customer documents that I want to be retrieved using ElasticSearch based on where the customers come from (country field is IN an array of countries).
[
{
"name": "A1",
"address": {
"street": "1 Downing Street"
"country": {
"code": "GB",
"name": "United Kingdom"
}
}
},
{
"name": "A2",
"address": {
"street": "25 Gormut Street"
"country": {
"code": "FR",
"name": "France"
}
}
},
{
"name": "A3",
"address": {
"street": "Bonjour Street"
"country": {
"code": "FR",
"name": "France"
}
}
}
]
Now, I have another an array in my Python code:
["DE", "FR", "IT"]
I'd like to obtain the two documents, A2 and A3.
How would I write this in PyES/Query DSL? Am I supposed to be using an ExistsFilter or a TermQuery for this. ExistsFilter seems to only check whether the field exists or not, but doesn't care about the value.
In NoSQL-type document stores, all you get back is the document, not parts of the document.
Your requirement: "I'd like to obtain the two documents, A2 and A3." implies that you need to index each of those documents separately, not as an array inside another "parent" document.
If you need to match values of the parent document alongside country then you need to denormalize your data and store those values from the parent doc inside each sub-doc as well.
Once you've done the above, then the query is easy. I'm assuming that the country field is mapped as:
country: {
type: "string",
index: "not_analyzed"
}
To find docs with DE, you can do:
curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"country" : "DE"
}
}
}
}
}
'
To find docs with either DE or FR:
curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"terms" : {
"country" : [
"DE",
"FR"
]
}
}
}
}
}
'
To combine the above with some other query terms:
curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"filter" : {
"terms" : {
"country" : [
"DE",
"FR"
]
}
},
"query" : {
"text" : {
"address.street" : "bonjour"
}
}
}
}
}
'
Also see this answer for an explanation of how arrays of objects can be tricky, because of the way they are flattened:
Is it possible to sort nested documents in ElasticSearch?

Categories