If I have a document that looks like this:
{
"_id" : 1,
"name": "Homer J. Simpson",
"income" : 45000,
"address": {
"street": "742 Evergreen Terrace",
"city": "Springfield",
"state": "???",
"email": "homer#springfield.com",
"zipcode": "12345",
"country": "USA"
}
}
And want to do an update on some of the fields in the address document (leaving the other ones unchanged), and insert new fields if they do not already exist, such as this:
{
"address": {
"email": "homer#gmail.com",
"zipcode": "77788",
"latitude" : 23.43545,
"longitude" : 123.45553
}
}
Is there a way to do an atomic update all at once, or do you need to loop over the key/values in the new data and do a .update() for each one?
Use dot notation with a $set to target multiple embedded fields in a single update:
{ "$set": {
"address.email": "homer#gmail.com",
"address.zipcode": "77788",
"address.latitude" : 23.43545,
"address.longitude" : 123.45553
} }
As Sergio metioned use a $set.
{address.latitude : "77788"}
Related
example:
_id = 001
field 'location' = PARIS FRANCE
field 'country' = FRANCE
_id = 002
field 'location' = TORONTO
field 'country' = CANADA
desired result:
ability to recognize that for _id 001, "france" is also in the value for location field;
whereas, _id 002 does not have a value from country that also is in location
Instead of relying on pandas, would like to see if there are more efficient options using pymongo, for example?
This is sensitive to case, and possible abbreviations, etc., but here's one way to identify if one string is contained within the other.
Given an example collection like this:
[
{
"_id": "001",
"location": "PARIS FRANCE",
"country": "FRANCE"
},
{
"_id": "002",
"location": "TORONTO",
"country": "CANADA"
}
]
This will set "isIn" if "country" is contained within "location" or vice-versa.
db.collection.aggregate([
{
"$set": {
"isIn": {
"$gte": [
{
"$sum": [
{ // returns pos or -1 if not found
"$indexOfCP": ["$location", "$country"]
},
{"$indexOfCP": ["$country", "$location"]}
]
},
-1
]
}
}
}
])
Example output:
[
{
"_id": "001",
"country": "FRANCE",
"isIn": true,
"location": "PARIS FRANCE"
},
{
"_id": "002",
"country": "CANADA",
"isIn": false,
"location": "TORONTO"
}
]
Try it on mongoplayground.net.
I was following the book "Elasticsearch: The Definitive Guide". This book is outdated and when something was not working I was searching it on the internet and making it work with newer versions. But I can't find anything useful for Parent-Child Mapping and Indexing.
For example:
{
"mappings": {
"branch": {},
"employee": {
"_parent": {
"type": "branch"
}
}
}
}
How can I represent following mapping in new version of Elasticsearch.
And How can I index following parent:
{ "name": "London Westminster", "city": "London", "country": "UK" }
and following childer:
PUT company/employee/1?parent=London
{
"name": "Alice Smith",
"dob": "1970-10-24",
"hobby": "hiking"
}
Also, I am using elasticsearch python client and providing examples in it would be appreciated.
The _parent field has been removed in favor of the join field.
The join data type is a special field that creates parent/child
relation within documents of the same index. The relations section
defines a set of possible relations within the documents, each
relation being a parent name and a child name.
Consider company as the parent and employee as its child
Index Mapping:
{
"mappings": {
"properties": {
"my_join_field": {
"type": "join",
"relations": {
"company": "employee"
}
}
}
}
}
Parent document in the company context
PUT /index-name/_doc/1
{
"name": "London Westminster",
"city": "London",
"country": "UK",
"my_join_field": {
"name": "company"
}
}
Child document
PUT /index-name/_doc/2?routing=1&refresh
{
"name": "Alice Smith",
"dob": "1970-10-24",
"hobby": "hiking",
"my_join_field": {
"name": "employee",
"parent": "1"
}
}
I am having difficulty updating nested json structure in mongo.
I am using pymongo along with Mongoengine-Rest-framework.
Since this particular json has dynamic structure and is heavily nested, I chose to use pymongo over mongo-engine ORM.
The create, retrieve and delete operations faring fine.
I would like some suggestions on the updation issue.
Lets consider a sample object which is already present in mongo:
st1 = {
"name": "Some_name",
"details": {
"address1": {
"house_no": "731",
"street": "Some_street",
"city": "some_city"
"state": "some_state"
}
}
}
If I try to update st1 by adding address2 to the details by sending the json st2 in the update command with _id being the condition for updation,
st2 = {
"details": {
"address2": {
"house_no": "5102",
"street": "Some_street",
"city": "some_city"
"state": "some_state"
}
}
}
I get the following object st3 as result , in mongo,
st3 = {
"name": "Some_name",
"details": {
"address2": {
"house_no": " 5102",
"street": "Some_street",
"city": "some_city"
"state": "some_state"
}
}
}
instead of the expected st4 object.
st4 = {
"name": "Some_name",
"details": {
"address1": {
"house_no": "731",
"street": "Some_street",
"city": "some_city"
"state": "some_state"
},
"address2": {
"house_no": "5102",
"street": "Some_street",
"city": "some_city"
"state": "some_state"
}
}
}
my update command is:
result = collection.update_one({'_id': id}, doc)
where
id: _id of document
doc: (here) st2
collection: pymongo colllection object
The original JSON depth is 6 and the keys are dynamic. The updation will be needed at different depths.
First, change the object to update to this:
to_update = {
"house_no": "5102",
"street": "Some_street",
"city": "some_city",
"state": "some_state"
}
And then use it to update the specific part of the document you want:
collection.update_one({_id: id}, { '$set': {"details.address2" : to_update} });
use this to add address 2:
collection.update({'_id': ObjectId(doc_id)}, {'$set': {'details.%s' %
'address2': address2}}, upsert=True)
Checkout complete code :
import pymongo
from bson.objectid import ObjectId
data = {"name": "Some_name",
"details": {"address1": {"house_no": "731", "street": "Some_street", "city": "some_city", "state": "some_state"}}}
address2 = {"house_no": "731", "street": "Some_street", "city": "some_city", "state": "some_state"}
connect = pymongo.MongoClient('192.168.4.202', 20020)
database = connect['my_test']
collection = database['coll']
# # CREATE COLLECTIONS AND INSERT DATA
# _id = collection.insert(data)
# print _id
doc_id = '57568aa11ec52522343ee695'
collection.update({'_id': ObjectId(doc_id)}, {'$set': {'details.%s' % 'address2': address2}}, upsert=True)
I've a nested json structure, I'm using objectpath (python API version), but I don't understand how to select and filter some information (more precisely the nested information in the structure).
EG.
I want to select the "description" of the action "reading" for the user "John".
JSON:
{
"user":
{
"actions":
[
{
"name": "reading",
"description": "blablabla"
}
]
"name": "John"
}
}
CODE:
$.user[#.name is 'John' and #.actions.name is 'reading'].actions.description
but it doesn't work (empty set but in my JSON it isn't so).
Any suggestion?
Is this what you are trying to do?
import objectpath
data = {
"user": {
"actions": {
"name": "reading",
"description": "blablabla"
},
"name": "John"
}
}
tree = objectpath.Tree(data)
result = tree.execute("$.user[#.name is 'John'].actions[#.name is 'reading'].description")
for entry in result:
print entry
Output
blablabla
I had to fix your JSON. Also, tree.execute returns a generator. You could replace the for loop with print result.next(), but the for loop seemed more clear.
import objectpath import *
your_json = {"name": "felix", "last_name": "diaz"}
# This json path will bring all the key-values of your json
your_json_path='$.*'
my_key_values = Tree(your_json).execute(your_json_path)
# If you want to retrieve the name node...then specify it.
my_name= Tree(your_json).execute('$.name')
# If you want to retrieve a the last_name node...then specify it.
last_name= Tree(your_json).execute('$.last_name')
I believe you're just missing a comma in JSON:
{
"user":
{
"actions": [
{
"name": "reading",
"description": "blablabla"
}
],
"name": "John"
}
}
Assuming there is only one "John", with only one "reading" activity, the following query works:
$.user[#.name is 'John'].actions[0][#.name is 'reading'][0].description
If there could be multiple "John"s, with multiple "reading" activities, the following query will almost work:
$.user.*[#.name is 'John'].actions..*[#.name is 'reading'].description
I say almost because the use of .. will be problematic if there are other nested dictionaries with "name" and "description" entries, such as
{
"user": {
"actions": [
{
"name": "reading",
"description": "blablabla",
"nested": {
"name": "reading",
"description": "broken"
}
}
],
"name": "John"
}
}
To get a correct query, there is an open issue to correctly implement queries into arrays: https://github.com/adriank/ObjectPath/issues/60
I have some customer documents that I want to be retrieved using ElasticSearch based on where the customers come from (country field is IN an array of countries).
[
{
"name": "A1",
"address": {
"street": "1 Downing Street"
"country": {
"code": "GB",
"name": "United Kingdom"
}
}
},
{
"name": "A2",
"address": {
"street": "25 Gormut Street"
"country": {
"code": "FR",
"name": "France"
}
}
},
{
"name": "A3",
"address": {
"street": "Bonjour Street"
"country": {
"code": "FR",
"name": "France"
}
}
}
]
Now, I have another an array in my Python code:
["DE", "FR", "IT"]
I'd like to obtain the two documents, A2 and A3.
How would I write this in PyES/Query DSL? Am I supposed to be using an ExistsFilter or a TermQuery for this. ExistsFilter seems to only check whether the field exists or not, but doesn't care about the value.
In NoSQL-type document stores, all you get back is the document, not parts of the document.
Your requirement: "I'd like to obtain the two documents, A2 and A3." implies that you need to index each of those documents separately, not as an array inside another "parent" document.
If you need to match values of the parent document alongside country then you need to denormalize your data and store those values from the parent doc inside each sub-doc as well.
Once you've done the above, then the query is easy. I'm assuming that the country field is mapped as:
country: {
type: "string",
index: "not_analyzed"
}
To find docs with DE, you can do:
curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"country" : "DE"
}
}
}
}
}
'
To find docs with either DE or FR:
curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"constant_score" : {
"filter" : {
"terms" : {
"country" : [
"DE",
"FR"
]
}
}
}
}
}
'
To combine the above with some other query terms:
curl -XGET 'http://127.0.0.1:9200/_all/_search?pretty=1' -d '
{
"query" : {
"filtered" : {
"filter" : {
"terms" : {
"country" : [
"DE",
"FR"
]
}
},
"query" : {
"text" : {
"address.street" : "bonjour"
}
}
}
}
}
'
Also see this answer for an explanation of how arrays of objects can be tricky, because of the way they are flattened:
Is it possible to sort nested documents in ElasticSearch?