Perform nested search using elasticsearch dsl - python

Hi I want to perform a nested search using elasticsearch dsl where a document field has nested json data in it so I want specific nested key values from it like -
Below is the document:-
{
"_index" : "data",
"_type" : "users",
"_id" : "15",
"_version" : 1,
"found" : true,
"_source" : {
"data" : {
"Gender" : "M",
"Marks" : "80",
"name" : "Mayank",
"Address" : "India"
},
"last_updated" : "2017-04-09T01:54:33.764573"
}
}
I only want field values which are stored in an array.
fields_want = ['name', 'Marks']
Output should be like -> {"name":"Mayank", "Marks":"80"}
Elasticsearch dsl documentation is pretty hard to understandand for me.
https://elasticsearch-dsl.readthedocs.io/en/latest/search_dsl.html#
Dsl code:-
client = Elasticsearch()
s = Search(using=client, index="data") \
.query("match", _type="users") \
.query("match", _id=15)
response = s.execute()
for hit in s:
print(hit.data)
From this code I can get the whole json object under data field.
Can somebody guide me here ?

It was solved.
I have used source filter to get nested output.
client = Elasticsearch()
s = Search(using=client, index="data") \
.query("match", _type="users") \
.query("match", _id=15) \
.source(['data.Name', 'data.Marks'])
response = s.execute()
print response
Output -
{u'Name': u'Mayank', u'Marks': u'80'}

Related

MongoDB values to Dict in python

Basically I need to connect to MongoDB documents records and put into values into dict.
**MongoDB Values**
{ "_id" : "LAC1397", "code" : "MIS", "label" : "Marshall Islands", "mappingName" : "RESIDENTIAL_COUNTRY" }
{ "_id" : "LAC1852", "code" : "COP", "label" : "Colombian peso", "mappingName" : "FOREIGN_CURRENCY_CODE"}
How do i map it to dict in the below fashion in python
**syntax :**
dict = {"mappingName|Code" : "Value" }
**Example :**
dict = { "RESIDENTIAL_COUNTRY|MIS" : "Marshall Islands" , "FOREIGN_CURRENCY_CODE|COP" : "Colombian peso" , "COMM_LANG|ENG" : "English" }
**Python Code**
from pymongo import MongoClient
client = MongoClient('localhost', 27017)
db = client.mongo
collection = db.masters
for post in collection.find():
Got stuck after this , not sure how to put into dict in the mentioned method
post will be a dict with the values from mongo, so you can loop the records and append to a new dictionary. As the comments mention, any duplicates would be overridden by the last found value. If this might be an issue, consider a sort() on the find() function.
Sample code:
from pymongo import MongoClient
db = MongoClient()['mydatabase']
db.mycollection.insert_one({ "_id" : "LAC1397", "code" : "MIS", "label" : "Marshall Islands", "mappingName" : "RESIDENTIAL_COUNTRY" })
db.mycollection.insert_one({ "_id" : "LAC1852", "code" : "COP", "label" : "Colombian peso", "mappingName" : "FOREIGN_CURRENCY_CODE"})
mydict = {}
for post in db.mycollection.find():
k = f"{post.get('mappingName')}|{post.get('code')}"
mydict[k] = post.get('label')
print(mydict)
Gives:
{'RESIDENTIAL_COUNTRY|MIS': 'Marshall Islands', 'FOREIGN_CURRENCY_CODE|COP': 'Colombian peso'}

indexing synonyms in ElasticSearch Python

Problem description
I want to run a query string like this for example :
{"query": {
"query_string" : {
"fields" : ["description"],
"query" : "illegal~"
}
}
}
I have a side synonyms.txt file that contains synonyms :
illegal, banned, criminal, illegitimate, illicit, irregular, outlawed, prohibited
otherWord, synonym1, synonym2...
I want to find all elements having any one of these synonyms.
What I tried
First I want to index those synonyms in my ES database.
I tried to run this query with curl :
curl -X PUT "https://instanceAdress.europe-west1.gcp.cloud.es.io:9243/app/kibana#/dev_tools/console/sources" -H 'Content-Type: application/json' -d' {
"settings": {
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "whitespace",
"filter" : ["synonym"]
}
},
"filter" : {
"synonym" : {
"type" : "synonym",
"synonyms_path" : "synonyms.txt"
}
}
}
}
}
}
'
but it doesn't work {"statusCode":404,"error":"Not Found"}
I then need to change my query so that it takes into account the synonyms but I have no idea how.
So my questions are :
How can I index my synonyms ?
How can I change my query so that it does the query for all synonyms ?
Is there any way to index them in Python ?
example of a get query using Python Elasticsearch
es = Elasticsearch(
['fullAdress.europe-west1.gcp.cloud.es.io'],
http_auth=('login', 'password'),
scheme="https",
port=9243,
)
es.get(index="sources", doc_type='rcp', id="301495")
You can index using synonyms with Python by:
First, create a token filter:
synonyms_token_filter = token_filter(
'synonyms_token_filter', # Any name for the filter
'synonym', # Synonym filter type
synonyms=your_synonyms # Synonyms mapping will be inlined
)
And then create an analyzer:
custom_analyzer = analyzer(
'custom_analyzer',
tokenizer='standard',
filter=[
'lowercase',
synonyms_token_filter
])
There's also a package for this: https://github.com/agora-team/elasticsearch-synonyms

db.collection.update $set is not working

I am using gv3.6.2 mongo db and using $set to-update a field and it just doesn't work and am clueless as to why?any pointers are appreciated?
from pymongo import MongoClient .
from bson import ObjectId
import os,pymongo
dbuser = os.environ.get('user', '')
dbpass = os.environ.get('pwd', '')
uri = 'mongodb://{dbuser}:{dbpass}#machineip/data'.format(**locals())
client = MongoClient(uri)
db = client.data
collection = db['test']
print db.version
db.collection.update(
{ "_id" : ObjectId("5a95a1c32a2e2e0025e6d6e2") },
{ "$set":
{
"status": "submission"
}
}
)
Document:
{
"_id" : ObjectId("5a95a1c32a2e2e0025e6d6e2"),
"status" : "Submitting",
"endRev" : "9531c3448d3f7713dc74c4b05d177ecf0c6e4df6",
"chip" : "4364",
}
Your update isn't working because of the match portion of your query:
{ "_id": "5a95a1c32a2e2e0025e6d6e2" }
That is searching for a document with a string _id. You must cast to an ObjectId in order for it to find the matching document and perform the update.
{ "_id" : ObjectId("5a95a1c32a2e2e0025e6d6e2") }
Also be sure to include from pymongo import ObjectId.
Use update_many to update more than one document. If you want to update one document use update_one.
Actually update is deprecated.
from bson import ObjectId
db.collection.update_many({"_id" : ObjectId("5a95a1c32a2e2e0025e6d6e2")}, {"status": "submission"})
I hope it helps.

Extract values from oddly-nested Python

I must be really slow because I spent a whole day googling and trying to write Python code to simply list the "code" values only so my output will be Service1, Service2, Service2. I have extracted json values before from complex json or dict structure. But now I must have hit a mental block.
This is my json structure.
myjson='''
{
"formatVersion" : "ABC",
"publicationDate" : "2017-10-06",
"offers" : {
"Service1" : {
"code" : "Service1",
"version" : "1a1a1a1a",
"index" : "1c1c1c1c1c1c1"
},
"Service2" : {
"code" : "Service2",
"version" : "2a2a2a2a2",
"index" : "2c2c2c2c2c2"
},
"Service3" : {
"code" : "Service4",
"version" : "3a3a3a3a3a",
"index" : "3c3c3c3c3c3"
}
}
}
'''
#convert above string to json
somejson = json.loads(myjson)
print(somejson["offers"]) # I tried so many variations to no avail.
Or, if you want the "code" stuffs :
>>> [s['code'] for s in somejson['offers'].values()]
['Service1', 'Service2', 'Service4']
somejson["offers"] is a dictionary. It seems you want to print its keys.
In Python 2:
print(somejson["offers"].keys())
In Python 3:
print([x for x in somejson["offers"].keys()])
In Python 3 you must use the list comprehension because in Python 3 keys() is a 'view', not a list.
This should probably do the trick , if you are not certain about the number of Services in the json.
import json
myjson='''
{
"formatVersion" : "ABC",
"publicationDate" : "2017-10-06",
"offers" : {
"Service1" : {
"code" : "Service1",
"version" : "1a1a1a1a",
"index" : "1c1c1c1c1c1c1"
},
"Service2" : {
"code" : "Service2",
"version" : "2a2a2a2a2",
"index" : "2c2c2c2c2c2"
},
"Service3" : {
"code" : "Service4",
"version" : "3a3a3a3a3a",
"index" : "3c3c3c3c3c3"
}
}
}
'''
#convert above string to json
somejson = json.loads(myjson)
#Without knowing the Services:
offers = somejson["offers"]
keys = offers.keys()
for service in keys:
print(somejson["offers"][service]["code"])

How to use "suggest" in elasticsearch pyes?

How to use the "suggest" feature in pyes? Cannot seem to figure it out due to poor documentation. Could someone provide a working example? None of what I tried appears to work. In the docs its listed under query, but using:
query = Suggest(fields="fieldname")
connectionobject.search(query=query)
Since version 5:
_suggest endpoint has been deprecated in favour of using suggest via _search endpoint. In 5.0, the _search endpoint has been optimized for suggest only search requests.
(from https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-suggesters.html)
Better way to do this is using search api with suggest option
from elasticsearch import Elasticsearch
es = Elasticsearch()
text = 'ra'
suggest_dictionary = {"my-entity-suggest" : {
'text' : text,
"completion" : {
"field" : "suggest"
}
}
}
query_dictionary = {'suggest' : suggest_dictionary}
res = es.search(
index='auto_sugg',
doc_type='entity',
body=query_dictionary)
print(res)
Make sure you have indexed each document with suggest field
sample_entity= {
'id' : 'test123',
'name': 'Ramtin Seraj',
'title' : 'XYZ',
"suggest" : {
"input": [ 'Ramtin', 'Seraj', 'XYZ'],
"output": "Ramtin Seraj",
"weight" : 34 # a prior weight
}
}
Here is my code which runs perfectly.
from elasticsearch import Elasticsearch
es = Elasticsearch()
text = 'ra'
suggDoc = {
"entity-suggest" : {
'text' : text,
"completion" : {
"field" : "suggest"
}
}
}
res = es.suggest(body=suggDoc, index="auto_sugg", params=None)
print(res)
I used the same client mentioned on the elasticsearch site here
I indexed the data in the elasticsearch index by using completion suggester from here

Categories