simple Elasticsearch nested search query - python

I have documents in ES (Tweepy JSON) like this
{
"_source": {
"id": 792477813014224900,
"metadata": {
"iso_language_code": "en",
"result_type": "recent"
},
"retweeted": false,
"retweet_count": 330,
"user": {
"id": 149250899,
"listed_count": 0,
"protected": false,
"followers_count": 347,
"entities": {
"description": {
"urls": []
}
},
"screen_name": "Zwido_"
}
And I would like to search and query one full document based by user_name field.
I tryied this code
{
"nested": {
"path": "_source",
"score_mode": "avg",
"query": {
"bool": {
"must": [
{
"text": {"_source.user.user_name": user}
}
]
}
}
}
}
But it doesn't work and I received error
TransportError(400, 'search_phase_execution_exception', 'failed to parse search source. unknown search element [nested]
What I am doing wrong?
Thanks for help.

You don't need to specify the _source field + you're missing a query at the top-level, do it like this instead.
{
"query": {
"nested": {
"path": "user",
"score_mode": "avg",
"query": {
"bool": {
"must": [
{
"match": {"user.screen_name": user}
}
]
}
}
}
}
}
UPDATE
If your user field is not of nested type, then you can simply do it like this:
{
"query": {
"bool": {
"must": [
{
"match": {
"user.screen_name": user
}
}
]
}
}
}

as mentioned in the elasticsearch documentation here you should change the mapping of your data to tell elasticsearch that it is nested object. Once that is done then you can query the object.

Related

Match query returns nothing

I want to search text inside fields.
I tried to fix my problem from this documentation
One of my index contains items which structure is the following:
{
url: "https://exampleurl.com"
username: "some_username"
}
Here is my querys:
"query": {
"multi_match": {
"query": keyword,
"type": "phrase",
"fields": [ "username", "url" ]
}
}
Also bool query:
"query": {
"bool": {
"must": {
"multi_match": {
"query": keyword,
"type": "phrase",
"fields": [ "username", "url" ]
}
},
}
}
"query": {
"bool": {
"must": [{
"match": {
"username": keyword,
}
}, {
"match": {
"url": keyword
}
}]
}
}
But result is a empty array
please try the below query.
Create Index
PUT test
{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"properties" : {
"url" : { "type" : "text" },
"username" : { "type" : "text" }
}
}
}
Insert Document
PUT test/_doc/1
{
"url" : "https://exampleurl.com",
"username" : "Arjun Das"
}
Search
GET test/_search
{
"query": {
"multi_match": {
"query": "http",
"type": "best_fields",
"fields": [ "username", "url" ],
"fuzziness":"2"
}
}
}

ElasticSearch error: [function_score] malformed query, expected [END_OBJECT] but found [FIELD_NAME]

The following JSON structure gives me an error when doing a query:
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "BRCA1",
"fuzziness": "AUTO",
"fields": [
"Long_Name",
"Short_Name",
"Uniprot_ID^10",
"Genes^2",
"Diseases^2",
"Function",
"Domains"
]
}
},
{
"term": {
"Is_Reviewed": true
}
},
{
"term": {
"Has_Function": true
}
}
]
}
}
},
"field_value_factor": {
"field": "Number_Of_Structures"
}
},
"size": 100
}
The error is:
[function_score] malformed query, expected [END_OBJECT] but found [FIELD_NAME]
The bool query on its own works perfectly, but as soon as I use function_score, it stops working. I have tried to follow this example: https://www.elastic.co/guide/en/elasticsearch/guide/master/boosting-by-popularity.html
Any ideas as to what I am doing wrong would be much appreciated!
You must put field_value_factor one level higher, inside function_score:
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "BRCA1",
"fuzziness": "AUTO",
"fields": [
"Long_Name",
"Short_Name",
"Uniprot_ID^10",
"Genes^2",
"Diseases^2",
"Function",
"Domains"
]
}
},
{
"term": {
"Is_Reviewed": true
}
},
{
"term": {
"Has_Function": true
}
}
]
}
},
"field_value_factor": {
"field": "Number_Of_Structures"
}
}
},
"size": 100
}

Or in a Elasticsearch filter

I want to query my elasticsearch (using a python library) and I want to filter some of the document. Since I don't want to have a score I'm using only filter and must not keyword:
{
"_source": ["entities"],
"query": {
"bool": {
"must_not": [
{"exists": {"field": "retweeted_status"}}
],
"filter": [
{"match": {"entities.urls.display_url": "blabla.com"}},
{"match": {"entities.urls.display_url": "blibli.com"}}]
}
}
}
This is the query I have done but the problem is that in the same filter it's apparently a AND operation that is effectued. I would like it to be a OR. How can I change my query to have all the document that contain "blibli.com" OR "blabla.com"
You can nest bool inside another bool so you can write query like this:
{
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "retweeted_status"
}
}
],
"filter": [
{
"bool": {
"should": [
{
"match": {
"entities.urls.display_url": "blabla.com"
}
},
{
"match": {
"entities.urls.display_url": "blibli.com"
}
}
]
}
}
]
}
}
}
Tested on ES 5.3, you can use Explain API to check if this also works in your version of Elasticsearch.

How to generate queries and skip parts of queries in Elasticsearch?

I am using Python to query Elasticsearch with a custom query. Let's look at a very simple example that will search for a given term in the field 'name' and another one in the 'surname' field of the document:
from elasticsearch import Elasticsearch
import json
# read query from external JSON
with open('query.json') as data_file:
read_query= json.load(data_file)
# search with elastic search and show hits
es = Elasticsearch()
# set query through body parameter
res = es.search(index="test", doc_type="articles", body=read_query)
print("%d documents found" % res['hits']['total'])
for doc in res['hits']['hits']:
print("%s) %s" % (doc['_id'], doc['_source']['content']))
'query.json'
{
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "Star",
"boost": 2
}
}
},
{
"match": {
"surname": "Fox"
}
}
]
}
}
}
Now, I am expecting the input of search words from the user, the first word that is typed in is used for the field 'name' and the second one for 'surname'. Let's imagine I will replace the {$name} and {$surname} with the two words that have been typed in by the user using python:
'query.json'
{
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "{$name}",
"boost": 2
}
}
},
{
"match": {
"surname": "{$surname}"
}
}
]
}
}
}
Now the problem arises when the user doesn't input the surname but only the name, so I end up with the following query:
'query.json'
{
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "Star",
"boost": 2
}
}
},
{
"match": {
"surname": ""
}
}
]
}
}
}
The field "surname" is now empty and elasticsearch will look for hits where "surname" is an empty string, which is not what I want. I want to ignore the surname field if the input term is empty. Is there any mechanism in elasticsearch to set a part of query to be ignored if the given term is empty?
{
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "Star",
"boost": 2
}
}
},
{
"match": {
"surname": "",
"ignore_if_empty" <--- this would be really cool
}
}
]
}
}
}
Maybe there is any other way of generating query strings? I can't seem to find anything about query generation in Elasticsearch. How do you guys do it? Any input is welcome!
Python DSL seems to be the proper way of doing it https://github.com/elastic/elasticsearch-dsl-py/

Elasticsearch match multiple fields

I am recently using elasticsearch in a website. The scenario is, I have to search a string on afield. So, if the field is named as title then my search query was,
"query" :{"match": {"title": my_query_string}}.
But now I need to add another field in it. Let say, category. So i need to find the matches of my string which are in category :some_category and which have title : my_query_string I tried with multi_match. But it does not give me the result i am looking for. I am looking into query filter now. But is there way of adding two fields in such criteria in my match query?
GET indice/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"title": "title"
}
},
{
"match": {
"category": "category"
}
}
]
}
}
}
Replace should with must if desired.
Ok, so I think that what you need is something like this:
"query": {
"filtered": {
"query": {
"match": {
"title": YOUR_QUERY_STRING,
}
},
"filter": {
"term": {
"category": YOUR_CATEGORY
}
}
}
}
If your category field is analyzed, then you will need to use match instead of term in the filter.
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{"match": {"title": "bold title"},
{"match": {"body": "nice body"}}
]
}
},
"filter": {
"term": {
"category": "xxx"
}
}
}
}

Categories