I have an elasticsearch query which returns the top 10 results for a given querystring. I now need to use the response to create a sum aggregation for each of the 10 top results. This is my query to return the top 10:
GET search/
{
"index": "my_index",
"query": {
"match": {
"name": {
"query": "hello world",
"fuzziness": 2
}
}
}
}
With the response from the above request, I generate a list of the 10 org_ids and iterate over each of these ID. I have to make another request using the query below (where "org_id": "12345" is the first element in my array of IDs).
POST _search/my_index
{ "size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"org_id": "12345"
}
}
]
}
},
"aggs": {
"aggregation_1": {
"sum": {
"field": "dollar_amount"
}
},
"aggregation_2": {
"sum": {
"field": "employees"
}
}
}
}
However, I think that this approach is inefficient because I have to make a total of 11 requests which won't scale well. Ideally, I would like to make one request that can do all of this.
Is there any functionality in ES that would make this possible, or would I have to make individual requests for each search parameter? I've looked through the docs and can't find anything that involves iterating over the array of results.
EDIT: For simplicity, I think having 2 requests is fine for now. So I just need to figure out how to pass through an array of org_ids into the 2nd query and do all aggregations in that 2nd query.
E.g.
POST _search/my_index
{ "size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"org_id": ["12345", "67891", "98765"]
}
}
]
}
},
"aggs": {
"aggregation_1": {
"sum": {
"field": "dollar_amount"
}
},
"aggregation_2": {
"sum": {
"field": "employees"
}
}
}
}
To start you can aggregate on one step (so 2 requests in total)
I am taking a look about the fuzziness, but I don't see how make a one shot query.
Edit: are your org_id unique (= id of documents?), can you describe your data (how org_id are linked with the fuzziness query)?
{ "size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"org_id": "12 13 14 15 16 17 18...."
}
}
]
}
},
"aggs": {
"group_org_id": {
"terms": {
"field": "org_id"
}
},
"aggs": {
"aggregation_1": {
"sum": {
"field": "dollar_amount"
}
},
"aggregation_2": {
"sum": {
"field": "employees"
}
}
}
}
}
Related
I have a elastic search index collection like below,
"_index":"test",
"_type":"abc",
"_source":{
"file_name":"xyz.ex"
"metadata":{
"format":".ex"
"profile":[
{"date_value" : "2018-05-30T00:00:00",
"key_id" : "1",
"type" : "date",
"value" : [ "30-05-2018" ]
},
{
"key_id" : "2",
"type" : "freetext",
"value" : [ "New york" ]
}
}
Now I need to search for document by matching key_id to its value. (key_id is some field whose value is stored in "value")
Ex. For key_id='1'field, if it's value = "30-05-2018" it should match the above document.
I tried mapping this as a nested object, But I am not able to write query to search with 2 or more key_id matching its respective value.
This is how I would do it. You need to AND together via bool/filter (or bool/must) two nested queries for each of the condition pair, since you want to match two different nested elements from the same parent document.
{
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "metadata.profile",
"query": {
"bool": {
"filter": [
{
"term": {
"metadata.profile.f1": "a"
}
},
{
"term": {
"metadata.profile.f2": true
}
}
]
}
}
}
},
{
"nested": {
"path": "metadata.profile",
"query": {
"bool": {
"filter": [
{
"term": {
"metadata.profile.f1": "b"
}
},
{
"term": {
"metadata.profile.f2": false
}
}
]
}
}
}
}
]
}
}
}
I'd like to "translate" a string like:
A AND (C OR B) AND NOT D
into an Elasticsearch query like:
{
"query": {
"bool": {
"must": {
"term": {
"text": "A"
}
},
"must_not": {
"term": {
"text": "D"
}
},
"should": [
{
"term": {
"text": "B"
}
},
{
"term": {
"text": "C"
}
}
],
"minimum_should_match": 1,
"boost": 1
}
}
}
does exists some library which I can use ?
any help appreciated
Thanks!
ok according to:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
I can do query like:
{
"query": {
"query_string" : {
"default_field" : "text",
"query" : (this AND (submitted OR flowers) AND NOT blight"
}
}
}
which works great.
The following mapping is aggregated on multiple levels on a field grouping documents using another field.
Mapping:
{
'predictions': {
'properties': {
'Company':{'type':'string'},
'TxnsId':{'type':'string'},
'Emp':{'type':'string'},
'Amount':{'type':'float'},
'Cash/online':{'type':'string'},
'items':{'type':'float'},
'timestamp':{'type':'date'}
}
}
}
My requirement is bit complex, I need to
For each Emp (Getting the distinct employees)
Check whether it is online or cashed transaction
Group by items with the ranges like 0-10,11-20,21-30....
Sum the Amount
Final Output is like:
>Emp-online-range-Amount
>a-online-(0-10)-1240$
>a-online-(21-30)-3543$
>b-online-(0-10)-2345$
>b-online-(11-20)-3456$
Something like this should do the job:
{
"size": 0,
"aggs": {
"by_emp": {
"terms": {
"field": "Emp"
},
"aggs": {
"cash_online": {
"filters": {
"filters": {
"cashed": {
"term": {
"Cash/online": "cached"
}
},
"online": {
"term": {
"Cash/online": "online"
}
}
}
},
"aggs": {
"ranges": {
"range": {
"field": "items",
"ranges": [
{
"from": 0,
"to": 11
},
{
"from": 11,
"to": 21
},
{
"from": 21,
"to": 31
}
]
},
"aggs": {
"total": {
"sum": {
"field": "Amount"
}
}
}
}
}
}
}
}
}
}
I am recently using elasticsearch in a website. The scenario is, I have to search a string on afield. So, if the field is named as title then my search query was,
"query" :{"match": {"title": my_query_string}}.
But now I need to add another field in it. Let say, category. So i need to find the matches of my string which are in category :some_category and which have title : my_query_string I tried with multi_match. But it does not give me the result i am looking for. I am looking into query filter now. But is there way of adding two fields in such criteria in my match query?
GET indice/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"title": "title"
}
},
{
"match": {
"category": "category"
}
}
]
}
}
}
Replace should with must if desired.
Ok, so I think that what you need is something like this:
"query": {
"filtered": {
"query": {
"match": {
"title": YOUR_QUERY_STRING,
}
},
"filter": {
"term": {
"category": YOUR_CATEGORY
}
}
}
}
If your category field is analyzed, then you will need to use match instead of term in the filter.
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{"match": {"title": "bold title"},
{"match": {"body": "nice body"}}
]
}
},
"filter": {
"term": {
"category": "xxx"
}
}
}
}
I was doing search using elastic search using the code:
es.search(index="article-index", fields="url", body={
"query": {
"query_string": {
"query": "keywordstr",
"fields": [
"text",
"title",
"tags",
"domain"
]
}
}
})
Now I want to insert another parameter in the search scoring - "recencyboost".
I was told function_score should solve the problem
res = es.search(index="article-index", fields="url", body={
"query": {
"function_score": {
"functions": {
"DECAY_FUNCTION": {
"recencyboost": {
"origin": "0",
"scale": "20"
}
}
},
"query": {
{
"query_string": {
"query": keywordstr
}
}
},
"score_mode": "multiply"
}
}
})
It gives me error that dictionary {"query_string": {"query": keywordstr}} is not hashable.
1) How can I fix the error?
2) How can I change the decay function such that it give higher weight to higher recency boost?
You appear to have an extra query in your search (giving a total of three), which is giving you an unwanted top-level. You need to remove the top-level query and replace it with function_score as the top level key.
res = es.search(index="article-index", fields="url", body={"function_score": {
"query": {
{ "query_string": {"query": keywordstr} }
},
"functions": {
"DECAY_FUNCTION": {
"recencyboost": {
"origin": "0",
"scale": "20"
}
}
},
"score_mode": "multiply"
})
Note: score_mode defaults to "multiply", as does the unused boost_mode, so it should be unnecessary to supply it.
You cant use dictionary as a key in the dictionary. You are doing this in the following segment of the code:
"query": {
{"query_string": {"query": keywordstr}}
},
Following should work fine
"query": {
"query_string": {"query": keywordstr}
},
use it like this
query: {
function_score: {
query: {
filtered: {
query: {
bool: {
must: [
{
query_string: {
query: shop_search,
fields: [ 'shop_name']
},
boost: 2.0
},
{
query_string: {
query: shop_search,
fields: [ 'shop_name']
},
boost: 3.0
}
]
}
},
filter: {
// { term: { search_city: }}
}
},
exp: {
location: {
origin: { lat: 12.8748964,
lon: 77.6413239
},
scale: "10000m",
offset: "0m",
decay: "0.5"
}
}
// score_mode: "sum"
}