How can multi search in Elasticsearch for python? - python

body =
{
"query":{
"bool":{
"must":{
'terms':{
"reason":["A","B"]
}
}
}
}
}
The 'reason' is in _source.
I want to find reason A or reason B given the index='test_index' by python.
But this code can't find.
The result is empty.
I use this "/_search?q=reason:A|Bsize=50&from=5000",the result is correct.
I want to get the same result in Python.
How can I do?

Try this.
test_index/_search
{
"query": {
"bool" :{
"should" : [
{ "term" : { "reason" : "A" } },
{ "term" : { "reason" : "B" } }
]
}
}
}
must is strict search and analogous to AND in sql query. should is analogous to OR in sql

It's not about Python when you use body as dict of your query.
Terms Query:
https://www.elastic.co/guide/en/elasticsearch/reference/7.4/query-dsl-terms-query.html
ES Client for Python:
https://elasticsearch-py.readthedocs.io/en/master/
It should works:
body = {"query":{"bool":{"must":{'terms':{"reason":["A","B"]}}}}}
res = es.search(index="test-index", body=body)
Check:
res['hits']['hits']

Related

How to get all document with max date?

i'm trying to get, from my MongoDB, all documents with the higher date.
My db is look like :
_id:"1"
date:"21-12-20"
report:"some stuff"
_id:"2"
date:"11-11-11"
report:"qualcosa"
_id:5fe08735b5a28812866cbc8a
date:"21-12-20"
report:Object
_id:5fe0b35e2f465c2a2bbfc0fd
date:"20-12-20"
report:"ciao"
and i would like to have a result like :
_id:"1"
date:"21-12-20"
report:"some stuff"
_id:5fe08735b5a28812866cbc8a
date:"21-12-20"
report:Object
I tried to run this script :
db.collection.find({}).sort([("date", -1)]).limit(1)
but it gives me only one document.
How can I get all the documents with the greatest date automatically?
Try to remove limit(1) and it's gonna work
If you add .limit(1) it's only ever going to give you one document.
Either use the answer as a query to another .find(), or you can write an aggregate query. If you data set is a modest size, I prefer the former for clarity.
max_date = list(db.collection.find({}).sort([("date", -1)])).limit(1)
if len(max_date) > 0:
db.collection.find({'date': max_date[0]['date']})
Use an aggregation pipeline like this:
db.collection.aggregate([
{ $group: { _id: null, data: { $push: "$$ROOT" } } },
{
$set: {
data: {
$filter: {
input: "$data",
cond: { $eq: [{ $max: "$data.date" }, "$$this.date"] }
}
}
}
},
{ $unwind: "$data" },
{ $replaceRoot: { newRoot: "$data" } }
])

What is the correct way to use the elasticsearch-dsl Percolate query?

s = Search(index='test-index').using(client)
q = Q('percolate',
field="query",
documents=list_of_documents)
s = s.query(q)
p = s.execute()
I am attempting to run a percolation query against an index, with a list of documents and I am getting the error
RequestError(400, 'search_phase_execution_exception', 'Field [_id] is a metadata field and cannot be added inside a document. Use the index API request parameters.').
Any help solving this is very much appreciated.
I'll start to explain this via the APIs.
The Percolate query can be used to match queries stored in an index.
When creating an index with a Percolate field, you specify a mapping like this:
PUT /my-index
{
"mappings": {
"properties": {
"message": {
"type": "text"
},
"query": {
"type": "percolator"
}
}
}
}
This indicates that the field message will be the one used for the Percolate query.
If you would like to match a list of documents, a list of terms with this field should be sent like in the example found in the docs:
GET /my-index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"documents" : [
{
"message" : "bonsai tree"
},
{
"message" : "new tree"
},
{
"message" : "the office"
},
{
"message" : "office tree"
}
]
}
}
}
Said this, you should:
Set the proper mappings in your ES index to Percolate a specific field.
In the DSL, send the list of parameters only with the "Percolated" field, instead of the whole ES document.
Hope this is helpful :D

elastic query where field is null in elastic

i am trying to write a query where it searches in elastic that a particular field is null.this query us executed in python using Python Elasticsearch Client.
query:
{
"_source": ["name"],
"query": {
"nested": {
"path": "experience",
"query": {
"match": {
"experience.resignation_date": {
"query": None
}
}
}
}
}
}
since its python i have used None in the query part but it throwing me this error.
elasticsearch.exceptions.RequestError: TransportError(400, 'parsing_exception', '[match] unknown token [VALUE_NULL] after [query]')
The missing query is deprecated, you're looking for bool/must_not + exists
{
"_source": [
"name"
],
"query": {
"nested": {
"path": "experience",
"query": {
"bool": {
"must_not": {
"exists": {
"field": "experience.resignation_date"
}
}
}
}
}
}
}
With this expression you're not querying for null, you're saying use null as the query.
Query must always be one of the query types that ElasticSearch has defined, such as for example the "match" query.
The kind of syntax you wanted to write here was
query: {
match: {
"experience.resignation_date": None
}
}
Where you are asserting that the value matches "None"
However, there is a better specific query type for matching documents with empty field values called "missing".
It will match null fields as well as documents which lack the property all together. Whether this is more appropriate for you depends on your needs.
EDIT: As a subsequent answer points out "missing" is actually deprecated now. The equivalent is to negate the "exists" query instead.
query: {
bool: {
must_not: {
exists: "experience.resignation_date"
}
}
}

Elasticsearch-Python 2.7-Configure an index for analyzer

I am trying to build an index using the python API, with the following code (In particular I am trying to configure an analyzer):
doc = {
"settings": {
"analysis": {
"analyzer": {
"folding": {
"tokenizer": "standard",
"filter": [ "lowercase", "asciifolding" ]
}
}
}
}
}
res = es.indices.create(index='index_db',body=doc)
But when I try to feed the database with some example data: 'My œsophagus caused a débâcle' (the same example of the website) I don't obtain : 'my, oesophagus, caused, a, debacle' but again: 'my, œsophagus caused, a, débâcle'. I think the problem is in the creation of the index. Do I use the correct syntax?
After several attempt I found the solution. It was a syntax problem.
The correct answer is:
doc = {
"index" : {
"analysis" : {
"analyzer" : {
"default" : {
"tokenizer" : "standard",
"filter" : ["standard", "asciifolding"]
}
}
}
}
}
es.indices.create(index='forensic_db',body=doc)

PyMongo returns an empty result set when a MongoDB client returns correct results

I have a simple MongoDB collection that I am accessing using PyMongo in my Python script.
I am filtering the query in Python using the dictionary:
{ "$and" : [
{ "bettinginterests" : { "$elemMatch" : { "runner.name" : "Jailhouse King" } } },
{ "bettinginterests" : { "$elemMatch" : { "runner.name" : "Tyrone Haji" } } }
]
}
And this returns correct results. However, I would like to expand the filter to be:
{ "$and" : [
{ "bettinginterests" : { "$elemMatch" : { "runner.name" : "Jailhouse King" } } },
{ "bettinginterests" : { "$elemMatch" : { "runner.name" : "Tyrone Haji" } } },
{ "summary.dist" : "1" }
]
}
And this is returning an empty result set. Now when I do this same query in my MongoDB client using:
db.race_results.find({ "$and" : [
{ "bettinginterests" : { "$elemMatch" : { "runner.name" : "Jailhouse King" } } },
{ "bettinginterests" : { "$elemMatch" : { "runner.name" : "Tyrone Haji" } } },
{ "summary.dist": "1" }
]
})
The results are returned correctly as expected.
I don't see any difference between the Python dictionary being passed as the query filter, and the js code being executed on my MongoDB client.
Does anyone see where there might be a difference? I'm at a loss here.
UPDATE:
Here is a sample record in my DB:
https://gist.github.com/brspurri/8cefcd20a7f995145a81
UPDATE 2:
Python Code to perform the query:
runner = "Jailhouse King"
opponent = "Tyrone Haji"
query_filter = {"$and": [
{"bettinginterests": {"$elemMatch": {"runner.name": runner}}},
{"bettinginterests": {"$elemMatch": {"runner.name": opponent}}},
{ "summary.dist" : "1" }
]
}
try:
collection = db.databases['race_results']
entities = None
if not query_filter:
entities = collection.find().sort([("date", -1)])
else:
entities = collection.find(query_filter).sort([("date", -1)])
except BaseException, e:
print('An error occured in query: %s\n' % e)
This line is probably the culprit.
collection = db.databases['race_results']
If db is your database you are doing it wrong. It should be
collection = db['race_results']

Categories