Elasticsearch with python: query specific field

Elasticsearch with python: query specific field - python

I'm using python's elasticsearch module to connect and search through my elasticsearch cluster.
In the cluster, one of the fields in my index is 'message' - I want to query my elastic, from python, for a specific value in this 'message' field.
Here is my basic search which simply returns all logs of a specific index.
es = elasticsearch.Elasticsearch(source_cluster)
doc = {
'size' : 10000,
'query': {
'match_all' : {}
}
}
res = es.search(index='test-index', body=doc, scroll='1m')
How should I change this query in order to find all results with the word 'moved' in their 'message' field?
The equivalent query that does it from Kibana is:
_index:test-index && message: moved
Thanks,
Noam

You need to use the match query. Try this:
doc = {
'size' : 10000,
'query': {
'match' : {
'message': 'moved'
}
}
}

Related

MongoDB - Querying a nested boolean field

I have the following mongoclient query:
db = db.getSiblingDB("test-db");
hosts = db.getCollection("test-collection")
db.hosts.aggregate([
{$match: {"ip_str": {$in: ["52.217.105.116"]}}}
]);
Which outputs this:
{
"_id" : ObjectId("..."),
"ip_str" : "52.217.105.116",
"data" : [
{"ssl" : {"cert" : {"expired" : "False"}}}
]
}
I'm trying to build the query so it returns a boolean True or False depending on the value of the ssl.cert.expired field. I'm not quite sure how to do this though. I've had a look into the $lookup and $where operators, but am not overly familiar with querying nested objects in Mongo yet.

As the data is an array, in order to get the (first) element of the nested expired, you should work with $arrayElemAt and provide an index as 0 to indicate the first element.
{
$project: {
ip_str: 1,
expired: {
$arrayElemAt: [
"$data.ssl.cert.expired",
0
]
}
}
}
Demo # Mongo Playgound

How to scroll through elastic query results, python

I'm querying my elastic search server and limiting it to 100 results, but there could be a potential of 5000+ results, but for speed I don't want to overload the users connection trying to send it all in bulk.
data = es.search(index=case_to_view, size=100,body={
"query": {
"range" : {
"someRandomFIeld" : {
"gte" : 1,
}
}
}
})
This is doing two things, getting me results that have the field type and only getting the results where that field type exists if its value is greater than equal to 1.
data['hits']['total'] # 5089
How do I let the user get the next lot of results from the same query, ie. The next 100, previous 100, etc

You'll want to utilize the "from" and "size" properties.
You can see it here in the 7.0 documentation.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-from-size.html
ex :
{
"from" : 0, "size" : 10,
"query" : {
"term" : { "user" : "kimchy" }
}
}

db.collection.update $set is not working

I am using gv3.6.2 mongo db and using $set to-update a field and it just doesn't work and am clueless as to why?any pointers are appreciated?
from pymongo import MongoClient .
from bson import ObjectId
import os,pymongo
dbuser = os.environ.get('user', '')
dbpass = os.environ.get('pwd', '')
uri = 'mongodb://{dbuser}:{dbpass}#machineip/data'.format(**locals())
client = MongoClient(uri)
db = client.data
collection = db['test']
print db.version
db.collection.update(
{ "_id" : ObjectId("5a95a1c32a2e2e0025e6d6e2") },
{ "$set":
{
"status": "submission"
}
}
)
Document:
{
"_id" : ObjectId("5a95a1c32a2e2e0025e6d6e2"),
"status" : "Submitting",
"endRev" : "9531c3448d3f7713dc74c4b05d177ecf0c6e4df6",
"chip" : "4364",
}

Your update isn't working because of the match portion of your query:
{ "_id": "5a95a1c32a2e2e0025e6d6e2" }
That is searching for a document with a string _id. You must cast to an ObjectId in order for it to find the matching document and perform the update.
{ "_id" : ObjectId("5a95a1c32a2e2e0025e6d6e2") }
Also be sure to include from pymongo import ObjectId.

Use update_many to update more than one document. If you want to update one document use update_one.
Actually update is deprecated.
from bson import ObjectId
db.collection.update_many({"_id" : ObjectId("5a95a1c32a2e2e0025e6d6e2")}, {"status": "submission"})
I hope it helps.

How to use find() nested documents for two levels or more?

Here is my sample mongodb database
database image for one object
The above is a database with an array of articles. I fetched only one object for simplicity purposes.
database image for multiple objects ( max 20 as it's the size limit )
I have about 18k such entries.
I have to extract the description and title tags present inside the (articles and 0) subsections.
The find() method is the question here.. i have tried this :
for i in db.ncollec.find({'status':"ok"}, { 'articles.0.title' : 1 , 'articles.0.description' : 1}):
for j in i:
save.write(j)
After executing the code, the file save has this :
_id
articles
_id
articles
and it goes on and on..
Any help on how to print what i stated above?
My entire code for reference :
import json
import newsapi
from newsapi import NewsApiClient
import pymongo
from pymongo import MongoClient
client = MongoClient()
db = client.dbasenews
ncollec = db.ncollec
newsapi = NewsApiClient(api_key='**********')
source = open('TextsExtractedTemp.txt', 'r')
destination = open('NewsExtracteddict.txt', "w")
for word in source:
if word == '\n':
continue
all_articles = newsapi.get_everything(q=word, language='en', page_size=1)
print(all_articles)
json.dump(all_articles, destination)
destination.write("\n")
try:
ncollec.insert(all_articles)
except:
pass

Okay, so I checked a little to update my rusty memory of pymongo, and here is what I found.
The correct query should be :
db.ncollec.find({ 'status':"ok",
'articles.title' : { '$exists' : 'True' },
'articles.description' : { '$exists' : 'True' } })
Now, if you do this :
query = { 'status' : "ok",
'articles.title' : { '$exists' : 'True' },
'articles.description' : { '$exists' : 'True' } }
for item in db.ncollect.find(query):
print item
And that it doesn't show anything, the query is correct, but you don't have the right database, or the right tree, or whatever.
But I assure you, that with the database you showed me, that if you do...
query = { 'status' : "ok",
'articles.title' : { '$exists' : 'True' },
'articles.description' : { '$exists' : 'True' } }
for item in db.ncollect.find(query):
save.write(item[0]['title'])
save.write(item[0]['description'])
It'll do what you wished to do in the first place.
Now, the key item[0] might not be good, but for this, I can't really be of any help since it is was you are showing on the screen. :)
Okay, now. I have found something for you that is a bit more complicated, but is cool :)
But I'm not sure if it'll work for you. I suspect you're giving us a wrong tree, since when you do .find( {'status' : 'ok'} ), it doesn't return anything, and it should return all the documents with a 'status' : 'ok', and since you have lots...
Anyways, here is the query, that you should use with .aggregate() method, instead of .find() :
elem = { '$match' : { 'status' : 'ok', 'articles.title' : { '$exists' : 'True'}, 'articles.description' : { '$exists' : 'True'}} }
[ elem, { '$unwind' : '$articles' }, elem ]
If you want an explanation as to how this works, I invite you to read this page.
This query will return ONLY the elements in your array that have a title, and a description, with a status OK. If an element doesn't have a title, or a description, it will be ignored.

How to use "suggest" in elasticsearch pyes?

How to use the "suggest" feature in pyes? Cannot seem to figure it out due to poor documentation. Could someone provide a working example? None of what I tried appears to work. In the docs its listed under query, but using:
query = Suggest(fields="fieldname")
connectionobject.search(query=query)

Since version 5:
_suggest endpoint has been deprecated in favour of using suggest via _search endpoint. In 5.0, the _search endpoint has been optimized for suggest only search requests.
(from https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-suggesters.html)
Better way to do this is using search api with suggest option
from elasticsearch import Elasticsearch
es = Elasticsearch()
text = 'ra'
suggest_dictionary = {"my-entity-suggest" : {
'text' : text,
"completion" : {
"field" : "suggest"
}
}
}
query_dictionary = {'suggest' : suggest_dictionary}
res = es.search(
index='auto_sugg',
doc_type='entity',
body=query_dictionary)
print(res)
Make sure you have indexed each document with suggest field
sample_entity= {
'id' : 'test123',
'name': 'Ramtin Seraj',
'title' : 'XYZ',
"suggest" : {
"input": [ 'Ramtin', 'Seraj', 'XYZ'],
"output": "Ramtin Seraj",
"weight" : 34 # a prior weight
}
}

Here is my code which runs perfectly.
from elasticsearch import Elasticsearch
es = Elasticsearch()
text = 'ra'
suggDoc = {
"entity-suggest" : {
'text' : text,
"completion" : {
"field" : "suggest"
}
}
}
res = es.suggest(body=suggDoc, index="auto_sugg", params=None)
print(res)
I used the same client mentioned on the elasticsearch site here
I indexed the data in the elasticsearch index by using completion suggester from here

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Elasticsearch with python: query specific field - python

You need to use the match query. Try this: doc = { 'size' : 10000, 'query': { 'match' : { 'message': 'moved' } } }

Related

MongoDB - Querying a nested boolean field

How to scroll through elastic query results, python

db.collection.update $set is not working

How to use find() nested documents for two levels or more?

How to use "suggest" in elasticsearch pyes?

Categories

Resources