Elastic search create custom analyzer using Python client HTTP 400 issue - python

I am trying to create a custom analyzer with elastic search python client. I'm referring to this article in elastic search documentation.
elastic docs article
When I send a PUT request with the following JSON settings it sends 200 Success.
PUT my-index-000001
{
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"char_filter": [
"emoticons"
],
"tokenizer": "punctuation",
"filter": [
"lowercase",
"english_stop"
]
}
},
"tokenizer": {
"punctuation": {
"type": "pattern",
"pattern": "[ .,!?]"
}
},
"char_filter": {
"emoticons": {
"type": "mapping",
"mappings": [
":) => _happy_",
":( => _sad_"
]
}
},
"filter": {
"english_stop": {
"type": "stop",
"stopwords": "_english_"
}
}
}
}
}
The issue comes when I try to do the same with the python client. Here's how I am using it.
settings.py to define settings
settings = {
"settings": {
"analysis": {
"analyzer": {
"my_custom_analyzer": {
"char_filter": [
"emoticons"
],
"tokenizer": "punctuation",
"filter": [
"lowercase",
"english_stop"
]
}
},
"tokenizer": {
"punctuation": {
"type": "pattern",
"pattern": "[ .,!?]"
}
},
"char_filter": {
"emoticons": {
"type": "mapping",
"mappings": [
":) => _happy_",
":( => _sad_"
]
}
},
"filter": {
"english_stop": {
"type": "stop",
"stopwords": "_english_"
}
}
}
}
}
create-index helper method
es_connection.create_index(index_name="test", mapping=mapping, settings=settings)
es-client call
def create_index(self, index_name: str, mapping: Dict, settings) -> None:
"""
Create an ES index.
:param index_name: Name of the index.
:param mapping: Mapping of the index
"""
logging.info(f"Creating index {index_name} with the following schema: {json.dumps(mapping, indent=2)}")
self.es_client.indices.create(index=index_name, ignore=400, mappings=mapping, settings=settings)
I get the following error from logs
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.analyzer.my_custom_analyzer.char_filter] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"}],"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.analyzer.my_custom_analyzer.char_filter] please check that any required plugins are installed, or check the breaking changes documentation for removed settings","suppressed":[{"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.analyzer.my_custom_analyzer.filter] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"},{"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.analyzer.my_custom_analyzer.tokenizer] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"},{"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.char_filter.emoticons.mappings] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"},{"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.char_filter.emoticons.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"},{"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.filter.english_stop.stopwords] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"},{"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.filter.english_stop.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"},{"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.tokenizer.punctuation.pattern] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"},{"type":"illegal_argument_exception","reason":"unknown setting [index.settings.analysis.tokenizer.punctuation.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"}]},"status":400}
Any idea what causes this issue ??? Related to ignore 400 ???? Thanks in advance.
PS - I'm using docker.elastic.co/elasticsearch/elasticsearch:7.15.1 and python elasticsearch client 7.15.1

You simply need to remove the settings section at the top because it's added automatically by the client code:
settings = {
"settings": { <--- remove this line
"analysis": {
"analyzer": {

Related

Warning while trying to add mapping with dynamic_templates having analyzer and search_analyzer

I am using elasticsearch python client to connect to elasticsearch.
While trying to add mapping to index, I am getting following warning:
es.indices.put_mapping(index=index, body=mappings)
/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py:209: ElasticsearchWarning: }}], attempted to validate it with the following match_mapping_type: [string], caused by [unknown parameter [search_analyzer] on mapper [__dynamic__attributes] of type [keyword]]
/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py:209: ElasticsearchWarning: }}], attempted to validate it with the following match_mapping_type: [string], caused by [unknown parameter [search_analyzer] on mapper [__dynamic__metadata] of type [keyword]]
warnings.warn(message, category=ElasticsearchWarning)
And while indexing the record, got this warning:
/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py:209: ElasticsearchWarning: Parameter [search_analyzer] is used in a dynamic template mapping and has no effect on type [keyword]. Usage will result in an error in future major versions and should be removed.
warnings.warn(message, category=ElasticsearchWarning)
/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py:209: ElasticsearchWarning: Parameter [analyzer] is used in a dynamic template mapping and has no effect on type [keyword]. Usage will result in an error in future major versions and should be removed.
warnings.warn(message, category=ElasticsearchWarning)
I am using Using elasticsearch "7.15.1"
pip packages:
elasticsearch==7.15.1
elasticsearch-dsl==7.4.0
My settings and mappings are:
settings = {"analysis": {"analyzer": {"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"filter": ["trim"]}
}
}
}
mappings = {"dynamic_templates": [
{"attributes": {
"match_mapping_type": "string",
"path_match": "attributes.*",
"mapping": {
"type": "keyword",
"analyzer": "my_analyzer",
"search_analyzer": "my_analyzer"
}
}
},
{"metadata": {
"match_mapping_type": "string",
"path_match": "metadata.*",
"mapping": {
"type": "keyword",
"analyzer": "my_analyzer",
"search_analyzer": "my_analyzer"
}
}
}
]
}
I need help in adjusting the mapping, this mapping was working fine on elastic 6.0.1. After upgrading to 7.15.1 started getting warning.
You are trying to set an analyzer on a keyword field. The Elasticsearch analyzer documentation states at the top of the page:
Only text fields support the analyzer mapping parameter.
You have to change the type of your field to text or specify no analyzer at all for the keyword fields. You can also use normalizers to apply token filters to your keyword fields. As mentioned in the answer from this question on the Elastic discuss page.
The trim token filter that you want to use is not explicitly mentioned in the list of compatible filters, but I tried it with the Kibana dev tools, and it seems to work:
PUT normalizer_trim
{
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"filter": ["lowercase", "trim"]
}
}
}
},
"mappings": {
"properties": {
"foo": {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}

Firebase Hosting REST API: Page not found after successful deploy

I was trying to use Python to deploy sites to Firebase Hosting. I followed this guide.
My code seems to be working fine, I'm not getting any errors and I'm getting 200 status codes in the API responses. I'm getting all the same responses as they show in the guide:
# versions.create
200, {
"name": "sites/xxxxx/versions/bd94931c702c6150",
"status": "CREATED",
"config": {
"headers": [
{
"headers": {
"Cache-Control": "max-age=1800"
},
"glob": "**"
}
]
}
}
# versions.populateFiles
200, {
"uploadRequiredHashes": [
"13f7dc725fc6c937322b1614479fdb916f5d27f027fef1bee83c7bc61fc393c6",
"8529e2e12706f35232fce346d3fe23166b72a8fa029c153533e1139a8cc7b08d",
"30e3a300bf4c8ab3fc5e3906772c9ccabfcbe18447143edf7ab6c9cb22a18d73"
],
"uploadUrl": "https://upload-firebasehosting.googleapis.com/upload/sites/xxxxx/versions/bd94931c702c6150/files"
}
200 # file1 upload
200 # file2 upload
200 # file3 upload
# versions.patch
200, {
"name": "sites/xxxxx/versions/bd94931c702c6150",
"status": "FINALIZED",
"config": {
"headers": [
{
"headers": {
"Cache-Control": "max-age=1800"
},
"glob": "**"
}
]
},
"createTime": "2021-10-01T11:38:24.345049Z",
"createUser": {
"email": "firebase-adminsdk-xj8ro#xxxxx.iam.gserviceaccount.com"
},
"finalizeTime": "2021-10-01T11:38:37.780419Z",
"finalizeUser": {
"email": "firebase-adminsdk-xj8ro#xxxxx.iam.gserviceaccount.com"
}
}
# releases.create
200, {
"name": "sites/xxxxx/releases/1633088318665339",
"version": {
"name": "sites/xxxxx/versions/bd94931c702c6150",
"status": "FINALIZED",
"config": {
"headers": [
{
"headers": {
"Cache-Control": "max-age=1800"
},
"glob": "**"
}
]
},
"createTime": "2021-10-01T11:38:24.345049Z",
"createUser": {
"email": "firebase-adminsdk-xj8ro#xxxxx.iam.gserviceaccount.com"
},
"finalizeTime": "2021-10-01T11:38:37.780419Z",
"finalizeUser": {
"email": "firebase-adminsdk-xj8ro#xxxxx.iam.gserviceaccount.com"
}
},
"type": "DEPLOY",
"releaseTime": "2021-10-01T11:38:38.665339693Z",
"releaseUser": {
"email": "firebase-adminsdk-xj8ro#xxxxx.iam.gserviceaccount.com"
}
}
(I replaced my site ID with xxxxx)
I don't know what the problem is.
Maybe it's due to the way I gzip my files? I do it using the gzip module in Python.
for file_name in file_names:
with open(f"{folder_path}/{file_name}", 'rb') as f_in, gzip.open(f"`{OUTPUT_DIR}/{file_name}.gz", 'wb') as f_out:
f_out.writelines(f_in)
And then I read and upload them like this:
headers = {
"Authorization": f"Bearer {access_token}",
"Content-Type": "application/octet-stream",
"Content-Length": "500"
}
f = open(file_path, "rb")
bytes = f.read()
r = requests.post(API_ENDPOINT, headers=headers, data=bytes)
However, I did notice that in my response to the versions.patch call
is missing the following part that is present in the tutorial:
"fileCount": "5",
"versionBytes": "114951"
The tutorial seems to be from 2018, so it could be an API change.
After doing everything like shown in the tutorial I still get Page Not Found error when I go to the URL of my site.
I can add more code if it is needed. Please help me. Thanks in advance.
There can be multiple reasons for the error you are facing. I am trying to put together possible fixes/workarounds for the error in this single answer for you to analyse and try upon :
The step to deploy are as following :
STEP 1:
ng build --prod
STEP 2:
firebase init
Are you ready to proceed? Yes
*What do you want to use as your public directory? dist/{your-
application-name}
Configure as a single-page app (rewrite all urls to /index.html)?(y/N)
Yes
File dist/{your-application-name}/index.html already exists.
Overwrite?(y/N) No
STEP 3:
firebase deploy --only hosting
And if still you are getting the same page just press 'CTRL + F5' it
will clean the cache.
Add a dot before /dist on the public tag "public":
"./dist/my-app-name" Example of firebase.json :
{
"hosting": {
"public": "./dist/my-app-name",
"ignore": [
"firebase.json",
"**/.*",
"**/node_modules/**"
],
"rewrites": [
{
"source": "**",
"destination": "/index.html"
}
]
}
}
It may happen that, hosting -> public entry in the firebase.json
file is not pointing to the directory you built it to. If you're
building a single-page-app with React, double check you have a
rewrites entry to redirect all requests to index.html
"hosting": {
"public": "build",
"ignore": [ "firebase.json", "**/.*", "**/node_modules/**" ],
"rewrites": [
{ "source": "**",
"destination": "/index.html"
} ] }
Add site property to firebase.json to fix this.
{
"hosting": {
"site": "my-app-id",
"public": "app", ... … }
It can also be because the index.html has modified itself when you
selected "Y" while initializing the firebase. It has basically
replaced your own index file to this one. Check and replace the
index file and next time, do not overwrite the index.html file. It
would work.
Activate Firestore for the project and specify the Resource Location
Id by following Get started with Cloud Firestore.
Try the following as you might have deleted the project in the
console but the source-code was still referencing it hence error 404
resource not found.
Delete .firebaserc file (contains your project alias) located in the
root of the project
Run firebase init and link to your project
Run firebase deploy again
Make sure you choose a default storage location for your firebase
project, then deploy again. In firebase project > project overview > gear icon > project settings > Default GCP resource location.
You can simply go to Firebase Console > Storage > enable Firebase Storageand it may resolve the issue.
Please update your installation of firebase-tools to the latest via’
npm i -g firebase-tools’. Also note that we are not actively testing
or supporting node versions greater than 10 - if you continue having
issues, downgrade your version of node to 10 and see if the issue
remains. You may have the latest version locally but not globally
installed. See Get started: write and deploy your first
functions for more details.
Also a little suggestion, go through the fixes/workarounds mentioned above, try them and follow the guide (as it is) without missing out on anything, and if still you are getting Page not Found after successful deployment, please open a public issue here.

Operation Failure : Not authorized on aggregations to execute the command

I am new to the mongodb , and i have been learning some of the methods using the pymongo version 3.8.0 and the jupyter notebook. It has been going fine, until i tried the "$lookup" methods, now it has started throwing the error
Operations Failure: not authorized on aggregations to execute the command. Any help/suggestions on solving the issue will highly be appreciated.
I have tried reinstalling the packages, and enable windows administration privileges, that so far has not solved the problem
OperationFailure: not authorized on aggregations to execute command
{ aggregate: "air_routes", pipeline: [ { $match: { airplane: { $regex: "747|380" } } }, { $lookup: { from: "air_alliance", localField: "airline.name", foreignField: "airlines", as: "data_src" } },
{ $unwind: "$data_src" }, { $group: { _id: { name: "$name", airlines: "$airlines" }, numberofflights: { $sum: 1 } } }, { $sort: { numberofflights: -1 } },
{ allowDiskUse: true } ], cursor: {}, lsid: { id: UUID("af942a3d-309b-4cd2-a99b-3ebcd60406f4") }, $clusterTime: { clusterTime: Timestamp(1557101096, 1),
signature: { hash: BinData(0, AD50B7BE136F58D794C75C6AD031E92168EF61D1), keyId: 6627672121604571137 } }, $db: "aggregations", $readPreference: { mode: "primary" } }
Please help resolve this issue. Thanks,
Okay, i have found the answer, apparently, it is permissions related issue, and the second call to the database (databases are stored on atlas cluster) was passing some parameters (apparently), which were either coming off empty or were not fetched properly, reasons for which are still not clear. Therefore, the second collection set "air_alliance" was reproducing the error.
A helpful thread is given here https://jira.mongodb.org/browse/CSHARP-1722

Configure Vs code version 2.0.0 Build Task for python

I need help in configuring my Vs code to run scripts in python using Cntrl Shift B, I was working fine until Vs code upgraded to version 2.0.0 now it wants me to configure the Build. And I am clueless what Build is all about.
In the past it worked well when I only needed to configure the task runner. There are youtube videos for the task runner. I cant seem to lay my finger on what the Build is all about.
In VS Code go Tasks -> Configure Tasks
{
// See https://go.microsoft.com/fwlink/?LinkId=733558
// for the documentation about the tasks.json format
"version": "2.0.0",
"tasks": [
{
"taskName": "Run File",
"command": "python ${file}",
"type": "shell",
"group": {
"kind": "build",
"isDefault": true
},
"presentation": {
"reveal": "always",
"panel": "new",
"focus": true
}
},
{
"taskName": "nosetest",
"command": "nosetests -v",
"type": "shell",
"group": {
"kind": "test",
"isDefault": true
},
"presentation": {
"reveal": "always",
"panel": "new",
"focus": true
}
}
]
}
command: runs current python file
group: 'build'
presentation:
always shows the shell when run
uses a new shell
focuses the shell (i.e. keyboard is captured in shell window)
2nd Task is configured as the default test and just runs nosetest -v in the folder that's currently open in VS Code.
The "Run Build Task" (the one that's bound to Ctrl+Shift+B) is the one that's configured as the default build task, task 1 in this example (see the group entry).
EDIT:
Suggested by #RafaelZayas in the comments (this will use the Python interpreter that's specified in VS Code's settings rather than the system default; see his comment for more info):
"command": "${command:python.interpreterPath} ${file}"
...Don't have enough reputation to comment on the accepted answer...
At least in my environment (Ubuntu 18.04, w/ virtual env) if arguments are being passed in with "args", the file must be the first argument, as #VladBezden is doing, and not part of the command as #orangeInk is doing. Otherwise I get the message "No such file or directory".
Specifically, the answer #VladBezden has does work for me, where the following does not.
{
// See https://go.microsoft.com/fwlink/?LinkId=733558
// for the documentation about the tasks.json format
"version": "2.0.0",
"tasks": [
{
"label": "build",
"command": "${config:python.pythonPath} setup.py", // WRONG, move setup.py to args
"group": {
"kind": "build",
"isDefault": true
},
"args": [
"install"
],
"presentation": {
"echo": true,
"panel": "shared",
"focus": true
}
}
]
}
This took me a while to figure out, so I thought I would share.
Here is my configuration for the build (Ctrl+Shift+B)
tasks.json
{
// See https://go.microsoft.com/fwlink/?LinkId=733558
// for the documentation about the tasks.json format
"version": "2.0.0",
"tasks": [
{
"label": "build",
"command": "python",
"group": {
"kind": "build",
"isDefault": true
},
"args": [
"setup.py",
"install"
],
"presentation": {
"echo": true,
"panel": "shared",
"focus": true
}
}
]
}

Python custom scripting in ElasticSearch

The index has the capability of taking custom scripting in Python, but I can't find an example of custom scripting written in Python anywhere. Does anybody have an example of a working script? One with something as simple as an if-statement would be amazing.
A simple custom scoring query using python (assuming you have the plugin installed).
{
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"query": {
"function_score": {
"query": {
"match_all": {}
},
"script_score": {
"lang": "python",
"script": [
"if _score:",
" _score"
]
},
"boost_mode": "replace"
}
},
"track_scores": true
}
Quoted from elasticsearch ML -
Luca pointed out that ES calls python with an 'eval'
PyObject ret = interp.eval((PyCode) compiledScript);
Just make sure your code pass through the eval.

Categories