I am using elasticsearch python client 6.4.0
I want to use the optimize API
https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-optimize.html
But I could not find anything about it in the elasticsearch python api doc
https://elasticsearch-py.readthedocs.io/en/master/api.html
I tried using es.indices.optimize(...) but that function does not exist.
I will prefer to use the python client instead of direct API call.
You are looking at a very old version of elastic (1.7). Optimize API was renamed to forcemerge and under this name it is available in python client (docs)
Related
I am a C# developer who has been tasked with converting some deployed C# Azure functions (mostly webhooks / SB) to Python.
I am struggling with the concept of the python equivalency to dependency injection.
Take for example an api client class that makes calls continuously to some 3rd party API to push and pull data. In .NET if i had a webhook function that needed to use this api client, i would initiate a singleton service in the startup.cs class, and inject it into my azure webhook function. This is advantageous cause i can handle the token refreshing and what not inside of the service class itself and have the token stored in memory, instead of having to re-create an instance of the api client class each time the webhook is fired.
How do i do this in Python? Or what is the right method of doing something similar in a similar environment (Azure functions) where we store tokens in memory AND create a service once and use the same service across multiple functions?
Thanks
While not being using widely because the way python is setup, dependency injection can be implemented using python. There is a python package called dependency-injector which helps in implementing di.
We should not store the tokens in local memory of the function as functions are stateless and serverless what that means is we don't have either control or knowledge about the servers on which our function is running thus we don't know how long they will be stored in the memory instead of this once you acquire the tokens add them to the azure keyvault, so that you can use them frequently.
you can share a python class among the functions just place them a separate folder and import them in your functions.
from ..common_files import test_file
Refer this article by Szymon miks for examples of dependency Injection in python
Refer this documentation on the sharing file between functions.
Refer the following documentation on azure keyvault about retrieving keys
I need to download my company's marketing campaigns on Bing Ads from Airflow. I have seen there is a Bing Ads SDK in Python that may be a good starting point to build a custom operator that would make an API call to Bing Ads and upload the resulting csv to S3 using the s3 hook. I previously used the Google Ads client library to download campaigns as a list of Python objects inside the PythonOperator, but this library was way more intuitive than BingAds.
There is no BingAds hook in Airflow unfortunately, so I have to rely fully on this Python library and wrap it inside the PythonOperator. Has someone built a similar solution in the past that can point me in the right direction?
Best,
I think it's ok to use PythonOperator to run python code - including importing the Bing API - you just have to make sure that the packages installed for scheduler and worker. No need to build custom operator for that.
You can even use task flow API to integrate it nicely https://airflow.apache.org/docs/apache-airflow/stable/tutorial_taskflow_api.html
Airlfow is meant to be extended in customized - so writing your Python code is entirely expected. More - if you find it works for you, you might actually to contribute it back to Airflow after adding all the "generalisations" - BingAdsProvider would be nice.
I'm working on automated deletion of Azure resources based on tags attached to these resources.
I'm using Azure SDK for python (https://github.com/Azure/azure-sdk-for-python) - I found how to get a list of my resources and that I can delete them with ResourceManagementClient with resources.delete_by_id method.
However, this method requires 2 arguments - resource id (which I have from resources listed by ResourceManagementClient) and API version (which is different for every resource type.
How can I determine which API version should be passed to the method?
I've tried to find something in docs and code of SDK, but I couldn't come up with a proper solution.
API version can be even hardcoded, but it needs to work for all resource types.
When using some api versions (e.g. 2018-05-01) I get error for some of resource types:
Azure Error: NoRegisteredProviderFound
Message: No registered resource provider found for location 'westeurope' and API version '['2018-05-01']' for type 'virtualMachines'. The supported api-versions are '2015-05-01-preview, 2015-06-15, 2016-03-30, 2016-04-30-preview, 2016-08-30, 2017-03-30, 2017-12-01, 2018-04-01, 2018-06-01, 2018-10-01, 2019-03-01'. The supported locations are 'eastus, eastus2, westus, centralus, northcentralus, southcentralus, northeurope, westeurope, eastasia, southeastasia, japaneast, japanwest, australiaeast, australiasoutheast, brazilsouth, southindia, centralindia, westindia, canadacentral, canadaeast, westus2, westcentralus, uksouth, ukwest, koreacentral, koreasouth, francecentral, southafricanorth'.
ERROR: 'CloudError' object has no attribute '__traceback__'
I would recommend the same approach than the CLI implementation, do an initial call to ARM to get the possible mappings from Resource Provider / Resource Type to API version(s), and use that to inject the correct api version in your call.
Get this mapping would be a list providers call.
Adding a Mgmt sample repo, look for ResourceManagementClient:
https://github.com/Azure-Samples/azure-samples-python-management
Edit: I work at MS in the Python SDK team.
If I am not mistaken, resources.delete_by_id is a wrapper over Delete By Id REST API method. Currently the latest API Version for this operation is 2018-05-01. You can use that in your method call.
I am newbie in the real-time distributed search engine elasticsearch, but I would like to ask a technical question.
I have written a python module-crawler that parses a web page and creates JSON objects with native information. The next step for my module-crawler is to store the native information, using the elasticsearch.
The real question is the following.
Which technique is better for my occasion? The elasticsearch RESTful API or the python API for elastic search (elasticsearch-py) ?
If you already have Python code, then the most natural way for you would be to use the elasticsearch-py client.
After installing the elasticsearch-py library via pip install elatsicsearch, you can find a simple code example to get you going:
# import the elasticsearch library
from elasticsearch import Elasticsearch
# get your JSON data
json_page = {...}
# create a new client to connect to ES running on localhost:9200
es = Elasticsearch()
# index your JSON data
es.index(index="webpages", doc_type="webpage", id=1, body=json_page)
You may also try elasticsearch_dsl it is a high level wraper of elasticsearch.
I am using aerospike as the caching layer and need to implement CAS (Compare and Set/Swap). While I am able to find the support for the same with php client (http://www.aerospike.com/docs/client/php/usage/kvs/write.html) the same is not available for python client. Anyone has any idea if CAS is supported for python client as well- and if there's any documentation for the same?
Thanks!
When you get a record, the meta data contains the generation:
http://www.aerospike.com/apidocs/python/client.html#aerospike-record-tuple
You then need to supply a gen policy to put: http://www.aerospike.com/apidocs/python/client.html#write-policies
Then on the put call, you need the meta dict to contain the expected generation. There is an example here: http://www.aerospike.com/apidocs/python/client.html#aerospike.Client.put