how to feed data to Elasticseach as Integer using Python? - python

i am using this python script to feed my data to elasticsearch 6.0. How can i store the variable Value with type float in Elasticsearch?
I can't use the metric options for the visualization in Kibana, because all the data is stored automatically as string
from elasticsearch import Elasticsearch
Device=""
Value=""
for key, value in row.items():
Device = key
Value = value
print("Dev",Device, "Val:", Value)
doc = {'Device':Device, 'Measure':Value , 'Sourcefile':filename}
print(' doc: ', doc)
es.index(index=name, doc_type='trends', body=doc)
Thanks
EDIT:
After the advice of #Saul, i could fix this problem with the following code:
import os,csv
import time
from elasticsearch import Elasticsearch
#import pandas as pd
import requests
Datum = time.strftime("%Y-%m-%d_")
path = '/home/pi/Desktop/Data'
os.chdir(path)
name = 'test'
es = Elasticsearch()
#'Time': time ,
#url = 'http://localhost:9200/index_name?pretty'
doc = {
"mappings": {
"doc": {
"properties": {
"device": { "type": "text" },
"measure": { "type": "text" },
"age": { "type": "integer" },
"created": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}
#headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
#r = requests.post(url, data=json.dumps(data), headers=headers)
r= es.index(index=name, doc_type='trends', body=doc)
print(r)

You need to send a HTTP Post request using python request, as follows:
url = "http://localhost:9200/index_name?pretty”
data = {
"mappings": {
"doc": {
"properties": {
"title": { "type": "text" },
"name": { "type": "text" },
"age": { "type": "integer" },
"created": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}
headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.post(url, data=json.dumps(data), headers=headers)
Please replace index_name in the URL with the name of the index you are defining in to elasticsearch engine.
If you want to delete the index before creating it again, please do as follows:
url = "http://localhost:9200/index_name”
data = { }
headers = {'Content-type': 'application/json', 'Accept': 'text/plain'}
r = requests.delete(url, data=json.dumps(data), headers=headers)
please replace index_name in the URL with your actual index name. After deleting the index, create it again with the first code example above including the mappings that you would need. Enjoy.

Elasticsearch defines field types in the index mapping. It looks like you probably have dynamic mapping enabled, so when you send data to Elasticsearch for the first time, it makes an educated guess about the shape of your data and the field types.
Once those types are set, they are fixed for that index, and Elasticsearch will continue to interpret your data according to those types no matter what you do in your python script.
To fix this you need to either:
Define the index mapping before you load any data. This is the better option as it gives you complete control over how your data is interpreted. https://www.elastic.co/guide/en/elasticsearch/reference/6.0/mapping.html
Make sure that, the first time you send data into the index, you use the correct data types. This will rely dynamic mapping generation, but it will typically do the right thing.
Defining the index mapping is the best option. It's common to do that once off, in Kibana or with curl, or if you create a lot of indices, with a template.
However if you want to use python, you should look at the create or put_mapping functions on IndicesClient

Related

Reading key values a JSON array which is a set in Python

I have the following code
import requests
import json
import sys
credentials_User=sys.argv[1]
credentials_Password=sys.argv[2]
email=sys.argv[3]
def auth_api(login_User,login_Password,):
gooddata_user=login_User
gooddata_password=login_Password
body = json.dumps({
"postUserLogin":{
"login": gooddata_user,
"password": gooddata_password,
"remember":1,
"verify_level":0
}
})
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json'
}
url="https://reports.domain.com/gdc/account/login"
response = requests.request(
"POST",
url,
headers=headers,
data=body
)
sst=response.headers.get('Set-Cookie')
return sst
def query_api(cookie,email):
url="https://reports.domain.com/gdc/account/domains/domain/users?login="+email
body={}
headers={
'Content-Type': 'application/json',
'Accept': 'application/json',
'Cookie': cookie
}
response = requests.request(
"GET",
url,
headers=headers,
data=body
)
jsonContent=[]
jsonContent.append({response.text})
accountSettings=jsonContent[0]
print(accountSettings)
cookie=auth_api(credentials_User,credentials_Password)
profilehash=query_api(cookie,email)
The code itself works and sends a request to the Gooddata API.
The query_api() function returns JSON similar to below
{
"accountSettings": {
"items": [
{
"accountSetting": {
"login": "user#example.com",
"email": "user#example.com",
"firstName": "First Name",
"lastName": "Last Name",
"companyName": "Company Name",
"position": "Data Analyst",
"created": "2020-01-08 15:44:23",
"updated": "2020-01-08 15:44:23",
"timezone": null,
"country": "United States",
"phoneNumber": "(425) 555-1111",
"old_password": "secret$123",
"password": "secret$234",
"verifyPassword": "secret$234",
"authenticationModes": [
"SSO"
],
"ssoProvider": "sso-domain.com",
"language": "en-US",
"ipWhitelist": [
"127.0.0.1"
],
"links": {
"projects": "/gdc/account/profile/{profile_id}/projects",
"self": "/gdc/account/profile/{profile_id}",
"domain": "/gdc/domains/default",
"auditEvents": "/gdc/account/profile/{profile_id}/auditEvents"
},
"effectiveIpWhitelist": "[ 127.0.0.1 ]"
}
}
],
"paging": {
"offset": 20,
"count": 100,
"next": "/gdc/uri?offset=100"
}
}
}
The issue I am having is reading specific keys from this JSON Dict, I can use accountSettings=jsonContent[0] but that just returns the same JSON.
What I want to do is read the value of the project key within links
How would I do this with a dict?
Thanks
Based on your description, uyou have your value inside a list, (not a set. Foergt about set: sets are not used with JSON). Inside your list, you either your content as a single string, which then you'd have to parse with json.loads, or it is simply a well behaved nested data structure already extracted from JSON, but which is inside a single element list. This seems the most likely.
So, you should be able to do:
accountlink = jsonContent[0]["items"][0]["accountSetting"]["login"]
otherwise, if it is encoded as a a json string, you have to parse it first:
import json
accountlink = json.loads(jsonContent[0])["items"][0]["accountSetting"]["login"]
Now, given your question, I'd say your are on a begginer level as a programmer, or a casual user, just using Python to automatize something either way, I'd recommend you do try some exercising before proceeding: it will save you time (a lot of time). I am not trying to bully or mock anything here: this is the best advice I can offer you. Seek for tutorials that play around on the interactive mode, rather than trying entire programs at once that you'd just copy and paste.
Using the below code fixed the issue
jsonContent=json.loads(response.text)
print(type(jsonContent))
test=jsonContent["accountSettings"]["items"][0]
test2=test["accountSetting"]["links"]["self"]
print(test)
print(test2)
I believe this works because for some reason I didn't notice I was using .append for my jsonContent. This resulted in the data type being something other than it should have been.
Thanks to everyone who tried helping me.

python Issue with getting correct info from the json

I have issue then i try get from correct information.
For example i have very big json output after request i made in post (i cant use get).
"offers": [
{
"rank": 1,
"provider": {
"id": 6653,
"isLocalProvider": false,
"logoUrl": "https://img.vxcdn.com/i/partner-energy/c_6653.png?v=878adaf9ed",
"userRatings": {
"additonalCustomerRatings": {
"price": {
"percent": 73.80
},
"service": {
"percent": 67.50
},
"switching": {
"percent": 76.37
},
"caption": {
"text": "Zusätzliche Kundenbewertungen"
}
},
I cant show it all because its very big.
Like you see "rank" 1 in this request exist 20 ranks with information like content , totalCost and i need pick them all. Like 6 rank content and totalCost, 8 rank content and totalCost.
So first off all in python i use code for getting what json data.
import requests
import json
url = "https://www.verivox.de/api/energy/offers/electricity/postCode/10555/custom?"
payload="{\"profile\":\"H0\",\"prepayment\":true,\"signupOnly\":true,\"includePackageTariffs\":true,\"includeTariffsWithDeposit\":true,\"includeNonCompliantTariffs\":true,\"bonusIncluded\":\"non-compliant\",\"maxResultsPerPage\":20,\"onlyProductsWithGoodCustomerRating\":false,\"benchmarkTariffId\":741122,\"benchmarkPermanentTariffId\":38,\"paolaLocationId\":\"71085\",\"includeEcoTariffs\":{\"includesNonEcoTariffs\":true},\"maxContractDuration\":240,\"maxContractProlongation\":240,\"usage\":{\"annualTotal\":3500,\"offPeakUsage\":0},\"priceGuarantee\":{\"minDurationInMonths\":0},\"maxTariffsPerProvider\":999,\"cancellationPeriod\":null,\"previewDisplayTime\":null,\"onlyRegionalTariffs\":false,\"sorting\":{\"criterion\":\"TotalCosts\",\"direction\":\"Ascending\"},\"includeSpecialBonusesInCalculation\":\"None\",\"totalCostViewMode\":1,\"ecoProductType\":0}"
headers = {
'Content-Type': 'application/json',
'Cookie': '__cfduid=d97a159bb287de284487ebdfa0fd097b41606303469; ASP.NET_SessionId=jfg3y20s31hclqywloocjamz; 0e3a873fd211409ead79e21fffd2d021=product=Power&ReturnToCalcLink=/power/&CustomErrorsEnabled=False&IsSignupWhiteLabelled=False; __RequestVerificationToken=vrxksNqu8CiEk9yV-_QHiinfCqmzyATcGg18dAqYXqR0L8HZNlvoHZSZienIAVQ60cB40aqfQOXFL9bsvJu7cFOcS2s1'
}
response = requests.request("POST", url, headers=headers, data=payload)
jsondata = response.json()
# print(response.text)
For it working fine, but then i try pick some data what i needed like i say before im getting
for Rankdata in str(jsondata['rank']):
KeyError: 'rank'
my code for this error.
dataRank = []
for Rankdata in str(jsondata['rank']):
dataRank.append({
'tariff':Rankdata['content'],
'cost': Rankdata['totalCost'],
'sumOfOneTimeBonuses': Rankdata['content'],
'savings': Rankdata['content']
})
Then i try do another way. Just get one or some data, but not working too.
data = response.json()
#print(data)
test = float((data['rank']['totalCost']['content']))
I know my code not perfect, but i first time deal with json what are so big and are so difficult. I will be very grateful if show my in my case example how i can pick rank 1 - rank 20 data and print it.
Thank you for your help.
If you look closely at the highest level in the json, you can see that the value for key offers is a list of dicts. You can therefore loop through it like this:
for offer in jsondata['offers']:
print(offer.get('rank'))
print(offer.get('provider').get('id'))
And the same goes for other keys in the offers.

Using the Firestore REST API to Update a Document Field

I've been searching for a pretty long time but I can't figure out how to update a field in a document using the Firestore REST API. I've looked on other questions but they haven't helped me since I'm getting a different error:
{'error': {'code': 400, 'message': 'Request contains an invalid argument.', 'status': 'INVALID_ARGUMENT', 'details': [{'#type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'oil', 'description': "Error expanding 'fields' parameter. Cannot find matching fields for path 'oil'."}]}]}}
I'm getting this error even though I know that the "oil" field exists in the document. I'm writing this in Python.
My request body (field is the field in a document and value is the value to set that field to, both strings received from user input):
{
"fields": {
field: {
"integerValue": value
}
}
}
My request (authorizationToken is from a different request, dir is also a string from user input which controls the directory):
requests.patch("https://firestore.googleapis.com/v1beta1/projects/aethia-resource-management/databases/(default)/documents/" + dir + "?updateMask.fieldPaths=" + field, data = body, headers = {"Authorization": "Bearer " + authorizationToken}).json()
Based on the the official docs (1,2, and 3), GitHub and a nice article, for the example you have provided you should use the following:
requests.patch("https://firestore.googleapis.com/v1beta1/projects{projectId}/databases/{databaseId}/documents/{document_path}?updateMask.fieldPaths=field")
Your request body should be:
{
"fields": {
"field": {
"integerValue": Value
}
}
}
Also keep in mind that if you want to update multiple fields and values you should specify each one separately.
Example:
https://firestore.googleapis.com/v1beta1/projects/{projectId}/databases/{databaseId}/documents/{document_path}?updateMask.fieldPaths=[Field1]&updateMask.fieldPaths=[Field2]
and the request body would have been:
{
"fields": {
"field": {
"integerValue": Value
},
"Field2": {
"stringValue": "Value2"
}
}
}
EDIT:
Here is a way I have tested which allows you to update some fields of a document without affecting the rest.
This sample code creates a document under collection users with 4 fields, then tries to update 3 out of 4 fields (which leaves the one not mentioned unaffected)
from google.cloud import firestore
db = firestore.Client()
#Creating a sample new Document “aturing” under collection “users”
doc_ref = db.collection(u'users').document(u'aturing')
doc_ref.set({
u'first': u'Alan',
u'middle': u'Mathison',
u'last': u'Turing',
u'born': 1912
})
#updating 3 out of 4 fields (so the last should remain unaffected)
doc_ref = db.collection(u'users').document(u'aturing')
doc_ref.update({
u'first': u'Alan',
u'middle': u'Mathison',
u'born': 2000
})
#printing the content of all docs under users
users_ref = db.collection(u'users')
docs = users_ref.stream()
for doc in docs:
print(u'{} => {}'.format(doc.id, doc.to_dict()))
EDIT: 10/12/2019
PATCH with REST API
I have reproduced your issue and it seems like you are not converting your request body to a json format properly.
You need to use json.dumps() to convert your request body to a valid json format.
A working example is the following:
import requests
import json
endpoint = "https://firestore.googleapis.com/v1/projects/[PROJECT_ID]/databases/(default)/documents/[COLLECTION]/[DOCUMENT_ID]?currentDocument.exists=true&updateMask.fieldPaths=[FIELD_1]"
body = {
"fields" : {
"[FIELD_1]" : {
"stringValue" : "random new value"
}
}
}
data = json.dumps(body)
headers = {"Authorization": "Bearer [AUTH_TOKEN]"}
print(requests.patch(endpoint, data=data, headers=headers).json())
I found the official documentation to not to be of much use since there was no example mentioned. This is the API end-point for your firestore database
PATCH https://firestore.googleapis.com/v1beta1/projects/{YOUR_PROJECT_ID}/databases/(default)/documents/{COLLECTION_NAME}/{DOCUMENT_NAME}
the following code is the body of your API request
{
"fields": {
"first_name": {
"stringValue":"Kurt"
},
"name": {
"stringValue":"Cobain"
},
"band": {
"stringValue":"Nirvana"
}
}
}
The response you should get upon successful update of the database should look like
{
"name": "projects/{YOUR_PROJECT_ID}/databases/(default)/documents/{COLLECTION_ID/{DOC_ID}",
{
"fields": {
"first_name": {
"stringValue":"Kurt"
},
"name": {
"stringValue":"Cobain"
},
"band": {
"stringValue":"Nirvana"
}
}
"createTime": "{CREATE_TIME}",
"updateTime": "{UPDATE_TIME}"
Note that performing the above action would result in a new document being created, meaning that any fields that existed previously but have NOT been mentioned in the "fields" body will be deleted. In order to preserve fields, you'll have to add
?updateMask.fieldPaths={FIELD_NAME} --> to the end of your API call (for each individual field that you want to preserve).
For example:
PATCH https://firestore.googleapis.com/v1beta1/projects/{YOUR_PROJECT_ID}/databases/(default)/documents/{COLLECTION_NAME}/{DOCUMENT_NAME}?updateMask.fieldPaths=name&updateMask.fieldPaths=band&updateMask.fieldPaths=age. --> and so on

Converting Curl command line to python

I am using the Fiware Orion Context Broker and I want to POST some data using a python script. The command line (which works fine) looks like this:
curl -X POST -H "Accept: application/json" -H "Fiware-ServicePath: /orion" -H "Fiware-Service: orion" -H "Content-Type: application/json" -d '{"id": "JetsonTX1", "type": "sensor", "title": {"type": "Text","value": "Init"}, "percentage": { "type": "Text", "value": "0%"}}' "http://141.39.159.63:1026/v2/entities/"
My Python script:
import requests
import json
url = 'http://141.39.159.63:1026/v2/entities/'
data = '''{
"title": {
"value": "demo",
"type": "Text"
},
"percentage": {
"type": "Text",
"value": "0%"
}'''
data_json = json.dumps(data)
headers = {"Accept": "application/json", "Fiware-ServicePath": "/bonseyes", "Fiware-Service": "bonseyes", "Content-Type": "application/json"}
response = requests.post(url, data=data_json, headers=headers)
print(response.json())
This is what response.json() returns:
{u'description': u'Errors found in incoming JSON buffer', u'error': u'ParseError'}
Any ideas how to fix this?
Thank you!
You probably should not pass data as string like so:
data = '''{
"title": {
"value": "demo",
"type": "Text"
},
"percentage": {
"type": "Text",
"value": "0%"
}'''
Pass it as normal dict:
data = {
"title": {
"value": "demo",
"type": "Text"
},
"percentage": {
"type": "Text",
"value": "0%"
}}
The request library will automatically convert this dictionary for you. Also make sure you want to use the data parameter and not the json. Below expert from the documentation should clear out why.
def post(url, data=None, json=None, **kwargs):
r"""Sends a POST request.
:param url: URL for the new :class:`Request` object.
:param data: (optional) Dictionary (will be form-encoded), bytes, or file-like object to send in the body of the :class:`Request`.
:param json: (optional) json data to send in the body of the :class:`Request`.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
From your comment it seems you should pass your data like so:
response = requests.post(url, json=data_json, headers=headers)
Because your endpoint requires json and not form-encoded bytes
And there is also missing curly brace at the end.
In my opinion you should try using keyValues option for OCB. It will make your payload way shorter. I use similar python program for updating values, therefore PATCH request in my approach:
#Sorting out url and payload for request
data = '{"' + attribute + '":' + value + '}'
headers = {'Content-type': 'application/json'}
url = urlOfOCB + '/v2/entities/' + entityId + '/attrs?options=keyValues'
r = requests.patch(url, (data), headers=headers)
You can read about this option here. As I can see you are not defining any new type for your attributes, therefore it will be "Text" by default, while using keyValues.
Attribute/metadata type may be omitted in requests. When omitted in attribute/metadata creation or in update operations, a default is used for the type depending on the value:
If value is a string, then type Text is used
If value is a number, then type Number is used.
If value is a boolean, then type Boolean is used.
If value is an object or array, then StructuredValue is used.
If value is null, then None is used.
More on those things you can find here.

Get all values from a JSON object and store in a flat array with Python

I am returning a JSON object from a requests call. I would like to get all the values from it and store them in a flat array.
My JSON object:
[
{
"link": "https://f.com/1"
},
{
"link": "https://f.com/2"
},
{
"link": "https://f.com/3"
}
]
I would like to store this as:
[https://f.com/things/1, https://f.com/things/2, https://f.com/things/3]
My code is as follows.. it is just printing each link out:
import requests
import json
def start_urls_data():
url = 'http://106309.n.com:3000/api/v1/product_urls?q%5Bcompany_in%5D%5B%5D=F'
headers = {'X-Api-Key': '1', 'Content-Type': 'application/json'}
r = requests.get(url, headers=headers)
start_urls_data = json.loads(r.content)
for i in start_urls_data:
print i['link']
You can use a simple list comprehension:
data = [
{
"link": "https://f.com/1"
},
{
"link": "https://f.com/2"
},
{
"link": "https://f.com/3"
}
]
print([x["link"] for x in data])
This code just loops through the list data and put the value of the key link from the dict element to a new list.

Categories