Error while updating a document in ElasticSearch using python es.update()

Error while updating a document in ElasticSearch using python es.update() - python

I am trying to update a document in elasticsearch using the default python interface for Elasticsearch using the below command.
res = es.update(index='its', doc_type='vents', id=txid, body={"doc":{"f_vent" :{"b_vent":rx_buffer}}})
The updated document is shown below.
{
"_index": "its",
"_type": "vents",
"_id": "4752956038",
"_score": null,
"_source": {
"ResponseTime": 0,
"Session": "None",
"Severity": "warn",
"StatusCode": 0,
"Subject": "Reporting Page Load Time",
"Time": "Fri Jun 05 2015 12:23:46 GMT+1200 (NZST)",
"Timestamp": "1433463826535",
"TransactionId": "4752956038",
"msgType": "0",
"tid": "1",
"f_vent": {
"b_vent": "{\"ActiveTransactions\": 6, \"AppName\": \"undefined\", \"TransactionId\": \"4752956038\", \"UserInfo\": \"Unknown\"}"
}
},
"fields": {
"_timestamp": 1433818222372
},
"sort": [
1433818222372
]
}
I copied this from Kibana4 discover tab by expanding the document.The 'transaction Id' inside b_vent has to be accessed as f_vent.b_vent.TransactionId. I suspect this is putting some restricions on me plotting a graph on transaction Id. I tried using
res = es.update(index='its', doc_type='vents', id=txid, body={"doc":{"b_vent":rx_buffer}})
so that I could use b_vent.TransactionId but I am getting the following error when calling es.update().
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
RequestError: TransportError(400, u'MapperParsingException[object mapping for [events] tried to parse field [be_event] as object, but got EOF, has a concrete value been provided to it?]')
What am I doing wrong? How can I fix this problem?
This is the almost full strucuture of b_vent.
"{
\"ActiveTr\": 6,
\"ErrorM\": \"None\",
\"HError\": \"false\",
\"HMPct\": 62,
\"NHMPct\": 57,
\"Parameter\": \"1433195852706\",
\"ParameterD\": \"false\",
\"ProcessCPU\": 1,
\"Proxies\": \"None\",
\"RStatusCode\": \"34500\",
\"Severity\": \"info\",
\"ThrWtTi\": -1,
\"ThrWai\": 16,
\"Timestamp\": \"TueJun0209: 58: 16NZST2015\",
\"TxId\": \"316029416\",
\"UserInfo\": \"Unknown\"
}"
It does seem to have some strange escape sequences. I am not sure why they are there. But json.loads() does seem to parse the file. I don't know how to fix this issue?

Related

How to parse nested JSON object?

I am working on a new project in HubSpot that returns nested JSON like the sample below. I am trying to access the associated contacts id, but am struggling to reference it correctly (the id I am looking for is the value '201' in the example below). I've put together this script, but this script only returns the entire associations portion of the JSON and I only want the id. How do I reference the id correctly?
Here is the output from the script:
{'contacts': {'paging': None, 'results': [{'id': '201', 'type': 'ticket_to_contact'}]}}
And here is the script I put together:
import hubspot
from pprint import pprint
client = hubspot.Client.create(api_key="API_KEY")
try:
api_response = client.crm.tickets.basic_api.get_page(limit=2, associations=["contacts"], archived=False)
for x in range(2):
pprint(api_response.results[x].associations)
except ApiException as e:
print("Exception when calling basic_api->get_page: %s\n" % e)
Here is what the full JSON looks like ('contacts' property shortened for readability):
{
"results": [
{
"id": "34018123",
"properties": {
"content": "Hi xxxxx,\r\n\r\nCan you clarify on how the blocking of script happens? Is it because of any CSP (or) the script will decide run time for every URL’s getting triggered from browser?\r\n\r\nRegards,\r\nLogan",
"createdate": "2019-07-03T04:20:12.366Z",
"hs_lastmodifieddate": "2020-12-09T01:16:12.974Z",
"hs_object_id": "34018123",
"hs_pipeline": "0",
"hs_pipeline_stage": "4",
"hs_ticket_category": null,
"hs_ticket_priority": null,
"subject": "RE: call followup"
},
"createdAt": "2019-07-03T04:20:12.366Z",
"updatedAt": "2020-12-09T01:16:12.974Z",
"archived": false
},
{
"id": "34018892",
"properties": {
"content": "Hi Guys,\r\n\r\nI see that we were placed back on the staging and then removed again.",
"createdate": "2019-07-03T07:59:10.606Z",
"hs_lastmodifieddate": "2021-12-17T09:04:46.316Z",
"hs_object_id": "34018892",
"hs_pipeline": "0",
"hs_pipeline_stage": "3",
"hs_ticket_category": null,
"hs_ticket_priority": null,
"subject": "Re: Issue due to server"
},
"createdAt": "2019-07-03T07:59:10.606Z",
"updatedAt": "2021-12-17T09:04:46.316Z",
"archived": false,
"associations": {
"contacts": {
"results": [
{
"id": "201",
"type": "ticket_to_contact"
}
]
}
}
}
],
"paging": {
"next": {
"after": "35406270",
"link": "https://api.hubapi.com/crm/v3/objects/tickets?associations=contacts&archived=false&hs_static_app=developer-docs-ui&limit=2&after=35406270&hs_static_app_version=1.3488"
}
}
}

You can do api_response.results[x].associations["contacts"]["results"][0]["id"].

Sorted this out, posting in case anyone else is struggling with the response from the HubSpot v3 Api. The response schema for this call is:
Response schema type: Object
String results[].id
Object results[].properties
String results[].createdAt
String results[].updatedAt
Boolean results[].archived
String results[].archivedAt
Object results[].associations
Object paging
Object paging.next
String paging.next.after
String paging.next.linkResponse schema type: Object
String results[].id
Object results[].properties
String results[].createdAt
String results[].updatedAt
Boolean results[].archived
String results[].archivedAt
Object results[].associations
Object paging
Object paging.next
String paging.next.after
String paging.next.link
So to access the id of the contact associated with the ticket, you need to reference it using this notation:
api_response.results[1].associations["contacts"].results[0].id
notes:
results[x] - reference the result in the index
associations["contacts"] -
associations is a dictionary object, you can access the contacts item
by it's name
associations["contacts"].results is a list - reference
by the index []
id - is a string

In my case type was ModelProperty or CollectionResponseProperty couldn't reach dict anyhow.
For the record this got me to go through the results.
for result in list(api_response.results):
ID = result.id

I am getting this following error: 21/10/06 21:15:55 ERROR ScalaDriverLocal: User Code Stack Trace: java.lang.Exception: max value not updated

I am trying to extract data from Netsuite and load it into Azure Databricks, by scripting a JSON config and running it through Azure Data Factory pipeline. I get the error that is mentioned above:
ERROR ScalaDriverLocal: User Code Stack Trace:
java.lang.Exception: max value not updated
Could this be related to an error checkpoint table updation?
I am providing the JSON script I used below. I hope someone can help me figure out the error. Thanks.
{
"parallelism": 1,
"onJobFailure": "Fail",
"onEmptyDF": "Fail",
"ignoreInvalidRows": true,
"cleanColumnNames": true,
"jobs": [
{
"name": "GenericPassThroughBatchJob.CURRENCY_EXCHANGE_RATE_L1",
"description": "Extract CURRENCY_EXCHANGE_RATE_L1 data from NetSuite",
"ignoreInvalidRows": true,
"cleanColumnNames": true,
"jdbcInputs": [
{
"dataFrameName": "CURRENCY_EXCHANGE_RATE_L1",
"driver": "com.netsuite.jdbc.openaccess.OpenAccessDriver",
"flavor": "oracle",
"url": "${spark.wsgc.jdbcUrl}",
"keyVaultAuth": {
"keyVaultParams": {
"clientId": "${spark.wsgc.clientId}",
"usernameKey": "${spark.wsgc.usernamekey}",
"passwordKey": "${spark.wsgc.passwordkey}",
"clientKey": "${spark.wsgc.clientkey}",
"vaultBaseUrl": "${spark.wsgc.vaultbaseurl}"
}
},
"incrementalParams": {
"checkpointTablePath": "dbfs:/mnt/data/governed/l1/audit/log/checkpoint_log/",
"extractId": "NETSUITE_CURRENCY_EXCHANGE_RATE",
"incrementalSql": "(select b.NAME as BASE_CURRENCY_CD, c.NAME as CURRENCY_CD, a.EXCHANGE_RATE, a.DATE_EFFECTIVE from Administrator.CURRENCY_EXCHANGE_RATES a left join Administrator.CURRENCIES b on a.BASE_CURRENCY_ID = b.CURRENCY_ID left join Administrator.CURRENCIES c on a.CURRENCY_ID = c.CURRENCY_ID) a1",
"maxCheckPoint1": "(select to_char(max(DATE_EFFECTIVE), 'DD-MM-YYYY HH24:MI:SS') from Administrator.CURRENCY_EXCHANGE_RATES where DATE_EFFECTIVE > to_date('%%{CHECKPOINT_VALUE_1}', 'YYYY-MM-DD HH24:MI:SS'))"
}
}
],
"fileOutputs": [
{
"dataFrameName": "CURRENCY_EXCHANGE_RATE_L1",
"format": "PARQUET",
"path": "dbfs:/mnt/data/governed/l1/global_netsuite/CurrencyExchangeRate/table/inbound/All_Currency_Exchange_Rate/",
"saveMode": "Overwrite"
},
{
"dataFrameName": "CURRENCY_EXCHANGE_RATE_L1",
"format": "DELTA",
"path": "dbfs:/mnt/data/governed/l1/global_netsuite/CurrencyExchangeRate/table/inbound_archive/All_Currency_Exchange_Rate/",
"saveMode": "Append"
}
]
}
]
}

This was an issue with Checkpoint table updation. Once I rectified the checkpoint value, this error was resolved. I had to set the checkpoint value to min(DATE_LAST_MODIFIED) from the records in the table.

Pymongo: Unable to find record from mongodb

I have a collection containing country records, I need to find particular country with uid and it's countryId
Below is the sample collection data:
{
"uid": 15024,
"countries": [{
"countryId": 123,
"popullation": 45000000
},
{
"countryId": 456,
"poppulation": 9000000000
}
]
},
{
"uid": 15025,
"countries": [{
"countryId": 987,
"popullation": 560000000
},
{
"countryId": 456,
"poppulation": 8900000000
}
]
}
I have tried with below query in in python but unable to find any result:
foundRecord = collection.find_one({"uid" : 15024, "countries.countryId": 456})
but it return None.
Please help and suggest.

I think following will work better :
foundRecord = collection.find_one({"uid" : 15024,
"countries" : {"$elemMatch" : { "countryId" : 456 }})

Are you sure you're using the same Database / Collection source?
Seems that you're saving results on another collection.
I've tried to reproduce your problem and it works on my mongodb ( note that I'm using v4)
EDIT: Would be nice to have the piece of code where you're defining "collection"

Referring to parent attribute in pandas

This is my json
{
"fInstructions": [
{
"id": 155,
"type":"finstruction",
"ref": "/spm/finstruction/155",
"iLineItem":[
{
"id": 156,
"type":"ilineitem",
"ref": "/spm/ilineitem/156",
"creationDate": "2018-03-09",
"dueDate":"2018-02-01",
"effectiveDate":"2018-03-09",
"frequency":"01",
"coveredPeriodFrom":"2018-02-28",
"coveredPeriodTo":"2018-02-28",
"statusCode":"PRO",
"amount": 6
},
{
"id": 157,
"type":"ilineitem",
"ref": "/spm/ilineitem/157",
"creationDate": "2018-03-09",
"dueDate":"2018-02-01",
"effectiveDate":"2018-03-09",
"frequency":"01",
"coveredPeriodFrom":"2018-03-01",
"coveredPeriodTo":"2018-03-31",
"statusCode":"PRO",
"amount": 192
}
]
}
]
}
If I do:
json_normalize(data['fInstructions'], record_path=['iLineItem'])
I get two rows as expected with all the ILIs. However, I want to also have the parent attributes id, type in the result set. To that I try:
json_normalize(df_data_1['fInstructions'], record_path=['iLineItem'], meta=['id', 'type'])
But then I get:
ValueError: Conflicting metadata name id, need distinguishing prefix
So I try:
json_normalize(df_data_1['fInstructions'], record_path=['iLineItem'], meta=['fInstructions.id'])
Which gives me:
KeyError: "Try running with errors='ignore' as key 'fInstructions.id' is not always present"

Answer is:
json_normalize(df_data_1['fInstructions'], record_path=['iLineItem'], meta='id', record_prefix='ils.')

scan python helper in elasticsearch with slice

I have the following code:
client = Elasticsearch(hosts=['host'], port=9200)
scan_arguments = {'query': {'slice': {'max': 1, 'id': 0}}, 'preference': '_shards:0', 'index': u'my_index'}
for hit in scan(client, **scan_args):
# do something with hit
and I get the following error
RequestError: TransportError(400, u'parsing_exception', u'[slice] failed to parse field [max]')
How should the slice parameter be passed in the scan function?

"max" needs to be >1 in my experience. I saw the same error before when using "max":1.

The raw error from the HTTP API says max must be greater than 1.
{
"error": {
"root_cause": [
{
"type": "x_content_parse_exception",
"reason": "[3:20] [slice] failed to parse field [max]"
}
],
"type": "x_content_parse_exception",
"reason": "[3:20] [slice] failed to parse field [max]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "max must be greater than 1"
}
},
"status": 400
}

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Error while updating a document in ElasticSearch using python es.update() - python

Related

How to parse nested JSON object?

I am getting this following error: 21/10/06 21:15:55 ERROR ScalaDriverLocal: User Code Stack Trace: java.lang.Exception: max value not updated

Pymongo: Unable to find record from mongodb

Referring to parent attribute in pandas

scan python helper in elasticsearch with slice

Categories

Resources