This question already has answers here:
How to extract data from dictionary in the list
(3 answers)
Closed 11 months ago.
I have the following json output.
"detections": [
{
"source": "detection",
"uuid": "50594028",
"detectionTime": "2022-03-27T06:50:56Z",
"ingestionTime": "2022-03-27T07:04:50Z",
"filters": [
{
"id": "F2058",
"unique_id": "3638f7c0",
"level": "critical",
"name": "Possible Right-To-Left Override Attack",
"description": "Possible Right-To-Left Override Detected in the Filename",
"tactics": [
"TA0005"
],
"techniques": [
"T1036.002"
],
"highlightedObjects": [
{
"field": "fileName",
"type": "filename",
"value": [
"1465940311.,S=473394(NONAMEFL(Z00057-PIfdp.exe))"
]
},
{
"field": "filePathName",
"type": "fullpath",
"value": "/exports/10_19/mail/12/91/20193/new/1465940311.,S=473394(NONAMEFL(Z00057-PIfdp.exe))"
},
{
"field": "malName",
"type": "detection_name",
"value": "HEUR_RLOTRICK.A"
},
{
"field": "actResult",
"type": "text",
"value": [
"Passed"
]
},
{
"field": "scanType",
"type": "text",
"value": "REALTIME"
}
]
},
{
"id": "F2140",
"unique_id": "5a313874",
"level": "medium",
"name": "Malicious Software",
"description": "A malicious software was detected on an endpoint.",
"tactics": [],
"techniques": [],
"highlightedObjects": [
{
"field": "fileName",
"type": "filename",
"value": [
"1465940311.,S=473394(NONAMEFL(Z00057-PIfdp.exe))"
]
},
{
"field": "filePathName",
"type": "fullpath",
"value": "/exports/10_19/mail/12/91/rs001291-excluido-20193/new/1465940311.,S=473394(NONAMEFL(Z00057-PIfdp.exe))"
},
{
"field": "malName",
"type": "detection_name",
"value": "HEUR_RLOTRICK.A"
},
{
"field": "actResult",
"type": "text",
"value": [
"Passed"
]
},
{
"field": "scanType",
"type": "text",
"value": "REALTIME"
},
{
"field": "endpointIp",
"type": "ip",
"value": [
"xxx.xxx.xxx"
]
}
]
}
],
"entityType": "endpoint",
"entityName": "xxx(xxx.xxx.xxx)",
"endpoint": {
"name": "xxx",
"guid": "d1dd7e61",
"ips": [
"2xx.xxx.xxx"
]
}
}
Inside the 'filters' offset it brings me two levels, one critical and one medim, both with the variable 'name'.
I want to print only the first name, but when I print the 'name', it returns both names:
How do I print only the first one?
If I put print in for filters, it returns both names:
If I put print in for detections, it only returns the second 'name' and that's not what I want:
If you only want to print the name of the first filter, why iterate over it, just index it and print the value under "name":
for d in r['detections']:
print(d['filters'][0]['name'])
Related
I started using Python Cubes Olap recently.
I'm trying to sum/avg a JSON postgres column, how can i do this?
my db structure:
events
id
object_type
sn_name
spectra
id
snx_wavelengths (json column)
event_id
my json:
{
"dimensions": [
{
"name": "event",
"levels": [
{
"name": "object_type",
"label": "Object Type",
"attributes": [
"object_type"
]
},
{
"name": "sn_name",
"label": "name",
"attributes": [
"sn_name"
]
}
]
},
{
"name": "spectra",
"levels": [
{
"name": "catalog_name",
"label": "Catalog Name",
"attributes": [
"catalog_name"
]
},
{
"name": "capture_date",
"label": "Capture Date",
"attributes": [
"capture_date"
]
}
]
},
{
"name": "date"
}
],
"cubes": [
{
"id": "uid",
"name": "14G31Yx98ZG8aEhFHjOWNNBmFOETg5APjZo5AiHaqog5YxLMK5",
"dimensions": [
"event",
"spectra",
"date"
],
"aggregates": [
{
"name": "event_snx_wavelengths_sum",
"function": "sum",
"measure": "event.snx_wavelengths"
},
{
"name": "record_count",
"function": "count"
}
],
"joins": [
{
"master": "14G31Yx98ZG8aEhFHjOWNNBmFOETg5APjZo5AiHaqog5YxLMK5.id",
"detail": "spectra.event_id"
},
],
"mappings": {
"event.sn_name": "sn_name",
"event.object_type": "object_type",
"spectra.catalog_name": "spectra.catalog_name",
"spectra.capture_date": "spectra.capture_date",
"event.snx_wavelengths": "spectra.snx_wavelengths",
"date": "spectra.capture_date"
},
}
]
}
I'm getting the follow error:
Unknown attribute ''event.snx_wavelengths''
Anyone can help?
I already tried use mongodb to do the sum, i didnt had success.
I currently have two JSONS that I want to merge into one singular JSON, additionally I want to add in a slight change.
Firstly, these are the two JSONS in question.
An intents JSON:
[
{
"ID": "G1",
"intent": "password_reset",
"examples": [
{
"text": "I forgot my password"
},
{
"text": "I can't log in"
},
{
"text": "I can't access the site"
},
{
"text": "My log in is failing"
},
{
"text": "I need to reset my password"
}
]
},
{
"ID": "G2",
"intent": "account_closure",
"examples": [
{
"text": "I want to close my account"
},
{
"text": "I want to terminate my account"
}
]
},
{
"ID": "G3",
"intent": "account_creation",
"examples": [
{
"text": "I want to open an account"
},
{
"text": "Create account"
}
]
},
{
"ID": "G4",
"intent": "complaint",
"examples": [
{
"text": "A member of staff was being rude"
},
{
"text": "I have a complaint"
}
]
}
]
and an entities JSON:
[
{
"ID": "K1",
"entity": "account_type",
"values": [
{
"type": "synonyms",
"value": "business",
"synonyms": [
"corporate"
]
},
{
"type": "synonyms",
"value": "personal",
"synonyms": [
"vanguard",
"student"
]
}
]
},
{
"ID": "K2",
"entity": "beverage",
"values": [
{
"type": "synonyms",
"value": "hot",
"synonyms": [
"heated",
"warm"
]
},
{
"type": "synonyms",
"value": "cold",
"synonyms": [
"ice",
"freezing"
]
}
]
}
]
The expected outcome is to create a JSON file that mimics this structure:
{
"intents": [
{
"intent": "password_reset",
"examples": [
{
"text": "I forgot my password"
},
{
"text": "I want to reset my password"
}
],
"description": "Reset a user password"
}
],
"entities": [
{
"entity": "account_type",
"values": [
{
"type": "synonyms",
"value": "business",
"synonyms": [
"company",
"corporate",
"enterprise"
]
},
{
"type": "synonyms",
"value": "personal",
"synonyms": []
}
],
"fuzzy_match": true
}
],
"metadata": {
"api_version": {
"major_version": "v2",
"minor_version": "2018-11-08"
}
},
"dialog_nodes": [
{
"type": "standard",
"title": "anything_else",
"output": {
"generic": [
{
"values": [
{
"text": "I didn't understand. You can try rephrasing."
},
{
"text": "Can you reword your statement? I'm not understanding."
},
{
"text": "I didn't get your meaning."
}
],
"response_type": "text",
"selection_policy": "sequential"
}
]
},
"conditions": "anything_else",
"dialog_node": "Anything else",
"previous_sibling": "node_4_1655399659061",
"disambiguation_opt_out": true
},
{
"type": "event_handler",
"output": {
"generic": [
{
"title": "What type of account do you hold with us?",
"options": [
{
"label": "Personal",
"value": {
"input": {
"text": "personal"
}
}
},
{
"label": "Business",
"value": {
"input": {
"text": "business"
}
}
}
],
"response_type": "option"
}
]
},
"parent": "slot_9_1655398217028",
"event_name": "focus",
"dialog_node": "handler_6_1655398217052",
"previous_sibling": "handler_7_1655398217052"
},
{
"type": "event_handler",
"output": {},
"parent": "slot_9_1655398217028",
"context": {
"account_type": "#account_type"
},
"conditions": "#account_type",
"event_name": "input",
"dialog_node": "handler_7_1655398217052"
},
{
"type": "standard",
"title": "business_account",
"output": {
"generic": [
{
"values": [
{
"text": "We have notified your corporate security team, they will be in touch to reset your password."
}
],
"response_type": "text",
"selection_policy": "sequential"
}
]
},
"parent": "node_3_1655397279884",
"next_step": {
"behavior": "jump_to",
"selector": "body",
"dialog_node": "node_4_1655399659061"
},
"conditions": "#account_type:business",
"dialog_node": "node_1_1655399028379",
"previous_sibling": "node_3_1655399027429"
},
{
"type": "standard",
"title": "intent_collection",
"output": {
"generic": [
{
"values": [
{
"text": "Thank you for confirming that you want to reset your password."
}
],
"response_type": "text",
"selection_policy": "sequential"
}
]
},
"next_step": {
"behavior": "jump_to",
"selector": "body",
"dialog_node": "node_3_1655397279884"
},
"conditions": "#password_reset",
"dialog_node": "node_3_1655396920143",
"previous_sibling": "Welcome"
},
{
"type": "frame",
"title": "account_type_confirmation",
"output": {
"generic": [
{
"values": [
{
"text": "Thank you"
}
],
"response_type": "text",
"selection_policy": "sequential"
}
]
},
"parent": "node_3_1655396920143",
"context": {},
"next_step": {
"behavior": "skip_user_input"
},
"conditions": "#password_reset",
"dialog_node": "node_3_1655397279884"
},
{
"type": "standard",
"title": "personal_account",
"output": {
"generic": [
{
"values": [
{
"text": "We have sent you an email with a password reset link."
}
],
"response_type": "text",
"selection_policy": "sequential"
}
]
},
"parent": "node_3_1655397279884",
"next_step": {
"behavior": "jump_to",
"selector": "body",
"dialog_node": "node_4_1655399659061"
},
"conditions": "#account_type:personal",
"dialog_node": "node_3_1655399027429"
},
{
"type": "standard",
"title": "reset_confirmation",
"output": {
"generic": [
{
"values": [
{
"text": "Do you need assistance with anything else today?"
}
],
"response_type": "text",
"selection_policy": "sequential"
}
]
},
"digress_in": "does_not_return",
"dialog_node": "node_4_1655399659061",
"previous_sibling": "node_3_1655396920143"
},
{
"type": "slot",
"output": {},
"parent": "node_3_1655397279884",
"variable": "$account_type",
"dialog_node": "slot_9_1655398217028",
"previous_sibling": "node_1_1655399028379"
},
{
"type": "standard",
"title": "welcome",
"output": {
"generic": [
{
"values": [
{
"text": "Hello. How can I help you?"
}
],
"response_type": "text",
"selection_policy": "sequential"
}
]
},
"conditions": "welcome",
"dialog_node": "Welcome"
}
],
"counterexamples": [],
"system_settings": {
"off_topic": {
"enabled": true
},
"disambiguation": {
"prompt": "Did you mean:",
"enabled": true,
"randomize": true,
"max_suggestions": 5,
"suggestion_text_policy": "title",
"none_of_the_above_prompt": "None of the above"
},
"human_agent_assist": {
"prompt": "Did you mean:"
},
"intent_classification": {
"training_backend_version": "v2"
},
"spelling_auto_correct": true
},
"learning_opt_out": false,
"name": "Reset Password",
"language": "en",
"description": "Basic Password Reset Request"
}
So what I am missing in my original files, is essentially:
"intents":
and for the entities file:
"entities"
at the start of each list of dictionaries.
Additionally, I would need to wrap the whole thing in curly braces to comply with json formatting.
As seen, the final goal is not just appending these two to one another but the file technically continues with some other JSON code that I have yet to write and deal with.
My question now is as follows; by what method can I either add in these words and the braces to the individual files, then combine them into a singular JSON or alternatively by what method can I read in these files and combine them with the changes all in one go?
The new output file closing on a curly brace after the entities list of dicts is an acceptable outcome for me at the time, so that I can continue to make changes and hopefully further learn from this how to do these changes in future when I get there.
TIA
JSON is only a string format, you can it load in a language structure, in python that is list and dict, do what you need then dump it back, so you don't "add strings" and "add brackets", on modify the structure
file = 'intents.txt'
intents = json.load(open(file)) # load a list
file = 'entities.txt'
entities = json.load(open(file)) # load a list
# create a dict
content = {
"intents": intents,
"entities": entities
}
json.dump(content, open(file, "w"))
If you're reading all the json in as a string, you can just prepend "{'intents':" to the start and append a closing "}".
myJson = "your json string"
myWrappedJson = '{"intents":' + myJson + "}"
How to convert the complex Json format to python? I feel difficulty in converting the attached complex json to python object and I have to validate this data later against the DB.
Json:
{
"namespace":"Data.Datapoint",
"type":"record",
"name":"Blood Donar",
"fields":[
{
"name":"id",
"type":"int"
},
{
"name":"donor_number",
"type":"string"
},
{
"name":"birth_date",
"type":{
"type":"int",
"logicalType":"date"
},
"doc":"Birth Date"
},
{
"name":"height",
"type":[
"int",
"null"
],
"doc":"Height"
},
{
"name":"applicant_ts",
"type":[
{
"type":"long",
"logicalType":"timestamp-millis"
},
"null"
],
"doc":"Creation Timestamp"
},
{
"name":"arm_preference_ind",
"type":[
"string",
"null"
],
"doc":"Arm Preference; Selection from list"
},
{
"name":"abo_ind",
"type":[
"string",
"null"
],
"doc":"Blood Type/ABO"
},
{
"name":"vein_grading_ind",
"type":[
"string",
"null"
],
"doc":"Vein Grade"
}
]
}
import json
data = '''
{ "namespace": "Data.Datapoint", "type": "record", "name": "Blood Donar", "fields": [ { "name": "id", "type": "int" }, { "name": "donor_number", "type": "string" }, { "name": "birth_date", "type": { "type": "int", "logicalType": "date" }, "doc": "Birth Date" }, { "name": "height", "type": [ "int", "null" ], "doc": "Height" }, { "name": "applicant_ts", "type": [ { "type": "long", "logicalType": "timestamp-millis" }, "null" ], "doc": "Creation Timestamp" }, { "name": "arm_preference_ind", "type": [ "string", "null" ], "doc": "Arm Preference; Selection from list" }, { "name": "abo_ind", "type": [ "string", "null" ], "doc": "Blood Type/ABO" }, { "name": "vein_grading_ind", "type": [ "string", "null" ], "doc": "Vein Grade" } ] }
'''
json_data = json.loads(data)
json_data is your python dict obj.
if you want json data from web you can try this
import json
import requests
response = requests.get("https://jsonplaceholder.typicode.com/todos")
todos = json.loads(response.text)
I have a json file where I need to read it in a structured way to insert in a database each value in its respective column, but in the tag "customFields" the fields change index, example: "Tribe / Customer" can be index 0 (row['customFields'][0]) in a json block, and in the other one be index 3 (row['customFields'][3]), so I tried to read the data using the name of the row field ['customFields'] ['Tribe / Customer'], but I got the error below:
TypeError: list indices must be integers or slices, not str
Script:
def getCustomField(ModelData):
for row in ModelData["data"]["squads"][0]["cards"]:
print(row['identifier'],
row['customFields']['Tribe / Customer'],
row['customFields']['Stopped with'],
row['customFields']['Sub-Activity'],
row['customFields']['Activity'],
row['customFields']['Complexity'],
row['customFields']['Effort'])
if __name__ == "__main__":
f = open('test.json')
json_file = json.load(f)
getCustomField(json_file)
JSON:
{
"data": {
"squads": [
{
"name": "TESTE",
"cards": [
{
"identifier": "0102",
"title": "TESTE",
"description": " TESTE ",
"status": "on_track",
"priority": null,
"assignees": [
{
"fullname": "TESTE",
"email": "TESTE"
}
],
"createdAt": "2020-04-16T15:00:31-03:00",
"secondaryLabel": null,
"primaryLabels": [
"TESTE",
"TESTE"
],
"swimlane": "TESTE",
"workstate": "Active",
"customFields": [
{
"name": "Tribe / Customer",
"value": "TESTE 1"
},
{
"name": "Checkpoint",
"value": "GNN"
},
{
"name": "Stopped with",
"value": null
},
{
"name": "Sub-Activity",
"value": "DEPLOY"
},
{
"name": "Activity",
"value": "TOOL"
},
{
"name": "Complexity",
"value": "HIGH"
},
{
"name": "Effort",
"value": "20"
}
]
},
{
"identifier": "0103",
"title": "TESTE",
"description": " TESTE ",
"status": "on_track",
"priority": null,
"assignees": [
{
"fullname": "TESTE",
"email": "TESTE"
}
],
"createdAt": "2020-04-16T15:00:31-03:00",
"secondaryLabel": null,
"primaryLabels": [
"TESTE",
"TESTE"
],
"swimlane": "TESTE",
"workstate": "Active",
"customFields": [
{
"name": "Tribe / Customer",
"value": "TESTE 1"
},
{
"name": "Stopped with",
"value": null
},
{
"name": "Checkpoint",
"value": "GNN"
},
{
"name": "Sub-Activity",
"value": "DEPLOY"
},
{
"name": "Activity",
"value": "TOOL"
},
{
"name": "Complexity",
"value": "HIGH"
},
{
"name": "Effort",
"value": "20"
}
]
}
]
}
]
}
}
You'll have to parse the list of custom fields into something you can access by name. Since you're accessing multiple entries from the same list, a dictionary is the most appropriate choice.
for row in ModelData["data"]["squads"][0]["cards"]:
custom_fields_dict = {field['name']: field['value'] for field in row['customFields']}
print(row['identifier'],
custom_fields_dict['Tribe / Customer'],
...
)
If you only wanted a single field you could traverse the list looking for a match, but it would be less efficient to do that repeatedly.
I'm skipping over dealing with missing fields - you'd probably want to use get('Tribe / Customer', some_reasonable_default) if there's any possibility of the field not being present in the json list.
I have the following json
{
"response": {
"message": null,
"exception": null,
"context": [
{
"headers": null,
"name": "aname",
"children": [
{
"type": "cluster-connectivity",
"name": "cluster-connectivity"
},
{
"type": "consistency-groups",
"name": "consistency-groups"
},
{
"type": "devices",
"name": "devices"
},
{
"type": "exports",
"name": "exports"
},
{
"type": "storage-elements",
"name": "storage-elements"
},
{
"type": "system-volumes",
"name": "system-volumes"
},
{
"type": "uninterruptible-power-supplies",
"name": "uninterruptible-power-supplies"
},
{
"type": "virtual-volumes",
"name": "virtual-volumes"
}
],
"parent": "/clusters",
"attributes": [
{
"value": "true",
"name": "allow-auto-join"
},
{
"value": "0",
"name": "auto-expel-count"
},
{
"value": "0",
"name": "auto-expel-period"
},
{
"value": "0",
"name": "auto-join-delay"
},
{
"value": "1",
"name": "cluster-id"
},
{
"value": "true",
"name": "connected"
},
{
"value": "synchronous",
"name": "default-cache-mode"
},
{
"value": "true",
"name": "default-caw-template"
},
{
"value": "blah",
"name": "default-director"
},
{
"value": [
"blah",
"blah"
],
"name": "director-names"
},
{
"value": [
],
"name": "health-indications"
},
{
"value": "ok",
"name": "health-state"
},
{
"value": "1",
"name": "island-id"
},
{
"value": "blah",
"name": "name"
},
{
"value": "ok",
"name": "operational-status"
},
{
"value": [
],
"name": "transition-indications"
},
{
"value": [
],
"name": "transition-progress"
}
],
"type": "cluster"
}
],
"custom-data": null
}
}
which im trying to parse using the json module in python. I am only intrested in getting the following information out of it.
Name Value
operational-status Value
health-state Value
Here is what i have tried.
in the below script data is the json returned from a webpage
json = json.loads(data)
healthstate= json['response']['context']['operational-status']
operationalstatus = json['response']['context']['health-status']
Unfortunately i think i must be missing something as the above results in an error that indexes must be integers not string.
if I try
healthstate= json['response'][0]
it errors saying index 0 is out of range.
Any help would be gratefully received.
json['response']['context'] is a list, so that object requires you to use integer indices.
Each item in that list is itself a dictionary again. In this case there is only one such item.
To get all "name": "health-state" dictionaries out of that structure you'd need to do a little more processing:
[attr['value'] for attr in json['response']['context'][0]['attributes'] if attr['name'] == 'health-state']
would give you a list of of matching values for health-state in the first context.
Demo:
>>> [attr['value'] for attr in json['response']['context'][0]['attributes'] if attr['name'] == 'health-state']
[u'ok']
You have to follow the data structure. It's best to interactively manipulate the data and check what every item is. If it's a list you'll have to index it positionally or iterate through it and check the values. If it's a dict you'll have to index it by it's keys. For example here is a function that get's the context and then iterates through it's attributes checking for a particular name.
def get_attribute(data, attribute):
for attrib in data['response']['context'][0]['attributes']:
if attrib['name'] == attribute:
return attrib['value']
return 'Not Found'
>>> data = json.loads(s)
>>> get_attribute(data, 'operational-status')
u'ok'
>>> get_attribute(data, 'health-state')
u'ok'
json['reponse']['context'] is a list, not a dict. The structure is not exactly what you think it is.
For example, the only "operational status" I see in there can be read with the following:
json['response']['context'][0]['attributes'][0]['operational-status']