python parse from a json array with duplicate names (mitre att&ck) - python

I am a novice trying to parse data from a repo of mitre att&ck json files and am stuck on how to parse data for one of the fields - attack phase names. They are stored in an array and there are sometimes duplicate names, see below:
"type": "attack-pattern",
"kill_chain_phases": [
{
"kill_chain_name": "mitre-attack",
"phase_name": "persistence"
},
{
"kill_chain_name": "mitre-attack",
"phase_name": "privilege-escalation"
}
],
If I try to return values for get_phase(attack.kill_chain_phases[0].phase_name), python only returns one value when there are sometimes multiple values, like privilege-escalation
If I try to mess around and use something like this get_phase(attack.kill_chain_phases[0].phase_name[0]) the output is the first character of one of the phase names c
If I try to do something like get_phase(attack_pattern.kill_chain_phases[1].phase_name) I get an out of index error...
Does anyone have an idea on how I can go about using python to grab these fields? Also does anyone know how to describe this data format and/or what I'm trying to do so I can try to search for solutions? Thanks in advance!

You're probably looking for something like a for loop. A simple example would be something like:
for attack in attack_pattern.kill_chain_phrases:
get_phrase(attack)

You will want to use a loop for this. Get the parent of the items you want to find all the values for, and then you can loop through and get all the children's values.
import json
json_string = """{
"type": "attack-pattern",
"kill_chain_phases": [
{
"kill_chain_name": "mitre-attack",
"phase_name": "persistence"
},
{
"kill_chain_name": "mitre-attack",
"phase_name": "privilege-escalation"
}
]
}
"""
parsed_json = json.loads(json_string)
# Loop through the parent "kill_chain_phases"
for kill_chain_phase in parsed_json["kill_chain_phases"]:
# print out the children "phase_name" values
print(kill_chain_phase["phase_name"]
You get the first character for get_phase(attack.kill_chain_phases[0].phase_name[0] because:
get_phase(attack.kill_chain_phases[0].phase_name[0] = "persistence"
Python then takes that string and treats it like a list:
["p", "e", "r", "s", "i", "s", "t", "e", "n", "c", "e"]
So phase_name[0] will show "p".
Hope that makes sense.
More info here

Related

Is there an efficient way to write data to a JSON file using a dictionary in Python?

I'm writing a program in Python to use an API that needs to get input from a JSON payload in a really specific way which is shown below. The poid element will contain a different number with each run of the program, the inventories element contains a list of dictionaries that I am trying to send to the API.
[
{
"poid":"22130",
"inventories":
[
{
"item": "SAMPLE-ITEM-1",
"mfgr": "SAMPLE-MANUFACTURER-1",
"quantity": "1",
"condition": "REF"
},
{
"item": "SAMPLE-ITEM-2",
"mfgr": "SAMPLE-MANUFACTURER-2",
"quantity": "3",
"condition": "REF"
}
]
}
]
The data I need to put into the file is stored in a dictionary and a list as shown below. For simplicity of this post, I'm showing what the dictionary and list would look like after another method creates them. I'm not sure if this is the most efficient way of storing this data when I'm having to write it to JSON.
pn_and_mfgr_dict = {'SAMPLE-ITEM-1': 'SAMPLE-MANUFACTURER-1', 'SAMPLE-ITEM-2': 'SAMPLE-MANUFACTURER-2'}
quantities = ["1","3"]
poid = 22130 #this will be different each run
If it makes sense from what I've written above, I need to generate a JSON file that looks like the first codeblock given the information from the second codeblock. The item at index 0 in the quantities list corresponds to the first key/value pair in the dictionary and so on. The "condition" value in the first codeblock will always have "REF" as its value for my use, but I need to also include that in the final payload that gets sent to the API. Since the part number and manufacturer dictionary will be a different length with each run, I also need this method to work regardless of how many values are in the dictionary. This dictionary and the quantities list will always be the same length though. I think the best way I can solve this is making a for loop that iterates through the dictionary and puts respective data where it needs to be, then reading the file when the for loop is done and sending it as the payload but please correct me if there's a better way to do this like storing everything in variables. I also have no experience with JSON so I have attempted to use JSON libraries to accomplish this with no idea what I'm doing wrong. I can edit this with my attempts tonight but I wanted to post this as soon as possible.
Here is one possible solution:
import json
pn_and_mfgr_dict = {
'SAMPLE-ITEM-1': 'SAMPLE-MANUFACTURER-1',
'SAMPLE-ITEM-2': 'SAMPLE-MANUFACTURER-2'
}
quantities = ['1', '3']
poid = 22130
payload = {
'poid': poid,
'inventories': [{
'item': item,
'mfgr': mfgr,
'quantity': quantity,
'condition': 'REF'
} for (item, mfgr), quantity in zip(pn_and_mfgr_dict.items(), quantities)]
}
print(json.dumps(payload, indent=2))
The code above will result in:
{
"poid": 22130,
"inventories": [
{
"item": "SAMPLE-ITEM-1",
"mfgr": "SAMPLE-MANUFACTURER-1",
"quantity": "1",
"condition": "REF"
},
{
"item": "SAMPLE-ITEM-2",
"mfgr": "SAMPLE-MANUFACTURER-2",
"quantity": "3",
"condition": "REF"
}
]
}
Naturally, you can adjust that for multiple poids with something like this:
poids = [22130, 22131, 22132]
for poid in poids:
# implement here the logic to get items and quantities for
# each poid
payload = {
'poid': poid,
'inventories': [{
'item': item,
'mfgr': mfgr,
'quantity': quantity,
'condition': 'REF'
} for (item, mfgr), quantity in zip(pn_and_mfgr_dict.items(), quantities)]
}
print(json.dumps(payload, indent=2))
You will need to change it to have the correspondents items and quantities for each poid, and I leave that as starting point for you to implement.
Your second block is your input, so you could immediately start by write down a function taking those input and returning a JSON string.
import json
from typing import Dict, List
def jsonify_data(pn_and_mfgr_dict: Dict, quantities: List, poid: int):
constructed_data = [] # TODO
return json.dumps(constructed_data)
Then you could start working on using the inputs to construct the output data you desired. And you already know how to do it.
I think the best way I can solve this is making a for loop that iterates through the dictionary and puts respective data where it needs to be
Yes, that's the way to do it.
Here's my version of solution:
import json
from typing import Dict, List
def jsonify_data(pn_and_mfgr_dict: Dict, quantities: List, poid: int):
inventories = [
{
'item': item,
'mfgr': mfgr,
'quantity': quantity,
'condition': 'REF',
} for (item, mfgr), quantity in zip(pn_and_mfgr_dict.items(), quantities)
]
constructed_data = [
{
'poid': f'{poid}',
'inventories': inventories,
}
]
return json.dumps(constructed_data)
import json
data = {'inventories': [{'SAMPLE-ITEM-1': 'SAMPLE-MANUFACTURER-1'}, {'SAMPLE-ITEM-2': 'SAMPLE-MANUFACTURER-2'}]}
quantities = ["1", "3"]
poid = 22130
# Add poid to data
data['poid'] = poid
# Add quantities to data
for item in data['inventories']:
item['quantity'] = quantities.pop(0)
# Serializing json
json_object = json.dumps(data, indent=4)
print(json_object)

Need Help to Realize a Function in a Python

I need to make a function that can check a given word and return another word. For example, if the input is "a1", the function will check for this word in the dictionary and return "a".
I know how to code it if it is just a single input word per category using a simple if-else, but I'm still confused if a category has more than 3 words. And I plan to have a lot of data in this dictionary. So a simple if-else would need a lot of code to be written.
this is the example for the input & output that i want:
Input : a2 Output : a
Input : b3 Output : b
If you really just need to strip a single digit number on the end (per your example):
words=['a1','a2','a3','a4','b1','b2','b3','c1','c2','d1']
realwords= set() # empty set, like a list, but can only have each item once, no duplication
for w in words:
realwords.add(w[:-1])
print(realwords)
{'c', 'a', 'b', 'd'}
If you have a more complex problem than single digits, please append it to your question. There are many ways to solve such problems in Python. In the above example I used the concept of the set, which can be very powerful to ensure that no duplication happens. You can convert the set back to a list easily b=list(a)
If I understand your question correctly, you have a list of words and you want to replace a certain word with another word? In many programming languages I think you would use a switch-case statement for something like this. This can be implemented in Python by using a dict:
switch = {
"a1" : "a",
"a2" : "a",
"a3" : "a",
"a4" : "a",
"b1" : "b",
"b2" : "b",
"b3" : "b",
"c1" : "c",
"c2" : "c",
"c3" : "c",
"d1" : "d",
"d2" : "d"
}
test_word = "d1"
answer = switch[test_word]
print(answer) #d
The bracket notation of [ ] used on a dict looks in the dict for a key matching the value within the brackets. It returns the corresponding value of that key in the dictionary. If it is not found it will raise a KeyError.
If you want to return a different value in the case that the key is not found, then you can use .get instead, like so:
switch.get(test_word, "not found")

Reference all indexes in list and check for existence of value in python

I'm trying to create if block in my python3 script that checks if a value exists within a list I pull from JSON. The JSON data is below:
[
{
"id": 59616405645,
"name": "Foo"
},
{
"id": 990164054345,
"name": "FindMe"
},
{
"id": 2009167874,
"name": "Bar"
}
]
I'm trying to determine whether or not the value of Bar exists within the list. To do so I'm doing the following which directly references the index:
if "FindMe" in m_orgs[1].values():
print("Yo it's here")
else:
print("Yo it's not here.")
But the JSON data I'm pulling will always have different results and we will never know the index numbers, so direct reference will not work. How do I reference all indexes in a list at once?
You can't reference all indexes at once, but you can loop through them, and stop as soon as you find the first existence. Something like:
found = any("findMe" in item.values() for item in m_orgs)
This line will stop the execution when it finds the first True value. So worst case, it will look through every position and not find anything.
You can use any() like this:
if any(d['name'] == 'Foo' for d in json):
do this
You can first translate the original json data to set of data, and then simply check through set operations,
name_set = {org['name'] for org in m_orgs}
print 'FindMe' in name_set

Comparing Swift and Python Dictionary objects

I'm trying to get familiar with Swift, so I'm doing some basic computations that I would normally do in Python.
I want to get a value from a dictionary using a key. In Python I would simply :
sequences = ["ATG","AAA","TAG"]
D_codon_aa = {"ATG": "M", "AAA": "R", "TAG": "*"}
for seq in sequences:
print D_codon_aa[seq]
>>>
M
R
*
When I try this in Swift.
let sequences = ["ATG","AAA","TAG"]
let D_codon_aa = ["ATG": "M", "AAA": "R", "TAG": "*"]
for seq in sequences
{
var codon = D_codon_aa[seq]
println(codon)
}
>>>
Optional("M")
Optional("R")
Optional("*")
1) What is Optional() and why is it around the dictionary value?
2) Why can't I make a dictionary with multiple types of objects inside?
In Python I can do this:
sequence= {'A':0,'C':1, 'G':'2', 'T':3.0}
In Swift I can't do this:
let sequences = ["A":0,"C":1, "G":"2", "T":3.0]
1:
Look at the declaration of the dictionarys subscript:
subscript(key: Key) -> Value?
It returns an optional, since you can use any key you want in subscripts, but they might not associated with values, so in that case it returns nil, otherwise the value wrapped in an optional.
2: Actually, you can, if you define your dictionary as for eg. ["String": AnyObject], and now you can associate keys with values, thats conforms to the AnyObject protocol.
Updated
And your example let sequences = ["A":0,"C":1, "G":"2", "T":3.0] compiles fine in Xcode 6.1.1.

Python parsing json issue

I'm having troubles parsing a complicated json. That is it:
{
"response": {
"players": [
{
"bar": "76561198034360615",
"foo": "49329432943232423"
}
]
}
}
My code:
url = urllib.urlopen("foobar").read()
js = json.load(url)
data = js['response']
print data['players']
The problem is that this would print the dict. What I want is to reach the key's values, like foo and bar. What I tried so far is doing data['players']['foo'] and it gives me an error that list indices should be integers, I tried of course, it didn't work. So my question is how do I reach these values? Thanks in advance.
data['response']['players'] is an array (as defined by the brackets ([, ]), so you need to access items using a specific index (0 in this case):
data['players'][0]['foo']
Or iterate over all players:
for player in data['response']['players']:
print player['foo']
The problem is that players is a list of items ([ ] in json). So you need to select the first and only item in this case using [0].
print data['players'][0]['foo']
But, keep in mind that you may have more than one player, in which case you either need to specify the player, or loop through the players using a for loop
for player in data['players']:
print player['foo']

Categories