using pop on multidimensional lists python with dynamoDB - python

I want to pop an item from a list of lists.
So I have a scan for all dynamo Items and want to "pop" one field from each list.
For example:
response = table.query(
KeyConditionExpression=Key('userId').eq(userId)
)
agentList = response['Items']
The List:
"result": [
{
"credentials": {
"key": "xxx",
"secret": "xxxx"
},
"active": true,
"totalImported": "12345",
}]
From this example, I have a bunch of Results and for every result list, I will remove the Item "credentials" like
agentList.pop('credentials')
However, this isn't working

You're doing the pop() on a list. So, you can specify a position in the list, but, as a list has no key values, you can't use a string. Hence the error.
I'm not sure exactly what you're trying to do, but, assuming you want to remove the 'credentials' from every item in the list you could do something like:
agentList = [
{
"credentials": {
"key": "xxx",
"secret": "xxxx"
},
"active": True,
"totalImported": "12345",
},
{
"credentials": {
"key": "yyy",
"secret": "yyy"
},
"active": True,
"totalImported": "2222",
}
]
for result in agentList:
result.pop('credentials')
print(agentList)
Which would result in:
[{'active': True, 'totalImported': '12345'}, {'active': True, 'totalImported': '2222'}]

I finally found this way:
print(items)
agentList =[]
for result in items['Items']:
agentList.append(result)
result.pop('credentials')
Appending the List each by each and afterwards "popping" the "credentials" field.

Related

Append to a json file using python

Trying to append to a nested json file
My goal is to append some values to a JSON file.
Here is my original JSON file
{
"web": {
"all": {
"side": {
"tags": [
"admin"
],
"summary": "Generates",
"operationId": "Key",
"consumes": [],
"produces": [
"application/json"
],
"responses": {
"200": {
"description": "YES",
"schema": {
"type": "string"
}
}
},
"Honor": [
{
"presidential": []
}
]
}
}
}
}
It is my intention to add two additional lines inside the key "Honor", with the values "Required" : "YES" and "Prepay" : "NO". As a result of appending the two values, I will have the following JSON file.
{
"web": {
"all": {
"side": {
"tags": [
"admin"
],
"summary": "Generates",
"operationId": "Key",
"consumes": [],
"produces": [
"application/json"
],
"responses": {
"200": {
"description": "YES",
"schema": {
"type": "string"
}
}
},
"Honor": [
{
"presidential": [],
"Required" : "YES",
"Prepay" : "NO"
}
]
}
}
}
}
Below is the Python code that I have written
import json
def write_json(data,filename ="exmpleData.json"):
with open(filename,"w") as f:
json.dump(data,f,indent=2)
with open ("exmpleData.json") as json_files:
data= json.load(json_files)
temp = data["Honor"]
y = {"required": "YES","type": "String"}
temp.append(y)
write_json(data)
I am receiving the following error message:
** temp = data["Honor"] KeyError: 'Honor'
**
I would appreciate any guidance that you can provide to help me achieve my goal. I am running Python 3.7
'Honor' is deeply nested in other dictionaries, and its value is a 1-element list containing a dictionary. Here's how to access:
import json
def write_json(data, filename='exmpleData.json'):
with open(filename, 'w') as f:
json.dump(data, f, indent=2)
with open('exmpleData.json') as json_files:
data = json.load(json_files)
# 'Honor' is deeply nested in other dictionaries
honor = data['web']['all']['side']['Honor']
# Its value is a 1-element list containing another dictionary.
honor[0]['Required'] = 'YES'
honor[0]['Prepay'] = 'NO'
write_json(data)
I'd recommend that you practice your fundamentals a bit more since you're making many mistakes in your data structure handling. The good news is, your JSON load/dump is fine.
The cause of your error message is that data doesn't have an "Honor" property. Data only has a "web" property, which contains "all" which contains "side" which contains "Honor", which contains an array with a dictionary that holds the properties you are trying to add to. So you want to set temp with temp = data['web']['all']['side']['Honor'][0]
You also cannot use append on python dictionaries. Instead, check out dict.update().

Populating JSON data from API in Python pandas DataFrame - TypeError and IndexError

I am trying to populate a pandas DataFrame with select information from JSON output fetched from an API.
candidate_list = []
for candidate in candidate_response['data']:
if 'error' not in candidate_response:
candidate_list.append([candidate['id'], candidate['attributes']['first_name'], candidate['attributes']
['last_name'], candidate['relationships']['educations']['data']['id']])
The DataFrame populates fine until I add candidate['relationships']['educations']['data']['id'], which throws TypeError: list indices must be integers or slices, not str.
When trying to get the values of the indexes for ['id'] by using candidate['relationships']['educations']['data'][0]['id'] instead, I get IndexError: list index out of range.
The JSON output looks something like:
"data": [
{
"attributes": {
"first_name": "Tester",
"last_name": "Testman",
"other stuff": "stuff",
},
"id": "732887",
"relationships": {
"educations": {
"data": [
{
"id": "605372",
"type": "educations"
},
{
"id": "605371",
"type": "educations"
},
{
"id": "605370",
"type": "educations"
}
]
}
},
How would I go about successfully filling a column in the DataFrame with the 'id's under 'relationships'>'educations'>'data'?
Please note then when using candidate['relationships']['educations']['data']['id'] you get that error because at data there is a list, and not a dictionary. And you cannot access dictionary by name.
Assuming, what you are trying to achieve is one entry per data.attributes.relationships.educations.data entry. Complete code that works and does what you are trying is:
import json
json_string = """{
"data": [
{
"attributes": {
"first_name": "Tester",
"last_name": "Testman",
"other stuff": "stuff"
},
"id": "732887",
"relationships": {
"educations": {
"data": [
{
"id": "605372",
"type": "educations"
},
{
"id": "605371",
"type": "educations"
},
{
"id": "605370",
"type": "educations"
}
]
}
}
}
]
}"""
candidate_response = json.loads(json_string)
candidate_list = []
for candidate in candidate_response['data']:
if 'error' not in candidate_response:
for data in candidate['relationships']['educations']['data']:
candidate_list.append(
[
candidate['id'],
candidate['attributes']['first_name'],
candidate['attributes']['last_name'],
data['id']
]
)
print(candidate_list)
Code run available at ideone.
I have analyzed your code and also ran it on Jupyter notebook all looks good, I am getting the output,
The error you got list indices must be integers or slices, not str, that is because you were not using the index, this required because the value which you are looking for that is in the list.
and about this error: IndexError: list index out of range. Maybe some code typo mistake is done from your side otherwise the code is fine.
here is the output of your following code:
candidate_list = []
for candidate in candidate_response['data']:
if 'error' not in candidate_response:
candidate_list.append([candidate['id'], candidate['attributes']['first_name'], candidate['attributes']['last_name'],candidate['relationships']['educations']['data'][0]['id']])
Output
probably for any candidate candidate['relationships']['educations']['data'] is an empty list

Working with multiple JSONs from API calls in Python

I'm trying to make multiple API calls to retrieve JSON files. The JSONs all follow the same schema. I want to merge all the JSON files together as one file so I can do two things:
1) Extract all the IP addresses from the JSON to work with later
2) Convert the JSON into a Pandas Dataframe
When I first wrote the code, I made a single request and it returned a JSON that I could work with. Now I have used a for loop to collect multiple JSONs and append them to a list called results_list so that the next JSON does not overwrite the previous one I requested.
Here's the code
headers = {
'Accept': 'application/json',
'key': 'MY_API_KEY'
}
query_type = 'QUERY_TYPE'
locations_list = ['London', 'Amsterdam', 'Berlin']
results_list = []
for location in locations_list:
url = ('https://API_URL' )
r = requests.get(url, params={'query':str(query_type)+str(location)}, headers = headers)
results_list.append(r)
with open('my_search_results.json' ,'w') as outfile:
json.dump(results_list, outfile)
The JSON file my_search_results.json has a separate row for each API query e.g. 0 is London, 1 is Amsterdam, 2 is Berlin etc. Like this:
[
{
"complete": true,
"count": 51,
"data": [
{
"actor": "unknown",
"classification": "malicious",
"cve": [],
"first_seen": "2020-03-11",
"ip": "1.2.3.4",
"last_seen": "2020-03-28",
"metadata": {
"asn": "xxxxx",
"category": "isp",
"city": "London",
"country": "United Kingdom",
"country_code": "GB",
"organization": "British Telecommunications PLC",
"os": "Linux 2.2-3.x",
"rdns": "xxxx",
"tor": false
},
"raw_data": {
"ja3": [],
"scan": [
{
"port": 23,
"protocol": "TCP"
},
{
"port": 81,
"protocol": "TCP"
}
],
"web": {}
},
"seen": true,
"spoofable": false,
"tags": [
"some tag",
]
}
(I've redacted any sensitive data. There is a separate row in the JSON for each API request, representing each city, but it's too big to show here)
Now I want to go through the JSON and pick out all the IP addresses:
for d in results_list['data']:
ips = (d['ip'])
print(ips)
However this gives the error:
TypeError: list indices must be integers or slices, not str
When I was working with a single JSON from a single API request this worked fine, but now it seems like either the JSON is not formatted properly or Python is seeing my big JSON as a list and not a dictionary, even though I used json.dump() on results_list earlier in the script. I'm sure it has to do with the way I had to take all the API calls and append them to a list but I can't work out where I'm going wrong.
I'm struggling to figure out how to pick out the IP addresses or if there is just a better way to collect and merge multiple JSONs. Any advice appreciated.
To get the IP try:
for d in results_list['data']: #this works only if you accessed data rightly..
ips = (d[0]['ip'])
print(ips)
Reason for why you recieved the Error:
The key value of data is a list which contains a dictionary of the ip you need. So when you try to access ip by ips = (d['ip']), you are indexing the outer list, which raises the error:
TypeError: list indices must be integers or slices, not str
So if:
results_list= [
{
"complete": True,
"count": 51,
"data": [
{
"actor": "unknown",
"classification": "malicious",
"cve": [],
"first_seen": "2020-03-11",
"ip": "1.2.3.4",
"last_seen": "2020-03-28",
"metadata": {
"asn": "xxxxx",
"category": "isp",
"city": "London",
"country": "United Kingdom",
"country_code": "GB",
"organization": "British Telecommunications PLC",
"os": "Linux 2.2-3.x",
"rdns": "xxxx",
"tor": False
},
"raw_data": {
"ja3": [],
"scan": [
{
"port": 23,
"protocol": "TCP"
},
{
"port": 81,
"protocol": "TCP"
}
],
"web": {}
},
"seen": True,
"spoofable": False,
"tags": [
"some tag",
]
}...(here is your rest data)
]}]
to get all IP addresses, run:
ip_address=[]
# this works only if each result is a seperate dictionary in the results_list
for d in results_list:
ips = d['data'][0]['ip']
ip_address.append(ips)
print(ips)
#if all results are within data
for d in results_list[0]['data']:
ips = d['ip']
ip_address.append(ips)
print(ips)
results_list is a list, not a dictionary, so results_list['data'] raises an error. Instead, you should get each dictionary from that list, then access the 'data' attribute. Noting also that the value for the key 'data' is of type list, you also need to access the element of that list:
for result in results_list:
for d in result["data"]:
ips = d["ip"]
print(ips)
If you know that your JSON list only has one element, you may simplify this to:
for d in results_list[0]["data"]:
ips = d["ip"]
print(ips)

How to add a string to JSON list in python

Here is my code sample:
try:
REST_Call = Session.get(CC_URL_REST) #Getting the session for a particular url.
REST_CALL = REST_Call.content #Retrieving the contents from the url.
JSON_Data = json.loads(REST_CALL) #Loading data as JSON.
Report_JSON.append(JSON_Data) #Appending the data to an empty list
The JSON data that is returned and appended to the 'Report_JSON' is:
[
{
"content": [
{
"id": 001,
"name": "Sample_Name",
"description": "Sample_description",
"status": "STARTED",
"source": null,
"user": "username"
}
},
],
I just want to add the below data in string format to the above JSON list:
{
"CronExpression": "0 1,30 4,5,6,7,8,9,10,11,12,13,14 ? * 2,3,4,5,6"
},
Sample code for the above string data:
Cron_Call = Session.get(Cron_URL_REST)
Cron_CALL = Cron_Call.content
Cron_Data = json.loads(Cron_CALL)
cron_value = Cron_Data["cronExpression"]
Report_JSON.append({
"CronExpression": cron_value
})
When trying to append it to the 'Report_JSON' this is the output I get:
[
{
"content": [
{
"id": 001,
"name": "Sample_Name",
"description": "Sample_description",
"status": "STARTED",
"source": null,
"user": "username"
}
},
],
{
"CronExpression": "0 1,30 4,5,6,7,8,9,10,11,12,13,14 ? * 2,3,4,5,6"
},
I'm trying to show both the data's under the same "content" tab unlike it being separate.
This is the result i'm trying to get:
{
"id": 001,
"name": "Sample_Name",
"description": "Sample_description",
"status": "STARTED",
"source": null,
"user": "username"
"CronExpression": "0 1,30 4,5,6,7,8,9,10,11,12,13,14 ? * 2,3,4,5,6"
},
Any ideas on how to implement it?
Loop over JSON_Data['content'] and add the new key to each of them.
Cron_Call = Session.get(Cron_URL_REST)
Cron_CALL = Cron_Call.content
Cron_Data = json.loads(Cron_CALL)
cron_value = Cron_Data["cronExpression"]
for x in JSON_DATA['content']:
x['CronExpression'] = cron_value
Here, Report_JSON is loaded as a list type in Python (JSON data can be interpreted by Python as either a list, if it is surrounded by [] square brackets, or a dict if it is surrounded by {} curly brackets).
When you call Report_JSON.append(), it will append a new item to the list. You are creating a new dictionary with a single key-value pair (CronExpression) and adding it to the list, which is why the two dictionaries are side-by-side.
What you should do instead is get the first item in the Report_JSON list, which is the dictionary; then ask for the value corresponding to the content key, which will be a list; then ask for the first item in that list, which will be the dictionary you want to modify (with keys id, name, description, etc.)
Modify that dictionary, then put it back in the list. Here's the code that will do that:
# Get first item in Report_JSON list
content_dict = Report_JSON[0]
# Get the value corresponding to the 'content' key, which is a list
content_value = content_dict['content']
# Get the first item in the list, which is the dict you want to modify
dict_to_modify = content_value[0]
# Now add your new key-value pair
dict_to_modify['CronExpression'] = "0 1,30 4,5,6,7 ..."
Or, to do it in one shot:
Report_JSON[0]['content'][0]['CronExpression'] = "0 1,30 4,5,6,7 ..."
UPDATE: If the "content" list has multiple items, you can iterate over each item in that list:
for content_dict in Report_JSON[0]['content']:
content_dict['CronExpression'] = "0 1,30 4,5,6,7 ..."
which will result in something like this:
[
{
"content": [
{
"id": 001,
"name": "Sample_Name",
"description": "Sample_description",
"status": "STARTED",
"source": null,
"user": "username",
"CronExpression": "0 1,30 4,5,6,7 ..."
},
{
"id": 002,
"name": "Another_Sample_Name",
"description": "Another_sample_description",
"status": "STARTED",
"source": null,
"user": "another_username",
"CronExpression": "0 1,30 4,5,6,7 ..."
},
]
},
],
UPDATE 2: If you are not interested in keeping the original structure and you want to strip everything up to and including the "content" key, you can just do this to start off:
Report_JSON = Report_JSON[0]['content']
and Report_JSON is now just the inner "content" list.

Modify value for specified key in list of dictionaries

I have a list of json objects (dictionaries) ds_list
ds_list = [ { "status": "NEW" }, { "status": "UP_TO_DATE" }]
I need to modify an attribute of each object.
So here is my solution:
if we_are_processing:
result = list(map(lambda ds: ds.update({'status': 'PROCESSING'}) or ds, ds_list))
result = [ { "status": "PROCESSING" }, { "status": "PROCESSING" }]
It works, but I don't like it very much, in particular update() and or ds.
What is more pythonic (readable) way of implementing it?
The Pythonic way is to use for loop:
ds_list = [ { "status": "NEW" }, { "status": "UP_TO_DATE" }]
for item in ds_list:
item['status'] = 'PENDING'
# [{'status': 'PENDING'}, {'status': 'PENDING'}]

Categories