JSON filter "smaller then" condition - python

I have a JSON which looks like this:
{
"data": [
{
"Name": "Hello",
"Number": "20"
},
{
"Name": "Beautiful",
"Number": "22"
},
{
"Name": "World",
"Number": "25"
},
{
"Name": "!",
"Number": "28"
}
}
and I want to get everything what is smaller than 28, it should look like this:
{
"data": [
{
"Name": "Hello",
"Number": "20"
},
{
"Name": "Beautiful",
"Number": "22"
},
{
"Name": "World",
"Number": "25"
}
}
I looked for a solution but all I have found was to remove an exact value.
I'm doing this with a much larger file this is just an example.

You can do it with a simple for loop
import json
with open('your_path_here.json', 'r') as f:
data = json.load(f)
for elem in data['data']:
if int(elem['Number']) >= 28:
data['data'].remove(elem)
print(data)
>>> {
"data": [
{
"Name": "Hello",
"Number": "20"
},
{
"Name": "Beautiful",
"Number": "22"
},
{
"Name": "World",
"Number": "25"
}
}

An example could use list comprehension:
data = {
"data": [
{
"Name": "Hello",
"Number": "20"
},
{
"Name": "Beautiful",
"Number": "22"
},
{
"Name": "World",
"Number": "25"
},
{
"Name": "!",
"Number": "28"
}
]
}
filter_ = 28
filtered = {
"data": [
item for item in data["data"]
if int(item["Number"]) < filter_
]
}
print(filtered)
Basically, this creates iterates through data["data"], checks if that current item's number is less than the filter (28 in this case), and adds those to the list. You're left with:
{'data': [{'Name': 'Hello', 'Number': '20'}, {'Name': 'Beautiful', 'Number': '22'}, {'Name': 'World', 'Number': '25'}]}
...which should be what you need, but unformatted.
However, for larger JSON files, you might want to look into ijson, which allows you to load json files in a memory-efficient way. Here's an example:
import ijson
import json
filter_ = 28
with open('data.json', 'r') as file:
items = ijson.items(file, 'data.item')
filtered = [item for item in items if int(item["Number"]) < filter_]
with open('filtered.json', 'w') as output:
json.dump(filtered, output, indent=2)
Try this code online

Related

How to delete an element of an array in a JSON file by its key in Python?

I am creating a kind-of database in using the .JSON file and I want to delete a specific element of an array in a JSON file using Python language, but I can't do this how I want, here's what info.json file looks like:
{
"dates": [
{
"date": "10/10",
"desc": "test1"
},
{
"date": "09/09",
"desc": "test3"
}
],
"data": [
{
"name": "123",
"infotext": "1234"
},
{
"name": "!##",
"infotext": "!##$"
}
]
}
Here's what json_import.py file looks like:
def delete_data():
name = input("Enter data name\n")
with open("info.json", "r+") as f:
file_data = json.load(f)
for x in file_data["data"]:
if x["name"] == name:
file_data["data"].remove(x)
f.seek(0)
json.dump(file_data, f, indent = 4)
delete_data()
TERMINAL:
Enter data name
!##
Expected:
{
"dates": [
{
"date": "10/10",
"desc": "test1"
},
{
"date": "09/09",
"desc": "test3"
}
],
"data": [
{
"name": "123",
"datatext": "1234"
}
]
}
Actual result:
{
"dates": [
{
"date": "10/10",
"desc": "test1"
},
{
"date": "09/09",
"desc": "test3"
}
],
"data": [
{
"name": "123",
"datatext": "1234"
}
]
} {
"name": "!##",
"infotext": "!##$"
}
]
}
So how to fix it?

Getting all the Keys from JSON Object?

Goal: To create a script that will take in nested JSON object as input and output a CSV file with all keys as rows in the CSV?
Example:
{
"Document": {
"DocumentType": 945,
"Version": "V007",
"ClientCode": "WI",
"Shipment": [
{
"ShipmentHeader": {
"ShipmentID": 123456789,
"OrderChannel": "Shopify",
"CustomerNumber": 234234,
"VendorID": "2343SDF",
"ShipViaCode": "FEDX2D",
"AsnDate": "2018-01-27",
"AsnTime": "09:30:47-08:00",
"ShipmentDate": "2018-01-23",
"ShipmentTime": "09:30:47-08:00",
"MBOL": 12345678901234568,
"BOL": 12345678901234566,
"ShippingNumber": "1ZTESTTEST",
"LoadID": 321456987,
"ShipmentWeight": 10,
"ShipmentCost": 2.3,
"CartonsTotal": 2,
"CartonPackagingCode": "CTN25",
"OrdersTotal": 2
},
"References": [
{
"Reference": {
"ReferenceQualifier": "TST",
"ReferenceText": "Testing text"
}
}
],
"Addresses": {
"Address": [
{
"AddressLocationQualifier": "ST",
"LocationNumber": 23234234,
"Name": "John Smith",
"Address1": "123 Main St",
"Address2": "Suite 12",
"City": "Hometown",
"State": "WA",
"Zip": 92345,
"Country": "USA"
},
{
"AddressLocationQualifier": "BT",
"LocationNumber": 2342342,
"Name": "Jane Smith",
"Address1": "345 Second Ave",
"Address2": "Building 32",
"City": "Sometown",
"State": "CA",
"Zip": "23665-0987",
"Country": "USA"
}
]
},
"Orders": {
"Order": [
{
"OrderHeader": {
"PurchaseOrderNumber": 23456342,
"RetailerPurchaseOrderNumber": 234234234,
"RetailerOrderNumber": 23423423,
"CustomerOrderNumber": 234234234,
"Department": 3333,
"Division": 23423,
"OrderWeight": 10.23,
"CartonsTotal": 2,
"QTYOrdered": 12,
"QTYShipped": 23
},
"Cartons": {
"Carton": [
{
"SSCC18": 12345678901234567000,
"TrackingNumber": "1ZTESTTESTTEST",
"CartonContentsQty": 10,
"CartonWeight": 10.23,
"LineItems": {
"LineItem": [
{
"LineNumber": 1,
"ItemNumber": 1234567890,
"UPC": 9876543212,
"QTYOrdered": 34,
"QTYShipped": 32,
"QTYUOM": "EA",
"Description": "Shoes",
"Style": "Tall",
"Size": 9.5,
"Color": "Bllack",
"RetailerItemNumber": 2342333,
"OuterPack": 10
},
{
"LineNumber": 2,
"ItemNumber": 987654321,
"UPC": 7654324567,
"QTYOrdered": 12,
"QTYShipped": 23,
"QTYUOM": "EA",
"Description": "Sunglasses",
"Style": "Short",
"Size": 10,
"Color": "White",
"RetailerItemNumber": 565465456,
"OuterPack": 12
}
]
}
}
]
}
}
]
}
}
]
}
}
In the above JSON Object, I want all the keys (nested included) in a List (Duplicates can be removed by using a set Data Structure). If Nested Key Occurs like in actual JSON they can be keys multiple times in the CSV !
I personally feel that recursion is a perfect application for this type of problem if the amount of nests you will encounter is unpredictable. Here I have written an example in Python of how you can utilise recursion to extract all keys. Cheers.
import json
row = ""
def extract_keys(data):
global row
if isinstance(data, dict):
for key, value in data.items():
row += key + "\n"
extract_keys(value)
elif isinstance(data, list):
for element in data:
extract_keys(element)
# MAIN
with open("input.json", "r") as rfile:
dicts = json.load(rfile)
extract_keys(dicts)
with open("output.csv", "w") as wfile:
wfile.write(row)

How can I extract data from json

I want to extract code from JSON format.
import json
json_data = '''
{
"Body": {
"stkCallback": {
"MerchantRequestID": "22531-976234-1",
"CheckoutRequestID": "ws_CO_DMZ_250600506_23022019144745852",
"ResultCode": 0,
"ResultDesc": "The service request is processed successfully.",
"CallbackMetadata": {
"Item": [
{
"Name": "Amount",
"Value": 1.0
},
{
"Name": "MpesaReceiptNumber",
"Value": "NBN52K8A1J"
},
{
"Name": "Balance"
},
{
"Name": "TransactionDate",
"Value": 20190223144807
},
{
"Name": "PhoneNumber",
"Value": 254725696042
}
]
}
}
}
}
'''
json_da = data['Body']
list_data = data['Body']['MpesaReceiptNumber']
print (json_da)
print (list_data)
I want to print this: NBN52K8A1J
The problem is that you have a list of dicts you need to search first:
>>> for obj in data['Body']['stkCallback']['CallbackMetadata']['Item']:
... print(obj)
...
{'Name': 'Amount', 'Value': 1.0}
{'Name': 'MpesaReceiptNumber', 'Value': 'NBN52K8A1J'}
{'Name': 'Balance'}
{'Name': 'TransactionDate', 'Value': 20190223144807}
{'Name': 'PhoneNumber', 'Value': 254725696042}
One possibility is
>>> [x['Value'] for x in data['Body']['stkCallback']['CallbackMetadata']['Item'] if x['Name'] == 'MpesaReceiptNumber'][0]
'NBN52K8A1J'
Just use the library json. Then you can print its inner elements
import json
json_data = '{"Body":{"stkCallback":{"MerchantRequestID":"22531-976234-1","CheckoutRequestID":"ws_CO_DMZ_250600506_23022019144745852","ResultCode":0,"ResultDesc":"The service request is processed successfully.","CallbackMetadata":{"Item":[{"Name":"Amount","Value":1.00},{"Name":"MpesaReceiptNumber","Value":"NBN52K8A1J"},{"Name":"Balance"},{"Name":"TransactionDate","Value":20190223144807},{"Name":"PhoneNumber","Value":254725696042}]}}}}'
a = json.loads(json_data)
print(a["Body"]["stkCallback"]["CallbackMetadata"]["Item"][1]["Value"])
You were almost there just have to get the the key value pair itself from the dicts and check if it is the name you wanted:
data = json.loads(json_data)
list_data = data['Body']["stkCallback"]['CallbackMetadata']['Item']
var: str
for x in list_data:
if x['Name'] == 'MpesaReceiptNumber':
var = x['Value']
break
print(var)
You can use this in the future easily by replacing the if check with the name of something else so you can grab the value depending on a variable.
I find using pprint to get the shape of the data structure is helpful when you're learning how to navigate it all out.
import json
import pprint
json_data = '{"Body":{"stkCallback":{"MerchantRequestID":"22531-976234-1","CheckoutRequestID":"ws_CO_DMZ_250600506_23022019144745852","ResultCode":0,"ResultDesc":"The service request is processed successfully.","CallbackMetadata":{"Item":[{"Name":"Amount","Value":1.00},{"Name":"MpesaReceiptNumber","Value":"NBN52K8A1J"},{"Name":"Balance"},{"Name":"TransactionDate","Value":20190223144807},{"Name":"PhoneNumber","Value":254725696042}]}}}}'
data = json.loads(json_data)
pprint.pprint(data)
Results in:
{'Body': {'stkCallback': {'CallbackMetadata': {'Item': [{'Name': 'Amount', 'Value': 1.0},
{'Name': 'MpesaReceiptNumber', 'Value': 'NBN52K8A1J'},
{'Name': 'Balance'},
{'Name': 'TransactionDate', 'Value': 20190223144807},
{'Name': 'PhoneNumber', 'Value': 254725696042}]},
'CheckoutRequestID': 'ws_CO_DMZ_250600506_23022019144745852',
'MerchantRequestID': '22531-976234-1',
'ResultCode': 0,
'ResultDesc': 'The service request is processed successfully.'}}}
So you should be able to see that data["Body"]["stkCallback"]["CallbackMetadata"]["Item"] gets to to the depth you need for your data.
>>> pprint.pprint(data["Body"]["stkCallback"]["CallbackMetadata"]["Item"])
[{'Name': 'Amount', 'Value': 1.0},
{'Name': 'MpesaReceiptNumber', 'Value': 'NBN52K8A1J'},
{'Name': 'Balance'},
{'Name': 'TransactionDate', 'Value': 20190223144807},
{'Name': 'PhoneNumber', 'Value': 254725696042}]
So next you need to iterate through that list and find a match (if one exists) for the MpesaReceiptNumber key.
receipt_no = None
for item in data["Body"]["stkCallback"]["CallbackMetadata"]["Item"]:
if item.get('Name') == 'MpesaReceiptNumber':
receipt_no = item.get('Value')
print(f"The receipt # is: {receipt_no}")
If you parse the json you will notice that the path to the data is not simply ['Body']['MpesaReceiptNumber']. In fact you have a list of dicts inside ['Item'] that needs to be searched.
Parsed data tree
One suggestion is to run the following code to find the data you are looking for:
import json
json_data = '{"Body":{"stkCallback":{"MerchantRequestID":"22531-976234-1","CheckoutRequestID":"ws_CO_DMZ_250600506_23022019144745852","ResultCode":0,"ResultDesc":"The service request is processed successfully.","CallbackMetadata":{"Item":[{"Name":"Amount","Value":1.00},{"Name":"MpesaReceiptNumber","Value":"NBN52K8A1J"},{"Name":"Balance"},{"Name":"TransactionDate","Value":20190223144807},{"Name":"PhoneNumber","Value":254725696042}]}}}}'
data = (json.loads(json_data))
list_data = data['Body']['stkCallback']['CallbackMetadata']['Item']
# Returns:
# [{'Name': 'Amount', 'Value': 1.0}, {'Name': 'MpesaReceiptNumber', 'Value':'NBN52K8A1J'}, {'Name': 'Balance'}, {'Name': 'TransactionDate', 'Value': 20190223144807}, {'Name': 'PhoneNumber', 'Value': 254725696042}]
# Now find Name: 'MpesaReceiptNumber' inside the dict list
find_it = next(item for item in list_data if item["Name"] == "MpesaReceiptNumber")
find_it = find_it['Value']
print (find_it)
Result
NBN52K8A1J
Use jq.
First off, it can "pretty print" any JSON data.
Put the value of json_data into a file test.json, and then show the formatted output of the JSON data with:
$ jq <test.json
{
"Body": {
"stkCallback": {
"MerchantRequestID": "22531-976234-1",
"CheckoutRequestID": "ws_CO_DMZ_250600506_23022019144745852",
"ResultCode": 0,
"ResultDesc": "The service request is processed successfully.",
"CallbackMetadata": {
"Item": [
{
"Name": "Amount",
"Value": 1
},
{
"Name": "MpesaReceiptNumber",
"Value": "NBN52K8A1J"
},
{
"Name": "Balance"
},
{
"Name": "TransactionDate",
"Value": 20190223144807
},
{
"Name": "PhoneNumber",
"Value": 254725696042
}
]
}
}
}
}
Next, to extract values, a selector path needs to be given on the jq command line:
jq '.Body.stkCallback.CallbackMetadata.Item|.[]|select(.Name == "MpesaReceiptNumber")|.Value' test.json
"NBN52K8A1J"
Now to make this sequence easier to understand, let's break it down component by component.
To extract and return only the .Body:
$ jq '.Body' <test.json
{
"stkCallback": {
"MerchantRequestID": "22531-976234-1",
"CheckoutRequestID": "ws_CO_DMZ_250600506_23022019144745852",
"ResultCode": 0,
"ResultDesc": "The service request is processed successfully.",
"CallbackMetadata": {
"Item": [
{
"Name": "Amount",
"Value": 1
},
{
"Name": "MpesaReceiptNumber",
"Value": "NBN52K8A1J"
},
{
"Name": "Balance"
},
{
"Name": "TransactionDate",
"Value": 20190223144807
},
{
"Name": "PhoneNumber",
"Value": 254725696042
}
]
}
}
}
Now let's fetch the stkCallback component:
$ jq '.Body.stkCallback' <test.json
{
"MerchantRequestID": "22531-976234-1",
"CheckoutRequestID": "ws_CO_DMZ_250600506_23022019144745852",
"ResultCode": 0,
"ResultDesc": "The service request is processed successfully.",
"CallbackMetadata": {
"Item": [
{
"Name": "Amount",
"Value": 1
},
{
"Name": "MpesaReceiptNumber",
"Value": "NBN52K8A1J"
},
{
"Name": "Balance"
},
{
"Name": "TransactionDate",
"Value": 20190223144807
},
{
"Name": "PhoneNumber",
"Value": 254725696042
}
]
}
}
Ok, now the callbackMetadata:
$ jq '.Body.stkCallback.CallbackMetadata' <test.json
{
"Item": [
{
"Name": "Amount",
"Value": 1
},
{
"Name": "MpesaReceiptNumber",
"Value": "NBN52K8A1J"
},
{
"Name": "Balance"
},
{
"Name": "TransactionDate",
"Value": 20190223144807
},
{
"Name": "PhoneNumber",
"Value": 254725696042
}
]
}
Next, the Item part:
$ jq '.Body.stkCallback.CallbackMetadata.Item' <test.json
[
{
"Name": "Amount",
"Value": 1
},
{
"Name": "MpesaReceiptNumber",
"Value": "NBN52K8A1J"
},
{
"Name": "Balance"
},
{
"Name": "TransactionDate",
"Value": 20190223144807
},
{
"Name": "PhoneNumber",
"Value": 254725696042
}
]
Notice that the result is an array of list items? Let's filter the data out of the array:
$ jq '.Body.stkCallback.CallbackMetadata.Item|.[]' <test.json
{
"Name": "Amount",
"Value": 1
}
{
"Name": "MpesaReceiptNumber",
"Value": "NBN52K8A1J"
}
{
"Name": "Balance"
}
{
"Name": "TransactionDate",
"Value": 20190223144807
}
{
"Name": "PhoneNumber",
"Value": 254725696042
}
Now the result is just the list of tuples, each with a "Name" and "Value". So, let's select just the one we (you) wanted:
$ jq '.Body.stkCallback.CallbackMetadata.Item|.[]|select(.Name == "MpesaReceiptNumber")' <test.json
{
"Name": "MpesaReceiptNumber",
"Value": "NBN52K8A1J"
}
Cool. We've got the tuple we wanted. Let's extract just the value now:
$ jq '.Body.stkCallback.CallbackMetadata.Item|.[]|select(.Name == "MpesaReceiptNumber")|.Value' <test.json
"NBN52K8A1J"
And, there you go.
Simplest way to get the value associated with a specific CallbackMetadata item name:
json_string = '''
{
"Body": {
"stkCallback": {
...
}
'''
json_data = json.loads(json_string)
for item in json_data["Body"]["stkCallback"]["CallbackMetadata"]["Item"]:
if item["Name"] == "MpesaReceiptNumber":
print(item["Value"]) # -> NBN52K8A1J

Flatten nested json to csv with nested column names

I have rather very weird requirement now. I have below json and somehow I have to convert it into flat csv.
[
{
"authorizationQualifier": "SDA",
"authorizationInformation": " ",
"securityQualifier": "ASD",
"securityInformation": " ",
"senderQualifier": "ASDAD",
"senderId": "FADA ",
"receiverQualifier": "ADSAS",
"receiverId": "ADAD ",
"date": "140101",
"time": "0730",
"standardsId": null,
"version": "00501",
"interchangeControlNumber": "123456789",
"acknowledgmentRequested": "0",
"testIndicator": "T",
"functionalGroups": [
{
"functionalIdentifierCode": "ADSAD",
"applicationSenderCode": "ASDAD",
"applicationReceiverCode": "ADSADS",
"date": "20140101",
"time": "07294900",
"groupControlNumber": "123456789",
"responsibleAgencyCode": "X",
"version": "005010X221A1",
"transactions": [
{
"name": "ASDADAD",
"transactionSetIdentifierCode": "adADS",
"transactionSetControlNumber": "123456789",
"implementationConventionReference": null,
"segments": [
{
"BPR03": "ad",
"BPR14": "QWQWDQ",
"BPR02": "1.57",
"BPR13": "23223",
"BPR01": "sad",
"BPR12": "56",
"BPR10": "32424",
"BPR09": "12313",
"BPR08": "DA",
"BPR07": "123456789",
"BPR06": "12313",
"BPR05": "ASDADSAD",
"BPR16": "21313",
"BPR04": "SDADSAS",
"BPR15": "11212",
"id": "aDSASD"
},
{
"TRN02": "2424",
"TRN03": "35435345",
"TRN01": "3435345",
"id": "FSDF"
},
{
"REF02": "fdsffs",
"REF01": "sfsfs",
"id": "fsfdsfd"
},
{
"DTM02": "2432424",
"id": "sfsfd",
"DTM01": "234243"
}
],
"loops": [
{
"id": "24324234234",
"segments": [
{
"N101": "sfsfsdf",
"N102": "sfsf",
"id": "dgfdgf"
},
{
"N301": "sfdssfdsfsf",
"N302": "effdssf",
"id": "fdssf"
},
{
"N401": "sdffssf",
"id": "sfds",
"N402": "sfdsf",
"N403": "23424"
},
{
"PER06": "Wsfsfdsfsf",
"PER05": "sfsf",
"PER04": "23424",
"PER03": "fdfbvcb",
"PER02": "Pedsdsf",
"PER01": "sfsfsf",
"id": "fdsdf"
}
]
},
{
"id": "2342",
"segments": [
{
"N101": "sdfsfds",
"N102": "vcbvcb",
"N103": "dsfsdfs",
"N104": "343443",
"id": "fdgfdg"
},
{
"N401": "dfsgdfg",
"id": "dfgdgdf",
"N402": "dgdgdg",
"N403": "234244"
},
{
"REF02": "23423342",
"REF01": "fsdfs",
"id": "sfdsfds"
}
]
}
]
}
]
}
]
}
]
The column header name corresponding to deeper key-value make take nested form, like functionalGroups[0].transactions[0].segments[0].BPR15.
I am able to do this in java using this github project (here you can find the output format I desire in the explanation) in one line:
flatJson = JSONFlattener.parseJson(new File("files/simple.json"), "UTF-8");
The output was:
date,securityQualifier,testIndicator,functionalGroups[1].functionalIdentifierCode,functionalGroups[1].date,functionalGroups[1].applicationReceiverCode, ...
140101,00,T,HP,20140101,ETIN,...
But I want to do this in python. I tried as suggested in this answer:
with open('data.json') as data_file:
data = json.load(data_file)
df = json_normalize(data, record_prefix=True)
with open('temp2.csv', "w", newline='\n') as csv_file:
csv_file.write(df.to_csv())
However, for column functionalGroups, it dumps json as a cell value.
I also tried as suggested in this answer:
with open('data.json') as f: # this ensures opening and closing file
a = json.loads(f.read())
df = pandas.DataFrame(a)
print(df.transpose())
But this also seem to do the same:
0
acknowledgmentRequested 0
authorizationInformation
authorizationQualifier SDA
date 140101
functionalGroups [{'functionalIdentifierCode': 'ADSAD', 'applic...
interchangeControlNumber 123456789
receiverId ADAD
receiverQualifier ADSAS
securityInformation
securityQualifier ASD
senderId FADA
senderQualifier ASDAD
standardsId None
testIndicator T
time 0730
version 00501
Is it possible to do what I desire in python?

python append dictionary to list

According to this post, I need to use .copy() on a dictionary, if I want to reference a dictionary which gets updated in a loop (instead of always referencing the same dictionary). However, in my code example below this doesn't seem to work:
main.py:
import collections
import json
nodes_list = ['donald', 'daisy', 'mickey', 'minnie']
edges_list = [('donald', 'daisy', '3'), ('mickey', 'minnie', '3'), ('daisy', 'minnie', '2')]
node_dict, edge_dict = collections.defaultdict(dict), collections.defaultdict(dict)
ultimate_list = []
for n in nodes_list:
node_dict["data"]["id"] = str(n)
ultimate_list.append(node_dict.copy())
for e in edges_list:
edge_dict["data"]["id"] = str(e[2])
edge_dict["data"]["source"] = e[0]
edge_dict["data"]["target"] = e[1]
ultimate_list.append(edge_dict.copy())
print(json.dumps(ultimate_list, indent=2))
As a result here I get the following:
[
{
"data": {
"id": "minnie"
}
},
{
"data": {
"id": "minnie"
}
},
{
"data": {
"id": "minnie"
}
},
{
"data": {
"id": "minnie"
}
},
{
"data": {
"target": "minnie",
"id": "2",
"source": "daysi"
}
},
{
"data": {
"target": "minnie",
"id": "2",
"source": "daysi"
}
},
{
"data": {
"target": "minnie",
"id": "2",
"source": "daysi"
}
}
]
Whereas I would actually expect to get this:
[
{
"data": {
"id": "donald"
}
},
{
"data": {
"id": "daisy"
}
},
{
"data": {
"id": "mickey"
}
},
{
"data": {
"id": "minnie"
}
},
{
"data": {
"target": "donald",
"id": "3",
"source": "daysi"
}
},
{
"data": {
"target": "mickey",
"id": "3",
"source": "minnie"
}
},
{
"data": {
"target": "minnie",
"id": "2",
"source": "daysi"
}
}
]
Can anyone please tell me what I'm doing wrong here?
dict.copy only makes a shallow copy of the dict, the nested dictionaries are never copied, you need deep copies to have those copied over too.
However, you can simply define each new dict at each iteration of the loop and append the new dict at that iteration instead:
for n in nodes_list:
node_dict = collections.defaultdict(dict) # create new instance of data structure
node_dict["data"]["id"] = str(n)
ultimate_list.append(node_dict)
Same applies to the edge_dict:
for e in edges_list:
edge_dict = collections.defaultdict(dict)
...
ultimate_list.append(edge_dict)
Use copy.deepcopy(your_dict): deepcopy.
I see a few things. According to your desired results your edge_list is a bit off.
Change:
('daisy', 'minnie', '2')
To:
('minnie', 'daisy', '2')
To create the data the way you would like in your desired output we can do this with a more basic approach to dicts.
If you are trying to match the desired results in your question then you are calling the wrong index in your for e in edges_list function.
It should be:
"target" : e[0]
"id" : str(e[2])
"source" : e[1]
First I removed
node_dict, edge_dict = collections.defaultdict(dict), collections.defaultdict(dict)
as its not needed for my method.
Next I changed how you are defining the data.
Instead of using pre-defined dictionaries we can just append the results of each set of data to the ultimate_list directly. This shortens the code and is a bit easier to set up.
for n in nodes_list:
ultimate_list.append({"data" : {"id" : str(n)}})
for e in edges_list:
ultimate_list.append({"data" : {"target" : e[0], "id" : str(e[2]), "source" : e[1]}})
print(json.dumps(ultimate_list, indent=2))
So the following code:
import collections
import json
nodes_list = ['donald', 'daisy', 'mickey', 'minnie']
edges_list = [('donald', 'daisy', '3'), ('mickey', 'minnie', '3'), ('minnie', 'daisy', '2')]
ultimate_list = []
for n in nodes_list:
ultimate_list.append({"data" : {"id" : str(n)}})
for e in edges_list:
ultimate_list.append({"data" : {"target" : e[0], "id" : str(e[2]), "source" : e[1]}})
print(json.dumps(ultimate_list, indent=2))
Should result in:
[
{
"data": {
"id": "donald"
}
},
{
"data": {
"id": "daisy"
}
},
{
"data": {
"id": "mickey"
}
},
{
"data": {
"id": "minnie"
}
},
{
"data": {
"target": "donald",
"id": "3",
"source": "daisy"
}
},
{
"data": {
"target": "mickey",
"id": "3",
"source": "minnie"
}
},
{
"data": {
"target": "minnie",
"id": "2",
"source": "daisy"
}
}
]

Categories