Compare two separate JSONs

Compare two separate JSONs - python

I have a resultant json from an intermediate stage as following
a=[{
"ID": "1201",
"SubID": "S1201",
"Information": {
"Name": "Kim",
"Age": "41"
}
}, {
"ID": "1433",
"subID": "G1433",
"Information": {
"Name": "John",
"Age": "32"
}
}]
I have another json that needs to compared with the above json
c= [{
"ID": "1201",
"SubID": "S1201"
},
{
"ID": "3211",
"subID": "G3211"
}
]
since the json object(a) in my intermediate result is present in another json(c). I want to retain only the json object which is being repeated.
expected output:
[{
"ID": "1201",
"SubID": "S1201",
"Information": {
"Name": "Kim",
"Age": "41"
}
}]
I'm not clear on what the approach to proceed with in achieving the same. Please guide me on this. Thanks.

ids = [e['ID'] for e in c]
repeated = [e for e in a if e['ID'] in ids]
print(repeated)

Related

Referring to parts of large JSON files

I am currently trying to have python parse JSON similar to the one at https://petition.parliament.uk/petitions/560216.json.
My problem is that the data I need is nested in a lot of parts and I don't know how to tell python which part to take.
A simplified version of the data I need is below
{
"data": {
"attributes": {
"signatures_by_country": [
{
"name": "Afghanistan",
"code": "AF",
"signature_count": 1
},
{
"name": "Algeria",
"code": "DZ",
"signature_count": 2
},
]
}
}
}
I am trying to pull the "signature_count" part.

The below code collect what you have asked to a list
data = {
"data": {
"attributes": {
"signatures_by_country": [
{
"name": "Afghanistan",
"code": "AF",
"signature_count": 1
},
{
"name": "Algeria",
"code": "DZ",
"signature_count": 2
},
]
}
}
}
counts = [x['signature_count'] for x in data['data']['attributes']['signatures_by_country']]
print(counts)
output
[1,2]
Count by country below
counts = [{x['name']:x['signature_count']} for x in data['data']['attributes']['signatures_by_country']]
output
[{'Afghanistan': 1}, {'Algeria': 2}]

Parse specific data from JSON

I have a JSON file with lots of data, and I want to keep only specific data.
I thought reading the file, get all the data I want and save as a new JSON.
The JSON is like this:
{
"event": [
{
"date": "2019-01-01",
"location": "world",
"url": "www.com",
"comments": "null",
"country": "china",
"genre": "blues"
},
{
"date": "2000-01-01",
"location": "street x",
"url": "www.cn",
"comments": "null",
"country":"turkey",
"genre": "reds"
},
{...
and I want it to be like this (with just date and url from each event.
{
"event": [
{
"date": "2019-01-01",
"url": "www.com"
},
{
"date": "2000-01-01",
"url": "www.cn"
},
{...
I can open the JSON and read from it using
with open('xx.json') as f:
data = json.load(f)
data2=data["events"]["date"]
But I still need to understand how to save the data I want in a new JSON keeping it's structure

You can use loop comprehension to loop over the events in and return a dictionary containing only the keys that you want.
data = { "event": [
{
"date": "2019-01-01",
"location": "world",
"url": "www.com",
"comments": None,
"country": "china",
"genre": "blues",
},
{
"date": "2000-01-01",
"location": "street x",
"url": "www.cn",
"comments": None,
"country" :"turkey",
"genre":"reds",
}
]}
# List comprehension
data["event"] = [{"date": x["date"], "url": x["url"]} for x in data["event"]]
Alternatively, you can map a function over the events list
keys_to_keep = ["date", "url"]
def subset_dict(d):
return {x: d[x] for x in keys_to_keep}
data["event"] = list(map(subset_dict, data["event"]))

Group and sort JSON array of dictionaries by repeatable keys in Python

I have a json that is a list of dictionaries that looks like this:
I am getting it from MySQL with pymysql
[{
"id": "123",
"name": "test",
"group": "test_group"
},
{
"id": "123",
"name": "test",
"group": "test2_group"
},
{
"id": "456",
"name": "test2",
"group": "test_group2"
},
{
"id": "456",
"name": "test2",
"group": "test_group3"
}]
I need to group it so each "name" will have just one dict and it will contain a list of all groups that under this name.
something like this :
[{
"id": "123",
"name": "test",
"group": ["test2_group", "test_group"]
},
{
"id": "456",
"name": "test2",
"group": ["test_group2", "test_group3"]
}]
I would like to get some help,
Thanks !

You can use itertools.groupby for grouping of data.
Although I don't guarantee solution below to be shortest way but it should do the work.
# Your input data
data = []
from itertools import groupby
res = []
key_func = lambda k: k['id']
for k, g in groupby(sorted(data, key=key_func), key=key_func):
obj = { 'id': k, 'name': '', 'group': []}
for group in g:
if not obj['name']:
obj['name'] = group['name']
obj['group'].append(group['group'])
res.append(obj)
print(res)
It should print the data in required format.

Extracting elements from json in python

I have the following json:
{
"request": {
"id": "123",
"url": "/aa/bb/cc",
"method": "GET",
"timestamp": "2018-08-09T08:41:38.432Z"
},
"response": {
"status": {
"code": 200,
"message": "OK"
},
"items": [
{
"id": "aaa",
"name": "w1"
},
{
"id": "bbb",
"name": "w2"
},
{
"id": "ccc",
"name": "w3"
}
]
}
}
I need to loop over items and print each name. I've tried the following code which doesn't work.
response = requests.get(url)
data = json.loads(response.content)
for group in data['response']['items']:
print data['response']['items'][group]['name']
When i replace group with 0 for example, I can access the first name:
data['response']['items'][0]['name']
However, I don't know in advanced how many elements are in the array.

As Joel mentioned, in the for loop,
for group in data['response']['items']:
you are assigning group the value from data['response']['items']. Hence group contains the value :
[
{
"id": "aaa",
"name": "w1"
},
{
"id": "bbb",
"name": "w2"
},
{
"id": "ccc",
"name": "w3"
}
]
So all you need to do is
print group['name']

You can use Pandas module and call read_json function.
import pandas as pd
df = pd.read_json(your_json_file.json)
for i in df.response['items']:
print(i['name'])
# w1
# w2
# w3

You could try this:
for i in range (0,len(d['response']['items'])):
print(d['response']['items'][i]['name'])
Output:
w1
w2
w3

Find a value in JSON using Python

I’ve previously succeeded in parsing data from a JSON file, but now I’m facing a problem with the function I want to achieve. I have a list of names, identification numbers and birthdate in a JSON. What I want to get in Python is to be able to let a user input a name and retrieve his identification number and the birthdate (if present).
This is my JSON example file:
[
{
"id_number": "SA4784",
"name": "Mark",
"birthdate": null
},
{
"id_number": "V410Z8",
"name": "Vincent",
"birthdate": "15/02/1989"
},
{
"id_number": "CZ1094",
"name": "Paul",
"birthdate": "27/09/1994"
}
]
To be clear, I want to input "V410Z8" and get his name and his birthdate.
I tried to write some code in Python but I only succeed in searching for “id_number” and not for what is inside “id_number” for example "V410Z8".
#!/usr/bin/python
# -*- coding: utf-8 -*-
import json
database = "example.json"
data = json.loads(open(database).read())
id_number = data[0]["id_number"]
print id_number
Thank you for your support, guys :)

You have to iterate over the list of dictionaries and search for the one with the given id_number. Once you find it you can print the rest of its data and break, assuming id_number is unique.
data = [
{
"id_number": "SA4784",
"name": "Mark",
"birthdate": None
},
{
"id_number": "V410Z8",
"name": "Vincent",
"birthdate": "15/02/1989"
},
{
"id_number": "CZ1094",
"name": "Paul",
"birthdate": "27/09/1994"
}
]
for i in data:
if i['id_number'] == 'V410Z8':
print(i['birthdate'])
print(i['name'])
break
If you have control over the data structure, a more efficient way would be to use the id_number as a key (again, assuming id_number is unique):
data = { "SA4784" : {"name": "Mark", "birthdate": None},
"V410Z8" : { "name": "Vincent", "birthdate": "15/02/1989"},
"CZ1094" : {"name": "Paul", "birthdate": "27/09/1994"}
}
Then all you need to do is try to access it directly:
try:
print(data["V410Z8"]["name"])
except KeyError:
print("ID doesn't exist")
>> "Vincent"

Using lamda in Python
data = [
{
"id_number": "SA4784",
"name": "Mark",
"birthdate": None
},
{
"id_number": "V410Z8",
"name": "Vincent",
"birthdate": "15/02/1989"
},
{
"id_number": "CZ1094",
"name": "Paul",
"birthdate": "27/09/1994"
}
]
Using Lambda and filter
print(list(filter(lambda x:x["id_number"]=="CZ1094",data)))
Output
[{'id_number': 'CZ1094', 'name': 'Paul', 'birthdate': '27/09/1994'}]

You can use list comprehension:
Given
data = [
{
"id_number": "SA4784",
"name": "Mark",
"birthdate": None
},
{
"id_number": "V410Z8",
"name": "Vincent",
"birthdate": "15/02/1989"
},
{
"id_number": "CZ1094",
"name": "Paul",
"birthdate": "27/09/1994"
}
]
to get the list item(s) with id_number equal to "V410Z8" you may use:
result = [x for x in data if x["id_number"]=="V410Z8"]
result will contain:
[{'id_number': 'V410Z8', 'name': 'Vincent', 'birthdate': '15/02/1989'}]
In case the if condition is not satisfied, result will contain an empty list: []

data = [
{
"id_number": "SA4784",
"name": "Mark",
"birthdate": None
},
{
"id_number": "V410Z8",
"name": "Vincent",
"birthdate": "14/02/1989"
},
{
"id_number": "CZ1093",
"name": "Paul",
"birthdate": "26/09/1994"
}
]
list(map(lambda x:x if x["id_number"]=="cz1093" ,data)
Output should be
[{
"id_number": "CZ1094",
"name": "Paul",
"birthdate": "26/09/1994"
}]

If you are only interested in one or a subset of total results, then I'd suggest a generator function as the fastest solution, since it will not unnecessarily iterate over every item regardless, and is more memory efficient:
def gen_func(data, search_term):
for i in data:
if i['id_number'] == search_term:
yield i
You can then run the following to retrieve results for CZ1094:
foo = gen_func(data, 'CZ1094')
next(foo)
{'id_number': 'CZ1094', 'name': 'Paul', 'birthdate': '27/09/1994'}
NB: You'll need to handle StopIteration at end of iterable.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Compare two separate JSONs - python

ids = [e['ID'] for e in c] repeated = [e for e in a if e['ID'] in ids] print(repeated)

Related

Referring to parts of large JSON files

Parse specific data from JSON

Group and sort JSON array of dictionaries by repeatable keys in Python

Extracting elements from json in python

Find a value in JSON using Python

Categories

Resources