How to convert raw json to required format in pythonic way - python

I have json from some service, where each value is different row.
Input example:
[
{'author': 'alf', 'topic': 'topic1', 'lang': 'ge', 'value': 11},
{'author': 'alf', 'topic': 'topic1', 'lang': 'ge', 'value': 22},
{'author': 'bob', 'topic': 'topic1', 'lang': 'ge', 'value': 33},
{'author': 'bob', 'topic': 'topic1', 'lang': 'ge', 'value': 44},
{'author': 'alf', 'topic': 'topic1', 'lang': 'fr', 'value': 99},
{'author': 'alf', 'topic': 'topic2', 'lang': 'ge', 'value': -20},
]
Output example:
{
'alf': {
'topic1': [
{'ge': [11, 22]},
{'fr': [99]}
],
'topic2': [
{'ge': [-20]}
]
},
'bob': {
'topic1': [
{'ge': [33, 44]}
]
}
}
So basically this is simple transformation via grouping specified keys to collect all values in to one array.
I done this transformation via checking and creating required key if it is missing:
for entry in self._raw_data:
parsed = {}
author = entry["author"]
topic = entry["topic"]
lang = entry["lang"]
value = entry["value"]
if not parsed.get(author):
parsed[author] = {}
if not parsed[author].get(topic):
parsed[author][topic] = []
#etc
I am sure, that could be done in more transparent way. Can anyone recommend something?

If you're willing to change the type of "topic"'s value from list to dict, you can use .setdefault():
res = {}
for entry in raw_data:
res.setdefault(entry['author'], {}).setdefault(entry["topic"], {}).setdefault(entry["lang"], []).append(entry["value"])
OUTPUT:
{
"alf": {
"topic1": {
"fr": [99],
"ge": [11, 22]
},
"topic2": {
"ge": [-20]
}
},
"bob": {
"topic1": {
"ge": [33, 44]
}
}
}

Related

TimestampedGeoJson with MultiPolygon shows Time Not Available

I want to display a shape over the Canada map.
The idea is 2 shapes in different years.
But my slide at the end says:
"Time Not Available"
I tried to find here at the community, but I haven't found a problem like it.
Here you can find my file and here you can find my code:
import folium
from folium.plugins import TimestampedGeoJson
import json
import pandas as pd
with open('outputfile.json') as f:
poly = json.load(f)
features = [
{
'type': 'Feature',
'geometry': {
'type': 'MultiPolygon',
'coordinates': pol['coordinates'],
},
'properties': {
'ABBREVNAME': pol['ABBREVNAME'],
'time': pol['date'],
}
} for pol in poly
]
mapa = folium.Map(
location = [56.130,-106.35],
tiles='openstreetmap',
zoom_start = 3
)
TimestampedGeoJson({'type': 'FeatureCollection', 'features': features}).add_to(mapa)
mapa
Thanks!!!
I had the same problem of yours of time not available and solved by following the example on the documentation and this other post.
Basically, some key points from doc:
1- It's is not 'time' but "times" and it must be the same length of the list of coordinates
2- Lookout for time format it only takes ISO or ms epoch
enter image description here
Here is a code example of a store location code i was working, I didn't try Polygon yet but hope it helps u:
m = folium.Map([-23.579782, -46.687754], zoom_start=6, tiles="cartodbpositron")
TimestampedGeoJson({
'type': 'FeatureCollection',
'features': [
{
'type': 'Feature',
'geometry': {
'type': 'LineString',
'coordinates': [[-46.687754, -23.579782]],
},
'properties': {
'icon': 'marker',
'iconstyle': {
'iconSize': [20, 20],
'iconUrl':
'https://img.icons8.com/ios-filled/50/000000/online-store.png'
},
'id': 'house',
'popup': 1,
'times': [1633046400000.0]
}
}, {
'type': 'Feature',
'geometry': {
'type': 'LineString',
'coordinates': [[-46.887754, -23.579782]],
},
'properties': {
'icon': 'marker',
'iconstyle': {
'iconSize': [20, 20],
'iconUrl':
'https://img.icons8.com/ios-filled/50/000000/online-store.png'
},
'id': 'house',
'popup': 1,
'times': [1635046400000.0]
}
}
]
}).add_to(m)
folium_static(m)
m.save('test.html')

Mapping JSON key-value pairs from source to destination using Python

Using Python requests I want to grab a piece of JSON from one source and post it to a destination. The structure of the JSON received and the one required by the destination, however, differs a bit so my question is, how do I best map the items from the source structure onto the destination structure?
To illustrate, imagine we get a list of all purchases made by John and Mary. And now we want to post the individual items purchased linking these to the individuals who purchased them (NOTE: The actual use case involves thousands of entries so I am looking for an approach that would scale accordingly):
Source JSON:
{
'Total Results': 2,
'Results': [
{
'Name': 'John',
'Age': 25,
'Purchases': [
{
'Fruits': {
'Type': 'Apple',
'Quantity': 3,
'Color': 'Red'}
},
{
'Veggie': {
'Type': 'Salad',
'Quantity': 2,
'Color': 'Green'
}
}
]
},
{
'Name': 'Mary',
'Age': 20,
'Purchases': [
{
'Fruits': {
'Type': 'Orange',
'Quantity': 2,
'Color': 'Orange'
}
}
]
}
]
}
Destination JSON:
{
[
{
'Purchase': 'Apple',
'Purchased by': 'John',
'Quantity': 3,
'Type': 'Red',
},
{
'Purchase': 'Salad',
'Purchased by': 'John',
'Quantity': 2,
'Type': 'Green',
},
{
'Purchase': 'Orange',
'Purchased by': 'Mary',
'Quantity': 2,
'Type': 'Orange',
}
]
}
Any help on this would be greatly appreciated! Cheers!
Just consider loop through the dict.
res = []
for result in d['Results']:
value = {}
for purchase in result['Purchases']:
item = list(purchase.values())[0]
value['Purchase'] = item['Type']
value['Purchased by'] = result['Name']
value['Quantity'] = item['Quantity']
value['Type'] = item['Color']
res.append(value)
pprint(res)
[{'Purchase': 'Apple', 'Purchased by': 'John', 'Quantity': 3, 'Type': 'Red'},
{'Purchase': 'Salad', 'Purchased by': 'John', 'Quantity': 2, 'Type': 'Green'},
{'Purchase': 'Orange', 'Purchased by': 'Mary', 'Quantity': 2, 'Type': 'Orange'}]

Create avro schema for python dict

I want to create an avro-schema for following python-dictionary:
d = {
'topic': 'example',
'content': (
{ 'description': {'name': 'alex', 'value': 12}, 'id': '234ba' },
{ 'description': {'name': 'john', 'value': 14}, 'id': '823cx' }
)
}
How can I do this?
Have you tried to use the default serialization and deserialization included in the avro library for python?
https://avro.apache.org/docs/1.10.0/gettingstartedpython.html
Verify that is what you want

Loop through multidimensional JSON (Python)

I have a JSON with following structure:
{
'count': 93,
'apps' : [
{
'last_modified_at': '2016-10-21T12:20:26Z',
'frequency_caps': [],
'ios': {
'enabled': True,
'push_enabled': False,
'app_store_id': 'bbb',
'connection_type': 'certificate',
'sdk_api_secret': '--'
},
'organization_id': '--',
'name': '---',
'app_id': 27,
'control_group_percentage': 0,
'created_by': {
'user_id': 'abc',
'user_name': 'def'
},
'created_at': '2016-09-28T11:41:24Z',
'web': {}
}, {
'last_modified_at': '2016-10-12T08:58:57Z',
'frequency_caps': [],
'ios': {
'enabled': True,
'push_enabled': True,
'app_store_id': '386304604',
'connection_type': 'certificate',
'sdk_api_secret': '---',
'push_expiry': '2018-01-14T08:24:09Z'
},
'organization_id': '---',
'name': '---',
'app_id': 87,
'control_group_percentage': 0,
'created_by': {
'user_id': '----',
'user_name': '---'
},
'created_at': '2016-10-12T08:58:57Z',
'web': {}
}
]
}
It's a JSON with two key-value-pairs. The second pair's value is a List of more JSON's.
For me it is too much information and I want to have a JSON like this:
{
'apps' : [
{
'name': 'Appname',
'app_id' : 1234,
'organization_id' : 'Blablabla'
},
{
'name': 'Appname2',
'app_id' : 5678,
'organization_id' : 'Some other Organization'
}
]
}
I want to have a JSON that only contains one key ("apps") and its value, which would be a List of more JSONs that only have three key-value-pairs..
I am thankful for any advice.
Thank you for your help!
#bishakh-ghosh I don't think you need to use the input json as string. It can be used straight as a dictionary. (thus avoid ast)
One more concise way :
# your original json
input_ = { 'count': 93, ... }
And here are the steps :
Define what keys you want to keep
slice_keys = ['name', 'app_id', 'organization_id']
Define the new dictionary as a slice on the slice_keys
dict(apps=[{key:value for key,value in d.items() if key in slice_keys} for d in input_['apps']])
And that's it.
That should yield the JSON formatted as you want, e.g
{
'apps':
[
{'app_id': 27, 'name': '---', 'organization_id': '--'},
{'app_id': 87, 'name': '---', 'organization_id': '---'}
]
}
This might be what you are looking for:
import ast
import json
json_str = """{
'count': 93,
'apps' : [
{
'last_modified_at': '2016-10-21T12:20:26Z',
'frequency_caps': [],
'ios': {
'enabled': True,
'push_enabled': False,
'app_store_id': 'bbb',
'connection_type': 'certificate',
'sdk_api_secret': '--'
},
'organization_id': '--',
'name': '---',
'app_id': 27,
'control_group_percentage': 0,
'created_by': {
'user_id': 'abc',
'user_name': 'def'
},
'created_at': '2016-09-28T11:41:24Z',
'web': {}
}, {
'last_modified_at': '2016-10-12T08:58:57Z',
'frequency_caps': [],
'ios': {
'enabled': True,
'push_enabled': True,
'app_store_id': '386304604',
'connection_type': 'certificate',
'sdk_api_secret': '---',
'push_expiry': '2018-01-14T08:24:09Z'
},
'organization_id': '---',
'name': '---',
'app_id': 87,
'control_group_percentage': 0,
'created_by': {
'user_id': '----',
'user_name': '---'
},
'created_at': '2016-10-12T08:58:57Z',
'web': {}
}
]
}"""
json_dict = ast.literal_eval(json_str)
new_dict = {}
app_list = []
for appdata in json_dict['apps']:
appdata_dict = {}
appdata_dict['name'] = appdata['name']
appdata_dict['app_id'] = appdata['app_id']
appdata_dict['organization_id'] = appdata['organization_id']
app_list.append(appdata_dict)
new_dict['apps'] = app_list
new_json_str = json.dumps(new_dict)
print(new_json_str) # This is your resulting json string

Updating complex JSON object in Python

I am grabbing sort of a complex MongoDB document with Python (v3.5) and I should update some values in it which are scattered all around the object and have no particular pattern in the structure and save it back to a different MongoDB collection. The object looks like this:
# after json.loads(mongo_db_document) my dict looks like this
notification = {
'_id': '570f934f45213b0d14b1256f',
'key': 'receipt',
'label': 'Delivery Receipt',
'version': '0.0.1',
'active': True,
'children': [
{
'key': 'started',
'label': 'Started',
'children': [
'date',
'time',
'offset'
]
},
{
'key': 'stop',
'label': 'Ended',
'children': [
'date',
'time',
'offset'
]
},
{
'label': '1. Particulars',
'template': 'formGroup',
'children': [
{
'children': [
{
'key': 'name',
'label': '2.1 Name',
'value': '********** THIS SHOULD BE UPDATED **********',
'readonly': 'true'
},
{
'key': 'ims_id',
'label': '2.2 IMS Number',
'value': '********** THIS SHOULD BE UPDATED **********',
'readonly': 'true'
}
]
},
{
'children': [
{
'key': 'type',
'readonly': '********** THIS SHOULD BE UPDATED **********',
'label': '2.3 Type',
'options': [
{
'label': 'Passenger',
'value': 'A37'
},
{
'label': 'Cargo',
'value': 'A35'
},
{
'label': 'Other',
'value': '********** THIS SHOULD BE UPDATED **********'
}
]
}
]
}
]
},
{
'template': 'formGroup',
'key': 'waste',
'label': '3. Waste',
'children': [
{
'label': 'Waste',
'children': [
{
'label': 'Plastics',
'key': 'A',
'inputType': 'number',
'inputAttributes': {
'min': 0
},
'value': '********** THIS SHOULD BE UPDATED **********'
},
{
'label': 'B. Oil',
'key': 'B',
'inputType': 'number',
'inputAttributes': {
'min': 0
},
'value': '********** THIS SHOULD BE UPDATED **********'
},
{
'label': 'C. Operational',
'key': 'C',
'inputType': 'number',
'inputAttributes': {
'min': 0
},
'value': '********** THIS SHOULD BE UPDATED **********'
}
]
}
]
},
{
'template': 'formRow',
'children': [
'empty',
'signature'
]
}
],
'filter': {
'timestamp_of_record': [
'date',
'time',
'offset'
]
}
}
My initial idea was to put placeholders (like $var_name) in places where I need to update values, and load the string with Python's string.Template, but that approach unfortunately breaks lots of stuff to other users of the same MongoDB document for some reason.
Is there a solution to simply modify this kind of object without "hardcoding" path to find the values I need to update?
There's this small script that I had written a couple years ago - I used it to find entries in some very long and unnerving JSONs. Admittedly it's not beautiful, but it might help in your case, perhaps?
You can find the script on Bitbucket, here (and here is the code).
Unfortunately it's not documented; at the time I wasn't really believing other people would use it, I guess.
Anyways, if you'd like to try it, save the script in your working directory and then use something like this:
from RecursiveSearch import Retriever
def alter_data(json_data, key, original, newval):
'''
Alter *all* values of said keys
'''
retr = Retriever(json_data)
for item_no, item in enumerate(retr.__track__(key)): # i.e. all 'value'
# Pick parent objects with a last element False in the __track__() result,
# indicating that `key` is either a dict key or a set element
if not item[-1]:
parent = retr.get_parent(key, item_no)
try:
if parent[key] == original:
parent[key] = newval
except TypeError:
# It's a set, this is not the key you're looking for
pass
if __name__ == '__main__':
alter_data(notification, key='value',
original = '********** THIS SHOULD BE UPDATED **********',
newval = '*UPDATED*')
Unfortunately as I said the script isn't well documented, so if you want to try it and need more info, I'll be glad to provide it.
Not sure if I understood correctly, but this will dynamically find all keys "value" and "readonly" and print out the paths to address the fields.
def findem(data, trail):
if isinstance(data, dict):
for k in data.keys():
if k in ('value', 'readonly'):
print("{}['{}']".format(trail, k))
else:
findem(data[k], "{}['{}']".format(trail, k))
elif isinstance(data, list):
for k in data:
findem(k, '{}[{}]'.format(trail, data.index(k)))
if __name__ == '__main__':
findem(notification, 'notification')
notification['children'][2]['children'][0]['children'][0]['readonly']
notification['children'][2]['children'][0]['children'][0]['value']
notification['children'][2]['children'][0]['children'][1]['readonly']
notification['children'][2]['children'][0]['children'][1]['value']
notification['children'][2]['children'][1]['children'][0]['readonly']
notification['children'][2]['children'][1]['children'][0]['options'][0]['value']
notification['children'][2]['children'][1]['children'][0]['options'][1]['value']
notification['children'][2]['children'][1]['children'][0]['options'][2]['value']
notification['children'][3]['children'][0]['children'][0]['value']
notification['children'][3]['children'][0]['children'][1]['value']
notification['children'][3]['children'][0]['children'][2]['value']
Add another list to the JSON object. Each item in that list would be a list of keys that lead to the values to be changed. An example for one such list is: ['children', 2, 'children', 'children', 0, 'value'].
Then, to access the value you could use a loop:
def change(json, path, newVal):
cur = json
for key in path[:-1]:
cur = cur[key]
cur[path[-1]] = newVal
path = notification['paths'][0]
#path, for example, could be ['children', 2, 'children', 'children', 0, 'value']
newVal = 'what ever you want'
change(notification, path, newVal)

Categories