I have an input list from users and a standard config. Only user_input can change. based on the user_input , would need to select only required data in a dictionary. ie Most of the config would remain as it is, just fruits are filtered based on user_input.
user_input = ['Apple','Grapes','Watermelon']
superset_config = """
[
{
"input":"source_1",
"operation":"add",
"fruits": {
"Apple":"Red",
"Grapes": ["Red","Yellow"],
"Orange": "Yellow"
},
"output":"target_1"
},
{
"input":"source_2",
"fruits": { "Watermelon":"green"},
"output":"target_2"
}
]
"""
Desired results: just remove 'Orange' from fruits, as Orange is not part of user input.rest everything is same.
[
{
"input":"source_1",
"operation":"add",
"fruits": {
"Apple":"Red",
"Grapes": ["Red","Yellow"]
},
"output":"target_1"
},
{
"input":"source_2",
"fruits": { "Watermelon":"green"},
"output":"target_2"
}
]
Transform:
import json
superset_definitions = json.loads(superset_config)
superset_definitions
filtered_common_defintion = []
for each_input in user_input:
for each_node in superset_definitions:
if each_input in each_node['fruits'].keys():
temp_dictionary = {}
temp_dictionary[each_input] = each_node['fruits'][each_input]
filtered_common_defintion.append(temp_dictionary)
filtered_common_defintion
The above code performs filter on fruits, but I am not sure how to capture remaining elements of the config. Can someone please guide?
You can use json.load to convert JSON string to a python dictionary, then iterate the list of the dictionary, and create a temporary dictionary to hold the values, if the key is fruits take only the key in user_input, and corresponding values from the dictionary, otherwise, just store it in temporary dictionary, finally, append each such dictionary to a resulting list:
result = []
for d in json.loads(superset_config):
temp = {}
for k in d:
if k=='fruits':
fruits = {key:value for key,value in d[k].items() if key in user_input}
temp[k] = fruits
else:
temp[k] = d[k ]
result.append(temp)
OUTPUT:
[{'input': 'source_1',
'operation': 'add',
'fruits': {'Apple': 'Red',
'Grapes': ['Red', 'Yellow']
},
'output': 'target_1'},
{'input': 'source_2',
'fruits': {'Watermelon': 'green'
},
'output': 'target_2'}]
You can use Python's Dictionary Comprehension
for each in superset_definitions:
each['fruits'] = {k: v for k, v in each['fruits'].items() if k in user_input}
Here is another approach for the same:
import json
import copy
user_input = ['Apple','Grapes','Watermelon']
superset_config = """
[
{
"input":"source_1",
"operation":"add",
"fruits": {
"Apple":"Red",
"Grapes": ["Red","Yellow"],
"Orange": "Yellow"
},
"output":"target_1"
},
{
"input":"source_2",
"fruits": { "Watermelon":"green"},
"output":"target_2"
}
]
"""
config = json.loads(superset_config)
for item in config:
for fruit_name, fruit_value in list(item["fruits"].items()):
if fruit_name not in user_input:
del item["fruits"][fruit_name]
print (config)
Output:
[{'input': 'source_1', 'operation': 'add', 'fruits': {'Apple': 'Red', 'Grapes': ['Red', 'Yellow']}, 'output': 'target_1'}, {'input': 'source_2', 'fruits': {'Watermelon': 'green'}, 'output': 'target_2'}]
Related
There are two dict main and input, I want to validate the "input" such that all the keys in the list of dictionary and nested dictionary (if present/all keys are optional) matches that of the main if not the wrong/different key should be returned as the output.
main = "app":[{
"name": str,
"info": [
{
"role": str,
"scope": {"groups": list}
}
]
},{
"name": str,
"info": [
{"role": str}
]
}]
input_data = "app":[{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
},{
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]
when compared input with main the wrong/different key should be given as output, in this case
['rol']
The schema module does exactly this.
You can catch SchemaUnexpectedTypeError to see which data doesn't match your pattern.
Also, make sure you don't use the word input as a variable name, as it's the name of a built-in function.
keys = []
def print_dict(d):
if type(d) == dict:
for val in d.keys():
df = d[val]
try:
if type(df) == list:
for i in range(0,len(df)):
if type(df[i]) == dict:
print_dict(df[i])
except AttributeError:
pass
keys.append(val)
else:
try:
x = d[0]
if type(x) == dict:
print_dict(d[0])
except:
pass
return keys
keys_input = print_dict(input)
keys = []
keys_main = print_dict(main)
print(keys_input)
print(keys_main)
for i in keys_input[:]:
if i in keys_main:
keys_input.remove(i)
print(keys_input)
This has worked for me. you can check above code snippet and if any changes provide more information so any chances if required.
Dictionary and lists compare theire content nested by default.
input_data == main should result in the right output if you format your dicts correctly. Try adding curly brackets "{"/"}" arround your dicts. It should probably look like something like this:
main = {"app": [{
"name": str,
"info": [
{
"role": str,
"scope": {"groups": list}
}
]
},{
"name": str,
"info": [
{"role": str}
]
}]}
input_data = {"app":[{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
},{
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]}
input_data2 = {"app": [{
'name': 'nms',
'info': [
{
'role': 'user',
'scope': {'groups': ['xyz']
}
}]
}, {
'name': 'abc',
'info': [
{'rol': 'user'}
]
}]}
Comparision results should look like this:
input_data2 == input_data # True
main == input_data # False
I have some json that I would like to transform from this:
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
...
{
"name":"fieldN",
"intValue":"N"
}
]
into this:
{ "field1" : "1",
"field2" : "2",
...
"fieldN" : "N",
}
For each pair, I need to change the value of the name field to a key, and the values of the intValue field to a value. This doesn't seem like flattening or denormalizing. Are there any tools that might do this out-of-the-box, or will this have to be brute-forced? What's the most pythonic way to accomplish this?
parameters = [ # assuming this is loaded already
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]
field_int_map = dict()
for p in parameters:
field_int_map[p['name']] = p['intValue']
yields {'field1': '1', 'field2': '2', 'fieldN': 'N'}
or as a dict comprehension
field_int_map = {p['name']:p['intValue'] for p in parameters}
This works to combine the name attribute with the intValue as key:value pairs, but the result is a dictionary instead of the original input type which was a list.
Use dictionary comprehension:
json_dct = {"parameters":
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]}
dct = {d["name"]: d["intValue"] for d in json_dct["parameters"]}
print(dct)
# {'field1': '1', 'field2': '2', 'fieldN': 'N'}
I have two dictionaries, as below. Both dictionaries have a list of dictionaries as the value associated with their properties key; each dictionary within these lists has an id key. I wish to merge my two dictionaries into one such that the properties list in the resulting dictionary only has one dictionary for each id.
{
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
and the other list:
{
"name":"harry",
"properties":[
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
The output I am trying to achieve is:
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic",
"language": "english"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
As id: N3 is common in both the lists, those 2 dicts should be merged with all the fields. So far I have tried using itertools and
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = tuple(d[k] for d in ds)
Could someone please help in figuring this out?
Here is one of the approach:
a = {
"name":"harry",
"properties":[
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
b = {
"name":"harry",
"properties":[
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
# Create dic maintaining the index of each id in resp dict
a_ids = {item['id']: index for index,item in enumerate(a['properties'])} #{'N3': 0, 'N5': 1}
b_ids = {item['id']: index for index,item in enumerate(b['properties'])} #{'N3': 0, 'N6': 1}
# Loop through one of the dict created
for id in a_ids.keys():
# If same ID exists in another dict, update it with the key value
if id in b_ids:
b['properties'][b_ids[id]].update(a['properties'][a_ids[id]])
# If it does not exist, then just append the new dict
else:
b['properties'].append(a['properties'][a_ids[id]])
print (b)
Output:
{'name': 'harry', 'properties': [{'id': 'N3', 'type': 'energetic', 'language': 'english', 'status': 'OPEN'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}]}
It might help to treat the two objects as elements each in their own lists. Maybe you have other objects with different name values, such as might come out of a JSON-formatted REST request.
Then you could do a left outer join on both name and id keys:
#!/usr/bin/env python
a = [
{
"name": "harry",
"properties": [
{
"id":"N3",
"status":"OPEN",
"type":"energetic"
},
{
"id":"N5",
"status":"OPEN",
"type":"hot"
}
]
}
]
b = [
{
"name": "harry",
"properties": [
{
"id":"N3",
"type":"energetic",
"language": "english"
},
{
"id":"N6",
"status":"OPEN",
"type":"cool"
}
]
}
]
a_names = set()
a_prop_ids_by_name = {}
a_by_name = {}
for ao in a:
an = ao['name']
a_names.add(an)
if an not in a_prop_ids_by_name:
a_prop_ids_by_name[an] = set()
for ap in ao['properties']:
api = ap['id']
a_prop_ids_by_name[an].add(api)
a_by_name[an] = ao
res = []
for bo in b:
bn = bo['name']
if bn not in a_names:
res.append(bo)
else:
ao = a_by_name[bn]
bp = bo['properties']
for bpo in bp:
if bpo['id'] not in a_prop_ids_by_name[bn]:
ao['properties'].append(bpo)
res.append(ao)
print(res)
The idea above is to process list a for names and ids. The names and ids-by-name are instances of a Python set. So members are always unique.
Once you have these sets, you can do the left outer join on the contents of list b.
Either there's an object in b that doesn't exist in a (i.e. shares a common name), in which case you add that object to the result as-is. But if there is an object in b that does exist in a (which shares a common name), then you iterate over that object's id values and look for ids not already in the a ids-by-name set. You add missing properties to a, and then add that processed object to the result.
Output:
[{'name': 'harry', 'properties': [{'id': 'N3', 'status': 'OPEN', 'type': 'energetic'}, {'id': 'N5', 'status': 'OPEN', 'type': 'hot'}, {'id': 'N6', 'status': 'OPEN', 'type': 'cool'}]}]
This doesn't do any error checking on input. This relies on name values being unique per object. So if you have duplicate keys in objects in both lists, you may get garbage (incorrect or unexpected output).
I am trying to write a program where I am having a list of dictionaries in the following manner
[
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':2,
}
]
Can we form it as a dictionary, where the first key in tuple should become unique Key in a dictionary
and it's corresponding values as a list for that values
Example:
[
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':8,
},
{
'unique':2,
'duplicate':2,
},
{
'unique':1,
'duplicate':4,
}
]
The above list should be converted into the following
---- Expected Outcome ---
[
{
'unique':1,
'duplicates':[2,8,4]
},
{
'unique':2,
'duplicates':[2]
}
]
PS: I am doing this in python
Thanks for the code in advance
you can also use itertools.groupby:
from itertools import groupby
from operator import itemgetter
l = [
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':8,
},
{
'unique':2,
'duplicate':2,
},
{
'unique':1,
'duplicate':4,
}
]
key = itemgetter('unique')
result = [{'unique':k, 'duplicate': list(map(itemgetter('duplicate'), g))}
for k, g in groupby(sorted(l, key=key ), key = key)]
print(result)
output:
[{'unique': 1, 'duplicate': [2, 8, 4]}, {'unique': 2, 'duplicate': [2]}]
I think this list comprehension can solve your problem:
result = [{'unique': id, 'duplicates': [d['duplicate'] for d in l if d['unique'] == id]} for id in set(map(lambda d: d['unique'], l))]
This might help you:
l = [
{
'unique':1,
'duplicate':2,
},
{
'unique':1,
'duplicate':8,
},
{
'unique':2,
'duplicate':2,
},
{
'unique':1,
'duplicate':4,
}
]
a = set()
for i in l:
a.add(i['unique'])
d = {i:[] for i in a }
for i in l:
d[i['unique']].append(i['duplicate'])
output = [{'unique': i, 'duplicate': j}for i, j in d.items()]
The output will be:
[{'unique': 1, 'duplicate': [2, 8, 4]}, {'unique': 2, 'duplicate': [2]}]
defaultdict(list) may help you here:
from collections import defaultdict
# data = [ {'unique': 1, 'duplicate': 2}, ... ] # your data
dups = defaultdict(list) # {unique: [duplicate]}
for dd in data:
dups[dd['unique']].append(dd['duplicate'])
answer = [dict(unique = k, duplicates = v) for k, v in dups.items()]
If you don't know the name of unique key, then replace 'unique' with something like
unique_key = list(data[0].keys())[0]
unique=[]
duplicate ={}
for items in data:
if items['unique'] not in unique:
unique.append(items['unique'])
duplicate[items['unique']]=[items['duplicate']]
else:
duplicate[items['unique']].append(items['duplicate'])
new_data=[]
for key in unique:
new_data.append({'unique':key,'duplicate':duplicate[key]})
Explanation: In the first for loop, I am appending unique keys to 'unique'. If the key doesn't exists in 'unique', I will append it in 'unique' & add a key in 'duplicate' with value as single element list. If the same key is found again, I simply append that value to 'duplicate' corresponding the key. In the 2nd loop, I am creating a 'new_dict' where I am adding these unique keys & its duplicate value list
I have the following object in python:
{
name: John,
age: {
years:18
},
computer_skills: {
years:4
},
mile_runner: {
years:2
}
}
I have an array with 100 people with the same structure.
What is the best way to go through all 100 people and make it such that there is no more "years"? In other words, each object in the 100 would look something like:
{
name: John,
age:18,
computer_skills:4,
mile_runner:2
}
I know I can do something in pseudocode:
for(item in list):
if('years' in (specific key)):
specifickey = item[(specific key)][(years)]
But is there a smarter/more efficent way?
Your pseudo-code is already pretty good I think:
for person in persons:
for k, v in person.items():
if isinstance(v, dict) and 'years' in v:
person[k] = v['years']
This overwrites every property which is a dictionary that has a years property with that property’s value.
Unlike other solutions (like dict comprehensions), this will modify the object in-place, so no new memory to keep everything is required.
def flatten(d):
ret = {}
for key, value in d.iteritems():
if isinstance(value, dict) and len(value) == 1 and "years" in value:
ret[key] = value["years"]
else:
ret[key] = value
return ret
d = {
"name": "John",
"age": {
"years":18
},
"computer_skills": {
"years":4
},
"mile_runner": {
"years":2
}
}
print flatten(d)
Result:
{'age': 18, 'mile_runner': 2, 'name': 'John', 'computer_skills': 4}
Dictionary comprehension:
import json
with open("input.json") as f:
cont = json.load(f)
print {el:cont[el]["years"] if "years" in cont[el] else cont[el] for el in cont}
prints
{u'age': 18, u'mile_runner': 2, u'name': u'John', u'computer_skills': 4}
where input.json contains
{
"name": "John",
"age": {
"years":18
},
"computer_skills": {
"years":4
},
"mile_runner": {
"years":2
}
}
Linear with regards to number of elements, you can't really hope for any lower.
As people said in the comments, it isn't exactly clear what your "object" is, but assuming that you actually have a list of dicts like this:
list = [{
'name': 'John',
'age': {
'years': 18
},
'computer_skills': {
'years':4
},
'mile_runner': {
'years':2
}
}]
Then you can do something like this:
for item in list:
for key in item:
try:
item[key] = item[key]['years']
except (TypeError, KeyError):
pass
Result:
list = [{'age': 18, 'mile_runner': 2, 'name': 'John', 'computer_skills': 4}]