I'm making this function on Python called clean_data that has...
Parameters: A dictionary of dictionaries (all strings), and a list of strings containing the fields we care about.
Returns: A dictionary of dictionaries with only the fields we care about, and with appropriate data types.
Fields we care about:
Opponent (string)
Power Plays (“PP” -- int)
Sample input:
{"1/1" : {"opponent" : "BU", "X" : "3", "PP" : "0"},
"1/2" : {"opponent" : "HC", "X" : "4", "PP" : "1"},
"1/5" : {"opponent" : "BC", "X" : "8", "PP" : "0"}}
["opponent", "PP"]
Expected output:
{"1/1" : {"opponent" : "BU", "PP" : 0},
"1/2" : {"opponent" : "HC", "PP" : 1},
"1/5" : {"opponent" : "BC", "PP" : 0}}
Currently, I have
def clean_data(my_dict, field_ls):
t = {}
for x in field_ls:
field = x
for line in my_dict:
for i in (my_dict[line]):
if i != field:
my_dict[line].pop(i)
However, I don't quite seem to get the output I want as the pop(i) isn't really working. How can I fix my problem?
Instead of popping keys, it seems it's easier (since there's only two of them) to select the key-value pairs you actually want.
def clean_data(my_dict):
out = {}
for k,d in my_dict.items():
out[k] = {'opponent': str(d['opponent']), 'PP': int(d['PP'])}
return out
Output:
{'1/1': {'opponent': 'BU', 'PP': 0},
'1/2': {'opponent': 'HC', 'PP': 1},
'1/5': {'opponent': 'BC', 'PP': 0}}
If you absolutely have to pop keys from a dictionary, then you can copy the keys using list constructor, then iterate over that list and pop unwanted fields
def clean_data(input_dict, dict_of_fields):
my_dict = input_dict.copy()
for d in my_dict.values():
for field in list(d.keys()):
if field not in dict_of_fields:
d.pop(field)
else:
d[field] = dict_of_fields[field](d[field])
return my_dict
Example:
sample_data = {"1/1" : {"opponent" : "BU", "X" : "3", "PP" : "0"},
"1/2" : {"opponent" : "HC", "X" : "4", "PP" : "1"},
"1/5" : {"opponent" : "BC", "X" : "8", "PP" : "0"}}
fields = {"opponent": str, "PP": int}
Output:
>>> print(clean_data(sample_data, fields))
{'1/1': {'opponent': 'BU', 'PP': 0},
'1/2': {'opponent': 'HC', 'PP': 1},
'1/5': {'opponent': 'BC', 'PP': 0}}
Related
I have a list of dicts:
input = [{'name':'A', 'Status':'Passed','id':'x1'},
{'name':'A', 'Status':'Passed','id':'x2'},
{'name':'A','Status':'Failed','id':'x3'},
{'name':'B', 'Status':'Passed','id':'x4'},
{'name':'B', 'Status':'Passed','id':'x5'}]
I want an output like :
output = [{'name':'A', 'Passed':'2', 'Failed':'1', 'Total':'3', '%Pass':'66%'},
{'name':'B', 'Passed':'2', 'Failed':'0', 'Total':'2', '%Pass':'100%'},
{'name':'Total', 'Passed':'4', 'Failed':'1', 'Total':'5', '%Pass':'80%'}]\
i started retrieving the different names by using a lookup :
lookup = {(d["name"]): d for d in input [::-1]}
names= [e for e in lookup.values()]
names= names[::-1]
and after using the list comprehension something like :\
for name in names :
name_passed = sum(["Passed" and "name" for d in input if 'Status' in d and name in d])
name_faled = sum(["Failed" and "name" for d in input if 'Status' in d and name in d])\
But i am not sure if there is a smartest way ? a simple loop and comparing dict values will be more simple!?
Assuming your input entries will always be grouped according to the "name" key-value pair:
entries = [
{"name": "A", "Status": "Passed", "id": "x1"},
{"name": "A", "Status": "Passed", "id": "x2"},
{"name": "A", "Status": "Failed", "id": "x3"},
{"name": "B", "Status": "Passed", "id": "x4"},
{"name": "B", "Status": "Passed", "id": "x5"}
]
def to_grouped(entries):
from itertools import groupby
from operator import itemgetter
for key, group_iter in groupby(entries, key=itemgetter("name")):
group = list(group_iter)
total = len(group)
passed = sum(1 for entry in group if entry["Status"] == "Passed")
failed = total - passed
perc_pass = (100 // total) * passed
yield {
"name": key,
"Passed": str(passed),
"Failed": str(failed),
"Total": str(total),
"%Pass": f"{perc_pass:.0f}%"
}
print(list(to_grouped(entries)))
Output:
[{'name': 'A', 'Passed': '2', 'Failed': '1', 'Total': '3', '%Pass': '66%'}, {'name': 'B', 'Passed': '2', 'Failed': '0', 'Total': '2', '%Pass': '100%'}]
This will not create the final entry you're looking for, which sums the statistics of all other entries. Though, that shouldn't be too hard to do.
I currently have a dictionary of dictionaries in which some of the items are related. The items which are related to one another invariably follow a pattern illustrated here:
{ "item" : { "foo" : "bar", "fizz" : "buzz"},
"itemSuper" : { "boo" : "far", "bizz" : "fuzz"},
"itemDuper" : { "omg" : "wtf", "rofl" : "lmao"}}
As you can see, the keys of all the dictionaries which are related have a substring in common which is equal to the full key of one of the dictionaries. I would like to go through my dictionary-of-dictionaries combining all of the contents of such related groups into single dictionaries whose keys are the substring by which the matching was done. So, the end-goal is to end up with something like this for all of these groups:
{ "item" : { "foo" : "bar", "fizz" : "buzz", "boo" : "far", "bizz" : "fuzz", "omg" : "wtf", "rofl" : "lmao"}}
The substring is always the leading part of the key, but may be arbitrary from group to group. So in addition to the "item", "itemSuper" and "itemDuper" above, there are "thingy", "thingySuper", and "thingyDuper", along with others of that sort.
There are three possible suffixes for the substring; let's call them Super, Duper and Uber. Any of the groups of items I am interested in can have any or all three of these, but there are no other suffixes that may occur.
I would do it following way:
data = { "item" : { "foo" : "bar", "fizz" : "buzz"},
"itemSuper" : { "boo" : "far", "bizz" : "fuzz"},
"itemDuper" : { "omg" : "wtf", "rofl" : "lmao"}}
for key1 in list(data.keys()):
for key2 in list(data.keys()):
if key1!=key2 and key1 in key2:
data[key1].update(data[key2])
del data[key2]
print(data)
Output:
{'item': {'foo': 'bar', 'fizz': 'buzz', 'boo': 'far', 'bizz': 'fuzz', 'omg': 'wtf', 'rofl': 'lmao'}}
Note that this solution is in-place (it alters data) and that I use for ... in list(...) - this is crucial, because otherwise I would not be able to del inside loop.
dict_of_dict = {
"item" : { "foo" : "bar", "fizz" : "buzz"},
"itemSuper" : { "boo" : "far", "bizz" : "fuzz"},
"itemDuper" : { "omg" : "wtf", "rofl" : "lmao"}
}
suffixes = {'Super', 'Duper', 'Uber'}
def get_base(key, suffix_lst):
for suffix in suffix_lst:
if key.endswith(suffix):
return key[:-len(suffix)]
return key
res = {}
for k,d in dict_of_dict.items():
base = get_base(k, suffixes)
res.setdefault(base, {}).update(d)
print(res)
Output
{'item': {'foo': 'bar', 'fizz': 'buzz', 'boo': 'far', 'bizz': 'fuzz', 'omg': 'wtf', 'rofl': 'lmao'}}
def recombine(k, substring):
newd = dict()
newk = dict()
key = [i for i in k if (substring in i)] # select out the strings which contains substring
value = [k[i] for i in key] # select out the corresponding value of target key
for i in value:
for j in i.items():
newk[j[0]] = j[1]
newd[substring] = newk
return newd
k = { "item" : { "foo" : "bar", "fizz" : "buzz"},
"itemSuper" : { "boo" : "far", "bizz" : "fuzz"},
"itemDuper" : { "omg" : "wtf", "rofl" : "lmao"}}
recombine(k, 'item')
Output
{'item': {'foo': 'bar',
'fizz': 'buzz',
'boo': 'far',
'bizz': 'fuzz',
'omg': 'wtf',
'rofl': 'lmao'}}
I have a file with this type of structure:
{
"key" : "A",
"description" : "1",
"uninterestingInformation" : "whatever"
}
{
"key" : "B",
"description" : "2",
"uninterestingInformation" : "whatever"
}
{
"key" : "C",
"description" : "3",
"uninterestingInformation" : "whatever"
}
I want to build a dictionary in Python that contains the key as key and the description as value. I have more fields, but just the 2 of them are interesting for me.
This file is not exactly a .json file, is a file with a lot of similar json objects.
json.loads is not working, obviously.
Any suggestion on how to read the data?
I've already read this post, but my json object is not on one line...
EDIT:
If it wasn't clear in my explanations, the example is quite accurate, I have a lot of similar JSON objects, one after another, separated by new line (\n), with no comma. So, overall the file is not a valid JSON file, while each object is a valid JSON object.
The solution I've applied finally was:
api_key_file = open('mongo-config.json').read()
api_key_file = '[' + api_key_file + ']'
api_key_file= api_key_file.replace("}\n{", "},\n{")
api_key_data = json.loads(api_key_file)
api_key_description = {}
for data in api_key_data:
api_key_description[data['apiKey']] = data['description']
It worked well for my situation. There are maybe better ways of doing this explained in the comments bellow.
Another option would be to use the literal_eval function from the ast module, after making the necessary changes so that it fits the format of a valid type:
from ast import literal_eval
inJson = '''{
"key" : "A"
"description" : "1"
"uninterestingInformation" : "whatever"
}
{
"key" : "B"
"description" : "2"
"uninterestingInformation" : "whatever"
}
{
"key" : "C"
"description" : "3"
"uninterestingInformation" : "whatever"
}'''
inJson = "[" + inJson.replace("}", "},")[:-1] + "]"
inJson = inJson.replace("\"\n ","\",")
newObject = literal_eval(inJson)
print(newObject)
Output:
[{'key': 'A', 'description': '1', 'uninterestingInformation': 'whatever'}, {'key': 'B', 'description': '2', 'uninterestingInformation': 'whatever'}, {'key': 'C', 'description': '3', 'uninterestingInformation': 'whatever'}]
You can use re.split to split the file content into appropriate JSON strings for parsing:
import re
import json
j='''{
"key" : "A",
"description" : "1",
"uninterestingInformation" : "whatever"
}
{
"key" : "B",
"description" : "2",
"uninterestingInformation" : "whatever"
}
{
"key" : "C",
"description" : "3",
"uninterestingInformation" : "whatever"
}'''
print(list(map(json.loads, re.split(r'(?<=})\n(?={)', j))))
This outputs:
[{'key': 'A', 'description': '1', 'uninterestingInformation': 'whatever'}, {'key': 'B', 'description': '2', 'uninterestingInformation': 'whatever'}, {'key': 'C', 'description': '3', 'uninterestingInformation': 'whatever'}]
In mongo I am storing list which contains dictionary.
something like [{"a": 1, "city" : "pune"}, {"b": 2, "city" : "abad"}].
When I update this list with new list, some dict are new and some old. So it should store only distinct dictionaries. For that what I do is, fetch existing record from mongo, and append new dictionary.
record['result'].extend([k for k in new_key if k not in record['result']])
This line create distinct dictionary in record['result'].
Dictionary in input record comes as string, but dict which fetched from mongo comes as unicode, So to avoid mismatch I convert fetched mongo record in string.
record['result'].extend([k for k in new_key if k not in record['result']])
Code:
def saveEntity(self, record):
try:
self.collection.insert(record)
print "mongo done"
except Exception:
print 'Failed to save value '
data = self.collection.find({'date': 2})
new_key = []
for key in data:
for val in key['result']:
new_key.append({ str(key):str(value) for key,value in val.items() })
record['result'].extend([k for k in new_key if k not in record['result']])
self.collection.update(
{'date' : 2},
{ '$set' : {'result': record['result']}},
True
)
result = {'date': 2, 'result' : [{"a": 1, "city" : "pune"}, {"b": 2, "city" : "abad"}]}
#result = {'date': 2, 'result' : [{"a": 1, "city" : "pune"}, {"c": 3, "city" : "mum"}]}
YoutubeTrend().saveEntity(result)
But still records stored in mongodb are not distinct. Anyone can help?
Update
Suppose input list is,
[{"a": 1, "city" : "pune"}, {"b": 2, "city" : "abad"}]
list fetched from mongo which takes all values in unicode format
[{"a": 1, "city" : "pune"}, {"c": 3, "city" : "mum"}]
Updated list in mongo should looks like
[{"a": 1, "city" : "pune"}, {"b": 2, "city" : "abad"}, {"c": 3, "city" : "mum"}]
I have two dictionary objects which are very complex and created by converting large xml files into python dictionaries.
I don't know the depth of the dictionaries and just want to compare and want the following output...
e.g. My dictionaries are like this
d1 = {"great grand father":
{"name":"John",
"grand father":
{"name":"Tom",
"father":
{"name":"Andy",
"Me":
{"name":"Mike",
"son":
{"name":"Tom"}
}
}
}
}
}
d2 is also a similar but could be possible any one of the field is missing or changed as below
d2 = {"great grand father":
{"name":"John",
"grand father":
{"name":"Tom",
"father":
{"name":"Andy",
"Me":
{"name":"Tonny",
"son":
{"name":"Tom"}
}
}
}
}
}
The dictionary comparison should give me results like this -
Expected Key/Val : Me->name/"Mike"
Actual Key/Val : Me->name/"Tonny"
If the key "name" does not exists in "Me" in d2, it should give me following output
Expected Key/Val : Me->name/"Mike"
Actual Key/Val : Me->name/NOT_FOUND
I repeat the dictionary depth could be variable or dynamically generated. The two dictionaries here are given as examples...
All the dictionary comparison questions and their answers which I have seen in SO are related fixed depth Dictionaries.....
You're in luck, I did this as part of a project where I worked.
You need a recursive function something like:
def checkDifferences(dict_a,dict_b,differences=[])
You can first check for keys that don't exist in one or the other.
e.g
Expected Name/Tom Actual None
Then you compare the types of the values i.e check if the value is a dict or a list etc.
If it is then you can recursively call the function using the value as dict_a/b. When calling recursively pass the differences array.
If the type of the value is a list and the list may have dictionaries within it then you need to covert the list to a dict and call the function on the converted dictionary.
I'm sorry I can't help more but I no longer have access to the source code. Hopefully this is enough to get you started.
Here I found a way to compare any two dictionaries -
I have tried with various dictionaries of any depths and worked for me. The code is not so modular but just for the reference -
import pprint
pp = pprint.PrettyPrinter(indent=4)
dict1 = { 'Person' : { 'Male' : {'Boys' : {'Roger' : {'age' : 20},
'Rafa' : {'age' : 25}
}
},
'Female' : { 'Girls' : {'Serena' : {'age' : 23},
'Maria' : {'age' : 15}
}
}
},
'Animal' : { 'Huge' : {'Elephant' : {'color' : 'black' }
}
}
}
'''
dict2 = { 'Person' : { 'Male' : {'Boys' : {'Roger' : {'age' : 20}
}
},
'Female' : { 'Girls' : {'Serena' : {'age' : 23},
'Maria' : {'age' : 1}
}
}
}
}
dict2 = { 'Person' : { 'Male' : {'Boys' : {'Roger' : {'age' : 20},
'Rafa' : {'age' : 2}
}
}
}
}
'''
dict2 = { 'Person' : { 'Male' : {'Boys' : {'Roger' : {'age' : 2}}},
'Female' : 'Serena'}
}
key_list = []
err_list = {}
def comp(exp,act):
for key in exp:
key_list.append(key)
exp_val = exp[key]
try:
act_val = act[key]
is_dict_exp = isinstance(exp_val,__builtins__.dict)
is_dict_act = isinstance(act_val,__builtins__.dict)
if is_dict_exp == is_dict_act == True:
comp(exp_val,act_val)
elif is_dict_exp == is_dict_act == False:
if not exp_val == act_val:
temp = {"Exp" : exp_val,"Act" : act_val}
err_key = "-->".join(key_list)
if err_list.has_key(key):
err_list[err_key].update(temp)
else:
err_list.update({err_key : temp})
else:
temp = {"Exp" : exp_val, "Act" : act_val}
err_key = "-->".join(key_list)
if err_list.has_key(key):
err_list[err_key].update(temp)
else:
err_list.update({err_key : temp})
except KeyError:
temp = {"Exp" : exp_val,"Act" : "NOT_FOUND"}
err_key = "-->".join(key_list)
if err_list.has_key(key):
err_list[err_key].update(temp)
else:
err_list.update({err_key : temp})
key_list.pop()
comp(dict1,dict2)
pp.pprint(err_list)
Here is the output of my code -
{ 'Animal': { 'Act': 'NOT_FOUND',
'Exp': { 'Huge': { 'Elephant': { 'color': 'black'}}}},
'Person-->Female': { 'Act': 'Serena',
'Exp': { 'Girls': { 'Maria': { 'age': 15},
'Serena': { 'age': 23}}}},
'Person-->Male-->Boys-->Rafa': { 'Act': 'NOT_FOUND', 'Exp': { 'age': 25}},
'Person-->Male-->Boys-->Roger-->age': { 'Act': 2, 'Exp': 20}
}
One can also try with other dictionaries given in commented code..
One more thing - The keys are checked in expected dictionary and the matched with an actual. If we pass dictionaries in alternate order the other way matching is also possible...
comp(dict2,dict1)