Fatterned List in Dict - python

Like to convert following data into single list in dict. If the dict key already exits in list in the list and value is the same, do not add it.
data1 =
"""
[{'In': ['5,000 MByte']},
{'Out': ['155 MByte', '10,100 MByte']},
{'Total': ['5,000 MByte']},}]
"""
Expected:
[{'In': '5,000 MByte',
'Out': '155 MByte',
'Total': '5,000 MByte'}]

This should work:
data1 = """
[{'In': ['5,000 MByte']},
{'Out': ['155 MByte', '10,100 MByte']},
{'Total': ['5,000 MByte']}]
"""
import ast
data1_dict = {}
for item in ast.literal_eval(data1):
for key in item:
data1_dict[key] = item[key][0]
res = [data1_dict]
print(res)

Related

How to generate a dictionary dynamically from a list in python?

I want to run a script which grabs all the titles of the files in a folder and collects them in a dictionary. I want the output structured like this:
{
1: {"title": "one"},
2: {"title": "two"},
...
}
I have tried the following, but how to add the "title"-part and make the dictionary dynamically?
from os import walk
mypath = '/Volumes/yahiaAmin-1'
filenames = next(walk(mypath), (None, None, []))[2] # [] if no file
courseData = {}
for index, x in enumerate(filenames):
# print(index, x)
# courseData[index]["title"].append(x)
# courseData[index].["tlt"].append(x)
courseData.setdefault(index).append(x)
print(courseData)
Assign the value dict directly to the index
courseData = {}
filenames = ["one", "two"]
for index, x in enumerate(filenames, 1):
courseData[index] = {"title": x}
print(courseData)
# {1: {'title': 'one'}, 2: {'title': 'two'}}
Not that using a dict where the key is an incremental int is generally useless, as a list will do the same

Hierarchical grouping in key value pair with python

I have a list like this:
data = [
{'date':'2017-01-02', 'model': 'iphone5', 'feature':'feature1'},
{'date':'2017-01-02', 'model': 'iphone7', 'feature':'feature2'},
{'date':'2017-01-03', 'model': 'iphone6', 'feature':'feature2'},
{'date':'2017-01-03', 'model': 'iphone6', 'feature':'feature2'},
{'date':'2017-01-03', 'model': 'iphone7', 'feature':'feature3'},
{'date':'2017-01-10', 'model': 'iphone7', 'feature':'feature2'},
{'date':'2017-01-10', 'model': 'iphone7', 'feature':'feature1'},
]
I want to achieve this:
[
{
'2017-01-02':[{'iphone5':['feature1']}, {'iphone7':['feature2']}]
},
{
'2017-01-03': [{'iphone6':['feature2']}, {'iphone7':['feature3']}]
},
{
'2017-01-10':[{'iphone7':['feature2', 'feature1']}]
}
]
I need an efficient way, since it could be much data.
I was trying this:
data = sorted(data, key=itemgetter('date'))
date = itertools.groupby(data, key=itemgetter('date'))
But I'm getting nothing for the value of the 'date' key.
Later I will iterate over this structure for building an HTML.
You can do this pretty efficiently and cleanly using defaultdict. Unfortunately it's a pretty advanced use and it gets hard to read.
from collections import defaultdict
from pprint import pprint
# create a dictionary whose elements are automatically dictionaries of sets
result_dict = defaultdict(lambda: defaultdict(set))
# Construct a dictionary with one key for each date and another dict ('model_dict')
# as the value.
# The model_dict has one key for each model and a set of features as the value.
for d in data:
result_dict[d["date"]][d["model"]].add(d["feature"])
# more explicit version:
# for d in data:
# model_dict = result_dict[d["date"]] # created automatically if needed
# feature_set = model_dict[d["model"]] # created automatically if needed
# feature_set.add(d["feature"])
# convert the result_dict into the required form
result_list = [
{
date: [
{phone: list(feature_set)}
for phone, feature_set in sorted(model_dict.items())
]
} for date, model_dict in sorted(result_dict.items())
]
pprint(result_list)
# [{'2017-01-02': [{'iphone5': ['feature1']}, {'iphone7': ['feature2']}]},
# {'2017-01-03': [{'iphone6': ['feature2']}, {'iphone7': ['feature3']}]},
# {'2017-01-10': [{'iphone7': ['feature2', 'feature1']}]}]
You can try this, here is my way, td is a dict to store { iphone : index } to check if the new item exist in the list of dict:
from itertools import groupby
from operator import itemgetter
r = []
for i in groupby(sorted(data, key=itemgetter('date')), key=itemgetter('date')):
td, tl = {}, []
for j in i[1]:
if j["model"] not in td:
tl.append({j["model"]: [j["feature"]]})
td[j["model"]] = len(tl) - 1
elif j["feature"] not in tl[td[j["model"]]][j["model"]]:
tl[td[j["model"]]][j["model"]].append(j["feature"])
r.append({i[0]: tl})
Result:
[
{'2017-01-02': [{'iphone5': ['feature1']}, {'iphone7': ['feature2']}]},
{'2017-01-03': [{'iphone6': ['feature2']}, {'iphone7': ['feature3']}]},
{'2017-01-10': [{'iphone7': ['feature2', 'feature1']}]}
]
As matter of fact, I think the data structure can be simplified, maybe you don't need so many nesting.
total_result = list()
result = dict()
inner_value = dict()
for d in data:
if d["date"] not in result:
if result:
total_result.append(result)
result = dict()
result[d["date"]] = set()
inner_value = dict()
if d["model"] not in inner_value:
inner_value[d["model"]] = set()
inner_value[d["model"]].add(d["feature"])
tmp_v = [{key: list(inner_value[key])} for key in inner_value]
result[d["date"]] = tmp_v
total_result.append(result)
total_result
[{'2017-01-02': [{'iphone7': ['feature2']}, {'iphone5': ['feature1']}]},
{'2017-01-03': [{'iphone6': ['feature2']}, {'iphone7': ['feature3']}]},
{'2017-01-10': [{'iphone7': ['feature2', 'feature1']}]}]

Dot notation to Json in python

I receive data from the Loggly service in dot notation, but to put data back in, it must be in JSON.
Hence, I need to convert:
{'json.message.status.time':50, 'json.message.code.response':80, 'json.time':100}
Into:
{'message': {'code': {'response': 80}, 'status': {'time': 50}}, 'time': 100}
I have put together a function to do so, but I wonder if there is a more direct and simpler way to accomplish the same result.
def dot_to_json(a):
# Create root for JSON tree structure
resp = {}
for k,v in a.items():
# eliminate json. (if metric comes from another type, it will keep its root)
k = re.sub(r'\bjson.\b','',k)
if '.' in k:
# Field has a dot
r = resp
s = ''
k2 = k.split('.')
l = len(k2)
count = 0
t = {}
for f in k2:
count += 1
if f not in resp.keys():
r[f]={}
r = r[f]
if count < l:
s += "['" + f + "']"
else:
s = "resp%s" % s
t = eval(s)
# Assign value to the last branch
t[f] = v
else:
r2 = resp
if k not in resp.keys():
r2[k] = {}
r2[k] = v
return resp
You can turn the path into dictionary access with:
def dot_to_json(a):
output = {}
for key, value in a.iteritems():
path = key.split('.')
if path[0] == 'json':
path = path[1:]
target = reduce(lambda d, k: d.setdefault(k, {}), path[:-1], output)
target[path[-1]] = value
return output
This takes the key as a path, ignoring the first json part. With reduce() you can walk the elements of path (except for the last one) and fetch the nested dictionary with it.
Essentially you start at output and for each element in path fetch the value and use that value as the input for the next iteration. Here dict.setdefault() is used to default to a new empty dictionary each time a key doesn't yet exist. For a path ['foo', 'bar', 'baz'] this comes down to the call output.setdefault('foo', {}).setdefault('bar', {}).setdefault('baz', {}), only more compact and supporting arbitrary length paths.
The innermost dictionary is then used to set the value with the last element of the path as the key.
Demo:
>>> def dot_to_json(a):
... output = {}
... for key, value in a.iteritems():
... path = key.split('.')[1:] # ignore the json. prefix
... target = reduce(lambda d, k: d.setdefault(k, {}), path[:-1], output)
... target[path[-1]] = value
... return output
...
>>> dot_to_json({'json.message.status.time':50, 'json.message.code.response':80, 'json.time':100}))
{'message': {'status': {'time': 50}, 'code': {'response': 80}}, 'time': 100}

Append values in the same key of a dictionary

How to add different values in the same key of a dictionary? These different values are added
in a loop.
Below is what I desired entries in the dictionary data_dict
data_dict = {}
And during each iterations, output should looks like:
Iteration1 -> {'HUBER': {'100': 5.42}}
Iteration2 -> {'HUBER': {'100': 5.42, '10': 8.34}}
Iteration3 -> {'HUBER': {'100': 5.42, '10': 8.34, '20': 7.75}} etc
However, at the end of the iterations, data_dict is left with the last entry only:
{'HUBER': {'80': 5.50}}
Here's the code:
import glob
path = "./meanFilesRun2/*.txt"
all_files = glob.glob(path)
data_dict = {}
def func_(all_lines, method, points, data_dict):
if method == "HUBER":
mean_error = float(all_lines[-1]) # end of the file contains total_error
data_dict["HUBER"] = {points: mean_error}
return data_dict
elif method == "L1":
mean_error = float(all_lines[-1])
data_dict["L1"] = {points: mean_error}
return data_dict
for file_ in all_files:
lineMthds = file_.split("_")[1] # reading line methods like "HUBER/L1/L2..."
algoNum = file_.split("_")[-2] # reading diff. algos number used like "1/2.."
points = file_.split("_")[2] # diff. points used like "10/20/30..."
if algoNum == "1":
FI = open(file_, "r")
all_lines = FI.readlines()
data_dict = func_(all_lines, lineMthds, points, data_dict)
print data_dict
FI.close()
You can use dict.setdefault here. Currently the problem with your code is that in each call to func_ you're re-assigning data_dict["HUBER"] to a new dict.
Change:
data_dict["HUBER"] = {points: mean_error}
to:
data_dict.setdefault("HUBER", {})[points] = mean_error
You can use defaultdict from the collections module:
import collections
d = collections.defaultdict(dict)
d['HUBER']['100'] = 5.42
d['HUBER']['10'] = 3.45

python generating nested dictionary key error

I am trying to create a nested dictionary from a mysql query but I am getting a key error
result = {}
for i, q in enumerate(query):
result['data'][i]['firstName'] = q.first_name
result['data'][i]['lastName'] = q.last_name
result['data'][i]['email'] = q.email
error
KeyError: 'data'
desired result
result = {
'data': {
0: {'firstName': ''...}
1: {'firstName': ''...}
2: {'firstName': ''...}
}
}
You wanted to create a nested dictionary
result = {} will create an assignment for a flat dictionary, whose items can have any values like "string", "int", "list" or "dict"
For this flat assignment
python knows what to do for result["first"]
If you want "first" also to be another dictionary you need to tell Python by an assingment
result['first'] = {}.
otherwise, Python raises "KeyError"
I think you are looking for this :)
>>> from collections import defaultdict
>>> mydict = lambda: defaultdict(mydict)
>>> result = mydict()
>>> result['Python']['rules']['the world'] = "Yes I Agree"
>>> result['Python']['rules']['the world']
'Yes I Agree'
result = {}
result['data'] = {}
for i, q in enumerate(query):
result['data']['i'] = {}
result['data'][i]['firstName'] = q.first_name
result['data'][i]['lastName'] = q.last_name
result['data'][i]['email'] = q.email
Alternatively, you can use you own class which adds the extra dicts automatically
class AutoDict(dict):
def __missing__(self, k):
self[k] = AutoDict()
return self[k]
result = AutoDict()
for i, q in enumerate(query):
result['data'][i]['firstName'] = q.first_name
result['data'][i]['lastName'] = q.last_name
result['data'][i]['email'] = q.email
result['data'] does exist. So you cannot add data to it.
Try this out at the start:
result = {'data': []};
You have to create the key data first:
result = {}
result['data'] = {}
for i, q in enumerate(query):
result['data'][i] = {}
result['data'][i]['firstName'] = q.first_name
result['data'][i]['lastName'] = q.last_name
result['data'][i]['email'] = q.email

Categories