Append values in the same key of a dictionary

Append values in the same key of a dictionary - python

How to add different values in the same key of a dictionary? These different values are added
in a loop.
Below is what I desired entries in the dictionary data_dict
data_dict = {}
And during each iterations, output should looks like:
Iteration1 -> {'HUBER': {'100': 5.42}}
Iteration2 -> {'HUBER': {'100': 5.42, '10': 8.34}}
Iteration3 -> {'HUBER': {'100': 5.42, '10': 8.34, '20': 7.75}} etc
However, at the end of the iterations, data_dict is left with the last entry only:
{'HUBER': {'80': 5.50}}
Here's the code:
import glob
path = "./meanFilesRun2/*.txt"
all_files = glob.glob(path)
data_dict = {}
def func_(all_lines, method, points, data_dict):
if method == "HUBER":
mean_error = float(all_lines[-1]) # end of the file contains total_error
data_dict["HUBER"] = {points: mean_error}
return data_dict
elif method == "L1":
mean_error = float(all_lines[-1])
data_dict["L1"] = {points: mean_error}
return data_dict
for file_ in all_files:
lineMthds = file_.split("_")[1] # reading line methods like "HUBER/L1/L2..."
algoNum = file_.split("_")[-2] # reading diff. algos number used like "1/2.."
points = file_.split("_")[2] # diff. points used like "10/20/30..."
if algoNum == "1":
FI = open(file_, "r")
all_lines = FI.readlines()
data_dict = func_(all_lines, lineMthds, points, data_dict)
print data_dict
FI.close()

You can use dict.setdefault here. Currently the problem with your code is that in each call to func_ you're re-assigning data_dict["HUBER"] to a new dict.
Change:
data_dict["HUBER"] = {points: mean_error}
to:
data_dict.setdefault("HUBER", {})[points] = mean_error

You can use defaultdict from the collections module:
import collections
d = collections.defaultdict(dict)
d['HUBER']['100'] = 5.42
d['HUBER']['10'] = 3.45

Related

Creating nested subdict python

I have some student names of different types and scores of each type in a list.
Eg:
students_exam_names = [exam_name1, exam_name2, exam_name3]
students_exam_score = [exam_score1, exam_score2, exam_score3]
students_quiz_names = [quiz_name1, quiz_name2]
students_quiz_score = [quiz_score1, quiz_score2]
students_homework_names = [homework_name1, homework_name2, homework_name3, homework_name4]
students_homework_score = [homework_score1, homework_score2, homework_score3, homework_score4]
Similarly for all three as shown below.
I want to have the details in the form of nested dict as follows:
details = {'students_exam':{
'exam_name1':exam_score1,
'exam_name2':exam_score2,
'exam_name3':exam_score3
},
'students_quiz':{
'quiz_name1': quiz_score1,
'quiz_name2': quiz_score2
},
'students_homework':{
'homework_name1': homework_score1,
'homework_name2': homework_score2,
'homework_name3': homework_score3,
'homework_name4': homework_score4,
}
The length of each students type is different. I tried to get it in the form of list of dictionaries as below but couldn't go further.
students_exam = {}
for i in range(len(students_exam_names)):
students_exam[students_exam_names[i]] = students_exam_score[i]

Do not forget to use ' when you are defining your inputs:
students_exam_names = ['exam_name1', 'exam_name2', 'exam_name3']
students_exam_score = ['exam_score1', 'exam_score2', 'exam_score3']
students_quiz_names = ['quiz_name1', 'quiz_name2']
students_quiz_score = ['quiz_score1', 'quiz_score2']
students_homework_names = ['homework_name1', 'homework_name2', 'homework_name3', 'homework_name4']
students_homework_score = ['homework_score1', 'homework_score2', 'homework_score3', 'homework_score4']
Then, simply use the zip function:
details = {'students_exam': dict(zip(students_exam_names, students_exam_score)),
'students_quiz': dict(zip(students_quiz_names, students_quiz_score)),
'students_homework': dict(zip(students_homework_names, students_homework_score))}
The output is:
{'students_exam': {'exam_name1': 'exam_score1', 'exam_name2': 'exam_score2', 'exam_name3': 'exam_score3'}, 'students_quiz': {'quiz_name1': 'quiz_score1', 'quiz_name2': 'quiz_score2'}, 'students_homework': {'homework_name1': 'homework_score1', 'homework_name2': 'homework_score2', 'homework_name3': 'homework_score3', 'homework_name4': 'homework_score4'}}

So what if i assume your complete set of inputs are like
students_exam_names = ['name1', 'name2', 'name3']
students_exam_score = ['score1', 'score2', 'score3']
students_quiz_names = ['name1', 'name2']
students_quiz_score = ['score1', 'score2']
students_homework_names = ['name1', 'name2', 'name3', 'name4']
students_homework_score = ['score1', 'score2', 'score3', 'score4']
if so then the following code should do the job.
details={}
details['students_exam']={sexam: students_exam_score[students_exam_names.index(sexam)] for sexam in students_exam_names}
details['students_quiz']={squiz: students_quiz_score[students_quiz_names.index(squiz)] for squiz in students_quiz_names}
details['students_homework']={shome: students_homework_score[students_homework_names.index(shome)] for shome in students_homework_names}

It looks like you need some functions to do these updates:
def update_exam(details, names, scores):
results = {}
for name,score in zip(names,scores):
results[name]=score
details['students_exam'] = results
def update_quiz(details, names, scores):
results = {}
for name,score in zip(names,scores):
results[name]=score
details['students_quiz'] = results
def update_homework(details, names, scores):
results = {}
for name,score in zip(names,scores):
results[name]=score
details['students_homework'] = results
details = {}
update_exam(details, students_exam_names, students_exam_score)
update_quiz(details, students_quiz_names, students_quiz_score)
update_homework(details, students_homework_names, students_homework_score)
But since the above functions only really differ in the text name of the key, they can be collapsed further:
def update(details, key, names, scores):
results = {}
for name,score in zip(names,scores):
results[name]=score
details[key] = results
details = {}
update(details,'students_exam', students_exam_names, students_exam_score)
update(details,'students_quiz', students_quiz_names, students_quiz_score)
update(details,'students_homework', students_homework_names, students_homework_score)
And then the loop can become a dictionary comprehension:
def update(details, key, names, scores):
details[key] = {name:score for (name,score) in zip(names,scores)}

Hierarchical grouping in key value pair with python

I have a list like this:
data = [
{'date':'2017-01-02', 'model': 'iphone5', 'feature':'feature1'},
{'date':'2017-01-02', 'model': 'iphone7', 'feature':'feature2'},
{'date':'2017-01-03', 'model': 'iphone6', 'feature':'feature2'},
{'date':'2017-01-03', 'model': 'iphone6', 'feature':'feature2'},
{'date':'2017-01-03', 'model': 'iphone7', 'feature':'feature3'},
{'date':'2017-01-10', 'model': 'iphone7', 'feature':'feature2'},
{'date':'2017-01-10', 'model': 'iphone7', 'feature':'feature1'},
]
I want to achieve this:
[
{
'2017-01-02':[{'iphone5':['feature1']}, {'iphone7':['feature2']}]
},
{
'2017-01-03': [{'iphone6':['feature2']}, {'iphone7':['feature3']}]
},
{
'2017-01-10':[{'iphone7':['feature2', 'feature1']}]
}
]
I need an efficient way, since it could be much data.
I was trying this:
data = sorted(data, key=itemgetter('date'))
date = itertools.groupby(data, key=itemgetter('date'))
But I'm getting nothing for the value of the 'date' key.
Later I will iterate over this structure for building an HTML.

You can do this pretty efficiently and cleanly using defaultdict. Unfortunately it's a pretty advanced use and it gets hard to read.
from collections import defaultdict
from pprint import pprint
# create a dictionary whose elements are automatically dictionaries of sets
result_dict = defaultdict(lambda: defaultdict(set))
# Construct a dictionary with one key for each date and another dict ('model_dict')
# as the value.
# The model_dict has one key for each model and a set of features as the value.
for d in data:
result_dict[d["date"]][d["model"]].add(d["feature"])
# more explicit version:
# for d in data:
# model_dict = result_dict[d["date"]] # created automatically if needed
# feature_set = model_dict[d["model"]] # created automatically if needed
# feature_set.add(d["feature"])
# convert the result_dict into the required form
result_list = [
{
date: [
{phone: list(feature_set)}
for phone, feature_set in sorted(model_dict.items())
]
} for date, model_dict in sorted(result_dict.items())
]
pprint(result_list)
# [{'2017-01-02': [{'iphone5': ['feature1']}, {'iphone7': ['feature2']}]},
# {'2017-01-03': [{'iphone6': ['feature2']}, {'iphone7': ['feature3']}]},
# {'2017-01-10': [{'iphone7': ['feature2', 'feature1']}]}]

You can try this, here is my way, td is a dict to store { iphone : index } to check if the new item exist in the list of dict:
from itertools import groupby
from operator import itemgetter
r = []
for i in groupby(sorted(data, key=itemgetter('date')), key=itemgetter('date')):
td, tl = {}, []
for j in i[1]:
if j["model"] not in td:
tl.append({j["model"]: [j["feature"]]})
td[j["model"]] = len(tl) - 1
elif j["feature"] not in tl[td[j["model"]]][j["model"]]:
tl[td[j["model"]]][j["model"]].append(j["feature"])
r.append({i[0]: tl})
Result:
[
{'2017-01-02': [{'iphone5': ['feature1']}, {'iphone7': ['feature2']}]},
{'2017-01-03': [{'iphone6': ['feature2']}, {'iphone7': ['feature3']}]},
{'2017-01-10': [{'iphone7': ['feature2', 'feature1']}]}
]
As matter of fact, I think the data structure can be simplified, maybe you don't need so many nesting.

total_result = list()
result = dict()
inner_value = dict()
for d in data:
if d["date"] not in result:
if result:
total_result.append(result)
result = dict()
result[d["date"]] = set()
inner_value = dict()
if d["model"] not in inner_value:
inner_value[d["model"]] = set()
inner_value[d["model"]].add(d["feature"])
tmp_v = [{key: list(inner_value[key])} for key in inner_value]
result[d["date"]] = tmp_v
total_result.append(result)
total_result
[{'2017-01-02': [{'iphone7': ['feature2']}, {'iphone5': ['feature1']}]},
{'2017-01-03': [{'iphone6': ['feature2']}, {'iphone7': ['feature3']}]},
{'2017-01-10': [{'iphone7': ['feature2', 'feature1']}]}]

Aggregating values in one column by their corresponding value in another from two files

had a question regarding summing the multiple values of duplicate keys into one key with the aggregate total. For example:
1:5
2:4
3:2
1:4
Very basic but I'm looking for an output that looks like:
1:9
2:4
3:2
In the two files I am using, I am dealing with a list of 51 users(column 1 of user_artists.dat) who have the artistID(column 2) and how many times that user has listened to that particular artist given by the weight(column 3).
I am attempting to aggregate the total times that artist has been played, across all users and display it in a format such as:
Britney Spears (289) 2393140. Any help or input would be so appreciated.
import codecs
#from collections import defaultdict
with codecs.open("artists.dat", encoding = "utf-8") as f:
artists = f.readlines()
with codecs.open("user_artists.dat", encoding = "utf-8") as f:
users = f.readlines()
artist_list = [x.strip().split('\t') for x in artists][1:]
user_stats_list = [x.strip().split('\t') for x in users][1:]
artists = {}
for a in artist_list:
artistID, name = a[0], a[1]
artists[artistID] = name
grouped_user_stats = {}
for u in user_stats_list:
userID, artistID, weight = u
grouped_user_stats[artistID] = grouped_user_stats[artistID].astype(int)
grouped_user_stats[weight] = grouped_user_stats[weight].astype(int)
for artistID, weight in u:
grouped_user_stats.groupby('artistID')['weight'].sum()
print(grouped_user_stats.groupby('artistID')['weight'].sum())
#if userID not in grouped_user_stats:
#grouped_user_stats[userID] = { artistID: {'name': artists[artistID], 'plays': 1} }
#else:
#if artistID not in grouped_user_stats[userID]:
#grouped_user_stats[userID][artistID] = {'name': artists[artistID], 'plays': 1}
#else:
#grouped_user_stats[userID][artistID]['plays'] += 1
#print('this never happens')
#print(grouped_user_stats)

how about:
import codecs
from collections import defaultdict
# read stuff
with codecs.open("artists.dat", encoding = "utf-8") as f:
artists = f.readlines()
with codecs.open("user_artists.dat", encoding = "utf-8") as f:
users = f.readlines()
# transform artist data in a dict with "artist id" as key and "artist name" as value
artist_repo = dict(x.strip().split('\t')[:2] for x in artists[1:])
user_stats_list = [x.strip().split('\t') for x in users][1:]
grouped_user_stats = defaultdict(lambda:0)
for u in user_stats_list:
#userID, artistID, weight = u
grouped_user_stats[u[0]] += int(u[2]) # accumulate weights in a dict with artist id as key and sum of wights as values
# extra: "fancying" the data transforming the keys of the dict in "<artist name> (artist id)" format
grouped_user_stats = dict(("%s (%s)" % (artist_repo.get(k,"Unknown artist"), k), v) for k ,v in grouped_user_stats.iteritems() )
# lastly print it
for k, v in grouped_user_stats.iteritems():
print k,v

python generating nested dictionary key error

I am trying to create a nested dictionary from a mysql query but I am getting a key error
result = {}
for i, q in enumerate(query):
result['data'][i]['firstName'] = q.first_name
result['data'][i]['lastName'] = q.last_name
result['data'][i]['email'] = q.email
error
KeyError: 'data'
desired result
result = {
'data': {
0: {'firstName': ''...}
1: {'firstName': ''...}
2: {'firstName': ''...}
}
}

You wanted to create a nested dictionary
result = {} will create an assignment for a flat dictionary, whose items can have any values like "string", "int", "list" or "dict"
For this flat assignment
python knows what to do for result["first"]
If you want "first" also to be another dictionary you need to tell Python by an assingment
result['first'] = {}.
otherwise, Python raises "KeyError"
I think you are looking for this :)
>>> from collections import defaultdict
>>> mydict = lambda: defaultdict(mydict)
>>> result = mydict()
>>> result['Python']['rules']['the world'] = "Yes I Agree"
>>> result['Python']['rules']['the world']
'Yes I Agree'

result = {}
result['data'] = {}
for i, q in enumerate(query):
result['data']['i'] = {}
result['data'][i]['firstName'] = q.first_name
result['data'][i]['lastName'] = q.last_name
result['data'][i]['email'] = q.email
Alternatively, you can use you own class which adds the extra dicts automatically
class AutoDict(dict):
def __missing__(self, k):
self[k] = AutoDict()
return self[k]
result = AutoDict()
for i, q in enumerate(query):
result['data'][i]['firstName'] = q.first_name
result['data'][i]['lastName'] = q.last_name
result['data'][i]['email'] = q.email

result['data'] does exist. So you cannot add data to it.
Try this out at the start:
result = {'data': []};

You have to create the key data first:
result = {}
result['data'] = {}
for i, q in enumerate(query):
result['data'][i] = {}
result['data'][i]['firstName'] = q.first_name
result['data'][i]['lastName'] = q.last_name
result['data'][i]['email'] = q.email

Learning Python: Store values in dict from stdout

How can I do the following in Python:
I have a command output that outputs this:
Datexxxx
Clientxxx
Timexxx
Datexxxx
Client2xxx
Timexxx
Datexxxx
Client3xxx
Timexxx
And I want to work this in a dict like:
Client:(date,time), Client2:(date,time) ...

After reading the data into a string subject, you could do this:
import re
d = {}
for match in re.finditer(
"""(?mx)
^Date(.*)\r?\n
Client\d*(.*)\r?\n
Time(.*)""",
subject):
d[match.group(2)] = (match.group(1), match.group(2))

How about something like:
rows = {}
thisrow = []
for line in output.split('\n'):
if line[:4].lower() == 'date':
thisrow.append(line)
elif line[:6].lower() == 'client':
thisrow.append(line)
elif line[:4].lower() == 'time':
thisrow.append(line)
elif line.strip() == '':
rows[thisrow[1]] = (thisrow[0], thisrow[2])
thisrow = []
print rows
Assumes a trailing newline, no spaces before lines, etc.

What about using a dict with tuples?
Create a dictionary and add the entries:
dict = {}
dict['Client'] = ('date1','time1')
dict['Client2'] = ('date2','time2')
Accessing the entires:
dict['Client']
>>> ('date1','time1')

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Append values in the same key of a dictionary - python

You can use dict.setdefault here. Currently the problem with your code is that in each call to func_ you're re-assigning data_dict["HUBER"] to a new dict. Change: data_dict["HUBER"] = {points: mean_error} to: data_dict.setdefault("HUBER", {})[points] = mean_error

You can use defaultdict from the collections module: import collections d = collections.defaultdict(dict) d['HUBER']['100'] = 5.42 d['HUBER']['10'] = 3.45

Related

Creating nested subdict python

Hierarchical grouping in key value pair with python

Aggregating values in one column by their corresponding value in another from two files

python generating nested dictionary key error

Learning Python: Store values in dict from stdout

Categories

Resources