Python Compare List to Dict Value - python

Updated in an attempt to be more clear
I have three list of dictionaries that I want to merge into one based on a value.
The lists looks like this. They vary in how many dictionaries that they can have.
unplanned = [{'service__name': u'Email', 'service_sum': 4}, {'service__name': u'Peoplesoft', 'service_sum': 2}]
planned = [{'service__name': u'Email', 'service_sum': 2}, {'service__name': u'Gopher', 'service_sum': 2}, {'service__name': u'Peoplesoft', 'service_sum': 4}]
emerg = [{'service__name': u'Internet', 'service_sum': 1}]
I want to take the 3 lists and and create a new list that has the name's from all 3 lists and the values or 0 in a set order. So I am thinking something like this.
[(Email, (4, 2, 0)), (Peoplesoft, (2, 4, 0)), Gopher, (0, 2, 0)), Internet, (0, 0, 1))]
I thought I should create a list of the service__name's to compare against each list so I did that but I am not sure how to compare the 3 lists against this name list. I thought izip_longest would work but have no idea how to implement it. I am using 2.7.

Just use a dict, then convert it into a list afterwards:
some_list = [{'service__name': u'Email', 'service_sum': 4}, {'service__name': u'Email', 'service_sum': 1}, {'service__name': u'Network', 'service_sum': 0}]
def combine(list):
combined = {}
for item in list:
if item['service__name'] not in combined:
combined[item['service__name']] = []
combined[item['service__name']].append(item['service_sum'])
return combined.items()
combine(some_list) # [(u'Email', [4, 1]), (u'Network', [0])]
combine(unplanned)
combine(emerg + planned)
.....
Here's the version of the function that uses defaultdict:
def combine(list):
from collections import defaultdict
combined = defaultdict(list)
for item in list:
combined[item['service__name']].append(item['service_sum'])
return combined.items()
A little cleaner, but there's an unnecessary import, and a few other problems with it that may pop up in the future if the function definition is changed (see comments).

It seems like you could do something like:
output = []
for dicts in zip(unplanned,planned,emerg):
output.append(('Email',tuple(d['service_sum'] if d['service__name'] == 'Email' else 0 for d in dicts)))

Try the following codes. You can give variables better name since you know better about the contexts.
def convert(unplanned, planned, emerg):
chain = (unplanned, planned, emerg)
names = map(lambda lst: [d['service__name'] for d in lst], chain)
sums = map(lambda lst: [d['service_sum'] for d in lst], chain)
ds = [dict(zip(n, s)) for n,s in zip(names, sums)]
unique_names = set([])
unique_names = reduce(unique_names.union,names)
results = []
for n in unique_names:
s = []
for i in range(3):
s.append(ds[i].get(n,0))
results.append((n, tuple(s)))
return results
print convert(unplanned, planned, emerg)
The output at my machine is
[(u'Internet', (0, 0, 1)), (u'Peoplesoft', (2, 4, 0)), (u'Email', (4, 2, 0)), (u'Gopher', (0, 2, 0))]

Related

What's a more Pythonic way of grabbing N items from a dictionary?

In Python, suppose I want to grab N arbitrary items from a dictionary—say, to print them, to inspect a few items. I don't care which items I get. I don't want to turn the dictionary into a list (as does some code I have seen); that seems like a waste. I can do it with the following code (where N = 5), but it seems like there has to be a more Pythonic way:
count = 0
for item in my_dict.items():
if count >= 5:
break
print(item)
count += 1
Thanks in advance!
You can use itertools.islice to slice any iterable (not only lists):
>>> import itertools
>>> my_dict = {i: i for i in range(10)}
>>> list(itertools.islice(my_dict.items(), 5))
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]
I might use zip and range:
>>> my_dict = {i: i for i in range(10)}
>>> for _, item in zip(range(5), my_dict.items()):
... print(item)
...
(0, 0)
(1, 1)
(2, 2)
(3, 3)
(4, 4)
The only purpose of the range here is to give an iterable that will cause zip to stop after 5 iterations.
You can modify what you have slightly:
for count, item in enumerate(dict.items()):
if count >= 5:
break
print(item)
Note: in this case when you're looping through .items(), you're getting a key/value pair, which can be unpacked as you iterate:
for count, (key, value) in enumerate(dict.items()):
if count >= 5:
break
print(f"{key=} {value=})
If you want just the keys, you can just iterate over the dict.
for count, key in enumerate(dict):
if count >= 5:
break
print(f"{key=})
If you want just the values:
for count, value in enumerate(dict.values()):
if count >= 5:
break
print(f"{value=})
And last note: using dict as a variable name overwrites the built in dict and makes it unavailable in your code.
Typically, I would like to use slice notation to do this, but dict.items() returns an iterator, which is not slicable.
You have two main options:
Make it something that slice notation works on:
x = {'a':1, 'b':2, 'c': 3, 'd': 4, 'e': 5, 'f': 6}
for item, index in list(x.items())[:5]:
print(item)
Use something that works on iterators. In this case, the built-in (and exceedingly popular) itertools package:
import itertools
x = {'a':1, 'b':2, 'c': 3, 'd': 4, 'e': 5, 'f': 6}
for item in itertools.islice(x.items(), 5):
print(item)

Convert nested dictionary into list of tuples

I have a nested dictionary that looks like the following
db1 = {
'Diane': {'Laundry': 2, 'Cleaning': 4, 'Gardening': 3},
'Betty': {'Gardening': 2, 'Tutoring': 1, 'Cleaning': 3},
'Charles': {'Plumbing': 2, 'Cleaning': 5},
'Adam': {'Cleaning': 4, 'Tutoring': 2, 'Baking': 1},
}
The desired sorted dictionary looks like the following
[(5, [('Cleaning', ['Charles'])]),
(4, [('Cleaning', ['Adam', 'Diane'])]),
(3, [('Cleaning', ['Betty']), ('Gardening', ['Diane'])]),
(2, [('Gardening', ['Betty']), ('Laundry', ['Diane']),
('Plumbing', ['Charles']), ('Tutoring', ['Adam'])]),
(1, [('Baking', ['Adam']), ('Tutoring', ['Betty'])])]
it is a list of 2 tuples, the first index is the skill level sorted by decreasing level and the second index is another list of 2 tuple which contains their names as another list inside it. Do I need to extract info from the original dictionary and build a completely new list of tuples? Or I just need to simply change original dictionary into a list of 2 tuple
You can build up an intermediate dictionary, and then use it to produce your final output, as follows:
from pprint import pprint
db1 = {
'Diane': {'Laundry': 2, 'Cleaning': 4, 'Gardening': 3},
'Betty': {'Gardening': 2, 'Tutoring': 1, 'Cleaning': 3},
'Charles': {'Plumbing': 2, 'Cleaning': 5},
'Adam': {'Cleaning': 4, 'Tutoring': 2, 'Baking': 1},
}
d = {}
for k, v in db1.items():
for kk, vv in v.items():
if vv not in d:
d[vv] = {}
if kk not in d[vv]:
d[vv][kk] = []
d[vv][kk].append(k)
out = sorted([(k,
[(kk, sorted(v[kk])) for kk in sorted(v.keys())])
for k, v in d.items()],
key=lambda t:t[0],
reverse=True)
pprint(out)
Gives:
[(5, [('Cleaning', ['Charles'])]),
(4, [('Cleaning', ['Adam', 'Diane'])]),
(3, [('Cleaning', ['Betty']), ('Gardening', ['Diane'])]),
(2,
[('Gardening', ['Betty']),
('Laundry', ['Diane']),
('Plumbing', ['Charles']),
('Tutoring', ['Adam'])]),
(1, [('Baking', ['Adam']), ('Tutoring', ['Betty'])])]
(Note: it might be possible to use some kind of nested defaultdict to avoid the two if statements shown here, but I have not attempted this. If you did d=defaultdict(dict), that would avoid the first if statement, but the second would remain.)

Is there a way to self generate similar key names and values in a dictionary with a loop?

I want to create a dictionary by using a loop or similar technique. Something like the below variable assignment is possible.
my_dict = {v:int(v*random()) for v in range(10)}
Though the question I am stuck at- How can I generate similar names for the item keys? Giving an example below:
{'Item-1': 1, 'Item-2':3, 'Item-3':3 ....}
Thanks in advance!
from random import random
my_dict = {f'item-{v+1}': int(v*random()) for v in range(10)}
print(my_dict)
Output:
{'item-1': 0, 'item-2': 0, 'item-3': 1, 'item-4': 1, 'item-5': 0, 'item-6': 3, 'item-7': 2, 'item-8': 4, 'item-9': 6, 'item-10': 2}
This uses an f-string to create the key, the corresponding value is randomly generated like in your question.
You can use list comprehension in dictionaries too.
from random import randint
dic = {f"item-{i}": randint(0, 10) for i in range(1, 11)}
print(dic)
Create keys and values and add to my_dict in a loop
my_dict = {}
for v in range(10): my_dict[f'Item-{v}'] = v
print(my_dict)
{'Item-0': 0, 'Item-1': 1, 'Item-2': 2, 'Item-3': 3, 'Item-4': 4, 'Item-5': 5, 'Item-6': 6, 'Item-7': 7, 'Item-8': 8, 'Item-9': 9}

Finding item frequency in list of lists

Let's say I have a list of lists and I want to find the frequency in which pairs (or more) of elements appears in total.
For example, if i have [[a,b,c],[b,c,d],[c,d,e]
I want :(a,b) = 1, (b,c) = 2, (c,d) = 2, etc.
I tried finding a usable apriori algorithm that would allow me to do this, but i couldn't find a easy to implement one in python.
How would I approach this problem in a better way?
This is a way to do it:
from itertools import combinations
l = [['a','b','c'],['b','c','d'],['c','d','e']]
d = {}
for i in l:
# for every item on l take all the possible combinations of 2
comb = combinations(i, 2)
for c in comb:
k = ''.join(c)
if d.get(k):
d[k] += 1
else:
d[k] = 1
Result:
>>> d
{'bd': 1, 'ac': 1, 'ab': 1, 'bc': 2, 'de': 1, 'ce': 1, 'cd': 2}

Is there a way in Python to index a list of containers (tuples, lists, dictionaries) by an element of a container?

I have been poking around for a recipe / example to index a list of tuples without taking a modification of the decorate, sort, undecorate approach.
For example:
l=[(a,b,c),(x,c,b),(z,c,b),(z,c,d),(a,d,d),(x,d,c) . . .]
The approach I have been using is to build a dictionary using defaultdict of the second element
from collections import defaultdict
tdict=defaultdict(int)
for myTuple in l:
tdict[myTuple[1]]+=1
Then I have to build a list consisting of only the second item in the tuple for each item in the list. While there are a number of ways to get there a simple approach is to:
tempList=[myTuple[1] for myTuple in l]
and then generate an index of each item in tdict
indexDict=defaultdict(dict)
for key in tdict:
indexDict[key]['index']=tempList.index(key)
Clearly this does not seem very Pythonic. I have been trying to find examples or insights thinking that I should be able to use something magical to get the index directly. No such luck so far.
Note, I understand that I can take my approach a little more directly and not generating tdict.
output could be a dictionary with the index
indexDict={'b':{'index':0},'c':{'index':1},'d':{'index':4},. . .}
After learning a lot from Nadia's responses I think the answer is no.
While her response works I think it is more complicated than needed. I would simply
def build_index(someList):
indexDict={}
for item in enumerate(someList):
if item[1][1] not in indexDict:
indexDict[item[1][1]]=item[0]
return indexDict
This will generate the result you want
dict((myTuple[1], index) for index, myTuple in enumerate(l))
>>> l = [(1, 2, 3), (4, 5, 6), (1, 4, 6)]
>>> dict((myTuple[1], index) for index, myTuple in enumerate(l))
{2: 0, 4: 2, 5: 1}
And if you insist on using a dictionary to represent the index:
dict((myTuple[1], {'index': index}) for index, myTuple in enumerate(l))
The result will be:
{2: {'index': 0}, 4: {'index': 2}, 5: {'index': 1}}
EDIT
If you want to handle key collision then you'll have to extend the solution like this:
def build_index(l):
indexes = [(myTuple[1], index) for index, myTuple in enumerate(l)]
d = {}
for e, index in indexes:
d[e] = min(index, d.get(e, index))
return d
>>> l = [(1, 2, 3), (4, 5, 6), (1, 4, 6), (2, 4, 6)]
>>> build_index(l)
{2: 0, 4: 2, 5: 1}
EDIT 2
And a more generalized and compact solution (in a similar definition to sorted)
def index(l, key):
d = {}
for index, myTuple in enumerate(l):
d[key(myTuple)] = min(index, d.get(key(myTuple), index))
return d
>>> index(l, lambda a: a[1])
{2: 0, 4: 2, 5: 1}
So the answer to your question is yes: There is a way in Python to index a list of containers (tuples, lists, dictionaries) by an element of a container without preprocessing. But your request of storing the result in a dictionary makes it impossible to be a one liner. But there is no preprocessing here. The list is iterated only once.
If i think this is what you're asking...
l = ['asd', 'asdxzc']
d = {}
for i, x in enumerate(l):
d[x] = {'index': i}

Categories