Python - Add new entries to dictionary while iterating over another dictionary - python

I want to check if there is an overlap between points in one dictionary with another.
If there is no overlap, then create a new key in the dictionary with the value.
I am getting an error while implementing the following code.
RuntimeError: dictionary changed size during iteration
for k1 in d1:
for k2 in d2:
if(overlap(k1,k2)==False):
d2[len(d2)+1]=d1[k1]
Is there another way to implement this?
Edit:
d1 = {"file1":[2,3],"file2":[11,15]}
d2 = {1:[1,5],2:[6,10]}
Output:
d2 = {1:[1,5],2:[6,10],3:[11,15]}

I believe you are looking for dict.update():
d1 = {'foo': 0,
'bar': 1}
d2 = {'foo': 0,
'bar': 1,
'baz': 2}
d1.update(d2)
This results in:
d1 = {'foo': 0,
'bar': 1,
'baz': 2}
Edit:
import itertools
d1 = {"file1":[2,3],
"file2":[11,15]}
d2 = {1:[1,5],
2:[6,10]}
d2 = {**d2, **{max(d2.keys())+i+1: v for i, (k, v) in enumerate({k: v for k, v in d1.items() if not any(i in range(v[0], v[1]+1) for i in itertools.chain.from_iterable(range(v[0], v[1]+1) for v in d2.values()))}.items())}}
Produces:
d2 = {1: [1, 5],
2: [6, 10],
3: [11, 15]}
😃

If you are trying to merge dictionaries then the faster way is using one of these
a = dict(one=1,two=2,three=3,four=4,five=5,six=6)
b = dict(one=1,two=2,three=5,six=6, seven=7, nine=9)
a.update(b)
Which gives {'one': 1, 'two': 2, 'three': 5, 'four': 4, 'five': 5, 'six': 6, 'seven': 7, 'nine': 9}
Or you can use c = dict(a, **b)

You don't need a double loop there. Loop through the keys in d1. Check if d2.keys contains k1. If not, add k1 to d2. I don't have the syntax right here, but something along these lines.
for k1 in d1.keys:
if not k1 in d2.keys:
d2[k1] = d1[k1]
return d2

Based on your description, edit, and comments it doesn't sound like dict.update() will give you the results you want.
The error you're hitting is because the loop makes an assumption of the number of items it will be iterating once it starts up. During the loop, you change that number, invalidating the assumption, causing the the program to fail.
To fix this, you need to iterate a copy of the items instead. You also mentioned you're concerned with the values and not the keys, so you need to adjust what you're iterating as well.
Now, you didn't actually post what your overlap function does, so I'm assuming something like this based on your comments/sample:
def overlap(left, right):
return left[0] - right[-1] == 1
For the most part, the actual implementation of your overlap is irrelevant to the question, but the code above at least gave me your expected output.
I've then adjusted your code to the following:
for v1 in list(d1.values()):
for v2 in list(d2.values()):
# If v1 overlaps v2
if overlap(v1, v2):
# Add v1 to d2 under the next sequential key
d2[len(d1) + 1] = v1
By using list(d1.values()), I'm no longer iterating the dictionary itself while modifying it - I'm taking a "snapshot" of the contents (the values specifically here) at the start of the loop.
Using this and your sample data, d2 contains {1: [1, 5], 2: [6, 10], 3: [11, 15]} after the loop.

Related

Python Dictionary Comprehension Not Outputting As Expected

I am playing around with dictionaries, and thought how would I create a dictionary using comprehensions. I thought
{k:v for k in [0,1,2] for v in [5,8,7]}
would print as
{0:5, 1:8, 2:7}
But instead it prints as
{0: 7, 1: 7, 2: 7}
Why is this happening and what modifications would I need to make to get the first output?
Your list comprehension is equivalent to nested loops:
result = {}
for v in [5, 8, 7]:
for k in [0, 1, 2]:
result[k] = v
So each iteration of the outer loop sets all the keys to that value, and at the end you have the last value in all of them.
Use zip() to iterate over two lists in parallel.
{k: v for k, v in zip([0, 1, 2], [5, 8, 7])}
You can also just use the dict() constructor:
dict(zip([0, 1, 2], [5, 8, 7]))
Whenever you have trouble with a comprehension, unroll it into the equivalent loops. Which in this case goes:
mydict = {}
for v in [5,8,7]:
for k in [0,1,2]:
mydict[k] = v
Each successive assignment to mydict[k] overwrites the previous one.

Quickest way to merge dictionaries based on key match

I have two dictionaries:
dic_1={'1234567890': 1, '1234567891': 2, '1234567880': 3, '1234567881': 4}
dic_2={'1234567890': 5, '1234567891': 6}
Now I want to merge them based on key values such that the merged dictionary looks like the following:
merged_dic=={'1234567890': 1, '1234567891': 2, '1234567880': 3, '1234567881': 4}
We only want to keep unique keys and only one distinct value associated with them. What's the best way to do that
This should be what you need. It iterates through all dictionaries adding key/values only if the key is not already in the merged dictionary.
from itertools import chain
merged_dic = {}
for k, v in chain(dic_1.items(), dic_2.items()):
if k not in merged_dic:
merged_dic[k] = v
print(merged_dic)
# {'1234567890': 1, '1234567891': 2, '1234567880': 3, '1234567881': 4}
If, for example, you were wanting to keep all values for a key you could use:
from collections import defaultdict
from itertools import chain
merged_dic = defaultdict(list)
for k, v in chain(dic_1.items(), dic_2.items()):
merged_dic[k].append(v)
print(merged_dic)
# {'1234567890': [1, 5], '1234567891': [2, 6], '1234567880': [3], '1234567881': [4]}
Using chain() can allow you to iterate over many dictionaries. In the question you showed 2 dictionaries, but if you had 4 you could easily merge them all. E.g.
for k, v in chain(dic_1.items(), dic_2.items(), dic_3.items(), dic_4.items()):
All you're really trying to do is update dic_2 with any values in dic_1 so you can just do
merged_dic = {**dic_2,**dic_1}
This will merge the two dictionaries, taking all the values from dic_2, updating any keys in the new dictionary with any new values that exist in dic_1 and then adding any unique keys in dic_1
The sample data is not exactly explains the SO. If dic_2 has common key with dic_1 then retain the item in dic_1; if new item is found in dic_2 then put it in merged dictionary.
import copy
dic_1={'1234567890': 1, '1234567891': 2, '1234567880': 3, '1234567881': 4}
dic_2={'1234567890': 5, '8234567890': 6}
merged_d = copy.copy(dic_1)
diff = set(dic_2)-set(dic_1)
merged_d.update({k: dic_2[k] for k in diff})
print(merged_d)
Result:
{'1234567890': 1, '1234567891': 2, '1234567880': 3, '1234567881': 4, '8234567890': 6}
If you want the first dict to override the keys in the second dict then:
dic_2.update(dic_1)

Merging dictionaries

I need to append the values from one dictionary (N) to another (M) - pseudocode below
if[x] in M:
M[x]=M[x]+N[x]
else:
M[x]=N[x]
Doing this for every key in N seems quite untidy coding.
What would be the most efficient way to achieve this?
Of course you should be iterating your keys in "x" already - but a single line solution is:
M.update({key:((M[key] + value) if key in M else value) for key, value in N.items()})
not entirely sure what your x is (guessing the keys of both M and N), then this might work:
M = {key: M.get(key, 0) + N.get(key, 0) for key in set((*M, *N))}
for the example:
M = {'a': 1, 'c': 3, 'e': 5}
N = {'a': 2, 'b': 4, 'c': 6, 'd': 8}
you get:
print(M) # {'a': 3, 'e': 5, 'd': 8, 'c': 9, 'b': 4}
or please clarify what the desired output for the given example would be.
When you say "append", I assume that means that the values in your dicts are lists. However the techniques below can easily be adapted if they're simple objects like integers or strings.
Python doesn't provide any built-in methods for handling dicts of lists, and the most efficient way to do this depends on the data, because some ways work best when there are a high proportion of shared keys, other ways are better when there aren't many shared keys.
If the proportion of shared keys isn't too high, this code is reasonably efficient:
m = {1:[1, 2], 2:[3]}
n = {1:[4], 3:[5]}
for k, v in n.items():
m.setdefault(k, []).extend(v)
print(m)
output
{1: [1, 2, 4], 2: [3], 3: [5]}
You can make this slightly faster by caching the .setdefault method and the empty list:
def merge(m, n):
setdef = m.setdefault
empty = []
for k, v in n.items():
setdef(k, empty).extend(v)
If you expect a high proportion of shared keys, then it's better to perform set operations on the keys (in Python 3, dict.keys() return a set-like View object, which is extremely efficient to construct), and handle the shared keys separately from the unique keys of N.

python : list of dictionary values by alphabetical order of keys

Is there a simple way of getting a list of values from a dictionary, but in the way that all values are ordered by alphabetical order of keys in dictionary?
You have several options; the easiest is to just sort the items, picking out the values with a list comprehension:
[v for k, v in sorted(dictionary.iteritems())]
as tuples are sorted lexicographically; by key first, then on value. Replace iteritems() with items() if you are using Python 3.
You can sort just the keys and translate those to values:
[dictionary[k] for k in sorted(dictionary)]
Demo:
>>> dictionary = {'foo': 42, 'bar': 38, 'baz': 20}
>>> [v for k, v in sorted(dictionary.iteritems())]
[38, 20, 42]
>>> [dictionary[k] for k in sorted(dictionary)]
[38, 20, 42]
Accessing keys afterwards is also the faster option:
>>> timeit.timeit('[v for k, v in sorted(dictionary.iteritems())]', 'from __main__ import dictionary')
3.4159910678863525
>>> timeit.timeit('[d[key] for key in sorted(d)]', 'from __main__ import dictionary as d')
1.5645101070404053
Yes, that's more than twice as fast to sort a small dictionary a million times.
There are numerous ways to do that. one way is by using sorted on the dict:
>>> d = {'c': 1, 'b': 2, 'e': 3, 'a': 4}
>>> l = [d[key] for key in sorted(d)]
>>> print(l)
[4, 2, 1, 3]
Yes, for that you can use zip. Here is an example:
y = {'a': 1, 'd':4, 'h':3, 'b': 2}
a = y.keys()
b = y.values()
print [d for (c,d) in sorted(zip(a,b))]
[1, 2, 4, 3]
or simply:
print [i[1] for i in sorted(y.items())]
You can try it out here: http://repl.it/R8e

Flatten a dictionary of dictionaries (2 levels deep) of lists

I'm trying to wrap my brain around this but it's not flexible enough.
In my Python script I have a dictionary of dictionaries of lists. (Actually it gets a little deeper but that level is not involved in this question.) I want to flatten all this into one long list, throwing away all the dictionary keys.
Thus I want to transform
{1: {'a': [1, 2, 3], 'b': [0]},
2: {'c': [4, 5, 1], 'd': [3, 8]}}
to
[1, 2, 3, 0, 4, 5, 1, 3, 8]
I could probably set up a map-reduce to iterate over items of the outer dictionary to build a sublist from each subdictionary and then concatenate all the sublists together.
But that seems inefficient for large data sets, because of the intermediate data structures (sublists) that will get thrown away. Is there a way to do it in one pass?
Barring that, I would be happy to accept a two-level implementation that works... my map-reduce is rusty!
Update:
For those who are interested, below is the code I ended up using.
Note that although I asked above for a list as output, what I really needed was a sorted list; i.e. the output of the flattening could be any iterable that can be sorted.
def genSessions(d):
"""Given the ipDict, return an iterator that provides all the sessions,
one by one, converted to tuples."""
for uaDict in d.itervalues():
for sessions in uaDict.itervalues():
for session in sessions:
yield tuple(session)
...
# Flatten dict of dicts of lists of sessions into a list of sessions.
# Sort that list by start time
sessionsByStartTime = sorted(genSessions(ipDict), key=operator.itemgetter(0))
# Then make another copy sorted by end time.
sessionsByEndTime = sorted(sessionsByStartTime, key=operator.itemgetter(1))
Thanks again to all who helped.
[Update: replaced nthGetter() with operator.itemgetter(), thanks to #intuited.]
I hope you realize that any order you see in a dict is accidental -- it's there only because, when shown on screen, some order has to be picked, but there's absolutely no guarantee.
Net of ordering issues among the various sublists getting catenated,
[x for d in thedict.itervalues()
for alist in d.itervalues()
for x in alist]
does what you want without any inefficiency nor intermediate lists.
edit: re-read the original question and reworked answer to assume that all non-dictionaries are lists to be flattened.
In cases where you're not sure how far down the dictionaries go, you would want to use a recursive function. #Arrieta has already posted a function that recursively builds a list of non-dictionary values.
This one is a generator that yields successive non-dictionary values in the dictionary tree:
def flatten(d):
"""Recursively flatten dictionary values in `d`.
>>> hat = {'cat': ['images/cat-in-the-hat.png'],
... 'fish': {'colours': {'red': [0xFF0000], 'blue': [0x0000FF]},
... 'numbers': {'one': [1], 'two': [2]}},
... 'food': {'eggs': {'green': [0x00FF00]},
... 'ham': ['lean', 'medium', 'fat']}}
>>> set_of_values = set(flatten(hat))
>>> sorted(set_of_values)
[1, 2, 255, 65280, 16711680, 'fat', 'images/cat-in-the-hat.png', 'lean', 'medium']
"""
try:
for v in d.itervalues():
for nested_v in flatten(v):
yield nested_v
except AttributeError:
for list_v in d:
yield list_v
The doctest passes the resulting iterator to the set function. This is likely to be what you want, since, as Mr. Martelli points out, there's no intrinsic order to the values of a dictionary, and therefore no reason to keep track of the order in which they were found.
You may want to keep track of the number of occurrences of each value; this information will be lost if you pass the iterator to set. If you want to track that, just pass the result of flatten(hat) to some other function instead of set. Under Python 2.7, that other function could be collections.Counter. For compatibility with less-evolved pythons, you can write your own function or (with some loss of efficiency) combine sorted with itertools.groupby.
A recursive function may work:
def flat(d, out=[]):
for val in d.values():
if isinstance(val, dict):
flat(d, out)
else:
out+= val
If you try it with :
>>> d = {1: {'a': [1, 2, 3], 'b': [0]}, 2: {'c': [4, 5, 6], 'd': [3, 8]}}
>>> out = []
>>> flat(d, out)
>>> print out
[1, 2, 3, 0, 4, 5, 6, 3, 8]
Notice that dictionaries have no order, so the list is in random order.
You can also return out (at the end of the loop) and don't call the function with a list argument.
def flat(d, out=[]):
for val in d.values():
if isinstance(val, dict):
flat(d, out)
else:
out+= val
return out
call as:
my_list = flat(d)

Categories