merge python dictionary of sets

merge python dictionary of sets - python

I have a graph with 2 kinds of nodes- 'Letter nodes' (L) and 'Number nodes' (N). I have 2 dictionaries, one shows edges from L to N and the other shows edges from N to L.
A = {0:(b,), 1:(c,), 2:(c,), 3:(c,)}
B = {a:(3,), b:(0,), c:(1,2,3)}
A key,value pair c:(1,2,3) means there are edges from c to 1,2,3 (3 edges)
I want to merge these to one dictionary C so that the result is a new dictionary:
C = {(0,): (b,), (1, 2, 3): (a, c)}
or
C = {(b,):(0,), (a, c):(1, 2, 3)}
In the resulting dictionary I want the letter nodes and numerical nodes to be on separate sides of keys and values. I don't care which is the key or value just need them separated. How can I go about solving this efficiently?
CLARIFICATION: this of a graph with 2 types of nodes - number nodes, and letter nodes. the dictionary C says from letter nodes (a,c) you can reach the number nodes (1,2,3) i.e a->3->c->1, a->3->c->2 thus you can get to 1,2,3 from a. EVEN THOUGH THERE IS NO DIRECT EDGE FROM a to 2 or a to 1.

According to your statement, I guess you are trying to find a graph algorithms.
import itertools
def update_dict(A, result): #update vaules to the same set
for k in A:
result[k] = result.get(k, {k}).union(set(A[k]))
tmp = None
for i in result[k]:
tmp = result.get(k, {k}).union(result.get(i, {i}))
result[k] = tmp
for i in result[k]:
result[i] = result.get(i, {i}).union(result.get(k, {k}))
A = {0:('b',), 1:('c',), 2:('c',), 3:('c',)}
B = {'a':(3,), 'b':(0,), 'c':(1,2,3)}
result = dict()
update_dict(A, result)
update_dict(B, result)
update_dict(A, result) #update to fix bugs
update_dict(B, result)
k = sorted([sorted(list(v)) for v in result.values()])
k = list( k for k, _ in itertools.groupby(k)) #sort and remove dumplicated set
final_result = dict()
for v in k: #merge the result as expected
final_result.update({tuple([i for i in v if isinstance(i, int)]):tuple([i for i in v if not isinstance(i, int)])})
print final_result
#output
{(0,): ('b',), (1, 2, 3): ('a', 'c')}

So I'm not sure if this is the most efficient way of doing this at this point, but it works:
A = {0:('b',), 1:('c',), 2:('c',), 3:('c',)}
B = {'a':(3,), 'b':(0,), 'c':(1,2,3)}
# Put B in the same form as A
B_inv = {}
for k, v in B.items():
for i in v:
if B_inv.get(i) is not None:
B_inv[i] = B_inv[i].union(k)
else:
B_inv[i] = set(k)
B_inv = {k: tuple(v) for k, v in B_inv.items()}
AB = set(B_inv.items() + A.items()) # get AB as merged
This gets you the merged dictionaries. From here:
new_dict = {}
for a in AB:
for i in a[1]:
if new_dict.get(i) is not None:
new_dict[i] = new_dict[i].union([a[0]])
else:
new_dict[i] = set([a[0]])
# put in tuple form
new_dict = {tuple(k): tuple(v) for k,v in new_dict.items()}
This gives me:
{('a',): (3,), ('b',): (0,), ('c',): (1, 2, 3)}
Basically, I'm relying on the mutability of sets and their built-in functionality of eliminating duplicates to try to keep the number of loops through each dictionary to a minimum. Unless I missed something, this should be in linear time.
From here, I need to do comparison, and relying on sets again to prevent me from needing to do a worst-case pairwise comparison of every single element.
merge_list = []
for k, v in new_dict.items():
matched = False
nodeset = set([k[0]]).union(v)
for i in range(len(merge_list)):
if len(nodeset.intersection(merge_list[i])) != 0:
merge_list[i] = merge_list[i].union(nodeset)
matched = True
# did not find shared edges
if not matched:
merge_list.append(nodeset)
Finally, turn it into the form with a single "layer" and tuples.
C = {}
for item in merge_list:
temp_key = []
temp_val = []
for i in item:
if str(i).isalpha():
temp_key.append(i)
else:
temp_val.append(i)
C[tuple(temp_key)] = tuple(temp_val)
C gives me {('a', 'c'): (1, 3, 2), ('b',): (0,)}.

try this:
c = a.copy()
c.update(b)

Related

How to get multiple most frequent k-mers of a string using Python?

If I insert the following
Insert the Text:
ACACACA
Insert a value for k:
2
For the following codes
print("Insert the Text:")
Text = input()
print("Insert a value for k:")
k = int(input())
Pattern = " "
count = [ ]
FrequentPatterns = [ ]
def FrequentWords(Text, k):
for i in range (len(Text)-k+1):
Pattern = Text[i: i+k]
c = 0
for i in range (len(Text)-len(Pattern)+1):
if Text[i: i+len(Pattern)] == Pattern:
c = c+1
else:
continue
count.extend([c])
print(count)
if count[i] == max(count):
FrequentPatterns.extend([Pattern])
return FrequentPatterns
FrequentWords(Text, k)
I get the following out put
Insert the Text:
ACACACA
Insert a value for k:
2
[3, 3, 3, 3, 3, 3]
['CA']
Clearly there are two FrequentPatterns. So the last list output should be ['AC', 'CA']
I don't know why this code isn't working. Really appreciate if anyone could help.

Here's how would solve this:
from itertools import groupby
def find_kgrams(string, k):
kgrams = sorted(
string[j:j+k]
for i in range(k)
for j in range(i, (len(string) - i) // k * k, k)
)
groups = [(k, len(list(g))) for k, g in groupby(kgrams)]
return sorted(groups, key=lambda i: i[1], reverse=True)
The way this works is:
it produces string chunks of the given length k, e.g.:
starting from 0: 'ACACACA' -> 'AC', 'AC', 'AC'
starting from 1: 'ACACACA' -> 'CA', 'CA', 'CA'
...up to k - 1 (1 is the maximum for k == 2)
groupby() groups those chunks
sorted() sorts them by count
a list of tuples of kgrams and their count is returned
Test:
s = 'ACACACA'
kgrams = find_kgrams(s, 2)
print(kgrams)
prints:
[('AC', 3), ('CA', 3)]
It's already sorted, you can pick the most frequent one(s) from the front of the returned list:
max_kgrams = [k for k, s in kgrams if s == kgrams[1][1])
print(max_kgrams)
prints:
['AC', 'CA']

merging list of dictionaries while merging the values of similar ke

i need to make a function that takes a list of dictionary and returns their sum,
for example
in [{(1,3):2, (2,7):1} , {(1,3):6}] it needs to return {(1,3):
8, (2,7): 1}
if their sum is 0 it removes the key from the
dictionary. The problem here is that it returns [(1, 3), 6]
def swe(lst):
s = []
a = []
v = []
q = 0
l = 0
for d in lst:
s.append(d.keys())
v.append(d.values())
for i in s:
for j in i:
if len(i) == 1:
a.append(j)
if len(i) > 1:
a.append(j)
for t in a:
if a.count(t) == 1:
for q in range(len(v)):
for q in range(len(s)):
dct1 = v[q]
dct2 = s[q]
dct3 = dct2+ dct1
q = q+1
continue
return dct3
if a.count(t) > 1:
for l in range(len(a)):
dct5 = v[q]
dct6 = s[q]
dct7 = dct5 + dct6
l = l+ 1
return dct7
print swe([{(1,3):2, (2,7):1} , {(1,3):6}])

Your code is quite hard to follow with the one letter variables names so I wrote something new which I think does what you want:
def merge_dictionaries(list_of_dictionaries):
results_dict = dict()
for dictionary in list_of_dictionaries:
for key, value in dictionary.items():
results_dict[key] = results_dict.get(key, 0) + value
return {key: value for (key, value) in results_dict.iteritems() if value != 0}
print merge_dictionaries([{(1,3):2, (2,7):1} , {(1,3):6 , (9,9) : 0}])
>>> {(2, 7): 1, (1, 3): 8}
It goes through each dictionary in your list and adds the value to the sum so far and then filters out answers with a sum of 0 at the end.

Probably you are looking for something like this:
def swe(lst):
res = dict()
for d in lst:
for key,value in d.items():
if key in res:
res[key] += value
else:
res[key] = value
for key,value in res.items():
if value == 0:
res.pop(key)
return res

OrderedDict Changing Order after Double Iterator Loop

I set up an OrderedDict and perform dictionary comprehensions with different grammars, which I have simplified to a function dictcomp(fn, dictionary, key_or_value)::
x = OrderedDict(self._Median_Colors)
x = self.dictcomp(hex2color, x, 'v')
x = self.dictcomp(rgb_to_hsv, x, 'v_tuple')
At this point I am able to sort the dictionary:
x = self.dictcomp(self.sort_by_hue, x, 'v')
Everything seems to check out so far:
print x
Now I need to rename keys, so I will create a new ordered dictionary:
color_indexes = list(xrange(0, len(x.keys())))
print color_indexes
newkeys = [self.rename(color_index) for color_index in color_indexes]
print x.values()
vi = iter(x.values())
x = OrderedDict.fromkeys(newkeys);
I had no idea how to fill in the old values immediately, so I did this:
ki = iter(x.keys())
for k, v in zip(ki, vi):
#print "k:", k
print v
x[k] = tuple(v)
Checks out fine:
print x.items()
Here comes trouble:
x = self.dictcomp(hsv_to_rgb, x, 'v_tuple')
print x.items()
where dictcomp does this:
dictionary = {k: fn(*v) for k, v in dictionary.items()}
where fn=hsv_to_rgb, dictionary=x
Now, I have:
[('Blue', (0.9764705882352941, 0.5529411764705883, 0.0)), ....
instead of the expected:
[('Red', (0.4745098039215686, 0.7372549019607844, 0.23137254901960794)), ....
The keys are the same, but the values have changed. I am guessing that the insertion order was somehow affected. How did this happen and how can I keep the order of keys in the dictionary?

The problem is because of
for i, j in zip([4, 5, 6], [1, 2, 3]):
print i
print j
Results in the column:
4 1 5 2 6 3
It turns out that zip acts as a zipper if using two iterators.
The fix is to get the keyword-value as an iterable tuple:
for i in zip([4, 5, 6], [1, 2, 3]):
print i
Returns
(4, 1)
(5, 2)
(6, 3)

In Python, How can I get the next and previous key:value of a particular key in a dictionary?

Okay, so this is a little hard to explain, but here goes:
I have a dictionary, which I'm adding content to. The content is a hashed username (key) with an IP address (value).
I was putting the hashes into an order by running them against base 16, and then using Collection.orderedDict.
So, the dictionary looked a little like this:
d = {'1234': '8.8.8.8', '2345':'0.0.0.0', '3213':'4.4.4.4', '4523':'1.1.1.1', '7654':'1.3.3.7', '9999':'127.0.0.1'}
What I needed was a mechanism that would allow me to pick one of those keys, and get the key/value item one higher and one lower. So, for example, If I were to pick 2345, the code would return the key:value combinations '1234:8.8.8.8' and '3213:4.4.4.4'
So, something like:
for i in d:
while i < len(d)
if i == '2345':
print i.nextItem
print i.previousItem
break()

Edit: OP now states that they are using OrderedDicts but the use case still requires this sort of approach.
Since dicts are not ordered you cannot directly do this. From your example, you are trying to reference the item like you would use a linked list.
A quick solution would be instead to extract the keys and sort them then iterate over that list:
keyList=sorted(d.keys())
for i,v in enumerate(keyList):
if v=='eeee':
print d[keyList[i+1]]
print d[keyList[i-1]]
The keyList holds the order of your items and you have to go back to it to find out what the next/previous key is to get the next/previous value. You also have to check for i+1 being greater than the list length and i-1 being less than 0.
You can use an OrderedDict similarly but I believe that you still have to do the above with a separate list as OrderedDict doesn't have next/prev methods.

As seen in the OrderedDict source code,
if you have a key and you want to find the next and prev in O(1) here's how you do that.
>>> from collections import OrderedDict
>>> d = OrderedDict([('aaaa', 'a',), ('bbbb', 'b'), ('cccc', 'c'), ('dddd', 'd'), ('eeee', 'e'), ('ffff', 'f')])
>>> i = 'eeee'
>>> link_prev, link_next, key = d._OrderedDict__map['eeee']
>>> print 'nextKey: ', link_next[2], 'prevKey: ', link_prev[2]
nextKey: ffff prevKey: dddd
This will give you next and prev by insertion order. If you add items in random order then just keep track of your items in sorted order.

You could also use the list.index() method.
This function is more generic (you can check positions +n and -n), it will catch attempts at searching a key that's not in the dict, and it will also return None if there's nothing before of after the key:
def keyshift(dictionary, key, diff):
if key in dictionary:
token = object()
keys = [token]*(diff*-1) + sorted(dictionary) + [token]*diff
newkey = keys[keys.index(key)+diff]
if newkey is token:
print None
else:
print {newkey: dictionary[newkey]}
else:
print 'Key not found'
keyshift(d, 'bbbb', -1)
keyshift(d, 'eeee', +1)

Try:
pos = 0
d = {'aaaa': 'a', 'bbbb':'b', 'cccc':'c', 'dddd':'d', 'eeee':'e', 'ffff':'f'}
for i in d:
pos+=1
if i == 'eeee':
listForm = list(d.values())
print(listForm[pos-1])
print(listForm[pos+1])
As in #AdamKerz's answer enumerate seems pythonic, but if you are a beginner this code might help you understand it in an easy way.
And I think its faster + smaller compared to sorting followed by building list & then enumerating

You could use a generic function, based on iterators, to get a moving window (taken from this question):
import itertools
def window(iterable, n=3):
it = iter(iterable)
result = tuple(itertools.islice(it, n))
if len(result) == n:
yield result
for element in it:
result = result[1:] + (element,)
yield result
l = range(8)
for i in window(l, 3):
print i
Using the above function with OrderedDict.items() will give you three (key, value) pairs, in order:
d = collections.OrderedDict(...)
for p_item, item, n_item in window(d.items()):
p_key, p_value = p_item
key, value = item
# Or, if you don't care about the next value:
n_key, _ = n_item
Of course using this function the first and last values will never be in the middle position (although this should not be difficult to do with some adaptation).
I think the biggest advantage is that it does not require table lookups in the previous and next keys, and also that it is generic and works with any iterable.

Maybe it is an overkill, but you can keep Track of the Keys inserted with a Helper Class and according to that list, you can retrieve the Key for Previous or Next. Just don't forget to check for border conditions, if the objects is already first or last element. This way, you will not need to always resort the ordered list or search for the element.
from collections import OrderedDict
class Helper(object):
"""Helper Class for Keeping track of Insert Order"""
def __init__(self, arg):
super(Helper, self).__init__()
dictContainer = dict()
ordering = list()
#staticmethod
def addItem(dictItem):
for key,value in dictItem.iteritems():
print key,value
Helper.ordering.append(key)
Helper.dictContainer[key] = value
#staticmethod
def getPrevious(key):
index = (Helper.ordering.index(key)-1)
return Helper.dictContainer[Helper.ordering[index]]
#Your unordered dictionary
d = {'aaaa': 'a', 'bbbb':'b', 'cccc':'c', 'dddd':'d', 'eeee':'e', 'ffff':'f'}
#Create Order over keys
ordered = OrderedDict(sorted(d.items(), key=lambda t: t[0]))
#Push your ordered list to your Helper class
Helper.addItem(ordered)
#Get Previous of
print Helper.getPrevious('eeee')
>>> d

You can store the keys and values in temp variable in prior, and can access previous and next key,value pair using index.
It is pretty dynamic, will work for any key you query. Please check this code :
d = {'1234': '8.8.8.8', '2345':'0.0.0.0', '3213':'4.4.4.4', '4523':'1.1.1.1', '7654':'1.3.3.7', '9999':'127.0.0.1'}
ch = raw_input('Pleasure Enter your choice : ')
keys = d.keys()
values = d.values()
#print keys, values
for k,v in d.iteritems():
if k == ch:
ind = d.keys().index(k)
print keys[ind-1], ':',values[ind-1]
print keys[ind+1], ':',values[ind+1]

I think this is a nice Pythonic way of resolving your problem using a lambda and list comprehension, although it may not be optimal in execution time:
import collections
x = collections.OrderedDict([('a','v1'),('b','v2'),('c','v3'),('d','v4')])
previousItem = lambda currentKey, thisOrderedDict : [
list( thisOrderedDict.items() )[ z - 1 ] if (z != 0) else None
for z in range( len( thisOrderedDict.items() ) )
if (list( thisOrderedDict.keys() )[ z ] == currentKey) ][ 0 ]
nextItem = lambda currentKey, thisOrderedDict : [
list( thisOrderedDict.items() )[ z + 1 ] if (z != (len( thisOrderedDict.items() ) - 1)) else None
for z in range( len( thisOrderedDict.items() ) )
if (list( thisOrderedDict.keys() )[ z ] == currentKey) ][ 0 ]
assert previousItem('c', x) == ('b', 'v2')
assert nextItem('c', x) == ('d', 'v4')
assert previousItem('a', x) is None
assert nextItem('d',x) is None

Another way that seems simple and straight forward: this function returns the key which is offset positions away from k
def get_shifted_key(d:dict, k:str, offset:int) -> str:
l = list(d.keys())
if k in l:
i = l.index(k) + offset
if 0 <= i < len(l):
return l[i]
return None

i know how to get next key:value of a particular key in a dictionary:
flag = 0
for k, v in dic.items():
if flag == 0:
code...
flag += 1
continue
code...{next key and value in for}

if correct :
d = { "a": 1, "b":2, "c":3 }
l = list( d.keys() ) # make a list of the keys
k = "b" # the actual key
i = l.index( k ) # get index of the actual key
for the next :
i = i+1 if i+1 < len( l ) else 0 # select next index or restart 0
n = l [ i ]
d [ n ]
for the previous :
i = i-1 if i-1 >= 0 else len( l ) -1 # select previous index or go end
p = l [ i ]
d [ p ]

Compare two list of tuples to create a new tuple containing highest values

I have the following code which will compare the two dictionaries b and c and create a third one from them called d which contains the product of comparing b to c and taking the highest one:
b = {1:0,2:0,3:0,4:0,5:0}
c = {1:1,4:4,5:5}
d={k:c[k] if k in c and c[k]>b[k] else v for k,v in b.items()}
However, because dictionaries are not sorted I have had to use the following syntax to convert b and c into tuples, before comparing them, so that they are in the right order:
b = sorted(b.iteritems())
c = sorted(c.iteritems())
This produces an output of:
b = [(1,0),(2,0),(3,0),(4,0),(5,0)]
c = [(1,1),(4,4),(5,5)]
However I am now unsure of how I could compare the two tuples and produce an output that looks like this:
d = [(1,1),(2,0),(3,0),(4,4),(5,5)]
There does not seem to be a tuple comprehension available in Python 2.7, unless I have not been looking for answers in the right places.
Can anyone assist?

You can add a one liner (the last line) to the upper code snippet:
b = {1:0,2:0,3:0,4:0,5:0}
c = {1:1,4:4,5:5}
d={k:c[k] if k in c and c[k]>b[k] else v for k,v in b.items()}
print [(k, v) for k, v in d.items()] # [(1, 1), (2, 0), (3, 0), (4, 4), (5, 5)]

Why can't you just sort the items of d?
b = {1:0, 2:0, 3:0, 4:0, 5:0}
c = {1:1, 4:4, 5:5}
d = {k: c[k] if k in c and c[k] > b[k] else v for k, v in b.items()}
sorted(d.items())
I suppose this implies that you know the keys of c are a subset of the keys in b. If this isn't true, you need to get the union of the keys and them compare the values from each dict:
b = {1:0, 2:0, 3:0, 5:0}
c = {1:1, 4:4, 5:5} # note, 4 not in "b"
ninf = -float('inf') # some value smaller than all others.
all_keys = b.viewkeys() | c # Union of keys.
result = [(k, max(b.get(k, ninf), c.get(k, ninf))) for k in sorted(all_keys)]
print(result) # [(1, 1), (2, 0), (3, 0), (4, 4), (5, 5)]

You can do the same thing, but in the order of the sorted keys of b, and make a tuple on the fly
d=tuple(
(k , (c[k] if k in c and c[k]>b[k] else b[k]) )
for k in sorted(b.keys()) )
or some might prefer
d=tuple(
(k , max(b[k],c.get(k,minus_inf)) )
for k in sorted(b.keys()) )
... if a suitable minus_inf exists which is less than all possible b[k]. For numbers, None works so you could use
d=tuple(
(k , max(b[k],c.get(k)) ) # note, max( x,None)==x here
for k in sorted(b.keys()) )

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

merge python dictionary of sets - python

try this: c = a.copy() c.update(b)

Related

How to get multiple most frequent k-mers of a string using Python?

merging list of dictionaries while merging the values of similar ke

OrderedDict Changing Order after Double Iterator Loop

In Python, How can I get the next and previous key:value of a particular key in a dictionary?

Compare two list of tuples to create a new tuple containing highest values

Categories

Resources