dynamically nesting a python dictionary - python

I want to create a function that will create dynamic levels of nesting in a python dictionary.
e.g. if I call my function nesting, I want the outputs like the following:
nesting(1) : dict = {key1:<value>}
nesting(2) : dict = {key1:{key2:<value>}}
nesting(3) : dict = {key1:{key2:{key3:<value>}}}
and so on. I have all the keys and values before calling this function, but not before I start executing the code.
I have the keys stored in a variable 'm' where m is obtained from:
m=re.match(pattern,string)
the pattern is constructed dynamically for this case.

You can iterate over the keys like this:
def nesting(level):
ret = 'value'
for l in range(level, 0, -1):
ret = {'key%d' % l: ret}
return ret
Replace the range(...) fragment with the code which yields the keys in the desired order. So, if we assume that the keys are the captured groups, you should change the function as follows:
def nesting(match): # `match' is a match object like your `m' variable
ret = 'value'
for key in match.groups():
ret = {key: ret}
return ret
Or use reversed(match.groups()) if you want to get the keys in the opposite order.

def nesting(level, l=None):
# assuming `m` is accessible in the function
if l is None:
l = level
if level == 1:
return {m[l-level]: 'some_value'}
return {m[l-level]: nesting(level-1, l)
For reasonable levels, this won't exceed the recursion depth. This is also assuming that the value is always the same and that m is of the form:
['key1', 'key2', ...]
An iterative form of this function can be written as such:
def nesting(level):
# also assuming `m` is accessible within the function
d = 'some_value'
l = level
while level > 0:
d = {m[l-level]: d}
level -= 1
return d
Or:
def nesting(level):
# also assuming `m` is accessible within the function
d = 'some_value'
for l in range(level, 0, -1): # or xrange in Python 2
d = {m[l-level]: d}
return d

Related

Rearranging a dictionary based on a function-condition over its items

(In relation to this question I posed a few days ago)
I have a dictionary whose keys are strings, and whose values are sets of integers, for example:
db = {"a":{1,2,3}, "b":{5,6,7}, "c":{2,5,4}, "d":{8,11,10,18}, "e":{0,3,2}}
I would like to have a procedure that joins the keys whose values satisfy a certain generic condition given in an external function. The new item will therefore have as a key the union of both keys (the order is not important). The value will be determined by the condition itserf.
For example: given this condition function:
def condition(kv1: tuple, kv2: tuple):
key1, val1 = kv1
key2, val2 = kv2
union = val1 | val2 #just needed for the following line
maxDif = max(union) - min(union)
newVal = set()
for i in range(maxDif):
auxVal1 = {pos - i for pos in val2}
auxVal2 = {pos + i for pos in val2}
intersection1 = val1.intersection(auxVal1)
intersection2 = val1.intersection(auxVal2)
print(intersection1, intersection2)
if (len(intersection1) >= 3):
newVal.update(intersection1)
if (len(intersection2) >= 3):
newVal.update({pos - i for pos in intersection2})
if len(newVal)==0:
return False
else:
newKey = "".join(sorted(key1+key2))
return newKey, newVal
That is, the satisfying pair of items have at least 3 numbers in their values at the same distance (difference) between them. As said, if satisfied, the resulting key is the union of the two keys. And for this particular example, the value is the (minimum) matching numbers in the original value sets.
How can I smartly apply a function like this to a dictionary like db? Given the aforementioned dictionary, the expected result would be:
result = {"ab":{1,2,3}, "cde":{0,3,2}, "d":{18}}
Your "condition" in this case is more than just a mere condition. It is actually merging rule that identifies values to keep and values to drop. This may or may not allow a generalized approach depending on how the patterns and merge rules vary.
Given this, each merge operation could leave values in the original keys that may be merged with some of the remaining keys. Multiple merges can also occur (e.g. key "cde"). In theory the merging process would need to cover a power set of all keys which may be impractical. Alternatively, this can be performed by successive refinements using pairings of (original and/or merged) keys.
The merge condition/function:
db = {"a":{1,2,3}, "b":{5,6,7}, "c":{2,5,4}, "d":{8,11,10,18}, "e":{0,3,2}}
from itertools import product
from collections import Counter
# Apply condition and return a keep-set and a remove-set
# the keep-set will be empty if the matching condition is not met
def merge(A,B,inverted=False):
minMatch = 3
distances = Counter(b-a for a,b in product(A,B) if b>=a)
delta = [d for d,count in distances.items() if count>=minMatch]
keep = {a for a in A if any(a+d in B for d in delta)}
remove = {b for b in B if any(b-d in A for d in delta)}
if len(keep)>=minMatch: return keep,remove
return None,None
print( merge(db["a"],db["b"]) ) # ({1, 2, 3}, {5, 6, 7})
print( merge(db["e"],db["d"]) ) # ({0, 2, 3}, {8, 10, 11})
Merge Process:
# combine dictionary keys using a merging function/condition
def combine(D,mergeFunction):
result = { k:set(v) for k,v in D.items() } # start with copy of input
merging = True
while merging: # keep merging until no more merges are performed
merging = False
for a,b in product(*2*[list(result.keys())]): # all key pairs
if a==b: continue
if a not in result or b not in result: continue # keys still there?
m,n = mergeFunction(result[a],result[b]) # call merge function
if not m : continue # if merged ...
mergedKey = "".join(sorted(set(a+b))) # combine keys
result[mergedKey] = m # add merged set
if mergedKey != a: result[a] -= m; merging = True # clean/clear
if not result[a]: del result[a] # original sets,
if mergedKey != b: result[b] -= n; merging = True # do more merges
if not result[b]: del result[b]
return result

Python 3: Creating list of multiple dictionaries have same keys but different values coming from multiple lists

I'm parsing through a response of XML using xpath from lxml library.
I'm getting the results and creating lists out of them like below:
object_name = [o.text for o in response.xpath('//*[name()="objectName"]')]
object_size_KB = [o.text for o in response.xpath('//*[name()="objectSize"]')]
I want to use the lists to create a dictionary per element in list and then add them to a final list like this:
[{'object_name': 'file1234', 'object_size_KB': 9347627},
{'object_name': 'file5671', 'objeobject_size_KBt_size': 9406875}]
I wanted a generator because I might need to search for more metadata from the response in the future so I want my code to be future proof and reduce repetition:
meta_names = {
'object_name': '//*[name()="objectName"]',
'object_size_KB': '//*[name()="objectSize"]'
}
def parse_response(response, meta_names):
"""
input: response: api xml response text from lxml xpath
input: meta_names: key names used to generate dictionary per object
return: list of objects dictionary
"""
mylist = []
# create list of each xpath match assign them to variables
for key, value in meta_names.items():
mylist.append({key: [o.text for o in response.xpath(value)]})
return mylist
However the function gives me this:
[{'object_name': ['file1234', 'file5671']}, {'object_size_KB': ['9347627', '9406875']}]
I've been searching for a similar case in the forums but couldn't find something to match my needs.
Appreciate your help.
UPDATE: Renneys answer was what I wanted I just adjusted the length value of range of my results since I don't always have the same length of xpath per object key and since my lists have identical length everytime I picked first index [0].
now the function looks like this.
def create_entries(root, keys):
tmp = []
for key in keys:
tmp.append([o.text for o in root.xpath('//*[name()="' + key + '"]')])
ret = []
# print(len(tmp[0]))
for i in range(len(tmp[0])):
add = {}
for j in range(len(keys)):
add[keys[j]] = tmp[j][i]
ret.append(add)
return ret
Use a two dimensional array:
def createEntries(root, keys):
tmp = []
for key in keys:
tmp.append([o.text for o in root.xpath('//*[name()="' + key + '"]')])
ret = []
for i in range(len(tmp)):
add = {}
for j in range(len(keys)):
add[keys[j]] = tmp[j][i]
ret.append(add)
return ret
I think this is what you are looking for.
You can use zip to combine your two lists into a list of value pairs.
Then, you can use a list comprehension or a generator expression to pair your value pairs with your desired keys.
import pprint
object_name = ['file1234', 'file5671']
object_size = [9347627, 9406875]
[{'object_name': 'file1234', 'object_size_KB': 9347627},
{'object_name': 'file5671', 'objeobject_size_KBt_size': 9406875}]
[{'object_name': ['file1234', 'file5671']}, {'object_size_KB': ['9347627', '9406875']}]
# List Comprehension
obj_list = [{'object_name': name, 'object_size': size} for name,size in zip(object_name,object_size)]
pprint.pprint(obj_list)
print('\n')
# Generator Expression
generator = ({'object_name': name, 'object_size': size} for name,size in zip(object_name,object_size))
for obj in generator:
print(obj)
Live Code Example -> https://onlinegdb.com/SyNSwd7jU
I think the accepted answer is more efficient, but here's an example of how list comprehensions could be used.
meta_names = {
'object_name': ['file1234', 'file5671'],
'object_size_KB': ['9347627', '9406875'],
'object_text': ['Bob', 'Ross']
}
def parse_response(meta_names):
"""
input: response: api xml response text from lxml xpath
input: meta_names: key names used to generate dictionary per object
return: list of objects dictionary
"""
# List comprehensions
to_dict = lambda l: [{key:val for key,val in pairs} for pairs in l]
objs = list(zip(*list([[key,val] for val in vals] for key,vals in meta_names.items())))
pprint.pprint(to_dict(objs))
parse_response(meta_names)
Live Code -> https://onlinegdb.com/ryLq4PVjL

Python Tulpe Key For Dict Partial Lookup

I have a dictionary with a tuple of 5 values as a key. For example:
D[i,j,k,g,h] = value.
Now i need to process all elements with a certain partial key pair (i1,g1):
I need now for each pair (i1,g1) all values that have i == i1 and g == g1 in the full key.
What is an pythonic and efficient way to retrieve this, knowing that i need the elements for all pairs and each full key belongs to exactly one partial key?
Is there a more appropriate data structure than dictionaries?
One reference implementation is this:
results = {}
for i in I:
for g in G:
results[i,g] = []
for i,j,k,g,h in D:
if i1 == i and g1 == g:
results[i,g].append(D[i,j,k,g,h])
Assuming you know all the valid values for the different indices you can get all possible keys using itertools.product:
import itertools
I = [3,6,9]
J = range(10)
K = "abcde"
G = ["first","second"]
H = range(10,20)
for tup in itertools.product(I,J,K,G,H):
my_dict[tup] = 0
To restrict the indices generated just put a limit on one / several of the indices that gets generated, for instance all of the keys where i = 6 would be:
itertools.product((6,), J,K,G,H)
A function to let you specify you want all the indices where i==6 and g =="first" would look like this:
def partial_indices(i_vals=I, j_vals=J, k_vals=K, g_vals = G, h_vals = H):
return itertools.product(i_vals, j_vals, k_vals, g_vals, h_vals)
partial_indices(i_vals=(6,), g_vals=("first",))
Or assuming that not all of these are present in the dictionary you can also pass the dictionary as an argument and check for membership before generating the keys:
def items_with_partial_indices(d, i_vals=I, j_vals=J, k_vals=K, g_vals = G, h_vals = H):
for tup in itertools.product(i_vals, j_vals, k_vals, g_vals, h_vals):
try:
yield tup, d[tup]
except KeyError:
pass
for k,v in D.iteritems():
if i in k and p in k:
print v

In Python, How can I get the next and previous key:value of a particular key in a dictionary?

Okay, so this is a little hard to explain, but here goes:
I have a dictionary, which I'm adding content to. The content is a hashed username (key) with an IP address (value).
I was putting the hashes into an order by running them against base 16, and then using Collection.orderedDict.
So, the dictionary looked a little like this:
d = {'1234': '8.8.8.8', '2345':'0.0.0.0', '3213':'4.4.4.4', '4523':'1.1.1.1', '7654':'1.3.3.7', '9999':'127.0.0.1'}
What I needed was a mechanism that would allow me to pick one of those keys, and get the key/value item one higher and one lower. So, for example, If I were to pick 2345, the code would return the key:value combinations '1234:8.8.8.8' and '3213:4.4.4.4'
So, something like:
for i in d:
while i < len(d)
if i == '2345':
print i.nextItem
print i.previousItem
break()
Edit: OP now states that they are using OrderedDicts but the use case still requires this sort of approach.
Since dicts are not ordered you cannot directly do this. From your example, you are trying to reference the item like you would use a linked list.
A quick solution would be instead to extract the keys and sort them then iterate over that list:
keyList=sorted(d.keys())
for i,v in enumerate(keyList):
if v=='eeee':
print d[keyList[i+1]]
print d[keyList[i-1]]
The keyList holds the order of your items and you have to go back to it to find out what the next/previous key is to get the next/previous value. You also have to check for i+1 being greater than the list length and i-1 being less than 0.
You can use an OrderedDict similarly but I believe that you still have to do the above with a separate list as OrderedDict doesn't have next/prev methods.
As seen in the OrderedDict source code,
if you have a key and you want to find the next and prev in O(1) here's how you do that.
>>> from collections import OrderedDict
>>> d = OrderedDict([('aaaa', 'a',), ('bbbb', 'b'), ('cccc', 'c'), ('dddd', 'd'), ('eeee', 'e'), ('ffff', 'f')])
>>> i = 'eeee'
>>> link_prev, link_next, key = d._OrderedDict__map['eeee']
>>> print 'nextKey: ', link_next[2], 'prevKey: ', link_prev[2]
nextKey: ffff prevKey: dddd
This will give you next and prev by insertion order. If you add items in random order then just keep track of your items in sorted order.
You could also use the list.index() method.
This function is more generic (you can check positions +n and -n), it will catch attempts at searching a key that's not in the dict, and it will also return None if there's nothing before of after the key:
def keyshift(dictionary, key, diff):
if key in dictionary:
token = object()
keys = [token]*(diff*-1) + sorted(dictionary) + [token]*diff
newkey = keys[keys.index(key)+diff]
if newkey is token:
print None
else:
print {newkey: dictionary[newkey]}
else:
print 'Key not found'
keyshift(d, 'bbbb', -1)
keyshift(d, 'eeee', +1)
Try:
pos = 0
d = {'aaaa': 'a', 'bbbb':'b', 'cccc':'c', 'dddd':'d', 'eeee':'e', 'ffff':'f'}
for i in d:
pos+=1
if i == 'eeee':
listForm = list(d.values())
print(listForm[pos-1])
print(listForm[pos+1])
As in #AdamKerz's answer enumerate seems pythonic, but if you are a beginner this code might help you understand it in an easy way.
And I think its faster + smaller compared to sorting followed by building list & then enumerating
You could use a generic function, based on iterators, to get a moving window (taken from this question):
import itertools
def window(iterable, n=3):
it = iter(iterable)
result = tuple(itertools.islice(it, n))
if len(result) == n:
yield result
for element in it:
result = result[1:] + (element,)
yield result
l = range(8)
for i in window(l, 3):
print i
Using the above function with OrderedDict.items() will give you three (key, value) pairs, in order:
d = collections.OrderedDict(...)
for p_item, item, n_item in window(d.items()):
p_key, p_value = p_item
key, value = item
# Or, if you don't care about the next value:
n_key, _ = n_item
Of course using this function the first and last values will never be in the middle position (although this should not be difficult to do with some adaptation).
I think the biggest advantage is that it does not require table lookups in the previous and next keys, and also that it is generic and works with any iterable.
Maybe it is an overkill, but you can keep Track of the Keys inserted with a Helper Class and according to that list, you can retrieve the Key for Previous or Next. Just don't forget to check for border conditions, if the objects is already first or last element. This way, you will not need to always resort the ordered list or search for the element.
from collections import OrderedDict
class Helper(object):
"""Helper Class for Keeping track of Insert Order"""
def __init__(self, arg):
super(Helper, self).__init__()
dictContainer = dict()
ordering = list()
#staticmethod
def addItem(dictItem):
for key,value in dictItem.iteritems():
print key,value
Helper.ordering.append(key)
Helper.dictContainer[key] = value
#staticmethod
def getPrevious(key):
index = (Helper.ordering.index(key)-1)
return Helper.dictContainer[Helper.ordering[index]]
#Your unordered dictionary
d = {'aaaa': 'a', 'bbbb':'b', 'cccc':'c', 'dddd':'d', 'eeee':'e', 'ffff':'f'}
#Create Order over keys
ordered = OrderedDict(sorted(d.items(), key=lambda t: t[0]))
#Push your ordered list to your Helper class
Helper.addItem(ordered)
#Get Previous of
print Helper.getPrevious('eeee')
>>> d
You can store the keys and values in temp variable in prior, and can access previous and next key,value pair using index.
It is pretty dynamic, will work for any key you query. Please check this code :
d = {'1234': '8.8.8.8', '2345':'0.0.0.0', '3213':'4.4.4.4', '4523':'1.1.1.1', '7654':'1.3.3.7', '9999':'127.0.0.1'}
ch = raw_input('Pleasure Enter your choice : ')
keys = d.keys()
values = d.values()
#print keys, values
for k,v in d.iteritems():
if k == ch:
ind = d.keys().index(k)
print keys[ind-1], ':',values[ind-1]
print keys[ind+1], ':',values[ind+1]
I think this is a nice Pythonic way of resolving your problem using a lambda and list comprehension, although it may not be optimal in execution time:
import collections
x = collections.OrderedDict([('a','v1'),('b','v2'),('c','v3'),('d','v4')])
previousItem = lambda currentKey, thisOrderedDict : [
list( thisOrderedDict.items() )[ z - 1 ] if (z != 0) else None
for z in range( len( thisOrderedDict.items() ) )
if (list( thisOrderedDict.keys() )[ z ] == currentKey) ][ 0 ]
nextItem = lambda currentKey, thisOrderedDict : [
list( thisOrderedDict.items() )[ z + 1 ] if (z != (len( thisOrderedDict.items() ) - 1)) else None
for z in range( len( thisOrderedDict.items() ) )
if (list( thisOrderedDict.keys() )[ z ] == currentKey) ][ 0 ]
assert previousItem('c', x) == ('b', 'v2')
assert nextItem('c', x) == ('d', 'v4')
assert previousItem('a', x) is None
assert nextItem('d',x) is None
Another way that seems simple and straight forward: this function returns the key which is offset positions away from k
def get_shifted_key(d:dict, k:str, offset:int) -> str:
l = list(d.keys())
if k in l:
i = l.index(k) + offset
if 0 <= i < len(l):
return l[i]
return None
i know how to get next key:value of a particular key in a dictionary:
flag = 0
for k, v in dic.items():
if flag == 0:
code...
flag += 1
continue
code...{next key and value in for}
if correct :
d = { "a": 1, "b":2, "c":3 }
l = list( d.keys() ) # make a list of the keys
k = "b" # the actual key
i = l.index( k ) # get index of the actual key
for the next :
i = i+1 if i+1 < len( l ) else 0 # select next index or restart 0
n = l [ i ]
d [ n ]
for the previous :
i = i-1 if i-1 >= 0 else len( l ) -1 # select previous index or go end
p = l [ i ]
d [ p ]

searching and adding a python list

I have a TList which is a list of lists. I would like to add new items to the list if they are not present before. For instance if item I is not present, then add to Tlist otherwise skip.Is there a more pythonic way of doing it ? Note : At first TList may be empty and elements are added in this code. After adding Z for example, TList = [ [A,B,C],[D,F,G],[H,I,J],[Z,aa,bb]]. The other elements are based on calculations on Z.
item = 'C' # for example this item will given by user
TList = [ [A,B,C],[D,F,G],[H,I,J]]
if not TList:
## do something
# check if files not previously present in our TList and then add to our TList
elif item not in zip(*TList)[0]:
## do something
Since it would appear that the first entry in each sublist is a key of some sort, and the remaining entries are somehow derived from that key, a dictionary might be a more suitable data structure:
vals = {'A': ['B','C'], 'D':['F','G'], 'H':['I','J']}
if 'Z' in vals:
print 'found Z'
else:
vals['Z'] = ['aa','bb']
#aix made a good suggestion to use a dict as your data structure; It seems to fit your use case well.
Consider wrapping up the value checking (i.e. 'Does it exist?') and the calculation of the derived values ('aa' and 'bb' in your example?).
class TList(object):
def __init__(self):
self.data = {}
def __iter__(self):
return iter(self.data)
def set(self, key):
if key not in self:
self.data[key] = self.do_something(key)
def get(self, key):
return self.data[key]
def do_something(self, key):
print('Calculating values')
return ['aa', 'bb']
def as_old_list(self):
return [[k, v[0], v[1]] for k, v in self.data.iteritems()]
t = TList()
## Add some values. If new, `do_something()` will be called
t.set('aval')
t.set('bval')
t.set('aval') ## Note, do_something() is not called
## Get a value
t.get('aval')
## 'in ' tests work
'aval' in t
## Give you back your old data structure
t.as_old_list()
if you need to keep the same data structure, something like this should work:
# create a set of already seen items
seen = set(zip(*TList)[:1])
# now start adding new items
if item not in seen:
seen.add(item)
# add new sublist to TList
Here is a method using sets and set.union:
a = set(1,2,3)
b = set(4,5,6)
c = set()
master = [a,b,c]
if 2 in set.union(*master):
#Found it, do something
else:
#Not in set, do something else
If the reason for testing for membership is simply to avoid adding an entry twice, the set structure uses a.add(12) to add something to a set, but only add it once, thus eliminating the need to test. Thus the following:
>>> a=set()
>>> a.add(1)
>>> a
set([1])
>>> a.add(1)
>>> a
set([1])
If you need the set elsewhere as a list you simply say "list(a)" to get "a" as a list, or "tuple(a)" to get it as a tuple.

Categories