Parse dict into specific list format using Python

Parse dict into specific list format using Python - python

Say I have the following dict
{'red':'boop','white':'beep','rose':'blip'}
And I want to get it to a list like so
['red','boop','end','white','beep','rose','blip','end']
The key / value which is to be placed in front of the list is an input.
So I essentially I want [first_key, first_value,end, .. rest of the k/v pairs..,end]
I wrote a brute force approach but I feel like there's a more pythonic way of doing it (and also because once implemented the snippet would make my code O(n^2) )
for item in lst_items
data_lst = []
for key, value in item.iteritems():
data_lst.append(key)
ata_lst.append(value)
#insert 'end' at the appropiate indeces
#more code ...
Any pythonic approach?

The below relies on itertools.chain.from_iterable to flatten the items into a single list. We pull the first two values from the chain and then use them to build a new list, which we extend with the rest of the values.
from itertools import chain
def ends(d):
if not d:
return []
c = chain.from_iterable(d.iteritems())
l = [next(c), next(c), "end"]
l.extend(c)
l.append("end")
return l
ends({'red':'boop','white':'beep','rose':'blip'})
# ['rose', 'blip', 'end', 'white', 'beep', 'red', 'boop', 'end']
If you know the key you want first, and don't care about the rest, we can use a lazily evaluated generator expression to remove it from the flattened list.
def ends(d, first):
if not d:
return []
c = chain.from_iterable((k, v) for k, v in d.iteritems() if k != first)
l = [first, d[first], "end"]
l.extend(c)
l.append("end")
return l
ends({'red':'boop','white':'beep','rose':'blip'}, 'red')
# ['red', 'boop', 'end', 'rose', 'blip', 'white', 'beep', 'end']

The first key is specified in first variable:
first = 'red'
d = {'red':'boop','white':'beep','rose':'blip'}
new_l = [first, d[first], 'end']
for k, v in d.items():
if k == first:
continue
new_l.append(k)
new_l.append(v)
new_l.append('end')
print(new_l)
Prints:
['red', 'boop', 'end', 'white', 'beep', 'rose', 'blip', 'end']

You could use enumerate and check the current index:
>>> d = {'red':'boop','white':'beep','rose':'blip'}
>>> [x for i, e in enumerate(d.items())
... for x in (e + ("end",) if i in (0, len(d)-1) else e)]
...
['white', 'beep', 'end', 'red', 'boop', 'rose', 'blip', 'end']
However, your original idea, first chaining the keys and values and then inserting the "end" items would not have O(n²), either. It would be O(n) followed by another O(n), hence still O(n).

from itertools import chain
list(chain(*item.items())) + ['end']

data_lst = [x for k, v in lst_itemsL.iteritems() for x in (k, v) ]
data_lst.insert(2, 'end')
data_lst.append('end')
This is pythonic; though will likely have the same efficiency (which can't be helped here).
This should be faster than placing if blocks inside the loops...

Related

Python - itertools.groupby 2

Just having trouble with itertools.groupby. Given a list of dictionaries,
my_list= [
"AD01", "AD01AA", "AD01AB", "AD01AC", "AD01AD","AD02", "AD02AA", "AD02AB", "AD02AC"]
from this list, I expected to create a dictionary, where the key is the shortest name and the values are the longest names
example
[
{"Legacy" : "AD01", "rphy" : ["AD01AA", "AD01AB", "AD01AC", "AD01AD"]},
{"Legacy" : "AD02", "rphy" : ["AD02AA", "AD02AB", "AD02AC"]},
]
could you help me please

You can use itertools.groupby, with some nexts:
from itertools import groupby
my_list= ["AD01", "AD01AA", "AD01AB", "AD01AC", "AD01AD","AD02", "AD02AA", "AD02AB", "AD02AC"]
groups = groupby(my_list, len)
output = [{'Legacy': next(g), 'rphy': list(next(groups)[1])} for _, g in groups]
print(output)
# [{'Legacy': 'AD01', 'rphy': ['AD01AA', 'AD01AB', 'AD01AC', 'AD01AD']},
# {'Legacy': 'AD02', 'rphy': ['AD02AA', 'AD02AB', 'AD02AC']}]
This is not robust to reordering of the input list.
Also, if there is some "gap" in the input, e.g., if "AD01" does not have corresponding 'rphy' entries, then it will throw a StopIteration error as you have found out. In that case you can use a more conventional approach:
from itertools import groupby
my_list= ["AD01", "AD02", "AD02AA", "AD02AB", "AD02AC"]
output = []
for item in my_list:
if len(item) == 4:
dct = {'Legacy': item, 'rphy': []}
output.append(dct)
else:
dct['rphy'].append(item)
print(output)
# [{'Legacy': 'AD01', 'rphy': []}, {'Legacy': 'AD02', 'rphy': ['AD02AA', 'AD02AB', 'AD02AC']}]

One approach would be: (see the note at the end of the answer)
from itertools import groupby
from pprint import pprint
my_list = [
"AD01",
"AD01AA",
"AD01AB",
"AD01AC",
"AD01AD",
"AD02",
"AD02AA",
"AD02AB",
"AD02AC",
]
res = []
for _, g in groupby(my_list, len):
lst = list(g)
if len(lst) == 1:
res.append({"Legacy": lst[0], "rphy": []})
else:
res[-1]["rphy"].append(lst)
pprint(res)
output:
[{'Legacy': 'AD01', 'rphy': [['AD01AA', 'AD01AB', 'AD01AC', 'AD01AD']]},
{'Legacy': 'AD02', 'rphy': [['AD02AA', 'AD02AB', 'AD02AC']]}]
This assumes that your data always starts with your desired key(the name which has the smallest name compare to the next values).
Basically in every iteration you check then length of the created list from groupby. If it is 1, this mean it's your key, if not, it will add the next items to the dictionary.
Note: This code would break if there aren't at least 2 names with the length larger than the keys between two keys.

Handle dictionary collision in python3

I currently have the code below working fine:
Can someone help me solve the collision created from having two keys with the same number in the dictionary?
I tried multiple approach (not listed here) to try create an array to handle it but my approaches are still unsuccessful.
I am using #python3.7
def find_key(dic1, n):
'''
Return the key '3' from the dict
below.
'''
d = {}
for x, y in dic1.items():
# swap keys and values
# and update the result to 'd'
d[y] = x
try:
if n in d:
return d[y]
except Exception as e:
return (e)
dic1 = {'james':2,'david':3}
# Case to test that return ‘collision’
# comment 'dic1' above and replace it by
# dic1 below to create a 'collision'
# dic1 = {'james':2,'david':3, 'sandra':3}
n = 3
print(find_key(dic1,n))
Any help would be much appreciated.

You know there should be multiple returns, so plan for that in advance.
def find_keys_for_value(d, value):
for k, v in d.items():
if v == value:
yield k
data = {'james': 2, 'david': 3, 'sandra':3}
for result in find_keys_for_value(data, 3):
print (result)

You can use a defaultdict:
from collections import defaultdict
def find_key(dct, n):
dd = defaultdict(list)
for x, y in dct.items():
dd[y].append(x)
return dd[n]
dic1 = {'james':2, 'david':3, 'sandra':3}
print(find_key(dic1, 3))
print(find_key(dic1, 2))
print(find_key(dic1, 1))
Output:
['david', 'sandra']
['james']
[]
Building a defaultdict from all keys and values is only justified if you will repeatedly search for keys of the same dict given different values, though. Otherwise, the approach of Kenny Ostrom is preferrable. In any case, the above makes little sense if left as it stands.
If you are not at ease with generators and yield, here is the approach of Kenny Ostrom translated to lists (less efficient than generators, better than the above for one-shot searches):
def find_key(dct, n):
return [x for x, y in dct.items() if y == n]
The output is the same as above.

Create sublists with duplicate list elements

I'm new to Python and I'm trying to create sublists for list elements sharing the same base:
listRaw = ['AKS/STB', 'SBHS/AME', 'SBJ/OAK', 'SBJ/ALS', 'AKS/OSMX', 'SBHS/ABNX', 'AKS/AKX']
desiredOutput = [['AKS/STB', 'AKS/OSMX', 'AKS/AKX'], ['SBHS/AME', 'SBHS/ABNX'], ['SBJ/OAK', 'SBJ/ALS']]
I've tried to first isolate the base from each list element using:
def commonNumerator(self):
checkPosition = self.find('/')
commonNumerator = self[:checkPosition]
return commonNumerator
listRawModified = [commonNumerator(x) for x in listRaw]
print(listRawModified)
which gets me:
['AKS', 'SBHS', 'SBJ', 'SBJ', 'AKS', 'SBHS', 'AKS']
but from then I don't know how to proceed to get to the desired ouput.
Can someone explain to me how to do it?

Typical usecase for itertools.groupby():
from itertools import groupby
listRaw = ['AKS/STB', 'SBHS/AME', 'SBJ/OAK', 'SBJ/ALS', 'AKS/OSMX', 'SBHS/ABNX', 'AKS/AKX']
def key(s):
return s.split('/')[0]
[list(g) for k, g in groupby(sorted(listRaw, key=key), key=key)]
# [['AKS/STB', 'AKS/OSMX', 'AKS/AKX'], ['SBHS/AME', 'SBHS/ABNX'], ['SBJ/OAK', 'SBJ/ALS']]
The key() function helps in extracting the sorting/grouping key: key('AKS/STB') == 'AKS'.

Another way to do this would be to split each element and create a dictionary and then construct your desired output from that dictionary, e.g.:
In []:
d = {}
for i in listRaw:
k, v = i.split('/')
d.setdefault(k, []).append(v)
[['/'.join([k, v]) for v in d[k]] for k in d]
Out[]:
[['AKS/STB', 'AKS/OSMX', 'AKS/AKX'], ['SBHS/AME', 'SBHS/ABNX'], ['SBJ/OAK', 'SBJ/ALS']]

This is a typical usecase for itertools. But you could also consider storing the values in a dictionary:
from collections import defaultdict
d = defaultdict(list)
listRaw = ['AKS/STB', 'SBHS/AME', 'SBJ/OAK', 'SBJ/ALS', 'AKS/OSMX', 'SBHS/ABNX', 'AKS/AKX']
for item in listRaw:
i,y = item.split('/')
d[i].append(y)
print(dict(d))
# {'AKS': ['STB', 'OSMX', 'AKX'], 'SBHS': ['AME', 'ABNX'], 'SBJ': ['OAK', 'ALS']}
You can then access the values to AKS with a simple command as:
d['AKS'] # ['STB', 'OSMX', 'AKX']

In Python, How can I get the next and previous key:value of a particular key in a dictionary?

Okay, so this is a little hard to explain, but here goes:
I have a dictionary, which I'm adding content to. The content is a hashed username (key) with an IP address (value).
I was putting the hashes into an order by running them against base 16, and then using Collection.orderedDict.
So, the dictionary looked a little like this:
d = {'1234': '8.8.8.8', '2345':'0.0.0.0', '3213':'4.4.4.4', '4523':'1.1.1.1', '7654':'1.3.3.7', '9999':'127.0.0.1'}
What I needed was a mechanism that would allow me to pick one of those keys, and get the key/value item one higher and one lower. So, for example, If I were to pick 2345, the code would return the key:value combinations '1234:8.8.8.8' and '3213:4.4.4.4'
So, something like:
for i in d:
while i < len(d)
if i == '2345':
print i.nextItem
print i.previousItem
break()

Edit: OP now states that they are using OrderedDicts but the use case still requires this sort of approach.
Since dicts are not ordered you cannot directly do this. From your example, you are trying to reference the item like you would use a linked list.
A quick solution would be instead to extract the keys and sort them then iterate over that list:
keyList=sorted(d.keys())
for i,v in enumerate(keyList):
if v=='eeee':
print d[keyList[i+1]]
print d[keyList[i-1]]
The keyList holds the order of your items and you have to go back to it to find out what the next/previous key is to get the next/previous value. You also have to check for i+1 being greater than the list length and i-1 being less than 0.
You can use an OrderedDict similarly but I believe that you still have to do the above with a separate list as OrderedDict doesn't have next/prev methods.

As seen in the OrderedDict source code,
if you have a key and you want to find the next and prev in O(1) here's how you do that.
>>> from collections import OrderedDict
>>> d = OrderedDict([('aaaa', 'a',), ('bbbb', 'b'), ('cccc', 'c'), ('dddd', 'd'), ('eeee', 'e'), ('ffff', 'f')])
>>> i = 'eeee'
>>> link_prev, link_next, key = d._OrderedDict__map['eeee']
>>> print 'nextKey: ', link_next[2], 'prevKey: ', link_prev[2]
nextKey: ffff prevKey: dddd
This will give you next and prev by insertion order. If you add items in random order then just keep track of your items in sorted order.

You could also use the list.index() method.
This function is more generic (you can check positions +n and -n), it will catch attempts at searching a key that's not in the dict, and it will also return None if there's nothing before of after the key:
def keyshift(dictionary, key, diff):
if key in dictionary:
token = object()
keys = [token]*(diff*-1) + sorted(dictionary) + [token]*diff
newkey = keys[keys.index(key)+diff]
if newkey is token:
print None
else:
print {newkey: dictionary[newkey]}
else:
print 'Key not found'
keyshift(d, 'bbbb', -1)
keyshift(d, 'eeee', +1)

Try:
pos = 0
d = {'aaaa': 'a', 'bbbb':'b', 'cccc':'c', 'dddd':'d', 'eeee':'e', 'ffff':'f'}
for i in d:
pos+=1
if i == 'eeee':
listForm = list(d.values())
print(listForm[pos-1])
print(listForm[pos+1])
As in #AdamKerz's answer enumerate seems pythonic, but if you are a beginner this code might help you understand it in an easy way.
And I think its faster + smaller compared to sorting followed by building list & then enumerating

You could use a generic function, based on iterators, to get a moving window (taken from this question):
import itertools
def window(iterable, n=3):
it = iter(iterable)
result = tuple(itertools.islice(it, n))
if len(result) == n:
yield result
for element in it:
result = result[1:] + (element,)
yield result
l = range(8)
for i in window(l, 3):
print i
Using the above function with OrderedDict.items() will give you three (key, value) pairs, in order:
d = collections.OrderedDict(...)
for p_item, item, n_item in window(d.items()):
p_key, p_value = p_item
key, value = item
# Or, if you don't care about the next value:
n_key, _ = n_item
Of course using this function the first and last values will never be in the middle position (although this should not be difficult to do with some adaptation).
I think the biggest advantage is that it does not require table lookups in the previous and next keys, and also that it is generic and works with any iterable.

Maybe it is an overkill, but you can keep Track of the Keys inserted with a Helper Class and according to that list, you can retrieve the Key for Previous or Next. Just don't forget to check for border conditions, if the objects is already first or last element. This way, you will not need to always resort the ordered list or search for the element.
from collections import OrderedDict
class Helper(object):
"""Helper Class for Keeping track of Insert Order"""
def __init__(self, arg):
super(Helper, self).__init__()
dictContainer = dict()
ordering = list()
#staticmethod
def addItem(dictItem):
for key,value in dictItem.iteritems():
print key,value
Helper.ordering.append(key)
Helper.dictContainer[key] = value
#staticmethod
def getPrevious(key):
index = (Helper.ordering.index(key)-1)
return Helper.dictContainer[Helper.ordering[index]]
#Your unordered dictionary
d = {'aaaa': 'a', 'bbbb':'b', 'cccc':'c', 'dddd':'d', 'eeee':'e', 'ffff':'f'}
#Create Order over keys
ordered = OrderedDict(sorted(d.items(), key=lambda t: t[0]))
#Push your ordered list to your Helper class
Helper.addItem(ordered)
#Get Previous of
print Helper.getPrevious('eeee')
>>> d

You can store the keys and values in temp variable in prior, and can access previous and next key,value pair using index.
It is pretty dynamic, will work for any key you query. Please check this code :
d = {'1234': '8.8.8.8', '2345':'0.0.0.0', '3213':'4.4.4.4', '4523':'1.1.1.1', '7654':'1.3.3.7', '9999':'127.0.0.1'}
ch = raw_input('Pleasure Enter your choice : ')
keys = d.keys()
values = d.values()
#print keys, values
for k,v in d.iteritems():
if k == ch:
ind = d.keys().index(k)
print keys[ind-1], ':',values[ind-1]
print keys[ind+1], ':',values[ind+1]

I think this is a nice Pythonic way of resolving your problem using a lambda and list comprehension, although it may not be optimal in execution time:
import collections
x = collections.OrderedDict([('a','v1'),('b','v2'),('c','v3'),('d','v4')])
previousItem = lambda currentKey, thisOrderedDict : [
list( thisOrderedDict.items() )[ z - 1 ] if (z != 0) else None
for z in range( len( thisOrderedDict.items() ) )
if (list( thisOrderedDict.keys() )[ z ] == currentKey) ][ 0 ]
nextItem = lambda currentKey, thisOrderedDict : [
list( thisOrderedDict.items() )[ z + 1 ] if (z != (len( thisOrderedDict.items() ) - 1)) else None
for z in range( len( thisOrderedDict.items() ) )
if (list( thisOrderedDict.keys() )[ z ] == currentKey) ][ 0 ]
assert previousItem('c', x) == ('b', 'v2')
assert nextItem('c', x) == ('d', 'v4')
assert previousItem('a', x) is None
assert nextItem('d',x) is None

Another way that seems simple and straight forward: this function returns the key which is offset positions away from k
def get_shifted_key(d:dict, k:str, offset:int) -> str:
l = list(d.keys())
if k in l:
i = l.index(k) + offset
if 0 <= i < len(l):
return l[i]
return None

i know how to get next key:value of a particular key in a dictionary:
flag = 0
for k, v in dic.items():
if flag == 0:
code...
flag += 1
continue
code...{next key and value in for}

if correct :
d = { "a": 1, "b":2, "c":3 }
l = list( d.keys() ) # make a list of the keys
k = "b" # the actual key
i = l.index( k ) # get index of the actual key
for the next :
i = i+1 if i+1 < len( l ) else 0 # select next index or restart 0
n = l [ i ]
d [ n ]
for the previous :
i = i-1 if i-1 >= 0 else len( l ) -1 # select previous index or go end
p = l [ i ]
d [ p ]

Merge nested list items based on a repeating value

Although poorly written, this code:
marker_array = [['hard','2','soft'],['heavy','2','light'],['rock','2','feather'],['fast','3'], ['turtle','4','wet']]
marker_array_DS = []
for i in range(len(marker_array)):
if marker_array[i-1][1] != marker_array[i][1]:
marker_array_DS.append(marker_array[i])
print marker_array_DS
Returns:
[['hard', '2', 'soft'], ['fast', '3'], ['turtle', '4', 'wet']]
It accomplishes part of the task which is to create a new list containing all nested lists except those that have duplicate values in index [1]. But what I really need is to concatenate the matching index values from the removed lists creating a list like this:
[['hard heavy rock', '2', 'soft light feather'], ['fast', '3'], ['turtle', '4', 'wet']]
The values in index [1] must not be concatenated. I kind of managed to do the concatenation part using a tip from another post:
newlist = [i + n for i, n in zip(list_a, list_b]
But I am struggling with figuring out the way to produce the desired result. The "marker_array" list will be already sorted in ascending order before being passed to this code. All like-values in index [1] position will be contiguous. Some nested lists may not have any values beyond [0] and [1] as illustrated above.

Quick stab at it... use itertools.groupby to do the grouping for you, but do it over a generator that converts the 2 element list into a 3 element.
from itertools import groupby
from operator import itemgetter
marker_array = [['hard','2','soft'],['heavy','2','light'],['rock','2','feather'],['fast','3'], ['turtle','4','wet']]
def my_group(iterable):
temp = ((el + [''])[:3] for el in marker_array)
for k, g in groupby(temp, key=itemgetter(1)):
fst, snd = map(' '.join, zip(*map(itemgetter(0, 2), g)))
yield filter(None, [fst, k, snd])
print list(my_group(marker_array))

from collections import defaultdict
d1 = defaultdict(list)
d2 = defaultdict(list)
for pxa in marker_array:
d1[pxa[1]].extend(pxa[:1])
d2[pxa[1]].extend(pxa[2:])
res = [[' '.join(d1[x]), x, ' '.join(d2[x])] for x in sorted(d1)]
If you really need 2-tuples (which I think is unlikely):
for p in res:
if not p[-1]:
p.pop()

marker_array = [['hard','2','soft'],['heavy','2','light'],['rock','2','feather'],['fast','3'], ['turtle','4','wet']]
marker_array_DS = []
marker_array_hit = []
for i in range(len(marker_array)):
if marker_array[i][1] not in marker_array_hit:
marker_array_hit.append(marker_array[i][1])
for i in marker_array_hit:
lists = [item for item in marker_array if item[1] == i]
temp = []
first_part = ' '.join([str(item[0]) for item in lists])
temp.append(first_part)
temp.append(i)
second_part = ' '.join([str(item[2]) for item in lists if len(item) > 2])
if second_part != '':
temp.append(second_part);
marker_array_DS.append(temp)
print marker_array_DS
I learned python for this because I'm a shameless rep whore

marker_array = [
['hard','2','soft'],
['heavy','2','light'],
['rock','2','feather'],
['fast','3'],
['turtle','4','wet'],
]
data = {}
for arr in marker_array:
if len(arr) == 2:
arr.append('')
(first, index, last) = arr
firsts, lasts = data.setdefault(index, [[],[]])
firsts.append(first)
lasts.append(last)
results = []
for key in sorted(data.keys()):
current = [
" ".join(data[key][0]),
key,
" ".join(data[key][1])
]
if current[-1] == '':
current = current[:-1]
results.append(current)
print results
--output:--
[['hard heavy rock', '2', 'soft light feather'], ['fast', '3'], ['turtle', '4', 'wet']]

A different solution based on itertools.groupby:
from itertools import groupby
# normalizes the list of markers so all markers have 3 elements
def normalized(markers):
for marker in markers:
yield marker + [""] * (3 - len(marker))
def concatenated(markers):
# use groupby to iterator over lists of markers sharing the same key
for key, markers_in_category in groupby(normalized(markers), lambda m: m[1]):
# get separate lists of left and right words
lefts, rights = zip(*[(m[0],m[2]) for m in markers_in_category])
# remove empty strings from both lists
lefts, rights = filter(bool, lefts), filter(bool, rights)
# yield the concatenated entry for this key (also removing the empty string at the end, if necessary)
yield filter(bool, [" ".join(lefts), key, " ".join(rights)])
The generator concatenated(markers) will yield the results. This code correctly handles the ['fast', '3'] case and doesn't return an additional third element in such cases.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parse dict into specific list format using Python - python

from itertools import chain list(chain(*item.items())) + ['end']

data_lst = [x for k, v in lst_itemsL.iteritems() for x in (k, v) ] data_lst.insert(2, 'end') data_lst.append('end') This is pythonic; though will likely have the same efficiency (which can't be helped here). This should be faster than placing if blocks inside the loops...

Related

Python - itertools.groupby 2

Handle dictionary collision in python3

Create sublists with duplicate list elements

In Python, How can I get the next and previous key:value of a particular key in a dictionary?

Merge nested list items based on a repeating value

Categories

Resources