Related
Say I have the following dict
{'red':'boop','white':'beep','rose':'blip'}
And I want to get it to a list like so
['red','boop','end','white','beep','rose','blip','end']
The key / value which is to be placed in front of the list is an input.
So I essentially I want [first_key, first_value,end, .. rest of the k/v pairs..,end]
I wrote a brute force approach but I feel like there's a more pythonic way of doing it (and also because once implemented the snippet would make my code O(n^2) )
for item in lst_items
data_lst = []
for key, value in item.iteritems():
data_lst.append(key)
ata_lst.append(value)
#insert 'end' at the appropiate indeces
#more code ...
Any pythonic approach?
The below relies on itertools.chain.from_iterable to flatten the items into a single list. We pull the first two values from the chain and then use them to build a new list, which we extend with the rest of the values.
from itertools import chain
def ends(d):
if not d:
return []
c = chain.from_iterable(d.iteritems())
l = [next(c), next(c), "end"]
l.extend(c)
l.append("end")
return l
ends({'red':'boop','white':'beep','rose':'blip'})
# ['rose', 'blip', 'end', 'white', 'beep', 'red', 'boop', 'end']
If you know the key you want first, and don't care about the rest, we can use a lazily evaluated generator expression to remove it from the flattened list.
def ends(d, first):
if not d:
return []
c = chain.from_iterable((k, v) for k, v in d.iteritems() if k != first)
l = [first, d[first], "end"]
l.extend(c)
l.append("end")
return l
ends({'red':'boop','white':'beep','rose':'blip'}, 'red')
# ['red', 'boop', 'end', 'rose', 'blip', 'white', 'beep', 'end']
The first key is specified in first variable:
first = 'red'
d = {'red':'boop','white':'beep','rose':'blip'}
new_l = [first, d[first], 'end']
for k, v in d.items():
if k == first:
continue
new_l.append(k)
new_l.append(v)
new_l.append('end')
print(new_l)
Prints:
['red', 'boop', 'end', 'white', 'beep', 'rose', 'blip', 'end']
You could use enumerate and check the current index:
>>> d = {'red':'boop','white':'beep','rose':'blip'}
>>> [x for i, e in enumerate(d.items())
... for x in (e + ("end",) if i in (0, len(d)-1) else e)]
...
['white', 'beep', 'end', 'red', 'boop', 'rose', 'blip', 'end']
However, your original idea, first chaining the keys and values and then inserting the "end" items would not have O(n²), either. It would be O(n) followed by another O(n), hence still O(n).
from itertools import chain
list(chain(*item.items())) + ['end']
data_lst = [x for k, v in lst_itemsL.iteritems() for x in (k, v) ]
data_lst.insert(2, 'end')
data_lst.append('end')
This is pythonic; though will likely have the same efficiency (which can't be helped here).
This should be faster than placing if blocks inside the loops...
I'm new to Python and I'm trying to create sublists for list elements sharing the same base:
listRaw = ['AKS/STB', 'SBHS/AME', 'SBJ/OAK', 'SBJ/ALS', 'AKS/OSMX', 'SBHS/ABNX', 'AKS/AKX']
desiredOutput = [['AKS/STB', 'AKS/OSMX', 'AKS/AKX'], ['SBHS/AME', 'SBHS/ABNX'], ['SBJ/OAK', 'SBJ/ALS']]
I've tried to first isolate the base from each list element using:
def commonNumerator(self):
checkPosition = self.find('/')
commonNumerator = self[:checkPosition]
return commonNumerator
listRawModified = [commonNumerator(x) for x in listRaw]
print(listRawModified)
which gets me:
['AKS', 'SBHS', 'SBJ', 'SBJ', 'AKS', 'SBHS', 'AKS']
but from then I don't know how to proceed to get to the desired ouput.
Can someone explain to me how to do it?
Typical usecase for itertools.groupby():
from itertools import groupby
listRaw = ['AKS/STB', 'SBHS/AME', 'SBJ/OAK', 'SBJ/ALS', 'AKS/OSMX', 'SBHS/ABNX', 'AKS/AKX']
def key(s):
return s.split('/')[0]
[list(g) for k, g in groupby(sorted(listRaw, key=key), key=key)]
# [['AKS/STB', 'AKS/OSMX', 'AKS/AKX'], ['SBHS/AME', 'SBHS/ABNX'], ['SBJ/OAK', 'SBJ/ALS']]
The key() function helps in extracting the sorting/grouping key: key('AKS/STB') == 'AKS'.
Another way to do this would be to split each element and create a dictionary and then construct your desired output from that dictionary, e.g.:
In []:
d = {}
for i in listRaw:
k, v = i.split('/')
d.setdefault(k, []).append(v)
[['/'.join([k, v]) for v in d[k]] for k in d]
Out[]:
[['AKS/STB', 'AKS/OSMX', 'AKS/AKX'], ['SBHS/AME', 'SBHS/ABNX'], ['SBJ/OAK', 'SBJ/ALS']]
This is a typical usecase for itertools. But you could also consider storing the values in a dictionary:
from collections import defaultdict
d = defaultdict(list)
listRaw = ['AKS/STB', 'SBHS/AME', 'SBJ/OAK', 'SBJ/ALS', 'AKS/OSMX', 'SBHS/ABNX', 'AKS/AKX']
for item in listRaw:
i,y = item.split('/')
d[i].append(y)
print(dict(d))
# {'AKS': ['STB', 'OSMX', 'AKX'], 'SBHS': ['AME', 'ABNX'], 'SBJ': ['OAK', 'ALS']}
You can then access the values to AKS with a simple command as:
d['AKS'] # ['STB', 'OSMX', 'AKX']
I have an array and I want to count the occurrence of each item in the array.
I have managed to use a map function to produce a list of tuples.
def mapper(a):
return (a, 1)
r = list(map(lambda a: mapper(a), arr));
//output example:
//(11817685, 1), (2014036792, 1), (2014047115, 1), (11817685, 1)
I'm expecting the reduce function can help me to group counts by the first number (id) in each tuple. For example:
(11817685, 2), (2014036792, 1), (2014047115, 1)
I tried
cnt = reduce(lambda a, b: a + b, r);
and some other ways but they all don't do the trick.
NOTE
Thanks for all the advice on other ways to solve the problems, but I'm just learning Python and how to implement a map-reduce here, and I have simplified my real business problem a lot to make it easy to understand, so please kindly show me a correct way of doing map-reduce.
You could use Counter:
from collections import Counter
arr = [11817685, 2014036792, 2014047115, 11817685]
counter = Counter(arr)
print zip(counter.keys(), counter.values())
EDIT:
As pointed by #ShadowRanger Counter has items() method:
from collections import Counter
arr = [11817685, 2014036792, 2014047115, 11817685]
print Counter(arr).items()
Instead of using any external module you can use some logic and do it without any module:
track={}
if intr not in track:
track[intr]=1
else:
track[intr]+=1
Example code :
For these types of list problems there is a pattern :
So suppose you have a list :
a=[(2006,1),(2007,4),(2008,9),(2006,5)]
And you want to convert this to a dict as the first element of the tuple as key and second element of the tuple. something like :
{2008: [9], 2006: [5], 2007: [4]}
But there is a catch you also want that those keys which have different values but keys are same like (2006,1) and (2006,5) keys are same but values are different. you want that those values append with only one key so expected output :
{2008: [9], 2006: [1, 5], 2007: [4]}
for this type of problem we do something like this:
first create a new dict then we follow this pattern:
if item[0] not in new_dict:
new_dict[item[0]]=[item[1]]
else:
new_dict[item[0]].append(item[1])
So we first check if key is in new dict and if it already then add the value of duplicate key to its value:
full code:
a=[(2006,1),(2007,4),(2008,9),(2006,5)]
new_dict={}
for item in a:
if item[0] not in new_dict:
new_dict[item[0]]=[item[1]]
else:
new_dict[item[0]].append(item[1])
print(new_dict)
output:
{2008: [9], 2006: [1, 5], 2007: [4]}
After writing my answer to a different question, I remembered this post and thought it would be helpful to write a similar answer here.
Here is a way to use reduce on your list to get the desired output.
arr = [11817685, 2014036792, 2014047115, 11817685]
def mapper(a):
return (a, 1)
def reducer(x, y):
if isinstance(x, dict):
ykey, yval = y
if ykey not in x:
x[ykey] = yval
else:
x[ykey] += yval
return x
else:
xkey, xval = x
ykey, yval = y
a = {xkey: xval}
if ykey in a:
a[ykey] += yval
else:
a[ykey] = yval
return a
mapred = reduce(reducer, map(mapper, arr))
print mapred.items()
Which prints:
[(2014036792, 1), (2014047115, 1), (11817685, 2)]
Please see the linked answer for a more detailed explanation.
If all you need is cnt, then a dict would probably be better than a list of tuples here (if you need this format, just use dict.items).
The collections module has a useful data structure for this, a defaultdict.
from collections import defaultdict
cnt = defaultdict(int) # create a default dict where the default value is
# the result of calling int
for key in arr:
cnt[key] += 1 # if key is not in cnt, it will put in the default
# cnt_list = list(cnt.items())
I have a dictionary with a tuple of 5 values as a key. For example:
D[i,j,k,g,h] = value.
Now i need to process all elements with a certain partial key pair (i1,g1):
I need now for each pair (i1,g1) all values that have i == i1 and g == g1 in the full key.
What is an pythonic and efficient way to retrieve this, knowing that i need the elements for all pairs and each full key belongs to exactly one partial key?
Is there a more appropriate data structure than dictionaries?
One reference implementation is this:
results = {}
for i in I:
for g in G:
results[i,g] = []
for i,j,k,g,h in D:
if i1 == i and g1 == g:
results[i,g].append(D[i,j,k,g,h])
Assuming you know all the valid values for the different indices you can get all possible keys using itertools.product:
import itertools
I = [3,6,9]
J = range(10)
K = "abcde"
G = ["first","second"]
H = range(10,20)
for tup in itertools.product(I,J,K,G,H):
my_dict[tup] = 0
To restrict the indices generated just put a limit on one / several of the indices that gets generated, for instance all of the keys where i = 6 would be:
itertools.product((6,), J,K,G,H)
A function to let you specify you want all the indices where i==6 and g =="first" would look like this:
def partial_indices(i_vals=I, j_vals=J, k_vals=K, g_vals = G, h_vals = H):
return itertools.product(i_vals, j_vals, k_vals, g_vals, h_vals)
partial_indices(i_vals=(6,), g_vals=("first",))
Or assuming that not all of these are present in the dictionary you can also pass the dictionary as an argument and check for membership before generating the keys:
def items_with_partial_indices(d, i_vals=I, j_vals=J, k_vals=K, g_vals = G, h_vals = H):
for tup in itertools.product(i_vals, j_vals, k_vals, g_vals, h_vals):
try:
yield tup, d[tup]
except KeyError:
pass
for k,v in D.iteritems():
if i in k and p in k:
print v
Okay, so this is a little hard to explain, but here goes:
I have a dictionary, which I'm adding content to. The content is a hashed username (key) with an IP address (value).
I was putting the hashes into an order by running them against base 16, and then using Collection.orderedDict.
So, the dictionary looked a little like this:
d = {'1234': '8.8.8.8', '2345':'0.0.0.0', '3213':'4.4.4.4', '4523':'1.1.1.1', '7654':'1.3.3.7', '9999':'127.0.0.1'}
What I needed was a mechanism that would allow me to pick one of those keys, and get the key/value item one higher and one lower. So, for example, If I were to pick 2345, the code would return the key:value combinations '1234:8.8.8.8' and '3213:4.4.4.4'
So, something like:
for i in d:
while i < len(d)
if i == '2345':
print i.nextItem
print i.previousItem
break()
Edit: OP now states that they are using OrderedDicts but the use case still requires this sort of approach.
Since dicts are not ordered you cannot directly do this. From your example, you are trying to reference the item like you would use a linked list.
A quick solution would be instead to extract the keys and sort them then iterate over that list:
keyList=sorted(d.keys())
for i,v in enumerate(keyList):
if v=='eeee':
print d[keyList[i+1]]
print d[keyList[i-1]]
The keyList holds the order of your items and you have to go back to it to find out what the next/previous key is to get the next/previous value. You also have to check for i+1 being greater than the list length and i-1 being less than 0.
You can use an OrderedDict similarly but I believe that you still have to do the above with a separate list as OrderedDict doesn't have next/prev methods.
As seen in the OrderedDict source code,
if you have a key and you want to find the next and prev in O(1) here's how you do that.
>>> from collections import OrderedDict
>>> d = OrderedDict([('aaaa', 'a',), ('bbbb', 'b'), ('cccc', 'c'), ('dddd', 'd'), ('eeee', 'e'), ('ffff', 'f')])
>>> i = 'eeee'
>>> link_prev, link_next, key = d._OrderedDict__map['eeee']
>>> print 'nextKey: ', link_next[2], 'prevKey: ', link_prev[2]
nextKey: ffff prevKey: dddd
This will give you next and prev by insertion order. If you add items in random order then just keep track of your items in sorted order.
You could also use the list.index() method.
This function is more generic (you can check positions +n and -n), it will catch attempts at searching a key that's not in the dict, and it will also return None if there's nothing before of after the key:
def keyshift(dictionary, key, diff):
if key in dictionary:
token = object()
keys = [token]*(diff*-1) + sorted(dictionary) + [token]*diff
newkey = keys[keys.index(key)+diff]
if newkey is token:
print None
else:
print {newkey: dictionary[newkey]}
else:
print 'Key not found'
keyshift(d, 'bbbb', -1)
keyshift(d, 'eeee', +1)
Try:
pos = 0
d = {'aaaa': 'a', 'bbbb':'b', 'cccc':'c', 'dddd':'d', 'eeee':'e', 'ffff':'f'}
for i in d:
pos+=1
if i == 'eeee':
listForm = list(d.values())
print(listForm[pos-1])
print(listForm[pos+1])
As in #AdamKerz's answer enumerate seems pythonic, but if you are a beginner this code might help you understand it in an easy way.
And I think its faster + smaller compared to sorting followed by building list & then enumerating
You could use a generic function, based on iterators, to get a moving window (taken from this question):
import itertools
def window(iterable, n=3):
it = iter(iterable)
result = tuple(itertools.islice(it, n))
if len(result) == n:
yield result
for element in it:
result = result[1:] + (element,)
yield result
l = range(8)
for i in window(l, 3):
print i
Using the above function with OrderedDict.items() will give you three (key, value) pairs, in order:
d = collections.OrderedDict(...)
for p_item, item, n_item in window(d.items()):
p_key, p_value = p_item
key, value = item
# Or, if you don't care about the next value:
n_key, _ = n_item
Of course using this function the first and last values will never be in the middle position (although this should not be difficult to do with some adaptation).
I think the biggest advantage is that it does not require table lookups in the previous and next keys, and also that it is generic and works with any iterable.
Maybe it is an overkill, but you can keep Track of the Keys inserted with a Helper Class and according to that list, you can retrieve the Key for Previous or Next. Just don't forget to check for border conditions, if the objects is already first or last element. This way, you will not need to always resort the ordered list or search for the element.
from collections import OrderedDict
class Helper(object):
"""Helper Class for Keeping track of Insert Order"""
def __init__(self, arg):
super(Helper, self).__init__()
dictContainer = dict()
ordering = list()
#staticmethod
def addItem(dictItem):
for key,value in dictItem.iteritems():
print key,value
Helper.ordering.append(key)
Helper.dictContainer[key] = value
#staticmethod
def getPrevious(key):
index = (Helper.ordering.index(key)-1)
return Helper.dictContainer[Helper.ordering[index]]
#Your unordered dictionary
d = {'aaaa': 'a', 'bbbb':'b', 'cccc':'c', 'dddd':'d', 'eeee':'e', 'ffff':'f'}
#Create Order over keys
ordered = OrderedDict(sorted(d.items(), key=lambda t: t[0]))
#Push your ordered list to your Helper class
Helper.addItem(ordered)
#Get Previous of
print Helper.getPrevious('eeee')
>>> d
You can store the keys and values in temp variable in prior, and can access previous and next key,value pair using index.
It is pretty dynamic, will work for any key you query. Please check this code :
d = {'1234': '8.8.8.8', '2345':'0.0.0.0', '3213':'4.4.4.4', '4523':'1.1.1.1', '7654':'1.3.3.7', '9999':'127.0.0.1'}
ch = raw_input('Pleasure Enter your choice : ')
keys = d.keys()
values = d.values()
#print keys, values
for k,v in d.iteritems():
if k == ch:
ind = d.keys().index(k)
print keys[ind-1], ':',values[ind-1]
print keys[ind+1], ':',values[ind+1]
I think this is a nice Pythonic way of resolving your problem using a lambda and list comprehension, although it may not be optimal in execution time:
import collections
x = collections.OrderedDict([('a','v1'),('b','v2'),('c','v3'),('d','v4')])
previousItem = lambda currentKey, thisOrderedDict : [
list( thisOrderedDict.items() )[ z - 1 ] if (z != 0) else None
for z in range( len( thisOrderedDict.items() ) )
if (list( thisOrderedDict.keys() )[ z ] == currentKey) ][ 0 ]
nextItem = lambda currentKey, thisOrderedDict : [
list( thisOrderedDict.items() )[ z + 1 ] if (z != (len( thisOrderedDict.items() ) - 1)) else None
for z in range( len( thisOrderedDict.items() ) )
if (list( thisOrderedDict.keys() )[ z ] == currentKey) ][ 0 ]
assert previousItem('c', x) == ('b', 'v2')
assert nextItem('c', x) == ('d', 'v4')
assert previousItem('a', x) is None
assert nextItem('d',x) is None
Another way that seems simple and straight forward: this function returns the key which is offset positions away from k
def get_shifted_key(d:dict, k:str, offset:int) -> str:
l = list(d.keys())
if k in l:
i = l.index(k) + offset
if 0 <= i < len(l):
return l[i]
return None
i know how to get next key:value of a particular key in a dictionary:
flag = 0
for k, v in dic.items():
if flag == 0:
code...
flag += 1
continue
code...{next key and value in for}
if correct :
d = { "a": 1, "b":2, "c":3 }
l = list( d.keys() ) # make a list of the keys
k = "b" # the actual key
i = l.index( k ) # get index of the actual key
for the next :
i = i+1 if i+1 < len( l ) else 0 # select next index or restart 0
n = l [ i ]
d [ n ]
for the previous :
i = i-1 if i-1 >= 0 else len( l ) -1 # select previous index or go end
p = l [ i ]
d [ p ]