Iterating over first half a dictionary in python - python

How to iterate over first half of dictionary in python
This iterates over all values in the dictionary
for key, value in checkbox_dict.iteritems():
print key,value
But I want to iterate over the first half of the dictionary only.
one way is to do it like this
for key, value in dict(checkbox_dict.items()[:11]).iteritems():
print key,value
Is there any better way also ?

If you mean: over half of the items here's a way:
for key, value in checkbox_dict.items()[:int(len(checkbox_dict)/2)]:
pass
… But be aware: the elements in a normal dictionary don't necessarily keep the same iteration order that was used for inserting the elements … unless you use an OrderedDict.

You can use itertools.islice to slice the dict.iteritems iterator, unlike dict.items() with slice this won't create an intermediate list in memory.
>>> from itertools import islice
>>> d = dict.fromkeys('abcdefgh')
>>> for k, v in islice(d.iteritems(), len(d)/2):
print k, v
...
a None
c None
b None
e None
Note that normal dictionaries are unordered, so the items are returned in arbitrary order.

Related

How to get a subset from an OrderedDict?

I have an OrderedDict in Python, and I only want to get the first key-vale pairs. How to get it? For example, to get the first 4 elements, i did the following:
subdict = {}
for index, pair in enumerate(my_ordered_dict.items()):
if index < 4:
subdict[pair[0]] = pair[1]
Is this the good way to do it?
That approach involves running over the whole dictionary even though you only need the first four elements, checking the index over and over, and manually unpacking the pairs, and manually performing index checking unnecessarily.
Making it short-circuit is easy:
subdict = {}
for index, pair in enumerate(my_ordered_dict.items()):
if index >= 4:
break # Ends the loop without iterating all of my_ordered_dict
subdict[pair[0]] = pair[1]
and you can nested the unpacking to get nicer names:
subdict = {}
# Inner parentheses mandatory for nested unpacking
for index, (key, val) in enumerate(my_ordered_dict.items()):
if index >= 4:
break # Ends the loop
subdict[key] = value
but you can improve on that with itertools.islice to remove the manual index checking:
from itertools import islice # At top of file
subdict = {}
# islice lazily produces the first four pairs then stops for you
for key, val in islice(my_ordered_dict.items(), 4):
subdict[key] = value
at which point you can actually one-line the whole thing (because now you have an iterable of exactly the four pairs you want, and the dict constructor accepts an iterable of pairs):
subdict = dict(islice(my_ordered_dict.items(), 4))
You can use a map function, like this
item = dict(map(lambda x: (x, subdict[x]),[*subdict][:4]))
Here is one approach:
sub_dict = dict(pair for i, pair in zip(range(4), my_ordered_dict.items()))
The length of zip(a,b) is equal to the length of the shortest of a and b, so if my_ordered_dict.items() is longer than 4, zip(range(4), my_ordered_dict.items() just takes the first 4 items. These key-value pairs are passed to the dict builtin to make a new dict.

Remove items from dictionary if the length of the item is 1 or less

Is there a way to remove a key from a dictionary using it's index position (if it has one) instead of using the actual key (to avoid e.g. del d['key'], but use index position instead)?
If there is then don't bother reading the rest of this question as that's what I'm looking for too.
So, as an example for my case, I have the dictionary d which uses lists for the values:
d = {'acd': ['cad'], 'abd': ['bad', 'dab']}
I want to remove each key completely from such dictionary whose value's items have a length of less than 2 (so if there's only 1 item).
So, in this example, I would want to remove the key 'acd' because it's value's list only has 1 item ['cad']. 'abd' has 2 items ['bad', 'dab'], so I don't want to delete it - only if it contains 1 or less item. This dictionary is just an example - I am working with a much bigger version than this and I need it to remove all of the single item value keys.
I wrote this for testing but I'm not sure how to go about removing the keys I want - or determing what they are.
d = {'acd': ['cad'], 'abd': ['bad', 'dab']}
index_pos = 0
for i in d.values():
#Testing and seeing stuff
print("pos:", index_pos)
print(i)
print(len(i))
if len(i) < 2:
del d[???]
#What do I do?
index_pos += 1
I used index_pos because I thought it might be useful but I'm not sure.
I know I can delete an entry from the dictionary using
del d['key']
But how do I avoid using the key and e.g. use the index position instead, or how do I find out what the key is, so I can delete it?
Just use a dictionary comprehension:
d = {'acd': ['cad'], 'abd': ['bad', 'dab']}
res = {k: v for k, v in d.items() if len(v) >= 2}
Yes, you are creating a new dictionary, but this in itself is not usually a problem. Any solution will take O(n) time.
You can iterate a copy of your dictionary while modifying your original one. However, you should find the dictionary comprehension more efficient. Don't, under any circumstances, remove or add keys while you iterate your original dictionary.
If you doesn't want to create new dict here's how you could change your code. Here we iterate over copy of our dict and then delete keys of our original dict if length of its value is less than 2.
d = {'acd': ['cad'], 'abd': ['bad', 'dab']}
for key in d.copy():
if len(d[key]) < 2:
del d[key]
print(d)

Parse a list, check if it has elements from another list and print out these elements

I have a list populated from entries of a log; for sake of simplicity, something like
listlog = ["entry1:abcde", "entry2:abbds", "entry1:eorieo", "entry3:orieqor", "entry2:iroewiow"......]
This list can have an undefined number of entry, which may or may not be in sequence, since I run multiple operations in async fashion.
Then I have another list, which I use as reference to get only the list of entries; which may be like
list_template = ["entry1", "entry2", "entry3"]
I am trying to use the second list, to get sequences of entries, so I can isolate the single sequence, taking only the first instance found of each entry.
Since I am not dealing with numbers, I can't use set, so I did try with a loop inside a loop, comparing values in each list
This does not work, because it is possible that another entry may happen before what I am looking for (say, I want entry1, entry2, entry3, and the loop find entry1, but then find entry3, and since I compare every element of each list, it will be happy to find an element)
for item in listlog:
entry, value = item.split(":")
for reference_entry in list_template:
if entry == reference_entry:
print item
break
I have to, in a nutshell, find a sequence as in the template list, while these items are not necessarily in order. I am trying to parse the list once, otherwise I could do a very expensive multi-pass for each element of the template list, until I find the first occurrence and bail out. I thought that doing the loop in the loop is more efficient, since my reference list is always smaller than the log list, which is usually few elements.
How would you approach this problem, in the most efficient and pythonic way? All that I can think of, is multiple passes on the log list
you can use dict:
>>> listlog
['entry1:abcde', 'entry2:abbds', 'entry1:eorieo', 'entry3:orieqor', 'entry2:iroewiow']
>>> list_template
['entry1', 'entry2', 'entry3']
>>> for x in listlog:
... key, value = x.split(":")
... if key not in my_dict and key in list_template:
... my_dict[key] = value
...
>>> my_dict
{'entry2': 'abbds', 'entry3': 'orieqor', 'entry1': 'abcde'}
Disclaimer : This answer could use someone's insight on performance. Sure, list/dict comprehensions and zip are pythonic but the following may very well be a poor use of those tools.
You could use zip :
>>> data = ["a:12", "b:32", "c:54"]
>>> ref = ['c', 'b']
>>> matches = zip(ref, [val for key,val in [item.split(':') for item in data] if key in ref])
>>> for k, v in matches:
>>> print("{}:{}".format(k, v))
c:32
b:54
Here's another (worse? I'm not sure, performance-wise) way to get around this :
>>> data = ["a:12", "b:32", "c:54"]
>>> data_dict = {x:y for x,y in [item.split(':') for item in data]}
>>> ["{}:{}".format(key, val) for key,val in md.items() if key in ref]
['b:32', 'c:54']
Explanation :
Convert your initial list into a dict using a dict
For each pair of (key, val) found in the dict, join both in a string if the key is found in the 'ref' list
You can use a list comprehension something like this:
import re
listlog = ["entry1:abcde", "entry2:abbds", "entry1:eorieo", "entry3:orieqor", "entry2:iroewiow"]
print([item for item in listlog if re.search('entry', item)])
# ['entry1:abcde', 'entry2:abbds', 'entry1:eorieo', 'entry3:orieqor', 'entry2:iroewiow']
Than u can split 'em as u wish and create a dictonary if u want:
import re
listlog = ["entry1:abcde", "entry2:abbds", "entry1:eorieo", "entry3:orieqor", "entry2:iroewiow"]
mylist = [item for item in listlog if re.search('entry', item)]
def create_dict(string, dict_splitter=':'):
_dict = {}
temp = string.split(dict_splitter)
key = temp[0]
value = temp[1]
_dict[key] = value
return _dict
mydictionary = {}
for x in mylist:
x = str(x)
mydictionary.update(create_dict(x))
for k, v in mydictionary.items():
print(k, v)
# entry1 eorieo
# entry2 iroewiow
# entry3 orieqor
As you see this method need an update, cause we have changing the dictionary value. That's bad. Most better to update value for the same key. But it's much easier as u can think

Creating a list by iterating over a dictionary

I defined a dictionary like this (list is a list of integers):
my_dictionary = {'list_name' : list, 'another_list_name': another_list}
Now, I want to create a new list by iterating over this dictionary. In the end, I want it to look like this:
my_list = [list_name_list_item1, list_name_list_item2,
list_name_list_item3, another_list_name_another_list_item1]
And so on.
So my question is: How can I realize this?
I tried
for key in my_dictionary.keys():
k = my_dictionary[key]
for value in my_dictionary.values():
v = my_dictionary[value]
v = str(v)
my_list.append(k + '_' + v)
But instead of the desired output I receive a Type Error (unhashable type: 'list') in line 4 of this example.
You're trying to get a dictionary item by it's value whereas you already have your value.
Do it in one line using a list comprehension:
my_dictionary = {'list_name' : [1,4,5], 'another_list_name': [6,7,8]}
my_list = [k+"_"+str(v) for k,lv in my_dictionary.items() for v in lv]
print(my_list)
result:
['another_list_name_6', 'another_list_name_7', 'another_list_name_8', 'list_name_1', 'list_name_4', 'list_name_5']
Note that since the order in your dictionary is not guaranteed, the order of the list isn't either. You could fix the order by sorting the items according to keys:
my_list = [k+"_"+str(v) for k,lv in sorted(my_dictionary.items()) for v in lv]
Try this:
my_list = []
for key in my_dictionary:
for item in my_dictionary[key]:
my_list.append(str(key) + '_' + str(item))
Hope this helps.
Your immediate problem is that dict().values() is a generator yielding the values from the dictionary, not the keys, so when you attempt to do a lookup on line 4, it fails (in this case) as the values in the dictionary can't be used as keys. In another case, say {1:2, 3:4}, it would fail with a KeyError, and {1:2, 2:1} would not raise an error, but likely give confusing behaviour.
As for your actual question, lists do not attribute any names to data, like dictionaries do; they simply store the index.
def f()
a = 1
b = 2
c = 3
l = [a, b, c]
return l
Calling f() will return [1, 2, 3], with any concept of a, b, and c being lost entirely.
If you want to simply concatenate the lists in your dictionary, making a copy of the first, then calling .extend() on it will suffice:
my_list = my_dictionary['list_name'][:]
my_list.extend(my_dictionary['another_list_name'])
If you're looking to keep the order of the lists' items, while still referring to them by name, look into the OrderedDict class in collections.
You've written an outer loop over keys, then an inner loop over values, and tried to use each value as a key, which is where the program failed. Simply use the dictionary's items method to iterate over key,value pairs instead:
["{}_{}".format(k,v) for k,v in d.items()]
Oops, failed to parse the format desired; we were to produce each item in the inner list. Not to worry...
d={1:[1,2,3],2:[4,5,6]}
list(itertools.chain(*(
["{}_{}".format(k,i) for i in l]
for (k,l) in d.items() )))
This is a little more complex. We again take key,value pairs from the dictionary, then make an inner loop over the list that was the value and format those into strings. This produces inner sequences, so we flatten it using chain and *, and finally save the result as one list.
Edit: Turns out Python 3.4.3 gets quite confused when doing this nested as generator expressions; I had to turn the inner one into a list, or it would replace some combination of k and l before doing the formatting.
Edit again: As someone posted in a since deleted answer (which confuses me), I'm overcomplicating things. You can do the flattened nesting in a chained comprehension:
["{}_{}".format(k,v) for k,l in d.items() for v in l]
That method was also posted by Jean-François Fabre.
Use list comprehensions like this
d = {"test1":[1,2,3,],"test2":[4,5,6],"test3":[7,8,9]}
new_list = [str(item[0])+'_'+str(v) for item in d.items() for v in item[1]]
Output:
new_list:
['test1_1',
'test1_2',
'test1_3',
'test3_7',
'test3_8',
'test3_9',
'test2_4',
'test2_5',
'test2_6']
Let's initialize our data
In [1]: l0 = [1, 2, 3, 4]
In [2]: l1 = [10, 20, 30, 40]
In [3]: d = {'name0': l0, 'name1': l1}
Note that in my example, different from yours, the lists' content is not strings... aren't lists heterogeneous containers?
That said, you cannot simply join the keys and the list's items, you'd better cast these value to strings using the str(...) builtin.
Now it comes the solution to your problem... I use a list comprehension
with two loops, the outer loop comes first and it is on the items (i.e., key-value couples) in the dictionary, the inner loop comes second and it is on the items in the corresponding list.
In [4]: res = ['_'.join((str(k), str(i))) for k, l in d.items() for i in l]
In [5]: print(res)
['name0_1', 'name0_2', 'name0_3', 'name0_4', 'name1_10', 'name1_20', 'name1_30', 'name1_40']
In [6]:
In your case, using str(k)+'_'+str(i) would be fine as well, but the current idiom for joining strings with a fixed 'text' is the 'text'.join(...) method. Note that .join takes a SINGLE argument, an iterable, and hence in the list comprehension I used join((..., ...))
to collect the joinands in a single argument.

Get elements of a tuple-indexed dictionary specifying only one field of the tuple

I've got a big (20k+) set of data in a form of a dictionary indexed by tuple, e.g.
a = {(1,'000200','l1p'): 53, (15,'230512','l3c'): 81, ...}
I would like to filter that dictionary providing only one field of that tuple, e.g.
a[(_,_,'l1p')]`, or `a[(:,:,'l1p')]
Is there any better way than creating a list, like
[i for i in a.keys() if 'l1p' in i]
As I said, I'm trying to avoid copying elements as there are many entries in the dictionary.
EDIT: Is there any other way of obtaining the elements with 'l1p' in the key-tuple than iterating over the whole dictionary? I would like to perform a recursive least-square fitting on resultant sub-list.
First of all, what you have is a dictionary, not a list (and definitely not a tuple). Lists and tuples are just sequences of values numbered 0, 1, 2, ..., etc., while a dictionary is an unordered set of values each labelled & accessed with a unique key (in this case, the tuples).
With that out of the way, to get all of the values of a where the third element of the key is 'l1p', you can just do:
[v for k,v in a.items() if k[2] == 'l1p']
If you're concerned about saving memory and won't be trying to evaluate the entire result at once, this can be rewritten as a generator expression:
(v for k,v in a.items() if k[2] == 'l1p')
Note that, if you're using Python 2, a.items() will need to be changed to a.iteritems(), or the change to a generator will have been for naught.
Alternatively, if you want to instead get a sub-dictionary that includes the matching keys, do:
{k: v for k,v in a.items() if k[2] == 'l1p'}
Note that this is not a memory-friendly option. The closest analogue using a generator would be to create a generator of (key, value) pairs rather than a proper dictionary:
((k,v) for k,v in a.items() if k[2] == 'l1p')

Categories