Python: checking if an item is in a dictionary. Key OR value - python

I'm trying to check if a dictionary has a certain value in its keys as well as its values with just one command instead of having to OR two searches. I.e.
'b' in d.keys()
'b' in d.keys() or 'b' in d.values()
Searching the internet with these terms has returned nothing but instructions on how to do search just keys or just values.
d = {'a':'b'}
d.items() ## this is the closest thing I could find to all the items
'b' in d.items() ## but this returns false

There's no single function to do this. You'll have to either do (without keys() is faster):
'b' in d or 'b' in d.values()
or some kind of loop over items:
for i in d.items():
if 'b' in i:
return True
return False
or:
any(('b' in i) for i in d.items())
PS. It also points at a bad design. Dictionaries are cool for key lookups, because they're fast at that. If you check both keys and values, you're just looking through all the stored items anyway. (and it shows you're not even sure which side you're looking at) I'd suggest checking if maybe some combination of sets and dicts is better suited for what you want to do.

The problem with d.items() is that it returns a list of key/value pairs that are represented as tuples so 'b' will not be in [('a', 'b')].
all_items = d.keys() + d.values()
'b' in all_items

Related

Remove items from dictionary if the length of the item is 1 or less

Is there a way to remove a key from a dictionary using it's index position (if it has one) instead of using the actual key (to avoid e.g. del d['key'], but use index position instead)?
If there is then don't bother reading the rest of this question as that's what I'm looking for too.
So, as an example for my case, I have the dictionary d which uses lists for the values:
d = {'acd': ['cad'], 'abd': ['bad', 'dab']}
I want to remove each key completely from such dictionary whose value's items have a length of less than 2 (so if there's only 1 item).
So, in this example, I would want to remove the key 'acd' because it's value's list only has 1 item ['cad']. 'abd' has 2 items ['bad', 'dab'], so I don't want to delete it - only if it contains 1 or less item. This dictionary is just an example - I am working with a much bigger version than this and I need it to remove all of the single item value keys.
I wrote this for testing but I'm not sure how to go about removing the keys I want - or determing what they are.
d = {'acd': ['cad'], 'abd': ['bad', 'dab']}
index_pos = 0
for i in d.values():
#Testing and seeing stuff
print("pos:", index_pos)
print(i)
print(len(i))
if len(i) < 2:
del d[???]
#What do I do?
index_pos += 1
I used index_pos because I thought it might be useful but I'm not sure.
I know I can delete an entry from the dictionary using
del d['key']
But how do I avoid using the key and e.g. use the index position instead, or how do I find out what the key is, so I can delete it?
Just use a dictionary comprehension:
d = {'acd': ['cad'], 'abd': ['bad', 'dab']}
res = {k: v for k, v in d.items() if len(v) >= 2}
Yes, you are creating a new dictionary, but this in itself is not usually a problem. Any solution will take O(n) time.
You can iterate a copy of your dictionary while modifying your original one. However, you should find the dictionary comprehension more efficient. Don't, under any circumstances, remove or add keys while you iterate your original dictionary.
If you doesn't want to create new dict here's how you could change your code. Here we iterate over copy of our dict and then delete keys of our original dict if length of its value is less than 2.
d = {'acd': ['cad'], 'abd': ['bad', 'dab']}
for key in d.copy():
if len(d[key]) < 2:
del d[key]
print(d)

Parse a list, check if it has elements from another list and print out these elements

I have a list populated from entries of a log; for sake of simplicity, something like
listlog = ["entry1:abcde", "entry2:abbds", "entry1:eorieo", "entry3:orieqor", "entry2:iroewiow"......]
This list can have an undefined number of entry, which may or may not be in sequence, since I run multiple operations in async fashion.
Then I have another list, which I use as reference to get only the list of entries; which may be like
list_template = ["entry1", "entry2", "entry3"]
I am trying to use the second list, to get sequences of entries, so I can isolate the single sequence, taking only the first instance found of each entry.
Since I am not dealing with numbers, I can't use set, so I did try with a loop inside a loop, comparing values in each list
This does not work, because it is possible that another entry may happen before what I am looking for (say, I want entry1, entry2, entry3, and the loop find entry1, but then find entry3, and since I compare every element of each list, it will be happy to find an element)
for item in listlog:
entry, value = item.split(":")
for reference_entry in list_template:
if entry == reference_entry:
print item
break
I have to, in a nutshell, find a sequence as in the template list, while these items are not necessarily in order. I am trying to parse the list once, otherwise I could do a very expensive multi-pass for each element of the template list, until I find the first occurrence and bail out. I thought that doing the loop in the loop is more efficient, since my reference list is always smaller than the log list, which is usually few elements.
How would you approach this problem, in the most efficient and pythonic way? All that I can think of, is multiple passes on the log list
you can use dict:
>>> listlog
['entry1:abcde', 'entry2:abbds', 'entry1:eorieo', 'entry3:orieqor', 'entry2:iroewiow']
>>> list_template
['entry1', 'entry2', 'entry3']
>>> for x in listlog:
... key, value = x.split(":")
... if key not in my_dict and key in list_template:
... my_dict[key] = value
...
>>> my_dict
{'entry2': 'abbds', 'entry3': 'orieqor', 'entry1': 'abcde'}
Disclaimer : This answer could use someone's insight on performance. Sure, list/dict comprehensions and zip are pythonic but the following may very well be a poor use of those tools.
You could use zip :
>>> data = ["a:12", "b:32", "c:54"]
>>> ref = ['c', 'b']
>>> matches = zip(ref, [val for key,val in [item.split(':') for item in data] if key in ref])
>>> for k, v in matches:
>>> print("{}:{}".format(k, v))
c:32
b:54
Here's another (worse? I'm not sure, performance-wise) way to get around this :
>>> data = ["a:12", "b:32", "c:54"]
>>> data_dict = {x:y for x,y in [item.split(':') for item in data]}
>>> ["{}:{}".format(key, val) for key,val in md.items() if key in ref]
['b:32', 'c:54']
Explanation :
Convert your initial list into a dict using a dict
For each pair of (key, val) found in the dict, join both in a string if the key is found in the 'ref' list
You can use a list comprehension something like this:
import re
listlog = ["entry1:abcde", "entry2:abbds", "entry1:eorieo", "entry3:orieqor", "entry2:iroewiow"]
print([item for item in listlog if re.search('entry', item)])
# ['entry1:abcde', 'entry2:abbds', 'entry1:eorieo', 'entry3:orieqor', 'entry2:iroewiow']
Than u can split 'em as u wish and create a dictonary if u want:
import re
listlog = ["entry1:abcde", "entry2:abbds", "entry1:eorieo", "entry3:orieqor", "entry2:iroewiow"]
mylist = [item for item in listlog if re.search('entry', item)]
def create_dict(string, dict_splitter=':'):
_dict = {}
temp = string.split(dict_splitter)
key = temp[0]
value = temp[1]
_dict[key] = value
return _dict
mydictionary = {}
for x in mylist:
x = str(x)
mydictionary.update(create_dict(x))
for k, v in mydictionary.items():
print(k, v)
# entry1 eorieo
# entry2 iroewiow
# entry3 orieqor
As you see this method need an update, cause we have changing the dictionary value. That's bad. Most better to update value for the same key. But it's much easier as u can think

Iterating over first half a dictionary in python

How to iterate over first half of dictionary in python
This iterates over all values in the dictionary
for key, value in checkbox_dict.iteritems():
print key,value
But I want to iterate over the first half of the dictionary only.
one way is to do it like this
for key, value in dict(checkbox_dict.items()[:11]).iteritems():
print key,value
Is there any better way also ?
If you mean: over half of the items here's a way:
for key, value in checkbox_dict.items()[:int(len(checkbox_dict)/2)]:
pass
… But be aware: the elements in a normal dictionary don't necessarily keep the same iteration order that was used for inserting the elements … unless you use an OrderedDict.
You can use itertools.islice to slice the dict.iteritems iterator, unlike dict.items() with slice this won't create an intermediate list in memory.
>>> from itertools import islice
>>> d = dict.fromkeys('abcdefgh')
>>> for k, v in islice(d.iteritems(), len(d)/2):
print k, v
...
a None
c None
b None
e None
Note that normal dictionaries are unordered, so the items are returned in arbitrary order.

Dictionary with lists as values - find longest list

I have a dictionary where the values are lists. I need to find which key has the longest list as value, after removing the duplicates. If i just find the longest list this won't work as there may be a lot of duplicates. I have tried several things, but nothing is remotely close to being correct.
d = # your dictionary of lists
max_key = max(d, key= lambda x: len(set(d[x])))
# here's the short version. I'll explain....
max( # the function that grabs the biggest value
d, # this is the dictionary, it iterates through and grabs each key...
key = # this overrides the default behavior of max
lambda x: # defines a lambda to handle new behavior for max
len( # the length of...
set( # the set containing (sets have no duplicates)
d[x] # the list defined by key `x`
)
)
)
Since the code for max iterates through the dictionaries' keys (that's what a dictionary iterates through, by the by. for x in dict: print x will print each key in dict) it will return the key that it finds to have the highest result when it applies the function we built (that's what the lambda does) for key=. You could literally do ANYTHING here, that's the beauty of it. However, if you wanted the key AND the value, you might be able to do something like this....
d = # your dictionary
max_key, max_value = max(d.items(), key = lambda k,v: len(set(v)))
# THIS DOESN'T WORK, SEE MY NOTE AT BOTTOM
This differs because instead of passing d, which is a dictionary, we pass d.items(), which is a list of tuples built from d's keys and values. As example:
d = {"foo":"bar", "spam":['green','eggs','and','ham']}
print(d.items())
# [ ("foo", "bar"),
# ("spam", ["green","eggs","and","ham"])]
We're not looking at a dictionary anymore, but all the data is still there! It makes it easier to deal with using the unpack statement I used: max_key, max_value =. This works the same way as if you did WIDTH, HEIGHT = 1024, 768. max still works as usual, it iterates through the new list we built with d.items() and passes those values to its key function (the lambda k,v: len(set(v))). You'll also notice we don't have to do len(set(d[k])) but instead are operating directly on v, that's because d.items() has already created the d[k] value, and using lambda k,v is using that same unpack statement to assign the key to k and the value to v.
Magic! Magic that doesn't work, apparently. I didn't dig deep enough here, and lambdas cannot, in fact, unpack values on their own. Instead, do:
max_key, max_value = max(d.items(), key = lambda x: len(set(x[1])))
for less advanced user this can be a solution:
longest = max(len(item) for item in your_dict.values())
result = [item for item in your_dict.values() if len(item) == longest]

Get elements of a tuple-indexed dictionary specifying only one field of the tuple

I've got a big (20k+) set of data in a form of a dictionary indexed by tuple, e.g.
a = {(1,'000200','l1p'): 53, (15,'230512','l3c'): 81, ...}
I would like to filter that dictionary providing only one field of that tuple, e.g.
a[(_,_,'l1p')]`, or `a[(:,:,'l1p')]
Is there any better way than creating a list, like
[i for i in a.keys() if 'l1p' in i]
As I said, I'm trying to avoid copying elements as there are many entries in the dictionary.
EDIT: Is there any other way of obtaining the elements with 'l1p' in the key-tuple than iterating over the whole dictionary? I would like to perform a recursive least-square fitting on resultant sub-list.
First of all, what you have is a dictionary, not a list (and definitely not a tuple). Lists and tuples are just sequences of values numbered 0, 1, 2, ..., etc., while a dictionary is an unordered set of values each labelled & accessed with a unique key (in this case, the tuples).
With that out of the way, to get all of the values of a where the third element of the key is 'l1p', you can just do:
[v for k,v in a.items() if k[2] == 'l1p']
If you're concerned about saving memory and won't be trying to evaluate the entire result at once, this can be rewritten as a generator expression:
(v for k,v in a.items() if k[2] == 'l1p')
Note that, if you're using Python 2, a.items() will need to be changed to a.iteritems(), or the change to a generator will have been for naught.
Alternatively, if you want to instead get a sub-dictionary that includes the matching keys, do:
{k: v for k,v in a.items() if k[2] == 'l1p'}
Note that this is not a memory-friendly option. The closest analogue using a generator would be to create a generator of (key, value) pairs rather than a proper dictionary:
((k,v) for k,v in a.items() if k[2] == 'l1p')

Categories