I have referred to this page but it shows Dict.keys()'s time complexity.
https://wiki.python.org/moin/TimeComplexity
This sheet also shows the same https://www.geeksforgeeks.org/complexity-cheat-sheet-for-python-operations/
Time complexity for lookup in dictionary.values() lists vs sets
In this case, it searches for each key's list so it didn't help me.
Because in my case, all Dict's values will be a single integer.
Q(1): Is it O(1) or O(n) for Dict.values()?
Dict = {1:10,2:20,3:30,4:40}
if 10 in Dict.values():
print("YES")
Q(2): In python is it possible to get key by supplying value?[if the supplied value comes multiple times in Dict.values() I would like to get all corresponding keys]
Dict = {1:10,2:20,3:30,4:40}
value = 20
I want to find key=2 by this value. Is it possible with O(1), because in O(n) I have to check for all key's value!!!
Q(1):I think it is O(1)
edit:I was wrong.It is O(n).Thanks to #Roy Cohen and #kaya3.
test code:
import timeit
def timeis(func):
def wrap(*args, **kwargs):
start = timeit.default_timer()
result = func(*args, **kwargs)
end = timeit.default_timer()
print(func.__name__, end-start)
return result
return wrap
import random
#timeis
def dict_values_test(dic,value):
return value in dic.values()
tiny_dic = {i : 10*i for i in range(1000)}
value = random.randint(1,1000)
dict_values_test(tiny_dic,value)
small_dic = {i : 10*i for i in range(1000000)}
value = random.randint(1,1000000)
dict_values_test(small_dic,value)
big_dic = {i : 10*i for i in range(100000000)}
value = random.randint(1,100000000)
dict_values_test(big_dic,value)
result:
dict_values_test 2.580000000002025e-05
dict_values_test 0.015847600000000073
dict_values_test 1.4836825999999999
Q(2):
code:
def find_key_by_value(dic,find_value):
return [k for k,v in dic.items() if v == find_value]
dic = {1:10,2:20,3:30,4:40,5:40}
print(find_key_by_value(dic,40))
result:
[4, 5]
Related
I have a list of tuples which might have duplicates. I need to remove the duplicates (in the strict sense, e.g., after the initial appearance of a given tuple.) However, the gimmicks I found online using set-based list comprehensions don't work for me because I need to preserve the order:
>>> x = [(4,0),(4,0),(1,3) ...]
>>> x = [t for t in (set(tuple(i) for i in x))]
>>> print(x)
[(1,3), (4,0)] # bad, order was trashed!
I tried something like this:
>>> d = {}
>>> for t in x:
... if not d[t]:
... newlist.append(d[t])
... d[t] = 1
# then we have newlist with one instance of every tuple in x.
I haven't got the syntax right but what I mean is, create a dictionary with a key for the first time a tuple in x is found. Is this possible? Is there a more pythonic way?
Aside from the dict.fromkeys-method suggested by #andrej-kesely, I can think of three different solutions to this problem.
I know you did not ask for this, but I wrote a little test script for fun to see, how the various correct approaches hold up in terms of execution time. This should work under Python 3.7+:
import sys
from random import randint
from timeit import timeit
from typing import List, Tuple
TuplesT = List[Tuple[int, ...]]
TUP_LEN = 2
MIN, MAX = 0, 9
def get_tuples(n: int) -> TuplesT:
"""Randomly generates a list of `n` tuples."""
return [
tuple(randint(MIN, MAX) for __ in range(TUP_LEN))
for _ in range(n)
]
def new_list_contains(tuples: TuplesT) -> TuplesT:
new = []
for tup in tuples:
if tup not in new:
new.append(tup)
return new
def helper_set_lookup(tuples: TuplesT) -> TuplesT:
new = []
already_seen = set()
for tup in tuples:
if tup not in already_seen:
already_seen.add(tup)
new.append(tup)
return new
def dict_from_keys(tuples: TuplesT) -> TuplesT:
return list(dict.fromkeys(tuples))
def in_place_with_set_lookup(tuples: TuplesT) -> TuplesT:
to_delete = []
already_seen = set()
for i in range(len(tuples)):
if tuples[i] in already_seen:
to_delete.append(i)
else:
already_seen.add(tuples[i])
for i in reversed(to_delete):
del tuples[i]
return tuples # return it as well to make function comparisons easier
FUNCTIONS = (
new_list_contains, # reference implementation
helper_set_lookup,
dict_from_keys,
in_place_with_set_lookup,
)
def test_correctness(tuples: TuplesT) -> None:
correct_list = new_list_contains(tuples)
for func in FUNCTIONS[1:]:
assert correct_list == func(tuples.copy()), f"{func.__name__} incorrect!"
def main(n: int) -> None:
test_correctness(test_tuples)
for func in FUNCTIONS:
name = func.__name__
t = timeit(
stmt=f"{name}(tuples_copy)",
setup=f"from __main__ import test_tuples, {name};"
f"tuples_copy = test_tuples.copy()",
number=n,
)
print(f"{name:<30} {round(t, 4)} seconds")
if __name__ == '__main__':
num_tuples, num_repeats = sys.argv[1:3]
test_tuples = get_tuples(int(num_tuples))
main(int(num_repeats))
The script needs to be called with two arguments, the first being the number of test tuples to randomly generate in advance and the second being the number of repetitions to use for each function.
Here are some non-scientific results from my environment.
num_tuples, num_repeats = 100, 100000:
new_list_contains 3.2087 seconds
helper_set_lookup 0.6152 seconds
dict_from_keys 0.3408 seconds
in_place_with_set_lookup 0.4801 seconds
num_tuples, num_repeats = 1000, 10000:
new_list_contains 5.0503 seconds
helper_set_lookup 0.3682 seconds
dict_from_keys 0.3088 seconds
in_place_with_set_lookup 0.0792 seconds
num_tuples, num_repeats = 100000, 100:
new_list_contains 5.5827 seconds
helper_set_lookup 0.3122 seconds
dict_from_keys 0.3023 seconds
in_place_with_set_lookup 0.0114 seconds
The dict.fromkeys-method suggested by #andrej-kesely appears to be pretty fast, but an in-place implementation utilizing a set as cache seems to be significantly faster still. This difference becomes more pronounced, the larger the number of tuples in the list becomes.
Additionally, I would argue that the "in-place" approach is more memory efficient, because the cache set and index list likely use less memory than a whole additional dictionary and a whole additional list.
If your list of tuples isn't very large, I would still definitely recommend the approach via dictionary keys because "readability counts" and it is just hilariously more concise and elegant than any other method that comes to mind.
I need help creating a function that goes through a given dictionary. The value associated with that key may be another key to the dictionary. i need the function to keep looking up the keys until it reaches a key that has no associated value.
def follow_me(d, s):
while d:
if s in d:
return d[s]
I can return the value in the dictionary that s equals to but I've no idea how to iterate through it until I get a value that has no associated value. So I can get the value that badger is doe, but how do I iterate through the dictionary until I get fox and then fox to hen etc.
d = {'badger':'doe', 'doe':'fox', 'fox':'hen','hen':'flea',
'sparrow':'spider', 'zebra':'lion', 'lion':'zebra'}
print(follow_me(d, 'badger'))
print(follow_me(d, 'fox'))
print(follow_me(d, 'sparrow'))
print(follow_me(d, 'zebra'))
print(follow_me(d, 'aardvark'))
and this is what I currently have of the function that makes sense to me because everything else I've tried is just wrong.
def follow_me(d, s):
while d:
if s in d:
return d[s]
and the output needs to be:
flea
flea
spider
aardvark
but my code right now is producing:
doe
hen
spider
lion
To extend on the other answers, which are still valid. If you have a very large dictionary then using key not in dic.keys() or k in d iterates through all keys every loop.
To go around this, one can use a try catch:
def follow_me(dic, key):
while True:
if key not in dic.keys():
return key
key = dic[key]
def follow_me2(dic, key):
try:
while True:
key = dic[key]
except Exception as e:
return key
import time
d = { i: (i+1) for i in range(10000000) }
start = time.time()
follow_me(d, 0)
print("Using 'in' takes", time.time() - start,"s")
start = time.time()
follow_me2(d, 0)
print("Using 'try' takes", time.time() - start,"s")
gives the output:
Using 'in' takes 2.476428747177124 s
Using 'try' takes 0.9100546836853027 s
I think this is what you are looking for, though your problem description is very unclear:
def follow_me(d, k):
while k in d:
k = d[k]
return k
Note that the loop in this function will run forever if there is a cycle between keys and values in your dictionary. Your example has one between 'lion' and 'zebra', and it's not entirely clear how you intend such a cycle to be broken. If you want to expand each key only once, you could handle it by keeping track of the values you've seen so far in a set:
def follow_me(d, k):
seen = set()
while k in d and k not in seen:
seen.add(k)
k = d[k]
return k
This will return whichever key in the cycle you reach first (so follow_me(d, 'zebra') with your example dictionary will return 'zebra' after going zebra => lion => zebra). If you want some other outcome, you'd need different logic and it might be tricky to do.
If you request a key that's not in the dictionary (like 'aardvark' in your example), the requested key will be returned immediately. You could add special handling for the first key you look up, but it would again make things more complicated.
Considering the existence of infinite loops this has to be handled. Your description isn't clear about what should happen in this case.
def follow_me(d, key):
visited_keys = []
while key not in visited_keys and d[key]:
visited_keys.append(key)
key = d[key]
if not d[key]:
return key
return "this hunt has no end"
Given a basic class Item:
class Item(object):
def __init__(self, val):
self.val = val
a list of objects of this class (the number of items can be much larger):
items = [ Item(0), Item(11), Item(25), Item(16), Item(31) ]
and a function compute that process and return a value.
How to find two items of this list for which the function compute return the same value when using the attribute val? If nothing is found, an exception should be raised. If there are more than two items that match, simple return any two of them.
For example, let's define compute:
def compute( x ):
return x % 10
The excepted pair would be: (Item(11), Item(31)).
You can check the length of the set of resulting values:
class Item(object):
def __init__(self, val):
self.val = val
def __repr__(self):
return f'Item({self.val})'
def compute(x):
return x%10
items = [ Item(0), Item(11), Item(25), Item(16), Item(31)]
c = list(map(lambda x:compute(x.val), items))
if len(set(c)) == len(c): #no two or more equal values exist in the list
raise Exception("All elements have unique computational results")
To find values with similar computational results, a dictionary can be used:
from collections import Counter
new_d = {i:compute(i.val) for i in items}
d = Counter(new_d.values())
multiple = [a for a, b in new_d.items() if d[b] > 1]
Output:
[Item(11), Item(31)]
A slightly more efficient way to find if multiple objects of the same computational value exist is to use any, requiring a single pass over the Counter object, whereas using a set with len requires several iterations:
if all(b == 1 for b in d.values()):
raise Exception("All elements have unique computational results")
Assuming the values returned by compute are hashable (e.g., float values), you can use a dict to store results.
And you don't need to do anything fancy, like a multidict storing all items that produce a result. As soon as you see a duplicate, you're done. Besides being simpler, this also means we short-circuit the search as soon as we find a match, without even calling compute on the rest of the elements.
def find_pair(items, compute):
results = {}
for item in items:
result = compute(item.val)
if result in results:
return results[result], item
results[result] = item
raise ValueError('No pair of items')
A dictionary val_to_it that contains Items keyed by computed val can be used:
val_to_it = {}
for it in items:
computed_val = compute(it.val)
# Check if an Item in val_to_it has the same computed val
dict_it = val_to_it.get(computed_val)
if dict_it is None:
# If not, add it to val_to_it so it can be referred to
val_to_it[computed_val] = it
else:
# We found the two elements!
res = [dict_it, it]
break
else:
raise Exception( "Can't find two items" )
The for block can be rewrite to handle n number of elements:
for it in items:
computed_val = compute(it.val)
dict_lit = val_to_it.get(computed_val)
if dict_lit is None:
val_to_it[computed_val] = [it]
else:
dict_lit.append(it)
# Check if we have the expected number of elements
if len(dict_lit) == n:
# Found n elements!
res = dict_lit
break
I am looking for a way to update/access a Python dictionary by addressing all keys that do NOT match the key given.
That is, instead of the usual dict[key], I want to do something like dict[!key]. I found a workaround, but figured there must be a better way which I cannot figure out at the moment.
# I have a dictionary of counts
dicti = {"male": 1, "female": 200, "other": 0}
# Problem: I encounter a record (cannot reproduce here) that
# requires me to add 1 to every key in dicti that is NOT "male",
# i.e. dicti["female"], and dicti["other"],
# and other keys I might add later
# Here is what I am doing and I don't like it
dicti.update({k: v + 1 for k,v in dicti.items() if k != "male"})
dicti.update({k: v + 1 for k,v in dicti.items() if k != "male"})
that creates a sub-dictionary (hashing, memory overhead) then passes it to the old dictionary: more hashing/ref copy.
Why not a good old loop on the keys (since the values aren't mutable):
for k in dicti:
if k != "male":
dicti[k] += 1
Maybe faster if there are a lot of keys and only one key to avoid: add to all the keys, and cancel the operation on the one key you want to avoid (saves a lot of string comparing):
for k in dicti:
dicti[k] += 1
dicti["male"] -= 1
if the values were mutable (ex: lists) we would avoid one hashing and mutate the value instead:
for k,v in dicti.items():
if k != "male":
v.append("something")
One-liners are cool, but sometimes it's better to avoid them (performance & readability in that case)
If you have to perform this "add to others" operation more often, and if all the values are numeric, you could also subtract from the given key and add the same value to some global variable counting towards all the values (including that same key). For example, as a wrapper class:
import collections
class Wrapper:
def __init__(self, **values):
self.d = collections.Counter(values)
self.n = 0
def add(self, key, value):
self.d[key] += value
def add_others(self, key, value):
self.d[key] -= value
self.n += value
def get(self, key):
return self.d[key] + self.n
def to_dict(self):
if self.n != 0: # recompute dict and reset global offset
self.d = {k: v + self.n for k, v in self.d.items()}
self.n = 0
return self.d
Example:
>>> dicti = Wrapper(**{"male": 1, "female": 200, "other": 0})
>>> dicti.add("male", 2)
>>> dicti.add_others("male", 5)
>>> dicti.get("male")
3
>>> dicti.to_dict()
{'other': 5, 'female': 205, 'male': 3}
The advantage is that both the add and the add_others operation are O(1) and only when you actually need them, you update the values with the global offset. Of course, the to_dict operation still is O(n), but the updated dict can be saved and only recomputed when add_other has been called again in between.
Is there a function which can take in a dictionary and modify the dictionary by increasing only the values in it by 1?
i.e
f({'1':0.3, '11':2, '111':{'a':7, 't':2}})
becomes
{'1':1.3, '11':3, '111':{'a':8, 't':3}}
and
f({'a':{'b':{'c':5}}})
becomes
{'a':{'b':{'c':6}}}
Thanks!
Not the best...
def incr(d):
try:
return d + 1
except TypeError: # test the type rather catch error
return g_incr(d)
except:
return 0
def g_incr(d):
return {k:incr(v) for k, v in d.items()}
test = {'1':0.3, '11':2, '111':{'a':7, 't':2}}
print g_incr(test)
I think you should try this;
def increment(dict):
return {k:v+1 for k,v in dict.items()}
result = increment()
print result