This question already has answers here:
Best way to determine if a sequence is in another sequence?
(10 answers)
Closed 8 years ago.
I have a dictionary which consists of {str: list}.
What I want to do is find out the keys with specific sequnce that may exist in value.
for example, the content of dictionary is like this:
DOC3187 [1, 2, 3, 6, 7]
DOC4552 [5, 2, 3, 6]
DOC4974 [1, 2, 3, 6]
DOC8365 [1, 2, 3, 5, 6, 7]
DOC3738 [1, 4, 2, 3, 6]
DOC5311 [1, 5, 2, 3, 6, 7]
and I need to find out the keys with sequence of [5,2,3], so desired return should be:
DOC4552, DOC5311
I'm using Python 3.3.2, and the dictionary has about 400 items.
for any sequence 'seq' and longer sequence in your dictionary, 'myseq' the statement:
any(myseq[a:a+len(seq)] == seq for a in range(len(myseq)))
will evaluate to True if seq is a subsequence of myseq, False otherwise
NOTE: I realized that this will actually fail if your list contains [15, 2, 36] which does contain the string 5, 2, 3 so it is just for special cases.
Since you have a dictionary, maybe list comprehension on the keys and string matching? It is actually the same speed as walking through the elements, according to timeit...
s_list = [5,2,3] # sequence to search for
# Setting up your dictionary
MyD = {'DOC3187' : [1, 2, 3, 6, 7],
'DOC4552' : [5, 2, 3, 6],
'DOC4974' : [1, 2, 3, 6],
'DOC8365' : [1, 2, 3, 5, 6, 7],
'DOC3738' : [1, 4, 2, 3, 6],
'DOC5311' : [1, 5, 2, 3, 6, 7]}
query = str(s_list)[1:-1] # make a string of '5, 2, 3'
Matches = [ k for k in MyD if query in str(MyD[k]) ]
Result:
['DOC5311', 'DOC4552']
You can use this function:
def find_key_in_dict(d, t):
""" d is dict for searching, t is target list.
-> return matching key list.
"""
b_str = reduce(lambda x, y: str(x) + str(y), t)
return map(lambda x: x[0], filter(lambda i: b_str in reduce(lambda x, y: str(x) + str(y), i[1]), d.items()))
To search the value, you can use reduce() function to change dict value (integer list) and target list (also integer list) to string, then use 'in' to judge whether the dict value is meet.
Related
How to sort values of A based on the order of occurrence in B where values in A may be repetitive and values in B are unique
A=[1, 2, 2, 2, 3, 4, 4, 5]
B=[8, 5, 6, 2, 10, 3, 1, 9, 4]
The expected list is C which should contain
C = [5, 2, 2, 2, 3, 1, 4, 4]
Solution:
Try using sorted:
C = sorted(A, key=B.index)
And now:
print(C)
Output:
[5, 2, 2, 2, 3, 1, 4, 4]
Documentation reference:
As mentioned in the documentation of sorted:
Return a new sorted list from the items in iterable.
Has two optional arguments which must be specified as keyword
arguments.
key specifies a function of one argument that is used to extract a
comparison key from each element in iterable (for example,
key=str.lower). The default value is None (compare the elements
directly).
reverse is a boolean value. If set to True, then the list elements are
sorted as if each comparison were reversed.
you can use the key in sorted function
A=[1, 2, 2, 2, 3, 4, 4, 5]
B=[8, 5, 6, 2, 10, 3, 1, 9, 4]
C = ((i, B.index(i)) for i in A) # <generator object <genexpr> at 0x000001CE8FFBE0A0>
output = [i[0] for i in sorted(C, key=lambda x: x[1])] #[5, 2, 2, 2, 3, 1, 4, 4]
You can sort it without actually using a sort. The Counter class (from collection) is a special dictionary that maintains counts for a set of keys. In this case, your B list contains all keys that are possible. So you can use it to initialize a Counter object with zero occurrences of each key (this will preserve the order) and then add the A list to that. Finally, get the repeated elements out of the resulting Counter object.
from collections import Counter
A=[1, 2, 2, 2, 3, 4, 4, 5]
B=[8, 5, 6, 2, 10, 3, 1, 9, 4]
C = Counter(dict.fromkeys(B,0)) # initialize order
C.update(A) # 'sort' A
C = list(C.elements()) # get sorted elements
print(C)
[5, 2, 2, 2, 3, 1, 4, 4]
You could also write it in a single line:
C = list((Counter(dict.fromkeys(B,0))+Counter(A)).elements())
While using sorted(A,key=B.index) is simpler to write, this solution has lower complexity O(K+N) than a sort on an index lookup O(N x K x logN).
I have a list of integers. Each number can appear several times, the list is unordered.
I want to get the list of relative sizes. Meaning, if for example the original list is [2, 5, 7, 7, 3, 10] then the desired output is [0, 2, 3, 3, 1, 4]
Because 2 is the zero'th smallest number in the original list, 3 is one'th, etc.
Any clear easy way to do this?
Try a list comprehension with dictionary and also use set for getting unique values, like below:
>>> lst = [2, 5, 7, 7, 3, 10]
>>> newl = dict(zip(range(len(set(lst))), sorted(set(lst))))
>>> [newl[i] for i in lst]
[0, 2, 3, 3, 1, 4]
>>>
Or use index:
>>> lst = [2, 5, 7, 7, 3, 10]
>>> newl = sorted(set(lst))
>>> [newl.index(i) for i in lst]
[0, 2, 3, 3, 1, 4]
>>>
I have 3 lists.
A_set = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Q_act = [2, 3]
dur = [0, 4, 5, 2, 1, 3, 4, 8, 2, 3]
All lists are integers.
What I am trying to do is to compare Q_act with A_set then obtain the indices of the numbers that match from A_set.
(Example:
Q_act has the elements [2,3]
it is located in indices [1,2] from A_set)
Afterwards, I will use those indices to obtain the corresponding value in dur and store this in a list called p_dur_Q_act.
(Example: using the result from the previous example, [1,2]
The values in the dur list corresponding to the indices [1,2] should be stored in another list called p_dur_Q_act
i.e. [4,5] should be the values stored in the list p_dur_Q_act)
So, how do I get the index of the common integer element (which is [1,2]) from two separate lists and plug it to another list?
So far here are the code(s) I used:
This one, I wrote because it returns the index. But not [4,5].
p_Q = set(Q_act).intersection(A_set)
p_dur_Q_act = [i + 1 for i, x in enumerate(p_Q)]
print(p_dur_Q_act)
I also tried this but I receive an error TypeError: argument of type 'int' is not iterable
p_dur_Q_act = [i + 1 for i, x in enumerate(Q_act) if any(elem in x for elem in A_set)]
print(p_dur_Q_act)
Another option is to use the enumerate iterator to generate every index, and then select only the ones you want:
a_set = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
q_act = [2, 3]
dur = [0, 4, 5, 2, 1, 3, 4, 8, 2, 3]
p_dur_q_act = [i for i,v in enumerate(a_set) if v in q_act]
print([dur[p] for p in p_dur_q_act if p in dur]) # [4, 5]
This is more efficient than repeatedly calling index if the number of matches is large, because the number of calls is proportional to the number of matches, but the duration of calls is proportional to the length of a_set. The enumerate approach can be made even more efficient by turning q_act into a set, since in scales better with sets than lists. At these scales, though, there will be no observable difference.
You don't need to map these to index values, though. You can get the same result if you use zip to map a_set to dur and then select the d values whose a values are in q_act.
a_set = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
q_act = {2, 3}
dur = [0, 4, 5, 2, 1, 3, 4, 8, 2, 3]
p_dur_q_act = [d for a, d in zip(a_set, dur) if a in q_act]
Use index function to get the index of the element in the list.
>>> a_set = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> q_act = [2, 3]
>>> dur = [0, 4, 5, 2, 1, 3, 4, 8, 2, 3]
>>>
>>> print([dur[a_set.index(q)] for q in set(a_set).intersection(q_act)])
[4, 5]
I am trying to sort two lists together:
list1 = [1, 2, 5, 4, 4, 3, 6]
list2 = [3, 2, 1, 2, 1, 7, 8]
list1, list2 = (list(x) for x in zip(*sorted(zip(list1, list2))))
Anyway, doing this gives me on output
list1 = [1, 2, 3, 4, 4, 5, 6]
list2 = [3, 2, 7, 1, 2, 1, 8]
while I would want to keep the initial order for equal number 4 in the first list: what I want is
list1 = [1, 2, 3, 4, 4, 5, 6]
list2 = [3, 2, 7, 2, 1, 1, 8]
What do I have to do? I wouldn't want to use loop for bubble-sorting. Any help appreciated.
Use a key parameter for your sort that only compares the first element of the pair. Since Python's sort is stable, this guarantees that the order of the second elements will remain the same when the first elements are equal.
>>> from operator import itemgetter
>>> [list(x) for x in zip(*sorted(zip(list1, list2), key=itemgetter(0)))]
[[1, 2, 3, 4, 4, 5, 6], [3, 2, 7, 2, 1, 1, 8]]
Which is equivalent to:
>>> [list(x) for x in zip(*sorted(zip(list1, list2), key=lambda pair: pair[0]))]
[[1, 2, 3, 4, 4, 5, 6], [3, 2, 7, 2, 1, 1, 8]]
The trick here is that when Python does tuple comparisons, it compares the elements in order from left to right (for example, (4, 1) < (4, 2), which is the reason that you don't get the ordering you want in your particular case). That means you need to pass in a key argument to the sorted function that tells it to only use the first element of the pair tuple as its sort expression, rather than the entire tuple.
This is guaranteed to retain the ordering you want because:
sorts are guaranteed to be stable. That means that when multiple records have the same key, their original order is preserved.
(source)
>>> list1 = [1, 2, 5, 4, 4, 3, 6]
>>> list2 = [3, 2, 1, 2, 1, 7, 8]
>>>
>>> list1, list2 = (list(x) for x in zip(*sorted(zip(list1, list2), key=lambda pair: pair[0])))
>>>
>>> print list1
[1, 2, 3, 4, 4, 5, 6]
>>> print list2
[3, 2, 7, 2, 1, 1, 8]
In you code the sorting is performed basing on the first and the second elements of the tuples, so the resulting second list elements are in the sorted order for the same elements of the first list.
To avoid sorting based on the second list, just specify that only the elements from the first list should be used in the comparison of the tuples:
>>> from operator import itemgetter
>>> list1, list2 = (list(x) for x in zip(*sorted(zip(list1, list2),key=itemgetter(0))))
>>> list1, list2
([1, 2, 3, 4, 4, 5, 6], [3, 2, 7, 2, 1, 1, 8])
itemgetter(0) takes the first element from each tuple, which belongs to the first list.
I have one list like this:
a = [3, 4, [1], 8, 9, [3, 4, 5]]
i would like to identify when that list with those characteristics has only one value, and then extract it to the main list:
Expected output
a = [3, 4, 1, 8, 9, [3, 4, 5]]
I know how extract values in a list composed of lists, but in this case i don't know how
My solution is simple and straightforward:
result = []
for x in a:
if isinstance(x, list) and len(x) == 1: # check item type and length
result.append(x[0])
else:
result.append(x)
Or the same but one line
>>> [x[0] if isinstance(x, list) and len(x) == 1 else x for x in a]
[3, 4, 1, 8, 9, [3, 4, 5]]