Related
I have two lists:
lookup_list = [1,2,3]
my_list = [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
I want to count how many times the lookup_list appeared in my_list with the following logic:
The order should be 1 -> 2 -> 3
In my_list, the lookup_list items doesn't have to be next to each other: 1,4,2,1,5,3 -> should generate a match since there is a 2 comes after a 1 and a 3 comes after 2.
The mathces based on the logic:
1st match: [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
2nd match: [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
3rd match: [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
4th match: [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
The lookup_list is dynamic, it could be defined as [1,2] or [1,2,3,4], etc. How can I solve it? All the answers I've found is about finding matches where 1,2,3 appears next to each other in an ordered way like this one: Find matching sequence of items in a list
I can find the count of consecutive sequences with the below code but it doesn't count the nonconsecutive sequences:
from nltk import ngrams
lookup_list = [1,2,3]
my_list = [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
all_counts = Counter(ngrams(l2, len(l1)))
counts = {k: all_counts[k] for k in [tuple(lookup_list)]}
counts
>>> {(1, 2, 3): 2}
I tried using pandas rolling window functions but they don't have a custom reset option.
def find_all_sequences(source, sequence):
def find_sequence(source, sequence, index, used):
for i in sequence:
while True:
index = source.index(i, index + 1)
if index not in used:
break
yield index
first, *rest = sequence
index = -1
used = set()
while True:
try:
index = source.index(first, index + 1)
indexes = index, *find_sequence(source, rest, index, used)
except ValueError:
break
else:
used.update(indexes)
yield indexes
Usage:
lookup_list = [1,2,3]
my_list = [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
print(*find_all_sequences(my_list, lookup_list), sep="\n")
Output:
(0, 1, 2)
(6, 7, 11)
(9, 10, 15)
(14, 16, 17)
Generator function find_all_sequences() yields tuples with indexes of sequence matches. In this function we initialize loop which will be stopped when list.index() call will throw ValueError. Internal generator function find_sequence() yields index of every sequence item.
According to this benchmark, my method is about 60% faster than one from Andrej Kesely's answer.
The function find_matches() returns indices where the matches from lookup_list are:
def find_matches(lookup_list, lst):
buckets = []
def _find_bucket(i, v):
for b in buckets:
if lst[b[-1]] == lookup_list[len(b) - 1] and v == lookup_list[len(b)]:
b.append(i)
if len(b) == len(lookup_list):
buckets.remove(b)
return b
break
else:
if v == lookup_list[0]:
buckets.append([i])
rv = []
for i, v in enumerate(my_list):
b = _find_bucket(i, v)
if b:
rv.append(b)
return rv
lookup_list = [1, 2, 3]
my_list = [1, 2, 3, 4, 5, 2, 1, 2, 2, 1, 2, 3, 4, 5, 1, 3, 2, 3, 1]
print(find_matches(lookup_list, my_list))
Prints:
[[0, 1, 2], [6, 7, 11], [9, 10, 15], [14, 16, 17]]
Here is a recursive solution:
lookup_list = [1,2,3]
my_list = [1,2,3,4,5,2,1,2,2,1,2,3,4,5,1,3,2,3,1]
def find(my_list, continue_from_index):
if continue_from_index > (len(my_list) - 1):
return 0
last_found_index = 0
found_indizes = []
first_occuring_index = 0
found = False
for l in lookup_list:
for m_index in range(continue_from_index, len(my_list)):
if my_list[m_index] is l and m_index >= last_found_index:
if not found:
found = True
first_occuring_index = m_index
last_found_index = m_index
found += 1
found_indizes.append(str(m_index))
break
if len(found_indizes) is len(lookup_list):
return find(my_list, first_occuring_index+1) + 1
return 0
print(find(my_list, 0))
my_list = [5, 6, 3, 8, 2, 1, 7, 1]
lookup_list = [8, 2, 7]
counter =0
result =False
for i in my_list:
if i in lookup_list:
counter+=1
if(counter==len(lookup_list)):
result=True
print (result)
I have an array and given an array of size N containing positive integers and I want to count number of smaller elements on right side of each array.
for example:-
Input:
N = 7
arr[] = {12, 1, 2, 3, 0, 11, 4}
Output: 6 1 1 1 0 1 0
Explanation: There are 6 elements right
after 12. There are 1 element right after
1. And so on.
And my code for this problem is like as :-
# python code here
n=int(input())
arr=list(map(int,input().split()))
ans=0
ANS=[]
for i in range(n-1):
for j in range(i+1,n):
if arr[i]>arr[j]:
ans+=1
ANS.append(ans)
ans=0
ANS.append(0)
print(ANS)
but the above my code take O(n^2) time complexity and I want to reduce the this. If anyone have any idea to reduce above python code time complexity please help me. Thank you.
This solution is O(n log(n)) as it is three iterations over the values and one sorting.
arr = [12, 1, 2, 3, 0, 11, 4]
# Gather original index and values
tups = []
for origin_index, el in enumerate(arr):
tups.append([origin_index, el])
# sort on value
tups.sort(key=lambda t: t[1])
res = []
for sorted_index, values in enumerate(tups):
# check the difference between the sorted and original index
# If there is a positive value we have the n difference smaller
# values to the right of this index value.
if sorted_index - values[0] > 0:
res.append([values[0], (sorted_index - values[0])])
elif sorted_index - values[0] == 0:
res.append([values[0], (sorted_index - values[0]) + 1])
else:
res.append([values[0], 0])
origin_sort_res = [0 for i in range(len(arr))]
for v in res:
# Return the from the sorted array to the original indexing
origin_sort_res[v[0]] = v[1]
print(origin_sort_res)
try this(nlog2n)
def solution(nums):
sortns = []
res = []
for n in reversed(nums):
idx = bisect.bisect_left(sortns, n)
res.append(idx)
sortns.insert(idx,n)
return res[::-1]
print(solution([12, 1, 2, 3, 0, 11, 4]))
# [6, 1, 1, 1, 0, 1, 0]
How to loop in a list while using dictionaries and return the value that repeats the most, and if the values are repeated the same amount return that which is greater?
Here some context with code unfinished
def most_frequent(lst):
dict = {}
count, itm = 0, ''
for item in lst:
dict[item] = dict.get(item, 0) + 1
if dict[item] >= count:
count, itm = dict[item], item
return itm
#lst = ["a","b","b","c","a","c"]
lst = [2, 3, 2, 2, 1, 3, 3,1,1,1,1] #this should return 1
lst2 = [2, 3, 2, 2, 1, 3, 3] # should return 3
print(most_frequent(lst))
Here is a different way to go about it:
def most_frequent(lst):
# Simple check to ensure lst has something.
if not lst:
return -1
# Organize your data as: {number: count, ...}
dct = {}
for i in lst:
dct[i] = dct[i] + 1 if i in dct else 1
# Iterate through your data and create a list of all large elements.
large_list, large_count = [], 0
for num, count in dct.items():
if count > large_count:
large_count = count
large_list = [num]
elif count == large_count:
large_list.append(num)
# Return the largest element in the large_list list.
return max(large_list)
There are many other ways to solve this problem, including using filter and other built-ins, but this is intended to give you a working solution so that you can start thinking on how to possibly optimize it better.
Things to take out of this; always think:
How can I break this problem down into smaller parts?
How can I organize my data so that it is more useful and easier to manipulate?
What shortcuts can I use along the way to make this function easier/better/faster?
Your code produces the result as you describe in your question, i.e. 1. However, your question states that you want to consider the case where two list elements are co-equals in maximum occurrence and return the largest. Therefore, tracking and returning a single element doesn't satisfy this requirement. You need to compile the dict and then evaluate the result.
def most_frequent(lst):
dict = {}
for item in lst:
dict[item] = dict.get(item, 0) + 1
itm = sorted(dict.items(), key = lambda kv:(-kv[1], -kv[0]))
return itm[0]
#lst = ["a","b","b","c","a","c"]
lst = [2, 3, 2, 2, 2, 2, 1, 3, 3,1,1,1,1] #this should return 1
lst2 = [2, 3, 2, 2, 1, 3, 3] # should return 3
print(most_frequent(lst))
I edited the list 'lst' so that '1' and '2' both occur 5 times. The result returned is a tuple:
(2,5)
I reuse your idea which is quite neat, and I just modified your program a bit.
def get_most_frequent(lst):
counts = dict()
most_frequent = (None, 0) # (item, count)
ITEM_IDX = 0
COUNT_IDX = 1
for item in lst:
counts[item] = counts.get(item, 0) + 1
if most_frequent[ITEM_IDX] is None:
# first loop, most_frequent is "None"
most_frequent = (item, counts[item])
elif counts[item] > most_frequent[COUNT_IDX]:
# if current item's "counter" is bigger than the most_frequent's counter
most_frequent = (item, counts[item])
elif counts[item] == most_frequent[COUNT_IDX] and item > most_frequent[ITEM_IDX]:
# if the current item's "counter" is the same as the most_frequent's counter
most_frequent = (item, counts[item])
else:
pass # do nothing
return most_frequent
lst1 = [2, 3, 2, 2, 1, 3, 3,1,1,1,1, 2] # 1: 5 times
lst2 = [2, 3, 1, 3, 3, 2, 2] # 3: 3 times
lst3 = [1]
lst4 = []
print(get_most_frequent(lst1))
print(get_most_frequent(lst2))
print(get_most_frequent(lst3))
print(get_most_frequent(lst4))
I'm doing an algorithm challenge on www.edabit.com, where you have a list of dice rolls, and:
if the number is 6, the next number on the list is amplified by a factor of 2
if the number is 1, the next number on the list is 0
this is my code:
def rolls(lst):
out = 0
iterate = 0
if lst[iterate] == 1:
out+=lst[iterate]
lst[iterate+1] = 0
iterate+=1
rolls(lst[iterate])
elif lst[iterate] == 6:
out+=lst[iterate]
lst[iterate+1] = lst[iterate+1]*2
iterate+=1
rolls(lst[iterate])
else:
out+=lst[iterate]
iterate+=1
The console gives me "TypeError: 'int' object is not subscriptable"
Any ideas? Also any other errors you spot would be useful.
I tried on other IDE's, but it gives the same output.
for a series like "1, 6, 2, 3, 2, 4, 5, 6, 2" I expect 27
As stated in the comments, for this kind of problem I wouldn't use recursion. A loop will be enough:
l = [1, 6, 2, 3, 2, 4, 5, 6, 2]
from itertools import accumulate
print(sum(accumulate([0] + l, lambda a, b: b*{1:0, 6:2}.get(a, 1))))
Prints:
27
If you have to use recursion for this problem, then you will need to pass two arguments first lst and second iterate. Note that lst[iterate] is a single element which you are passing to the function when calling it recursively.
Thus modify the function to take two arguments lst and iterate. And initially pass arguments as full list for lst and a 0 for iterate. rolls(lst, 0) should be your initial function call.
I suppose you want out variable to contain sum of all entries in lst when you visit them, so that also needs to be passed as an argument, making your initial call rolls(lst, 0, 0). I have edited the function to return the sum calculated in out accordingly.
def rolls(lst, iterate, out):
if iterate == len(lst):
return out
if lst[iterate] == 1:
out += lst[iterate]
if iterate + 1 < len(lst): #In order to avoid index out of bounds exception
lst[iterate + 1] = 0
rolls(lst, iterate + 1, out)
elif lst[iterate] == 6:
out += lst[iterate]
if iterate + 1 < len(lst): #In order to avoid index out of bounds exception
lst[iterate + 1] = lst[iterate + 1] * 2
rolls(lst, iterate + 1, out)
else:
out += lst[iterate]
rolls(lst, iterate+1, out)
Instead of looking to the next item, you can look at the previous item:
from itertools import islice
def rolls(lst):
if not lst:
return 0
total = prev = lst[0]
for x in islice(lst, 1, None):
if prev == 1:
x = 0
elif prev == 6:
x *= 2
prev = x
total += x
return total
For example:
>>> rolls([1, 6, 2, 3, 2, 4, 5, 6, 2])
27
>>> rolls([])
0
>>> rolls([1])
1
>>> rolls([2])
2
>>> rolls([3])
3
>>> rolls([4])
4
>>> rolls([6,1])
8
>>> rolls([6,2])
10
>>> rolls([6,1,5])
13
I am searching for a clean and pythonic way of checking if the contents of a list are greater than a given number (first threshold) for a certain number of times (second threshold). If both statements are true, I want to return the index of the first value which exceeds the given threshold.
Example:
# Set first and second threshold
thr1 = 4
thr2 = 5
# Example 1: Both thresholds exceeded, looking for index (3)
list1 = [1, 1, 1, 5, 1, 6, 7, 3, 6, 8]
# Example 2: Only threshold 1 is exceeded, no index return needed
list2 = [1, 1, 6, 1, 1, 1, 2, 1, 1, 1]
I don't know if it's considered pythonic to abuse the fact that booleans are ints but I like doing like this
def check(l, thr1, thr2):
c = [n > thr1 for n in l]
if sum(c) >= thr2:
return c.index(1)
Try this:
def check_list(testlist)
overages = [x for x in testlist if x > thr1]
if len(overages) >= thr2:
return testlist.index(overages[0])
# This return is not needed. Removing it will not change
# the outcome of the function.
return None
This uses the fact that you can use if statements in list comprehensions to ignore non-important values.
As mentioned by Chris_Rands in the comments, the return None is unnecessary. Removing it will not change the result of the function.
If you are looking for a one-liner (or almost)
a = filter(lambda z: z is not None, map(lambda (i, elem) : i if elem>=thr1 else None, enumerate(list1)))
print a[0] if len(a) >= thr2 else false
A naive and straightforward approach would be to iterate over the list counting the number of items greater than the first threshold and returning the index of the first match if the count exceeds the second threshold:
def answer(l, thr1, thr2):
count = 0
first_index = None
for index, item in enumerate(l):
if item > thr1:
count += 1
if not first_index:
first_index = index
if count >= thr2: # TODO: check if ">" is required instead
return first_index
thr1 = 4
thr2 = 5
list1 = [1, 1, 1, 5, 1, 6, 7, 3, 6, 8]
list2 = [1, 1, 6, 1, 1, 1, 2, 1, 1, 1]
print(answer(list1, thr1, thr2)) # prints 3
print(answer(list2, thr1, thr2)) # prints None
This is probably not quite pythonic though, but this solution has couple of advantages - we keep the index of the first match only and have an early exit out of the loop if we hit the second threshold.
In other words, we have O(k) in the best case and O(n) in the worst case, where k is the number of items before reaching the second threshold; n is the total number of items in the input list.
I don't know if I'd call it clean or pythonic, but this should work
def get_index(list1, thr1, thr2):
cnt = 0
first_element = 0
for i in list1:
if i > thr1:
cnt += 1
if first_element == 0:
first_element = i
if cnt > thr2:
return list1.index(first_element)
else:
return "criteria not met"
thr1 = 4
thr2 = 5
list1 = [1, 1, 1, 5, 1, 6, 7, 3, 6, 8]
list2 = [1, 1, 6, 1, 1, 1, 2, 1, 1, 1]
def func(lst)
res = [ i for i,j in enumerate(lst) if j > thr1]
return len(res)>=thr2 and res[0]
Output:
func(list1)
3
func(list2)
false