Find the number of consecutively increasing elements in a list - python

I got a problem in TalentBuddy, which sounds like this
A student's performance in lab activities should always improve, but that is not always the case.
Since progress is one of the most important metrics for a student, let’s write a program that computes the longest period of increasing performance for any given student.
For example, if his grades for all lab activities in a course are: 9, 7, 8, 2, 5, 5, 8, 7 then the longest period would be 4 consecutive labs (2, 5, 5, 8).
So far, I seem too confused to work the code. The only thing that I worked is
def longest_improvement(grades):
res = 0
for i in xrange(len(grades) - 2):
while grades[i] <= grades[i + 1]:
res += 1
i += 1
print res
But that prints 17, rather than 6 when grades = [1, 7, 2, 5, 6, 9, 11, 11, 1, 6, 1].
How to work out the rest of the code? Thanks

Solved with some old-fashioned tail-recursion:
grades = [1, 7, 2, 5, 6, 9, 11, 11, 1, 6, 1]
def streak(grades):
def streak_rec(longest, challenger, previous, rest):
if rest == []: # Base case
return max(longest, challenger)
elif previous <= rest[0]: # Streak continues
return streak_rec(longest, challenger + 1, rest[0], rest[1:])
else: # Streak is reset
return streak_rec(max(longest, challenger), 1, rest[0], rest[1:])
return streak_rec(0, 0, 0, grades)
print streak(grades) # => 6
print streak([2]) # => 1

Since the current solution involves yield and maps and additional memory overhead, it's probably a good idea to at least mention the simple solution:
def length_of_longest_sublist(lst):
max_length, cur_length = 1, 1
prev_val = lst[0]
for val in lst[1:]:
if val >= prev_val :
cur_length += 1
else:
max_length = max(max_length, cur_length)
cur_length = 1
prev_val = val
return max(max_length, cur_length)
We could reduce that code by getting the previous value directly:
def length_of_longest_sublist2(lst):
max_length, cur_length = int(bool(lst)), int(bool(lst))
for prev_val, val in zip(lst, lst[1:]):
if val >= prev_val:
cur_length += 1
else:
max_length = max(max_length, cur_length)
cur_length = 1
return max(max_length, cur_length)
which is a nice trick to know (and allows it to easily return the right result for an empty list), but confusing to people who don't know the idiom.

This method uses fairly basic python and the return statement can be quickly modified so that you have a list of all the streak lengths.
def longest_streak(grades):
if len(grades) < 2:
return len(grades)
else:
start, streaks = -1, []
for idx, (x, y) in enumerate(zip(grades, grades[1:])):
if x > y:
streaks.append(idx - start)
start = idx
else:
streaks.append(idx - start + 1)
return max(streaks)

I would solve it this way:
from itertools import groupby
from funcy import pairwise, ilen
def streak(grades):
if len(grades) <= 1:
return len(grades)
orders = (x <= y for x, y in pairwise(grades))
return max(ilen(l) for asc, l in groupby(orders) if asc) + 1
Very explicit: orders is an iterator of Trues for ascending pairs and Falses for descending ones. Then we need just find a longest list of ascending and add 1.

You're using the same res variable in each iteration of the inner while loop. You probably want to reset it, and keep the highest intermediate result in a different variable.

Little bit late, but here's my Updated version:
from funcy import ilen, ireductions
def streak(last, x):
if last and x >= last[-1]:
last.append(x)
return last
return [x]
def longest_streak(grades):
xs = map(ilen, ireductions(streak, grades, None))
return xs and max(xs) or 1
grades = [1, 7, 2, 5, 6, 9, 11, 11, 1, 6, 1]
print longest_streak(grades)
print longest_streak([2])
I decided in the end to not only produce a correct
version without bugs, but to use a library I quite like funcy :)
Output:
6
1

Maybe not as efficient as previous answers, but it's short :P
diffgrades = np.diff(grades)
maxlen = max([len(list(g)) for k,g in groupby(diffgrades, lambda x: x >= 0) if k]) + 1

Building on the idea of #M4rtini to use itertools.groupby.
def longest_streak(grades):
from itertools import groupby
if len(grade) > 1:
streak = [x <= y for x, y in zip(grades,grades[1:])]
return max([sum(g, 1) for k, g in groupby(streak) if k])
else:
return len(grades)

Related

Optimizing permutation generator where total of each permutation totals to same value

I'm wanting to create a list of permutations or cartesian products (not sure which one applies here) where the sum of values in each permutation totals to a provided value.
There should be three parameters required for the function.
Sample Size: The number of items in each permutation
Desired Sum: The total that each permutation should add up to
Set of Numbers: The set of numbers that can be included with repetition in the permutations
I have an implementation working below but it seems quite slow I would prefer to use an iterator to stream the results but I would also need a function that would be able to calculate the total number of items that the iterator would produce.
def buildPerms(sample_size, desired_sum, set_of_number):
blank = [0] * sample_size
return recurseBuildPerms([], blank, set_of_number, desired_sum)
def recurseBuildPerms(perms, blank, values, desired_size, search_index = 0):
for i in range(0, len(values)):
for j in range(search_index, len(blank)):
if(blank[j] == 0):
new_blank = blank.copy()
new_blank[j] = values[i]
remainder = desired_size - sum(new_blank)
new_values = list(filter(lambda x: x <= remainder, values))
if(len(new_values) > 0):
recurseBuildPerms(perms, new_blank, new_values, desired_size, j)
elif(sum(new_blank) <= desired_size):
perms.append( new_blank)
return perms
perms = buildPerms(4, 10, [1,2,3])
print(perms)
## Output
[[1, 3, 3, 3], [2, 2, 3, 3], [2, 3, 2, 3],
[2, 3, 3, 2], [3, 1, 3, 3], [3, 2, 2, 3],
[3, 2, 3, 2], [3, 3, 1, 3], [3, 3, 2, 2],
[3, 3, 3, 1]]
https://www.online-python.com/9cmOev3zlg
Questions:
Can someone help me convert my solution into an iterator?
Is it possible to have a calculation to know the total number of items without seeing the full list?
Here is one way to break this down into two subproblems:
Find all restricted integer partitions of target_sum into sample_size summands s.t. all summands come from set_of_number.
Compute multiset permutations for each partition (takes up most of the time).
Problem 1 can be solved with dynamic programming. I used multiset_permutations from sympy for part 2, although you might be able to get better performance by writing your own numba code.
Here is the code:
from functools import lru_cache
from sympy.utilities.iterables import multiset_permutations
#lru_cache(None)
def restricted_partitions(n, k, *xs):
'partitions of n into k summands using only elements in xs (assumed positive integers)'
if n == k == 0:
# case of unique empty partition
return [[]]
elif n <= 0 or k <= 0 or not xs:
# case where no partition is possible
return []
# general case
result = list()
x = xs[0] # element x we consider including in a partition
i = 0 # number of times x should be included
while True:
i += 1
if i > k or x * i > n:
break
for rest in restricted_partitions(n - x * i, k - i, *xs[1:]):
result.append([x] * i + rest)
result.extend(restricted_partitions(n, k, *xs[1:]))
return result
def buildPerms2(sample_size, desired_sum, set_of_number):
for part in restricted_partitions(desired_sum, sample_size, *set_of_number):
yield from multiset_permutations(part)
# %timeit sum(1 for _ in buildPerms2(8, 16, [1, 2, 3, 4])) # 16 ms
# %timeit sum(1 for _ in buildPerms (8, 16, [1, 2, 3, 4])) # 604 ms
The current solution requires computing all restricted partitions before iteration can begin, but it may still be practical if restricted partitions can be computed quickly. It may be possible to compute partitions iteratively as well, although this may require more work.
On the second question, you can indeed count the number of such permutations without generating them all:
# present in the builtin math library for Python 3.8+
#lru_cache(None)
def binomial(n, k):
if k == 0:
return 1
if n == 0:
return 0
return binomial(n - 1, k) + binomial(n - 1, k - 1)
#lru_cache(None)
def perm_counts(n, k, *xs):
if n == k == 0:
# case of unique empty partition
return 1
elif n <= 0 or k <= 0 or not xs:
# case where no partition is possible
return 0
# general case
result = 0
x = xs[0] # element x we consider including in a partition
i = 0 # number of times x should be included
while True:
i += 1
if i > k or x * i > n:
break
result += binomial(k, i) * perm_counts(n - x * i, k - i, *xs[1:])
result += perm_counts(n, k, *xs[1:])
return result
# assert perm_counts(15, 6, *[1,2,3,4]) == sum(1 for _ in buildPerms2(6, 15, [1,2,3,4])) == 580
# perm_counts(1000, 100, *[1,2,4,8,16,32,64])
# 902366143258890463230784240045750280765827746908124462169947051257879292738672
The function used to count all restricted permutations looks very similar to the function that generates partitions above. The only significant change is in the following line:
result += binomial(k, i) * perm_counts(n - x * i, k - i, *xs[1:])
There are i copies of x to include and k possible positions where x's may end up. To account for this multiplicity, the number of ways to resolve the recursive sub-problem is multiplied by k choose i.

Find large number in a list, where all previous numbers are also in the list

I am trying to implement a Yellowstone Integer calculation which suggests that "Every number appears exactly once: this is a permutation of the positive numbers". The formula I have implemented to derive the values is as follows:
import math
yellowstone_list = []
item_list = []
i = 0
while i <= 1000:
if i <= 3:
yellowstone_list.append(i)
else:
j = 1
inList = 1
while inList == 1:
minus_1 = math.gcd(j, yellowstone_list[i-1])
minus_2 = math.gcd(j, yellowstone_list[i-2])
if minus_1 == 1 and minus_2 > 1:
if j in yellowstone_list:
inList = 1
else:
inList = 0
j += 1
yellowstone_list.append(j - 1)
item_list.append(i)
i += 1
The issue becomes that as i increases, the time taken for the formula to determine the value of j also increases (naturally as i is increasingly further away from the start point of j).
What I would like to do is determine the largest value of j in the yellowstone_list, where all the values of 1 to j are already in the list.
As an example, in the below list, j would be 9, as all the values 0 - 9 are in the list:
yellowstone_list = [0, 1, 2, 3, 4, 9, 8, 15, 14, 5, 6, 25, 12, 35, 16, 7]
Any suggestions on how to implement this in an efficient manner?
For the "standalone" problem as stated the algorithm would be:
Sort the list.
Run a counter from 0 while in parallel traversing the list. Once the counter value is unequal to the list element, then you have found one-past the wanted element.
Something like the following:
x=[0, 1, 2, 3, 4, 9, 8, 15, 14, 5, 6, 25, 12, 35, 16, 7]
y=sorted(x)
for i in range(1, len(y)):
if y[i]!=i:
print(i-1)
break
But in your case it appears that the initial list is being built gradually. So each time a number is added to the list, it can be inserted in a sorted manner and can be checked against the previous element and the traversal can start from there for more efficient process.
This is how I would do it:
lst.sort()
for c, i in enumerate(lst):
if c + 1 < len(lst) and lst[c + 1] != i + 1:
j = i
break
else:
j = i
Basically, the list is sorted, and then, it loops through each value, checking if the next value is only 1 greater than the current.
After some time to sit down and think about it, and using the suggestions to sort the list, I came up with two solutions:
Sorting
I implemented #eugebe Sh.'s solution within the while i < 1000 loop as follows:
while i <= 1000:
m = sorted(yellowstone_list)
for n in range(1, len(m)):
if m[n]!=n:
break
if i == 0:
....
In List
I ran an increment to check if the value was in the list using the "in" function, also within the while i < 1000 loop, as follows:
while i <= 1000:
while k in yellowstone_list:
k += 1
if i == 0:
....
Running both codes 100 times, I got the following:
Sorting: Total: 1:56.403527 seconds, Average: 1.164035 seconds.
In List: Total: 1:14.225230 seconds, Average: 0.742252 seconds.

using recursion to find the integer appearing odd times

I am looking for some guidance with the following code please. I am learning Python and I come from Java and C# where I was a beginner. I want to write a function which returns the number which appears an odd number of times. Assumption is that the array is always greater than 1 and there is always only one integer appearing an odd number of times. I want to use recursion.
The function does not return a value as when I store the result I get a NoneType. Please, I am not looking for a solution but some advice of where to look and how to think when debugging.
def find_it(seq):
seqSort = seq
seqSort.sort()
def recurfinder(arg,start,end):
seqSort = arg
start = 0
end = seqSort.length()-1
for i in range(start,end):
counter = 1
pos = 0
if seqSort[i+1] == seqSort[i]:
counter+=1
pos = counter -1
else:
if(counter % 2 == 0):
recurfinder(seqSort, pos+1, end)
else:
return seqSort[i]
return -1
You need to actually call recurFinder from somewhere outside of recurFinder to get the ball rolling.
def getOddOccurrence(arr, arr_size):
for i in range(0, arr_size):
count = 0
for j in range(0, arr_size):
if arr[i] == arr[j]:
count+= 1
if (count % 2 != 0):
return arr[i]
return -1
arr = [2, 3, 5, 4, 5, 2, 4, 3, 5, 2, 4, 4, 2 ]
n = len(arr)
print(getOddOccurrence(arr, n))
This answer uses recursion and a dict for fast counter lookups -
def find_it(a = [], i = 0, d = {}):
if i >= len(a):
return [ n for (n, count) in d.items() if count % 2 == 1 ]
else:
d = d.copy()
d[a[i]] = d.get(a[i], 0) + 1
return find_it(a, i + 1, d)
It works like this -
print(find_it([ 1, 2, 2, 2, 3, 3, 4, 5, 5, 5, 5 ]))
# [ 1, 2, 4 ]
print(find_it([ 1, 2, 3 ]))
# [ 1, 2, 3 ]
print(find_it([ 1, 1, 2, 2, 3, 3 ]))
# []
print(find_it([]))
# []
Above i and d are exposed at the call-site. Additionally, because we're relying on Python's default arguments, we have to call d.copy() to avoid mutating d. Using an inner loop mitigates both issues -
def find_it(a = []):
def loop(i, d):
if i >= len(a):
return [ n for (n, count) in d.items() if count % 2 == 1 ]
else:
d = d.copy()
d[a[i]] = d.get(a[i], 0) + 1
return loop(i + 1, d)
return loop(0, {})
It works the same as above.

More pythonic way of filtering out stable values in a list

I wrote a function that allows me to run through a list, compare the values with the predecessors, and assert at which point the list becomes "stable" for a certain amount of entries.
The values in the list represent a signal, that may or not reach a stable point.
I came up with this:
def unstableFor(points, maxStable):
count = 0;
prev = points[0]
for i in range(1, len(points)):
if points[i] == prev:
count = count + 1
else:
count = 0
prev = points[i]
if count >= maxStable:
return i
return len(points) - 1
The returned value is then used by the caller for cutting away the last part of the list.
It does its job, however, I am quite dissatisfied with how cumbersome it looks. Can you think of a more pythonic, possibly purely-functional way of performing this filtering operation?
Use enumerating and zipping:
def unstableFor (points, threshold):
for i, (a, b) in enumerate(zip(points, points[1:])):
count = count + 1 if a == b else 0
if count >= threshold:
return i
return i
Your code looks fine to me: it is easy to read and to understand. I would just remove some repetitions to make it look like:
def unstableFor(points, maxStable):
prev = None # assuming None is not member of points
for i, point in enumerate(points):
if point == prev:
count = count + 1
else:
count = 0
prev = point
if count >= maxStable:
break
return i
Here's a sketch for a functional approach. It's a bit cryptic. Indeed, I would likely use your approach (using enumerate as is the idiomatic way instead of range(len(x))). Anyway, supposing max_stable is 3:
>>> from itertools import groupby
>>> grouped = groupby(enumerate(x), lambda i_e: i_e[1])
>>> gen = (g for g in map(lambda e: list(e[1]), grouped) if len(g) >= 3)
>>> run = next(gen)
>>> run[2][0]
10
Here it is cleaned up:
>>> from operator import itemgetter
>>> from itertools import islice
>>> def unstable_for(points, max_stable):
... grouped = groupby(enumerate(points), itemgetter(1))
... gen = (g for g in (tuple(gg) for _, gg in grouped) if len(g) >= max_stable)
... run = tuple(islice(gen,1))
... if len(run) == 0:
... return len(points) - 1
... else:
... return run[0][max_stable - 1][0]
...
>>> x
[1, 2, 3, 3, 4, 5, 6, 7, 8, 8, 8, 8, 8, 9]
>>> unstable_for(x, 3)
10
>>> unstable_for(x, 2)
3
>>> unstable_for(x, 1)
0
>>> unstable_for(x, 20)
13
>>>
Not very elegant. Again, I would go with the imperative solution. Maybe someone has a more elegant functional solution, though.

Python - Memoization and Collatz Sequence

When I was struggling to do Problem 14 in Project Euler, I discovered that I could use a thing called memoization to speed up my process (I let it run for a good 15 minutes, and it still hadn't returned an answer). The thing is, how do I implement it? I've tried to, but I get a keyerror(the value being returned is invalid). This bugs me because I am positive I can apply memoization to this and get this faster.
lookup = {}
def countTerms(n):
arg = n
count = 1
while n is not 1:
count += 1
if not n%2:
n /= 2
else:
n = (n*3 + 1)
if n not in lookup:
lookup[n] = count
return lookup[n], arg
print max(countTerms(i) for i in range(500001, 1000000, 2))
Thanks.
There is also a nice recursive way to do this, which probably will be slower than poorsod's solution, but it is more similar to your initial code, so it may be easier for you to understand.
lookup = {}
def countTerms(n):
if n not in lookup:
if n == 1:
lookup[n] = 1
elif not n % 2:
lookup[n] = countTerms(n / 2)[0] + 1
else:
lookup[n] = countTerms(n*3 + 1)[0] + 1
return lookup[n], n
print max(countTerms(i) for i in range(500001, 1000000, 2))
The point of memoising, for the Collatz sequence, is to avoid calculating parts of the list that you've already done. The remainder of a sequence is fully determined by the current value. So we want to check the table as often as possible, and bail out of the rest of the calculation as soon as we can.
def collatz_sequence(start, table={}): # cheeky trick: store the (mutable) table as a default argument
"""Returns the Collatz sequence for a given starting number"""
l = []
n = start
while n not in l: # break if we find ourself in a cycle
# (don't assume the Collatz conjecture!)
if n in table:
l += table[n]
break
elif n%2 == 0:
l.append(n)
n = n//2
else:
l.append(n)
n = (3*n) + 1
table.update({n: l[i:] for i, n in enumerate(l) if n not in table})
return l
Is it working? Let's spy on it to make sure the memoised elements are being used:
class NoisyDict(dict):
def __getitem__(self, item):
print("getting", item)
return dict.__getitem__(self, item)
def collatz_sequence(start, table=NoisyDict()):
# etc
In [26]: collatz_sequence(5)
Out[26]: [5, 16, 8, 4, 2, 1]
In [27]: collatz_sequence(5)
getting 5
Out[27]: [5, 16, 8, 4, 2, 1]
In [28]: collatz_sequence(32)
getting 16
Out[28]: [32, 16, 8, 4, 2, 1]
In [29]: collatz_sequence.__defaults__[0]
Out[29]:
{1: [1],
2: [2, 1],
4: [4, 2, 1],
5: [5, 16, 8, 4, 2, 1],
8: [8, 4, 2, 1],
16: [16, 8, 4, 2, 1],
32: [32, 16, 8, 4, 2, 1]}
Edit: I knew it could be optimised! The secret is that there are two places in the function (the two return points) that we know l and table share no elements. While previously I avoided calling table.update with elements already in table by testing them, this version of the function instead exploits our knowledge of the control flow, saving lots of time.
[collatz_sequence(x) for x in range(500001, 1000000)] now times around 2 seconds on my computer, while a similar expression with #welter's version clocks in 400ms. I think this is because the functions don't actually compute the same thing - my version generates the whole sequence, while #welter's just finds its length. So I don't think I can get my implementation down to the same speed.
def collatz_sequence(start, table={}): # cheeky trick: store the (mutable) table as a default argument
"""Returns the Collatz sequence for a given starting number"""
l = []
n = start
while n not in l: # break if we find ourself in a cycle
# (don't assume the Collatz conjecture!)
if n in table:
table.update({x: l[i:] for i, x in enumerate(l)})
return l + table[n]
elif n%2 == 0:
l.append(n)
n = n//2
else:
l.append(n)
n = (3*n) + 1
table.update({x: l[i:] for i, x in enumerate(l)})
return l
PS - spot the bug!
This is my solution to PE14:
memo = {1:1}
def get_collatz(n):
if n in memo : return memo[n]
if n % 2 == 0:
terms = get_collatz(n/2) + 1
else:
terms = get_collatz(3*n + 1) + 1
memo[n] = terms
return terms
compare = 0
for x in xrange(1, 999999):
if x not in memo:
ctz = get_collatz(x)
if ctz > compare:
compare = ctz
culprit = x
print culprit

Categories