I am looking for some guidance with the following code please. I am learning Python and I come from Java and C# where I was a beginner. I want to write a function which returns the number which appears an odd number of times. Assumption is that the array is always greater than 1 and there is always only one integer appearing an odd number of times. I want to use recursion.
The function does not return a value as when I store the result I get a NoneType. Please, I am not looking for a solution but some advice of where to look and how to think when debugging.
def find_it(seq):
seqSort = seq
seqSort.sort()
def recurfinder(arg,start,end):
seqSort = arg
start = 0
end = seqSort.length()-1
for i in range(start,end):
counter = 1
pos = 0
if seqSort[i+1] == seqSort[i]:
counter+=1
pos = counter -1
else:
if(counter % 2 == 0):
recurfinder(seqSort, pos+1, end)
else:
return seqSort[i]
return -1
You need to actually call recurFinder from somewhere outside of recurFinder to get the ball rolling.
def getOddOccurrence(arr, arr_size):
for i in range(0, arr_size):
count = 0
for j in range(0, arr_size):
if arr[i] == arr[j]:
count+= 1
if (count % 2 != 0):
return arr[i]
return -1
arr = [2, 3, 5, 4, 5, 2, 4, 3, 5, 2, 4, 4, 2 ]
n = len(arr)
print(getOddOccurrence(arr, n))
This answer uses recursion and a dict for fast counter lookups -
def find_it(a = [], i = 0, d = {}):
if i >= len(a):
return [ n for (n, count) in d.items() if count % 2 == 1 ]
else:
d = d.copy()
d[a[i]] = d.get(a[i], 0) + 1
return find_it(a, i + 1, d)
It works like this -
print(find_it([ 1, 2, 2, 2, 3, 3, 4, 5, 5, 5, 5 ]))
# [ 1, 2, 4 ]
print(find_it([ 1, 2, 3 ]))
# [ 1, 2, 3 ]
print(find_it([ 1, 1, 2, 2, 3, 3 ]))
# []
print(find_it([]))
# []
Above i and d are exposed at the call-site. Additionally, because we're relying on Python's default arguments, we have to call d.copy() to avoid mutating d. Using an inner loop mitigates both issues -
def find_it(a = []):
def loop(i, d):
if i >= len(a):
return [ n for (n, count) in d.items() if count % 2 == 1 ]
else:
d = d.copy()
d[a[i]] = d.get(a[i], 0) + 1
return loop(i + 1, d)
return loop(0, {})
It works the same as above.
Related
I want to check if element in list is smaller than the next one and break function in a moment when it is not. I have written this code and I am not sure what is wrong because output is [1] when it should be [1, 3, 4, 5, 6, 7]. It is probably small error, but I have none to ask..
def check_order(a):
i = 0
while i < (len(a)-1):
b = []
if a[i] < a[(i + 1)]:
b.insert(i, a[i])
i = i + 1
return b
else:
break
a = [1, 3, 4, 5, 6, 7, 22, 10]
print(check_order(a))
There are 2 issues, you are redefining b on every loop, and also returning early inside the if statement
def check_order(a):
i = 0
b = []
while i < (len(a) - 1):
if a[i] < a[(i + 1)]:
b.insert(i, a[i])
i = i + 1
else:
break
return b
I think you need to understand what will happen return statement is called.
when you call the return statement the flow will return to the function calling the place
So during the first loop it self when it comes 8th line control will come back to 11th line . So output is only 1
so you can change something like below
def check_order(a):
i = 0
b = []
while i < (len(a) - 1):
if a[i] < a[(i + 1)]:
b.insert(i, a[i])
i = i + 1
else:
break
return b
a = [1, 3, 4, 5, 6, 7, 22, 10]
print(check_order(a))
Here we have return at end of while loop , So the control from return will come to function call at end of while loop
My interview question was that I need to return the length of an array that removed duplicates but we can leave at most 2 duplicates.
For example, [1, 1, 1, 2, 2, 3] the new array would be [1, 1, 2, 2, 3]. So the new length would be 5. I came up with an algorithm with O(2n) I believe. How can I improve that to be the fastest.
def removeDuplicates(nums):
if nums is None:
return 0
if len(nums) == 0:
return 0
if len(nums) == 1:
return 1
new_array = {}
for num in nums:
new_array[num] = new_array.get(num, 0) + 1
new_length = 0
for key in new_array:
if new_array[key] > 2:
new_length = new_length + 2
else:
new_length = new_length + new_array[key]
return new_length
new_length = removeDuplicates([1, 1, 1, 2, 2, 3])
assert new_length == 5
My first question would be is my algorithm even correct?
Your logic is correct however he is a simpler method to reach the goal you had mentioned in your question.
Here is my logic.
myl = [1, 1, 1, 2, 2, 3, 1, 1, 1, 2, 2, 3, 1, 1, 1, 2, 2, 3]
newl = []
for i in myl:
if newl.count(i) != 2:
newl.append(i)
print newl
[1, 1, 2, 2, 3, 3]
Hope this helps.
If your original array size is n.
Count distinct numbers in your array.
If you have d distinct numbers, then your answer will be
d (when n == d)
d+1 (when n == d+1)
d+2 (when n >= d+2)
If all the numbers in your array are less than n-1, you can even solve this without using any extra space. If that's the case check this and you can count distinct numbers very easily without using extra space.
I'd forget about generating the new array and just focus on counting:
from collections import Counter
def count_non_2dups(nums):
new_len = 0
for num, count in Counter(nums).items():
new_len += min(2, count)
return new_len
int removeDuplicates(vector<int>& nums) {
if (nums.size() == 0) return nums.size();
int state = 1;
int idx = 1;
for (int i = 1; i < nums.size(); ++i) {
if (nums[i] != nums[i-1]) {
state = 1;
nums[idx++] = nums[i];
}
else if (state == 1) {
state++;
nums[idx++] = nums[i];
}
else {
state++;
}
}
return idx;
}
idea:maintain a variable(state) recording the current repeat times(more precisely, state records the repeat times of the element which adjacent to the left of current element). This algorithm is O(n) with one scanning of the array.
def removeDuplicates(nums):
if nums is None:
return 0
if len(nums) == 0:
return 0
if len(nums) == 1:
return 1
new_array_a = set()
new_array_b = set()
while nums:
i = nums.pop()
if i not in new_array_a:
new_array_a.add(i)
elif i not in new_array_b:
new_array_b.add(i)
return len(new_array_a) + len(new_array_b)
I got a problem in TalentBuddy, which sounds like this
A student's performance in lab activities should always improve, but that is not always the case.
Since progress is one of the most important metrics for a student, let’s write a program that computes the longest period of increasing performance for any given student.
For example, if his grades for all lab activities in a course are: 9, 7, 8, 2, 5, 5, 8, 7 then the longest period would be 4 consecutive labs (2, 5, 5, 8).
So far, I seem too confused to work the code. The only thing that I worked is
def longest_improvement(grades):
res = 0
for i in xrange(len(grades) - 2):
while grades[i] <= grades[i + 1]:
res += 1
i += 1
print res
But that prints 17, rather than 6 when grades = [1, 7, 2, 5, 6, 9, 11, 11, 1, 6, 1].
How to work out the rest of the code? Thanks
Solved with some old-fashioned tail-recursion:
grades = [1, 7, 2, 5, 6, 9, 11, 11, 1, 6, 1]
def streak(grades):
def streak_rec(longest, challenger, previous, rest):
if rest == []: # Base case
return max(longest, challenger)
elif previous <= rest[0]: # Streak continues
return streak_rec(longest, challenger + 1, rest[0], rest[1:])
else: # Streak is reset
return streak_rec(max(longest, challenger), 1, rest[0], rest[1:])
return streak_rec(0, 0, 0, grades)
print streak(grades) # => 6
print streak([2]) # => 1
Since the current solution involves yield and maps and additional memory overhead, it's probably a good idea to at least mention the simple solution:
def length_of_longest_sublist(lst):
max_length, cur_length = 1, 1
prev_val = lst[0]
for val in lst[1:]:
if val >= prev_val :
cur_length += 1
else:
max_length = max(max_length, cur_length)
cur_length = 1
prev_val = val
return max(max_length, cur_length)
We could reduce that code by getting the previous value directly:
def length_of_longest_sublist2(lst):
max_length, cur_length = int(bool(lst)), int(bool(lst))
for prev_val, val in zip(lst, lst[1:]):
if val >= prev_val:
cur_length += 1
else:
max_length = max(max_length, cur_length)
cur_length = 1
return max(max_length, cur_length)
which is a nice trick to know (and allows it to easily return the right result for an empty list), but confusing to people who don't know the idiom.
This method uses fairly basic python and the return statement can be quickly modified so that you have a list of all the streak lengths.
def longest_streak(grades):
if len(grades) < 2:
return len(grades)
else:
start, streaks = -1, []
for idx, (x, y) in enumerate(zip(grades, grades[1:])):
if x > y:
streaks.append(idx - start)
start = idx
else:
streaks.append(idx - start + 1)
return max(streaks)
I would solve it this way:
from itertools import groupby
from funcy import pairwise, ilen
def streak(grades):
if len(grades) <= 1:
return len(grades)
orders = (x <= y for x, y in pairwise(grades))
return max(ilen(l) for asc, l in groupby(orders) if asc) + 1
Very explicit: orders is an iterator of Trues for ascending pairs and Falses for descending ones. Then we need just find a longest list of ascending and add 1.
You're using the same res variable in each iteration of the inner while loop. You probably want to reset it, and keep the highest intermediate result in a different variable.
Little bit late, but here's my Updated version:
from funcy import ilen, ireductions
def streak(last, x):
if last and x >= last[-1]:
last.append(x)
return last
return [x]
def longest_streak(grades):
xs = map(ilen, ireductions(streak, grades, None))
return xs and max(xs) or 1
grades = [1, 7, 2, 5, 6, 9, 11, 11, 1, 6, 1]
print longest_streak(grades)
print longest_streak([2])
I decided in the end to not only produce a correct
version without bugs, but to use a library I quite like funcy :)
Output:
6
1
Maybe not as efficient as previous answers, but it's short :P
diffgrades = np.diff(grades)
maxlen = max([len(list(g)) for k,g in groupby(diffgrades, lambda x: x >= 0) if k]) + 1
Building on the idea of #M4rtini to use itertools.groupby.
def longest_streak(grades):
from itertools import groupby
if len(grade) > 1:
streak = [x <= y for x, y in zip(grades,grades[1:])]
return max([sum(g, 1) for k, g in groupby(streak) if k])
else:
return len(grades)
I am trying to implement the merge sort algorithm described in these notes by Jeff Erickson on page 3. but even though the algorithm is correct and my implementation seems correct, I am getting the input list as output without any change. Can someone point out the anomalies, if any, in it.
def merge(appnd_lst, m):
result = []
n = len(appnd_lst)
i, j = 0, m
for k in range(0, n):
if j < n:
result.append(appnd_lst[i])
i += 1
elif i > m:
result.append(appnd_lst[j])
j += 1
elif appnd_lst[i] < appnd_lst[j]:
result.append(appnd_lst[i])
i += 1
else:
result.append(appnd_lst[j])
j += 1
return result
def mergesort(lst):
n = len(lst)
if n > 1:
m = int(n / 2)
left = mergesort(lst[:m])
right = mergesort(lst[m:])
appnd_lst = left
appnd_lst.extend(right)
return merge(appnd_lst, m)
else:
return lst
if __name__ == "__main__":
print mergesort([3, 4, 8, 0, 6, 7, 4, 2, 1, 9, 4, 5])
There are three errors in your merge function a couple of indexing errors and using the wrong comparison operator. Remember python list indices go from 0 .. len(list)-1.
* ...
6 if j > n-1: # operator wrong and off by 1
* ...
9 elif i > m-1: # off by 1
* ...
When I was struggling to do Problem 14 in Project Euler, I discovered that I could use a thing called memoization to speed up my process (I let it run for a good 15 minutes, and it still hadn't returned an answer). The thing is, how do I implement it? I've tried to, but I get a keyerror(the value being returned is invalid). This bugs me because I am positive I can apply memoization to this and get this faster.
lookup = {}
def countTerms(n):
arg = n
count = 1
while n is not 1:
count += 1
if not n%2:
n /= 2
else:
n = (n*3 + 1)
if n not in lookup:
lookup[n] = count
return lookup[n], arg
print max(countTerms(i) for i in range(500001, 1000000, 2))
Thanks.
There is also a nice recursive way to do this, which probably will be slower than poorsod's solution, but it is more similar to your initial code, so it may be easier for you to understand.
lookup = {}
def countTerms(n):
if n not in lookup:
if n == 1:
lookup[n] = 1
elif not n % 2:
lookup[n] = countTerms(n / 2)[0] + 1
else:
lookup[n] = countTerms(n*3 + 1)[0] + 1
return lookup[n], n
print max(countTerms(i) for i in range(500001, 1000000, 2))
The point of memoising, for the Collatz sequence, is to avoid calculating parts of the list that you've already done. The remainder of a sequence is fully determined by the current value. So we want to check the table as often as possible, and bail out of the rest of the calculation as soon as we can.
def collatz_sequence(start, table={}): # cheeky trick: store the (mutable) table as a default argument
"""Returns the Collatz sequence for a given starting number"""
l = []
n = start
while n not in l: # break if we find ourself in a cycle
# (don't assume the Collatz conjecture!)
if n in table:
l += table[n]
break
elif n%2 == 0:
l.append(n)
n = n//2
else:
l.append(n)
n = (3*n) + 1
table.update({n: l[i:] for i, n in enumerate(l) if n not in table})
return l
Is it working? Let's spy on it to make sure the memoised elements are being used:
class NoisyDict(dict):
def __getitem__(self, item):
print("getting", item)
return dict.__getitem__(self, item)
def collatz_sequence(start, table=NoisyDict()):
# etc
In [26]: collatz_sequence(5)
Out[26]: [5, 16, 8, 4, 2, 1]
In [27]: collatz_sequence(5)
getting 5
Out[27]: [5, 16, 8, 4, 2, 1]
In [28]: collatz_sequence(32)
getting 16
Out[28]: [32, 16, 8, 4, 2, 1]
In [29]: collatz_sequence.__defaults__[0]
Out[29]:
{1: [1],
2: [2, 1],
4: [4, 2, 1],
5: [5, 16, 8, 4, 2, 1],
8: [8, 4, 2, 1],
16: [16, 8, 4, 2, 1],
32: [32, 16, 8, 4, 2, 1]}
Edit: I knew it could be optimised! The secret is that there are two places in the function (the two return points) that we know l and table share no elements. While previously I avoided calling table.update with elements already in table by testing them, this version of the function instead exploits our knowledge of the control flow, saving lots of time.
[collatz_sequence(x) for x in range(500001, 1000000)] now times around 2 seconds on my computer, while a similar expression with #welter's version clocks in 400ms. I think this is because the functions don't actually compute the same thing - my version generates the whole sequence, while #welter's just finds its length. So I don't think I can get my implementation down to the same speed.
def collatz_sequence(start, table={}): # cheeky trick: store the (mutable) table as a default argument
"""Returns the Collatz sequence for a given starting number"""
l = []
n = start
while n not in l: # break if we find ourself in a cycle
# (don't assume the Collatz conjecture!)
if n in table:
table.update({x: l[i:] for i, x in enumerate(l)})
return l + table[n]
elif n%2 == 0:
l.append(n)
n = n//2
else:
l.append(n)
n = (3*n) + 1
table.update({x: l[i:] for i, x in enumerate(l)})
return l
PS - spot the bug!
This is my solution to PE14:
memo = {1:1}
def get_collatz(n):
if n in memo : return memo[n]
if n % 2 == 0:
terms = get_collatz(n/2) + 1
else:
terms = get_collatz(3*n + 1) + 1
memo[n] = terms
return terms
compare = 0
for x in xrange(1, 999999):
if x not in memo:
ctz = get_collatz(x)
if ctz > compare:
compare = ctz
culprit = x
print culprit