I was solving the Find the min problem on facebook hackercup using python, my code works fine for sample inputs but for large inputs(10^9) it is taking hours to complete.
So, is it possible that the solution of that problem can't be computed within 6 minutes using python? Or may be my approaches are too bad?
Problem statement:
After sending smileys, John decided to play with arrays. Did you know that hackers enjoy playing with arrays? John has a zero-based index array, m, which contains n non-negative integers. However, only the first k values of the array are known to him, and he wants to figure out the rest.
John knows the following: for each index i, where k <= i < n, m[i] is the minimum non-negative integer which is not contained in the previous *k* values of m.
For example, if k = 3, n = 4 and the known values of m are [2, 3, 0], he can figure out that m[3] = 1.
John is very busy making the world more open and connected, as such, he doesn't have time to figure out the rest of the array. It is your task to help him.
Given the first k values of m, calculate the nth value of this array. (i.e. m[n - 1]).
Because the values of n and k can be very large, we use a pseudo-random number generator to calculate the first k values of m. Given positive integers a, b, c and r, the known values of m can be calculated as follows:
m[0] = a
m[i] = (b * m[i - 1] + c) % r, 0 < i < k
Input
The first line contains an integer T (T <= 20), the number of test
cases.
This is followed by T test cases, consisting of 2 lines each.
The first line of each test case contains 2 space separated integers,
n, k (1 <= k <= 10^5, k < n <= 10^9).
The second line of each test case contains 4 space separated integers
a, b, c, r (0 <= a, b, c <= 10^9, 1 <= r <= 10^9).
I tried two approaches but both failed to return results in 6 minutes, Here's my two approaches:
first:
import sys
cases=sys.stdin.readlines()
def func(line1,line2):
n,k=map(int,line1.split())
a,b,c,r =map(int,line2.split())
m=[None]*n #initialize the list
m[0]=a
for i in xrange(1,k): #set the first k values using the formula
m[i]= (b * m[i - 1] + c) % r
#print m
for j in range(0,n-k): #now set the value of m[k], m[k+1],.. upto m[n-1]
temp=set(m[j:k+j]) # create a set from the K values relative to current index
i=-1 #start at 0, lowest +ve integer
while True:
i+=1
if i not in temp: #if that +ve integer is not present in temp
m[k+j]=i
break
return m[-1]
for ind,case in enumerate(xrange(1,len(cases),2)):
ans=func(cases[case],cases[case+1])
print "Case #{0}: {1}".format(ind+1,ans)
Second:
import sys
cases=sys.stdin.readlines()
def func(line1,line2):
n,k=map(int,line1.split())
a,b,c,r =map(int,line2.split())
m=[None]*n #initialize
m[0]=a
for i in xrange(1,k): #same as above
m[i]= (b * m[i - 1] + c) % r
#instead of generating a set in each iteration , I used a
# dictionary this time.
#Now, if the count of an item is 0 then it
#means the item is not present in the previous K items
#and can be added as the min value
temp={}
for x in m[0:k]:
temp[x]=temp.get(x,0)+1
i=-1
while True:
i+=1
if i not in temp:
m[k]=i #set the value of m[k]
break
for j in range(1,n-k): #now set the values of m[k+1] to m[n-1]
i=-1
temp[m[j-1]] -= 1 #decrement it's value, as it is now out of K items
temp[m[k+j-1]]=temp.get(m[k+j-1],0)+1 # new item added to the current K-1 items
while True:
i+=1
if i not in temp or temp[i]==0: #if i not found in dict or it's val is 0
m[k+j]=i
break
return m[-1]
for ind,case in enumerate(xrange(1,len(cases),2)):
ans=func(cases[case],cases[case+1])
print "Case #{0}: {1}".format(ind+1,ans)
The last for-loop in second approach can also be written as :
for j in range(1,n-k):
i=-1
temp[m[j-1]] -= 1
if temp[m[j-1]]==0:
temp.pop(m[j-1]) #same as above but pop the key this time
temp[m[k+j-1]]=temp.get(m[k+j-1],0)+1
while True:
i+=1
if i not in temp:
m[k+j]=i
break
sample input :
5
97 39
34 37 656 97
186 75
68 16 539 186
137 49
48 17 461 137
98 59
6 30 524 98
46 18
7 11 9 46
output:
Case #1: 8
Case #2: 38
Case #3: 41
Case #4: 40
Case #5: 12
I already tried codereview, but no one replied there yet.
After at most k+1 steps, the last k+1 numbers in the array will be 0...k (in some order). Subsequently, the sequence is predictable: m[i] = m[i-k-1]. So the way to solve this problem is run your naive implementation for k+1 steps. Then you've got an array with 2k+1 elements (the first k were generated from the random sequence, and the other k+1 from iterating).
Now, the last k+1 elements are going to repeat infinitely. So you can just return the result for m[n] immediately: it's m[k + (n-k-1) % (k+1)].
Here's some code that implements it.
import collections
def initial_seq(k, a, b, c, r):
v = a
for _ in xrange(k):
yield v
v = (b * v + c) % r
def find_min(n, k, a, b, c, r):
m = [0] * (2 * k + 1)
for i, v in enumerate(initial_seq(k, a, b, c, r)):
m[i] = v
ks = range(k+1)
s = collections.Counter(m[:k])
for i in xrange(k, len(m)):
m[i] = next(j for j in ks if not s[j])
ks.remove(m[i])
s[m[i-k]] -= 1
return m[k + (n - k - 1) % (k + 1)]
print find_min(97, 39, 34, 37, 656, 97)
print find_min(186, 75, 68, 16, 539, 186)
print find_min(137, 49, 48, 17, 461, 137)
print find_min(1000000000, 100000, 48, 17, 461, 137)
The four cases run in 4 seconds on my machine, and the last case has the largest possible n.
Here is my O(k) solution, which is based on the same idea as above, but runs much faster.
import os, sys
f = open(sys.argv[1], 'r')
T = int(f.readline())
def next(ary, start):
j = start
l = len(ary)
ret = start - 1
while j < l and ary[j]:
ret = j
j += 1
return ret
for t in range(T):
n, k = map(int, f.readline().strip().split(' '))
a, b, c, r = map(int, f.readline().strip().split(' '))
m = [0] * (4 * k)
s = [0] * (k+1)
m[0] = a
if m[0] <= k:
s[m[0]] = 1
for i in xrange(1, k):
m[i] = (b * m[i-1] + c) % r
if m[i] < k+1:
s[m[i]] += 1
p = next(s, 0)
m[k] = p + 1
p = next(s, p+2)
for i in xrange(k+1, n):
if m[i-k-1] > p or s[m[i-k-1]] > 1:
m[i] = p + 1
if m[i-k-1] <= k:
s[m[i-k-1]] -= 1
s[m[i]] += 1
p = next(s, p+2)
else:
m[i] = m[i-k-1]
if p == k:
break
if p != k:
print 'Case #%d: %d' % (t+1, m[n-1])
else:
print 'Case #%d: %d' % (t+1, m[i-k + (n-i+k+k) % (k+1)])
The key point here is, m[i] will never exceeds k, and if we remember the consecutive numbers we can find in previous k numbers from 0 to p, then p will never reduce.
If number m[i-k-1] is larger than p, then it's obviously we should set m[i] to p+1, and p will increase at least 1.
If number m[i-k-1] is smaller or equal to p, then we should consider whether the same number exists in m[i-k:i], if not, m[i] should set equal to m[i-k-1], if yes, we should set m[i] to p+1 just as the "m[i-k-1]-larger-than-p" case.
Whenever p is equal to k, the loop begin, and the loop size is (k+1), so we can jump out of the calculation and print out the answer now.
I enhanced the performance through adding map.
import sys, os
import collections
def min(str1, str2):
para1 = str1.split()
para2 = str2.split()
n = int(para1[0])
k = int(para1[1])
a = int(para2[0])
b = int(para2[1])
c = int(para2[2])
r = int(para2[3])
m = [0] * (2*k+1)
m[0] = a
s = collections.Counter()
s[a] += 1
rs = {}
for i in range(k+1):
rs[i] = 1
for i in xrange(1,k):
v = (b * m[i - 1] + c) % r
m[i] = v
s[v] += 1
if v < k:
if v in rs:
rs[v] -= 1
if rs[v] == 0:
del rs[v]
for j in xrange(0,k+1):
for t in rs:
if not s[t]:
m[k+j] = t
if m[j] < k:
if m[j] in rs:
rs[m[j]] += 1
else:
rs[m[j]] = 0
rs[t] -= 1
if rs[t] == 0:
del rs[t]
s[t] = 1
break
s[m[j]] -= 1
return m[k + ((n-k-1)%(k+1))]
if __name__=='__main__':
lines = []
user_input = raw_input()
num = int(user_input)
for i in xrange(num):
input1 = raw_input()
input2 = raw_input()
print "Case #%s: %s"%(i+1, min(input1, input2))
Related
By using this equation:
second order Central finite difference coefficients equation
I need to find the coefficients given it for a second order differential. Without using a library that will just do it for me.
Central finite difference coefficients table
I tried using nested for loops to get each number but for some reason it never went to one of my elif sections.I have print values there to check when it does go there and what values the variables are.
I am not getting any error just the incorrect numbers.
"""
Filter for 2nd derivative
"""
n = 2
k = n/2
#k = range(-k,k+1)
#k = list(k)
h = 0.01
for j in range(0, n+1):
Isum = 0
for i in range(0, n+1):
Msum = 0
if i == j:
continue
else:
for m in range(0, n+1):
product = 1
if m == j or m == i:
continue
else:
for l in range(0, n+1):
if l == j or l == i or l == m:
continue
print("j is equal to:", + j)
print("i is equal to:", + l)
print("j - i is equal to:", + j - l)
product = product * ((k-l)/j-l)
product = product * (1/(j - m))
Msum += product
Msum = Msum * (1/(j-i))
Isum += Msum
print(Isum)
I need to find the biggest sequence of zeros next to each other (up down left right).
for example in this example the function should return 6
mat = [[1,**0**,**0**,3,0],
[**0**,**0**,2,3,0],
[2,**0**,**0**,2,0],
[0,1,2,3,3],]
the zeros that i marked as bold should be the answer (6)
the solution should be implemented without any loop (using recursion)
this is what i tried so far
def question_3_b(some_list,index_cord):
y = index_cord[0]
x = index_cord[1]
list_of_nums = []
def main(some_list,index_cord):
y = index_cord[0]
x = index_cord[1]
def check_right(x,y):
if x + 1 < 0:
return 0
if some_list[y][x+1] == 0:
main(some_list,(y,x+1))
else:
return 0
def check_left(x,y):
if x -1 < 0:
return 0
if some_list[y][x - 1] == 0:
main(some_list,(y, x - 1))
def check_down(x,y):
if y + 1 < 0:
return 0
try:
if some_list[y + 1][x] == 0:
main(some_list,(y + 1, x))
except:
print("out of range")
def check_up(x,y):
counter_up = 0
if y - 1 < 0:
return 0
if some_list[y - 1][x] == 0:
counter_up += 1
main(some_list,(y - 1, x))
list_of_nums.append((x,y))
right = check_right(x,y)
down = check_down(x,y)
left = check_left(x,y)
up = check_up(x, y)
main(some_list,index_cord)
print(list_of_nums)
question_3_b(mat,(0,1))
Solution #1: classic BFS
As I mention in a comment, you can tackle this problem using BFS (Breadth First Search), it will be something like this:
# This function will give the valid adjacent positions
# of a given position according the matrix size (NxM)
def valid_adj(i, j, N, M):
adjs = [[i + 1, j], [i - 1, j], [i, j + 1], [i, j - 1]]
for a_i, a_j in adjs:
if 0 <= a_i < N and 0 <= a_j < M:
yield a_i, a_j
def biggest_zero_chunk(mat):
answer = 0
N, M = len(mat), len(mat[0])
# Mark all non zero position as visited (we are not instrested in them)
mask = [[mat[i][j] != 0 for j in range(M)] for i in range(N)]
queue = []
for i in range(N):
for j in range(M):
if mask[i][j]: # You have visited this position
continue
# Here comes the BFS
# It visits all the adjacent zeros recursively,
# count them and mark them as visited
current_ans = 1
queue = [[i,j]]
while queue:
pos_i, pos_j = queue.pop(0)
mask[pos_i][pos_j] = True
for a_i, a_j in valid_adj(pos_i, pos_j, N, M):
if mat[a_i][a_j] == 0 and not mask[a_i][a_j]:
queue.append([a_i, a_j])
current_ans += 1
answer = max(answer, current_ans)
return answer
mat = [[1,0,0,3,0],
[0,0,2,3,0],
[2,0,0,2,0],
[0,1,2,3,3],]
mat2 = [[1,0,0,3,0],
[0,0,2,3,0],
[2,0,0,0,0], # A slight modification in this row to connect two chunks
[0,1,2,3,3],]
print(biggest_zero_chunk(mat))
print(biggest_zero_chunk(mat2))
Output:
6
10
Solution #2: using only recursion (no for statements)
def count_zeros(mat, i, j, N, M):
# Base case
# Don't search zero chunks if invalid position or non zero values
if i < 0 or i >= N or j < 0 or j >= M or mat[i][j] != 0:
return 0
ans = 1 # To count the current zero we start at 1
mat[i][j] = 1 # To erase the current zero and don't count it again
ans += count_zeros(mat, i - 1, j, N, M) # Up
ans += count_zeros(mat, i + 1, j, N, M) # Down
ans += count_zeros(mat, i, j - 1, N, M) # Left
ans += count_zeros(mat, i, j + 1, N, M) # Right
return ans
def biggest_zero_chunk(mat, i = 0, j = 0, current_ans = 0):
N, M = len(mat), len(mat[0])
# Base case (last position of mat)
if i == N - 1 and j == M - 1:
return current_ans
next_j = (j + 1) % M # Move to next column, 0 if j is the last one
next_i = i + 1 if next_j == 0 else i # Move to next row if j is 0
ans = count_zeros(mat, i, j, N, M) # Count zeros from this position
current_ans = max(ans, current_ans) # Update the current answer
return biggest_zero_chunk(mat, next_i, next_j, current_ans) # Check the rest of mat
mat = [[1,0,0,3,0],
[0,0,2,3,0],
[2,0,0,2,0],
[0,1,2,3,3],]
mat2 = [[1,0,0,3,0],
[0,0,2,3,0],
[2,0,0,0,0], # A slight modification in this row to connect two chunks
[0,1,2,3,3],]
print(biggest_zero_chunk(mat.copy()))
print(biggest_zero_chunk(mat2.copy()))
Output:
6
10
Notes:
The idea behind this solution is still BFS (represented mainly in the count_zeros function). Also, if you are interested in using the matrix values after this you should call the biggest_zero_chunk with a copy of the matrix (because it is modified in the algorithm)
import random, timeit
#Qucik sort
def quick_sort(A,first,last):
global Qs,Qc
if first>=last: return
left, right= first+1, last
pivot = A[first]
while left <= right:
while left <=last and A[left]<pivot:
Qc= Qc+1
left= left + 1
while right > first and A[right] >= pivot:
Qc=Qc+1
right = right -1
if left <= right:
A[left],A[right]=A[right],A[left]
Qs = Qs+1
left= left +1
right= right-1
A[first],A[right]=A[right],A[first]
Qs=Qs+1
quick_sort(A,first,right-1)
quick_sort(A,right+1,last)
#Merge sort
def merge_sort(A, first, last): # merge sort A[first] ~ A[last]
global Ms,Mc
if first >= last: return
middle = (first+last)//2
merge_sort(A, first, middle)
merge_sort(A, middle+1, last)
B = []
i = first
j = middle+1
while i <= middle and j <= last:
Mc=Mc+1
if A[i] <= A[j]:
B.append(A[i])
i += 1
else:
B.append(A[j])
j += 1
for i in range(i, middle+1):
B.append(A[i])
Ms=Ms+1
for j in range(j, last+1):
B.append(A[j])
for k in range(first, last+1): A[k] = B[k-first]
#Heap sort
def heap_sort(A):
global Hs, Hc
n = len(A)
for i in range(n - 1, -1, -1):
while 2 * i + 1 < n:
left, right = 2 * i + 1, 2 * i + 2
if left < n and A[left] > A[i]:
m = left
Hc += 1
else:
m = i
Hc += 1
if right < n and A[right] > A[m]:
m = right
Hc += 1
if m != i:
A[i], A[m] = A[m], A[i]
i = m
Hs += 1
else:
break
for i in range(n - 1, -1, -1):
A[0], A[i] = A[i], A[0]
n -= 1
k = 0
while 2 * k + 1 < n:
left, right = 2 * k + 1, 2 * k + 2
if left < n and A[left] > A[k]:
m = left
Hc += 1
else:
m = k
Hc += 1
if right < n and A[right] > A[m]:
m = right
Hc += 1
if m != k:
A[k], A[m] = A[m], A[k]
k = m
Hs += 1
else:
break
#
def check_sorted(A):
for i in range(n-1):
if A[i] > A[i+1]: return False
return True
#
#
Qc, Qs, Mc, Ms, Hc, Hs = 0, 0, 0, 0, 0, 0
n = int(input())
random.seed()
A = []
for i in range(n):
A.append(random.randint(-1000,1000))
B = A[:]
C = A[:]
print("")
print("Quick sort:")
print("time =", timeit.timeit("quick_sort(A, 0, n-1)", globals=globals(), number=1))
print(" comparisons = {:10d}, swaps = {:10d}\n".format(Qc, Qs))
print("Merge sort:")
print("time =", timeit.timeit("merge_sort(B, 0, n-1)", globals=globals(), number=1))
print(" comparisons = {:10d}, swaps = {:10d}\n".format(Mc, Ms))
print("Heap sort:")
print("time =", timeit.timeit("heap_sort(C)", globals=globals(), number=1))
print(" comparisons = {:10d}, swaps = {:10d}\n".format(Hc, Hs))
assert(check_sorted(A))
assert(check_sorted(B))
assert(check_sorted(C))
I made the code that tells how much time it takes to sort list size n(number input) with 3 ways of sorts. However, I found that my result is quite unexpected.
Quick sort:
time = 0.0001289689971599728
comparisons = 474, swaps = 168
Merge sort:
time = 0.00027709499408956617
comparisons = 541, swaps = 80
Heap sort:
time = 0.0002578190033091232
comparisons = 744, swaps = 478
Quick sort:
time = 1.1767549149953993
comparisons = 3489112, swaps = 352047
Merge sort:
time = 0.9040642600011779
comparisons = 1536584, swaps = 77011
Heap sort:
time = 1.665754442990874
comparisons = 2227949, swaps = 1474542
Quick sort:
time = 4.749891302999458
comparisons = 11884246, swaps = 709221
Merge sort:
time = 3.1966246420051903
comparisons = 3272492, swaps = 154723
Heap sort:
time = 6.2041203819972
comparisons = 4754829, swaps = 3148479
as you see, my results are very different from what I learned. Can you please tell me why quick sort is not the fastest in my code? and why merge is the fastest one.
I can see that you are choosing the first element of the array as the pivot in quicksort. Now, consider the order of the elements of the unsorted array. Is it random? How do you generate the input array?
You see, if the pivot was either the min or max value of the aray, or somewhere close to the mind/max value, the running time of quicksort in that case (worst case) will be in the order of O(n^2). That is because on each iteration, you are partitioning the arry by breaking off only one element.
For optimal quicksort performance of O(n log n), your pivot should be as close to the median value as possible. In order to increase the likelihood of that being the case, consider initially picking 3 values at random in from the array, and use the median value as the pivot. Obviously, the more values you choose the median from initially the better the probability that your pivot is more efficient, but you are adding extra moves by choosing those values to begin with, so it's a trade off. I imagine one would even be able to calculate exactly how many elements should be selected in relation to the size of the array for optimal performance.
Merge sort on the other hand, always has the complexity in the order of O(n log n) irrespective of input, which is why you got consistent results with it over different samples.
TL:DR my guess is that the input array's first element is very close to being the smallest or largest value of that array, and it ends up being the pivot of your quicksort algorithm.
I want to make a pattern as shown below:
N = 1
*
N = 3
*
***
*
N = 5
*
***
*****
***
*
and so on...
Nevertheless, I write this program only works for N = 5. Could anybody please explain what the mistake is and what the solution is? Thanks.
N = int(input())
k = 1
for i in range(1, N - 1):
for j in range(1, N + 1):
if(j <= N - k):
print(' ', end = '')
else:
print('*', end = '')
k = k + 2
print()
k = 2
for i in range(1, N - 2):
for j in range(1, N + 1):
if(j >= N - k):
print('*', end = '')
else:
print(' ', end = '')
k = k - 2
print()
You are trying to print a pyramid of N lines with two for-loops using range. So the sum of of the lengths of your two ranges should be N.
Now consider the sum of for i in range(1, N - 1) and for i in range(1, N - 2) with some large N. Take N=99 for example:
len(range(1, 99 - 1)) + len(range(1, 99 - 2))
>>> 193
This is your first mistake. The function only works with N=5 because of the two minuses you have chosen for your ranges. What you need to do here is to rewrite your math:
len(range((99+1)//2)) + len(range((99-1)//2))
>>> 99
Note that I removed the starting value from ranges, as it is not needed here. By default, range starts to count from 0 and ends one number before the ending value.
The second mistake you have is with how you are printing the bottom half of the triangle. Basically, you have tried to reverse everything to print everything in reverse, but that has somehow broken the logic. What here we want to do is to print k empty spaces for each line, incrementing k by two for each line.
Function with minimal fixes:
N = int(input())
k = 1
for i in range((N+1)//2):
for j in range(1, N + 1):
if(j <= N - k):
print(' ', end = '')
else:
print('*', end = '')
k = k + 2
print()
k = 2
for i in range((N-1)//2):
for j in range(1, N + 1):
if(j > k):
print('*', end = '')
else:
print(' ', end = '')
k = k + 2
print()
List comprehension is one pythonic trick to avoid multiple nested loops and can be very efficient. Another general guideline is to break down into smaller testable functions
N = 5
def print_pyramid(n):
for i in [min(2*i+1, 2*n-2*i-1) for i in range(n)]:
print((" " * (n-i)) + ("*" * i))
k = 1
while k <= N:
print("N = " + str(k))
print_pyramid(k)
print("")
k += 2
The longest arithmetic progression subsequence problem is as follows. Given an array of integers A, devise an algorithm to find the longest arithmetic progression in it. In other words find a sequence i1 < i2 < … < ik, such that A[i1], A[i2], …, A[ik] form an arithmetic progression, and k is maximal. The following code solves the problem in O(n^2) time and space. (Modified from http://www.geeksforgeeks.org/length-of-the-longest-arithmatic-progression-in-a-sorted-array/ . )
#!/usr/bin/env python
import sys
def arithmetic(arr):
n = len(arr)
if (n<=2):
return n
llap = 2
L = [[0]*n for i in xrange(n)]
for i in xrange(n):
L[i][n-1] = 2
for j in xrange(n-2,0,-1):
i = j-1
k = j+1
while (i >=0 and k <= n-1):
if (arr[i] + arr[k] < 2*arr[j]):
k = k + 1
elif (arr[i] + arr[k] > 2*arr[j]):
L[i][j] = 2
i -= 1
else:
L[i][j] = L[j][k] + 1
llap = max(llap, L[i][j])
i = i - 1
k = j + 1
while (i >=0):
L[i][j] = 2
i -= 1
return llap
arr = [1,4,5,7,8,10]
print arithmetic(arr)
This outputs 4.
However I would like to be able to find arithmetic progressions where up to one value is missing. So if arr = [1,4,5,8,10,13] I would like it to report that there is a progression of length 5 with one value missing.
Can this be done efficiently?
Adapted from my answer to Longest equally-spaced subsequence. n is the length of A, and d is the range, i.e. the largest item minus the smallest item.
A = [1, 4, 5, 8, 10, 13] # in sorted order
Aset = set(A)
for d in range(1, 13):
already_seen = set()
for a in A:
if a not in already_seen:
b = a
count = 1
while b + d in Aset:
b += d
count += 1
already_seen.add(b)
# if there is a hole to jump over:
if b + 2 * d in Aset:
b += 2 * d
count += 1
while b + d in Aset:
b += d
count += 1
# don't record in already_seen here
print "found %d items in %d .. %d" % (count, a, b)
# collect here the largest 'count'
I believe that this solution is still O(n*d), simply with larger constants than looking without a hole, despite the two "while" loops inside the two nested "for" loops. Indeed, fix a value of d: then we are in the "a" loop that runs n times; but each of the inner two while loops run at most n times in total over all values of a, giving a complexity O(n+n+n) = O(n) again.
Like the original, this solution is adaptable to the case where you're not interested in the absolute best answer but only in subsequences with a relatively small step d: e.g. n might be 1'000'000, but you're only interested in subsequences of step at most 1'000. Then you can make the outer loop stop at 1'000.