Google CodeJam Past Exercise - Decrease runtime - python

I have been working on a past Google Codejam algorithm from 2010, but the time complexity is awful.
Here is the question from Google Codejam: https://code.google.com/codejam/contest/619102/dashboard
TLDR - Imagine two towers that have a number line running up the sides, we draw a line from one buildings number line (say from 10) to another point on the other buildings number line (say from 1). If we do this n times, how many times will those lines intersect?
I was wondering if anyone here is able to suggest a way in which I can speed up my algorithm? After 4 hours I really can't see one and I'm losing my miinnnnddd.
Here is my code as of right now.
An example input would be:
2 - (Number of cases)
3 - (Number of wires in case # 1)
1 10
5 5
7 7
Case #1: 2 - (2 intersections among lines 1,10 5,5 7,7)
2 - (Number of wires in case #2)
5 5
2 2
Case #2: 0 - (No lines intersect)
def solve(wire_ints, test_case):
answer_integer = 0
for iterI in range(number_wires):
for iterJ in range(iterI):
holder = [wire_ints[iterI], wire_ints[iterJ]]
holder.sort()
if holder[0][1] > holder[1][1]:
answer_integer = answer_integer + 1
return("Case #" + str(test_case) + ":" + " " + str(answer_integer))
for test_case in range(1, int(input()) + 1):
number_wires = int(input())
wire_ints = []
for count1 in range(number_wires):
left_port,right_port = map(int, input().split())
wire_ints.append((left_port,right_port))
answer_string = solve(wire_ints, test_case)
print(answer_string)
This algorithm does WORK for any input I give it, but as I said its very ugly and slow.
Help would be appreciated!

Since N is 1000 an algorithm with O(N^2) would be acceptable. So what you have to do is sort the wires by one of their end points.
//sorted by first number
1 10
5 5
7 7
Then you process each line from the beginning and check whether it has intersection with lines before it. If the second end point of a line before it is bigger than the second point of current line they have intersection. This requires two loops thus the O(N^2) complexity which suffice for N=1000. Also you can interpret this as an inversion count. you have to count the number of inversions of the second end points where the list is sorted by first end point.
10 5 7 ->‌ number of inversions is 2, because of (10,5) and (10,7)
Also there is O(NlogN) approach to count the number of inversions which you don't need for this question.

Related

PYTHON - "Love for Mathematics"

I just finished a challenge on Dcoder ("Love for Mathematics") using Python. I failed two test-cases, but got one right. I used somewhat of a lower level of Python for the same as I haven't explored more yet, so I'm sorry if it looks a bit too basic.The Challenge reads:
Students of Dcoder school love Mathematics. They love to read a variety of Mathematics books. To make sure they remain happy, their Mathematics teacher decided to get more books for them.
A student would become happy if there are at least X Mathematics books in the class and not more than Y books because they know "All work and no play makes Jack a dull boy".The teacher wants to buy a minimum number of books to make the maximum number of students happy.
The Input
The first line of input contains an integer N indicating the number of students in the class. This is followed up by N lines where every line contains two integers X and Y respectively.
#Sample Input
5
3 6
1 6
7 11
2 15
5 8
The Output
Output two space-separated integers that denote the minimum number of mathematics books required and the maximum number of happy students.
Explanation: The teacher could buy 5 books and keep student 1, 2, 4 and 5 happy.
#Sample Output
5 4
Constraints:
1 <= N <= 10000
1 <= X, Y <= 10^9
My code:
n = int(input())
l = []
mi = []
ma = []
for i in range(n):
x, y = input().split()
mi.append(int(x))
ma.append(int(y))
if i == 0:
h=ma[0]
else:
if ma[i]>h:
h=ma[i]
for i in range(h):
c = 0
for j in range(len(mi)):
if ma[j]>=i and mi[j]<=i:
c+=1
l.append(c)
great = max(l)
for i in range(1,len(l)+1):
if l[i]==great:
print(i,l[i])
break
My Approach:
I first assigned the two minimum and maximum variables to two different lists - one containing the minimum values, and the other, the maximum. Then I created a loop that processes all numbers from 0 to the maximum possible value of the list containing maximum values and increasing the count for each no. by 1 every time it lies within the favorable range of students.
In this specific case, I got that count list to be (for the above given input):
[1,2,3,3,4,4,3,3,2 ...] and so on. So I could finalize that 4 would be the maximum no. of students and that the first index of 4 in the list would be the minimum no. of textbooks required.
But only 1 test-case worked and two failed. I would really appreciate it if anyone could help me out here.
Thank You.
This problem is alike minimum platform problem.
In that, you need to sort the min and max maths books array in ascending order respectively. Try to understand the problem from the above link (platform problem) then this will be a piece of cake.
Here is your solution:
n = int(input())
min_books = []
max_books = []
for i in range(n):
x, y = input().split()
min_books.append(int(x))
max_books.append(int(y))
min_books.sort()
max_books.sort()
happy_st_result = 1
happy_st = 1
books_needed = min_books[0]
i = 1
j = 0
while (i < n and j < n):
if (min_books[i] <= max_books[j]):
happy_st+= 1
i+= 1
elif (min_books[i] > max_books[j]):
happy_st-= 1
j+= 1
if happy_st > happy_st_result:
happy_st_result = happy_st
books_needed = min_books[i-1]
print(books_needed, happy_st_result)
Try this, and let me know if you need any clarification.
#Vinay Gupta's logic and explanation is correct. If you think on those lines, the answer should become immediately clear to you.
I have implemented the same logic in my code below, except using fewer lines and cool in-built python functions.
# python 3.7.1
import itertools
d = {}
for _ in range(int(input())):
x, y = map(int, input().strip().split())
d.setdefault(x, [0, 0])[0] += 1
d.setdefault(y, [0, 0])[1] += 1
a = list(sorted(d.items(), key=lambda x: x[0]))
vals = list(itertools.accumulate(list(map(lambda x: x[1][0] - x[1][1], a))))
print(a[vals.index(max(vals))][0], max(vals))
The above answer got accepted in Dcoder too.

A while loop time complexity

I'm interested in determining the big O time complexity of the following:
def f(x):
r = x / 2
d = 1e-10
while abs(x - r**2) > d:
r = (r + x/r) / 2
return r
I believe this is O(log n). To arrive at this, I merely collected empirical data via the timeit module and plotted the results, and saw that a plot that looked logarithmic using the following code:
ns = np.linspace(1, 50_000, 100, dtype=int)
ts = [timeit.timeit('f({})'.format(n),
number=100,
globals=globals())
for n in ns]
plt.plot(ns, ts, 'or')
But this seems like a corny way to go about figuring this out. Intuitively, I understand that the body of the while loop involves dividing an expression by 2 some number k times until the while expression is equal to d. This repeated division by 2 gives something like 1/2^k, from which I can see where a log is involved to solve for k. I can't seem to write down a more explicit derivation, though. Any help?
This is Heron's (Or Babylonian) method for calculating the square root of a number. https://en.wikipedia.org/wiki/Methods_of_computing_square_roots
Big O notation for this requires a numerical analysis approach. For more details on the analysis you can check the wikipedia page listed or look for Heron's error convergence or fixed point iteration. (or look here https://mathcirclesofchicago.org/wp-content/uploads/2015/08/johnson.pdf)
Broad-strokes, if we can write the error e_n = (x-r_n**2) in terms of itself to where e_n = (e_n**2)/(2*(e_n+1))
Then we can see that e_n+1 <= min{(e_n**2)/2,e_n/2} so we have the error decrease quadratically. With the degrees of accuracy effectively doubling each iteration.
Whats different between this analysis and Big-O, is that the time it takes does NOT depend on the size of the input, but instead of the wanted accuracy. So in terms of input, this while loop is O(1) because its number of iterations is bounded by the accuracy not the input.
In terms of accuracy the error is bounded by above by e_n < 2**(-n) so we would need to find -n such that 2**(-n) < d. So log_2(d) = b such that 2^b = d. Assuming d < 2, then n = floor(log_2(d)) would work. So in terms of d, it is O(log(d)).
EDIT: Some more info on error analysis of fixed point iteration http://www.maths.lth.se/na/courses/FMN050/media/material/part3_1.pdf
I believe you're correct that it's O(log n).
Here you can see the successive values of r when x = 100000:
1 50000
2 25001
3 12502
4 6255
5 3136
6 1584
7 823
8 472
9 342
10 317
11 316
12 316
(I've rounded them off because the fractions are not interesting).
What you can see if that it goes through two phases.
Phase 1 is when r is large. During these first few iterations, x/r is tiny compared to r. As a result, r + x/r is close to r, so (r + x/r) / 2 is approximately r/2. You can see this in the first 8 iterations.
Phase 2 is when it gets close to the final result. During the last few iterations, x/r is close to r, so r + x/r is close to 2 * r, so (r + x/r) / 2 is close to r. At this point we're just improving the approximation by small amounts. These iterations are not really very dependent on the magnitude of x.
Here's the succession for x = 1000000 (10x the above):
1 500000
2 250001
3 125002
4 62505
5 31261
6 15646
7 7855
8 3991
9 2121
10 1296
11 1034
12 1001
13 1000
14 1000
This time there are 10 iterations in Phase 1, then we again have 4 iterations in Phase 2.
The complexity of the algorithm is dominated by Phase 1, which is logarithmic because it's approximately dividing by 2 each time.

Python excersise gives wrong answers

Question here and my try below, cant complete no matter how I try.
N baskets are lined up, numbered 1 . . . N from left to right. The basket number i contains Ki
apples. John and Mary want to draw a line between two baskets, and then John would get all
the baskets to the left of the line and Mary all the baskets to the right of the line. Help them
draw the line to divide the apples as equally as possible!
Input. The first line of the file jagasis.txt contains N, the number of baskets (2 ≤ N ≤
1 000 000). Each of the following N lines contains an integer Ki
: the number of apples in basket
number i (1 ≤ i ≤ N, 0 ≤ Ki ≤ 10 000).
Output.
The only line of the file jagaval.txt should contain a single integer: the number
of the basket to the right of which the line should be drawn, so that the absolute value of the
difference between the number of apples John gets, and the number of apples Mary gets, would
be as small as possible. If there are multiple possible answers, output any one of them.
Example.
jagasis.txt
7
4
2
10
2
9
3
7
jagaval.txt
4
When the line is drawn between the fourth and the fifth basket, John gets 4 + 2 + 10 + 2 = 18
apples and Mary gets 9 + 3 + 7 = 19 apples. The difference between these numbers is 1, which
is the smallest possible
Here is my code, but its not working for some reason:
f = open("jagasis.txt")
inputs = []
for line in f.read().split():
inputs.append(int(line))
n=[]
location=[]
for x in range(inputs[0]):
n = inputs[1:]
m = n[:]
del n[:x]
m = set(m) - set(n)
jagamine=sum(m)/sum(n)
location.append(jagamine)
p=min(location, key=lambda x:abs(x-1))
uu = location.index(p)
print(location)
f = open("jagaval.txt", "w")
f.write(str(uu))
There's no reason to use set(). You should just sum the numbers before and after the dividing lines.
You should put the number of apples in a separate list from the number of baskets, so you don't have to keep skipping the first element of the list. And you don't need to make a copy and then delete things, use slices of the original list.
with open("jagasis.txt") as f:
count = int(f.readline().strip())
baskets = [int(strip(x)) for x in f]
sums = []
for x in range(1, count-1):
left = sum(baskets[:x])
right = sum(baskets[x:])
sums.append(abs(left - right))
result = index(min(sums)) + 1
with open("jagaval.txt", "w") as f:
f.write(str(result))
This is not the most efficient algorithm, this is O(n**2). A more efficient algorithm realizes that whenever you move the dividing line to the right, the element at the dividing line is added to the left sum and subtracted from the right sum. So if this is a coding competition, you should use that algorithm or you'll probably fail due to exceeding the time limit.
I just would like to add, that it would be nice if you've added what isn't working for you next time, as often many things may not "be working". Cheers.

find if a number divisible by the input numbers

Given two numbers a and b, we have to find the nth number which is divisible by a or b.
The format looks like below:
Input :
First line consists of an integer T, denoting the number of test cases.
Second line contains three integers a, b and N
Output :
For each test case, print the Nth
number in a new line.
Constraints :
1≤t≤105
1≤a,b≤104
1≤N≤10
Sample Input
1
2 3 10
Sample Output
15
Explanation
The numbers which are divisible by 2
or 3 are: 2,3,4,6,8,9,10,12,14,15 and the 10th number is 15
My code
test_case=input()
if int(test_case)<=100000 and int(test_case)>=1:
for p in range(int(test_case)):
count=1
j=1
inp=list(map(int,input().strip('').split()))
if inp[0]<=10000 and inp[0]>=1 and inp[1]<=10000 and inp[1]>=1 and inp[1]<=1000000000 and inp[1]>=1:
while(True ):
if count<=inp[2] :
k=j
if j%inp[0]==0 or j%inp[1] ==0:
count=count+1
j=j+1
else :
j=j+1
else:
break
print(k)
else:
break
Problem Statement:
For single test case input 2000 3000 100000 it is taking more than one second to complete.I want if i can get the results in less than 1 second. Is there a time efficient approach to this problem,may be if we can use some data structure and algorithms here??
For every two numbers there will be number k such that k=a*b. There will only be so many multiples of a and b under k. This set can be created like so:
s = set(a*1, b*1, ... a*(b-1), b*(a-1), a*b)
Say we take the values a=2, b=3 then s = (2,3,4,6). These are the possible values of c:
[1 - 4] => (2,3,4,6)
[5 - 8] => 6 + (2,3,4,6)
[9 - 12] => 6*2 + (2,3,4,6)
...
Notice that the values repeat with a predictable pattern. To get the row you can take the value of c and divide by length of the set s (call it n). The set index is the mod of c by n. Subtract 1 for 1 indexing used in the problem.
row = floor((c-1)/n)
column = `(c-1) % n`
result = (a*b)*row + s(column)
Python impl:
a = 2000
b = 3000
c = 100000
s = list(set([a*i for i in range(1, b+1)] + [b*i for i in range(1, a+1)]))
print((((c-1)//len(s)) * (a*b)) + s[(c - 1)%len(s)])
I'm not certain to grasp exactly what you're trying to accomplish. But if I get it right, isn't the answer simply b*(N/2)? since you are listing the multiples of both numbers the Nth will always be the second you list times N/2.
In your initial example that would be 3*10/2=15.
In the code example, it would be 3000*100000/2=150'000'000
Update:
Code to compute the desired values using set's and lists to speed up the calculation process. I'm still wondering what the recurrence for the odd indexes could be if anyone happens to stumble upon it...
a = 2000
b = 3000
c = 100000
a_list = [a*x for x in range(1, c)]
b_list = [b*x for x in range(1, c)]
nums = set(a_list)
nums.update(b_list)
nums = sorted(nums)
print(nums[c-1])
This code runs in 0.14s on my laptop. Which is significantly below the requested threshold. Nonetheless, this values will depend on the machine the code is run on.

Sum of Digits, properties, hint please

This is the problem:
How many integers 0 ≤ n < 10^18 have the property that the sum of the digits of n equals the sum of digits of 137n?
This solution is grossly inefficient. What am I missing?
#!/usr/bin/env python
#coding: utf-8
import time
from timestrings import *
start = time.clock()
maxpower = 18
count = 0
for i in range(0, 10 ** maxpower - 1):
if i % 9 == 0:
result1 = list(str(i))
result2 = list(str(137 * i))
sum1 = 0
for j in result1:
sum1 += int(j)
sum2 = 0
for j in result2:
sum2 += int(j)
if sum1 == sum2:
print (i, sum1)
count += 1
finish = time.clock()
print ("Project Euler, Project 290")
print ()
print ("Answer:", count)
print ("Time:", stringifytime(finish - start))
First of all, you are to count, not to show the hits.
That is very important. All you have to do is to device an efficient way to count it. Like Jon Bentley wrote in Programming Pearls: "Any methond that considers all permutations of letters for a word is doomed to failure". In fact, I tried in python, as soon as "i" hit 10^9, the system already freezed. 1.5 G memory was consumed. Let alone 10^18. And this also tells us, cite Bentley again, "Defining the problem was about ninety percent of this battle."
And to solve this problem, I can't see a way without dynamic programming (dp). In fact, most of those ridiculously huge Euler problems all require some sort of dp. The theory of dp itself is rather academic and dry, but to implement the idea of dp to solve real problems is not, in fact, the practice is fun and colorful.
One solution to the problem is, we go from 0-9 then 10-99 then 100-999 and so on and extract the signatures of the numbers, summarize numbers with the same signature and deal with all of them as a piece, thus save space and time.
Observation:
3 * 137 = 411 and 13 * 137 = 1781. Let's break the the first result "411" down into two parts: the first two digits "41" and the last digit "1". The "1" is staying, but the "41" part is going to be "carried" to further calculations. Let's call "41" the carry, the first element of the signature. The "1" will stay as the rightest digit as we go on calculating 13 * 137, 23 * 137, 33 * 137 or 43 * 137. All these *3 numbers have a "3" as their rightest digit and the last digit of 137*n is always 1. That is, the difference between this "3" and "1" is +2, call this +2 the "diff" as the second element of the signature.
OK, if we are gonna find a two-digit number with 3 as its last digit, we have to find a digit "m" that satisfies
diff_of_digitsum (m, 137*m+carry) = -2 (1)
to neutralize our +2 diff accumulated earlier. If m could do that, then you know m * 10 + 3, on the paper you write: "m3", is a hit.
For example, in our case we tried digit 1. diff_of_digitsum (digit, 137*digit+carry) = diff_of_digitsum (1, 137*1+41) = -15. Which is not -2, so 13 is not a hit.
Let's see 99. 9 * 137 = 1233. The "diff" is 9 - 3 = +6. "Carry" is 123. In the second iteration when we try to add a digit 9 to 9 and make it 99, we have diff_of_digitsum (digit, 137*digit+carry) = diff_of_digitsum (9, 137*9+123) = diff_of_digitsum (9, 1356) = -6 and it neutralizes our surplus 6. So 99 is a hit!
In code, we just need 18 iteration. In the first round, we deal with the single digit numbers, 2nd round the 2-digit numbers, then 3-digit ... until we get to 18-digit numbers. Make a table before the iterations that with a structure like this:
table[(diff, carry)] = amount_of_numbers_with_the_same_diff_and_carry
Then the iteration begins, you need to keep updating the table as you go. Add new entries if you encounter a new signature, and always update amount_of_numbers_with_the_same_diff_and_carry. First round, the single digits, populate the table:
0: 0 * 137 = 0, diff: 0; carry: 0. table[(0, 0)] = 1
1: 1 * 137 = 137. diff: 1 - 7 = -6; carry: 13. table[(-6, 13)] = 1
2: 2 * 137 = 274. diff: 2 - 7 = -5; carry: 27. table[(-5, 27)] = 1
And so on.
Second iteration, the "10"th digit, we will go over the digit 0-9 as your "m" and use it in (1) to see if it can produce a result that neutralizes the "diff". If yes, it means this m is going to make all those amount_of_numbers_with_the_same_diff_and_carry into hits. Hence counting not showing. And then we can calculate the new diff and carry with this digit added, like in the example 9 has diff 6 and carry 123 but 99 has the diff 9 - 6 ( last digit from 1356) = 3 and carry 135, replace the old table using the new info.
Last comment, be careful the digit 0. It will appear a lot of times in the iteration and don't over count it because 0009 = 009 = 09 = 9. If you use c++, make sure the sum is in unsigned long long and that sort because it is big. Good luck.
You are trying to solve a Project Euler problem by brute force. That may work for the first few problems, but for most problems you need think of a more sophisticated approach.
Since it is IMHO not OK to give advice specific to this problem, take a look at the general advice in this answer.
This brute force Python solution of 7 digits ran for 19 seconds for me:
print sum(sum(map(int, str(n))) == sum(map(int, str(137 * n)))
for n in xrange(0, 10 ** 7, 9))
On the same machine, single core, same Python interpreter, same code, would take about 3170 years to compute for 18 digits (as the problem asked).
See dgg32's answer for an inspiration of a faster counting.

Categories