Programming Challenge Code taking too much time - python

I wrote the following code for this problem.
prof = sorted([int(input()) for x in range(int(input()))])
student = sorted([int(input()) for x in range(int(input()))])
prof_dates = len(prof)
stud_dates = len(student)
amount = 0
prof_index = 0
stud_index = 0
while stud_index < stud_dates and prof_index < prof_dates:
if student[stud_index] == prof[prof_index]:
amount += 1
stud_index += 1
elif student[stud_index] > prof[prof_index]:
prof_index += 1
elif student[stud_index] < prof[prof_index]:
stud_index += 1
print(amount)
But the code is producing a Time Limit Exceeded Error. Earlier I had tried using a in for every item in student but it produced a TLE and I believe that's because the in statement is O(n). So, I wrote this code whose steps required are roughly equal to the sum of the lengths of both the lists. But this is also producing a TLE. So, what changes should I make in my code. Is there some particular part which has a high time expense?
Thanks.

You are using sorting + merging. This takes O(NlogN + MlogM + N + M) time complexity.
But you can put professor data in a set, check every student year value (from an unsorted list) and get O(M + N) complexity (on average).
Note that this approach eliminates the long operation of student list sorting.
Addition: python has built-in sets. For languages that have no such provision, the professor's list is already sorted, so you can just use binary search for every year. The complexity would be O(NlogM).

As the problem basically is to find the intersection of two sets of integers the following code solves the problem in O(M + N) when assuming that a dictionary access is possible in O(1)
prof = set([int(input()) for x in range(int(input()))])
student = set([int(input()) for x in range(int(input()))])
equals_dates = len(prof.intersection(student))

Related

Does this binary sorting algorithm need to be optimized significantly in memory usage and speed of execution

I made a binary search algorithm, and I just was wondering if there is a way to increase the performance of memory usage / effective usage of memory and the speed of execution. Specifically, is there a way to quickly insert an item since I know that using the insert method requires shifting the indexes of all the rightward values by 1. This takes time, so is there a faster way to do this, maybe use a different data set that is mutable and index-able? Also, I made this with the intention that the values are all real numbers / floats or integers. Also, am I correct to assume I won't be using much more memory at any given point in the execution due to popping off the values and sorting them into the sorted list. Since A (length of original array) - B (number of values popped off) + B (number of values popped of from the original array that are moved into sorted array) = A.
def binary_sort(array: []):
""" Returns an organized version of the array """
number = 0
length = len(array)
sorted_array = []
while number < length:
value = array.pop()
start = 0
end = number
while start < end:
avg = (start + end) // 2
v = sorted_array[avg]
if v <= value:
start = 1 + avg
else:
end = avg
sorted_array.insert(start, value)
number += 1
return sorted_array

How to implement a bubble sort that only performs "K" times

I am solving the following bubble sort algorithm.
However, this algorithm does not seem to be a common bubble sort implementation. I have implemented the following code, but it will time out. The reason is that the time complexity of my code is still O (n^2).
How do I write code for a bubble sort to correctly understand and solve the problem?
Solution
Bubble sorting is an algorithm that sorts sequences of length N in such a way that two adjacent elements are examined to change their positions. Bubble sorting can be performed N times as shown below.
Compare the first value with the second value, and change the position if the first value is greater.
Compares the second value with the third value, and changes the position if the second value is greater.
...
Compare the N-1 and N-th values, and change the position if the N-1th value is greater.
I know the result of bubble sorting, so I know the intermediate process of bubble sorting. However, since N is very large, it takes a long time to perform the above steps K times. Write a program that will help you to find the intermediate process of bubble sorting.
Input
N and K are given in the first line.
The second line gives the status of the first sequence. That is, N integers forming the first sequence are given in turn, with spaces between them.
1 <= N <= 100,000
1 <= K <= N
Each term in the sequence is an integer from 1 to 1,000,000,000.
Output
The above steps are repeated K times and the status of the sequence is output.
Commandline
Example input
4 1
62 23 32 15
Example output
23 32 15 62
My Code
n_k = input() # 4 1
n_k_s = [int(num) for num in n_k.split()]
progression = input() # 62 23 32 15 => 23 32 15 62
progressions = [int(num) for num in progression.split()]
def bubble(n_k_s, progressions):
for i in range(0, n_k_s[1]):
for j in range(i, n_k_s[0]-1):
if (progressions[j] > progressions[j+1]):
temp = progressions[j]
progressions[j] = progressions[j+1]
progressions[j+1] = temp
for k in progressions:
print(k, end=" ")
bubble(n_k_s, progressions)
I'm confused as to why you're saying "The reason is that the time complexity of my code is still O (n^2)"
The time complexity is always O(n²), unless you add a flag to check if your list is already sorted (complexity would now be 0(n) if the list is sorted at the beginning of your program)
As best I can tell, you have implemented the algorithm requested. It is O(nk); Phillippe already covered the rationale I was typing.
Yes, you can set a flag to indicate whether you've made any exchanges on this pass. That doesn't change any complexity except for best-case -- although it does reduce the constant factor in many other cases.
One possibility I see for speeding up your process is to use a more efficient value exchange: use the Python idiom a, b = b, a. In your case, the inner loop might become:
done = True
for j in range(i, n_k_s[0]-1):
if progressions[j] > progressions[j+1]:
progressions[j], progressions[j+1] = progressions[j+1], progressions[j]
done = False
if done:
break

Processing big list using python

So I'm trying to solve a challenge and have come across a dead end. My solution works when the list is small or medium but when it is over 50000. It just "time out"
a = int(input().strip())
b = list(map(int,input().split()))
result = []
flag = []
for i in range(len(b)):
temp = a - b[i]
if(temp >=0 and temp in flag):
if(temp<b[i]):
result.append((temp,b[i]))
else:
result.append((b[i],temp))
flag.remove(temp)
else:
flag.append(b[i])
result.sort()
for i in result:
print(i[0],i[1])
Where
a = 10
and b = [ 2, 4 ,6 ,8, 5 ]
Solution sum any two element in b which matches a
**Edit: ** Updated full code
flag is a list, of potentially the same order of magnitude as b. So, when you do temp in flag that's a linear search: it has to check every value in flag to see if that value is == temp. So, that's 50000 comparisons. And you're doing that once per loop in a linear walk over b. So, your total time is quadratic: 50,000 * 50,000 = 2,500,000,000. (And flag.remove is also linear time.)
If you replace flag with a set, you can test it for membership (and remove from it) in constant time. So your total time drops from quadratic to linear, or 50,000 steps, which is a lot faster than 2 billion:
flagset = set(flag)
for i in range(len(b)):
temp = a - b[i]
if(temp >=0 and temp in flagset):
if(temp<b[i]):
result.append((temp,b[i]))
else:
result.append((b[i],temp))
flagset.remove(temp)
else:
flagset.add(b[i])
flag = list(flagset)
If flag needs to retain duplicate values, then it's a multiset, not a set, which means you can implement with Counter:
flagset = collections.Counter(flag)
for i in range(len(b)):
temp = a - b[i]
if(temp >=0 and flagset[temp]):
if(temp<b[i]):
result.append((temp,b[i]))
else:
result.append((b[i],temp))
flagset[temp] -= 1
else:
flagset[temp] += 1
flag = list(flagset.elements())
In your edited code, you’ve got another list that’s potentially of the same size, result, and you’re sorting that list every time through the loop.
Sorting takes log-linear time. Since you do it up to 50,000 times, that’s around log(50;000) * 50,000 * 50,000, or around 30 billion steps.
If you needed to keep result in order throughout the operation, you’d want to use a logarithmic data structure, like a binary search tree or a skiplist, so you could insert a new element in the right place in logarithmic time, which would mean just 800.000 steps.
But you don’t need it in order until the end. So, much more simply, just move the result.sort out of the loop and do it at the end.

What's a fast and pythonic/clean way of removing a sorted list from another sorted list in python?

I am creating a fast method of generating a list of primes in the range(0, limit+1). In the function I end up removing all integers in the list named removable from the list named primes. I am looking for a fast and pythonic way of removing the integers, knowing that both lists are always sorted.
I might be wrong, but I believe list.remove(n) iterates over the list comparing each element with n. meaning that the following code runs in O(n^2) time.
# removable and primes are both sorted lists of integers
for composite in removable:
primes.remove(composite)
Based off my assumption (which could be wrong and please confirm whether or not this is correct) and the fact that both lists are always sorted, I would think that the following code runs faster, since it only loops over the list once for a O(n) time. However, it is not at all pythonic or clean.
i = 0
j = 0
while i < len(primes) and j < len(removable):
if primes[i] == removable[j]:
primes = primes[:i] + primes[i+1:]
j += 1
else:
i += 1
Is there perhaps a built in function or simpler way of doing this? And what is the fastest way?
Side notes: I have not actually timed the functions or code above. Also, it doesn't matter if the list removable is changed/destroyed in the process.
For anyone interested the full functions is below:
import math
# returns a list of primes in range(0, limit+1)
def fastPrimeList(limit):
if limit < 2:
return list()
sqrtLimit = int(math.ceil(math.sqrt(limit)))
primes = [2] + range(3, limit+1, 2)
index = 1
while primes[index] <= sqrtLimit:
removable = list()
index2 = index
while primes[index] * primes[index2] <= limit:
composite = primes[index] * primes[index2]
removable.append(composite)
index2 += 1
for composite in removable:
primes.remove(composite)
index += 1
return primes
This is quite fast and clean, it does O(n) set membership checks, and in amortized time it runs in O(n) (first line is O(n) amortized, second line is O(n * 1) amortized, because a membership check is O(1) amortized):
removable_set = set(removable)
primes = [p for p in primes if p not in removable_set]
Here is the modification of your 2nd solution. It does O(n) basic operations (worst case):
tmp = []
i = j = 0
while i < len(primes) and j < len(removable):
if primes[i] < removable[j]:
tmp.append(primes[i])
i += 1
elif primes[i] == removable[j]:
i += 1
else:
j += 1
primes[:i] = tmp
del tmp
Please note that constants also matter. The Python interpreter is quite slow (i.e. with a large constant) to execute Python code. The 2nd solution has lots of Python code, and it can indeed be slower for small practical values of n than the solution with sets, because the set operations are implemented in C, thus they are fast (i.e. with a small constant).
If you have multiple working solutions, run them on typical input sizes, and measure the time. You may get surprised about their relative speed, often it is not what you would predict.
The most important thing here is to remove the quadratic behavior. You have this for two reasons.
First, calling remove searches the entire list for values to remove. Doing this takes linear time, and you're doing it once for each element in removable, so your total time is O(NM) (where N is the length of primes and M is the length of removable).
Second, removing elements from the middle of a list forces you to shift the whole rest of the list up one slot. So, each one takes linear time, and again you're doing it M times, so again it's O(NM).
How can you avoid these?
For the first, you either need to take advantage of the sorting, or just use something that allows you to do constant-time lookups instead of linear-time, like a set.
For the second, you either need to create a list of indices to delete and then do a second pass to move each element up the appropriate number of indices all at once, or just build a new list instead of trying to mutate the original in-place.
So, there are a variety of options here. Which one is best? It almost certainly doesn't matter; changing your O(NM) time to just O(N+M) will probably be more than enough of an optimization that you're happy with the results. But if you need to squeeze out more performance, then you'll have to implement all of them and test them on realistic data.
The only one of these that I think isn't obvious is how to "use the sorting". The idea is to use the same kind of staggered-zip iteration that you'd use in a merge sort, like this:
def sorted_subtract(seq1, seq2):
i1, i2 = 0, 0
while i1 < len(seq1):
if seq1[i1] != seq2[i2]:
i2 += 1
if i2 == len(seq2):
yield from seq1[i1:]
return
else:
yield seq1[i1]
i1 += 1

Python Time Complexity (run-time)

def f2(L):
sum = 0
i = 1
while i < len(L):
sum = sum + L[i]
i = i * 2
return sum
Let n be the size of the list L passed to this function. Which of the following most accurately describes how the runtime of this function grow as n grows?
(a) It grows linearly, like n does.
(b) It grows quadratically, like n^2 does.
(c) It grows less than linearly.
(d) It grows more than quadratically.
I don't understand how you figure out the relationship between the runtime of the function and the growth of n. Can someone please explain this to me?
ok, since this is homework:
this is the code:
def f2(L):
sum = 0
i = 1
while i < len(L):
sum = sum + L[i]
i = i * 2
return sum
it is obviously dependant on len(L).
So lets see for each line, what it costs:
sum = 0
i = 1
# [...]
return sum
those are obviously constant time, independant of L.
In the loop we have:
sum = sum + L[i] # time to lookup L[i] (`timelookup(L)`) plus time to add to the sum (obviously constant time)
i = i * 2 # obviously constant time
and how many times is the loop executed?
it's obvously dependant on the size of L.
Lets call that loops(L)
so we got an overall complexity of
loops(L) * (timelookup(L) + const)
Being the nice guy I am, I'll tell you that list lookup is constant in python, so it boils down to
O(loops(L)) (constant factors ignored, as big-O convention implies)
And how often do you loop, based on the len() of L?
(a) as often as there are items in the list (b) quadratically as often as there are items in the list?
(c) less often as there are items in the list (d) more often than (b) ?
I am not a computer science major and I don't claim to have a strong grasp of this kind of theory, but I thought it might be relevant for someone from my perspective to try and contribute an answer.
Your function will always take time to execute, and if it is operating on a list argument of varying length, then the time it takes to run that function will be relative to how many elements are in that list.
Lets assume it takes 1 unit of time to process a list of length == 1. What the question is asking, is the relationship between the size of the list getting bigger vs the increase in time for this function to execute.
This link breaks down some basics of Big O notation: http://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
If it were O(1) complexity (which is not actually one of your A-D options) then it would mean the complexity never grows regardless of the size of L. Obviously in your example it is doing a while loop dependent on growing a counter i in relation to the length of L. I would focus on the fact that i is being multiplied, to indicate the relationship between how long it will take to get through that while loop vs the length of L. Basically, try to compare how many loops the while loop will need to perform at various values of len(L), and then that will determine your complexity. 1 unit of time can be 1 iteration through the while loop.
Hopefully I have made some form of contribution here, with my own lack of expertise on the subject.
Update
To clarify based on the comment from ch3ka, if you were doing more than what you currently have inside your with loop, then you would also have to consider the added complexity for each loop. But because your list lookup L[i] is constant complexity, as is the math that follows it, we can ignore those in terms of the complexity.
Here's a quick-and-dirty way to find out:
import matplotlib.pyplot as plt
def f2(L):
sum = 0
i = 1
times = 0
while i < len(L):
sum = sum + L[i]
i = i * 2
times += 1 # track how many times the loop gets called
return times
def main():
i = range(1200)
f_i = [f2([1]*n) for n in i]
plt.plot(i, f_i)
if __name__=="__main__":
main()
... which results in
Horizontal axis is size of L, vertical axis is how many times the function loops; big-O should be pretty obvious from this.
Consider what happens with an input of length n=10. Now consider what happens if the input size is doubled to 20. Will the runtime double as well? Then it's linear. If the runtime grows by factor 4, then it's quadratic. Etc.
When you look at the function, you have to determine how the size of the list will affect the number of loops that will occur.
In your specific situation, lets increment n and see how many times the while loop will run.
n = 0, loop = 0 times
n = 1, loop = 1 time
n = 2, loop = 1 time
n = 3, loop = 2 times
n = 4, loop = 2 times
See the pattern? Now answer your question, does it:
(a) It grows linearly, like n does. (b) It grows quadratically, like n^2 does.
(c) It grows less than linearly. (d) It grows more than quadratically.
Checkout Hugh's answer for an empirical result :)
it's O(log(len(L))), as list lookup is a constant time operation, independant of the size of the list.

Categories