I am trying an algorithm for a bubble sort and there is a part I don't understand
nums = [1,4,3,2,10,6,8,5]
for i in range (len(nums)-1,0,-1):
for j in range(i):
if nums[j] > nums[j+1]:
temp = nums[j]
nums[j] = nums[j+1]
nums[j+1] = temp
print(nums)
what does the numbers (-1,0,-1) mean in this part of the code (it dosent sort properly without it) v v v
for i in range (len(nums)-1,0,-1):
Syntax for range in python is -
range(start, end, step)
In your case, the looping is essentially starting from the last element(Index n-1) & moving towards the first element(Index 0) one step at a time.
Okey:
first one is starting point, second tells python where to stop, last one is step.
len(nums) - it gives you (durms..) length of this list, in our case it's 8,
len(nums)-1 - it's 8-1, we are doing this because when going through list python will start on number 0 and end at number 7(still 8 elements, but last one has index 7 not 8),
We will stop at 0, with step -1. So iteration will look like:
num[len(nums)-1] = num[7]
num[len(nums)-1-1] = num[6]
num[len(nums)-1-1-1] = num[5]
.....
num[len(nums)-1-1-1-1-1-1-1] = num[0]
Related
I'm tryin to design a function that, given an array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A.
This code works fine yet has a high order of complexity, is there another solution that reduces the order of complexity?
Note: The 10000000 number is the range of integers in array A, I tried the sort function but does it reduces the complexity?
def solution(A):
for i in range(10000000):
if(A.count(i)) <= 0:
return(i)
The following is O(n logn):
a = [2, 1, 10, 3, 2, 15]
a.sort()
if a[0] > 1:
print(1)
else:
for i in range(1, len(a)):
if a[i] > a[i - 1] + 1:
print(a[i - 1] + 1)
break
If you don't like the special handling of 1, you could just append zero to the array and have the same logic handle both cases:
a = sorted(a + [0])
for i in range(1, len(a)):
if a[i] > a[i - 1] + 1:
print(a[i - 1] + 1)
break
Caveats (both trivial to fix and both left as an exercise for the reader):
Neither version handles empty input.
The code assumes there no negative numbers in the input.
O(n) time and O(n) space:
def solution(A):
count = [0] * len(A)
for x in A:
if 0 < x <= len(A):
count[x-1] = 1 # count[0] is to count 1
for i in range(len(count)):
if count[i] == 0:
return i+1
return len(A)+1 # only if A = [1, 2, ..., len(A)]
This should be O(n). Utilizes a temporary set to speed things along.
a = [2, 1, 10, 3, 2, 15]
#use a set of only the positive numbers for lookup
temp_set = set()
for i in a:
if i > 0:
temp_set.add(i)
#iterate from 1 upto length of set +1 (to ensure edge case is handled)
for i in range(1, len(temp_set) + 2):
if i not in temp_set:
print(i)
break
My proposal is a recursive function inspired by quicksort.
Each step divides the input sequence into two sublists (lt = less than pivot; ge = greater or equal than pivot) and decides, which of the sublists is to be processed in the next step. Note that there is no sorting.
The idea is that a set of integers such that lo <= n < hi contains "gaps" only if it has less than (hi - lo) elements.
The input sequence must not contain dups. A set can be passed directly.
# all cseq items > 0 assumed, no duplicates!
def find(cseq, cmin=1):
# cmin = possible minimum not ruled out yet
size = len(cseq)
if size <= 1:
return cmin+1 if cmin in cseq else cmin
lt = []
ge = []
pivot = cmin + size // 2
for n in cseq:
(lt if n < pivot else ge).append(n)
return find(lt, cmin) if cmin + len(lt) < pivot else find(ge, pivot)
test = set(range(1,100))
print(find(test)) # 100
test.remove(42)
print(find(test)) # 42
test.remove(1)
print(find(test)) # 1
Inspired by various solutions and comments above, about 20%-50% faster in my (simplistic) tests than the fastest of them (though I'm sure it could be made faster), and handling all the corner cases mentioned (non-positive numbers, duplicates, and empty list):
import numpy
def firstNotPresent(l):
positive = numpy.fromiter(set(l), dtype=int) # deduplicate
positive = positive[positive > 0] # only keep positive numbers
positive.sort()
top = positive.size + 1
if top == 1: # empty list
return 1
sequence = numpy.arange(1, top)
try:
return numpy.where(sequence < positive)[0][0]
except IndexError: # no numbers are missing, top is next
return top
The idea is: if you enumerate the positive, deduplicated, sorted list starting from one, the first time the index is less than the list value, the index value is missing from the list, and hence is the lowest positive number missing from the list.
This and the other solutions I tested against (those from adrtam, Paritosh Singh, and VPfB) all appear to be roughly O(n), as expected. (It is, I think, fairly obvious that this is a lower bound, since every element in the list must be examined to find the answer.) Edit: looking at this again, of course the big-O for this approach is at least O(n log(n)), because of the sort. It's just that the sort is so fast comparitively speaking that it looked linear overall.
I'm having trouble understanding the 2nd for loop in the code below:
di = [4,5,6]
for i in range(len(di)):
total = di[i]
for j in range(i+1,len(di)):
total += di[j]
curr_di = total / ((j-i+1)**2)
I can't visualize what happens in for j in range(i+1,len(di)):, in particular the i+1 portion confuses me. how does the i in the first loop affect the 2nd loop, if any?
The first loop is simply looping over the indices available in list di. For each entry in that loop, the second loop then examines the remaining portions of di.
So, on the first iteration, we're examining the value 4. The second loop will then walk the list starting at that position, and running to the end (it will examine items 5 and 6).
On the second iteration, we'll examine entry 5, then walk the remainder of the list in the second loop (in this case 6). Make sense?
As a commenter pointed out, print statements are your friend. Here's an example with some print statements to show how i and j change:
di = [4,5,6,7]
for i in range(len(di)):
print(f"i: {i}")
total = di[i]
for j in range(i+1, len(di)):
print(f" - j: {j}")
total += di[j]
curr_di = total / ((j-i+1)**2)
Output:
i: 0
- j: 1
- j: 2
- j: 3
i: 1
- j: 2
- j: 3
i: 2
- j: 3
i: 3
This is a typical "combinations of 2" loop. Each item in the list (index i) is processed with every subsequent item (index j).
It looks like sequence processing to compute a sum of partial sums:
total = ∑(i=1..n) ( ∑(j=i..n) a[j] )
the problem is to find total number of sub-lists from a given list that doesn't contain numbers greater than a specified upper bound number say right and sub lists max number should be greater than a lower bound say left .Suppose my list is: x=[2, 0, 11, 3, 0] and upper bound for sub-list elements is 10 and lower bound is 1 then my sub-lists can be [[2],[2,0],[3],[3,0]] as sub lists are always continuous .My script runs well and produces correct output but needs some optimization
def query(sliced,left,right):
end_index=0
count=0
leng=len(sliced)
for i in range(leng):
stack=[]
end_index=i
while(end_index<leng and sliced[end_index]<=right):
stack.append(sliced[end_index])
if max(stack)>=left:
count+=1
end_index+=1
print (count)
origin=[2,0,11,3,0]
left=1
right=10
query(origin,left,right)
output:4
for a list say x=[2,0,0,1,11,14,3,5] valid sub-lists can be [[2],[2,0],[2,0,0],[2,0,0,1],[0,0,1],[0,1],[1],[3],[5],[3,5]] total being 10
Brute force
Generate every possible sub-list and check if the given criteria hold for each sub-list.
Worst case scenario: For every element e in the array, left < e < right.
Time complexity: O(n^3)
Optimized brute force (OP's code)
For every index in the array, incrementally build a temporary list (not really needed though) which is valid.
Worst case scenario: For every element e in the array, left < e < right.
Time complexity: O(n^2)
A more optimized solution
If the array has n elements, then the number of sub-lists in the array is 1 + 2 + 3 + ... + n = (n * (n + 1)) / 2 = O(n^2). We can use this formula strategically.
First, as #Tim mentioned, we can just consider the sum of the sub-lists that do not contain any numbers greater than right by partitioning the list about those numbers greater than right. This reduces the task to only considering sub-lists that have all elements less than or equal to right then summing the answers.
Next, break apart the reduced sub-list (yes, the sub-list of the sub-list) by partitioning the reduced sub-list about the numbers greater than or equal to left. For each of those sub-lists, compute the number of possible sub-lists of that sub-list of sub-lists (which is k * (k + 1) / 2 if the sub-list has length k). Once that is done for all the the sub-lists of sub-lists, add them together (store them in, say, w) then compute the number of possible sub-lists of that sub-list and subtract w.
Then aggregate your results by sum.
Worst case scenario: For every element e in the array, e < left.
Time Complexity: O(n)
I know this is very difficult to understand, so I have included working code:
def compute(sliced, lo, hi, left):
num_invalid = 0
start = 0
search_for_start = True
for end in range(lo, hi):
if search_for_start and sliced[end] < left:
start = end
search_for_start = False
elif not search_for_start and sliced[end] >= left:
num_invalid += (end - start) * (end - start + 1) // 2
search_for_start = True
if not search_for_start:
num_invalid += (hi - start) * (hi - start + 1) // 2
return ((hi - lo) * (hi - lo + 1)) // 2 - num_invalid
def query(sliced, left, right):
ans = 0
start = 0
search_for_start = True
for end in range(len(sliced)):
if search_for_start and sliced[end] <= right:
start = end
search_for_start = False
elif not search_for_start and sliced[end] > right:
ans += compute(sliced, start, end, left)
search_for_start = True
if not search_for_start:
ans += compute(sliced, start, len(sliced), left)
return ans
Categorise the numbers as small, valid and large (S,V and L) and further index the valid numbers: V_1, V_2, V_3 etc. Let us start off by assuming there are no large numbers.
Consider the list A = [S,S,…,S,V_1, X,X,X,X,…X] .If V_1 has index n, there are n+1, subsets of the form [V_1], [S,V_1], [S,S,V_1] and so on. And for each of these n+1 subsets, we can append the len(A)-n-1 sequences: [X], [XX], [XXX] and so on. Giving a total of (n+1)(len(A)-n) subsets containing V_1.
But we can partition the set of all subsets by those containing V_k but no V_n for n less than k. Hence we must then, simply perform the same calculation on the remaining XXX…X part of the list using V_2 and itterate. This would require something like this:
def query(sliced,left,right,total):
index=0
while index<len(sliced):
if sliced[index]>=left:
total+=(index+1)*(len(sliced)-index)
return total+query(sliced[index+1:],left,right,0)
else:
index+=1
return total
To incorporate the large numbers, we can just partition the whole set according to where the large numbers occur and add the total number of sequence for each. If we call our first function, sub_query, then we arrive at the following:
def sub_query(sliced,left,right,total):
index=0
while index<len(sliced):
if sliced[index]>=left:
total+=(index+1)*(len(sliced)-index)
return total+sub_query(sliced[index+1:],left,right,0)
else:
index+=1
return total
def query(sliced,left,right):
index=0
count=0
while index<len(sliced):
if sliced[index]>right:
count+=sub_query(sliced[:index],left,right,0)
sliced=sliced[index+1:]
index=0
else:
index+=1
count+=sub_query(sliced,left,right,0)
print (count)
This seems to run through the list and check for max/min values fewer times. Note it doesn’t distinguish between sub-lists that are the same but from different positions in the original list (as would arise from a list such as [0,1,0,0,1,0]. But the code from the original post wouldn’t do that either, so I am guessing this is not a requirement.
I want to compare a number (dist) against each element of a sorted list (my list).
If the number is smaller than the first element in myList, than I have to proceed and find the right place for dist, eliminate the first element in myList and shift the list.
My main problem now is tha case when dist is smaller that the 1st elemnt in myList. The index is out of range ...
dist = 10
mylist = [40, 30, 20, 15] # this is a sorted list
for j in range(0, len(mylist)):
if mylist[j] < dist & dist> mylist[j+1]:
print (mylist[j], '<' ,dist, '>', mylist[j+1])
#drop 40
#shift the list so that is becomes: [30,20, 15,10]
IIUC, you want to insert dist at its right position and knock off the first element. This is fine, but you have a few issues. The major one being your condition, and I'm not talking about the &. You'll want to make sure dist is greater the current, but lesser than the next. You do that like this:
if mylist[j] < dist < mylist[j+1]:
You'll also have to run till one lesser than len(mylist) to avoid index out of bounds.
Another trick you can use is for...else which will work for the corner case where dist was not inserted anywhere else.
In short, try this:
for j in range(len(mylist) - 1):
if mylist[j] < dist < mylist[j + 1]:
mylist.insert(j, dist)
mylist = mylist[1:]
break
else:
mylist.append(dist)
mylist = mylist[1:]
Alternatively, you can run from 1 to len(mylist), and compare check for mylist[j - 1] < dist < mylist[j].
I was solving a programming puzzle involving combinations. It led me to a wonderful itertools.combinations function and I'd like to know how it works under the hood. Documentation says that the algorithm is roughly equivalent to the following:
def combinations(iterable, r):
# combinations('ABCD', 2) --> AB AC AD BC BD CD
# combinations(range(4), 3) --> 012 013 023 123
pool = tuple(iterable)
n = len(pool)
if r > n:
return
indices = list(range(r))
yield tuple(pool[i] for i in indices)
while True:
for i in reversed(range(r)):
if indices[i] != i + n - r:
break
else:
return
indices[i] += 1
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
yield tuple(pool[i] for i in indices)
I got the idea: we start with the most obvious combination (r first consecutive elements). Then we change one (last) item to get each subsequent combination.
The thing I'm struggling with is a conditional inside for loop.
for i in reversed(range(r)):
if indices[i] != i + n - r:
break
This experession is very terse, and I suspect it's where all the magic happens. Please, give me a hint so I could figure it out.
The loop has two purposes:
Terminating if the last index-list has been reached
Determining the right-most position in the index-list that can be legally increased. This position is then the starting point for resetting all indeces to the right.
Let us say you have an iterable over 5 elements, and want combinations of length 3. What you essentially need for this is to generate lists of indexes. The juicy part of the above algorithm generates the next such index-list from the current one:
# obvious
index-pool: [0,1,2,3,4]
first index-list: [0,1,2]
[0,1,3]
...
[1,3,4]
last index-list: [2,3,4]
i + n - r is the max value for index i in the index-list:
index 0: i + n - r = 0 + 5 - 3 = 2
index 1: i + n - r = 1 + 5 - 3 = 3
index 2: i + n - r = 2 + 5 - 3 = 4
# compare last index-list above
=>
for i in reversed(range(r)):
if indices[i] != i + n - r:
break
else:
break
This loops backwards through the current index-list and stops at the first position that doesn't hold its maximum index-value. If all positions hold their maximum index-value, there is no further index-list, thus return.
In the general case of [0,1,4] one can verify that the next list should be [0,2,3]. The loop stops at position 1, the subsequent code
indices[i] += 1
increments the value for indeces[i] (1 -> 2). Finally
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
resets all positions > i to the smallest legal index-values, each 1 larger than its predecessor.
This for loop does a simple thing: it checks whether the algorithm should terminate.
The algorithm start with the first r items and increases until it reaches the last r items in the iterable, which are [Sn-r+1 ... Sn-1, Sn] (if we let S be the iterable).
Now, the algorithm scans every item in the indices, and make sure they still have where to go - so it verifies the ith indice is not the index n - r + i, which by the previous paragraph is the (we ignore the 1 here because lists are 0-based).
If all of these indices are equal to the last r positions - then it goes into the else, commiting the return and terminating the algorithm.
We could create the same functionality by using
if indices == list(range(n-r, n)): return
but the main reason for this "mess" (using reverse and break) is that the first index from the end that doesn't match is saved inside i and is used for the next level of the algorithm which increments this index and takes care of re-setting the rest.
You could check this by replacing the yields with
print('Combination: {} Indices: {}'.format(tuple(pool[i] for i in indices), indices))
Source code has some additional information about what is going on.
The yeild statement before while loop returns a trivial combination of elements (which is simply first r elements of A, (A[0], ..., A[r-1])) and prepares indices for future work.
Let's say that we have A='ABCDE' and r=3. Then, after the first step the value of indices is [0, 1, 2], which points to ('A', 'B', 'C').
Let's look at the source code of the loop in question:
2160 /* Scan indices right-to-left until finding one that is not
2161 at its maximum (i + n - r). */
2162 for (i=r-1 ; i >= 0 && indices[i] == i+n-r ; i--)
2163 ;
This loop searches for the rightmost element of indices that hasn't reached its maximum value yet. After the very first yield statement the value of indices is [0, 1, 2]. Therefore, for loop terminates at indices[2].
Next, the following code increments the ith element of indices:
2170 /* Increment the current index which we know is not at its
2171 maximum. Then move back to the right setting each index
2172 to its lowest possible value (one higher than the index
2173 to its left -- this maintains the sort order invariant). */
2174 indices[i]++;
As a result, we get index combination [0, 1, 3], which points to ('A', 'B', 'D').
Then we roll back the subsequent indices if they are too big:
2175 for (j=i+1 ; j<r ; j++)
2176 indices[j] = indices[j-1] + 1;
Indices increase step by step:
step indices
(0, 1, 2)
(0, 1, 3)
(0, 1, 4)
(0, 2, 3)
(0, 2, 4)
(0, 3, 4)
(1, 2, 3)
...