Python ~ First Covering Prefix in Array - python

So I recently gave an online interview for a job. Although my expertise are networks and cyber security.
I came across this question:
Write a function which takes an array of integers and returns the
first covering prefix of that array. The "first covering prefix" of an
array, A, of length N is the smallest index P such that 0 <= P <= N
and each element in A also appears in the list of elements A[0]
through A[P]. For example, the first covering prefix of the following
array: A = [5, 3, 19, 7, 3, 5, 7, 3] is 3, because the elements from
A[0] to A[3] (equal to [5, 3, 19, 7]) contains all values that occur
in array A.
Although I am not a programmer (chose python3 for the interview),
I would like someone to explain the logic behind this.
Just wanting to learn, its been bugging me for a day now.

You can iterate all elements, if not already seen (use a set to keep track efficiently), update P:
A = [5, 3, 19, 7, 3, 5, 7, 3]
S = set()
P = 0 # you could set -1/None as default to account for empty lists?
for i, item in enumerate(A): # iterate elements together with indices
if item not in S: # if we haven't seen this element yet
P = i # update P as the current index
S.add(item) # add the element to the set
print(P)
output: 3

Related

Find index of minimum value in a Python sublist - min() returns index of minimum value in list

I've been working on implementing common sorting algorithms into Python, and whilst working on selection sort I ran into a problem finding the minimum value of a sublist and swapping it with the first value of the sublist, which from my testing appears to be due to a problem with how I am using min() in my program.
Here is my code:
def selection_sort(li):
for i in range(0, len(li)):
a, b = i, li.index(min(li[i:]))
li[a], li[b] = li[b], li[a]
This works fine for lists that have zero duplicate elements within them:
>>> selection_sort([9,8,7,6,5,4,3,2,1])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
However, it completely fails when there are duplicate elements within the list.
>>> selection_sort([9,8,8,7,6,6,5,5,5,4,2,1,1])
[8, 8, 7, 6, 6, 5, 5, 5, 4, 2, 9, 1, 1]
I tried to solve this problem by examining what min() is doing on line 3 of my code, and found that min() returns the index value of the smallest element inside the sublist as intended, but the index is of the element within the larger list rather than of the sublist, which I hope this experimentation helps to illustrate more clearly:
>>> a = [1,2,1,1,2]
>>> min(a)
1 # expected
>>> a.index(min(a))
0 # also expected
>>> a.index(min(a[1:]))
0 # should be 1?
I'm not sure what is causing this behaviour; it could be possible to copy li[i:] into a temporary variable b and then do b.index(min(b)), but copying li[i:] into b for each loop might require a lot of memory, and selection sort is an in-place algorithm so I am uncertain as to whether this approach is ideal.
You're not quite getting the concept correctly!
li.index(item) will return the first appearance of that item in the list li.
What you should do instead is if you're finding the minimum element in the sublist, search for that element in the sublist as well instead of searching it in the whole list. Also when searching in the sliced list, you will get the index in respect to the sublist. Though you can easily fix that by adding the starting step to the index returned.
A small fix for your problem would be:
def selection_sort(li):
for i in range(0, len(li)):
a, b = i, i + li[i:].index(min(li[i:]))
li[a], li[b] = li[b], li[a]
When you write a.index(min(a[1:])) you are searching for the first occurence of the min of a[1:], but you are searching in the original list. That's why you get 0 as a result.
By the way, the function you are looking for is generally called argmin. It is not contained in pure python, but numpy module has it.
One way you can do it is using list comprehension:
idxs = [i for i, val in enumerate(a) if val == min(a)]
Or even better, write your own code, which is faster asymptotically:
idxs = []
minval = None
for i, val in enumerate(a):
if minval is None or minval > val:
idxs = [i]
minval = val
elif minval == val:
idxs.append(i)

python pop() giving unexpected results [duplicate]

I was practicing python 'list variable' with 'for loop', but was surprised to see that the order of the items in the list changed.
xlist=[1,2,3,4,5]
print(xlist)
#loop all items in the lxist
for item in xlist:
print(item)
#multiply each item by 5
xlist[xlist.index(item)] = item * 5
#print the list
print(xlist)
I was expecting the list order to be [5,10,15,20,25] but instead i got [25, 10, 15, 20, 5]
I am using python 3.8(32 version) using pycharm IDE.
Can anyone clarify why the order of the list has changed
You are not using the .index method correctly. Two problems, semantically, it doesn't mean what you think it means, it gives you the first index of some object in a list. So note, on your last iteration:
xlist.index(5) == 0
Because on your first iteration, you set:
xlist[0] = 1 * 5
The correct way to do this is to maintain and index as you iterate, either manually by using something like index = 0 outside the loop and incrementing it, or by iterating over a range and extracting the item using that index. But the pythonic way to do this is to use enumerate, which automatically provides a counter when you loop:
for index, item in enumerate(xlist):
xlist[index] = item*5
The other problem is even if your items were all unique and the index returned was correct, using .index in a loop is unnecessarily making your algorithm quadratic time, since .index takes linear time.
The index method returns the index of the first occurrence of the item you have passed as an argument (assuming it exists). So, by the time you reach the last element, i.e. 5 at index 4, the item at index 0 is also 5, so you get 5 * 5 at index 0 in the final result.
When the index method is searching for the 5th number (5) it locates the first index that has that value. At this point in time, index 0 (the 1st number) is also 5 so it multiplies index 0 by 5. A better way to loop through is to use the enumerate method to loop through each index and modify the number at that index, rather than find the index afterwards. This eliminates the troubles with the index method.
xlist=[1,2,3,4,5]
print(xlist)
#loop all items in the lxist
for i, item in enumerate(xlist):
print(item)
#multiply each item by 5
xlist[i] *= 5
#print the list
print(xlist)
Results:
[1, 2, 3, 4, 5]
1
[5, 2, 3, 4, 5]
2
[5, 10, 3, 4, 5]
3
[5, 10, 15, 4, 5]
4
[5, 10, 15, 20, 5]
5
[5, 10, 15, 20, 25]

Generic algorithm for element wise operation between hundreds of lists

I have been tasked to write an algorithm for a project. Basically, i scan the data to get the unique items and store their positions in an array. So i end up with multiple arrays with variable length. Now i have to do element wise operations on ALL of these arrays and their elements. Note that these will always be sorted (if that matters)
a = [0, 7, 13, 18]
b = [1, 2, 8, 10]
c = [0, 3, 5, 6, 7]
The current solution i have is a pretty basic loop solution where i loop through every array and compare its element with every other array and its elements. It works for small number of arrays and, as you can imagine, doesn't work well where i have a lot of unique items each with their own array/list.
def add(a, b):
result = []
for i in range(len(a)):
for j in range(len(b)):
result.append(a[i] + b[j])
return result
a = [0, 7, 13, 18]
b = [1, 2, 8, 10]
c = [0, 3, 5, 6, 7]
total_unique_items = [a, b, c]
calc = []
for i in range(len(total_unique_items)):
for j in range(i+1, len(total_unique_items)):
calc.append(add(total_unique_items[i], total_unique_items[j]))
print(calc)
I know there are pythonic solutions like zip but my teacher is asking for a generic language-independent solution here.
I am not really sure how to tackle this problem. One way would be to use a data structure like a tree or a graph and traverse through it? the other way would be to find a way to perform the operation on all the array's ith elements in ith iteration of the loop. This way, my main loop would run for the length of the longest array. I am just really confused about it and would love to get an idea of the direction i should go from here.

(Python) I need to remove the duplicate elements in a list (use remove funct; avoid typecasting). Trying to determine why my solution is incorrect

Specifications:
I want to use the remove function (in lists) and I'd prefer to avoid typecasting.
l = [2, 3, 3, 4, 6, 4, 6, 5]
q=len(l)
for i in range (0, q):
for g in range (i+1, q):
if l[g]==l[i]:
q-=1 #decremented q to account for reduction in list size.
l.remove(l[g])
print(l)
Error: if l[g]==l[i]:
IndexError: list index out of range
I know that similar questions have been asked by users previously. As the aforementioned constraints were absent in them, I would like to request you to treat this as a separate question. Thanks!
>>> l = [2, 3, 3, 4, 6, 4, 6, 5]
>>> s = set(l)
>>> t = sorted(s)
>>> print(t)
[2, 3, 4, 5, 6]
Using set is a simple and straight-forward way to filter your collection. If you don't need the list in a specific order, you can just use the set from thereon. The sorted function returns a list (using the default ordering).
Since you mentioned you don't want typecasting, so my solution is using while loop
l = [2, 3, 3, 4, 6, 4, 6, 5]
q=len(l)
i = 0
while i<len(l):
g = i+1
while (g < q):
if l[g]==l[i]:
q-=1 #decremented q to account for reduction in list size.
l.remove(l[g])
g += 1
i += 1
print(l)
Now, allow me to explain what was the problem in your code. When you use range function, it holds the starting and the ending value at the first run of the loop, so even if you change the limits afterwards in the loop, still, it won't change the range loop so eventually, you get index out of bounds error.
Hope this helps you :)
Your solution does not work, because range() store the value of q, and will ignore the change of q's value later. Eg:
>>> m = 10
>>> for i in range(m):
... m=0
... print(i)
...
0
1
2
3
4
5
6
7
8
9
Even if I change m, range() will still go 10 times in the loop. So, when you change the size of the list, even if you change q, you will still try to reach elements that does not exist anymore.

Python: Inplace Merge sort implementation issue

I am implementing inplace merge sort algorithm in python3. Code takes an input array and calls it self recursively (with split array as input) if length of the input array is more than one. After that, it joins two sorted arrays. Here is the code
def merge_sort(array):
"""
Input : list of values
Note :
It divides input array in two halves, calls itself for the two halves and then merges the two sorted halves.
Returns : sorted list of values
"""
def join_sorted_arrays(array1, array2):
"""
Input : 2 sorted arrays.
Returns : New sorted array
"""
new_array = [] # this array will contain values from both input arrays.
j = 0 # Index to keep track where we have reached in second array
n2 = len(array2)
for i, element in enumerate(array1):
# We will compare current element in array1 to current element in array2, if element in array2 is smaller, append it
# to new array and look at next element in array2. Keep doing this until either array2 is exhausted or an element of
# array2 greater than current element of array1 is found.
while j < n2 and element > array2[j]:
new_array.append(array2[j])
j += 1
new_array.append(element)
# If there are any remaining values in array2, that are bigger than last element in array1, then append those to
# new array.
for i in range(j,n2):
new_array.append(array2[i])
return new_array
n = len(array)
if n == 1:
return array
else:
# print('array1 = {0}, array2 = {1}'.format(array[:int(n/2)], array[int(n/2):]))
array[:int(n/2)] = merge_sort(array[:int(n/2)])
array[int(n/2):] = merge_sort(array[int(n/2):])
# print('array before joining : ',array)
array = join_sorted_arrays(array[:int(n/2)],array[int(n/2):])
# print('array after joining : ',array)
return array
Now if the code is tested,
a = [2,1,4,3,1,2,3,4,2,7,8,10,3,4]
merge_sort(a)
print(a)
out : [1, 1, 2, 2, 3, 3, 4, 2, 3, 4, 4, 7, 8, 10]
If you uncomment the print statements in the above function, you will notice that, a = given output, just before the last call of join_sorted_arrays. After this function has been called, array 'a' should be sorted. To my surprise, if I do the following, output is correct.
a = [2,1,4,3,1,2,3,4,2,7,8,10,3,4]
a = merge_sort(a)
print(a)
out : [1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 7, 8, 10]
I need some help to understand why this is happening.
I am beginner, so any other comments about coding practices etc. are also welcome.
When you reassign array as the output of join_sorted_arrays() with
array = join_sorted_arrays(array[:int(n/2)],array[int(n/2):])
you're not updating the value of a anymore.
Seeing as you pass in a as the argument array, it's understandable why all variables named array in a function might seem like they should update the original value of array (aka a). But instead, what's happening with array = join_sorted_arrays(...) is that you have a new variable array scoped within the merge_sort() function. Returning array from the function returns that new, sorted, set of values.
The reference to a was being modified up until that last statement, which is why it looks different with print(a) after merge_sort(a). But you'll only get the final, sorted output from the returned value of merge_sort().
It might be clearer if you look at:
b = merge_sort(a)
print(a) # [1, 1, 2, 2, 3, 3, 4, 2, 3, 4, 4, 7, 8, 10]
print(b) # [1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 7, 8, 10]
Note that Python isn't a pass-by-reference language, and the details of what it actually is can be a little weird to suss out at first. I'm always going back to read on how it works when I get tripped up. There are plenty of SO posts on the topic, which may be of some use to you here.
For example, this one and this one.

Categories