Related
Given a sorted list such as:
[1, 2, 2, 3, 3, 4]
My goal is to check if there are any numbers repeated, and if so, shift the element and all the numbers before it by one to the left as such:
[1, 2, 2, 3, 3, 4]
[0, 1, 2, 3, 3, 4]
[-1, 0, 1, 2, 3, 4]
right now this is my approach:
def shifting(data):
i = 0
while i< len(data)-1:
if data[i]==data[i+1]:
j=i
while j>=0:
data[j]-=1
j-=1
i+=1
return data
But this is an O(n^2) algorithm and takes a lot of time to run with very long lists. I want to find a more efficient approach. Any ideas?
I would agree with Stef. The main idea is that you don't really need to have this nested while loop. All you need is a single pass to pinpoint where the duplications occur and apply a shift accordingly.
I'll propose something a little bit more complex but might be more compact:
import numpy as np
input_list = [1, 2, 2, 3, 3, 4]
# Convert to array for easier indexing
output_ary = np.array(input_list)
# Pinpoint the location at which duplications occur
duplication_indicator = output_ary[:-1] - output_ary[1:] == 0
# Compute the corresponding shift
shift = np.cumsum(duplication_indicator[::-1])[::-1]
# Apply the shift
output_ary[:-1] -= shift
# Convert back to list
output_list = output_ary.tolist()
The main idea is that after you've pinpointed the duplication locations, you can compute the corresponding shift by looking at how many more duplications occur to the right. This could be done by simply doing a reversed cumulative sum (summing from the right to left). Applying this shift to the original list then gives the desired output.
Iterate on the data from right to left. Keep a counter decrement that tells you how many duplicates you've encountered so far, and thus, by how much you want to decrement every element you see.
This is linear instead of quadratic: you only iterate on the data once.
When writing python code, I strongly suggest using for-loops rather than while-loops whenever you can, and in particular when you know the length of the loop by advance.
In your code, i = 0; while i < len(data) - 1: i += 1 can be replaced by for i in range(len(data)-1):.
To iterate from right to left: for i in range(len(data)-1, -1, -1):
The logic is no fully clear, but IIUC is seems easy to remove the duplicates with np.unique, then to left fill the array with a range to go to the initial length:
a2 = np.unique(a)
out = np.r_[np.arange(-(a.shape[0]-a2.shape[0]), 0)+1, a2]
output:
array([-1, 0, 1, 2, 3, 4])
on a = np.array([1,2,2,6,6,7]):
array([-1, 0, 1, 2, 6, 7])
I was able to get the list to contain the correct number of occurrences for each element; however, the output I am getting is in the wrong order. So my question is: how would I reorder this to match the expected output? (Or write code that outputs what's expected. I would prefer help with fixing the code I wrote, however.)
My code:
def delete_nth(order,max_e):
for i in order:
if i in order:
if order.count(i)>max_e:
order.remove(i)
return order
My Inputs:
order = [1, 2, 3, 1, 1, 2, 1, 2, 3, 3, 2, 4, 5, 3, 1], max_e = 3
My output:
[1, 2, 1, 2, 3, 3, 2, 4, 5, 3, 1]
Should equal:
[1, 2, 3, 1, 1, 2, 2, 3, 3, 4, 5]
The prompt:
Alice and Bob were on a holiday. Both of them took many pictures of
the places they've been, and now they want to show Charlie their
entire collection. However, Charlie doesn't like these sessions,
since the motive usually repeats. He isn't fond of seeing the Eiffel
tower 40 times. He tells them that he will only sit during the
session if they show the same motive at most N times. Luckily, Alice
and Bob are able to encode the motive as a number. Can you help them
to remove numbers such that their list contains each number only up
to N times, without changing the order?
Task
Given a list lst and a number N, create a new list that contains each
number of lst at most N times without reordering. For example if N = 2, and the input is [1, 2, 3, 1, 2, 1, 2, 3], you take [1, 2, 3, 1, 2], drop the
next [1, 2] since this would lead to 1 and 2 being in the result 3
times, and then take 3, which leads to [1, 2, 3, 1, 2, 3].
Let's take a look at your code first:
def delete_nth(order, max_e):
for i in order:
if i in order: # this is (or should be) redundant
if order.count(i) > max_e: # note this scans order many times
order.remove(i) # removes the "first" occurrence not the current
return order
Word of warning, removing items from a list while iterating over it is full of pitfalls. For example after we do a remove() and go to the next iteration, are we sure what the "next" i is going to be?
anyways, the main problem is we are removing the items from the start of the list rather than from the end of it. There is also the issue of scanning items many times but we can look at that later.
One option then might be to work on a reversed version of the list:
NOTE: do not use delete_nth_v2() it is just here to illustrate a point
def delete_nth_v2(order, max_e):
order_reversed = list(reversed(order))
for i in order_reversed:
if order_reversed.count(i) > max_e:
order_reversed.remove(i)
return list(reversed(order_reversed))
This looks like it will do the trick a little better, but we actually still have the problem that i is likely NOT what we expect it to be.
to see this, add in a print() statement:
def delete_nth_v2(order, max_e):
order_reversed = list(reversed(order))
for i in order_reversed:
print(i, order_reversed)
if order_reversed.count(i) > max_e:
order_reversed.remove(i)
return list(reversed(order_reversed))
delete_nth_v2([1,2,3,1,2,3,1,2,3,4], 2)
you will see that on the 3rd iteration, we skip what we might have hoped was i == 2 Bummer :-(
Perhaps though there is a way we can track indexes more manually ourselves. This might allow us to also avoid reversing the lists
def delete_nth_v3(order, max_e):
for index in range(len(order) -1, -1, -1):
if order.count(order[index]) > max_e:
order.pop(index)
return order
Now we are getting someplace. This even produces the "correct" result :-) This seems better and is inline with how we started, but there is still the nagging opportunity to not search the entire list for each item in the list.
Why don't we just build a new list while keeping track of how many of each item we have already seen. This trades a little extra space to avoid the repeated searches.
def delete_nth_v4(items, at_most):
counts = {}
keepers = []
for item in items:
if counts.setdefault(item, 0) + 1 <= at_most:
counts[item] += 1
keepers.append(item)
return keepers
Again we get the correct result and this time it is potentially much faster (at least with larger lists) :-)
Finally, if it was me, I would duck being responsible for the space of the "new" list and I would look at yielding the results back to the caller. I would also probably swap out setdefault() for a collections.defaultdict()
import collections
def delete_nth_v5(items, at_most):
counts = collections.defaultdict(int)
for item in items:
if counts[item] < at_most:
counts[item] += 1
yield item
Note we can verify the equivalence via:
import random
motives = [random.randint(0, 100) for _ in range(1_000)]
print(list(delete_nth_v5(motives, 2)) == delete_nth_v4(motives, 2))
You code has few flaws, but the one you should never, ever do (unless you really know what is going on) is removing elements from lists (and other collections) while iterating over it
l = [1, 2, 3, 4, 5]
for el in l:
l.remove(el)
print(l) # it's not empty!
I suggest to iterate over list and count elements (with dict or Counter) while creating new list with elements that has no count bigger than max_e
def delete_nth(order, max_e):
c = Counter()
res = []
for el in order:
if c[el] < max_e:
c[el] += 1
res.append(el)
return res
I am new to Python and learning data structure in Python. I am trying to implement a bubble sort algorithm in python and I did well but I was not getting a correct result. Then I found some tutorial and there I saw that they are first setting a base range for checking.
So the syntax of range in python is:
range([start], stop[, step])
And the bubble sort algorithm is:
def bubbleSort(alist):
for i in range(len(alist) - 1, 0, -1):
for j in range(i):
if alist[j] > alist[j+1]:
temp = alist[j]
alist[j] = alist[j+1]
alist[j+1] = temp
return alist
print(bubbleSort([5, 1, 2, 3, 9, 8, 0]))
I understood all the other logic of the algorithm but I am not able to get why the loop is starting from the end of the list and going till first element of the list:
for i in range(len(alist) - 1, 0, -1):
Why is this traversing the list in reverse? The main purpose of this loop is setting the range condition only so why can't we traverse from the first element to len(list) - 1 like this:
for i in range(0, len(alist) - 1, 1):
In your code, the index i is the largest index that the inner loop will consider when swapping the elements. The way bubble sort works is by swapping sibling elements to move the largest element to the right.
This means that after the first outer iteration (or the first full cycle of the inner loop), the largest element of your list is positioned at the far end of the list. So it’s already in its correct place and does not need to be considered again. That’s why for the next iteration, i is one less to skip the last element and only look at the items 0..len(lst)-1.
Then in the next iteration, the last two elements will be sorted correctly, so it only needs to look at the item 0..len(lst)-2, and so on.
So you want to decrement i since more and more elements at the end of the list will be already in its correct position and don’t need to be looked at any longer. You don’t have to do that; you could also just always have the inner loop go up to the very end but you don’t need to, so you can skip a few iterations by not doing it.
I asked why we are going reverse in the list like len(list)-1,0. Why are we not going forward way like 0,len(list)-1?
I was hoping that the above explanation would already cover that but let’s go into detail. Try adding a print(i, alist) at the end of the outer loop. So you get the result for every iteration of i:
>>> bubbleSort([5, 1, 3, 9, 2, 8, 0])
6 [1, 3, 5, 2, 8, 0, 9]
5 [1, 3, 2, 5, 0, 8, 9]
4 [1, 2, 3, 0, 5, 8, 9]
3 [1, 2, 0, 3, 5, 8, 9]
2 [1, 0, 2, 3, 5, 8, 9]
1 [0, 1, 2, 3, 5, 8, 9]
As you can see, the list will be sorted from the right to the left. This works well for our index i which will limit how far the inner loop will go: For i = 4 for example, we already have 3 sorted elements at the end, so the inner loop will only have to look at the first 4 elements.
Now, let’s try changing the range to go in the other direction. The loop will be for i in range(0, len(alist)). Then we get this result:
>>> bubbleSort([5, 1, 3, 9, 2, 8, 0])
0 [5, 1, 3, 9, 2, 8, 0]
1 [1, 5, 3, 9, 2, 8, 0]
2 [1, 3, 5, 9, 2, 8, 0]
3 [1, 3, 5, 9, 2, 8, 0]
4 [1, 3, 5, 2, 9, 8, 0]
5 [1, 3, 2, 5, 8, 9, 0]
6 [1, 2, 3, 5, 8, 0, 9]
As you can see, this is not sorted at all. But why? i still limits how far the inner loop will go, so at i = 1, the loop will only look at the first pair and sort that; the rest will stay the same. At i = 2, the loop will look at the first two pairs and swap those (once!); the rest will stay the same. And so on. By the time the inner loop can reach the last element (which is only on the final iteration), there aren’t enough iterations left to swap the zero (which also happens to be the smallest element) to the very left.
This is again because bubble sort works by sorting the largest elements to the rightmost side first. So we have to start the algorithm by making the inner loop be able to reach that right side completely. Only when we are certain that those elements are in the right position, we can stop going that far.
There is one way to use a incrementing outer loop: By sorting the smallest elements first. But this also means that we have to start the inner loop on the far right side to make sure that we check all elements as we look for the smallest element. So we really have to make those loops go in the opposite directions.
It's because when you bubble from the start of the list to the end, the final result is that the last item in the list will be sorted (you've bubbled the largest item to the end). As a result, you don't want to include the last item in the list when you do the next bubble (you know it's already in the right place). This means the list you need to sort gets shorter, starting at the end and going down towards the start. In this code, i is always the length of the remaining unsorted list.
You can use this for:
for i in range(0,len(alist)-1,1):
but consequently you should change your second iteration:
for j in range(0,len(alist)-i,1):
I think the purpose of using reverse iteration in the first line is to simplify the second iteration. This is the advantage of using python
as #Jeremy McGibbon's answer, the logic behind bubble sort is to avoid j reach the "sorted part" in the behind of list. When using the example code, j range will be decreased as the value of i decrease. When you change i to increasing, you should handle j iteration differently
You can write the code as follow
lst = [9,6,5,7,8,3,2,1,0,4]
lengthOfArray = len(lst) - 1
for i in range(lengthOfArray):
for j in range(lengthOfArray - i):
if lst[j] > lst[j + 1]:
lst[j], lst[j + 1] = lst[j + 1], lst[j]
print(lst)
I do not understand how the result is 10...
Specifically where in the function does it create the loop that adds 1, 2, 3 and 4?
I am also new to Stackoverflow, so if there is a relative article that I overlooked then please do refer me.
def func(x):
res=0
for i in range(x):
res += i
return res
print(func(5))
def func(x): # defines the function name and input parameter
res=0 # created a variable that is right now set to 0
for i in range(x): # this has two parts, for-loop, and a range function both explained below
res += i # this adds 1 to the variable 'res' each time the for-loop loops
return res # end of the function, passes the variables value back to the caller
print(func(5)) # call the function while passing 5 as an argument
This is how a for loop works,
it will loop over each element you provide it.
so,
myPets = ['dog', 'cat', 'rabbit'] # create a list of pets
for pet in myPets:
print pet # print each pet
When this runs, you get
dog
cat
rabbit
Now the range function, creates a sequence of x numbers ranging from 0 to x-1 so,
range(5)
is equivalent to:
[0,1,2,3,4]
Keep in mind, it starts at 0 and ends at x-1
we could also do
range(3, 6)
which would be equivalent to:
[3,4,5]
note that in python2 range actually returns the list where as in python3 range is a separate sequence type. For the purposes of a for loop, they do the same thing.
As mentioned in comments, you need to know what does the range function to understand that loop.
range(x) function creates an array which contains from 0 to x-1. So range(5) create the array [0, 1, 2, 3, 4]. So the loop:
for i in range(5)
it's equivalent to:
for i in [0, 1, 2, 3, 4]
for i in range(x):
This is the loop you are looking for. It iterates over every element in range(x).
When you have range(5), you are telling python to generate 5 integers, which are up to but not including 5. So they are 0, 1, 2, 3, 4.
The += operator adds right operand to the left operand and assign the result to left operand.
So in your case, with the function and the loop iterating from 0 to 4, you are telling python to generate a function called func(x); within this function, generate a range of integers, from 0 up to but not including 5; add whatever i is to res (to add 0, 1, 2, 3, 4 sequentially to res).
Finally it becomes 10.
res += i means res= res+i
so for loop loops as below
res = 0, initially
The supplied list for looping is
[0,1,2,3,4]
so res += i is applied for each element of the list
the variable 'i' is a temporary variable used to loop through the for loop function.
so value of 'i' will be changing every time it loops i.e
i=0
i=1
i=2
i=3
i=4
the value of res keeps on changing as per the for loop
res= 0+0 =0
res= 0+1 =1
res= 1+2 =3
res= 3+3 =6
res= 6+4 =10
Final returned value is 10 as for loop ends at 4 value in the list
From Python.org:
If you do need to iterate over a sequence of numbers, the built-in function range() comes in handy. It generates lists containing arithmetic progressions, e.g.:
>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Essentially, for i in range(10) is the same as saying for i in [1,2,3,4,5,6,7,8,9,10]
The += operator is the same as saying res = res + value
So, in combination with the range statement, this piece of code is looking at the first element of the list, 0, adding that to zero(your starting element: res = 0), then adding one to that, then adding two to the result of the previous computation (resulting in 3), and so on.
Why isn't this for loop working? My goal is to delete every 1 from my list.
>>> s=[1,4,1,4,1,4,1,1,0,1]
>>> for i in s:
... if i ==1: s.remove(i)
...
>>> s
[4, 4, 4, 0, 1]
Never change a list while iterating over it. The results are unpredictable, as you're seeing here. One simple alternative is to construct a new list:
s = [i for i in s if i != 1]
If for some reason you absolutely have to edit the list in place rather than constructing a new one, my off-the-cuff best answer is to traverse it finding the indices that must be deleted, then reverse traverse that list of indices removing them one by one:
indices_to_remove = [i for (i, val) in enumerate(s) if val == 1]
for i in reversed(indices_to_remove):
del s[i]
Because that removes elements from the end of the list first, the original indices computed remain valid. But I would generally prefer computing the new list unless special circumstances apply.
Consider this code:
#!/usr/bin/env python
s=[1, 4, 1, 4, 1, 4, 1, 1, 0, 1]
list_size=len(s)
i=0
while i!=list_size:
if s[i]==1:
del s[i]
list_size=len(s)
else:
i=i + 1
print s
Result:
[4, 4, 4, 0]
For short, your code get some undesirable result because of "size" and "index positions" of your list are changed every times you cut the number 1 off and your code is clearly proved that for each loop in Python can not handle a list with a dynamic size.
You should not change the content of list while iterating over it
But you could iterate over the copy of the list content and change it in your case
Code:
s=[1,4,1,4,1,4,1,1,0,1]
for i in s[:]:
if i ==1: s.remove(i)
print s
Output:
[4, 4, 4, 0]
As #metatoaster stated you could use filter
Code:
s=[1,4,1,4,1,4,1,1,0,1]
s=list(filter(lambda x:x !=1,s))
print s
[4, 4, 4, 0]
You could use filter to remove multiple things example
Code:
s=[1,4,1,4,1,4,1,1,0,1,2,3,5,6,7,8,9,10,20]
remove_element=[1,2,3,5,6,7,8,9]
s=list(filter(lambda x:x not in remove_element,s))
print s
[4, 4, 4, 0, 10, 20]
This doesn't work because you are modifying the list as it is iterating, and the current pointer moves past one of the 1 you check against. We can illustrate this:
>>> for i in s:
... print(s)
... if i == 1:
... s.remove(i)
...
[1_, 4, 1, 4, 1, 4, 1, 1, 0, 1]
[4, 1_, 4, 1, 4, 1, 1, 0, 1]
[4, 4, 1_, 4, 1, 1, 0, 1]
[4, 4, 4, 1_, 1, 0, 1]
[4, 4, 4, 1, 0_, 1]
[4, 4, 4, 1, 0, 1_]
I added _ to the element being compared. Note how there was only 6 passes in total and with one of the 1s actually skipped over from being ever looked at. That ends up being the element that was removed because list.remove removes the first occurrence of the element specified, and it is an O(n) operation on its own which gets very expensive once your list gets big - this is O(n) even if the item is in the beginning, as it has to copy every single item from everything after the item one element forward as python lists are more like C styled arrays than Java linked-lists (if you want to use linked-lists, use collections.deque). O(n) towards the end because it has to iterate through the entire list to do its own comparison too. Your resulting code can result in a worst case runtime complexity of O(n log n) if you make use of remove.
See Python's data structure time complexity
Peter's answer already covered the generation of a new list, I am only answering why and how your original code did not work exactly.