Remove adjacent duplicate elements from a list [duplicate] - python

This question already has answers here:
Removing elements that have consecutive duplicates
(9 answers)
Closed 3 years ago.
Google Python Class | List Exercise -
Given a list of numbers, return a list where
all adjacent == elements have been reduced to a single element,
so [1, 2, 2, 3] returns [1, 2, 3]. You may create a new list or
modify the passed in list.
My solution using a new list is -
def remove_adjacent(nums):
a = []
for item in nums:
if len(a):
if a[-1] != item:
a.append(item)
else: a.append(item)
return a
The question even suggests that it could be done by modifying the passed in list. However, the python documentation warned against modifying elements while iterating a list using the for loop.
I am wondering what else can I try apart from iterating over the list, to get this done. I am not looking for the solution, but maybe a hint that can take me into a right direction.
UPDATE
-updated the above code with suggested improvements.
-tried the following with a while loop using suggested hints -
def remove_adjacent(nums):
i = 1
while i < len(nums):
if nums[i] == nums[i-1]:
nums.pop(i)
i -= 1
i += 1
return nums

Here's the traditional way, deleting adjacent duplicates in situ, while traversing the list backwards:
Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> def dedupe_adjacent(alist):
... for i in xrange(len(alist) - 1, 0, -1):
... if alist[i] == alist[i-1]:
... del alist[i]
...
>>> data = [1,2,2,3,2,2,4]; dedupe_adjacent(data); print data
[1, 2, 3, 2, 4]
>>> data = []; dedupe_adjacent(data); print data
[]
>>> data = [2]; dedupe_adjacent(data); print data
[2]
>>> data = [2,2]; dedupe_adjacent(data); print data
[2]
>>> data = [2,3]; dedupe_adjacent(data); print data
[2, 3]
>>> data = [2,2,2,2,2]; dedupe_adjacent(data); print data
[2]
>>>
Update: If you want a generator but (don't have itertools.groupby or (you can type faster than you can read its docs and understand its default behaviour)), here's a six-liner that does the job:
Python 2.3.5 (#62, Feb 8 2005, 16:23:02) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> def dedupe_adjacent(iterable):
... prev = object()
... for item in iterable:
... if item != prev:
... prev = item
... yield item
...
>>> data = [1,2,2,3,2,2,4]; print list(dedupe_adjacent(data))
[1, 2, 3, 2, 4]
>>>
Update 2: Concerning the baroque itertools.groupby() and the minimalist object() ...
To get the dedupe_adjacent effect out of itertools.groupby(), you need to wrap a list comprehension around it to throw away the unwanted groupers:
>>> [k for k, g in itertools.groupby([1,2,2,3,2,2,4])]
[1, 2, 3, 2, 4]
>>>
... or muck about with itertools.imap and/or operators.itemgetter, as seen in another answer.
Expected behaviour with object instances is that none of them compares equal to any other instance of any class, including object itself. Consequently they are extremely useful as sentinels.
>>> object() == object()
False
It's worth noting that the Python reference code for itertools.groupby uses object() as a sentinel:
self.tgtkey = self.currkey = self.currvalue = object()
and that code does the right thing when you run it:
>>> data = [object(), object()]
>>> data
[<object object at 0x00BBF098>, <object object at 0x00BBF050>]
>>> [k for k, g in groupby(data)]
[<object object at 0x00BBF098>, <object object at 0x00BBF050>]
Update 3: Remarks on forward-index in-situ operation
The OP's revised code:
def remove_adjacent(nums):
i = 1
while i < len(nums):
if nums[i] == nums[i-1]:
nums.pop(i)
i -= 1
i += 1
return nums
is better written as:
def remove_adjacent(seq): # works on any sequence, not just on numbers
i = 1
n = len(seq)
while i < n: # avoid calling len(seq) each time around
if seq[i] == seq[i-1]:
del seq[i]
# value returned by seq.pop(i) is ignored; slower than del seq[i]
n -= 1
else:
i += 1
#### return seq #### don't do this
# function acts in situ; should follow convention and return None

Use a generator to iterate over the elements of the list, and yield a new one only when it has changed.
itertools.groupby does exactly this.
You can modify the passed-in list if you iterate over a copy:
for elt in theList[ : ]:
...

Just to show one more way here is another single liner version without indexes:
def remove_adjacent(nums):
return [a for a,b in zip(nums, nums[1:]+[not nums[-1]]) if a != b]
The not part puts the last value to result as only a ends up to result.

As usual, I am just here to advertise the impressive recipes in the Python itertools documentation.
What you are looking for is the function unique_justseen:
from itertools import imap, groupby
from operator import itemgetter
def unique_justseen(iterable, key=None):
"List unique elements, preserving order. Remember only the element just seen."
# unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
# unique_justseen('ABBCcAD', str.lower) --> A B C A D
return imap(next, imap(itemgetter(1), groupby(iterable, key)))
list(unique_justseen([1,2,2,3])) # [1, 2, 3]

Well, katrielalex is right about itertools, but the OP seems to be rather more interested (or should be!) in learning to manipulate the basics of the built-in data structures. As for manipulating a list in place, it does need thought, but my recommendation would be to read through this section of the documentation and try a few list methods (hint: list.pop(), list.remove(), and learn everything about slices.)
The posted code could be simplified, by the way (you should however add handling of error conditions):
def remove_adjacent(nums):
a = nums[:1]
for item in nums[1:]:
if item != a[-1]:
a.append(item)
return a

Extremely elegant solution from Google (source is here: https://developers.google.com/edu/python/exercises/basic):
def remove_adjacent(nums):
result = []
for num in nums:
if len(result) == 0 or num != result[-1]:
result.append(num)
return result

You can use list comprehension. For example something like this should do the job:
def remove_adjacent(L):
return [elem for i, elem in enumerate(L) if i == 0 or L[i-1] != elem]
or:
def remove_adjacent(L):
return [L[i] for i in xrange(len(L)) if i == 0 or L[i-1] != L[i]]

Try this:
def remove_adjacent(nums):
result = []
if len(nums) > 0:
result = [nums[0]]
for i in range(len(nums)-1):
if nums[i] != nums[i+1]:
result.append(nums[i+1])
return result

itertools.groupby is superior, but there is also
reduce(lambda x, y: x + [y] if x[-1] != y else x, seq[1:], seq[0:1])
e.g.
>>> seq = [[1,1], [2,2], [3,3], [3,3], [2,2], [2,2], [1,1]]
>>> print reduce(lambda x, y: x + [y] if x[-1] != y else x, seq[1:], seq[0:1])
[[1, 1], [2, 2], [3, 3], [2, 2], [1, 1]]
When coming from functional languages where this sort of thing is done with a fold, then using reduce often feels natural.

You can modify a list you're iterating over if you use indices explicitly:
def remove_adjacent(l):
if len(l)<2:
return l
prev,i = l[0],1
while i < len(l):
if l[i] == prev:
del l[i]
else:
prev = l[i]
i += 1
It doesn't work with iterators because iterators don't "know" how to modify the index when you remove arbitrary elements, so it's easier to just forbid it. Some languages have iterators with functions to remove the "current item".

#katrielalex's solution is more pythonic, but if you did need to modify the list in-place without making a copy, you could use a while loop and break when you catch an IndexError.
e.g.
nums = [1,1,1,2,2,3,3,3,5,5,1,1,1]
def remove_adjacent(nums):
"""Removes adjacent items by modifying "nums" in-place. Returns None!"""
i = 0
while True:
try:
if nums[i] == nums[i+1]:
# Letting you figure this part out,
# as it's a homework question
except IndexError:
break
print nums
remove_adjacent(nums)
print nums
Edit: pastebin of one way to do it here, in case you get stuck and want to know..

def remove_adjacent(nums):
newList=[]
for num in nums:
if num not in newList:
newList.append(num)
newList.sort()
return newList

Another approach. Comments welcome.
def remove_adjacent(nums):
'''modifies the list passed in'''
l, r = 0, 1
while r < len(nums):
if nums[l] == nums[r]:
r += 1
else:
l += 1
nums[l] = nums[r]
r += 1
del nums[l+1:]

Seeing the code written by Google is a humbling lol. This is what I came up with:
def remove_adjacent(nums):
rmvelement = []
checkedIndex = []
for num in nums:
if nums.index(num) not in checkedIndex:
index = nums.index(num)
checkedIndex.append(index)
skip = False
else:
skip = True
if skip == False:
for x in nums[index+1:]:
if x == num:
rmvelement.append(x)
else:
break
[nums.remove(_) for _ in rmvelement]
return nums

This should work for a transparent (albeit roundabout) solution:
def remove_adjacent(nums):
numstail = [i for i in range(0,len(nums))]
nums = nums + numstail
for i in nums:
if nums[i] == nums[i-1]:
del nums[i]
return nums[:-len(numstail)]
The logic is as follows:
Create a tail-list equal to the length of the original list of numbers and append this to the end of the original list.
Run a 'for-loop' that checks if a given element of nums is the same as the previous element. If so, delete it.
Return the new nums list, with the necessary deletions, up to len(numtails) index positions from the end of the list.
(numstail is defined to avoid indices being out of range for any length list)

def removeDupAdj2(a):
b=[]
for i in reversed(range(1,len(a))):
if(a[i-1] == a[i]):
del(a[i])
#print(a)
return a
a = [int(i) for i in '1 2 3 3 4 4 3 5 4 4 6 6 6 7 8 8 8 9 1 1 0 0'.split(' ')]
a
res = removeDupAdj2(a)
res

Since you are in a Python class, I'm going to guess that you are new to the language. Thus, for you and any other beginners out there, I wrote a simple version of the code to help others get through the logic.
original= [1, 2, 2, 3]
newlist=[]
for item in original:
if item in newlist:
print "You don't need to add "+str(item)+" again."
else:
newlist.append(item)
print "Added "+str(item)
print newlist

Related

Deep list count - count lists within lists

I am trying to get the length of a list, but that list can contain other lists that are deeply nested (e.g. [[[[[[[]]]]]]]).
But you have to get any non list elements in the count as well: [1, 2, [3, 4]] would output 5 (1, 2, 3, 4 and a list). ["a", "b", ["c", "d", ["e"]]] would output 7.
It first comes to mind to use recursion to solve this problem.
I wrote this:
def deep_count(arr, count=0):
if any(type(i) == list for i in arr):
for i in arr:
if type(i) == list:
count += len(arr)
else:
count += 1
return deep_count(arr[1:], count)
return count
It's outputting 9 instead of 7. And if it's just nested lists like [[[]]] it only outputs 1.
There doesn't seem to be a need for supplying an initial value, you can just use arr as the only parameter, then just get the count and if the current element is a list then also add that list's count.
def f(arr):
count = len(arr)
for element in arr:
if isinstance(element, list):
count += f(element)
return count
>>> f([1, 2, [3, 4]])
5
>>> f(["a", "b", ["c", "d", ["e"]]])
7
Recursively adding all the list lengths (Try it online!):
def f(xs):
return len(xs) + sum(f(x) for x in xs
if isinstance(x, list))
A variation (Try it online!):
def f(xs):
return isinstance(xs, list) and len(xs) + sum(map(f, xs))
When talking recursion, always think of the base case (which case would not make a recursion). In your problem, it would be when the list is empty, then it would return a count of 0.
So you could start implementing it like so:
def f(arr):
if arr == []:
return 0
For the rest of the implementation, you have two approaches:
Looping over the list and add f(sublist) to the count when it is effectively a sublist, and add 1 when it isn't
Go full recursion and always call the f(x) function, no matter if it is a sublist or not, and always add the result to the count. In this case, you have a new base case where f(not_a_list) would return 1
I think this should unstuck you
Note: I just read that recursion is not required, you came up with it. I think this is a good approach for this kind of problem
My old answer takes the result of the last example of OP as the expected output. If it is not considered, the answer is very simple:
>>> def count_deep_list(lst):
... return sum(count_deep_list(val) for val in lst if isinstance(val, list)) + len(lst)
...
>>> count_deep_list([1, 2, [3, 4]])
5
>>> count_deep_list(["a", "b", ["c", "d", ["e"]]])
7
>>> count_deep_list([[[]]])
2
Old Answer:
It seems that you need a special judgment on the internal empty list and the incoming list itself. I think it needs to be implemented with the help of closure:
>>> def count_deep_list(lst):
... def count(_lst):
... if isinstance(_lst, list):
... if not _lst:
... return 0
... else:
... return sum(map(count, _lst)) + 1
... else:
... return 1
...
... return count(lst) - 1 if lst else 0
...
>>> count_deep_list([1, 2, [3, 4]])
5
>>> count_deep_list(["a", "b", ["c", "d", ["e"]]])
7
>>> count_deep_list([[[]]])
1
Simpler implementation:
>>> def count_deep_list(lst):
... def count(_lst):
... return sum(count(val) if val else -1 for val in _lst if isinstance(val, list)) + len(_lst)
... return count(lst) if lst else 0
...
>>> count_deep_list([1, 2, [3, 4]])
5
>>> count_deep_list(["a", "b", ["c", "d", ["e"]]])
7
>>> count_deep_list([[[]]])
1
A twoline answer would be:
def lenrecursive(seq):
return len(seq) + sum(lenrecursive(s) for s in seq if isinstance(s,Sized) and not isinstance(s, str))
However this will not take into account that other things may have a len
A more complete handling using EAPF approach is:
def lenrecursive(seq):
"""
Returns total summed length of all elements recursively.
If no elements support len then return will be 0, no TypeError will be raised
"""
def _len(x):
try:
return len(x)
except TypeError: #no len
return 0
try:
return _len(seq) + sum(lenrecursive(s) for s in seq if not isinstance(s, str))
except TypeError: #not iterable
return _len(seq)
Notes:
Why not isinstance(s, str)? - Strings will lead to infinite recursion as a one-character string is a sequence so lenrecursive(a) would return 1 + lenrecursive(a) which is 1 + 1 + lenrecursive(a) which is 1 + 1 + 1 + ...
Why EAFP using try: except: rather than LBYL checking with isinstance? - because there are so many things out there that support len and you don't know whether the next time you call this you've created a custom class which has a __len__ but doesn't subclass Sequence, Mapping, or something else that inherits from collections.Sized
class Counter:
def __init__(self):
self.counter = 0
def count_elements(self, l):
for el in l:
if type(el) == list:
self.counter += 1
self.count_elements(el)
else:
self.counter += 1
l = ["a", "b", ["c", "d", ["e"]]]
c = Counter()
c.count_elements(l)
print(c.counter)

How to split array but No use loop

Determine if it possible to divide a list (lst) at some index such that the sum of values in the first part of the list equals the sum in the latter part of the list.
No loops are used.
def split_array2(lst, all_sum=0):
first = split_array(lst[0:-1])
last = lst[-1]
def sum_(lst):
if len(lst) == 0:
return 0
return lst[0] + sum_[1:]
So something like this?
def canBeSplitted(lst,ind=1):
if ind == len(lst):
return false
return sum(lst[0:ind]) == sum(lst(ind:)) || canBeSplitted(lst,ind+1)
canBeSplitted(lst) #usage
a = [1, 2, 3, 4, 5, 7, 8]
def split_array(a, i=1): # changed to 1 to save 1 function call. AND to be sure that there is a valid split when sum(a)=0.
if i == len(a):
return False
if sum(a[:i]) == sum(a[i:]):
return i
return split_array(a, i+1)
split_array(a)
>>> 5
#sum(a[:5]) = 15
#sum(a[5:]) = 15
This is a recursive function. Recursion is in a lot of cases an alternative to loops.
There is always going to be some loops going on but you can hide them using recursion or library functions (if you're merely aiming to avoir the for/while constructs):
For example:
from itertools import accumulate
from bisect import bisect_right
def splitsum(lst):
i = bisect_right([*accumulate(lst)],sum(lst)/2)
return lst[:i],lst[i:]
L = [1,2,3,4,5,7,8]
print(*splitsum(L))
If you only need to determine if it is possible or not, you could return sum(lst[:i])==sum(lst[i:]) or better yet simply check if sum(lst)/2 is in accumulate(lst).

Trying to remove common elements in the list with recursion

My code:
def shorter(lst):
if len(lst) == 0:
return []
if lst[0] in lst[1:]:
lst.remove(lst[0])
shorter(lst[1:])
return lst
print shorter(["c","g",1,"t",1])
Why does it print ["c","g",1,"t",1] instead of ["c","g","t",1]
For a recursive method, what you can do is check a specific index in the again as you have it. If we remove the current element, we want to stay at the same index, otherwise we want to increase the index by one. The base case for this is if we are looking at or beyond the last element in the array since we don't really need to check it.
def shorter(lst, ind=0):
if ind >= len(lst)-1: #Base Case
return lst
if lst[ind] in lst[ind+1:]:
lst.pop(ind)
return shorter(lst,ind)
return shorter(lst, ind+1)
#Stuff to test the function
import random
x = [random.randint(1,10) for i in range(20)]
print(x)
x = shorter(x)
print(x)
Another way to solve this in a single line is to convert the list into a set and then back into a list. Sets have only unique values, so we can use that property to remove any repeating elements.
import random
x = [random.randint(1,10) for i in range(20)]
print(x)
x = list(set(x)) #Converts to set and back to list
print(x)
A possible recursive solution could be:
def shorter(lst):
if lst:
if lst[0] in lst[1:]:
prefix = [] # Skip repeated item.
else:
prefix = [lst[0]] # Keep unique item.
return prefix + shorter(lst[1:])
else:
return lst
The previous code can also be compacted to:
def shorter(lst):
if lst:
return lst[0:(lst[0] not in lst[1:])] + shorter(lst[1:])
else:
return lst
and the function body can also be reduced to a one-liner:
def shorter(lst):
return (lst[0:(lst[0] not in lst[1:])] + shorter(lst[1:])) if lst else lst
or even:
def shorter(lst):
return lst and (lst[0:(lst[0] not in lst[1:])] + shorter(lst[1:]))

How to split a list only using car, cdr, cons and other functions (python)

we need to be able to make a function that has 1 list as an input. We need to split the even numbers from the uneven numbers and put them in seperate lists. We are not permitted to make a second list and should be able to solve it only using recursion, and with car, cdr, cons, ... .
This is what I already have:
def split_list(list_a):
if null(list_a):
return []
elif not even(car(list_a)):
return cons(car(list_a), split_list(cdr(list_a)))
else:
return cons(splits_lijst(cdr(list_a)), car(list_a))
print(split_list([1, 2, 3, 4]))
I became the output: [1, 3, 4, 2]
While it should be: [1, 3][2, 4]
I really have no clue how to do this without making a secondary list.
Just to be clear, the functions in the function 'split_list' are car, cdr, cons, null and even. Here you see the contents of those functions:
def car(a_list):
return a_list[0]
def cdr(a_list):
return a_list[1:]
def null(a_list):
return True if len(a_list) == 0 else False
def cons(a, b):
new_list = []
if type(a) is list:
for item in a:
new_list.append(item)
else:
new_list.append(a)
if type(b) is list:
for item in b:
new_list.append(item)
else:
new_list.append(b)
return new_list
def even(x):
if x == 1:
return False
elif x == 0:
return True
else:
return even(x-2)
You need a way to make two lists during your iteration of one list. The best way to do that is to make a function that takes 3 arguments. one for the input and 2 for the two output lists, often called accumulators.
The logic would be to return a list of the two accumulators when you have reached the end of the list. If not you check the element for evenness and recurse by adding it to the even accumulator. If not you recurse by adding the element to the odd accumulator.
I think this would help you.
l=list(range(10))
evens=[]
odds=[]
for x in l:
if x%2==0:
evens.append(x)
else:
odds.append(x)
print(evens,odds)
the below one will use only one additional list,(in case you don't require the original one):
l=list(range(10))
odds=[]
for x in l:
if x%2==0:
continue
else:
l.remove(x)
odds.append(x)
print(l,odds)

What is the best way to fix a list index out of range exception?

I have written this recursive code in Python:
def suma(i,l):
if i == 0:
return l[i]
else:
return suma(i-1,l)+l[i]
And whenever I call the function by suma(3,[7,2,3]) and run it, I get this error message:
List index out of range on line return suma(i-1,l)+l[i]
Ok, I'm going to assume that the intent here is to recursively add the first i elements of l and return the result. If so, here's a concise way to do so:
def suma(i,l):
return 0 if i == 0 else suma(i-1,l)+l[i-1]
This is equivalent to:
def suma(i,l):
if i == 0:
return 0
else
return suma(i-1,l)+l[i-1]
It's unorthodox, but you could just call your suma() function with the first argument reduced by 1:
>>> l = [7, 2, 3]
>>> suma(len(l)-1, l)
12
But it could be better written like this:
def suma(l):
if len(l) > 1:
return suma(l[:-1]) + l[-1]
return l[0]
>>> suma([7,2,3])
12
>>> suma(range(1,11))
55
This has the advantage of not requiring the length of the list to be passed to the sum function at all - the length is always available using len().

Categories