Removing earlier duplicates from a list and keeping order - python

I want to define a function that takes a list as an argument and removes all duplicates from the list except the last one.
For example:
remove_duplicates([3,4,4,3,6,3]) should be [4,6,3]. The other post answers do not solve this one.
The function is removing each element if it exists later in the list.
This is my code:
def remove(y):
for x in y:
if y.count(x) > 1:
y.remove(x)
return y
and for this list:
[1,2,1,2,1,2,3] I am getting this output:
[2,1,2,3]. The real output should be [1,2,3].
Where am I going wrong and how do I fix it?

The other post does actually answer the question, but there's an extra step: reverse the input then reverse the output. You could use reversed to do this, with an OrderedDict:
from collections import OrderedDict
def remove_earlier_duplicates(sequence):
d = OrderedDict.fromkeys(reversed(sequence))
return reversed(d)
The output is a reversed iterator object for greater flexibility, but you can easily convert it to a list.
>>> list(remove_earlier_duplicates([3,4,4,3,6,3]))
[4, 6, 3]
>>> list(remove_earlier_duplicates([1,2,1,2,1,2,3]))
[1, 2, 3]
BTW, your remove function doesn't work because you're changing the size of the list as you're iterating over it, meaning certain items get skipped.

I found this way to do after a bit of research. #wjandrea provided me with the fromkeys method idea and helped me out a lot.
def retain_order(arr):
return list(dict.fromkeys(arr[::-1]))[::-1]

Related

Code does not work for all test cases of flattening a nested list [duplicate]

This question already has answers here:
Flatten a list of lists in Python recursively [closed]
(2 answers)
Closed 2 years ago.
Write a function to flatten a list. The list contains other lists, strings, or ints.
A list such as the example given below is to be flattened:
[[1,'a',['cat'],2],[[[3]],'dog'],4,5]
to produce a flattened list:
[1,'a','cat',2,3,'dog',4,5] (order matters)
I wrote my code as follows and got the required flattened list above. However, I did not score full marks for the question as apparently my code did not work for other test cases. The question did not show me any test cases.Thus I only know that my code works properly for the example given above Could you think of any cases that can show that my code is not correct?
My thinking process: Look through every index element in input list 'aList' by using a for loop. If that element happens to be a str/int, then that element is already flattened and I'll add that into my output flattened list 'new' using append. However, if that index element is still a list, it has not yet be flattened, so I'll do a recursive call 'new.append(aList[n])' so that I now look within list of list ie. aList[0][0] etc. until I find an element that is not a list.
new = []
def flatten(aList):
for n in range(len(aList)):
if type(aList[n]) == type([]):
flatten(aList[n])
else:
new.append(aList[n])
return new
Here, I found a code online that gets me the full marks for the question.
def flatten(aList):
new = []
for i in aList:
if type(i) == type([]):
new.extend(flatten(i))
else:
new.append(i)
return new
It is very similar to my code, with the only difference in the way it calls the recursive function to flatten any nested lists. I directly call my function 'flatten(aList[n])' while the sample answer used 'new.extend(flatten(i)). However, I could not see why my code does not work for all cases. Also, how is using extend going to solve the problem?
If it's a multi test-case problem, you need to reset new after each call to 'flatten' method.
Your algorithm is correct. Your test cases are probably failing because of the new variable declared outside of the flatten function. This causes the result of the function to be extended to the new list that already contains results from previous function calls. Pass the list as parameter for a recursive solution:
def flatten(aList, new=None):
if new == None: new = []
for n in range(len(aList)):
if type(aList[n]) == type([]):
flatten(aList[n], new)
else:
new.append(aList[n])
return new
What's important here that notice how I have used the default value of the argument new as None and then assigned it inside the function if it is None, rather than just: def flatten(aList, new=[]):. Never do this–use a mutable datatype like list as a default argument as it will only be evaluated once and this instance will be reused everytime the function is called rather than creating a fresh empty list. Check this out for more details.
If you want better performance, try using two functions to optimize the if away:
flatten = lambda aList: flatten_aux(aList, [])
def flatten_aux(aList, new):
for n in range(len(aList)):
if type(aList[n]) == type([]):
flatten(aList[n], new)
else:
new.append(aList[n])
return new
This is not an answer to the question, but rather an alternative way to solve the problem using generators (I would post this as a comment, but comments don't format multi-line code, to the best of my knowledge):
def flat_gen(nd_list):
if type(nd_list) is list:
for i in nd_list:
yield from flat_gen(i)
else:
yield nd_list
def flatten(nd_list):
return list(flat_gen(nd_list))
nd_list = [[1,'a',['cat'],2],[[[3]],'dog'],4,5]
flat_list = flatten(nd_list)
print(flat_list)
answer = [1,'a','cat',2,3,'dog',4,5]
print(flat_list == answer)
Output:
[1, 'a', 'cat', 2, 3, 'dog', 4, 5]
True

How do i 'replace' an array by filling an empty array's elements using pop method from another array?

I'm trying to implement a stack in python and I'm experimenting with the list data structure. If i want to use the pop method to 'fill' an empty array by using the elements from an existing array, what do I do?
# Implementing Stacks in Python through the given list structure
practiceStack = []
practiceStack.append(['HyunSoo', 'Shah'])
practiceStack.append('Jack')
practiceStack.append('Queen')
practiceStack.append(('Aces'))
# printing every element in the list/array
for i in practiceStack:
print(i)
# since stacks are LIFO (last in first out) or FILO (first in last out), the pop method will remove the first thing we did
emptyArrayPop = []
This is what I tried (by using a for loop) and keep getting a use integers not list error
for i in practiceStack:
emptyArrayPop[i].append(practiceStack.pop)
print(emptyArrayPop)
The pop function is a function — not a value. In other words, practiceStack.pop is a pointer to a function (you can mostly ignore this until you've spent more time around code); you likely want this instead:
practiceStack.pop()
You also need to append to the list; when adding something with append, the List will automatically add it at the end; you do not need to provide an index.
Further explanation: The List.append method will take the value that you pass to it and add that to the end of the List. For example:
A = [1, 2, 3]
A.append(4)
A is now [1, 2, 3, 4]. If you try to run the following:
A[2].append(4)
...then you are effectively saying, "append 4 to the end of the value at position-2 in A", (`A[2] is set to 3 in the above example; remember that python lists start counting at 0, or are "0-index".) which is like saying "Append 4 to 3." This makes no sense; it doesn't mean anything to append an integer to another integer.
Instead, you want to append to the LIST itself; you do not need to specify a position.
Don't get this confused with assigning a value to a position in a List; if you were setting a value at an existing position of a list, you can use the = operator:
>>> B = [1, 2, 3]
>>> B[2]
3
>>> B[2] = 4
>>> print(B)
[1, 2, 4]
>>> B.append(8)
>>> print(B)
[1, 2, 4, 8]
So to answer your original question, the line you want is the following:
emptyArrayPop.append(practiceStack.pop())
(note the [i] has been removed)
[edit] Not the only issue, as #selcuk pointed out.
You will also need to fix the way you're accessing data in the practiceStack list, as you cannot edit a list (calling pop modifies the list in-place) when you are iterating over it.
You will need to iterate over the integer index of the list in order to access the elements of practiceStack:
for i in range(len(practiceStack)):
emptyArrayPop.append(practiceStack.pop())

Extract list of objects from another Python list based on attribute

I have a Python list filled with instances of a class I defined.
I would like to create a new list with all the instances from the original list that meet a certain criterion in one of their attributes.
That is, for all the elements in list1, filled with instances of some class obj, I want all the elements for which obj.attrib == someValue.
I've done this using just a normal for loop, looping through and checking each object, but my list is extremely long and I was wondering if there was a faster/more concise way.
Thanks!
There's definitely a more concise way to do it, with a list comprehension:
filtered_list = [obj for obj in list1 if obj.attrib==someValue]
I don't know if you can go any faster though, because in the problem as you described you'll always have to check every object in the list somehow.
You can perhaps use pandas series. Read your list into a pandas series your_list.
Then you can filter by using the [] syntax and the .apply method:
def check_attrib(obj, value):
return obj.attrib==value
new_list = your_list[your_list.apply(check_attrib)]
Remember the [] syntax filters a series by the boolean values of the series inside the brackets. For example:
spam = [1, 5, 3, 7]
eggs = [True, True, False, False]
Than spam[eggs] returns:
[1, 5]
This is a vector operation and in general should be more efficient than a loop.
For completeness, you could also use filter and a lambda expression:
filtered_lst1 = filter(lambda obj: obj.attrib==someValue, lst1)

Calling "del" on result of Function call results in "SyntaxError: can't delete function call"

Consider below example:
random_tile = random.choice(tiles)
del random_tile
It first assigns a random element from the list tiles to a variable, and then calls a function on that variable.
Then if we want to shorten the code as follows:
del random.choice(tiles)
We would get a SyntaxError: can't delete function call. I tried eval() with no luck. How can this be solved?
EDIT:
What I am trying to do is to remove a element from the tiles list at random. I thought I found a more compact way of doing it other than using random.randint() as the index, but I guess not.
Is there a pythonic/conventional way of doing this otherwise?
del is not a function, it is a keyword to create del statements e.g. del var. You can't use it as a function or on a function, it can only be applied to a variable name, list of variable names or subscriptions and slicings e.g.
del a
del a,b
del l[0]
del l[:]
To remove the random item you can try this
random_tile.remove(random.choice(tiles))
BUT it is not best way to remove items because it could mean a internal search for items and item comparison, also won't work if items are not unique, so best would be to get random index and delete that
index = random.randint(0, len(l)-1)
del l[index]
If you're trying to actually modify your tiles list, your first attempt doesn't do it either. It just assigns to your variable, then wipes that variable. That's because del works on the variable itself, not what it points to. Most other functions work on the value some variable points to. So unless you actually need to use del, what you're trying to do will work fine.
Also, if you are trying to use del, i'm pretty sure you're using it wrong. If you elaborate more on what you're trying to do, we can probably help you.
There are a number of ways to remove a random element from a list. Here's one.
>>> import random
>>> my_list = [1, 2, 3, 4, 5]
>>> my_list.remove(random.choice(my_list))
>>> my_list
[1, 2, 4, 5]
Given a list tiles this does the trick:
import random
tiles.remove(random.choice(tiles))

Why does '.sort()' cause the list to be 'None' in Python? [duplicate]

This question already has answers here:
Why do these list operations (methods: clear / extend / reverse / append / sort / remove) return None, rather than the resulting list?
(6 answers)
Closed 2 years ago.
I am attempting to sort a Python list of ints and then use the .pop() function to return the highest one. I have tried a writing the method in different ways:
def LongestPath(T):
paths = [Ancestors(T,x) for x in OrdLeaves(T)]
#^ Creating a lists of lists of ints, this part works
result =[len(y) for y in paths ]
#^ Creating a list of ints where each int is a length of the a list in paths
result = result.sort()
#^meant to sort the result
return result.pop()
#^meant to return the largest int in the list (the last one)
I have also tried
def LongestPath(T):
return[len(y) for y in [Ancestors(T,x) for x in OrdLeaves(T)] ].sort().pop()
In both cases .sort() causes the list to be None (which has no .pop() function and returns an error). When I remove the .sort() it works fine but does not return the largest int since the list is not sorted.
Simply remove the assignment from
result = result.sort()
leaving just
result.sort()
The sort method works in-place (it modifies the existing list), so it returns None. When you assign its result to the name of the list, you're assigning None. So no assignment is necessary.
But in any case, what you're trying to accomplish can easily (and more efficiently) be written as a one-liner:
max(len(Ancestors(T,x)) for x in OrdLeaves(T))
max operates in linear time, O(n), while sorting is O(nlogn). You also don't need nested list comprehensions, a single generator expression will do.
This
result = result.sort()
should be this
result.sort()
It is a convention in Python that methods that mutate sequences return None.
Consider:
>>> a_list = [3, 2, 1]
>>> print a_list.sort()
None
>>> a_list
[1, 2, 3]
>>> a_dict = {}
>>> print a_dict.__setitem__('a', 1)
None
>>> a_dict
{'a': 1}
>>> a_set = set()
>>> print a_set.add(1)
None
>>> a_set
set([1])
Python's Design and History FAQ gives the reasoning behind this design decision (with respect to lists):
Why doesn’t list.sort() return the sorted list?
In situations where performance matters, making a copy of the list
just to sort it would be wasteful. Therefore, list.sort() sorts the
list in place. In order to remind you of that fact, it does not return
the sorted list. This way, you won’t be fooled into accidentally
overwriting a list when you need a sorted copy but also need to keep
the unsorted version around.
In Python 2.4 a new built-in function – sorted() – has been added.
This function creates a new list from a provided iterable, sorts it
and returns it.
.sort() returns None and sorts the list in place.
This has already been correctly answered: list.sort() returns None. The reason why is "Command-Query Separation":
http://en.wikipedia.org/wiki/Command-query_separation
Python returns None because every function must return something, and the convention is that a function that doesn't produce any useful value should return None.
I have never before seen your convention of putting a comment after the line it references, but starting the comment with a carat to point at the line. Please put comments before the lines they reference.
While you can use the .pop() method, you can also just index the list. The last value in the list can always be indexed with -1, because in Python negative indices "wrap around" and index backward from the end.
But we can simplify even further. The only reason you are sorting the list is so you can find its max value. There is a built-in function in Python for this: max()
Using list.sort() requires building a whole list. You will then pull one value from the list and discard it. max() will consume an iterator without needing to allocate a potentially-large amount of memory to store the list.
Also, in Python, the community prefers the use of a coding standard called PEP 8. In PEP 8, you should use lower-case for function names, and an underscore to separate words, rather than CamelCase.
http://www.python.org/dev/peps/pep-0008/
With the above comments in mind, here is my rewrite of your function:
def longest_path(T):
paths = [Ancestors(T,x) for x in OrdLeaves(T)]
return max(len(path) for path in paths)
Inside the call to max() we have a "generator expression" that computes a length for each value in the list paths. max() will pull values out of this, keeping the biggest, until all values are exhausted.
But now it's clear that we don't even really need the paths list. Here's the final version:
def longest_path(T):
return max(len(Ancestors(T, x)) for x in OrdLeaves(T))
I actually think the version with the explicit paths variable is a bit more readable, but this isn't horrible, and if there might be a large number of paths, you might notice a performance improvement due to not building and destroying the paths list.
list.sort() does not return a list - it destructively modifies the list you are sorting:
In [177]: range(10)
Out[177]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [178]: range(10).sort()
In [179]:
That said, max finds the largest element in a list, and will be more efficient than your method.
In Python sort() is an inplace operation. So result.sort() returns None, but changes result to be sorted. So to avoid your issue, don't overwrite result when you call sort().
Is there any reason not to use the sorted function? sort() is only defined on lists, but sorted() works with any iterable, and functions the way you are expecting. See this article for sorting details.
Also, because internally it uses timsort, it is very efficient if you need to sort on key 1, then sort on key 2.
You don't need a custom function for what you want to achieve, you first need to understand the methods you are using!
sort()ing a list in python does it in place, that is, the return from sort() is None. The list itself is modified, a new list is not returned.
>>>results = ['list','of','items']
>>>results
['list','of','items']
>>>results.sort()
>>>type(results)
<type 'list'>
>>>results
['items','list','of']
>>>results = results.sort()
>>>results
>>>
>>>type(results)
<type 'NoneType'>
As you can see, when you try to assign the sort() , you no longer have the list type.

Categories