Python unique list using set [duplicate] - python

This question already has answers here:
How do I remove duplicates from a list, while preserving order?
(31 answers)
Closed 8 months ago.
What I am trying to do is write a method that takes a list as an argument and uses a set to return a copy of the list where each element only occurs once, as well as having the elements in the new list occur in order of their first occurrence in the original list. I HAVE to use a set for this, however, I can't make it so that the output is in the right order while having a quick result.
If I put something like this:
def unique(a):
return list(set(a))
and passed a list with millions of elements, it would give me a result quickly, but it wouldn't be ordered.
So what I have right now is this:
def unique(a):
b = set(a)
c = {}
d = []
for i in b:
c[a.index(i)] = i
for i in c:
d.append(c[i])
return d
This gives me the result I want, but not fast enough. If I pass a list with a million elements, I could be waiting for half an hour, whereas the one liner up there takes less than a second. How could I solve this problem?

>>> from collections import OrderedDict
>>> items = [1, 2, 3, 'a', 2, 4, 'a']
>>> OrderedDict.fromkeys(items).keys()
[1, 2, 3, 'a', 4]

Related

how do you remove a list from another list? [duplicate]

This question already has answers here:
Remove all the elements that occur in one list from another
(13 answers)
Compute list difference [duplicate]
(17 answers)
Closed 6 years ago.
I have something like:
a = [4,3,1,6,3,5,3]
b = [4,2,6]
and I wanted to remove the 3 elements of b from a. I was trying to do:
c = a - b
but I was thinking that the inverse of merge (+) might not be a thing, and was correct: Unsupported Operand types for -: list and list. I was contemplating just looping over them, but that just doesnt really sound pythony.
My end state is going to be: c = [3,1,3,5,3]
If you had not noticed, b is not a subset of a and these are unordered. 2 different sets, these sets are not unique either, but only want to remove 1 instance per i in b, not all instance of i in b
EDIT It seems that the current answer does not resolve my question.
a = [1,1,2,2,2,3,4,5]
b = [1,3]
c = [x for x in a if x not in b]
#c result is [2,2,2,4,5]
I want c to return: [1,2,2,2,4,5]
For the sake of speed typing, I just put sorted numbers, but the lists ARE IN FACT unsorted, though for the sake of cleanliness we can sort them.
They are just unsorted at init. Since there are duplicates in list items, I can't use a set as per the definition of set.
You can use the following code:
[x for x in a if x not in b]
As input and output lists are unordered, use Counter is a way to go:
from collections import Counter
a = [4,3,1,6,3,5,3]
b = [4,2,6]
c = list((Counter(a) - Counter(b)).elements())
print(c)
This solution handles the following case differently than ysearka answer:
a = [4,3,1,6,3,5,3]
b = [3]
Where many answers here leads to c = [4,1,6,5], the one using Counter will outputs c = [4,1,6,3,3,5].
This behavior is also implemented by DAXaholic answer, but with modification of an existing list, which is not a pythonic way to go, and could be costly on big lists.
As you noted you only want to remove the items once per occurrence:
for x in [tmp for tmp in b if tmp in a]:
a.remove(x)
for x in b:
while x in a:
a.remove(x)
Now a is
[3, 1, 3, 5, 3]

Python: Assaign values to variables using two lists [duplicate]

This question already has answers here:
Convert string to variable name in python [duplicate]
(3 answers)
Closed 7 years ago.
How can I assign values to variables by using two lists:
Numbers =[1,2,3,4,]
Element= ["ElementA","ElementB","ElementC","ElementD"]
for e, n in zip(Element, Numbers):
e+ ' = ' +n #this part is wrong
What i want is a result like this in the end:
print ElementA
>1
print ElementB
>2
print ElementC
>3
print ElementD
>4
So what im trying to do is to fill up variables created from one list (Element) with values from another list (Numbers) in a kind of loop style.
Does anyone know how to archive something like that?
It should also be possible to assign many values contained in a list/array to variables.
Do not do that.
You're already working with lists, so just create a list instead of hacking some hard-coded names. You could use a dictionary if you really wanted to use identifiers like A, B, etc., but you'd have a problem with more than 26 items, and you're just using it to make a sequence anyway. Integers are great for that, and of course we'll start at 0 because that's how it works.
>>> numbers = [1, 2, 3, 4]
>>> elements = [item for item in numbers]
>>> elements[0]
1
And at this point we can see that, for this example at least, you already had what you were looking for this whole time, in numbers.
>>> numbers = [1, 2, 3, 4]
>>> numbers[0]
1
Perfect.
You may use exec but this is generally the wrong way to go.
for e, n in zip(Element, Numbers):
exec(e + ' = ' + n)
You better should use a dictionnary:
my_dict = {}
for e, n in zip(Element, Numbers):
my_dict[e] = n
Even simpler:
my_dict = dict(zip(Element, Numbers))

"list index out of range" - Python

I was trying to write a piece of program that will remove any repeating items in the list, but I get a list index out of range
Here's the code:
a_list = [1, 4, 3, 2, 3]
def repeating(any_list):
list_item, comparable = any_list, any_list
for x in any_list:
list_item[x]
comparable[x]
if list_item == comparable:
any_list.remove(x)
print(any_list)
repeating(a_list)
So my question is, what's wrong?
Your code does not do what you think it does.
First you are creating additional references to the same list here:
list_item, comparable = any_list, any_list
list_item and comparable are just additional names to access the same list object.
You then loop over the values contained in any_list:
for x in any_list:
This assigns first 1, then 4, then 3, then 2, then 3 again to x.
Next, use those values as indexes into the other two references to the list, but ignore the result of those expressions:
list_item[x]
comparable[x]
This doesn't do anything, other than test if those indexes exist.
The following line then is always true:
if list_item == comparable:
because the two variables reference the same list object.
Because that is always true, the following line is always executed:
any_list.remove(x)
This removes the first x from the list, making the list shorter, while still iterating. This causes the for loop to skip items as it'll move the pointer to the next element. See Loop "Forgets" to Remove Some Items for why that is.
All in all, you end up with 4, then 3 items in the list, so list_item[3] then fails and throws the exception.
The proper way to remove duplicates is to use a set object:
def repeating(any_list):
return list(set(any_list))
because a set can only hold unique items. It'll alter the order however. If the order is important, you can use a collections.OrderedDict() object:
def repeating(any_list):
return list(OrderedDict.fromkeys(any_list))
Like a set, a dictionary can only hold unique keys, but an OrderedDict actually also keeps track of the order of insertion; the dict.fromkeys() method gives each element in any_list a value of None unless the element was already there. Turning that back in to a list gives you the unique elements in a first-come, first serve order:
>>> from collections import OrderedDict
>>> a_list = [1, 4, 3, 2, 3]
>>> list(set(a_list))
[1, 2, 3, 4]
>>> list(OrderedDict.fromkeys(a_list))
[1, 4, 3, 2]
See How do you remove duplicates from a list in whilst preserving order? for more options still.
The easiest way to solve your issue is to convert the list to a set and then, back to a list...
def repeating(any_list):
print list(set(any_list))
You're probably having an issue, because you're modifying the list (removing), while iterating over it.
If you want to remove duplicates in a list but don't care about the elements formatting then you can
def removeDuplicate(numlist):
return list(set(numlist))
If you want to preserve the order then
def removeDuplicate(numlist):
return sorted(list(set(numlist)), key=numlist.index)

Python lists get specific length of elements from index [duplicate]

This question already has answers here:
Understanding slicing
(38 answers)
Closed 9 years ago.
my_list = [1,2,3,4,5,6,7,8,9,10,11,12,13]
I need to obtain a specific length of elements form a list starting at a specific index in Python. For instance I would like to get the three next elements from the [2] element above. Is there anyway to get the three elements from the specific index? I wont always know the next amount of elements I want to get, sometimes I may want to get two elements, sometimes eight elements, so x elements.
I know I can do my_list[2:] to get all of the elements from the third element to the end of the list. What I want to do is specify how many elements to read after the third element. Conceptually in my mind the example above would look like my_list[2:+3] however I know this wont work.
How can I achieve this or is it better to define my own function to give me this functionality?
You are actually very close:
>>> my_list = [1,2,3,4,5,6,7,8,9,10,11,12,13]
>>> x = 3
>>> my_list[2:2+x]
[3, 4, 5]
>>>
As you can see, the answer to your question is to slice the list.
The syntax for slicing is list[start:stop:step]. start is where to begin, stop is where to end, and step is what to count by (I didn't use step in my answer).
my_list = [1,2,3,4,5,6,7,8,9,10,11,12,13]
n = 3
print my_list[2:2+n]
Nothing smart here. But, you can pull n out and tweak it the way you want.
you should simply use
my_list[2:2+3]
and in general
list[ STARTING_INDEX : END_INDEX ]
which is equivalent to
list[ STARTING_INDEX : STARTING_INDEX + LENGTH ]
>>> my_list = [1,2,3,4,5,6,7,8,9,10,11,12,13]
>>> my_list[2:3]
[3]
>>> my_list[2:2+3]
[3, 4, 5]

how to convert two lists into a dictionary (one list is the keys and the other is the values)? [duplicate]

This question already has answers here:
How can I make a dictionary (dict) from separate lists of keys and values?
(21 answers)
Closed 6 years ago.
This is code in IDLE2 in python, and error.
I need to include each "data" element as key and value "otro", in an orderly manner. Well "data" and "otro" it's list with 38 string's, as for "dik" it's an dictionary.
>>> for i in range(len(otro)+1):
dik[dato[i]] = otro[i]
Traceback (most recent call last):
File "<pyshell#206>", line 2, in <module>
dik[dato[i]] = otro[i]
IndexError: list index out of range
>>>
this problem is range(0, 38)
output -> (0, 1,2,3 ... 37) and it is all messy
I think something like:
dik = dict(zip(dato,otro))
is a little cleaner...
If dik already exists and you're just updating it:
dik.update(zip(dato,otro))
If you don't know about zip, you should invest a little time learning it. It's super useful.
a = [ 1 , 2 , 3 , 4 ]
b = ['a','b','c','d']
zip(a,b) #=> [(1,'a'),(2,'b'),(3,'c'),(4,'d')] #(This is actually a zip-object on python 3.x)
zip can also take more arguments (zip(a,b,c)) for example will give you a list of 3-tuples, but that's not terribly important for the discussion here.
This happens to be exactly one of the things that the dict "constructor" (type) likes to initialize a set of key-value pairs. The first element in each tuple is the key and the second element is the value.
The error comes from this: range(len(otro)+1). When you use range, the upper value isn't actually iterated, so when you say range(5) for instance, your iteration goes 0, 1, 2, 3, 4, where position 5 is the element 4. If we then took that list elements and said for i in range(len(nums)+1): print nums[i], the final i would be len(nums) + 1 = 6, which as you can see would cause an error.
The more 'Pythonic' way to iterate over something is to not use the len of the list - you iterate over the list itself, pulling out the index if necessary by using enumerate:
In [1]: my_list = ['one', 'two', 'three']
In [2]: for index, item in enumerate(my_list):
...: print index, item
...:
...:
0 one
1 two
2 three
Applying this to your case, you can then say:
>>> for index, item in enumerate(otro):
... dik[dato[index]] = item
However keeping with the Pythonicity theme, #mgilson's zip is the better version of this construct.

Categories