Distinct List of Lists With Set - python

I created a list of lists and then tried to get a distinct list of the lists using set(), but it appears as though i cant use list on a set.
Is there another way to accomplish this with a concise statement that performs well?
CODE
x = [1,2]
y = [1,2]
z = [2,3]
xyz = []
xyz.append(x)
xyz.append(y)
xyz.append(z)
set(xyz)
Error
TypeError: unhashable type: 'list'
Goal
xyz = [[1,2],[2,3]]

if you want to preserve order and keep your lists, you could use generator function to remove the dupes from your list:
xyz = [x, y, z]
def remove_dupe_subs(l):
seen = set()
for sub in l:
tup = tuple(sub)
if tup not in seen:
yield sub
seen.add(tup)
xyz[:] = remove_dupe_subs(xyz)
Or using a generator expression taking advantage of the fact set.add returns None :
seen = set()
xyz[:] = (seen.add(tup) or sub for sub, tup in zip(xyz, map(tuple, xyz)) if tup not in seen)
print(xyz)

If the list members are hashable, it will work
x = [1,2]
y = [1,2]
z = [2,3]
xyz = []
xyz.append(tuple(x))
xyz.append(tuple(y))
xyz.append(tuple(z))
print xyz
xyz_set = set(xyz)
print xyz_set

It's a little bit convoluted, but this will do the trick in a single line:
xyz=[list(x) for x in list(set((tuple(x),tuple(y),tuple(z))))]

Related

Differences between declaring a list & appending and list()

This below appends the s to the list l
s = pd.Series([1], name='foo')
l = []
l.append(s)
This only appends 1 to l
s = pd.Series([1], name='foo')
l = list(s)
How to implement the first script the best way without declaring a list and then appending?
[x] makes a list with x as an element.
list(x) makes a list produced by iterating over x. x has to be iterable, otherwise you'll get an error.
It is, in effect, [i for i in x], or
alist = []
for i in x:
alist.append(i)

Python take modified list outside for loop

I have been working on a way to make list of tuples and finding the average of each tuples.
myList = [(1,2,3),(4,12,6)]
def GS(myList):
for miniList in myList:
r = miniList[0]
g = miniList[1]
b = miniList[2]
GS = round((r+g+b)/3,2)
miniList = list(miniList)
miniList[0] = GS
miniList[1] = GS
miniList[2] = GS
miniList = tuple(miniList)
return myList
print(GS(myList))
my list is [(1,2,3),(4,12,6)]
I should get the average of each tuple and replace the three
output : [(2.0,2.0,2.0),(7.33,7.33,7.33)]
You can use a list comprehension. Below is an example which avoids calculating the length of each tuple twice via map and zip iterators.
myList = [(1,2,3),(4,12,6)]
def GS(L):
lens = map(len, L)
res = [(sum(i)/i_len,)*i_len for i, i_len in zip(L, lens)]
return res
print(GS(myList))
[(2.0, 2.0, 2.0), (7.333333333333333, 7.333333333333333, 7.333333333333333)]
If you wish to round decimals, you can use:
res = [(round(sum(i)/i_len, 2),)*i_len for i, i_len in zip(L, lens)]
myList = [(1,2,3),(4,12,6)]
[(round(sum(e)/len(e)),)*len(e) for e in myList]
# [(2.0, 2.0, 2.0), (7.33, 7.33, 7.33)]
The problem is that in your for loop, you write:
for miniList in myList:
# ...
miniList = tuple(miniList)
you here seem to want to assign by reference. But that is not possible in Python (or at least not with this syntax).
You can however use for example an index, and perform it like:
def GS(myList):
for idx, miniList in enumerate(myList):
r = miniList[0]
g = miniList[1]
b = miniList[2]
GS = round((r+g+b)/3,2)
miniList = list(miniList)
miniList[0] = GS
miniList[1] = GS
miniList[2] = GS
myList[idx] = tuple(miniList)
return myList
That being said, this is a rather complex way to do this. You can for example use:
def GS(myList):
for idx, miniList in enumerate(myList):
miniList[idx] = (round(sum(miniList)/len(miniList), 2),) * len(miniList)
return myList
This will also work on tuples that contain more or less than three elements. What we do is calculating the sum(..) of the miniList, and divide that by the len(..) of the miniList to obtain the average. We then use the round(.., 2) function like in the original function.
Next we wrap this in a singleton tuple, with (.., ), and we then multiply it with the length of the tuple, to obtain a tuple where we repeat the elements in the singleton tuple len(miniList) times.
That being said, it is typically more Pythonic to construct a new list, then to change an existing one. Since it is possible that other variables refer to this list, and now are updated as well.

Lists comparison in loop: print second tuple element if condition

I have the following list and nested list:
first_var = ["id1","id2","id3"]
second_var = [("id1","name1"),("id2","name2"),("id3","name3"),("id4","name4"),]
I want to check for each first element in 'second_var' that doesn't exist in 'first_var' and print the second element in the 'second_var'.
my code is:
for x in [x[0] for x in second_var]:
if x not in first_var:
print(...)
For now if i execute print(x) it prints:
id4
but i need it to print
name4
How can i achieve that?
The problem with your code is you are not iterating the original list. You are only iterating the first entry of each tuple within the list.
This is how you can adapt your code:
first_var = ["id1","id2","id3"]
second_var = [("id1","name1"),("id2","name2"),("id3","name3"),("id4","name4"),]
for x in second_var:
if x[0] not in first_var:
print(x[1])
The Pythonic solution is to convert this to a list comprehension:
values = set(first_var)
res = [x[1] for x in second_var if x[0] not in values]
for item in res:
print(item)
Or the functional version; not recommended, but another way of seeing the logic:
from operator import itemgetter
values = set(first_var)
res = map(itemgetter(1), filter(lambda x: x[0] not in values, second_var))
>>> [v[1] for v in second_var if v[0] not in first_var]
['name4']
You can use list comprehension feature.
ids = [tuple[1] for tuple in second_var if tuple[0] not in first_var]
print(ids)
Output
['name4']
The list comprehension statement above is equivalent to:
>>> result = []
for tuple in second_var:
if tuple[0] not in first_var:
result.append(tuple[1])
>>> result
['name4']
If you have a lot of data, you need to build a dictionary & use set & all those nice hashing/difference techniques already existing in Python instead of linear lookup in lists (O(1) vs O(n)).
first_var = ["id1","id2","id3"]
second_var = [("id1","name1"),("id2","name2"),("id3","name3"),("id4","name4"),]
second_d = dict(second_var) # create a dict directly from tuples
missing = set(second_d).difference(first_var)
for m in missing:
print(second_d[m])
this prints name4
missing is the difference between the dict keys and the list.

Finding indices of items from a list in another list even if they repeat

This answer works very well for finding indices of items from a list in another list, but the problem with it is, it only gives them once. However, I would like my list of indices to have the same length as the searched for list.
Here is an example:
thelist = ['A','B','C','D','E'] # the list whose indices I want
Mylist = ['B','C','B','E'] # my list of values that I am searching in the other list
ilist = [i for i, x in enumerate(thelist) if any(thing in x for thing in Mylist)]
With this solution, ilist = [1,2,4] but what I want is ilist = [1,2,1,4] so that len(ilist) = len(Mylist). It leaves out the index that has already been found, but if my items repeat in the list, it will not give me the duplicates.
thelist = ['A','B','C','D','E']
Mylist = ['B','C','B','E']
ilist = [thelist.index(x) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
Basically, "for each element of Mylist, get its position in thelist."
This assumes that every element in Mylist exists in thelist. If the element occurs in thelist more than once, it takes the first location.
UPDATE
For substrings:
thelist = ['A','boB','C','D','E']
Mylist = ['B','C','B','E']
ilist = [next(i for i, y in enumerate(thelist) if x in y) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
UPDATE 2
Here's a version that does substrings in the other direction using the example in the comments below:
thelist = ['A','B','C','D','E']
Mylist = ['Boo','Cup','Bee','Eerr','Cool','Aah']
ilist = [next(i for i, y in enumerate(thelist) if y in x) for x in Mylist]
print(ilist) # [1, 2, 1, 4, 2, 0]
Below code would work
ilist = [ theList.index(i) for i in MyList ]
Make a reverse lookup from strings to indices:
string_indices = {c: i for i, c in enumerate(thelist)}
ilist = [string_indices[c] for c in Mylist]
This avoids the quadratic behaviour of repeated .index() lookups.
If you data can be implicitly converted to ndarray, as your example implies, you could use numpy_indexed (disclaimer: I am its author), to perform this kind of operation in an efficient (fully vectorized and NlogN) manner.
import numpy_indexed as npi
ilist = npi.indices(thelist, Mylist)
npi.indices is essentially the array-generalization of list.index. Also, it has a kwarg to give you control over how to deal with missing values and such.

Python - add information to items in a list thereby creating a list of tuples

I have a list:
list1 = [1,2,3]
I'm looking up info for each item in the list via some arbitrary function, and want to add the results so the list becomes:
list1 = [(1,a),(2,b),(3,x)]
How to best accomplish this in Python3?
for item in list1:
newinfo = some_arbitrary_function(item)
item = (item, newinfo)
Does not appear to work.
You need a list comprehension:
lst = [1,2,3]
result = [(item, function(item)) for item in lst]
Using list as a name is not a good idea, you'll shadow the original list builtin making it inaccessible later in your code.
In case you want to keep the reference to the original list:
lst[:] = [(item, function(item)) for item in lst]
It is not possible to assign a list item with the iteration variable (item in your case). In fact, there is nothing special about the variable item compared to an assignment with =. If the operation is intended to be in-place, you should do
for i, item in enumerate(list):
list[i] = (item, func(item))
Also, you should not name your list list, because it will hide the built-in type list.
It looks like you just want to change the values in the list to go from a single to a tuple containing the original value and the result of some lookup on that value. This should do what you want.
zzz = [1,2,3]
i = 0
for num in zzz:
zzz[i] = (num, somefunc(num))
i += 1
running this
zzz = [1,2,3]
i = 0
for num in zzz:
zzz[i] = (num, 8)
i += 1
gives the results zzz = [(1,8), (2,8), (3,8)]
If you have list1 = [1,2,3] and list2 = [x,y,z], zip(list1, list2) will give you what you're looking for. You will need to iterate over the first list with the function to find x, y, z and put it in a list.
zip(list1, list2)
[(1, x), (2, y), (3, z)]

Categories