Concise a for loop (x for x in ...) in python - python

I am trying to concise the following simple loop
a = [1,2,3,2,1,5,6,5,5,5]
for x in set(a):
a.remove(x)
This is working well but I need to know if it is possible to apply the concise for loop like that
a = [x for x in set(a):a.remove(x)]
My desire output is to get or list the duplicates only and get list of them, so the desired output is [1,2,5]
The code is working well
a = [1,2,3,2,1,5,6,5,5,5]
for x in set(a):
a.remove(x)
print(list(set(a)))
My target is not the code but to concise the loop in the loop. I need to learn this trick.
** Found a simple and effective solution:
print(list(set([x for x in a if a.count(x) > 1])))

Original question
a = a if not all([a.remove(i) for i in set(a) ]) else []
print(a)
As suggested by Copperfield, the following also works:
a = any(a.remove(i) for i in set(a) ) or a
Updated question
from collections import Counter
a = [1,2,3,2,1,5,6,5,5,5]
print([k for k, v in Counter(a).items() if v > 1])

find dups in a list
print ([c for i,c in enumerate(a) if a.count(c) > 1 and i==a.index(c)])
The output of this will be:
[1, 2, 5]
alternate for set(a)
Here's the list comprehension to create the same result as
print(list(set(a)))
This can be achieved by doing the following:
print([c for i,c in enumerate(a) if i==a.index(c)])
Here I am checking if the element c is the first time we encountered (index is i) and if yes, then add to list else ignore.
The output of both these will be:
[1, 2, 3, 5, 6]
While the output is the same, I would strongly recommend using the first method than doing a for loop and checking for index. The cost is too high to do this compared to list(set(a))

You can just do:
a = [1,2,3,2,1,5,6,5,5,5]
[a.remove(x) for x in set(a)]
print(a)
a will have the same items as after your for loop.
You can read more about list comprehensions.

Related

How can I check if a list of nodes have already been included in a list within a list of lists?

I have the following list: a = [[1,2,3],[4,5,6],[7,8,9]] which contains 3 lists, each being a list of nodes of a graph.
I am also given a tuple of nodes z = ([1,2], [4,9]). Now, I will like to check if either of the lists in z has been included in a list in a. For example, [1,2] is in [1,2,3], in a, but [4,9] is not in [4,5,6], although there is an overlapping node.
Remark: To clarify, I am also checking for sub-list of a list, or whether every item in a list is in another list. For example, I consider [1,3] to be "in" [1,2,3].
How can I do this? I tried implementing something similar found at Python 3 How to check if a value is already in a list in a list, but I have reached a mental deadlock..
Some insight on this issue will be great!
You can use any and all:
a = [[1,2,3],[4,5,6],[7,8,9]]
z = ([1,2], [4,9])
results = [i for i in z if any(all(c in b for c in i) for b in a)]
Output:
[[1, 2]]
You can use sets to compare if the nodes appear in a, <= operator for sets is equivalent to issubset().
itertools module provides some useful functions, itertools.product() is equivalent to nested for loops.
E.g.:
In []:
import itertools as it
[m for m, n in it.product(z, a) if set(m) <= set(n)]
Out[]:
[[1, 2]]
a = [[1,2,3],[4,5,6],[7,8,9]]
z = ([1,2], [4,9])
for z_ in z:
for a_ in a:
if set(z_).issubset(a_):
print(z_)
itertools.product is your friend (no installation builtin python module):
from itertools import product
print([i for i in z if any(tuple(i) in list(product(l,[len(i)])) for l in a)])
Output:
[[1, 2]]
Since you're only looking to test the sub-lists as if they were subsets, you can convert the sub-lists to sets and then use set.issubset() for the test:
s = map(set, a)
print([l for l in z for i in s if set(l).issubset(i)])
This outputs:
[[1, 2]]

list comprehension with iteration

Suppose i have a list L which consists of integer elements. I want to construct a list T which for each index i in 0..len(L) contains the item L[i] provided it is larger than 0 using List comprehension in python.
I tried the following command to do this
T=[L[i] if L[i]>0 for i in range(len(L))]
but i keep getting an error of invalid syntax. How would i do this correctly using List comprehensions in python?
Your syntax is wrong :
L = [1,2,-4,5,-6,7,8,9]
T = [L[i] for i in range(len(L)) if L[i]>0]
Output:
[1, 2, 5, 7, 8, 9]
Also you can just iterate through the elements of the list per se, no need to use a range. Like this : T = [i for i in L if i>0]
Remember :
When only if is there the syntax is [expression for var in list if ...]
When both if and else is there the syntax is [expression1 if ... else expression2 for var in list]
You need to write the filter after the iteration part. So:
T=[L[i] for i in range(len(L)) if L[i]>0]
# \__/ \_______ ____________/ \___ ___/
# yield v v
# iteration filter
Right now Python thinks that you want to write a ternary operator, like:
T=[L[i] if L[i] > 0 else 0 for i in range(len(L))]
This is incorrect: here you would evaluate L[i] if L[i] > 0 else 0 for every element, and you would thus add a 0 for every item L[i] where the element is less than or equal to zero.
That being said, you can write your list comprehension more elegant (and faster), with:
T = [l for l in L if l > 0]
So instead of iterating over indices, we iterate over the elements l in L. We also filter on l and yield l in case the filtering is successful.
While you can use a list comprehension to solve this problem, you can also use filter with a lambda function:
final_l = list(filter(lambda x:x > 0, L))

Finding indices of items from a list in another list even if they repeat

This answer works very well for finding indices of items from a list in another list, but the problem with it is, it only gives them once. However, I would like my list of indices to have the same length as the searched for list.
Here is an example:
thelist = ['A','B','C','D','E'] # the list whose indices I want
Mylist = ['B','C','B','E'] # my list of values that I am searching in the other list
ilist = [i for i, x in enumerate(thelist) if any(thing in x for thing in Mylist)]
With this solution, ilist = [1,2,4] but what I want is ilist = [1,2,1,4] so that len(ilist) = len(Mylist). It leaves out the index that has already been found, but if my items repeat in the list, it will not give me the duplicates.
thelist = ['A','B','C','D','E']
Mylist = ['B','C','B','E']
ilist = [thelist.index(x) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
Basically, "for each element of Mylist, get its position in thelist."
This assumes that every element in Mylist exists in thelist. If the element occurs in thelist more than once, it takes the first location.
UPDATE
For substrings:
thelist = ['A','boB','C','D','E']
Mylist = ['B','C','B','E']
ilist = [next(i for i, y in enumerate(thelist) if x in y) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
UPDATE 2
Here's a version that does substrings in the other direction using the example in the comments below:
thelist = ['A','B','C','D','E']
Mylist = ['Boo','Cup','Bee','Eerr','Cool','Aah']
ilist = [next(i for i, y in enumerate(thelist) if y in x) for x in Mylist]
print(ilist) # [1, 2, 1, 4, 2, 0]
Below code would work
ilist = [ theList.index(i) for i in MyList ]
Make a reverse lookup from strings to indices:
string_indices = {c: i for i, c in enumerate(thelist)}
ilist = [string_indices[c] for c in Mylist]
This avoids the quadratic behaviour of repeated .index() lookups.
If you data can be implicitly converted to ndarray, as your example implies, you could use numpy_indexed (disclaimer: I am its author), to perform this kind of operation in an efficient (fully vectorized and NlogN) manner.
import numpy_indexed as npi
ilist = npi.indices(thelist, Mylist)
npi.indices is essentially the array-generalization of list.index. Also, it has a kwarg to give you control over how to deal with missing values and such.

Operation with values at specific indexes in a list of list

I'm currently writing a little script in python (2.x), and there's a portion of the code that I'd like to improve without knowing how to do so.
I have a list of lists that looks like the following:
my_list = [["abc",1,2,"def"],["ghi",4,5,"klm"],["nop",6,7,"qrs"]]
I need to get the sum of all the integers at the index 1 and the sum of all the integers at the index 2. To do so, I currently have:
sum1, sum2 = 0, 0
for i in my_list:
sum1 += i[1]
sum2 += i[2]
What could be a more pythonic way to do that? Maybe using reduce and a lambda function or something?
A more pythonic way to do that would be using the sum function along with the for ... in ... generator and do all the work in a single line, like this:
sum1, sum2 = sum(x[1] for x in my_list), sum(x[2] for x in my_list)
The most Pythonic would probably be list comprehensions:
my_list = [["abc",1,2,"def"],["ghi",4,5,"klm"],["nop",6,7,"qrs"]]
Summations:
sum1 = sum(l[1] for l in my_list)
sum2 = sum(l[2] for l in my_list)
Which returns:
sum1 = 11
sum2 = 14
You could do sum(x[1] for x in in my_list), sum(x[2] for x in my_list) if you don't mind looping twice.
or reduce(lambda acc, l: (acc[0] + l[1], acc[1] + l[2]), my_list, (0, 0)) if you want to do it both at once. This will return a tuple with the sum of [1] on the first element, and [2] on the second
You can use zip and list comprehension
>>> lst = [["abc",1,2,"def"],["ghi",4,5,"klm"],["nop",6,7,"qrs"]]
>>> [sum(i) for i in list(zip(*lst))[1:3]]
[11, 14]
Or use zip and islice class from itertools
>>> from itertools import islice
>>> [sum(i) for i in islice(zip(*lst), 1, 3)]
[11, 14]
Here's a one liner that doesn't use a generator expression. Use zip plus unpacking to transpose the list, then run sum on all the numeric columns using map.
>>> map(sum, zip(*my_list)[1:-1])
[11, 14]
Unfortunately it's a little wordier in 3.X since you can't slice a zip object.
a,b = map(sum, list(zip(*my_list))[1:-1])
You can use reduce but, since the elements in your sequence are lists, you'll need to set an initial value of 0. In Example:
reduce(lambda total, list: total+list[1], my_list, 0) # integers at index 1
reduce(lambda total, list: total+list[2], my_list, 0) # integers at index 2

Comparing two lists and only printing the differences? (XORing two lists)

I'm trying to create a function that takes in 2 lists and returns the list that only has the differences of the two lists.
Example:
a = [1,2,5,7,9]
b = [1,2,4,8,9]
The result should print [4,5,7,8]
The function so far:
def xor(list1, list2):
list3=list1+list2
for i in range(0, len(list3)):
x=list3[i]
y=i
while y>0 and x<list3[y-1]:
list3[y]=list3[y-1]
y=y-1
list3[y]=x
last=list3[-1]
for i in range(len(list3) -2, -1, -1):
if last==list3[i]:
del list3[i]
else:
last=list3[i]
return list3
print xor([1,2,5,7,8],[1,2,4,8,9])
The first for loop sorts it, second one removes the duplicates. Problem is the result is
[1,2,4,5,7,8,9] not [4,5,7,8], so it doesn't completely remove the duplicates? What can I add to do this.
I can't use any special modules, .sort, set or anything, just loops basically.
You basically want to add an element to your new list if it is present in one and not present in another. Here is a compact loop which can do it. For each element in the two lists (concatenate them with list1+list2), we add element if it is not present in one of them:
[a for a in list1+list2 if (a not in list1) or (a not in list2)]
You can easily transform it into a more unPythonic code with explicit looping through elements as you have now, but honestly I don't see a point (not that it matters):
def xor(list1, list2):
outputlist = []
list3 = list1 + list2
for i in range(0, len(list3)):
if ((list3[i] not in list1) or (list3[i] not in list2)) and (list3[i] not in outputlist):
outputlist[len(outputlist):] = [list3[i]]
return outputlist
Use set is better
>>> a = [1,2,5,7,9]
>>> b = [1,2,4,8,9]
>>> set(a).symmetric_difference(b)
{4, 5, 7, 8}
Thanks to #DSM, a better sentence is:
>>> set(a)^set(b)
These two statements are the same. But the latter is clearer.
Update: sorry, I did not see the last requirement: cannot use set. As far as I see, the solution provided by #sashkello is the best.
Note: This is really unpythonic and should only be used as a homework answer :)
After you have sorted both lists, you can find duplicates by doing the following:
1) Place iterators at the start of A and B
2) If Aitr is greater than Bitr, advance Bitr after placing Bitr's value in the return list
3) Else if Bitr is greater than Aitr, advance Aiter after placing Aitr's value in the return list
4) Else you have found a duplicate, advance Aitr and Bitr
This code works assuming you've got sorted lists. It works in linear time, rather than quadratic like many of the other solutions given.
def diff(sl0, sl1):
i0, i1 = 0, 0
while i0 < len(sl0) and i1 < len(sl1):
if sl0[i0] == sl1[i1]:
i0 += 1
i1 += 1
elif sl0[i0] < sl1[i1]:
yield sl0[i0]
i0 += 1
else:
yield sl1[i1]
i1 += 1
for i in xrange(i0, len(sl0)):
yield sl0[i]
for i in xrange(i1, len(sl1)):
yield sl1[i]
print list(diff([1,2,5,7,9], [1,2,4,8,9]))
Try this,
a = [1,2,5,7,9]
b = [1,2,4,8,9]
print set(a).symmetric_difference(set(b))
Simple, but not particularly efficient :)
>>> a = [1,2,5,7,9]
>>> b = [1,2,4,8,9]
>>> [i for i in a+b if (a+b).count(i)==1]
[5, 7, 4, 8]
Or with "just loops"
>>> res = []
>>> for i in a+b:
... c = 0
... for j in a+b:
... if i==j:
... c += 1
... if c == 1:
... res.append(i)
...
>>> res
[5, 7, 4, 8]

Categories