Let's say I have a list of lists like this one:
l = [[3,4], [2,3], [1,2], [1,3], [1,2]]
How this list can be sorted based on the first element of each sub-list and get a list with the first equal sorted elements,
e.g. [[1,2],[1,2]]?
Here is my suggestion, not optimized at all but easily readable and understandable, without any additional package to import:
l = [[3,4], [2,3], [1,2], [1,3], [1,2]]
# Sort by first then second element of sub-lists
l.sort(key=lambda x:(x[0],x[1]))
print(l)
# Get a list with the first equal sorted sub-lists
s = [l[0]]* l.count(l[0])
print(s)
First you can find all the doubles:
l = [x for n, x in enumerate(l) if x in l[:n]]
And then you can sort by the first element using:
l = sorted(l)
It's faster than if you do it the other way around, especially if l is very long I would first reduce it to the minimal list and then sort.
If you have items appear three times, this will return a list with those items in there twice. You can remove this with:
l = [x for n, x in enumerate(l) if x not in l[:n]]
Largely based on another answer from #georg .
Related
I have the following list: a = [[1,2,3],[4,5,6],[7,8,9]] which contains 3 lists, each being a list of nodes of a graph.
I am also given a tuple of nodes z = ([1,2], [4,9]). Now, I will like to check if either of the lists in z has been included in a list in a. For example, [1,2] is in [1,2,3], in a, but [4,9] is not in [4,5,6], although there is an overlapping node.
Remark: To clarify, I am also checking for sub-list of a list, or whether every item in a list is in another list. For example, I consider [1,3] to be "in" [1,2,3].
How can I do this? I tried implementing something similar found at Python 3 How to check if a value is already in a list in a list, but I have reached a mental deadlock..
Some insight on this issue will be great!
You can use any and all:
a = [[1,2,3],[4,5,6],[7,8,9]]
z = ([1,2], [4,9])
results = [i for i in z if any(all(c in b for c in i) for b in a)]
Output:
[[1, 2]]
You can use sets to compare if the nodes appear in a, <= operator for sets is equivalent to issubset().
itertools module provides some useful functions, itertools.product() is equivalent to nested for loops.
E.g.:
In []:
import itertools as it
[m for m, n in it.product(z, a) if set(m) <= set(n)]
Out[]:
[[1, 2]]
a = [[1,2,3],[4,5,6],[7,8,9]]
z = ([1,2], [4,9])
for z_ in z:
for a_ in a:
if set(z_).issubset(a_):
print(z_)
itertools.product is your friend (no installation builtin python module):
from itertools import product
print([i for i in z if any(tuple(i) in list(product(l,[len(i)])) for l in a)])
Output:
[[1, 2]]
Since you're only looking to test the sub-lists as if they were subsets, you can convert the sub-lists to sets and then use set.issubset() for the test:
s = map(set, a)
print([l for l in z for i in s if set(l).issubset(i)])
This outputs:
[[1, 2]]
I want to join the elements of two lists into one list and add some characters, like so:
list_1 = ['some1','some2','some3']
list_2 = ['thing1','thing2','thing3']
joined_list = ['some1_thing1', 'some2_thing2', 'some3_thing3']
however i don't know in advance how many lists I will have to do this for, i.e. I want to do this for an arbitrary number of lists
Also, I currently receive a list in the following form:
list_A = [('some1','thing1'),('some2','thing2'),('some3','thing3')]
so I split it up into lists like so:
list_B = [i for i in zip(*list_A)]
I do this because sometimes I have an int instead of a string
list_A = [('some1','thing1',32),('some1','thing1',42),('some2','thing3', 52)]
so I can do this after
list_C = [list(map(str,list_B[i])) for i in range(0,len(list_B)]
and basically list_1 and list_2 are the elements of list_C.
So is there a more efficient way to do all this ?
Try this if you are using python>=3.6:
[f'{i}_{j}' for i,j in zip(list_1, list_2)]
If you using python3.5, you can do this:
['{}_{}'.format(i,j) for i,j in zip(list_1, list_2)]
also you can use this if you don't want to use formatted string:
['_'.join([i,j]) for i,j in zip(list_1, list_2)]
You can join function like this on the base list_A, itself, no need to split it for probable int values:
list_A = [('some1','thing1',32),('some1','thing1',42), ('some2','thing3', 52)]
["_".join(map(str, i)) for i in list_A]
Output:
['some1_thing1_32', 'some1_thing1_42', 'some2_thing3_52']
Update:
For you requirement, where you want to ignore last element for last tuple in your list_A, need to add if-else condition inside the list-comprehension as below:
["_".join(map(str, i)) if list_A.index(i) != len(list_A)-1 else "_".join(map(str, i[:-1])) for i in list_A ]
Updated Output:
['some1_thing1_32', 'some1_thing1_42', 'some2_thing3']
For ignoring the last element of every tuple in list_A, I found this to be the quickest way:
["_".join(map(str, i)) for i in [x[:-1] for x in list_A] ]
This answer works very well for finding indices of items from a list in another list, but the problem with it is, it only gives them once. However, I would like my list of indices to have the same length as the searched for list.
Here is an example:
thelist = ['A','B','C','D','E'] # the list whose indices I want
Mylist = ['B','C','B','E'] # my list of values that I am searching in the other list
ilist = [i for i, x in enumerate(thelist) if any(thing in x for thing in Mylist)]
With this solution, ilist = [1,2,4] but what I want is ilist = [1,2,1,4] so that len(ilist) = len(Mylist). It leaves out the index that has already been found, but if my items repeat in the list, it will not give me the duplicates.
thelist = ['A','B','C','D','E']
Mylist = ['B','C','B','E']
ilist = [thelist.index(x) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
Basically, "for each element of Mylist, get its position in thelist."
This assumes that every element in Mylist exists in thelist. If the element occurs in thelist more than once, it takes the first location.
UPDATE
For substrings:
thelist = ['A','boB','C','D','E']
Mylist = ['B','C','B','E']
ilist = [next(i for i, y in enumerate(thelist) if x in y) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
UPDATE 2
Here's a version that does substrings in the other direction using the example in the comments below:
thelist = ['A','B','C','D','E']
Mylist = ['Boo','Cup','Bee','Eerr','Cool','Aah']
ilist = [next(i for i, y in enumerate(thelist) if y in x) for x in Mylist]
print(ilist) # [1, 2, 1, 4, 2, 0]
Below code would work
ilist = [ theList.index(i) for i in MyList ]
Make a reverse lookup from strings to indices:
string_indices = {c: i for i, c in enumerate(thelist)}
ilist = [string_indices[c] for c in Mylist]
This avoids the quadratic behaviour of repeated .index() lookups.
If you data can be implicitly converted to ndarray, as your example implies, you could use numpy_indexed (disclaimer: I am its author), to perform this kind of operation in an efficient (fully vectorized and NlogN) manner.
import numpy_indexed as npi
ilist = npi.indices(thelist, Mylist)
npi.indices is essentially the array-generalization of list.index. Also, it has a kwarg to give you control over how to deal with missing values and such.
I am struck on an awkward lists comprehension problem, which I am not able to solve. So, I have two lists looking like the following:
a=[[....],[....],[....]]
b=[[....],[....],[....]]
len(a)==len(b) including sublists i.e sublists also have the same dimension.
Now I want to do a re.compile which looks something like:
[re.compile(_subelement_in_a).search(_subelement_in_b).group(1)]
and I am wondering how I can achieve the above using list compherension - something like:
[[re.compile(str(x)).search(str(y)).group(1) for x in a] for y in b]
..but obviously the above does not seem to work and I was wondering if anyone could point me in the right direction.
EDIT
I have just realized that the sublists of b have more elements than the sublists of a. So, for example:
a=[[1 items],[1 items],[1 items]]
b=[[10 item], [10 item], [10 item]]
I would still like to do the same as my above question:
[[re.compile(str(x)).search(str(y)).group(1) for x in b] for y in a]
and the output looking like:
c = [[b[0] in a[0] items],[b[1] in a[1] items],[b[2] in a[2] items]]
Example:
a=[["hgjkhukhkh"],["78hkugkgkug"],["ukkhukhylh"]]
b=[[r"""a(.*?)b""",r"""c(.*?)d""",r"""e(.*?)f""",r"""g(.*?)h""",r"""i(.*?)j"""],[r"""k(.*?)l""",r"""m(.*?)n""",r"""o(.*?)p""",r"""q(.*?)r"""],[r"""s(.*?)t""",r"""u(.*?)v""",r"""x(.*?)y""",r"""z(.*?)>"""]]
using one to one mapping. i.e check if:
elements of sublists of b[0] are present in sublist element of a[0]
elements of sublists of b[1] are present in sublist element of a[1]
elements of sublists of b[2] are presnet in sublist element of a[2]
Sounds like you are looking for zip? It takes a pair of lists and turns it into a list of pairs.
[
[my_operation(x,y) for x,y in zip(xs, ys)]
for xs, ys in zip(a, b)
]
-- Edit. Requirements changed:
[
[[regex(p, s) for p in patterns] for s in strings]
for strings, patterns in zip(a, b)
]
Use zip liberally:
[[re.search(x, y).group(1) for x,y in zip(s,t)] for s,t in zip(a,b)]
The first zip(a,b) produces lists of sublist pairs. The second zip pairs the elements in parallel sublists together.
I have the following input list
A = [['A',[1,2,3]],['D',[3,4]],['E',[6,7]],['F',[1]]]
I want to have sublists whose length is 2.
In the above example, I want to remove [A,[1,2,3]], [F,[1]] etc
I am creating a new list and appending all sublists whose length ==2.
If I can directly remove from A, the unwanted sublists, it would be ideal
Should't it be like this?
A = [x for x in A if len(x[1]) == 2]
Or even
A = [[a, b] for a, b in A if len(b) == 2]
You might want to use filter.
filter(lambda x: len(x[1]) == 2, A)
This assumes each of the element (list) has 2 elements, and the second element is a list. You want to filter the elements which have exactly 2 elements in this inner list.
More on filter:
filter(...)
filter(function or None, sequence) -> list, tuple, or string
Return those items of sequence for which function(item) is true.
The same can be achieved via a list comprehension:
[x for x in A if len(x[1]) == 2]
It's better and easier to create a new filtered list. A[:] = ... is a slice assignment which means that the content of the new list is copied back into A so that other references to the same list will see the update. If you don't need to keep the identity of A, you can just use A = ...
>>> A = [['A',[1,2,3]],['D',[3,4]],['E',[6,7]],['F',[1]]]
>>> A[:] = [x for x in A if len(x[1]) == 2]
>>> A
[['D', [3, 4]], ['E', [6, 7]]]
Removing in place is usually inefficient because you need to move the remaining items down the list each time you remove one. You also need to take care to not skip over elements when you remove from the list you are iterating over.
A = [x if len(x)==2 for x in A]
fyi, your E and F lists don't have closing brackets, but I'm assuming that's just a copy/paste error or similar.