Speed up a function in Python 2.7 - python

I want to know, if there is a way to speed up the function shown here. I know this looks not very pythonic...
def MakePairs(inputlist):
'''
#param inputlist: [[["a","b","c"],["d","e","f"]],[["g","h","i"],["j","k","l"]],...]
#return returnlist: [[["a","d"],["b","e],["c","f"]],[["g","j"],["h","k"],["i","l"]],...]
'''
returnlist = []
for Pair in xrange(len(inputlist)):
dummy2 = []
for item in xrange(len(inputlist[Pair][0])):
dummy = [Pair[0][item], Pair[1][item]]
dummy2.append(dummy)
returnlist.append(dummy2)
return returnlist
Edit: The pairs in the returnlist have to be lists.
Thanks in advance!!!

Looks like a job for zip():
>>> l = [[["a","b","c"],["d","e","f"]],[["g","h","i"],["j","k","l"]]]
>>> [zip(*item) for item in l]
[[('a', 'd'), ('b', 'e'), ('c', 'f')], [('g', 'j'), ('h', 'k'), ('i', 'l')]]
So, your function will be:
def MakePairs(inputlist):
return [zip(*item) for item in inputlist]
Also, consider using itertools.izip() instead of zip().

Related

Ngram in python with start_pad

i'm know in python i'm take some basic thing about list and tuple but my not full understand the my cod i want create list have three index in each index have tuple with tow index like this [('~','a'),('a','b'),('b','c')] the first index in tuple have tow char or the length context when have like this [('~a','a'),('ab','b'),('bc',' c')] can any one help my ? Her my code
def getNGrams(wordlist, n):
ngrams = []
padded_tokens = "~"*(n) + wordlist
t = tuple(wordlist)
for i in range(3):
t = tuple(padded_tokens[i:i+n])
ngrams.append(t)
return ngrams
IIUC, You can change the function like below and get what you want:
def getNGrams(wordlist, n):
ngrams = []
padded_tokens = "~"*n + wordlist
for idx, i in enumerate(range(len(wordlist))):
t = tuple((padded_tokens[i:i+n], wordlist[idx]))
ngrams.append(t)
return ngrams
print(getNGrams('abc',1))
print(getNGrams('abc',2))
print(getNGrams('abc',3))
Output:
[('~', 'a'), ('a', 'b'), ('b', 'c')]
[('~~', 'a'), ('~a', 'b'), ('ab', 'c')]
[('~~~', 'a'), ('~~a', 'b'), ('~ab', 'c')]

Why tuples in a set won't convert to any other type in a loop?

I'm trying to remove a certain item from a set of tuples. to do so I must convert the tuples to a list or a set (i.e. a mutable object). I'm trying to do in a for loop but the tuples won't convert and my item is yet to be removed.
a = [('A', 'C'), ('B', 'C'), ('B', 'C')]
for i in a:
i = list(i)
if 'C' in i:
i.remove('C')
print(a)
This is the output:
[('A', 'C'), ('B', 'C'), ('B', 'C')]
You got the right intuition. As your tuples are immutable, you need to create new ones.
However, in your code, you create lists, modify them, but fail to save them back in the original list.
You could use a list comprehension.
[tuple(e for e in t if e != 'C') for t in a]
Output:
[('A',), ('B',), ('B',)]
You are modifying the list but are not creating a new list.
Try this:
a = [('A', 'C'), ('B', 'C'), ('B', 'C')]
b = []
for i in a:
i = list(i)
if 'C' in i:
i.remove('C')
b.append(i)
print(b)

how to pair each 2 elements of a list?

I have a list like this
attach=['a','b','c','d','e','f','g','k']
I wanna pair each two elements that followed by each other:
lis2 = [('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'k')]
I did the following:
Category=[]
for i in range(len(attach)):
if i+1< len(attach):
Category.append(f'{attach[i]},{attach[i+1]}')
but then I have to remove half of rows because it also give 'b' ,'c' and so on. I thought maybe there is a better way
You can use zip() to achieve this as:
my_list = ['a','b','c','d','e','f','g','k']
new_list = list(zip(my_list[::2], my_list[1::2]))
where new_list will hold:
[('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'k')]
This will work to get only the pairs, i.e. if number of the elements in the list are odd, you'll loose the last element which is not as part of any pair.
If you want to preserve the last odd element from list as single element tuple in the final list, then you can use itertools.zip_longest() (in Python 3.x, or itertools.izip_longest() in Python 2.x) with list comprehension as:
from itertools import zip_longest # In Python 3.x
# from itertools import izip_longest ## In Python 2.x
my_list = ['a','b','c','d','e','f','g','h', 'k']
new_list = [(i, j) if j is not None else (i,) for i, j in zip_longest(my_list[::2], my_list[1::2])]
where new_list will hold:
[('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'h'), ('k',)]
# last odd number as single element in the tuple ^
You have to increment iterator i.e by i by 2 when moving forward
Category=[]
for i in range(0, len(attach), 2):
Category.append(f'{attach[i]},{attach[i+1]}')
Also, you don't need the if condition, if the len(list) is always even
lis2 = [(lis[i],lis[i+1]) for i in range(0,len(lis),2)]
lis2
You can use list comprehension

Select first item in each list

Here is my list:
[(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
Basically, I'd like to get:
[('A', 'C'), ('E', 'G')]
So, I'd like to select first elements from the lowest-level lists and build mid-level lists with them.
====================================================
Additional explanation below:
I could just zip them by
list(zip([w[0][0] for w in list1], [w[1][0] for w in list1]))
But later I'd like to add a condition: the second elements in the lowest level lists must be 'B' and 'D' respectively, so the final outcome should be:
[('A', 'C')] # ('E', 'G') must be sorted out
I'm a beginner, but can't find the case anywhere... Would be grateful for help.
I'd do it the following way
list = [(('A', 'B'), ('C', 'D')), (('E', 'F'), ('G', 'H'))]
out = []
for i in list:
listAux = []
for j in i:
listAux.append(j[0])
out.append((listAux[0],listAux[1]))
print(out)
I hope that's what you're looking for.

How to efficiently group pairs based on shared item?

I have a list of pairs (tuples), for simplification something like this:
L = [("A","B"), ("B","C"), ("C","D"), ("E","F"), ("G","H"), ("H","I"), ("G","I"), ("G","J")]
Using python I want efficiently split this list to:
L1 = [("A","B"), ("B","C"), ("C","D")]
L2 = [("E","F")]
L3 = [("G","H"), ("G","I"), ("G","J"), ("H","I")]
How to efficiently split list into groups of pairs, where for pairs in the group there must be always at least one pair which shares one item with others? As stated in one of the answers this is actually network problem. The goal is to efficiently split network into disconnected (isolated) network parts.
Type lists, tuples (sets) may be changed for achieving higher efficiency.
This is more like a network problem, so we can use networkx:
import networkx as nx
G=nx.from_edgelist(L)
l=list(nx.connected_components(G))
# after that we create the map dict , for get the unique id for each nodes
mapdict={z:x for x, y in enumerate(l) for z in y }
# then append the id back to original data for groupby
newlist=[ x+(mapdict[x[0]],)for x in L]
import itertools
#using groupby make the same id into one sublist
newlist=sorted(newlist,key=lambda x : x[2])
yourlist=[list(y) for x , y in itertools.groupby(newlist,key=lambda x : x[2])]
yourlist
[[('A', 'B', 0), ('B', 'C', 0), ('C', 'D', 0)], [('E', 'F', 1)], [('G', 'H', 2), ('H', 'I', 2), ('G', 'I', 2), ('G', 'J', 2)]]
Then to match your output format:
L1,L2,L3=[[y[:2]for y in x] for x in yourlist]
L1
[('A', 'B'), ('B', 'C'), ('C', 'D')]
L2
[('E', 'F')]
L3
[('G', 'H'), ('H', 'I'), ('G', 'I'), ('G', 'J')]
Initialise a list of groups as empty
Let (a, b) be the next pair
Collect all groups that contain any elements with a or b
Remove them all, join them, add (a, b), and insert as a new group
Repeat till done
That'd be something like this:
import itertools, functools
def partition(pred, iterable):
t1, t2 = itertools.tee(iterable)
return itertools.filterfalse(pred, t1), filter(pred, t2)
groups = []
for a, b in L:
unrelated, related = partition(lambda group: any(aa == a or bb == b or aa == b or bb == a for aa, bb in group), groups)
groups = [*unrelated, sum(related, [(a, b)])]
An efficient and Pythonic approach is to convert the list of tuples to a set of frozensets as a pool of candidates, and in a while loop, create a set as group and use a nested while loop to keep expanding the group by adding the first candidate set and then performing set union with other candidate sets that intersects with the group until there is no more intersecting candidate, at which point go back to the outer loop to form a new group:
pool = set(map(frozenset, L))
groups = []
while pool:
group = set()
groups.append([])
while True:
for candidate in pool:
if not group or group & candidate:
group |= candidate
groups[-1].append(tuple(candidate))
pool.remove(candidate)
break
else:
break
Given your sample input, groups will become:
[[('A', 'B'), ('C', 'B'), ('C', 'D')],
[('G', 'H'), ('H', 'I'), ('G', 'J'), ('G', 'I')],
[('E', 'F')]]
Keep in mind that sets are unordered in Python, which is why the order of the above output doesn't match your expected output, but for your purpose the order should not matter.
You can use the following code:
l = [("A","B"), ("B","C"), ("C","D"), ("E","F"), ("G","H"), ("H","I"), ("G","I"), ("G","J")]
result = []
if len(l) > 1:
tmp = [l[0]]
for i in range(1,len(l)):
if l[i][0] == l[i-1][1] or l[i][1] == l[i-1][0] or l[i][1] == l[i-1][1] or l[i][0] == l[i-1][0]:
tmp.append(l[i])
else:
result.append(tmp)
tmp = [l[i]]
result.append(tmp)
else:
result = l
for elem in result:
print(elem)
output:
[('A', 'B'), ('B', 'C'), ('C', 'D')]
[('E', 'F')]
[('G', 'H'), ('H', 'I'), ('G', 'I'), ('G', 'J')]
Note: this code is based on the hypothesis that your initial array is sorted. If this is not the case it will not work as it does only one pass on the whole list to create the groups (complexity O(n)).
Explanations:
result will store your groups
if len(l) > 1: if you have only one element in your list or an empty list no need to do any processing you have the answer
You will to a one pass on each element of the list and compare the 4 possible equality between the tuple at position i and the one at position i-1.
tmp is used to construct your groups, as long as the condition is met you add tuples to tmp
when the condition is not respected you add tmp (the current group that has been created to the result, reinitiate tmp with the current tuple) and you continue.
You can use a while loop and start iteration from first member of L(using a for loop inside). Check for the whole list if any member(either of the two) is shared or not. Then append it to a list L1 and pop that member from original list L. Then while loop would run again (till list L is nonempty). And for loop inside would run for each element in list to append to a new list L2. You can try this. (I will provide code it doesn't help)

Categories