Large amount of lists concatenation [duplicate] - python

This question already has answers here:
Merge lists that share common elements
(15 answers)
Closed 4 years ago.
I'm trying to make a function which concatenate multiple list if one element is the same in 2 or more different list.
Example :
[[1,2],[3,4,5],[0,4]] would become [[1,2],[0,3,4,5]
[[1],[1,2],[0,2]] would become [[0,1,2]]
[[1, 2], [2, 3], [3, 4]] would become [[1,2,3,4]]
In fact we just regroup the list if they have a common element and we delete one of the two element. The finals lists must have unique elements.
I tried to make the following function. It works, but when using big list (around 100 or 200 lists of list), I got the following recursion error :
RecursionError: maximum recursion depth exceeded while getting the repr of an object
def concat(L):
break_cond = False
print(L)
for L1 in L:
for L2 in L:
if (bool(set(L1) & set(L2)) and L1 != L2):
break_cond = True
if (break_cond):
i, j = 0, 0
while i < len(L):
while j < len(L):
if (bool(set(L[i]) & set(L[j])) and i != j):
L[i] = sorted(L[i] + list(set(L[j]) - set(L[i])))
L.pop(j)
j += 1
i += 1
return concat(L)
Moreover, I would like to do it using only basic python and not that much library. Any idea ? Thanks
Example of list where I get the error :
[[0, 64], [1, 120, 172], [2, 130], [3, 81, 102], [5, 126], [6, 176], [7, 21, 94], [8, 111, 167], [9, 53, 60, 138], [10, 102, 179], [11, 45, 72], [12, 53, 129], [14, 35, 40, 58, 188], [15, 86], [18, 70, 94], [19, 28], [20, 152], [21, 24], [22, 143, 154], [23, 110, 171], [24, 102, 144], [25, 73, 106, 187], [26, 189], [28, 114, 137], [29, 148], [30, 39], [31, 159], [33, 44, 132, 139], [34, 81, 100, 136, 185], [35, 53], [37, 61, 138], [38, 144, 147, 165], [41, 42, 174], [42, 74, 107, 162], [43, 99, 123], [44, 71, 122, 126], [45, 74, 144], [47, 94, 151], [48, 114, 133], [49, 130, 144], [50, 51], [51, 187], [52, 124, 142, 146, 167, 184], [54, 97], [55, 94], [56, 88, 128, 166], [57, 63, 80], [59, 89], [60, 106, 134, 142], [61, 128, 145], [62, 70], [63, 73, 76, 101, 106], [64, 80, 176], [65, 187, 198], [66, 111, 131, 150], [67, 97, 128, 159], [68, 85, 128], [69, 85, 169], [70, 182], [71, 123], [72, 85, 94], [73, 112, 161], [74, 93, 124, 151, 191], [75, 163], [76, 99, 106, 129, 138, 152, 179], [77, 89, 92], [78, 146, 156], [79, 182], [82, 87, 130, 179], [83, 148], [84, 110, 146], [85, 98, 137, 177], [86, 198], [87, 101], [88, 134, 149], [89, 99, 107, 130, 193], [93, 147], [95, 193], [96, 98, 109], [104, 105], [106, 115, 154, 167, 190], [107, 185, 193], [111, 144, 153], [112, 128, 188], [114, 136], [115, 146], [118, 195], [119, 152], [121, 182], [124, 129, 177], [125, 156], [126, 194], [127, 198], [128, 149], [129, 153], [130, 164, 196], [132, 140], [133, 181], [135, 165, 170, 171], [136, 145], [141, 162], [142, 170, 187], [147, 171], [148, 173], [150, 180], [153, 191], [154, 196], [156, 165], [157, 177], [158, 159], [159, 172], [161, 166], [162, 192], [164, 184, 197], [172, 199], [186, 197], [187, 192]]

As mentioned by #ScottBoston this is a graph problem, known as connected components, I suggest you used networkx as indicated by #ScottBoston, in case you cannot here is a version without networkx:
from itertools import combinations
def bfs(graph, start):
visited, queue = set(), [start]
while queue:
vertex = queue.pop(0)
if vertex not in visited:
visited.add(vertex)
queue.extend(graph[vertex] - visited)
return visited
def connected_components(G):
seen = set()
for v in G:
if v not in seen:
c = set(bfs(G, v))
yield c
seen.update(c)
def graph(edge_list):
result = {}
for source, target in edge_list:
result.setdefault(source, set()).add(target)
result.setdefault(target, set()).add(source)
return result
def concat(l):
edges = []
s = list(map(set, l))
for i, j in combinations(range(len(s)), r=2):
if s[i].intersection(s[j]):
edges.append((i, j))
G = graph(edges)
result = []
unassigned = list(range(len(s)))
for component in connected_components(G):
union = set().union(*(s[i] for i in component))
result.append(sorted(union))
unassigned = [i for i in unassigned if i not in component]
result.extend(map(sorted, (s[i] for i in unassigned)))
return result
print(concat([[1, 2], [3, 4, 5], [0, 4]]))
print(concat([[1], [1, 2], [0, 2]]))
print(concat([[1, 2], [2, 3], [3, 4]]))
Output
[[0, 3, 4, 5], [1, 2]]
[[0, 1, 2]]
[[1, 2, 3, 4]]

You can use networkx library because is a graph theory and connected components problem:
import networkx as nx
l = [[1,2],[3,4,5],[0,4]]
#l = [[1],[1,2],[0,2]]
#l = [[1, 2], [2, 3], [3, 4]]
G = nx.Graph()
#Add nodes to Graph
G.add_nodes_from(sum(l, []))
#Create edges from list of nodes
q = [[(s[i],s[i+1]) for i in range(len(s)-1)] for s in l]
for i in q:
#Add edges to Graph
G.add_edges_from(i)
#Find all connnected components in graph and list nodes for each component
[list(i) for i in nx.connected_components(G)]
Output:
[[1, 2], [0, 3, 4, 5]]
Output if uncomment line 2 and comment line 1:
[[0, 1, 2]]
Likewise for line 3:
[[1, 2, 3, 4]]

Here's an iterative approach, should be about as efficient as you can probably get in pure python. One thing is having to spend an extra pass removing duplicates at the end.
original_list = [[1,2],[3,4,5],[0,4]]
mapping = {}
rev_mapping = {}
for i, candidate in enumerate(original_list):
sentinel = -1
for item in candidate:
if mapping.get(item, -1) != -1:
merge_pos = mapping[item]
#update previous list with all new candidates
for item in candidate:
mapping[item] = merge_pos
rev_mapping[merge_pos].extend(candidate)
break
else:
for item in candidate:
mapping[item] = i
rev_mapping.setdefault(i, []).extend(candidate)
result = [list(set(item)) for item in rev_mapping.values()]
print(result)
Output:
[[1, 2], [0, 3, 4, 5]]

You can use a recursive version of the breadth-first search with no imports:
def group_vals(d, current, _groups, _seen, _master_seen):
if not any(set(current)&set(i) for i in d if i not in _seen):
yield list({i for b in _groups for i in b})
for i in d:
if i not in _master_seen:
yield from group_vals(d, i, [i], [i], _master_seen+[i])
else:
for i in d:
if i not in _seen and set(current)&set(i):
yield from group_vals(d, i, _groups+[i], _seen+[i], _master_seen+[i])
def join_data(_data):
_final_result = list(group_vals(_data, _data[0], [_data[0]], [_data[0]], []))
return [a for i, a in enumerate(_final_result) if a not in _final_result[:i]]
c = [[[1,2],[3,4,5],[0,4]], [[1],[1,2],[0,2]], [[1, 2], [2, 3], [3, 4]]]
print(list(map(join_data, c)))
Output:
[
[[1, 2], [0, 3, 4, 5]],
[[0, 1, 2]],
[[1, 2, 3, 4]]
]

if you want it in simple form here is solution :
def concate(l):
len_l = len(l)
i = 0
while i < (len_l - 1):
for j in range(i + 1, len_l):
# i,j iterate over all pairs of l's elements including new
# elements from merged pairs. We use len_l because len(l)
# may change as we iterate
i_set = set(l[i])
j_set = set(l[j])
if len(i_set.intersection(j_set)) > 0:
# Remove these two from list
l.pop(j)
l.pop(i)
# Merge them and append to the orig. list
ij_union = list(i_set.union(j_set))
l.append(ij_union)
# len(l) has changed
len_l -= 1
# adjust 'i' because elements shifted
i -= 1
# abort inner loop, continue with next l[i]
break
i += 1
return l

If you want to see how the algorithm works, you can use this script which uses the connectivity matrix:
import numpy
def Concatenate(L):
result = []
Ls_length = len(L)
conn_mat = numpy.zeros( [Ls_length, Ls_length] ) # you can use a list of lists instead of a numpy array
check_vector = numpy.zeros( Ls_length ) # you can use a list instead of a numpy array
idx1 = 0
while idx1 < Ls_length:
idx2 = idx1 + 1
conn_mat[idx1,idx1] = 1 # the diaginal is always 1 since every set intersects itself.
while idx2 < Ls_length:
if bool(set(L[idx1]) & set(L[idx2]) ): # 1 if the sets idx1 idx2 intersect, and 0 if they don't.
conn_mat[idx1,idx2] = 1 # this is clearly a symetric matrix.
conn_mat[idx2,idx1] = 1
idx2 += 1
idx1 += 1
print (conn_mat)
idx = 0
while idx < Ls_length:
if check_vector[idx] == 1: # check if we already concatenate the idx element of L.
idx += 1
continue
connected = GetAllPositiveIntersections(idx, conn_mat, Ls_length)
r = set()
for idx_ in connected:
r = r.union(set(L[idx_]))
check_vector[idx_] = 1
result.append(list(r))
return result
def GetAllPositiveIntersections(idx, conn_mat, Ls_length):
# the elements that intersect idx are coded with 1s in the ids' row (or column, since it's a symetric matrix) of conn_mat.
connected = [idx]
i = 0
idx_ = idx
while i < len(connected):
j = 0
while j < Ls_length:
if bool(conn_mat[idx_][j]):
if j not in connected: connected.append(j)
j += 1
i += 1
if i < len(connected): idx_ = connected[i]
return list(set(connected))
Then you just:
L = [[1,2],[3,4,5],[0,4]]
r = Concatenate(L)
print(r)

Related

How to interchange an element within multidimensional list in python?

I'm unable to change the elements of a multidimensional list. The code does work on a simple list, but not on a lists within a list. I assume there is a correct way of modifying a multidimensional list?
The code I'm stuck with:
curpos = [[88, 118, 1], [200, 118, 0], [312, 118, 2],
[88, 230, 3], [200, 230, 4], [312, 230, 5],
[88, 342, 6], [200, 342, 7], [312, 342, 8]]
movls = ['right', 'down', 'left']
def move_tile(current_loc, movlist):
result = []
temp = current_loc.copy()
for i in temp:
if i[2] == 0:
empty = temp.index(i)
for j in movlist:
if j == 'right':
temp[empty][2] = temp[empty+1][2] # the problem is here, it changes the origin current_loc instead of the copy temp
temp[empty+1][2] = 0
elif j == 'down':
temp[empty][2] = temp[empty+3][2]
temp[empty+3][2] = 0
elif j == 'left':
temp[empty][2] = temp[empty-1][2]
temp[empty-1][2] = 0
elif j == 'up':
temp[empty][2] = temp[empty-3][2]
temp[empty-3][2] = 0
print(temp) # this outputs [[88, 118, 1], [200, 118, 2], [312, 118, 0], [88, 230, 3], [200, 230, 4], [312, 230, 5], [88, 342, 6], [200, 342, 7], [312, 342, 8]]
result.append(temp) # but this appends [88, 118, 0], [200, 118, 1], [312, 118, 0], [88, 230, 3], [200, 230, 0], [312, 230, 5], [88, 342, 6], [200, 342, 7], [312, 342, 8]
temp = current_loc.copy() # does not reset the temporary var
return result
print(move_tile(curpos, movls))
The correct output (without the print(temp) should be something like this:
[[[88, 118, 1], [200, 118, 2], [312, 118, 0],
[88, 230, 3], [200, 230, 4], [312, 230, 5],
[88, 342, 6], [200, 342, 7], [312, 342, 8]],
[[88, 118, 1], [200, 118, 4], [312, 118, 2],
[88, 230, 3], [200, 230, 0], [312, 230, 5],
[88, 342, 6], [200, 342, 7], [312, 342, 8]],
[[88, 118, 0], [200, 118, 1], [312, 118, 2],
[88, 230, 3], [200, 230, 4], [312, 230, 5],
[88, 342, 6], [200, 342, 7], [312, 342, 8]]]
You can use deepcopy to ensure that temp is a separate list. Making this change will generate the output that you have described above.
from copy import deepcopy
curpos = [[88, 118, 1], [200, 118, 0], [312, 118, 2],
[88, 230, 3], [200, 230, 4], [312, 230, 5],
[88, 342, 6], [200, 342, 7], [312, 342, 8]]
movls = ['right', 'down', 'left']
def move_tile(current_loc, movlist):
result = []
temp = deepcopy(current_loc)
for i in temp:
if i[2] == 0:
empty = temp.index(i)
for j in movlist:
if j == 'right':
temp[empty][2] = temp[empty+1][2]
temp[empty+1][2] = 0
elif j == 'down':
temp[empty][2] = temp[empty+3][2]
temp[empty+3][2] = 0
elif j == 'left':
temp[empty][2] = temp[empty-1][2]
temp[empty-1][2] = 0
elif j == 'up':
temp[empty][2] = temp[empty-3][2]
temp[empty-3][2] = 0
result.append(temp)
temp = deepcopy(current_loc)
return result
print(move_tile(curpos, movls))

How to apply a certain function on all the combinations along a dimension of two tensors?

I am trying to achieve something like this in torch.
I have two tensors of shapes [X, 4] and [Y, 4]. this 4s are basically 4 coordinates of something. So for each combination of X and Y (2 vectors of length 4), I want to apply some function (elementwise average for example), and form result of the shape [X, Y, 4].
How to do this?
By elementwise average, I mean this operation,
[2 4 6 8] OP [8 6 4 2] = [5 5 5 5]
But it can be any arbitrary operation.
N.B. I was able to solve it using loops, but searching for a vectorized solution.
Your answer depends on the function. For example, for element wise mean:
np.mean([x[:,None,:],y[None,...]], axis=0)
or for einsum:
np.einsum('ij,kj->ikj',x,y)
or for summation:
x[:,None,:]+y[None,...]
And if you implement your CUSTOM function properly (elementwise), you can use broadcasting to do the job using this:
np.frompyfunc(myfunc,2,1)(x[:,None,:],y[None,...])
sample:
x = array([[0, 1, 2, 3],
[4, 5, 6, 7]])
y = array([[100, 101, 102, 103],
[104, 105, 106, 107],
[108, 109, 110, 111]])
#np.mean([x[:,None,:],y[None,...]], axis=0)
array([[[50, 51, 52, 53],
[52, 53, 54, 55],
[54, 55, 56, 57]],
[[52, 53, 54, 55],
[54, 55, 56, 57],
[56, 57, 58, 59]]])
#np.einsum('ij,kj->ikj',x,y)
array([[[ 0, 101, 204, 309],
[ 0, 105, 212, 321],
[ 0, 109, 220, 333]],
[[400, 505, 612, 721],
[416, 525, 636, 749],
[432, 545, 660, 777]]])
#x[:,None,:]+y[None,...]
array([[[100, 102, 104, 106],
[104, 106, 108, 110],
[108, 110, 112, 114]],
[[104, 106, 108, 110],
[108, 110, 112, 114],
[112, 114, 116, 118]]])
def myfunc(x, y):
return x+y
#np.frompyfunc(myfunc,2,1)(x[:,None,:],y[None,...])
array([[[100, 102, 104, 106],
[104, 106, 108, 110],
[108, 110, 112, 114]],
[[104, 106, 108, 110],
[108, 110, 112, 114],
[112, 114, 116, 118]]], dtype=object)

how to unflatten a matrix

How can I populate a matrix from a list of numbers.
I know this creates matrix. result = [[0 for x in range(3)] for y in range(3)].
I would like to know how to populate this matrix with from a list of numbers.
List1= [30, 18, 32, 66, 48, 77, 102, 78, 122]. This list comes from a matrix that I flattened. Now I would like to unflatten it.
output = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
I am trying to iterate to populate the matrix. But I end up with the last number.
List1= [30, 18, 32, 66, 48, 77, 102, 78, 122]
d=0
while d < len(List1):
result= [[List1 [d] for j in range(3)] for i in range(3)]
d+= 1
result= [[122, 122, 122], [122, 122, 122], [122, 122, 122]]
If you can't use numpy, then basically you need to index into List1 using the i and j iterator values; using one as column address and the other as row. Dependent on your desired output, you would use either:
result = [[List1[i*3+j] for j in range(3)] for i in range(3)]
Output
[[30, 18, 32], [66, 48, 77], [102, 78, 122]]
or
result = [[List1[j*3+i] for j in range(3)] for i in range(3)]
Output
[[30, 66, 102], [18, 48, 78], [32, 77, 122]]

script is changing the value of a variable for no reason [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 4 years ago.
When running this specific block of my script, record[r] is changing its value. Here're some lines I get printed out:
record[r] [[70, 190, 526, 9], [100, 160, 354, 60], [61, 45, 276, 15], [45, 61, 454, 28], [254, 192, 15, 20]] r : 0
[[190, 70, 524, 15], [160, 100, 353, 60], [45, 61, 280, 15], [45, 61, 456, 25], [245, 186, 14, 24]]
record[r] [[190, 70, 524, 15], [160, 100, 353, 60], [45, 61, 280, 15], [45, 61, 456, 25], [245, 186, 14, 24]] r : 0
[[190, 70, 528, 18], [100, 160, 355, 69], [45, 61, 277, 17], [45, 61, 454, 23], [233, 184, 9, 27]]
record[r] [[190, 70, 528, 18], [100, 160, 355, 69], [45, 61, 277, 17], [45, 61, 454, 23], [233, 184, 9, 27]] r : 0
[[190, 70, 526, 16], [160, 100, 354, 66], [45, 61, 277, 11], [61, 45, 450, 17], [242, 181, 6, 37]]
record[r] [[190, 70, 526, 16], [160, 100, 354, 66], [45, 61, 277, 11], [61, 45, 450, 17], [242, 181, 6, 37]] r : 0
[[190, 70, 531, 8], [100, 160, 358, 72], [61, 45, 280, 8], [45, 61, 448, 7], [240, 178, 4, 28]]
record[r] [[190, 70, 531, 8], [100, 160, 358, 72], [61, 45, 280, 8], [45, 61, 448, 7], [240, 178, 4, 28]] r : 0
[[190, 70, 531, 5], [100, 160, 360, 71], [45, 61, 277, 9], [45, 61, 452, 12], [238, 175, 8, 20]]
record[r] [[190, 70, 531, 5], [100, 160, 360, 71], [45, 61, 277, 9], [45, 61, 452, 12], [238, 175, 8, 20]] r : 0
Code:
for i in range(10):
print "loop {} of 100".format(i)
for r in range(3):
boo = False
while boo == False:
print "record[r]",record[r],"r :",r
data = place2(record[r])
print(data)
if validate(data, True):
boo = True
print "GETTING PAST WHILE"
record, gen2 = measure2(data, gen2, record)
def place2(inp):
out = inp
for i in range(4):
n = randint(0,1)
if n == 1:
out[i] = flip(out[i])
out[i][2] += randint(-5,5)
out[i][3] += randint(-10,10)
out[4][2] += randint(-5,5)
out[4][3] += randint(-10,10)
out[4][1] += randint(-10,10)
out[4][0] += randint(-15,15)
return out
def validate(inp, check):
p = 0
q = 0
r = 0
s = 0
for i in range(5):
for j in range(5):
if i != j:
if inp[i][2] - inp[j][2] <= (-1 * inp[i][0] )or inp[i][2] - inp[j][2] >= inp[j][0]:
p +=1
if inp[i][3] - inp[j][3] <= (-1 * inp[i][1]) or inp[i][3] - inp[j][3] >= inp[j][1]:
q += 1
if inp[i][2] >= 0 and inp[i][2] <= 600 - inp[i][0]:
r +=1
if inp[i][3] >= 0 and inp[i][3] <= 225 - inp[i][1]:
s +=1
if check:
print(p,q,r,s)
if p == 20 and s + r == 40:
return True
else:
return False
Its also worth nothing that I never get GETTING PAST WHILE printed out so I know the culprit must be in the while loop.
record[r] should be static during the while loop and I can't explain for the life of me why it's not. I've isolated out the validate function to see if that's causing it and the problem still happens and I have no idea why the place2 function would be causing the issue.
I have spent probably 3 hours in total looking for a solution and have not found one so I'm hoping that SO can help.
When you execute place2(inp), you will assign out = inp. This is not a copy !
What you're doing is pointing out toward inp. So when you change out, you also change inp.
You should use deepcopy if you don't want to modify your inp variable.
import copy
def place2(inp):
out = copy.deepcopy(inp) # This will do a copy instead of pointing.
for i in range(4):
n = randint(0,1)
if n == 1:
out[i] = flip(out[i])
# etc.
To be clearer, here's what happens without deepcopy :
a = [1,2]
b = a
b[0] = 10
print(a) # [10,2]
with deepcopy:
a = [1,2]
b = copy.deepcopy(a)
b[0] = 10
print(a) # [1,2]

Python 2d array

I want to make 2D array, which is 50X75.
Computer has to make random coordinates inside the array, about 15 to 20 coordinates.
What should I do TT
I stopped with the first step, making 50X75 2D array, so help meTT
You can generate 2D array using random runmbers
from random import randint
coordinates = [[randint(1, 100), randint(1, 100)] for i in range(20)]
Output: [[81, 52], [12, 79], [24, 90], [93, 53], [98, 17], [40, 44], [31, 1], [1, 40], [8, 34], [81, 31], [87, 50], [45, 72], [86, 70], [43, 78], [64, 80], [85, 76], [28, 43], [81, 78], [80, 55], [82, 58]]
A 50 x 75 2D array can be made using a np.reshape function. Here is an example, hope this helps.
import numpy as np
np.arange(3750).reshape(50, 75) # the array has 50 rows and 75 cols
array([[ 0, 1, 2, ..., 72, 73, 74],
[ 75, 76, 77, ..., 147, 148, 149],
[ 150, 151, 152, ..., 222, 223, 224],
...,
[3525, 3526, 3527, ..., 3597, 3598, 3599],
[3600, 3601, 3602, ..., 3672, 3673, 3674],
[3675, 3676, 3677, ..., 3747, 3748, 3749]])

Categories