Merge two lists based on condition

Merge two lists based on condition - python

I am trying to merge two lists based on position of index, so sort of a proximity intersection.
A set doesn't work in this case. What i am trying to do is match index in each list then if the element is one less than that of the element in other list, only then i collect it.
An example will explain my scenario better.
Sample Input:
print merge_list([[0, 1, 3], [1, 2], [4, 1, 3, 5]],
[[0, 2, 6], [1, 4], [2, 2], [4, 1, 6]])
Sample Output:
[[0,2],[4,6]]
so on position 0 in list1 we have 1, 3 and in list2 we have 2, 6. Since 1 is one less than 2, so we collect that and move on, now 3 is less than 6 but it's not one less than i.e. not 5 so we ignore that. Next we have [1, 2][1, 4], so both index/position 1, but 2 is not one less than 4 so we ignore that. Next we have [2, 2] in list2 both index 2 doesn't match any index in first list so no comparison. Finally we have [4, 1, 3, 5] [4, 1, 6] comparison. Both index match and only 5 in list one is one less than list two so we collect six hence we collect [4,6] meaning index 4 and match etc.
I have tried to make it work, but i don't seem to make it work.
This is my code so far.
def merge_list(my_list1, my_list2):
merged_list = []
bigger_list = []
smaller_list = []
temp_outer_index = 0
temp_inner_index = 0
if(len(my_list1) > len(my_list2)):
bigger_list = my_list1
smaller_list = my_list2
elif(len(my_list2) > len(my_list1)):
bigger_list = my_list2
smaller_list = my_list1
else:
bigger_list = my_list1
smaller_list = my_list2
for i, sublist in enumerate(bigger_list):
for index1 , val in enumerate(sublist):
for k, sublist2 in enumerate(smaller_list):
for index2, val2 in enumerate(sublist2):
temp_outer_index = index1 + 1
temp_inner_index = index2 + 1
if(temp_inner_index < len(sublist2) and temp_outer_index < len(sublist)):
# print "temp_outer:%s , temp_inner:%s, sublist[temp_outer]:%s, sublist2[temp_inner_index]:%s" % (temp_outer_index, temp_inner_index, sublist[temp_outer_index], sublist2[temp_inner_index])
if(sublist2[temp_inner_index] < sublist[temp_outer_index]):
merged_list.append(sublist[temp_outer_index])
break
return merged_list

No clue what you are doing, but this should work.
First, convert the list of lists to a mapping of indices to set of digits contained in that list:
def convert_list(l):
return dict((sublist[0], set(sublist[1:])) for sublist in l)
This will make the lists a lot easier to work with:
>>> convert_list([[0, 1, 3], [1, 2], [4, 1, 3, 5]])
{0: set([1, 3]), 1: set([2]), 4: set([1, 3, 5])}
>>> convert_list([[0, 2, 6], [1, 4], [2, 2], [4, 1, 6]])
{0: set([2, 6]), 1: set([4]), 2: set([2]), 4: set([1, 6])}
Now the merge_lists function can be written as such:
def merge_lists(l1, l2):
result = []
d1 = convert_list(l1)
d2 = convert_list(l2)
for index, l2_nums in d2.items():
if index not in d1:
#no matching index
continue
l1_nums = d1[index]
sub_nums = [l2_num for l2_num in l2_nums if l2_num - 1 in l1_nums]
if sub_nums:
result.append([index] + sorted(list(sub_nums)))
return result
Works for your test case:
>>> print merge_lists([[0, 1, 3], [1, 2], [4, 1, 3, 5]],
[[0, 2, 6], [1, 4], [2, 2], [4, 1, 6]])
[[0, 2], [4, 6]]

I believe this does what you want it to do:
import itertools
def to_dict(lst):
dct = {sub[0]: sub[1:] for sub in lst}
return dct
def merge_dicts(a, b):
result = []
overlapping_keys = set.intersection(set(a.keys()), set(b.keys()))
for key in overlapping_keys:
temp = [key] # initialize sublist with index
for i, j in itertools.product(a[key], b[key]):
if i == j - 1:
temp.append(j)
if len(temp) > 1: # if the sublist has anything besides the index
result.append(temp)
return result
dict1 = to_dict([[0, 1, 3], [1, 2], [4, 1, 3, 5]])
dict2 = to_dict([[0, 2, 6], [1, 4], [2, 2], [4, 1, 6]])
result = merge_dicts(dict1, dict2)
print(result)
Result:
[[0, 2], [4, 6]]
First, we convert your lists to dicts because they're easier to work with (this separates the key out from the other values). Then, we look for the keys that exist in both dicts (in the example, this is 0, 1, 4) and look at all pairs of values between the two dicts for each key (in the example, 1,2; 1,6; 3,2; 3,6; 2,4; 1,1; 1,6; 3,1; 3,6; 5,1; 5,6). Whenever the first element of a pair is one less than the second element, we add the second element to our temp list. If the temp list ends up containing anything besides the key (i.e. is longer than 1), we add it to the result list, which we eventually return.
(It just occurred to me that this has pretty bad performance characteristics - quadratic in the length of the sublists - so you might want to use Claudiu's answer instead if your sublists are going to be long. If they're going to be short, though, I think the cost of initializing a set is large enough that my solution might be faster.)

def merge_list(a, b):
d = dict((val[0], set(val[1:])) for val in a)
result = []
for val in b:
k = val[0]
if k in d:
match = [x for x in val[1:] if x - 1 in d[k]]
if match:
result.append([k] + match)
return result
Similar to the other answers, this will first convert one of the lists to a dictionary with the first element of each inner list as the key and the remainder of the list as the value. Then we walk through the other list and if the first element exists as a key in the dictionary, we find all values that meet your criteria using the list comprehension and if there were any, add an entry to the result list which is returned at the end.

Related

Permutations between 2 lists

From 2 list i would like to know an optimal way in Python to do a sort of "indexed permutation".
This is how this would look like :
input :
list2 = [3,4,5]
list1 = [0,1,2]
output
[[0,1,2], [0,1,5], [0,4,2], [3,1,2],
[3,4,5], [3,4,2], [3,1,5], [0,4,5],
]
So each element of the lists remains in the same index.

You basically want two variables: list_to_pick, which can vary in range(number_of_lists), and index_to_swap which can vary in the range(-1, len(list1)). Then, you want the product of these two ranges to decide which list to pick, and which item to swap. When index_to_swap is -1, we won't swap any items
import itertools
source = [list1, list2]
result = []
for list_to_pick, index_to_swap in itertools.product(range(len(source)), range(-1, len(source[0])):
# Make a copy so we don't mess up the original list
selected_list = source[list_to_pick].copy()
# There are only two lists, so the other list is at index abs(list_to_pick - 1)
other_list = source[abs(list_to_pick - 1)]
# We swap only if index_to_swap >= 0
if index_to_swap >= 0:
selected_list[index_to_swap] = other_list[index_to_swap]
result.append(selected_list)
Which gives:
[[0, 1, 2],
[3, 1, 2],
[0, 4, 2],
[0, 1, 5],
[3, 4, 5],
[0, 4, 5],
[3, 1, 5],
[3, 4, 2]]
The order is not the same as your required list, but all the "permutations" are there. If you want the same order as in your question, you will have to define the second argument to itertools.product as:
swap_indices = [-1] + list(range(len(source[0])-1, -2, -1))

Sort a list by frequency and value

I am trying to solve the following problem: a function takes a list A. The results must be a ordered list of list. Each list contains the elements which have the same frequency in the original list A.
Example:
Input: [3, 1, 2, 2, 4]
Output: [[1, 3, 4], [2, 2]]
I managed to sort the initial list A and determine how the frequency of an element.
However, I do not know how to split the original list A based on the frequencies.
My code:
def customSort(arr):
counter = Counter(arr)
y = sorted(arr, key=lambda x: (counter[x], x))
print(y)
x = Counter(arr)
a = sorted(x.values())
print()
customSort([3,1,2,2,4])
My current output:
[1, 3, 4, 2, 2]
[1, 1, 1, 2]

You can use a defaultdict of lists and iterate your Counter:
from collections import defaultdict, Counter
def customSort(arr):
counter = Counter(arr)
dd = defaultdict(list)
for value, count in counter.items():
dd[count].extend([value]*count)
return dd
res = customSort([3,1,2,2,4])
# defaultdict(list, {1: [3, 1, 4], 2: [2, 2]})
This gives additional information, i.e. the key represents how many times the values in the lists are seen. If you require a list of lists, you can simply access values:
res = list(res.values())
# [[3, 1, 4], [2, 2]]

Doing the grunt work suggested by Scott Hunter (Python 3):
#!/usr/bin/env python3
from collections import Counter
def custom_sort(arr):
v = {}
for key, value in sorted(Counter(arr).items()):
v.setdefault(value, []).append(key)
return [v * k for k,v in v.items()]
if __name__ == '__main__':
print(custom_sort([3, 1, 2, 2, 4])) # [[1, 3, 4], [2, 2]]
For Python 2.7 or lower use iteritems() instead of items()
Partially taken from this answer

Having sorted the list as you do:
counter = Counter(x)
y = sorted(x, key=lambda x: (counter[x], x))
#[1, 3, 4, 2, 2]
You could then use itertools.groupby, using the result from Counter(x) in the key argument to create groups according to the counts:
[list(v) for k,v in groupby(y, key = lambda x: counter[x])]
#[[1, 3, 4], [2, 2]]

Find your maximum frequency, and create a list of that many empty lists.
Loop over your values, and add each to the element of the above corresponding to its frequency.
There might be something in Collections that does at least part of the above.

Another variation of the same theme, using a Counter to get the counts and then inserting the elements into the respective position in the result list-of-lists. This retains the original order of the elemens (does not group same elements together) and keeps empty lists for absent counts.
>>> lst = [1,4,2,3,4,3,2,5,4,4]
>>> import collections
>>> counts = collections.Counter(lst)
>>> res = [[] for _ in range(max(counts.values()))]
>>> for x in lst:
... res[counts[x]-1].append(x)
...
>>> res
[[1, 5], [2, 3, 3, 2], [], [4, 4, 4, 4]]

A bit late to the party, but with plain Python:
test = [3, 1, 2, 2, 4]
def my_sort(arr):
count = {}
for x in arr:
if x in count:
count[x] += 1
else:
count[x] = 0
max_frequency = max(count.values()) + 1
res = [[] for i in range(max_frequency)]
for k,v in count.items():
for j in range(v + 1):
res[v].append(k)
return res
print(my_sort(test))

Using only Pythons built-in functions, no imports and a single for loop.
l1= []
l2 = []
def customSort(mylist):
sl = sorted(mylist)
for i in sl:
n = sl.count(i)
if n > 1:
l1.append(i)
if i not in l1:
l2.append(i)
return [l2, l1]
print(customSort([3, 1, 2, 2, 4]))
Output:
[[1, 3, 4], [2, 2]]

Generate all combinations of 2 lists (game playing)

I am trying to generate all possible combinations between 2 lists A and B in python with a few constraints. A and B alternate in picking values, A always picks first. A and B may have overlapping values. If A has already picked a value, then B cannot pick it, and vice versa.
Both lists need not be of equal lengths. If one list has no available values to pick then I stop generating combinations
Also the elements picked by each must be in increasing order, i.e. A[1] < A[2] < .... A[n] and B[1] < B[2] < .... B[n] where A[i] and B[i] is the i-th element picked by A and B respectively
Example:
A = [1, 2, 3, 4]
B = [2, 5]
Solution I need is
(1), (2), (3), (4),
(1,2), (1,5), (2,5), (3,2), (3,5), (4,2), (4,5),
(1,2,3), (1,2,4), (3,2,4), (1,5,2), (1,5,3), (1,5,4), (2,5,3), (2,5,4), (3,5,4),
(1,2,3,5), (1,2,4,5), (3,2,4,5)
(1,2,3,5,4)
I believe itertools in python can be useful for this but I havent really figured out how to implement it for this case.
As of now, this is how I am solving it:
A = [1, 2, 3, 4]
B = [2, 5]
A_set = set(A)
B_set = set(b)
#Append both sets
C = A.union(B)
for L in range(len(C), 0, -1):
for subset in itertools.combinations(C, L):
#Check if subset meets constraints and print it if it does

As noted in comments, this is probably much too specific to be easily solved using itertools, and you should use a recursive (generator) function instead. Just pick the next element from whichever list's turn it is, keeping track of the elements already selected, and recursively call the function again, swapping and shortening the lists and adding the element to the set of selected elements, until you've got the required number.
Something like this (this might be improved by adding parameters for the current index in both lists instead of actually slicing the lists for the recursive calls):
def solve(n, lst1, lst2, selected):
if n == 0:
yield []
elif lst1:
for i, x in enumerate(lst1):
if x not in selected:
selected.add(x)
for rest in solve(n-1, lst2, lst1[i+1:], selected):
yield [x] + rest
selected.remove(x)
Or a bit more condensed:
def solve(n, lst1, lst2, selected):
if n == 0:
yield []
elif lst1:
yield from ([x] + rest for i, x in enumerate(lst1) if x not in selected
for rest in solve(n-1, lst2, lst1[i+1:], selected.union({x})))
Example:
A = [1, 2, 3, 4]
B = [2, 5]
result = [res for n in range(1, len(A)+len(B)+1) for res in solve(n, A, B, set())]
Afterwards, result is:
[[1], [2], [3], [4],
[1, 2], [1, 5], [2, 5], [3, 2], [3, 5], [4, 2], [4, 5],
[1, 2, 3], [1, 2, 4], [1, 5, 2], [1, 5, 3], [1, 5, 4], [2, 5, 3], [2, 5, 4], [3, 2, 4], [3, 5, 4],
[1, 2, 3, 5], [1, 2, 4, 5], [3, 2, 4, 5],
[1, 2, 3, 5, 4]]

Removing duplicates from a list of lists based on a comparison of an element of the inner lists

I have a large list of lists and need to remove duplicate elements based on specific criteria:
Uniqueness is determined by the first element of the lists.
Removal of duplicates is determined by comparing the value of the second element of the duplicate lists, namely keep the list with the lowest second element.
[[1, 4, 5], [1, 3, 4], [1, 2, 3]]
All the above lists are considered duplicates since their first elements are equal. The third list needs to be kept since it's second element is the smallest. Note the actual list of lists has over 4 million elements, is double sorted and ordering needs to be preserved.
The list is first sorted based on the second element of the inner lists and in reverse (descending) order, followed by normal (ascending) order based on the first element:
sorted(sorted(the_list, key=itemgetter(1), reverse=True), key=itemgetter(0))
An example of three duplicate lists in their actual ordering:
[...
[33554432, 50331647, 1695008306],
[33554432, 34603007, 1904606324],
[33554432, 33554687, 2208089473],
...]
The goal is to prepare the list for bisect searching. Can someone provide me with insight on how this might be achieved using Python?

You can group the elements using a dict, always keeping the sublist with the smaller second element:
l = [[1, 2, 3], [1, 3, 4], [1, 4, 5], [2, 4, 3], [2, 5, 6], [2, 1, 3]]
d = {}
for sub in l:
k = sub[0]
if k not in d or sub[1] < d[k][1]:
d[k] = sub
Also you can pass two keys to sorted, you don't need to call sorted twice:
In [3]: l = [[1,4,6,2],[2,2,4,6],[1,2,4,5]]
In [4]: sorted(l,key=lambda x: (-x[1],x[0]))
Out[4]: [[1, 4, 6, 2], [1, 2, 4, 5], [2, 2, 4, 6]]
If you wanted to maintain order in the dict as per ordering needs to be preserved.:
from collections import OrderedDict
l = [[1, 2, 3], [1, 3, 4], [1, 4, 5], [2, 4, 3], [2, 5, 6], [2, 1, 3]]
d = OrderedDict()
for sub in l:
k = sub[0]
if k not in d or sub[1] < d[k][1]:
d[sub[0]] = sub
But not sure how that fits as you are sorting the data after so you will lose any order.
What you may find very useful is a sortedcontainers.sorteddict:
A SortedDict provides the same methods as a dict. Additionally, a SortedDict efficiently maintains its keys in sorted order. Consequently, the keys method will return the keys in sorted order, the popitem method will remove the item with the highest key, etc.
An optional key argument defines a callable that, like the key argument to Python’s sorted function, extracts a comparison key from each dict key. If no function is specified, the default compares the dict keys directly. The key argument must be provided as a positional argument and must come before all other arguments.
from sortedcontainers import SortedDict
l = [[1, 2, 3], [1, 3, 4], [1, 4, 5], [2, 4, 3], [2, 5, 6], [2, 1, 3]]
d = SortedDict()
for sub in l:
k = sub[0]
if k not in d or sub[1] < d[k][1]:
d[k] = sub
print(list(d.values()))
It has all the methods you want bisect, bisect_left etc..

If I got it correctly, the solution might be like this:
mylist = [[1, 2, 3], [1, 3, 4], [1, 4, 5], [7, 3, 6], [7, 1, 8]]
ordering = []
newdata = {}
for a, b, c in mylist:
if a in newdata:
if b < newdata[a][1]:
newdata[a] = [a, b, c]
else:
newdata[a] = [a, b, c]
ordering.append(a)
newlist = [newdata[v] for v in ordering]
So in newlist we will receive reduced list of [[1, 2, 3], [7, 1, 8]].

Remove a column from a nested list in Python

I need help figuring how to work around removing a 'column' from a nested list to modify it.
Say I have
L = [[1,2,3,4],
[5,6,7,8],
[9,1,2,3]]
and I want to remove the second column (so values 2,6,1) to get:
L = [[1,3,4],
[5,7,8],
[9,2,3]]
I'm stuck with how to modify the list with just taking out a column. I've done something sort of like this before? Except we were printing it instead, and of course it wouldn't work in this case because I believe the break conflicts with the rest of the values I want in the list.
def L_break(L):
i = 0
while i < len(L):
k = 0
while k < len(L[i]):
print( L[i][k] , end = " ")
if k == 1:
break
k = k + 1
print()
i = i + 1
So, how would you go about modifying this nested list?
Is my mind in the right place comparing it to the code I have posted or does this require something different?

You can simply delete the appropriate element from each row using del:
L = [[1,2,3,4],
[5,6,7,8],
[9,1,2,3]]
for row in L:
del row[1] # 0 for column 1, 1 for column 2, etc.
print L
# outputs [[1, 3, 4], [5, 7, 8], [9, 2, 3]]

If you want to extract that column for later use, while removing it from the original list, use a list comprehension with pop:
>>> L = [[1,2,3,4],
... [5,6,7,8],
... [9,1,2,3]]
>>>
>>> [r.pop(1) for r in L]
[2, 6, 1]
>>> L
[[1, 3, 4], [5, 7, 8], [9, 2, 3]]
Otherwise, just loop over the list and delete the fields you no longer want, as in arshajii's answer

You can use operator.itemgetter, which is created for this very purpose.
from operator import itemgetter
getter = itemgetter(0, 2, 3) # Only indexes which are needed
print(list(map(list, map(getter, L))))
# [[1, 3, 4], [5, 7, 8], [9, 2, 3]]
You can use it in List comprehension like this
print([list(getter(item)) for item in L])
# [[1, 3, 4], [5, 7, 8], [9, 2, 3]]
You can also use nested List Comprehension, in which we skip the elements if the index is 1, like this
print([[item for index, item in enumerate(items) if index != 1] for items in L])
# [[1, 3, 4], [5, 7, 8], [9, 2, 3]]
Note: All these suggested in this answer will not affect the original list. They will generate new lists without the unwanted elements.

Use map-lambda:
print map(lambda x: x[:1]+x[2:], L)

Here is one way, updated to take in kojiro's advice.
>>> L[:] = [i[:1]+i[2:] for i in L]
>>> L
[[1, 3, 4], [5, 7, 8], [9, 2, 3]]
You can generalize this to remove any column:
def remove_column(matrix, column):
return [row[:column] + row[column+1:] for row in matrix]
# Remove 2nd column
copyofL = remove_column(L, 1) # Column is zero-base, so, 1=second column

when you do the del it will delete that index and reset the index, so you have to reduce that index. Here I use the count to reduce and reset the same from the index list we have. Hope this helps. Thanks
nested_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
remove_cols_index = [1,2]
count = 0
for i in remove_cols_index:
i = i-count
count = count+1
del nested_list[i]
print (nested_list)

[j.pop(1) for j in nested_list]
from https://www.geeksforgeeks.org/python-column-deletion-from-list-of-lists/

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Merge two lists based on condition - python

Related

Permutations between 2 lists

Sort a list by frequency and value

Generate all combinations of 2 lists (game playing)

Removing duplicates from a list of lists based on a comparison of an element of the inner lists

Remove a column from a nested list in Python

Categories

Resources