Comparing two lists and performing an operation in Python

Comparing two lists and performing an operation in Python - python

I have two lists I and i. I want to find for each element of i, how many values are less than in I and add the total number of such values to the specific i element. For example, element 15 in i has two values less than itself in I i.e. [8,11]. So 2 should be added to 15 and the combination stored in Values. I present the expected output.
I = [8, 11, 19, 37, 40, 42]
i=[15, 17, 27, 28, 31, 41]
The expected output is
New i=[17,19,30,31,34,46]
Values=[[8,11],[8,11],[8,11,19],[8,11,19],[8,11,19],[8,11,19,37,40]]

Assuming your list I is sorted, you can use bisect_left to get insertion point in your list I for each element in i and then slice the list. It uses binary search.
With that you can do:
from bisect import bisect_left
Values = [I[:bisect_left(I, e)] for e in i]
New_i = [e + len(Values[j]) for j, e in enumerate(i)]
print(Values):
[[8, 11], [8, 11], [8, 11, 19], [8, 11, 19], [8, 11, 19], [8, 11, 19, 37, 40]]
print(New_i):
[17, 19, 30, 31, 34, 46]
BTW I highly recommend not to use I and i for your variable names.

You can use two list comprehensions
>>> # for each element in i, add the elements in I to a list if they are smaller
>>> values = [[e for e in I if e < n] for n in i]
>>> # add the element of i to the number of elements in I that are smaller
>>> new_i = [sum(x) for x in zip(i, map(len, values))]
>>> values
[[8, 11], [8, 11], [8, 11, 19], [8, 11, 19], [8, 11, 19], [8, 11, 19, 37, 40]]
>>> new_i
[17, 19, 30, 31, 34, 46]

the other solutions are 100% correct, here's another solution but oldschool and more readable:
l1 = [15, 17, 27, 28, 31, 41]
l2 = [8, 11, 19, 37, 40, 42]
comp = []
for i in l1:
c = []
for j in l2:
if i > j:
i += 1
c.append(j)
comp.append(c)
print(l1)
print(comp)
input >>
l1 = [15, 17, 27, 28, 31, 41]
l2 = [8, 11, 19, 37, 40, 42]
output >>
[15, 17, 27, 28, 31, 41]
[[8, 11], [8, 11], [8, 11, 19], [8, 11, 19], [8, 11, 19], [8, 11, 19, 37, 40, 42]]

One way to do this is to use numpy which allows quick operations on lists by putting them in matrix form and is more optimized than operations in lists. An example of code could be:
import numpy as np
list_1 = [8, 11, 19, 37, 40, 42]
list_2 = [15, 17, 27, 28, 31, 41]
arr_1, arr_2 = np.array(list_1), np.array(list_2)
# Broadcast to 2D array where elem (i, j) is whether list_2[i] < list_1[j] or not
arr_sup = (arr_2[:, None] - arr_1[None, :]) > 0
# Add sum of rows to list_2 to create the new list
new_list = (np.sum(arr_sup, axis=1) + list_2).tolist()
# Get indexes (i, j) where arr_sup[i, j] is True
idx_sup = np.where(arr_sup)
values = []
for i, j in zip(idx_sup[0], idx_sup[1]): # browse i and j together
if len(values) < i + 1:
# Add a new list for each new i
values.append([])
# Add value j of list 1 to i-th list in values
values[i].append(list_1[j])
print(new_list) # [17, 19, 30, 31, 34, 46]
print(values) # [[8, 11], [8, 11], [8, 11, 19], [8, 11, 19], [8, 11, 19], [8, 11, 19, 37, 40]]
It works even if the lists are not sorted.

Related

NumPy Array Fill Rows Downward By Indexed Sections

Let's say I have the following (fictitious) NumPy array:
arr = np.array(
[[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20],
[21, 22, 23, 24],
[25, 26, 27, 28],
[29, 30, 31, 32],
[33, 34, 35, 36],
[37, 38, 39, 40]
]
)
And for row indices idx = [0, 2, 3, 5, 8, 9] I'd like to repeat the values in each row downward until it reaches the next row index:
np.array(
[[1, 2, 3, 4],
[1, 2, 3, 4],
[9, 10, 11, 12],
[13, 14, 15, 16],
[13, 14, 15, 16],
[21, 22, 23, 24],
[21, 22, 23, 24],
[21, 22, 23, 24],
[33, 34, 35, 36],
[37, 38, 39, 40]
]
)
Note that idx will always be sorted and have no repeat values. While I can accomplish this by doing something like:
for start, stop in zip(idx[:-1], idx[1:]):
for i in range(start, stop):
arr[i] = arr[start]
# Handle last index in `idx`
start, stop = idx[-1], arr.shape[0]
for i in range(start, stop):
arr[i] = arr[start]
Unfortunately, I have many, many arrays like this and this can become slow as the size of the array gets larger (in both the number of rows as well as the number of columns) and the length of idx also increases. The final goal is to plot these as a heatmaps in matplotlib, which I already know how to do. Another approach that I tried was using np.tile:
for start, stop in zip(idx[:-1], idx[1:]):
reps = max(0, stop - start)
arr[start:stop] = np.tile(arr[start], (reps, 1))
# Handle last index in `idx`
start, stop = idx[-1], arr.shape[0]
arr[start:stop] = np.tile(arr[start], (reps, 1))
But I am hoping that there's a way to get rid of the slow for-loop.

Try np.diff to find the repetition for each row, then np.repeat:
# this assumes `idx` is a standard list as in the question
np.repeat(arr[idx], np.diff(idx+[len(arr)]), axis=0)
Output:
array([[ 1, 2, 3, 4],
[ 1, 2, 3, 4],
[ 9, 10, 11, 12],
[13, 14, 15, 16],
[13, 14, 15, 16],
[21, 22, 23, 24],
[21, 22, 23, 24],
[21, 22, 23, 24],
[33, 34, 35, 36],
[37, 38, 39, 40]])

list comprehensions with break

I have the following code that I would like to write in one line with a list comprehension.
list1 = [4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
list2 = [1, 3, 5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56]
list3 = []
for i in list1:
for j in list2:
if j>i:
# print(i,j)
list3.append(j)
break
print(list1)
print(list3)
The output is:
[4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
[5, 7, 7, 11, 11, 17, 24, 24, 26, 56]
It's the break statement that throws me off, I don't know where to put it.
Thank you

To build the expression it helps to ignore the break condition at first:
In [32]: [[j for j in list2 if j > i] for i in list1]
Out[32]:
[[5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
[7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
[7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
[11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
[11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
[17, 20, 24, 26, 56],
[24, 26, 56],
[24, 26, 56],
[26, 56],
[56]]
From there you can add the min constraint:
In [33]: [min([j for j in list2 if j > i]) for i in list1]
Out[33]: [5, 7, 7, 11, 11, 17, 24, 24, 26, 56]

You can't really break a list comprehension's internal for loop, what you can do is avoid having to break it at all by using the next function to find the first occurrence of a matching value:
list1 = [4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
list2 = [1, 3, 5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56]
list3 = [ next(j for j in list2 if j>i) for i in list1 ]
output:
print(list1)
print(list3)
[4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
[5, 7, 7, 11, 11, 17, 24, 24, 26, 56]
If you are concerned about performance (since the list comprehension will be slower than the loops), you could use a bisecting search in list 2 to find the next higher value:
from bisect import bisect_left
list3 = [ list2[bisect_left(list2,i+1)] for i in list1 ]
This assumes that list2 is sorted in ascending order and that max(list2) > max(list1)

I have tried timing the answer posted by AbbeGijly.
It turns out that it is slower than the original solution. Check it out.
import timeit
print(timeit.timeit('''
list1 = [4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
list2 = [1, 3, 5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 40, 56]
list3 = []
for i in list1:
for j in list2:
if j>i:
# print(i,j)
list3.append(j)
break
'''))
print(timeit.timeit('''
list1 = [4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
list2 = [1, 3, 5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 40, 56]
list4 = [[j for j in list2 if j > i] for i in list1]
'''))
The output is:
3.6144596
8.731578200000001

You could move the break logic into a separate function, then put that function inside a list comprehension:
def smallest_value_larger_than_i(candidate_values, i):
for value in candidate_values:
if value > i:
return value
return None # Not sure how you want to handle this case
list3 = [smallest_value_larger_than_i(list2, i) for i in list1]
This runs slightly slower than your original solution, but if the goal of using a list comprehension is speed, you'll get much better results by improving the algorithm instead. For example, if both lists are sorted, then you can discard elements from list2 as soon as you skip over them once, instead of checking them against the rest of list1. You could also do a binary search of list2 instead of scanning through it linearly.

make list as nested list with consecutive elements in separate list

I want to separate list of elements into nested list, each sub list having consecutive elements. If an element doesn't have a consecutive element, it should have in single list.
Input:
l1 = [1, 2, 3, 11, 12, 13, 23, 33, 34, 35, 45]
l2 = [11, 12, 13, 22, 23, 24, 33, 34]
l3 = [1, 2, 3, 11, 12, 13, 32, 33, 34, 45]
expected output:
l1 = [[1, 2, 3], [11, 12, 13], [23], [33, 34, 35], [45]]
l2 = [[11, 12, 13], [22, 23, 24], [33, 34]]
l3 = [[1, 2, 3], [11, 12, 13], [32, 33, 34], [45]]
I have tried the code below but it is not giving the expected result, printing an empty list:
def split_into_list(l):
t = []
for i in range(len(l) - 1):
if abs(l[i] - l[i + 1]) == 0:
t.append(l[i])
elif abs(l[i] - l[i + 1]) != 0 and abs(l[i - 1] - l[i]) == 0:
t.append(l[i])
yield t
split_into_list(l[i:])
if i + 1 == len(l):
t.append(l[i])
yield t
l = [1, 2, 3, 11, 12, 13, 32, 33, 34, 45]
li = []
li.append(split_into_list(l))
for i in li:
print(i, list(i))

Shorter approach with custom split_adjacent function:
def split_adjacent(lst):
res = [[lst[0]]] # start/init with the 1st item/number
for i in range(1, len(lst)):
if lst[i] - res[-1][-1] > 1: # compare current and previous item
res.append([])
res[-1].append(lst[i])
return res
l1 = [1, 2, 3, 11, 12, 13, 23, 33, 34, 35, 45]
l2 = [11, 12, 13, 22, 23, 24, 33, 34]
l3 = [1, 2, 3, 11, 12, 13, 32, 33, 34, 45]
print(split_adjacent(l1))
print(split_adjacent(l2))
print(split_adjacent(l3))
Final output:
[[1, 2, 3], [11, 12, 13], [23], [33, 34, 35], [45]]
[[11, 12, 13], [22, 23, 24], [33, 34]]
[[1, 2, 3], [11, 12, 13], [32, 33, 34], [45]]

def split_into_list(l):
result = [[]]
for i, elt in enumerate(l[1:]):
diff = abs(elt - l[i])
if diff == 1:
# still the same group
result[-1].append(elt)
else:
# new group
result.append([elt])
return result
l = [1,2,3,11,12,13,32,33,34,45]
print(split_into_list(l))
yields
[[2, 3], [11, 12, 13], [32, 33, 34], [45]]

def split_into_list(l):
t = []
temp = [l[0]]
prev = l[0]
for i in l[1:]:
if i == prev+1:
temp.append(i)
else:
t.append(temp)
temp = [i]
prev = i
return t
Mind that all solutions so far relie on sorted lists, which you didn't expicitly specify in your question.

This is something that lends itself well to a numpy solution using diff and split:
def consecutives(x):
np.split(x, np.flatnonzero(np.diff(x) != 1) + 1)
For example, consecutives(l1) will result in
[array([1, 2, 3]),
array([11, 12, 13]),
array([23]),
array([33, 34, 35]),
array([45])]
If you need nested lists, you can apply list or ndarray.tolist:
def consecutives(x):
return [a.tolist() for a in np.split(x, np.flatnonzero(np.diff(x) != 1) + 1)]
Now the result of consecutives(l1) is
[[1, 2, 3], [11, 12, 13], [23], [33, 34, 35], [45]]

Slice list of lists without numpy

In Python, how could I slice my list of lists and get a sub list of lists without numpy?
For example, get a list of lists from A[1][1] to A[2][2] and store it in B:
A = [[1, 2, 3, 4 ],
[11, 12, 13, 14],
[21, 22, 23, 24],
[31, 32, 33, 34]]
B = [[12, 13],
[22, 23]]

You can slice A and its sublists:
In [1]: A = [[1, 2, 3, 4 ],
...: [11, 12, 13, 14],
...: [21, 22, 23, 24],
...: [31, 32, 33, 34]]
In [2]: B = [l[1:3] for l in A[1:3]]
In [3]: B
Out[3]: [[12, 13], [22, 23]]

You may also perform nested list slicing using map() function as:
B = map(lambda x: x[1:3], A[1:3])
# Value of B: [[12, 13], [22, 23]]
where A is the list mentioned in the question.

Why am I getting "list assignment index out of range" error?

I keep getting the same error:
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
IndexError: list assignment index out of range
What's wrong with the following code?
myList = [([None] * 8) for x in range(16)]
for i in range(0,7):
for j in range(0,15):
myList[i][j] = 2 * i

The list comprehension
myList = [([None] * 8) for x in range(16)]
can be understood like this
mylist = []
for x in range(16):
mylist.append([None] * 8)
So, you are creating a list of 16 lists, each contains 8 Nones. But with the loops
for i in range(0,7):
for j in range(0,15):
you are trying to access 15 elements from the first 7 lists. That is why it is failing. Instead, you might want to do
for i in range(16):
for j in range(8):
...
Actually you can do the same in list comprehension, like this
[[2 * i for j in range(8)] for i in range(16)]
Demo:
>>> from pprint import pprint
>>> pprint([[2 * i for j in range(8)] for i in range(16)])
[[0, 0, 0, 0, 0, 0, 0, 0],
[2, 2, 2, 2, 2, 2, 2, 2],
[4, 4, 4, 4, 4, 4, 4, 4],
[6, 6, 6, 6, 6, 6, 6, 6],
[8, 8, 8, 8, 8, 8, 8, 8],
[10, 10, 10, 10, 10, 10, 10, 10],
[12, 12, 12, 12, 12, 12, 12, 12],
[14, 14, 14, 14, 14, 14, 14, 14],
[16, 16, 16, 16, 16, 16, 16, 16],
[18, 18, 18, 18, 18, 18, 18, 18],
[20, 20, 20, 20, 20, 20, 20, 20],
[22, 22, 22, 22, 22, 22, 22, 22],
[24, 24, 24, 24, 24, 24, 24, 24],
[26, 26, 26, 26, 26, 26, 26, 26],
[28, 28, 28, 28, 28, 28, 28, 28],
[30, 30, 30, 30, 30, 30, 30, 30]]
Note: I have used 16 and 8 in the range function because it, by default, starts from 0 and iterates till the parameter passed to it - 1. So, 0 to 15 and 0 to 8 respectively.

You have your indices reversed; the outer list has 16 elements, but you are trying to index the inner list past the 8 elements that are there.
That's because your list comprehension built 16 lists of each 8 values, so len(myList) is 16, and len(myList[0]) is 8.
Reverse the ranges:
for i in range(15):
for j in range(7):
myList[i][j] = 2 * i
or reverse the use of i and j when trying to index against myList:
for i in range(7):
for j in range(15):
myList[j][i] = 2 * i
Note that the outer myList is using j now and each nested inner list is indexed with i.
Both i and j omit the last element, if that wasn't intentional, use 8 and 16 rather than 7 and 15.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Comparing two lists and performing an operation in Python - python

Related

NumPy Array Fill Rows Downward By Indexed Sections

list comprehensions with break

make list as nested list with consecutive elements in separate list

Slice list of lists without numpy

Why am I getting "list assignment index out of range" error?

Categories

Resources