List comprehension using index values - python

How do I turn the below for loop into a list comprehension?
I have a list of lists. Each sublist has a number in index 4 position, and I would like to find a way to add them up. The for loop works, but I'd like a one-line solution with list comprehension.
frequencies = []
for row in table:
frequencies.append(int(row[4]))
sum(frequencies)
Here's my attempt:
frequencies = [sum(i) for i, x in enumerate(table) if x == 4]
However, the result is an empty object.
In[52]: total
Out[52]: []

Do it like this -
frequencies = sum([int(row[4]) for row in table])
The approach you're using is a little different from what you want to achieve.

Related

Grouping a grouped list of str without duplicates

I have a grouped list of strings that sort of looks like this, the lists inside of these groups will always contain 5 elements:
text_list = [['aaa','bbb','ccc','ddd','eee'],
['fff','ggg','hhh','iii','jjj'],
['xxx','mmm','ccc','bbb','aaa'],
['fff','xxx','aaa','bbb','ddd'],
['aaa','bbb','ccc','ddd','eee'],
['fff','xxx','aaa','ddd','eee'],
['iii','xxx','ggg','jjj','aaa']]
The objective is simple, group all of the list that is similar by the first 3 elements that is then compared against all of the elements inside of the other groups.
So from the above example the output might look like this (output is the index of the list):
[[0,2,4],[3,5]]
Notice how if there is another list that contains the same elements but in a different order is removed.
I've written the following code to extract the groups but they would return duplicates and I am unsure how to proceed. I also think this might not be the most efficient way to do the extraction as the real list can contain upwards to millions of groups:
grouped_list = []
for i in range(0,len(text_list)):
int_temp = []
for m in range(0,len(text_list)):
if i == m:
continue
bool_check = all( x in text_list[m] for x in text_list[i][0:3])
if bool_check:
if len(int_temp) == 0:
int_temp.append(i)
int_temp.append(m)
continue
int_temp.append(m)
grouped_list.append(int_temp)
## remove index with no groups
grouped_list = [x for x in grouped_list if x != []]
Is there a better way to go about this? How do I remove the duplicate group afterwards? Thank you.
Edit:
To be clearer, I would like to retrieve the lists that is similar to each other but only using the first 3 elements of the other lists. For example, using the first 3 elements from list A, check if list B,C,D... contains all 3 of the elements from list A. Repeat for the entire list then remove any list that contains duplicate elements.
You can build a set of frozensets to keep track of indices of groups with the first 3 items being a subset of the rest of the members:
groups = set()
sets = list(map(set, text_list))
for i, lst in enumerate(text_list):
groups.add(frozenset((i, *(j for j, s in enumerate(sets) if set(lst[:3]) <= s))))
print([sorted(group) for group in groups if len(group) > 1])
If the input list is long, it would be faster to create a set of frozensets of the first 3 items of all sub-lists and use the set to filter all combinations of 3 items from each sub-list, so that the time complexity is essentially linear to the input list rather than quadratic despite the overhead in generating combinations:
from itertools import combinations
sets = {frozenset(lst[:3]) for lst in text_list}
groups = {}
for i, lst in enumerate(text_list):
for c in map(frozenset, combinations(lst, 3)):
if c in sets:
groups.setdefault(c, []).append(i)
print([sorted(group) for group in groups.values() if len(group) > 1])

Get the n top elements from a list of lists

I'm using the following example for demonstration purposes.
[["apple",10],
["oranges",5],
["strawberies",2],
["pineapples",12],
["bananas",9],
["tomattoes",8],
["watermelon",1],
["mangos",7],
["grapes",11],
["potattoes",3]]
I want to get say the top 3 fruits by quantity (top 3 elements returned), however i don't want the order to change.
So the end result will be
[["apple",10],
["pineapples",12],
["grapes",11]]
Any help will be appreciated.
arr = [["apple",10],
["oranges",5],
["strawberies",2],
["pineapples",12],
["bananas",9],
["tomattoes",8],
["watermelon",1],
["mangos",7],
["grapes",11],
["potattoes",3]]
sorted_arr = sorted(arr,key=lambda x: x[1],reverse=True)[:3]
output = [elem for elem in arr if elem in sorted_arr]
print(sorted_arr)
print(output)
First, we sort the array in reverse order to get the first 3 elements. Then, we use list comprehension to loop through the original array and check if the elements are in the top 3. This preserves the order.
As you have probably heard before there is a sort method that you can apply to lists. You can pass in the key as a function that will indicate how you want your list to be sorted.
l = [["apple",10],
["oranges",5],
["strawberies",2],
["pineapples",12],
["bananas",9],
["tomattoes",8],
["watermelon",1],
["mangos",7],
["grapes",11],
["potattoes",3]]
l.sort(key = lambda x: x[1]) #to sort by second element
result = l[-3:] # To get the top three elements

Generating a list using another list and an index list

Suppose I have the following two list and a smaller list of indices:
list1=[2,3,4,6,7]
list2=[0,0,0,0,0]
idx=[1,2]
I want to replace the values in list 2 using the values in list 1 at the specified indices.
I could do so using the following loop:
for i in idx:
list2[i]=list1[i]
If I just have list1 and idx , how could I write a list comprehension to generate list2 (same length as list1)such that list2 has values of list1 at indices idx or 0 otherwise.
This will call __contains__ on every call for idx but should be reasonable for small(ish) lists.
list2 = [list1[i] if i in idx else 0 for i in range(len(list1))]
or
list2 = [e if i in idx else 0 for i, e in enumerate(list1)]
Also, do not write code like this. It is much less readable than your example. Furthermore, numpy may give you the kind of syntax you desire without sacrificing readability or speed.
import numpy as np
...
arr1 = np.array(list1)
arr2 = np.zeros_like(list1)
arr2[idx] = arr1[idx]
I assume that you want to generate list2 by using appending values of list1 at specific indexes. All you need to do this is to check whether the idx list contains any values and then use a for each loop to append the specific list1 values to list2. If idx is empty then you would only append list1[0] to list2.
if(len(idx) > 0):
for i in idx:
list2.append(list1[i])
else:
list2.append(list1[0])

Find indexes of common items in two python lists

I have two lists in python list_A and list_B and I want to find the common item they share. My code to do so is the following:
both = []
for i in list_A:
for j in list_B:
if i == j:
both.append(i)
The list common in the end contains the common items. However, I want also to return the indexes of those elements in the initial two lists. How can I do so?
It is advised in python that you avoid as much as possible to use for loops if better methods are available. You can efficiently find the common elements in the two lists by using python set as follows
both = set(list_A).intersection(list_B)
Then you can find the indices using the build-in index method
indices_A = [list_A.index(x) for x in both]
indices_B = [list_B.index(x) for x in both]
Instead of iterating through the list, access elements by index:
both = []
for i in range(len(list_A)):
for j in range(len(list_B)):
if list_A[i] == list_B[j]:
both.append((i,j))
Here i and j will take integer values and you can check values in list_A and list_B by index.
You can also get common elements and their indexes with numpy.intersect1d()
common_elements, a_indexes, b_indexes = np.intersect1d(a, b, return_indices=True)

adding to a variable in a nested list comprehension

I'm attempting to make a nested list comprehension, but I can't figure out how I should do it. currently, I have a loop like this:
filtered = []
p = -1
for i in list:
p += 1
for k in list_of_lists[p]:
if not k in filter:
filtered.append(k)
While this works, it takes about 5-8 seconds for it to complete, and this amount of time is nearly unacceptable for the circumstance that it is being used. I'm trying to make it in to a list comprehension, but I can't seem to figure out a way to make the p += 1 in the list comprehension. I attempted this:
filtered = [i for i in list for k ind list_of_list[p], p+=1]
but it clearly doesn't work. I was wondering if there was anyway to get around this.
I would flatten it and then convert it to a set because you can't self reference inside a list comprehension. The difference is a set can only have one of each item and order is not enforced.
list_of_lists = [["blue","green","red"],["red","yellow","white"],["orange","yellow","green"]]
filtered = set(y for x in list_of_lists for y in x)
print(filtered)

Categories