Remove sublist after first element appears n times [closed]

Remove sublist after first element appears n times [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a long nested list. Each sublist contains 2 elements. What I would like to do is iterate over the full list and remove sublists once I've found the first element more than 3 times.
Example:
ls = [[1,1], [1,2], [1,3], [1,4], [2,2], [2,3], [3,4], [3,5], [3,6], [3,7]]
desired_result = [[1,1], [1,2], [1,3], [2,2], [2,3], [3,4], [3,5], [3,6]]

If the input is sorted by the first element, you could use groupby and islice:
from itertools import groupby, islice
from operator import itemgetter
ls = [[1, 1], [1, 2], [1, 3], [1, 4], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6], [3, 7]]
result = [e for _, group in groupby(ls, key=itemgetter(0)) for e in islice(group, 3)]
print(result)
Output
[[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]
The idea is to group the elements by the first value using groupby, and then fetch the first 3 values, if they exist, using islice.

You can do it like below:
ls = [[1,1], [1,2], [1,3], [1,4], [2,2], [2,3], [3,4], [3,5], [3,6], [3,7]]
val_count = dict.fromkeys(set([i[0] for i in ls]), 0)
new_ls = []
for i in ls:
if val_count[i[0]] < 3:
val_count[i[0]] += 1
new_ls.append(i)
print(new_ls)
Output:
[[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

Probably not the shortest answer.
The idea is to count occurrences while you're iterating over ls
from collections import defaultdict
filtered_ls = []
counter = defaultdict(int)
for l in ls:
counter[l[0]] += 1
if counter[l[0]] > 3:
continue
filtered_ls += [l]
print(filtered_ls)
# [[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

You can use collections.defaultdict to aggregate by first value in O(n) time. Then use itertools.chain to construct a list of lists.
from collections import defaultdict
from itertools import chain
dd = defaultdict(list)
for key, val in ls:
if len(dd[key]) < 3:
dd[key].append([key, val])
res = list(chain.from_iterable(dd.values()))
print(res)
# [[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

Ghillas BELHADJ answer is good. But you should consider defaultdict for this task. The idea is taken from Raymond Hettinger who suggested to use defaultdict for grouping and counting tasks
from collections import defaultdict
def remove_sub_lists(a_list, nth_occurence):
found = defaultdict(int)
for sublist in a_list:
first_index = sublist[0]
print(first_index)
found[first_index] += 1
if found[first_index] <= nth_occurence:
yield sublist
max_3_times_first_index = list(remove_sub_lists(ls, 3)))

If the list is already sorted, you can use itertools.groupby then just keep the first three items from each group
>>> import itertools
>>> ls = [[1,1], [1,2], [1,3], [1,4], [2,2], [2,3], [3,4], [3,5], [3,6], [3,7]]
>>> list(itertools.chain.from_iterable(list(g)[:3] for _,g in itertools.groupby(ls, key=lambda i: i[0])))
[[1, 1], [1, 2], [1, 3], [2, 2], [2, 3], [3, 4], [3, 5], [3, 6]]

Here's an option that doesn't use any modules:
countDict = {}
for i in ls:
if str(i[0]) not in countDict.keys():
countDict[str(i[0])] = 1
else:
countDict[str(i[0])] += 1
if countDict[str(i[0])] > 3:
ls.remove(i)

Related

How can i sum up all values with the same index in a dictionary which each key has a nested list as a value?

I have a dictionary, each key of dictionary has a list of list (nested list) as its value. What I want is imagine we have:
x = {1: [[1, 2], [3, 5]], 2: [[2, 1], [2, 6]], 3: [[1, 5], [5, 4]]}
My question is how can I access each element of the dictionary and concatenate those with same index: for example first list from all keys:
[1,2] from first keye +
[2,1] from second and
[1,5] from third one
How can I do this?

You can access your nested list easily when you're iterating through your dictionary and append it to a new list and the you apply the sum function.
Code:
x={1: [[1,2],[3,5]] , 2:[[2,1],[2,6]], 3:[[1,5],[5,4]]}
ans=[]
for key in x:
ans += x[key][0]
print(sum(ans))
Output:
12

Assuming you want a list of the first elements, you can do:
>>> x={1: [[1,2],[3,5]] , 2:[[2,1],[2,6]], 3:[[1,5],[5,4]]}
>>> y = [a[0] for a in x.values()]
>>> y
[[1, 2], [2, 1], [1, 5]]
If you want the second element, you can use a[1], etc.

The output you expect is not entirely clear (do you want to sum? concatenate?), but what seems clear is that you want to handle the values as matrices.
You can use numpy for that:
summing the values
import numpy as np
sum(map(np.array, x.values())).tolist()
output:
[[4, 8], [10, 15]] # [[1+2+1, 2+1+5], [3+2+5, 5+6+4]]
concatenating the matrices (horizontally)
import numpy as np
np.hstack(list(map(np.array, x.values()))).tolist()
output:
[[1, 2, 2, 1, 1, 5], [3, 5, 2, 6, 5, 4]]

As explained in How to iterate through two lists in parallel?, zip does exactly that: iterates over a few iterables at the same time and generates tuples of matching-index items from all iterables.
In your case, the iterables are the values of the dict. So just unpack the values to zip:
x = {1: [[1, 2], [3, 5]], 2: [[2, 1], [2, 6]], 3: [[1, 5], [5, 4]]}
for y in zip(*x.values()):
print(y)
Gives:
([1, 2], [2, 1], [1, 5])
([3, 5], [2, 6], [5, 4])

Removing duplicates from a 4D list

I'm working with a 4D list and I'm trying to remove some duplicate inner lists, I have done something, but it's not exactly working, here is my code.
mylist = [[[], [[4, 3], [4, 3]], [[3, 2], [2, 3], [3, 4]]], [[[4, 2], [2, 3]], [[4, 3], [4, 3]], [[3, 2], [2, 3], [3, 4]]]]
final_list = []
for i in mylist:
current = []
for j in i:
for k in j:
for l in zip(k, k[1:]):
if list(l) not in current:
current.append(list(l))
final_list.append(current)
print(final_list)
final_list = [[[3, 2], [2, 3], [3, 4]], [[3, 2], [2, 3], [3, 4]]]
So instead of removing elements I append the values that are the same. This should be my desired output
#Here I remove the duplicate [4,3] #And here
! !
v v
final_list = [[[], [[4, 3]], [[3, 2], [2, 3], [3, 4]]], [[[4, 2], [2, 3]], [[4, 3]], [[3, 2], [2, 3], [3, 4]]]]
I think there should be an easy way, too many nested for loops, so any help would be appreciated, thank you so much!

You can try this with itertools.groupby:
import itertools
final_list=[[list(sbls for sbls,_ in itertools.groupby(sbls)) for sbls in ls] for ls in mylist]
Same as:
final_list=[[[sbls[i] for i in range(len(sbls)) if i == 0 or sbls[i] != sbls[i-1]] for sbls in ls] for ls in mylist]
Both outputs:
final_list
[[[], [[4, 3]], [[3, 2], [2, 3], [3, 4]]],
[[[4, 2], [2, 3]], [[4, 3]], [[3, 2], [2, 3], [3, 4]]]]
It can be done manually as well, with for loops, similar to your original approach:
flist=[]
for ls in mylist:
new_ls=[]
for sbls in ls:
new_sbls = []
for elem in sbls:
if elem not in new_sbls:
new_sbls.append(elem)
new_ls.append(new_sbls)
flist.append(new_ls)

You could use itertools.chain twice in order to reduce your list to a two-dimensional list. Now you can search for duplicates (e.g. by using count to count the numbers of occurrences. There are many solutions for this). Once you found all your duplicate entries, iterate over your original list and remove all but one occurrence of the duplicates:
import itertools
flat_list = itertools.chain(*itertools.chain(*mylist))
# TODO find duplicates in flat list
duplicates = ...
# TODO remove all duplicates from the original list

Combination of betting odds in Python

So I'm new to Python and I've decided to work on a project that I'm interested in. I've connected to an API to get betting odds from different bookies. I've successfully got the data and stored in a Sqlite3 database. The next step is to compare the odds, and this is where I'm getting stuck.
So let's say I have a list of odds from 3 bookies:
bookie1 = [1,2]
bookie2 = [3,4]
bookie3 = [5,6]
then I have the odds from all bookies in 1 list, such as:
bookies_all = [ [1,2], [3,4], [5,6] ]
How do I get the combinations of odds from the 3 bookies?
I expect the output to look something like this:
combos = [[1,3], [1,5], [1,4], [1,6], [2,3], [2,5], [2,4], [2,6], [3,5], [3,6],[4,5], [4,6]]
Is the best option to loop through the list?

I've coded this up and it gives me all the combinations I need.
bookies_all = [[1, 2], [3, 4], [5, 6]]
combos = []
count = 0
for outer in bookies_all:
for inner in bookies_all:
temp_list = [outer[0], inner[1]]
count += 1
combos.append(temp_list)
print(combos)
Output: [[1, 2], [1, 4], [1, 6], [3, 2], [3, 4], [3, 6], [5, 2], [5, 4], [5, 6]]
The combinations in bold are the ones I want. This code works for this example.
I will test it out for scenarios where the bookies_all list has more values.

You can use itertools.combinations to find the combinations of bookies, then use a list comprehension to interleave the items:
from itertools import combinations
bookies_all = [[1, 2], [3, 4], [5, 6]]
all_comb = list(combinations(bookies_all, 2))
#print(all_comb)
combos = [[i, j] for c in all_comb for i in c[0] for j in c[1]]
print(combos)
Output:
[[1, 3], [1, 4], [2, 3], [2, 4], [1, 5], [1, 6], [2, 5], [2, 6], [3, 5], [3, 6], [4, 5], [4, 6]]

Creating a unique list with the each subsequent item from a series of lists

How can i create a new list combing the first values of my old lists and then the second ones etc..
list_1 = [1,2,3,4]
list_2 = [1,2,3,4]
list_3 = [1,2,3,4]
new_list = [[1,1,1],[2,2,2],[3,3,3],[4,4,4]]

Pure python:
You can use zip:
new_list = list(map(list, zip(list_1,list_2,list_3)))
>>> new_list
[[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]]
Alternative:
numpy:
import numpy as np
new_list = np.array([list_1,list_2,list_3]).T.tolist()
>>> new_list
[[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]]

Here's another way to do it using list comprehensions.
new_list = [list(args) for args in zip(list_1, list_2, list_3)]

If we enumerate one list the index from that list to all 3
new_list = [[list_1[i], list_2[i], list_3[i]] for i, _ in enumerate(list_1)]
# [[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]]

Filtering the two first matching elements in a list

I have a list of lists sorted in an ascending order, similar to this one:
input = [[1,1],[1,2],[1,3],[1,4],[2,1],[2,2],[2,3],[3,1],[6,1],[6,2]]
I want to filter this list so that the new list would only contain the first two (or the only) element with matching integers in position 0, like so:
output = [[1,1],[1,2],[2,1],[2,2],[3,1],[6,1],[6,2]]
It would be ideal if the remaining elements (the ones which did not meet the criteria) would remain on the input list, while the matching elements would be stored separately.
How do I go about doing this?
Thank you in advance!
Edit: The elements on the index 1 could be virtually any integers, e.g. [[1,6],[1,7],[1,8],[2,1],[2,2]]

Pandas
Although this is a bit overkill, we can use pandas for this:
import pandas as pd
pd.DataFrame(d).groupby(0).head(2).values.tolist()
With d the original list. This then yields:
>>> pd.DataFrame(d).groupby(0).head(2).values.tolist()
[[1, 1], [1, 2], [2, 1], [2, 2], [3, 1], [6, 1], [6, 2]]
Note that this will return copies of the lists, not the original lists. Furthermore all the rows should have the same number of items.
Itertools groupby and islice
If the list is ordered lexicographically, then we can use itertools.groupby:
from operator import itemgetter
from itertools import groupby, islice
[e for _, g in groupby(d, itemgetter(0)) for e in islice(g, 2)]
this again yields:
>>> [e for _, g in groupby(d, itemgetter(0)) for e in islice(g, 2)]
[[1, 1], [1, 2], [2, 1], [2, 2], [3, 1], [6, 1], [6, 2]]
It is also more flexible since we copy the reference to the list, and all lists can have a different number of elements (at least one here).
EDIT
The rest of the values can be obtained, by letting islice work the opposite way: retain everything but the firs two:
[e for _, g in groupby(d, itemgetter(0)) for e in islice(g, 2, None)]
we then obtain:
>>> [e for _, g in groupby(d, itemgetter(0)) for e in islice(g, 2, None)]
[[1, 3], [1, 4], [2, 3]]

You could also use a collections.defaultdict to group the sublists by the first index:
from collections import defaultdict
from pprint import pprint
input_lst = [[1,1],[1,2],[1,3],[1,4],[2,1],[2,2],[2,3],[3,1],[6,1],[6,2]]
groups = defaultdict(list)
for lst in input_lst:
key = lst[0]
groups[key].append(lst)
pprint(groups)
Which gives this grouped dictionary:
defaultdict(<class 'list'>,
{1: [[1, 1], [1, 2], [1, 3], [1, 4]],
2: [[2, 1], [2, 2], [2, 3]],
3: [[3, 1]],
6: [[6, 1], [6, 2]]})
Then you could just take the first two [:2] values from each key, and make sure the result is flattened and sorted in the end:
from itertools import chain
result = sorted(chain.from_iterable(x[:2] for x in groups.values()))
print(result)
Which outputs:
[[1, 1], [1, 2], [2, 1], [2, 2], [3, 1], [6, 1], [6, 2]]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Remove sublist after first element appears n times [closed] - python

Here's an option that doesn't use any modules: countDict = {} for i in ls: if str(i[0]) not in countDict.keys(): countDict[str(i[0])] = 1 else: countDict[str(i[0])] += 1 if countDict[str(i[0])] > 3: ls.remove(i)

Related

How can i sum up all values with the same index in a dictionary which each key has a nested list as a value?

Removing duplicates from a 4D list

Combination of betting odds in Python

Creating a unique list with the each subsequent item from a series of lists

Filtering the two first matching elements in a list

Categories

Resources