Combinations Including Select Elements (Python) - python

In order to make the set of all combinations of numbers 0 to x, with length y, we do:
list_of_combinations=list(combinations(range(0,x+1),y))
list_of_combinations=map(list,list_of_combinations)
print list_of_combinations
This will output the result as a list of lists.
For example, x=4, y=3:
[[0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 2, 3], [0, 2, 4], [0, 3, 4], [1, 2, 3], [1, 2, 4],
[1, 3, 4], [2, 3, 4]]
I am trying to do the above, but only outputting lists that have 2 members chosen beforehand.
For instance, I would like to only output the set of the combos that has 1 and 4 inside it. The output would then be (for x=4, y=3):
[[0, 1, 4], [1, 2, 4], [1, 3, 4]]
The best approach I have now is to make a list that is y-2 length with all numbers of the set without the chosen numbers, and then append the chosen numbers, but this seems very inefficient. Any help appreciated.
*Edit: I am doing this for large x and y, so I can't just write out all the combos and then search for the selected elements, I need to find a better method.

combinations() returns an iterable, so loop over that while producing the list:
[list(combo) for combo in combinations(range(x + 1), y) if 1 in combo]
This produces one list, the list of all combinations that match the criteria.
Demo:
>>> from itertools import combinations
>>> x, y = 4, 3
>>> [list(combo) for combo in combinations(range(x + 1), y) if 1 in combo]
[[0, 1, 2], [0, 1, 3], [0, 1, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4]]
The alternative would be to produce y - 1 combinations of range(x + 1) with 1 removed, then adding 1 back in (using bisect.insort() to avoid having to sort afterwards):
import bisect
def combinations_with_guaranteed(x, y, *guaranteed):
values = set(range(x + 1))
values.difference_update(guaranteed)
for combo in combinations(sorted(values), y - len(guaranteed)):
combo = list(combo)
for value in guaranteed:
bisect.insort(combo, value)
yield combo
then loop over that generator:
>>> list(combinations_with_guaranteed(4, 3, 1))
[[0, 1, 2], [0, 1, 3], [0, 1, 4], [1, 2, 3], [1, 2, 4], [1, 3, 4]]
>>> list(combinations_with_guaranteed(4, 3, 1, 2))
[[0, 1, 2], [1, 2, 3], [1, 2, 4]]
This won't produce as many combinations for filtering to discard again.
It may well be that for larger values of y and guaranteed numbers, just using yield sorted(combo + values) is going to beat repeated bisect.insort() calls.

This should do the trick:
filtered_list = filter(lambda x: 1 in x and 4 in x, list_of_combinations)
To make your code nicer (use more generators), I'd use this
combs = combinations(xrange(0, x+1), y)
filtered_list = map(list, filter(lambda x: 1 in x and 4 in x, combs))
If you don't need the filtered_list to be a list and it can be an iterable, you could even do
from itertools import ifilter, imap, combinations
combs = combinations(xrange(0, x+1), y)
filtered_list = imap(list, ifilter(lambda x: 1 in x and 4 in x, combs))
filtered_list.next()
> [0, 1, 4]
filtered_list.next()
> [1, 2, 4]
filtered_list.next()
> [1, 3, 4]
filtered_list.next()
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> StopIteration

Related

Python: how to splice a list into sublists of given lengths?

x = [2, 1, 2, 0, 1, 2, 2]
I want to splice the above list into sublists of length = [1, 2, 3, 1]. In other words, I want my output to look something like this:
[[2], [1, 2], [0, 1, 2], [2]]
where my first sublist is of length 1, the second sublist is of length 2, and so forth.
You can use itertools.islice here to consume N many elements of the source list each iteration, eg:
from itertools import islice
x = [2, 1, 2, 0, 1, 2, 2]
length = [1, 2, 3, 1]
# get an iterable to consume x
it = iter(x)
new_list = [list(islice(it, n)) for n in length]
Gives you:
[[2], [1, 2], [0, 1, 2], [2]]
Basically we want to extract certain lengths of substrings.
For that we need a start_index and an end_index. The end_index is your start_index + the current length which we want to extract:
x = [2, 1, 2, 0, 1, 2, 2]
lengths = [1,2,3,1]
res = []
start_index = 0
for length in lengths:
res.append(x[start_index:start_index+length])
start_index += length
print(res) # [[2], [1, 2], [0, 1, 2], [2]]
Added this solution to the other answer as it does not need any imported modules.
You can use the following listcomp:
from itertools import accumulate
x = [2, 1, 2, 0, 1, 2, 2]
length = [1, 2, 3, 1]
[x[i - j: i] for i, j in zip(accumulate(length), length)]
# [[2], [1, 2], [0, 1, 2], [2]]

Python groupby to split list by delimiter

I am pretty new to Python (3.6) and struggling to understand itertools groupby.
I've got the following list containing integers:
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
But the list could also be much longer and the '0' doesn't have to appear after every pair of numbers. It can also be after 3, 4 or more numbers. My goal is to split this list into sublists where the '0' is used as a delimiter and doesn't appear in any of these sublists.
list2 = [[1, 2], [2, 3], [4, 5]]
A similar problem has been solved here already:
Python spliting a list based on a delimiter word
Answer 2 seemed to help me a lot but unfortunately it only gave me a TypeError.
import itertools as it
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
list2 = [list(group) for key, group in it.groupby(list1, lambda x: x == 0) if not key]
print(list2)
File "H:/Python/work/ps0001/example.py", line 13, in
list2 = [list(group) for key, group in it.groupby(list, lambda x: x == '0') if not key]
TypeError: 'list' object is not callable
I would appreciate any help and be very happy to finally understand groupby.
You were checking for "0" (str) but you only have 0 (int) in your list. Also, you were using list as a variable name for your first list, which is a keyword in Python.
from itertools import groupby
list1 = [1, 2, 0, 2, 7, 3, 0, 4, 5, 0]
list2 = [list(group) for key, group in groupby(list1, lambda x: x == 0) if not key]
print(list2)
This should give you:
[[1, 2], [2, 7, 3], [4, 5]]
In your code, you need to change lambda x: x == '0' to lambda x: x == 0, since your working with a list of int, not a list of str.
Since others have shown how to improve your solution with itertools.groupby, you can also do this task with no libraries:
>>> list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
>>> zeroes = [-1] + [i for i, e in enumerate(list1) if e == 0]
>>> result = [list1[zeroes[i] + 1: zeroes[i + 1]] for i in range(len(zeroes) - 1)]
>>> print(result)
[[1, 2], [2, 3], [4, 5]]
You can use regex for this:
>>> import ast
>>> your_list = [1, 2, 0, 2, 3, 0, 4, 5, 0]
>>> a_list = str(your_list).replace(', 0,', '], [').replace(', 0]', ']')
>>> your_result = ast.literal_eval(a_list)
>>> your_result
([1, 2], [2, 3], [4, 5])
>>> your_result[0]
[1, 2]
>>>
Or a single line solution:
ast.literal_eval(str(your_list).replace(', 0,', '], [').replace(', 0]', ']'))
You could do that within a Loop as depicted in the commented Snippet below:
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
tmp,result = ([],[]) # tmp HOLDS A TEMPORAL LIST :: result => RESULT
for i in list1:
if not i:
# CURRENT VALUE IS 0 SO WE BUILD THE SUB-LIST
result.append(tmp)
# RE-INITIALIZE THE tmp VARIABLE
tmp = []
else:
# SINCE CURRENT VALUE IS NOT 0, WE POPULATE THE tmp LIST
tmp.append(i)
print(result) # [[1, 2], [2, 3], [4, 5]]
Effectively:
list1 = [1, 2, 0, 2, 3, 0, 4, 5, 0]
tmp,result = ([],[]) # HOLDS A TEMPORAL LIST
for i in list1:
if not i:
result.append(tmp); tmp = []
else:
tmp.append(i)
print(result) # [[1, 2], [2, 3], [4, 5]]
Use zip to return a tuple of lists and convert them to list later on
>>> a
[1, 2, 0, 2, 3, 0, 4, 5, 0]
>>> a[0::3]
[1, 2, 4]
>>> a[1::3]
[2, 3, 5]
>>> zip(a[0::3],a[1::3])
[(1, 2), (2, 3), (4, 5)]
>>> [list(i) for i in zip(a[0::3],a[1::3])]
[[1, 2], [2, 3], [4, 5]]
Try to use join and then split by 0
lst = [1, 2, 0, 2, 3, 0, 4, 5, 0]
lst_string = "".join([str(x) for x in lst])
lst2 = lst_string.split('0')
lst3 = [list(y) for y in lst2]
lst4 = [list(map(int, z)) for z in lst3]
print(lst4)
Running on my console:

randomly partition list without duplicates

I've got an array that contains each of a set of numbers n times. Example with n=2:
[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]
What I would like is a partition of this array in which the members of the partition
contain elements that are drawn randomly from the array
contain no duplicates
contain the same number of elements (up to rounding) k
Example output for k=4:
[[3,0,2,1], [0,1,4,2], [3,4]]
Invalid output for k=4:
[[3,0,2,2], [3,1,4,0], [1,4]]
(this is a partition but the first element of the partition contains duplicates)
What's the most pythonic way of achieving this?
A combination of collections.Counter and random.sample can be used:
from collections import Counter
import random
def random_partition(seq, k):
cnts = Counter(seq)
# as long as there are enough items to "sample" take a random sample
while len(cnts) >= k:
sample = random.sample(list(cnts), k)
cnts -= Counter(sample)
yield sample
# Fewer different items than the sample size, just return the unique
# items until the Counter is empty
while cnts:
sample = list(cnts)
cnts -= Counter(sample)
yield sample
This is a generator that yields the samples, so you can simply cast it to a list:
>>> l = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4]
>>> list(random_partition(l, 4))
[[1, 0, 2, 4], [1, 0, 2, 3], [3, 4]]
>>> list(random_partition(l, 2))
[[1, 0], [3, 0], [1, 4], [2, 3], [4, 2]]
>>> list(random_partition(l, 6))
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
>>> list(random_partition(l, 4))
[[4, 1, 0, 3], [1, 3, 4, 0], [2], [2]]
The last case shows that this method can give weird results if the "random" part in the function returns the "wrong" samples. If that shouldn't happen or at least not often you need to figure out how the samples could be weighted (for example using random.choices) to minimize that possibility.

Split a list into increasing sequences using itertools

I have a list with mixed sequences like
[1,2,3,4,5,2,3,4,1,2]
I want to know how I can use itertools to split the list into increasing sequences cutting the list at decreasing points. For instance the above would output
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
this has been obtained by noting that the sequence decreases at 2 so we cut the first bit there and another decrease is at one cutting again there.
Another example is with the sequence
[3,2,1]
the output should be
[[3], [2], [1]]
In the event that the given sequence is increasing we return the same sequence. For example
[1,2,3]
returns the same result. i.e
[[1, 2, 3]]
For a repeating list like
[ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
the output should be
[[1, 2, 2, 2], [1, 2, 3, 3], [1, 1, 1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]
What I did to achieve this is define the following function
def splitter (L):
result = []
tmp = 0
initialPoint=0
for i in range(len(L)):
if (L[i] < tmp):
tmpp = L[initialPoint:i]
result.append(tmpp)
initialPoint=i
tmp = L[i]
result.append(L[initialPoint:])
return result
The function is working 100% but what I need is to do the same with itertools so that I can improve efficiency of my code. Is there a way to do this with itertools package to avoid the explicit looping?
With numpy, you can use numpy.split, this requires the index as split positions; since you want to split where the value decreases, you can use numpy.diff to calculate the difference and check where the difference is smaller than zero and use numpy.where to retrieve corresponding indices, an example with the last case in the question:
import numpy as np
lst = [ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
np.split(lst, np.where(np.diff(lst) < 0)[0] + 1)
# [array([1, 2, 2, 2]),
# array([1, 2, 3, 3]),
# array([1, 1, 1, 2, 3, 4]),
# array([1, 2, 3, 4, 5, 6])]
Psidom already has you covered with a good answer, but another NumPy solution would be to use scipy.signal.argrelmax to acquire the local maxima, then np.split.
from scipy.signal import argrelmax
arr = np.random.randint(1000, size=10**6)
splits = np.split(arr, argrelmax(arr)[0]+1)
Assume your original input array:
a = [1, 2, 3, 4, 5, 2, 3, 4, 1, 2]
First find the places where the splits shall occur:
p = [ i+1 for i, (x, y) in enumerate(zip(a, a[1:])) if x > y ]
Then create slices for each such split:
print [ a[m:n] for m, n in zip([ 0 ] + p, p + [ None ]) ]
This will print this:
[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]
I propose to use more speaking names than p, n, m, etc. ;-)

Sum of two nested lists

I have two nested lists:
a = [[1, 1, 1], [1, 1, 1]]
b = [[2, 2, 2], [2, 2, 2]]
I want to make:
c = [[3, 3, 3], [3, 3, 3]]
I have been referencing the zip documentation, and researching other posts, but don't really understand how they work. Any help would be greatly appreciated!
You may use list comprehension with zip() as:
>>> a = [[1, 1, 1], [1, 1, 1]]
>>> b = [[2, 2, 2], [2, 2, 2]]
>>> [[i1+j1 for i1, j1 in zip(i,j)] for i, j in zip(a, b)]
[[3, 3, 3], [3, 3, 3]]
More generic way is to create a function as:
def my_sum(*nested_lists):
return [[sum(items) for items in zip(*zipped_list)] for zipped_list in zip(*nested_lists)]
which can accept any number of list. Sample run:
>>> a = [[1, 1, 1], [1, 1, 1]]
>>> b = [[2, 2, 2], [2, 2, 2]]
>>> c = [[3, 3, 3], [3, 3, 3]]
>>> my_sum(a, b, c)
[[6, 6, 6], [6, 6, 6]]
If you're going to do this a whole bunch, you'll be better off using numpy:
import numpy as np
a = [[1, 1, 1], [1, 1, 1]]
b = [[2, 2, 2], [2, 2, 2]]
aa = np.array(a)
bb = np.array(b)
c = aa + bb
Working with numpy arrays will be much more efficient than repeated uses of zip on lists. On top of that, numpy allows you to work with arrays much more expressively so the resulting code us usually much easier to read.
If you don't want the third party dependency, you'll need to do something a little different:
c = []
for a_sublist, b_sublist in zip(a, b):
c.append([a_sublist_item + b_sublist_item for a_sublist_item, b_sublist_item in zip(a_sublist, b_sublist)])
Hopefully the variable names make it clear enough what it going on here, but basically, each zip takes the inputs and combines them (one element from each input). We need 2 zips here -- the outermost zip pairs lists from a with lists from b whereas the inner zip pairs up individual elements from the sublists that were already paired.
I use python build-in function map() to do this.
If I have simple list a and b, I sum them as this way:
>>> a = [1,1,1]
>>> b = [2,2,2]
>>> map(lambda x, y: x + y, a, b)
[3, 3, 3]
If I have nested list a and b, I sum them as a similar way:
>>> a = [[1, 1, 1], [1, 1, 1]]
>>> b = [[2, 2, 2], [2, 2, 2]]
>>> map(lambda x, y: map(lambda i, j: i + j, x, y), a, b)
[[3, 3, 3], [3, 3, 3]]

Categories