Is there a way to generate combinations, one at a time? - python

I can generate combinations from a list of numbers using itertools.combinations, such as the following:
from itertools import combinations
l = [1, 2, 3, 4, 5]
for i in combinations(l,2):
print(list(i))
This generates the following:
[1, 2]
[1, 3]
[1, 4]
[1, 5]
[2, 3]
[2, 4]
[2, 5]
[3, 4]
[3, 5]
[4, 5]
How can I generate just one of these list pairs at a time and save it to a variable? I want to use each pair of numbers, one pair at a time, and then go to the next pair of numbers. I don't want to generate all of them at once.

The itertools library actually returns a generator for most of it's functions (including itertools.combinations()). You can read more about Generators here. Basically, it's a function that lazily calculates values instead of generating everything all at once.
You can just get a single value out of a generator using the next command. Take a look at the following snippet:
import itertools
l = [1, 2, 3, 4, 5]
comb_generator = itertools.combinations(l, 2)
temp = next(comb_generator) #Get the first combo into a var
print(temp)
#Do some other stuff
temp = next(comb_generator) #Get another combo
print(temp)
Whenever you want a new combination, you can get it by calling next() on your generator.

Related

My_list.extend([My_list.pop(), my_list.pop])

My code:
my_list = [1,2,3,4,5]
my_list.extend([my_list.pop(), my_list.pop(), my_list.pop()])
So when I print it the result is
[1, 2, 5, 4, 3]
so why this Skew the last three Elements
Python's order of execution is inside-out and left-to-right.
So here python fill first pop() three elements from my_list and create a list from those, left to right -> [5, 4, 3] then it will execute extend appending everything front-to-back.
Essentially what your code does is:
my_list = [1, 2, 3, 4, 5]
second = []
second.append(my_list.pop())
second.append(my_list.pop())
second.append(my_list.pop())
my_list.extend(second)

Difference between a loop in a function call and listing all the arguments explicitly

I have a function that sorts a list of lists by the first list. When I use the function with the variables like so:
sort_lists(IN[0],IN[1],IN[2])
it works perfectly. Although, as I don't know how many lists my input contains, I want to use this as my variable:
sort_lists(IN[idx] for idx in range(len(IN)))
Although this returns a sorting of one list (the superlist). Why is there a difference between these variables, and how can I improve the code?
Here is the function if decisive (here IN[0] is the input with a number of sublists):
def sort_lists(*args):
zipped_list= zip(*sorted(zip(*args)))
return [list(l) for l in zipped_list]
OUT = sort_lists(data_sort[0],data_sort[1],data_sort[2])
I want to use this output:
OUT = sort_lists(data_sort[idx] for idx in range(len(IN[0])))
Two things to understand here:
*args will give you all function parameters as a tuple
IN[idx] for idx in range(len(IN)) is a generator expression
You can see how your inputs are different if you simply add print statement in your function:
def sort_lists(*args):
print(args)
zipped_list= zip(*sorted(zip(*args)))
return [list(l) for l in zipped_list]
Let the input list of lists be: lists = [[2, 1, 3], [1, 3, 4], [5, 4, 2]].
sort_lists(lists[0], lists[1], lists[2])
will print: ([2, 1, 3], [1, 3, 4], [5, 4, 2]). That's a tuple of inner lists.
Though, if you call it like this:
sort_lists(lists[idx] for idx in range(len(lists)))
or
sort_lists(sublist for sublist in lists)
this will print (<generator object <genexpr> at 0x0000007001D3FBA0>,), a one-element tuple of a generator.
You can make your function work with a generator by accepting only one parameter:
def sort_lists(arg):
zipped_list= zip(*sorted(zip(*arg)))
return [list(l) for l in zipped_list]
sort_lists(lists[idx] for idx in range(len(lists)))
# [[1, 2, 3], [3, 1, 4], [4, 5, 2]]
but I suggest to leave your function as is, and unpack your lists in the place where you call it instead:
>>> sort_lists(*lists)
[[1, 2, 3], [3, 1, 4], [4, 5, 2]]
Just change the function to accept list of lists, is it problem? This piece of code:
IN[idx] for idx in range(len(IN))
returns again list of lists
As Georgy pointed out, the difference is between the arguments being a generator or list. I would also like to point out that this is an opportunity to use and practice the map method. map applies the same function to each entry in a list. The function can be a built-in like sorted.
list_a = [[2, 1, 3], [1, 3, 4], [5, 4, 2]]
sorted_list_a = list(map(sorted, list_a)) # sorted is the python built-in function
print(sorted_list_a)
Returns:
[[1, 2, 3], [1, 3, 4], [2, 4, 5]]
You'll notice that you'll have to pass your map to the list function because map returns a map object, so you have to turn it into a list.
The map documentation is here. And a good example of it is here.

Functionally shuffling a list

There are quite a few questions on stack overflow regarding the random.shuffle method of the random module.
Something that irks me about shuffle is that it shuffles in-place rather than returning a shuffled copy.
Note that shuffle works in place, and returns None.
Therefore expressions like
for index, (parent1, parent2) in enumerate(zip(sorted(population)[::2], shuffle(population)[1::2])):
don't work. Writing it with a side effect seems unnecessarily verbose:
other_half = population[1::2]
random.shuffle(other_half)
for index, (parent1, parent2) in enumerate(zip(sorted(population)[::2], other_half):
What's a pythonic way of functionally shuffling a list?
This looks like a duplicate of this question
The accepted answer was
shuffled = sorted(x, key=lambda k: random.random())
A good alternative would be random.sample with k being the len of the list:
import random
li = [1, 2, 3, 4, 5]
for _ in range(4): # showing we get a new, 'shuffled' list
print(random.sample(li, len(li)))
# [5, 2, 3, 1, 4]
# [1, 5, 4, 3, 2]
# [4, 2, 5, 1, 3]
# [4, 2, 3, 5, 1]

Euclidean distance among dictionary elements

I am trying to compute the euclidean distance among the dictionary elements as shown below
#!/usr/bin/python
import itertools
from scipy.spatial import distance
def Distance(data):
for subset in itertools.combinations(data, 2):
print subset
#This shows a tuple of two element instead of the elements of the dictionary.
dst = distance.euclidean(subset)
if __name__ == "__main__":
data = {}
data['1'] = [5, 3, 4, 4, 6]
data['2'] = [1, 2, 3, 4, 5]
data['3'] = [3, 1, 2, 3, 3]
data['4'] = [4, 3, 4, 3, 3]
data['5'] = [3, 3, 1, 5, 4]
Distance(data)
The issue is that when I try to compute the combination of the dictionary elements I get something that I do not expect, as commented in the code. I think I am doing something wrong with itertools.combinations...
You're taking combinations of an iteration, and the iteration is:
for element in data:
print(element)
If you run that code, I think you'll see the issue: you're iterating over the keys of the dictionary, not the elements. Try again with .values():
for subset in itertools.combinations(data.values(), 2):
print subset
You have a couple of other issues in your code though: Firstly distance.euclidean takes 2 parameters, so you'll need to do something like this:
dst = distance.euclidean(*subset)
...or this...
dst = distance.euclidean(subset[0], subset[1])
...and secondly your function isn't doing anything with dst, so it will be overwritten every time through your loop.

Returning a list of list elements

I need help writing a function that will take a single list and return a different list where every element in the list is in its own original list.
I know that I'll have to iterate through the original list that I pass through and then append the value depending on whether or not the value is already in my list or create a sublist and add that sublist to the final list.
an example would be:
input:[1, 2, 2, 2, 3, 1, 1, 3]
Output:[[1,1,1], [2,2,2], [3,3]]
I'd do this in two steps:
>>> import collections
>>> inputs = [1, 2, 2, 2, 3, 1, 1, 3]
>>> counts = collections.Counter(inputs)
>>> counts
Counter({1: 3, 2: 3, 3: 2})
>>> outputs = [[key] * count for key, count in counts.items()]
>>> outputs
[[1, 1, 1], [2, 2, 2], [3, 3]]
(The fact that these happen to be in sorted numerical order, and also in the order of first appearance, is just a coincidence here. Counters, like normal dictionaries, store their keys in arbitrary order, and you should assume that [[3, 3], [1, 1, 1], [2, 2, 2]] would be just as possible a result. If that's not acceptable, you need a bit more work.)
So, how does it work?
The first step creates a Counter, which is just a special subclass of dict made for counting occurrences of each key. One of the many nifty things about it is that you can just pass it any iterable (like a list) and it will count up how many times each element appears. It's a trivial one-liner, it's obvious and readable once you know how Counter works, and it's even about as efficient as anything could possibly be.*
But that isn't the output format you wanted. How do we get that? Well, we have to get back from 1: 3 (meaning "3 copies of 1") to [1, 1, 1]). You can write that as [key] * count.** And the rest is just a bog-standard list comprehension.
If you look at the docs for the collections module, they start with a link to the source. Many modules in the stdlib are like this, because they're meant to serve as source code for learning from as well as usable code. So, you should be able to figure out how the Counter constructor works. (It's basically just calling that _count_elements function.) Since that's the only part of Counter you're actually using beyond a basic dict, you could just write that part yourself. (But really, once you've understood how it works, there's no good reason not to use it, right?)
* For each element, it's just doing a hash table lookup (and insert if needed) and a += 1. And in CPython, it all happens in reasonably-optimized C.
** Note that we don't have to worry about whether to use [key] * count vs. [key for _ in range(count)] here, because the values have to be immutable, or at least of an "equality is as good as identity" type, or they wouldn't be usable as keys.
The most time efficient would be to use a dictionary:
collector = {}
for elem in inputlist:
collector.setdefault(elem, []).append(elem)
output = collector.values()
The other, more costly option is to sort, then group using itertools.groupby():
from itertools import groupby
output = [list(g) for k, g in groupby(sorted(inputlist))]
Demo:
>>> inputlist = [1, 2, 2, 2, 3, 1, 1, 3]
>>> collector = {}
>>> for elem in inputlist:
... collector.setdefault(elem, []).append(elem)
...
>>> collector.values()
[[1, 1, 1], [2, 2, 2], [3, 3]]
>>> from itertools import groupby
>>> [list(g) for k, g in groupby(sorted(inputlist))]
[[1, 1, 1], [2, 2, 2], [3, 3]]
What about this, as you said you wanted a function:
def makeList(user_list):
user_list.sort()
x = user_list[0]
output = [[]]
for i in user_list:
if i == x:
output[-1].append(i)
else:
output.append([i])
x = i
return output
>>> print makeList([1, 2, 2, 2, 3, 1, 1, 3])
[[1, 1, 1], [2, 2, 2], [3, 3]]

Categories