Related
I would like to know how to use the python random.sample() function within a for-loop to generate multiple sample lists that are not identical.
For example, right now I have:
for i in range(3):
sample = random.sample(range(10), k=2)
This will generate 3 sample lists containing two numbers each, but I would like to make sure none of those sample lists are identical. (It is okay if there are repeating values, i.e., (2,1), (3,2), (3,7) would be okay, but (2,1), (1,2), (5,4) would not.)
If you specifically need to "use random.sample() within a for-loop", then you could keep track of samples that you've seen, and check that new ones haven't been seen yet.
import random
seen = set()
for i in range(3):
while True:
sample = random.sample(range(10), k=2)
print(f'TESTING: {sample = }') # For demo
fr = frozenset(sample)
if fr not in seen:
seen.add(fr)
break
print(sample)
Example output:
TESTING: sample = [0, 7]
[0, 7]
TESTING: sample = [0, 7]
TESTING: sample = [1, 5]
[1, 5]
TESTING: sample = [7, 0]
TESTING: sample = [3, 5]
[3, 5]
Here I made seen a set to allow fast lookups, and I converted sample to a frozenset so that order doesn't matter in comparisons. It has to be frozen because a set can't contain another set.
However, this could be very slow with different inputs, especially a larger range of i or smaller range to draw samples from. In theory, its runtime is infinite, but in practice, random's number generator is finite.
Alternatives
There are other ways to do the same thing that could be much more performant. For example, you could take a big random sample, then chunk it into the desired size:
n = 3
k = 2
upper = 10
sample = random.sample(range(upper), k=k*n)
for chunk in chunks(sample, k):
print(chunk)
Example output:
[6, 5]
[3, 0]
[1, 8]
With this approach, you'll never get any duplicate numbers like [[2,1], [3,2], [3,7]] because the sample contains all unique numbers.
This approach was inspired by Sven Marnach's answer on "Non-repetitive random number in numpy", which I coincidentally just read today.
it looks like you are trying to make a nested list of certain list items without repetition from original list, you can try below code.
import random
mylist = list(range(50))
def randomlist(mylist,k):
length = lambda : len(mylist)
newlist = []
while length() >= k:
newlist.append([mylist.pop(random.randint(0, length() - 1)) for I in range(k)])
newlist.append(mylist)
return newlist
randomlist(mylist,6)
[[2, 20, 36, 46, 14, 30],
[4, 12, 13, 3, 28, 5],
[45, 37, 18, 9, 34, 24],
[31, 48, 11, 6, 19, 17],
[40, 38, 0, 7, 22, 42],
[23, 25, 47, 41, 16, 39],
[8, 33, 10, 43, 15, 26],
[1, 49, 35, 44, 27, 21],
[29, 32]]
This should do the trick.
import random
import math
# create set to store samples
a = set()
# number of distinct elements in the population
m = 10
# sample size
k = 2
# number of samples
n = 3
# this protects against an infinite loop (see Safety Note)
if n > math.comb(m, k):
print(
f"Error: {math.comb(m, k)} is the number of {k}-combinations "
f"from a set of {m} distinct elements."
)
exit()
# the meat
while len(a) < n:
a.add(tuple(sorted(random.sample(range(m), k = k))))
print(a)
With a set you are guaranteed to get a collection with no duplicate elements. In a set, you would be allowed to have (1, 2) and (2, 1) inside, which is why sorted is applied. So if [1, 2] is drawn, sorted([1, 2]) returns [1, 2]. And if [2, 1] is subsequently drawn, sorted([2, 1]) returns [1, 2], which won't be added to the set because (1, 2) is already in the set. We use tuple because objects in a set have to be hashable and list objects are not.
I hope this helps. Any questions, please let me know.
Safety Note
To avoid an infinite loop when you change 3 to some large number, you need to know the maximum number of possible samples of the type that you desire.
The relevant mathematical concept for this is a combination.
Suppose your first argument of random.sample() is range(m) where
m is some arbitrary positive integer. Note that this means that the
sample will be drawn from a population of m distinct members
without replacement.
Suppose that you wish to have n samples of length k in total.
The number of possible k-combinations from the set of m distinct elements is
m! / (k! * (m - k)!)
You can get this value via
from math import comb
num_comb = comb(m, k)
comb(m, k) gives the number of different ways to choose k elements from m elements without repetition and without order, which is exactly what we want.
So in the example above, m = 10, k = 2, n = 3.
With these m and k, the number of possible k-combinations from the set of m distinct elements is 45.
You need to ensure that n is less than 45 if you want to use those specific m and k and avoid an infinite loop.
I find myself in a unique situation in which I need to multiply single elements within a listed pair of numbers where each pair is nested within a parent list of elements. For example, I have my pre-defined variables as:
output = []
initial_list = [[1,2],[3,4],[5,6]]
I am trying to calculate an output such that each element is the product of a unique combination (always of length len(initial_list)) of a single element from each pair. Using my example of initial_list, I am looking to generate an output of length pow(2 * len(initial_list)) that is scable for any "n" number of pairs in initial_list (with a minimum of 2 pairs). So in this case each element of the output would be as follows:
output[0] = 1 * 3 * 5
output[1] = 1 * 3 * 6
output[2] = 1 * 4 * 5
output[3] = 1 * 4 * 6
output[4] = 2 * 3 * 5
output[5] = 2 * 3 * 6
output[6] = 2 * 4 * 5
output[7] = 2 * 4 * 6
In my specific case, the order of output assignments does not matter other than output[0], which I need to be equivalent to the product of the first element in each pair in initial_list. What is the best way to proceed to generate an output list such that each element is a unique combination of every element in each list?
...
My initial approach consisted of using;
from itertools import combinations
from itertools import permutations
from itertools import product
to somehow generate a list of every possible combination then multiply the products together and append each product to the output list, but I couldn't figure out a wait to implement the tools successfully. I have since tried to create a recursive function that combines for x in range(2): with nested recursion recalls, but once again I cannot figured out a solution.
Someone more experienced and smarter than me please help me out; Any and all help is appreciated! Thank you!
Without using any external library
def multi_comb(my_list):
"""
This returns the multiplication of
every possible combinationation of
the `my_list` of type [[a1, a2], [b1, b2], ...]
Arg: List
Return: List
"""
if not my_list: return [1]
a, b = my_list.pop(0)
result = multi_comb(my_list)
left = [a * i for i in result]
right = [b * i for i in result]
return (left + right)
print(multi_comb([[1, 2], [3, 4], [5, 6]]))
# Output
# [15, 18, 20, 24, 30, 36, 40, 48]
I am using reccursion to get the result. Here's the visual illustration of how this works.
Instead of taking a top-down approach, we can take bottom-up approach to better understand how this program works.
At the last step, a and b becomes 5 and 6 respectively. Calling multi_comb() with empty list returns [1] as a result. So left and right becomes [5] and [6]. Thus we return [5, 6] to our previous step.
At the second last step, a and b was 3 and 4 respectively. From the last step we got [5, 6] as a result. After multiplying each of the values inside the result with a and b (notice left and right), we return the result [15, 18, 20, 24] to our previous step.
At our first step, that is our starting step, we had a and b as 1 and 2 respectively. The value returned from our last step becomes our result, ie, [15, 18, 20, 24]. Now we multiply both a and b with this result and return our final output.
Note:
This program works only if list is in the form [ [a1, a2], [b1, b2], [c1, c2], ... ] as told by the OP in the comments. The problem of solving the list containing the sub-list of n items will be little different in code, but the concept is same as in this answer.
This problem can also be solved using dynamic programming
output = [1, ]
for arr in initial_list:
output = [a * b for a in arr for b in product]
This problem is easy to solve if you have just one subarray -- the output is the given subarray.
Suppose you solved the problem for the first n - 1 subarrays, and you got the output. The new subarray is appended. How the output should change? The new output is all pair-wise products of the previous output and the "new" subarray.
Look closely, there's an easy pattern. Let there be n sublists, and 2 elements in each: at index 0 and 1. Now, the indexes selected can be represented as a binary string of length n.
It'll start with 0000..000, then 0000...001, 0000...010 and so on. So all you need to do is:
n = len(lst)
for i in range(2**n):
binary = bin(i)[2:] #get binary representation
for j in range(n):
if binary[j]=="1":
#include jth list's 1st index in product
else:
#include jth list's 0th index in product
The problem would a scalable solution would be, since you're generating all possible pairs, the time complexity will be O(2^N)
Your idea to use itertools.product is great!
import itertools
initial_list = [[1,2],[3,4],[5,6]]
combinations = list(itertools.product(*initial_list))
# [(1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)]
Now, you can get the product of each tuple in combination using for-loops, or using functools.reduce, or you can use math.prod which was introduced in python 3.8:
import itertools
import math
initial_list = [[1,2],[3,4],[5,6]]
output = [math.prod(c) for c in itertools.product(*initial_list)]
# [15, 18, 20, 24, 30, 36, 40, 48]
import itertools
import functools
import operator
initial_list = [[1,2],[3,4],[5,6]]
output = [functools.reduce(operator.mul, c) for c in itertools.product(*initial_list)]
# [15, 18, 20, 24, 30, 36, 40, 48]
import itertools
output = []
for c in itertools.product(*initial_list):
p = 1
for x in c:
p *= x
output.append(p)
# output == [15, 18, 20, 24, 30, 36, 40, 48]
Note: if you are more familiar with lambdas, operator.mul is pretty much equivalent to lambda x,y: x*y.
itertools.product and math.prod are a nice fit -
from itertools import product
from math import prod
input = [[1,2],[3,4],[5,6]]
output = [prod(x) for x in product(*input)]
print(output)
[15, 18, 20, 24, 30, 36, 40, 48]
I'm trying to practice Python exercises, but using list comprehension to solve problems rather than the beginner style loops shown in the book. There is one example where it asks for a list of numbers to be put into a list of even numbers only, BUT they must be in sublists so that if the numbers follow after one another without being interrupted by an odd number, they should be put into a sublist together:
my_list = [2,3,5,7,8,9,10,12,14,15,17,25,31,32]
desired_output = [[2],[8],[10,12,14],[32]]
So you can see in the desired output above, 10,12,14 are evens that follow on from one another without being interrupted by an odd, so they get put into a sublist together. 8 has an odd on either side of it, so it gets put into a sublist alone after the odds are removed.
I can put together an evens list easily using list comprehension like this below, but I have no idea how to get it into sublists like the desired output shows. Could someone please suggest an idea for this using list comprehension (or generators, I don't mind which as I'm trying to learn both at the moment). Thanks!
evens = [x for x in my_list if x%2==0]
print(evens)
[2, 8, 10, 12, 14, 32]
As explained in the comments, list comprehensions should not be deemed "for beginners" - first focus on writing your logic using simple for loops.
When you're ready, you can look at comprehension-based methods. Here's one:
from itertools import groupby
my_list = [2,3,5,7,8,9,10,12,14,15,17,25,31,32]
condition = lambda x: all(i%2==0 for i in x)
grouper = (list(j) for _, j in groupby(my_list, key=lambda x: x%2))
res = filter(condition, grouper)
print(list(res))
# [[2], [8], [10, 12, 14], [32]]
The main point to note in this solution is nothing is computed until you call list(res). This is because filter and generator comprehensions are lazy.
You mentioned also wanting to learn generators, so here is a version that's also a bit more readable, imho.
from itertools import groupby
def is_even(n):
return n%2 == 0
def runs(lst):
for even, run in groupby(lst, key=is_even):
if even:
yield list(run)
if __name__ == '__main__':
lst = [2, 3, 5, 7, 8, 9, 10, 12, 14, 15, 17, 25, 31, 32]
res = list(runs(lst))
print(res)
Incidentally, if you absolutely, positively want to implement it as a list comprehension, this solutions falls out of the above quite naturally:
[list(run) for even, run in groupby(lst, key=is_even) if even]
If you don't want to use itertools, there's another way to do it with list comprehensions.
First, take the indices of the odd elements:
[i for i,x in enumerate(my_list) if x%2==1]
And add two sentinels: [-1] before and [len(my_list)] after:
odd_indices = [-1]+[i for i,x in enumerate(my_list) if x%2==1]+[len(my_list)]
# [-1, 1, 2, 3, 5, 9, 10, 11, 12, 14]
You have now something like that:
[2,3,5,7,8,9,10,12,14,15,17,25,31,32]
^---^-^-^---^-----------^--^--^--^----^
You can see your sequences. Now, take the elements between those indices. To do that, zip odd_indices with itself to get the intervals as tuples:
zip(odd_indices, odd_indices[1:])
# [(-1, 1), (1, 2), (2, 3), (3, 5), (5, 9), (9, 10), (10, 11), (11, 12), (12, 14)]
even_groups = [my_list[a+1:b] for a,b in zip(odd_indices, odd_indices[1:])]
# [[2], [], [], [8], [10, 12, 14], [], [], [], [32]]
You just have to filter the non empty lists:
even_groups = [my_list[a+1:b] for a,b in zip(odd_indices, odd_indices[1:]) if a+1<b]
# [[2], [8], [10, 12, 14], [32]]
You can merge the two steps into one comprehension list, but that is a bit unreadable:
>>> my_list = [2,3,5,7,8,9,10,12,14,15,17,25,31,32]
>>> [my_list[a+1:b] for l1 in [[-1]+[i for i,x in enumerate(my_list) if x%2==1]+[len(my_list)]] for a,b in zip(l1, l1[1:]) if b>a+1]
[[2], [8], [10, 12, 14], [32]]
As pointed by #jpp, prefer basic loops until you feel comfortable. And maybe avoid those nested list comprehensions forever...
I am practicing exercises for functional programming concepts using python. I came across this problem. I have tried a lot and couldn't find a solution using functional programming constructs like map/reduce, closures.
Problem: Given a list of numbers
list = [10, 9, 8, 7, 6, 5, 4, 3]
Find sum of difference in each pair using Map/Reduce or any functional programming concepts e.g
[[10 -9] + [8 - 7] + [6 -5] + [4 - 3]] = 4
For me tricky part is isolating pairs using map/reduce/recursion/closure
The recursive relationship you are looking for is
f([4, 3, 2, 1]) = 4 - 3 + 2 - 1 = 4 - (3 - 2 + 1) = 4 - f([3, 2, 1])
One of the mantras followed by many functional programmers is the following:
Data should be organized into data structures that mirror the processing you want to perform on it
Applying this to your question, you're running into a simple problem: the list data structure doesn't encode in any way the relationship between the pairs that you want to operate on. So map/reduce operations, since they work on the structure of lists, don't have any natural visibility into the pairs! This means you're "swimming against the current" of these operations, so to speak.
So the first step should be to organize the data as a list or stream of pairs:
pairs = [(10, 9), (8, 7), (6, 5), (4, 3)]
Now after you've done this, it's trivial to use map to take the difference of the elements of each pair. So basically I'm telling you to split the problem into two easier subproblems:
Write a function that groups lists into pairs like that.
Use map to compute the difference of each pair.
And the hint I'll give is that neither map nor reduce is particularly useful for step #1.
using itertools.starmap:
l = [10, 9, 8, 7, 6, 5, 4, 3]
from operator import sub
from itertools import starmap
print(sum(starmap(sub, zip(*[iter(l)] * 2))))
4
Or just a lambda:
print(sum(map(lambda x: sub(*x), zip(*[iter(l)] * 2))))
Or range and itemgetter:
from operator import itemgetter as itgt
print(sum(itgt(*range(0, len(l), 2))(l)) - sum(itgt(*range(1, len(l), 2))(l)))
Another try may be isolating pairs-You may need to sort the list beforehand
>>>import operator
>>>l=[10, 9, 8, 7, 6, 5, 4, 3]
>>>d= zip(l,l[1:])
>>>w=[d[i] for i in range(0,len(d),2)]#isolate pairs i.e. [(10, 9), (8, 7), (6, 5), (4, 3)]
>>>reduce(operator.add,[reduce(operator.sub,i) for i in w])
>>>4
You can do it in simple way.
l = [10, 9, 8, 7, 6, 5, 4, 3]
reduce(lambda x, y : y - x, l) * -1
I have a sumranges() function, which sums all the ranges of consecutive numbers found in a tuple of tuples. To illustrate:
def sumranges(nums):
return sum([sum([1 for j in range(len(nums[i])) if
nums[i][j] == 0 or
nums[i][j - 1] + 1 != nums[i][j]]) for
i in range(len(nums))])
>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> print sumranges(nums)
7
As you can see, it returns the number of ranges of consecutive digits within the tuple, that is: len((1, 2, 3, 4), (1), (5, 6), (19, 20), (24), (29), (400)) = 7. The tuples are always ordered.
My problem is that my sumranges() is terrible. I hate looking at it. I'm currently just iterating through the tuple and each subtuple, assigning a 1 if the number is not (1 + previous number), and summing the total. I feel like I am missing a much easier way to accomplish my stated objective. Does anyone know a more pythonic way to do this?
Edit: I have benchmarked all the answers given thus far. Thanks to all of you for your answers.
The benchmarking code is as follows, using a sample size of 100K:
from time import time
from random import randrange
nums = [sorted(list(set(randrange(1, 10) for i in range(10)))) for
j in range(100000)]
for func in sumranges, alex, matt, redglyph, ephemient, ferdinand:
start = time()
result = func(nums)
end = time()
print ', '.join([func.__name__, str(result), str(end - start) + ' s'])
Results are as follows. Actual answer shown to verify that all functions return the correct answer:
sumranges, 250281, 0.54171204567 s
alex, 250281, 0.531121015549 s
matt, 250281, 0.843333005905 s
redglyph, 250281, 0.366822004318 s
ephemient, 250281, 0.805964946747 s
ferdinand, 250281, 0.405596971512 s
RedGlyph does edge out in terms of speed, but the simplest answer is probably Ferdinand's, and probably wins for most pythonic.
My 2 cents:
>>> sum(len(set(x - i for i, x in enumerate(t))) for t in nums)
7
It's basically the same idea as descriped in Alex' post, but using a set instead of itertools.groupby, resulting in a shorter expression. Since sets are implemented in C and len() of a set runs in constant time, this should also be pretty fast.
Consider:
>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> flat = [[(x - i) for i, x in enumerate(tu)] for tu in nums]
>>> print flat
[[1, 1, 1, 1], [1, 4, 4], [19, 19, 22, 26, 396]]
>>> import itertools
>>> print sum(1 for tu in flat for _ in itertools.groupby(tu))
7
>>>
we "flatten" the "increasing ramps" of interest by subtracting the index from the value, turning them into consecutive "runs" of identical values; then we identify and could the "runs" with the precious itertools.groupby. This seems to be a pretty elegant (and speedy) solution to your problem.
Just to show something closer to your original code:
def sumranges(nums):
return sum( (1 for i in nums
for j, v in enumerate(i)
if j == 0 or v != i[j-1] + 1) )
The idea here was to:
avoid building intermediate lists but use a generator instead, it will save some resources
avoid using indices when you already have selected a subelement (i and v above).
The remaining sum() is still necessary with my example though.
Here's my attempt:
def ranges(ls):
for l in ls:
consec = False
for (a,b) in zip(l, l[1:]+(None,)):
if b == a+1:
consec = True
if b is not None and b != a+1:
consec = False
if consec:
yield 1
'''
>>> nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
>>> print sum(ranges(nums))
7
'''
It looks at the numbers pairwise, checking if they are a consecutive pair (unless it's at the last element of the list). Each time there's a consecutive pair of numbers it yields 1.
This could probably be put together in a more compact form, but I think clarity would suffer:
def pairs(seq):
for i in range(1,len(seq)):
yield (seq[i-1], seq[i])
def isadjacent(pair):
return pair[0]+1 == pair[1]
def sumrange(seq):
return 1 + sum([1 for pair in pairs(seq) if not isadjacent(pair)])
def sumranges(nums):
return sum([sumrange(seq) for seq in nums])
nums = ((1, 2, 3, 4), (1, 5, 6), (19, 20, 24, 29, 400))
print sumranges(nums) # prints 7
You could probably do this better if you had an IntervalSet class because then you would scan through your ranges to build your IntervalSet, then just use the count of set members.
Some tasks don't always lend themselves to neat code, particularly if you need to write the code for performance.
There is a formula for this, the sum of the first n numbers, 1+ 2+ ... + n = n(n+1) / 2 . Then if you want to have the sum of i-j then it is (j(j+1)/2) - (i(i+1)/2) this I am sure simplifies but you can work that out. It might not be pythonic but it is what I would use.