Python: Generate matrix of Nx4 integers who sum is a constant

Python: Generate matrix of Nx4 integers who sum is a constant - python

I have search the web which has provided various solution on how to produce a matrix of random numbers whose sum is a constant. My problem is slightly different. I want to generate an NX4 matrix of exhaustive list of integers such that sum of all numbers in the row is exactly 100. and integers have a range from [0,100]. I want to the integers to increment sequentially as opposed to random. How can I do it in Python?
Thank you.

product is a handy way of generating combinations
In [774]: from itertools import product
In [775]: [x for x in product(range(10),range(10)) if sum(x)==10]
Out[775]: [(1, 9), (2, 8), (3, 7), (4, 6), (5, 5), (6, 4), (7, 3), (8, 2), (9, 1)]
The tuples sum to 10, and step sequentially (in the first value at least).
I can generalize it to 3 tuples, and it still runs pretty fast.
In [778]: len([x for x in product(range(100),range(100),range(100)) if sum(x)==100])
Out[778]: 5148
Length 4 tuples takes much longer (on an old machine),
In [780]: len([x for x in product(range(100),range(100),range(100),range(100)) if sum(x)==100])
Out[780]: 176847
So there's probably case to be made for solving this incrementally.
[x for x in product(range(100),range(100),range(100)) if sum(x)<=100]
runs much faster, producing the same number of of 3 tuples (within 1 or 2). And the 4th value can be derived that that x.
In [790]: timeit len([x+(100-sum(x),) for x in product(range(100),range(100),range(100)) if sum(x)<=100])
1 loops, best of 3: 444 ms per loop

import itertools
import random
def makerow(L, T, R):
# make a row of size L and sum T, with the integers from 0-R, in ascending
answer = []
pool = list(itertools.takewhile(lambda x: x<T, range(R+1)))
for i in range(L-1):
answer.append(random.choice(pool))
T -= answer[-1]
pool = list(itertools.takewhile(lambda x: x<T, range(R+1)))
answer.append(T)
answer.sort()
return answer
def makematrix(M, N, T, R):
# make a matrix of M rows and N columns per row
# each row adds up to T
# using the numbers between 0-R
return [makerow(N, T, R) for _ in range(M)]

Related

Generate all the possible values using info from random binary tuples (Python)

Here's part of the code I'm working on using Python:
import random
pairs = [
(0, 1),
(1, 2),
(2, 3),
(3, 0),
]
alphasori = [(random.choice([1, -1]) * random.uniform(5, 15), pairs[n]) for n in range(4)]
binum = np.random.randint(2, size=4).tolist()
d = dict(zip([0,1,2,3], binum))
alpbi = [(i, tuple(d[j] for j in c)) for i, c in alphasori]
print(alpbi)
And this is a sample output (we can call this list as alpbi):
[(-6.16111614207135, (1, 1)), (-9.39824028732309, (1, 1)), (12.1294338553467, (1, 0)), (8.192565262190904, (0, 1))]
I'm now trying to calculate the linear combination (call that S) of the random numbers (the first terms) inside each tuple in alpbi, (-6.16111614207135,-9.39824028732309...), which is followed by the following rules:
if the inner tuple is (1,1) or (0,0), then the random number is multiplied by (-1)
if the inner tuple is (0,1) or (1,0), then keep the original number.
From 1&2, We're calculating the linear combinations S of those random numbers.
For example, for the random sample generated above, we have
S = (-1)(-6.16111614207135)+ (-1)(-9.39824028732309) +12.1294338553467+8.192565262190904 = 35.8813
Here's the code I have to figure out a single case:
S = 0
for i in range (len(alpbi)):
if alpbi[i][1][0] == alpbi[i][1][1]:
S += (-1)*alpbi[i][0]
else:
S += alpbi[i][0]
print(S)
However, given that '1's and '0's in the inner tuple are random binary numbers, how can I calculate all the possible values of S? (There're 16 combinations, and 8 distinct values in total, I'm wondering is there a way I can write a function to return all the possible values of S at the same time? (like a list containing all of them))
Thanks a lot for reading my question, I really appreciate the help:)

The answer should be the following:
from itertools import product
def all_combinations(numbers):
linear_combinations = product([-1, 1], repeat=len(numbers))
result = [sum([a * b for a, b in zip(numbers, factors)]) for factors in linear_combinations]
return result
alpbi = [(-6.16111614207135, (1, 1)), (-9.39824028732309, (1, 1)), (12.1294338553467, (1, 0)), (8.192565262190904, (0, 1))]
numbers = [item[0] for item in alpbi]
combinations = all_combinations(numbers)
However, there are indeed 16 combinations. I assume that when you say there are just 8 distinct value, you mean ignoring the negative pair to the positive one?
In that case you can just filter all negative numbers:
combinations = [num for num in combinations if num >= 0]

Python3: What is the most efficient way to calculate all permutations of two lists summing to 100?

Imagine we have a list of stocks:
stocks = ['AAPL','GOOGL','IBM']
The specific stocks don't matter, what matters is that we have n items in this list.
Imagine we also have a list of weights, from 0% to 100%:
weights = list(range(101))
Given n = 3 (or any other number) I need to produce a matrix with every possible combinations of weights that sum to a full 100%. E.g.
0%, 0%, 100%
1%, 0%, 99%
0%, 1%, 99%
etc...
Is there some method of itertools that can do this? Something in numpy? What is the most efficient way to do this?

The way to optimize this isn't to figure out a faster way to generate the permutations, it's to generate as few permutations as possible.
First, how would you do this if you only wanted the combination that were in sorted order?
You don't need to generate all possible combinations of 0 to 100 and then filter that. The first number, a, can be anywhere from 0 to 100. The second number, b, can be anywhere from 0 to (100-a). The third number, c, can only be 100-a-b. So:
for a in range(0, 101):
for b in range(0, 101-a):
c = 100-a-b
yield a, b, c
Now, instead of generating 100*100*100 combination to filter them down to 100*50*1+1, we're just generating the 100*50*1+1, for a 2000x speedup.
However, keep in mind that there are still around X * (X/2)**N answers. So, computing them in X * (X/2)**N time instead of X**N may be optimal—but it's still exponential time. And there's no way around that; you want an exponential number of results, after all.
You can look for ways to make the first part more concise with itertools.product combined with reduce or accumulate, but I think it's going to end up less readable, and you want to be able to extend to any arbitrary N, and also to get all permutations rather than just the sorted ones. So keep it understandable until you do that, and then look for ways to condense it after you're done.
You obviously need to either go through N steps. I think this is easier to understand with recursion than a loop.
When n is 1, the only combination is (x,).
Otherwise, for each of the values a from 0 to x, you can have that value, together with all of the combinations of n-1 numbers that sum to x-a. So:
def sum_to_x(x, n):
if n == 1:
yield (x,)
return
for a in range(x+1):
for result in sum_to_x(x-a, n-1):
yield (a, *result)
Now you just need to add in the permutations, and you're done:
def perm_sum_to_x(x, n):
for combi in sum_to_x(x, n):
yield from itertools.permutations(combi)
But there's one problem: permutations permutes positions, not values. So if you have, say, (100, 0, 0), the six permutations of that are (100, 0, 0), (100, 0, 0), (0, 100, 0), (0, 0, 100), (0, 100, 0), (0, 0, 100).
If N is very small—as it is in your example, with N=3 and X=100—it may be fine to just generate all 6 permutations of each combination and filter them:
def perm_sum_to_x(x, n):
for combi in sum_to_x(x, n):
yield from set(itertools.permutations(combi))
… but if N can grow large, we're talking about a lot of wasted work there as well.
There are plenty of good answers here on how to do permutations without repeated values. See this question, for example. Borrowing an implementation from that answer:
def perm_sum_to_x(x, n):
for combi in sum_to_x(x, n):
yield from unique_permutations(combi)
Or, if we can drag in SymPy or more-itertools:
def perm_sum_to_x(x, n):
for combi in sum_to_x(x, n):
yield from sympy.multiset_permutations(combi)
def perm_sum_to_x(x, n):
for combi in sum_to_x(x, n):
yield from more_itertools.distinct_permutations(combi)

What you are looking for is product from itertools module
you can use it as shown below
from itertools import product
weights = list(range(101))
n = 3
lst_of_weights = [i for i in product(weights,repeat=n) if sum(i)==100]

What you need is combinations_with_replacement because in your question you wrote 0, 0, 100 which means you expect repetition, like 20, 20, 60 etc.
from itertools import combinations_with_replacement
weights = range(11)
n = 3
list = [i for i in combinations_with_replacement(weights, n) if sum(i) == 10]
print (list)
The above code results in
[(0, 0, 10), (0, 1, 9), (0, 2, 8), (0, 3, 7), (0, 4, 6), (0, 5, 5), (1, 1, 8), (1, 2, 7), (1, 3, 6), (1, 4, 5), (2, 2, 6), (2, 3, 5), (2, 4, 4), (3, 3, 4)]
Replace range(10), n and sum(i) == 10 by whatever you need.

This is a classic Stars and bars problem, and Python's itertools module does indeed provide a solution that's both simple and efficient, without any additional filtering needed.
Some explanation first: you want to divide 100 "points" between 3 stocks in all possible ways. For illustration purposes, let's reduce to 10 points instead of 100, with each one worth 10% instead of 1%. Imagine writing those points as a string of ten * characters:
**********
These are the "stars" of "stars and bars". Now to divide the ten stars amongst the 3 stocks, we insert two | divider characters (the "bars" of "stars and bars"). For example, one such division might look like this::
**|*******|*
This particular combination of stars and bars would correspond to the division 20% AAPL, 70% GOOGL, 10% IBM. Another division might look like:
******||****
which would correspond to 60% AAPL, 0% GOOGL, 40% IBM.
It's easy to convince yourself that every string consisting of ten * characters and two | characters corresponds to exactly one possible division of the ten points amongst the three stocks.
So to solve your problem, all we need to do is generate all possible strings containing ten * star characters and two | bar characters. Or, to think of this another way, we want to find all possible pairs of positions that we can place the two bar characters, in a string of total length twelve. Python's itertools.combinations function can be used to give us those possible positions, (for example with itertools.combinations(range(12), 2)) and then it's simple to translate each pair of positions back to a division of range(10) into three pieces: insert an extra imaginary divider character at the start and end of the string, then find the number of stars between each pair of dividers. That number of stars is simply one less than the distance between the two dividers.
Here's the code:
import itertools
def all_partitions(n, k):
"""
Generate all partitions of range(n) into k pieces.
"""
for c in itertools.combinations(range(n+k-1), k-1):
yield tuple(y-x-1 for x, y in zip((-1,) + c, c + (n+k-1,)))
For the case you give in the question, you want all_partitions(100, 3). But that yields 5151 partitions, starting with (0, 0, 100) and ending with (100, 0, 0), so it's impractical to show the results here. Instead, here are the results in a smaller case:
>>> for partition in all_partitions(5, 3):
... print(partition)
...
(0, 0, 5)
(0, 1, 4)
(0, 2, 3)
(0, 3, 2)
(0, 4, 1)
(0, 5, 0)
(1, 0, 4)
(1, 1, 3)
(1, 2, 2)
(1, 3, 1)
(1, 4, 0)
(2, 0, 3)
(2, 1, 2)
(2, 2, 1)
(2, 3, 0)
(3, 0, 2)
(3, 1, 1)
(3, 2, 0)
(4, 0, 1)
(4, 1, 0)
(5, 0, 0)

Python - Combination of Numbers Summing to Greater than or Equal to a Value

I was recently posed the following interview question to be answered in Python - given a list of quantity-value pairs, find the optimal combination(s) of sets of values whose sum is as close to, and at least as large as, some provided value.
For example, given: [(1, $5), (3, $10), (2, $15)], and a desired value of $36, the answer would be [(2,$15), (1,$10)] or [(1,$15), (2,$10), (1,$5)]. The reason is that $40 is the least sum greater than or equal to $36 that can be achieved, and these are the two ways to achieve that sum.
I got stumped. Does anyone have a solution?

The numbers are so small you can just brute force it:
In []:
notes = [(1, 5), (3, 10), (2, 15)]
wallet = [n for a, b in notes for n in [b]*a]
combs = {sum(x): x for i in range(1, len(wallet)) for x in it.combinations(wallet, i)}
target = 36
for i in sorted(combs):
if i >= target:
break
i, combs[i]
Out[]:
(40, (5, 10, 10, 15))
You can extend this for all combinations, just replace the combs dictionary comprehension with:
combs = {}
for i in range(1, len(wallet)):
for x in it.combinations(wallet, i):
combs.setdefault(sum(x), set()).add(x)
...
i, combs[i]
Out[]:
(40, {(5, 10, 10, 15), (10, 15, 15)})

iterate through an array looking at non-consecutive values

for i,(x,y,z) in enumerate( zip(analysisValues, analysisValues[1:], analysisValues[2:]) ):
if all(k<0.5 for k in (x,y,z)):
instance = i
break
this code iterates through an array and looks for the first 3 consecutive values that meet the condition '<0.5'
==============================
i'm working with 'timeseries' data and comparing the values at t, t+1s and t+2s
if the data is sampled at 1Hz then 3 consecutive values are compared and the code above is correct (points 0,1,2)
if the data is sampled at 2Hz then every other point must be compared (points 0,2,4) or
if the data is sampled at 3Hz then every third point must be compared (points 0,3,6)
the sample rate of input data can vary, but is known and recorded as the variable 'SRate'
==============================
please can you help me incorporate 'time' into this point-by-point analysis

You can use extended slice notation, giving the step value as SRate:
for i,(x,y,z) in enumerate(zip(analysisValues, \
analysisValues[SRate::SRate], \
analysisValues[2 * SRate::SRate])):

Let us first construct helper generator which does the following:
from itertools import izip, tee, ifilter
def sparsed_window(iterator, elements=2, step=1):
its = tee(iterator, elements)
for i,it in enumerate(its):
for _ in range(i*step):
next(it,None) # wind forward each iterator for the needed number of items
return izip(*its)
print list(sparsed_window([1,2,3,4,5,6,7,8,9,10],3,2))
Output:
>>>
[(1, 3, 5), (2, 4, 6), (3, 5, 7), (4, 6, 8), (5, 7, 9), (6, 8, 10)]
This helper avoids us of creating nearly the same lists in memory. It uses tee to clever cache only the part that is needed.
The helper code is based on pairwise recipe
Then we can use this helper to get what we want:
def find_instance(iterator, hz=1):
iterated_in_sparsed_window = sparsed_window(iterator, elements=3, step=hz)
fitting_values = ifilter(lambda (i,els): all(el<0.5 for el in els), enumerate(iterated_in_sparsed_window))
i, first_fitting = next(fitting_values, (None,None))
return i
print find_instance([1,0.4,1,0.4,1,0.4,1,0.4,1], hz=2)
Output:
>>>
1

Return a sequence of a variable length whose summation is equal to a given integer

In the form f(x,y,z) where x is a given integer sum, y is the minimum length of the sequence, and z is the maximum length of the sequence. But for now let's pretend we're dealing with a sequence of a fixed length, because it will take me a long time to write the question otherwise.
So our function is f(x,r) where x is a given integer sum and r is the length of a sequence in the list of possible sequences.
For x = 10, and r = 2, these are the possible combinations:
1 + 9
2 + 8
3 + 7
4 + 6
5 + 5
Let's store that in Python as a list of pairs:
[(1,9), (2,8), (3,7), (4,6), (5,5)]
So usage looks like:
>>> f(10,2)
[(1,9), (2,8), (3,7), (4,6), (5,5)]
Back to the original question, where a sequence is return for each length in the range (y,x). I the form f(x,y,z), defined earlier, and leaving out sequences of length 1 (where y-z == 0), this would look like:
>>> f(10,1,3)
[{1: [(1,9), (2,8), (3,7), (4,6), (5,5)],
2: [(1,1,8), (1,2,7), (1,3,6) ... (2,4,4) ...],
3: [(1,1,1,7) ...]}]
So the output is a list of dictionaries where the value is a list of pairs. Not exactly optimal.
So my questions are:
Is there a library that handles this already?
If not, can someone help me write both of the functions I mentioned? (fixed sequence length first)?
Because of the huge gaps in my knowledge of fairly trivial math, could you ignore my approach to integer storage and use whatever structure the makes the most sense?
Sorry about all of these arithmetic questions today. Thanks!

The itertools module will definately be helpful as we're dealing with premutations - however, this looks suspiciously like a homework task...
Edit: Looks like fun though, so I'll do an attempt.
Edit 2: This what you want?
from itertools import combinations_with_replacement
from pprint import pprint
f = lambda target_sum, length: [sequence for sequence in combinations_with_replacement(range(1, target_sum+1), length) if sum(sequence) == target_sum]
def f2(target_sum, min_length, max_length):
sequences = {}
for length in range(min_length, max_length + 1):
sequence = f(target_sum, length)
if len(sequence):
sequences[length] = sequence
return sequences
if __name__ == "__main__":
print("f(10,2):")
print(f(10,2))
print()
print("f(10,1,3)")
pprint(f2(10,1,3))
Output:
f(10,2):
[(1, 9), (2, 8), (3, 7), (4, 6), (5, 5)]
f(10,1,3)
{1: [(10,)],
2: [(1, 9), (2, 8), (3, 7), (4, 6), (5, 5)],
3: [(1, 1, 8),
(1, 2, 7),
(1, 3, 6),
(1, 4, 5),
(2, 2, 6),
(2, 3, 5),
(2, 4, 4),
(3, 3, 4)]}

The problem is known as Integer Partitions, and has been widely studied.
Here you can find a paper comparing the performance of several algorithms (and proposing a particular one), but there are a lot of references all over the Net.

I just wrote a recursive generator function, you should figure out how to get a list out of it yourself...
def f(x,y):
if y == 1:
yield (x, )
elif y > 1:
for head in range(1, x-y+2):
for tail in f(x-head, y-1):
yield tuple([head] + list(tail))
def f2(x,y,z):
for u in range(y, z+1):
for v in f(x, u):
yield v
EDIT: I just see it is not exactly what you wanted, my version also generates duplicates where only the ordering differs. But you can simply filter them out by ordering all results and check for duplicate tuples.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Generate matrix of Nx4 integers who sum is a constant - python

Related

Generate all the possible values using info from random binary tuples (Python)

Python3: What is the most efficient way to calculate all permutations of two lists summing to 100?

Python - Combination of Numbers Summing to Greater than or Equal to a Value

iterate through an array looking at non-consecutive values

Return a sequence of a variable length whose summation is equal to a given integer

Categories

Resources