array of Weighted random numbers with constant sum - python

Is there any well-known algorithm which can generate an array of weighted random numbers having the same constant sum?
This problem can be stated in another way: Say, I have a total value of 20. Which should be distributed into 3 parts each of them has a weight of 2,4,3 respectively.
So, I need 3 random numbers which will have a sum of 20 but the distribution will follow the weight.
I have tried:
Range=20
W=[2,4,3]
Prob=[i/float(sum(W)) for i in W]
Weighted_array=np.random.multinomial(Range, Prob)
Is there any better option?

Symbolic:
This is a linear Diophantine equation of n=3 variables. To solve it analytically in python you can use sympy, as in this answer:
from random import randint
from sympy.solvers.diophantine import diop_linear
from sympy.abc import x, y, z
# Our original equation is 2 * x + 4 * y + 3 * z = 20
# Re-arrange to 2 * x + 4 * y + 3 * z - 20 = 0, and input the left side to sympy
general_solution = diop_linear(2 * x + 4 * y + 3 * z - 20)
def get_random_valid_triple():
t0,t1 = general_solution[2].free_symbols
# You can pick whatever bounds you want here
a = randint(-100, 100)
b = randint(-100, 100)
solution = [s.subs({t0: a, t1:b}) for s in general_solution]
print(solution)
# Get a random solution
get_random_valid_triple()
Brute Force:
Alternately, at least for small n and tight bounds on each variable, you can just precompute all possible solutions and use random.choice to choose one each time. For example, we restrict all of the variables to be positive, then they're forced to be in [0, 20 / coefficient], and there are only 14 solutions. We can generate these in python as follows:
import random
import itertools
n = 20
coefficients = [2, 4, 3]
valid_solutions = []
ranges = [range(0, n // k + 1) for k in coefficients]
for value in itertools.product(*ranges):
if sum(value[j] * i for j, i in enumerate(coefficients)) == n:
valid_solutions.append(value)
print("All solutions:")
print("\n".join(str(i) for i in valid_solutions))
print("Random solution:")
print(random.choice(valid_solutions))
This yields:
All solutions:
(0, 2, 4)
(0, 5, 0)
(1, 0, 6)
(1, 3, 2)
(2, 1, 4)
(2, 4, 0)
(3, 2, 2)
(4, 0, 4)
(4, 3, 0)
(5, 1, 2)
(6, 2, 0)
(7, 0, 2)
(8, 1, 0)
(10, 0, 0)
Random solution:
(10, 0, 0)

Related

In numpy, multipy two structured matrices concisely

I have two matrices. The first has the following structure:
[[1, 0, a],
[0, 1, b],
[1, 0, c],
[0, 1, d]]
where 1, 0, a, b, c, and d are scalars. The matrix is 4 by 3
The second is just a 2 by 3 matrix:
[[r1],
[r2]]
where r1 and r2 are the first and second rows respectively, each having 3 elements.
I would like the output to be:
[[r1, 0, a*r1],
[0, r1, b*r1],
[r2, 0, c*r2],
[0, r2, d*r2]]
which would be a 4 by 9 matrix.
This is similar to the Kronecker product, except separately for each row of the second matrix. Of course this could be done with cumbersome loops which I want to avoid.
How can I do this concisely?
You can do exactly what you said in the last line: do a separate Kronecker product for each row of the second column and then concatenate the results.
Let's assume that the two matrices are called x (4 by 3) and y (2 by 3). The first thing to do is to split x in two parts because only half matrix participates in each part of the product.
x = x.reshape(2, 2, 3)
Then you can calculate the two products separately:
z0 = np.kron(x[0], y[0])
z1 = np.kron(x[1], y[1])
Finally, concatenate the two results along the first axis:
z = np.concatenate([z0, z1], axis=0)
Or if, like me, you enjoy big ugly one-liners you can do:
z = np.concatenate([np.kron(xr, yr) for xr, yr in zip(x.reshape(2, 2, 3), y)], axis=0)
In the general case you mentioned in the comments, it would become:
z = np.concatenate([np.kron(xr, yr) for xr, yr in zip(x.reshape(int(n / 2), 2, 3), y)], axis=0)
This gives equal results to the explicit loop, which can be numba.jit compiled I believe:
def solve_explicit(x, y):
# sanity checks
assert x.shape[0] == 2*y.shape[0]
assert x.shape[1] == y.shape[1]
n = x.shape[0]
z = np.zeros((n, 9))
for i in range(n):
for j in range(3):
for k in range(3):
z[i, k + 3 * j] = x[i, j] * y[int(i / 2), k]
return z
Using broadcasting, with x.shape (n, 3), and y.shape (n//2, 3):
out = (x.reshape(-1, 2, 3, 1) * y.reshape(-1, 1, 1, 3)).reshape(-1, 9)
I personally would use np.einsum in this situation because I think it's easier to understand than broadcasting.
import numpy as np
(a, b, c, d) = np.random.rand(4)
x = np.array([[1, 0, a], [0, 1, b], [1, 0, c], [0, 1, d]])
y = np.random.rand(2, 3)
z = np.einsum("ij,ik->ijk", x.reshape(-1, 6), y).reshape(-1, 9)
# timeit magic commands.
# %timeit -n 50000 np.einsum("ij,ik->ijk", x.reshape(-1, 6), y).reshape(-1, 9)
# %timeit -n 50000 (x.reshape(-1, 2, 3, 1) * y.reshape(-1, 1, 1, 3)).reshape(-1, 9)
Some good references on Einstein summation in NumPy: [2, 3, 4].

How to find the number of pairs of non-coprimes in two given arrays in python?

How to go about finding the number of non-coprimes in a given array?
Suppose
a = [2, 5, 6, 7]
b = [4, 9, 10, 12]
Then the number of non-coprimes will be 3, since You can remove:
(2, 4)
(5, 10)
(6, 9)
n = int(input())
a = list(map(int, input().split()))
b = list(map(int, input().split()))
count = 0
len_a = len(a)
len_b = len(b)
for i in range(len_a):
for j in range(len_b):
x = a[i]
y = b[j]
if(math.gcd(x,y) != 1):
count += 1
print(count)
This is in reference to :https://www.hackerrank.com/challenges/computer-game/problem
I am receiving 8 as output.
Why do you expect the answer to be 3?
You're pairing 5 and 10, so you're obviously looking at pairs of elements from a and b disregarding their position.
Just print out the pairs and you'll see why you're getting 8...
import math
from itertools import product
a=[2, 5, 6, 7]
b=[4, 9, 10, 12]
print(sum([math.gcd(x, y) != 1 for x, y in product(a, b)])) # 8
print([(x, y) for x, y in product(a, b) if math.gcd(x, y) != 1]) # the pairs
Update: After reading the problem the OP is trying to handle, it's worth pointing out that the expect output (3) is the answer to a different question!
Not how many pairs of elements are not coprime, but rather how many non-coprime pairs can be removed without returning them into the arrays.
This question is actually an order of magnitude more difficult, and is not a matter of fixing one's code, but rather about giving the actual problem a lot of mathematical and algorithmic thought.
See some discussion here
Last edit, a sort-of solution, albeit an extremely inefficient one. The only point is to suggest some code that can help the OP understand the point of the original question, by seeing some form of solution, however low-quality or bad-runtime it is.
import math
from itertools import product, permutations
n = 4
def get_pairs_list_not_coprime_count(pairs_list):
x, y = zip(*pairs_list)
return min(i for i in range(n) if math.gcd(x[i], y[i]) == 1) # number of pairs before hitting a coprime pair
a = [2, 5, 6, 7]
b = [4, 9, 10, 12]
a_perms = permutations(a) # so that the pairing product with b includes all pairing options
b_perms = permutations(b) # so that the pairing product with a includes all pairing options
pairing_options = product(a_perms, b_perms) # pairs-off different orderings of a and b
actual_pairs = [zip(*p) for p in pairing_options] # turn a pair of a&b orderings into number-pairs (for each of the orderings possible as realized by the product)
print(max(get_pairs_list_not_coprime_count(pairs_list) for pairs_list in actual_pairs)) # The most pairings managed over all possible options: 3 for this example
I believe the answer should be 8 itself. Out of the 4*4 possible combinations of numbers that you are comparing, there are 8 coprimes and 8 non-coprimes.
Here is an implementation of the code with the gcd function without using math and broadcasting to avoid multiple loops.
import numpy
a = '2 5 6 7'
b = '4 9 10 12'
a = np.array(list(map(int,a.split())))
b = np.array(list(map(int,b.split())))
def gcd(p,q):
while q != 0:
p, q = q, p%q
return p
def is_coprime(x, y):
return gcd(x, y) == 1
is_coprime_v = np.vectorize(is_coprime)
compare = is_coprime_v(a[:, None], b[None, :])
noncoprime_pairs = [(a[i],b[j]) for i,j in np.argwhere(~compare)]
coprime_pairs = [(a[i],b[j]) for i,j in np.argwhere(compare)]
print('non-coprime',noncoprime_pairs)
print('coprime',coprime_pairs)
non-coprime [(2, 4), (2, 10), (2, 12), (5, 10), (6, 4), (6, 9), (6, 10), (6, 12)]
coprime [(2, 9), (5, 4), (5, 9), (5, 12), (7, 4), (7, 9), (7, 10), (7, 12)]
Same solution but using the math.gcd() -
import math
import numpy
a = '2 5 6 7'
b = '4 9 10 12'
a = np.array(list(map(int,a.split())))
b = np.array(list(map(int,b.split())))
def f(x,y):
return math.gcd(x, y) == 1
fv = np.vectorize(f)
compare = fv(a[:, None], b[None, :])
noncoprime_pairs = [(a[i],b[j]) for i,j in np.argwhere(~compare)]
print(noncoprime_pairs)
[(2, 4), (2, 10), (2, 12), (5, 10), (6, 4), (6, 9), (6, 10), (6, 12)]
If you are looking for the answer to be 3 in your example, I would assume you are counting the number of values in a that have at least one non-coprime in b.
If that is the case you could do it like this:
from math import gcd
def nonCoprimes(A,B):
return sum(any(gcd(a,b)>1 for b in B) for a in A)
print(nonCoprimes([2,5,6,7],[4,9,10,12])) # 3
So, for each value in a check if there are any values of b that don't have a gcd of 1 with the value in a

List multiplication in Python

I need to multiply different lists to calculate areas of irregular polygons.
X = [1,1,1,1,1,1]
Y = [5,4,3,2,1,0]
This means that coordinates from point 1 are (1,5), for point 2 are (1,4) and so on. To calculate the area I need to multiply X[i] * Y[i+1] which is equal to 1 * 4, 1 * 3, 1 * 2 and exlude the last multiplication, such as 1 * empty.
How can I do this?
So, in my understanding, you need a lag between X and Y, where the first element of Y is excluded, and the last element of X is excluded. In other words, you need something like:
[(1, 4), (1, 3), (1, 2), (1, 1), (1, 0)]
You can produce the above via:
zipped = zip(X[:-1], Y[1:])
and you can compute the products of each pair like so:
[a * b for a, b in zipped]
Of course, if X and Y were numpy arrays, you could do this much more efficiently:
>>> X[:-1] * Y[1:]
array([4, 3, 2, 1, 0])
Something like
[x * y for x, y in zip(X, Y[1:])]
would do it. But you should really use Numpy for anything non-trivial.

Python Multiply tuples of equal length

I was hoping for an elegant or effective way to multiply sequences of integers (or floats).
My first thought was to try (1, 2, 3) * (1, 2, 2) would result (1, 4, 6), the products of the individual multiplications.
Though python isn't preset to do that for sequences. Which is fine, I wouldn't really expect it to. So what's the pythonic way to multiply (or possibly other arithmetic operations as well) each item in two series with and to their respective indices?
A second example (0.6, 3.5) * (4, 4) = (2.4, 14)
The simplest way is to use zip function, with a generator expression, like this
tuple(l * r for l, r in zip(left, right))
For example,
>>> tuple(l * r for l, r in zip((1, 2, 3), (1, 2, 3)))
(1, 4, 9)
>>> tuple(l * r for l, r in zip((0.6, 3.5), (4, 4)))
(2.4, 14.0)
In Python 2.x, zip returns a list of tuples. If you want to avoid creating the temporary list, you can use itertools.izip, like this
>>> from itertools import izip
>>> tuple(l * r for l, r in izip((1, 2, 3), (1, 2, 3)))
(1, 4, 9)
>>> tuple(l * r for l, r in izip((0.6, 3.5), (4, 4)))
(2.4, 14.0)
You can read more about the differences between zip and itertools.izip in this question.
A simpler way would be:
from operator import mul
In [19]: tuple(map(mul, [0, 1, 2, 3], [10, 20, 30, 40]))
Out[19]: (0, 20, 60, 120)
If you are interested in element-wise multiplication, you'll probably find that many other element-wise mathematical operations are also useful. If that is the case, consider using the numpy library.
For example:
>>> import numpy as np
>>> x = np.array([1, 2, 3])
>>> y = np.array([1, 2, 2])
>>> x * y
array([1, 4, 6])
>>> x + y
array([2, 4, 5])
With list comprehensions the operation could be completed like
def seqMul(left, right):
return tuple([value*right[idx] for idx, value in enumerate(left)])
seqMul((0.6, 3.5), (4, 4))
A = (1, 2, 3)
B = (4, 5, 6)
AB = [a * b for a, b in zip(A, B)]
use itertools.izip instead of zip for larger inputs.

filling numpy array with random element from another array

I'm not sure if this is possible but here goes. Suppose I have an array:
array1 = [0,.1,.2,.3,.4,.5,.6,.7,.8,.9,1]
and now I would like to create a numpy 1D array consisting of 5 elements that are randomly drawn from array1 AND with the condition that the sum is equal to 1. Example is something like, a numpy array that looks like [.2,.2,.2,.1,.1].
currently I use the random module, and choice function that looks like this:
range1= np.array([choice(array1),choice(array1),choice(array1),choice(array1),choice(array1)])
then checking range1 to see if it meets the criteria; I'm wondering if there is faster way , something similar to
randomArray = np.random.random() instead.
Would be even better if I can store this array in some library so that if I try to generate 100 of such array, that there is no repeat but this is not necessary.
You can use numpy.random.choice if you use numpy 1.7.0+:
>>> import numpy as np
>>> array1 = np.array([0,.1,.2,.3,.4,.5,.6,.7,.8,.9,1])
>>> np.random.choice(array1, 5)
array([ 0. , 0. , 0.3, 1. , 0.3])
>>> np.random.choice(array1, 5, replace=False)
array([ 0.6, 0.8, 0.1, 0. , 0.4])
To get 5 elements that the sum is equal to 1,
generate 4 random numbers.
substract the sum of 4 numbers from 1 -> x
if x included in array1, use that as final number; or repeat
>>> import numpy as np
>>>
>>> def solve(arr, total, n):
... while True:
... xs = np.random.choice(arr, n-1)
... remain = total - xs.sum()
... if remain in arr:
... return np.append(xs, remain)
...
>>> array1 = np.array([0,.1,.2,.3,.4,.5,.6,.7,.8,.9,1])
>>> print solve(array1, 1, 5)
[ 0.1 0.3 0.4 0.2 0. ]
Another version (assume given array is sorted):
EPS = 0.0000001
def solve(arr, total, n):
while True:
xs = np.random.choice(arr, n-1)
t = xs.sum()
i = arr.searchsorted(total - t)
if abs(t + arr[i] - total) < EPS:
return np.append(xs, arr[i])
I had to do something similar a while ago.
def getRandomList(n, source):
'''
Returns a list of n elements randomly selected from source.
Selection is done without replacement.
'''
list = source
indices = range(len(source))
randIndices = []
for i in range(n):
randIndex = indices.pop(np.random.randint(0, high=len(indices)))
randIndices += [randIndex]
return [source[index] for index in randIndices]
data = [1,2,3,4,5,6,7,8,9]
randomData = getRandomList(4, data)
print randomData
If you don't care about the order of the values in the output sequences, the number of 5-value combinations of values from your list that add up to 1 is pretty small. In the specific case you proposed though, it's a bit complicated to calculate, since floating point values have rounding issues. You can more easily solve the issue if you use a set of integers (e.g. range(11))and find combinations that add up to 10. Then if you need the fractional values, just divide the values in the results by 10.
Anyway, here's a generator that yields all the possible sets that add up to a given value:
def picks(values, n, target):
if n == 1:
if target in values:
yield (target,)
return
for i, v in enumerate(values):
if v <= target:
for r in picks(values[i:], n-1, target-v):
yield (v,)+r
Here's the results for the numbers zero through ten:
>>> for r in picks(range(11), 5, 10):
print(r)
(0, 0, 0, 0, 10)
(0, 0, 0, 1, 9)
(0, 0, 0, 2, 8)
(0, 0, 0, 3, 7)
(0, 0, 0, 4, 6)
(0, 0, 0, 5, 5)
(0, 0, 1, 1, 8)
(0, 0, 1, 2, 7)
(0, 0, 1, 3, 6)
(0, 0, 1, 4, 5)
(0, 0, 2, 2, 6)
(0, 0, 2, 3, 5)
(0, 0, 2, 4, 4)
(0, 0, 3, 3, 4)
(0, 1, 1, 1, 7)
(0, 1, 1, 2, 6)
(0, 1, 1, 3, 5)
(0, 1, 1, 4, 4)
(0, 1, 2, 2, 5)
(0, 1, 2, 3, 4)
(0, 1, 3, 3, 3)
(0, 2, 2, 2, 4)
(0, 2, 2, 3, 3)
(1, 1, 1, 1, 6)
(1, 1, 1, 2, 5)
(1, 1, 1, 3, 4)
(1, 1, 2, 2, 4)
(1, 1, 2, 3, 3)
(1, 2, 2, 2, 3)
(2, 2, 2, 2, 2)
You can select one of them at random (with random.choice), or if you plan on using many of them and you don't want to repeat yourself, you can use random.shuffle, then iterate.
results = list(picks(range(11), 5, 10))
random.shuffle(results)
for r in results:
# do whatever you want with r

Categories