Rounding floats while maintaining total sum equal [duplicate] - python

This question already has answers here:
Round a Python list of numbers and maintain their sum
(5 answers)
Closed 1 year ago.
I have a list of floats that add up to an integer. For circumstantial reasons, I have to iterate using a for loop x times, x being each float in the list, but since the argument for the range() function must be an integer, each float must be rounded. However, I want the total number of loops to remain equal to the sum of the original floats, which doesn't usually add up to the sum of the rounded numbers. How would you solve this problem?
Thank you.

I recently had to solve a similar problem and decided it was something common enough to various projects of mine to generalize a python solution and package it. Check out iteround.

Ok, this is going to be a bit mathematical:
You have a series of real numbers Xi
Their sum equals N
sum(Xi) = N
Let's break each real number to its floor integer and residual real part (between 0 and 1): Xi = Ri + fi
Now, you need a series of integers Yi that are as close to Xi, but are integers and also sum to N. We can break them like this: Yi = Ri + Fi (where Fi is an integer either 0 or 1).
Now we need that:
sum(Yi) = sum(Xi) = N
If you break that, you'll get this equation as a requirement for the solution:
sum(Fi) = sum(fi) = N - sum(Ri)
Let's denote: K = N - sum(Ri)
Now the solution is simple, choose the K elements which have the largest fi values, and assign their corresponding Fi to 1; assign the other Fi to 0.
Now you have your values for Yi which in your case are the loop sizes
Here's the code for it:
def round_series_retain_integer_sum(xs):
N = sum(xs)
Rs = [round(x) for x in xs]
K = N - sum(Rs)
assert K == round(K)
fs = [x - round(x) for x in xs]
indices = [i for order, (e, i) in enumerate(reversed(sorted((e,i) for i,e in enumerate(fs)))) if order < K]
ys = [R + 1 if i in indices else R for i,R in enumerate(Rs)]
return ys
xs = [5.2, 3.4, 2.1, 7.3, 3.25, 6.25, 8.2, 9.1, 10.1, 55.1]
ys = round_series_retain_integer_sum(xs)
print xs, sum(xs)
print ys, sum(ys)
I think I don't have any bugs, I hope you get the idea even if so

Related

Maximum value of given equation with range of x values

How can I find the maximum value of the following equation: Fp=(1000 + 9*(x**2) - (183)*x) Given values of x in the range of (1-10), using python. This is what I have tried already:
L= range(1, 11)
for x in L:
Fp=(1000 + 9*(x**2) - (183)*x)
Td=20 - 0.12*(x**2) + (4.2)*x
print(Fp, Td)
print(max(Fp))
Assuming that you have the set of natural numbers in mind, since you have a small range of numbers to check (only 10 numbers), the first approach would be to check the value of the equation for every number, save it in a list, and find the maximum of that list. Take a look at the code below.
max_list = []
for x in range(1,11):
Fp = (1000 + 9*(x**2) - (183)*x)
max_list.append(Fp)
print( max(max_list) )
Another more elegant approach is to analyze the equation. Since your Fp equation is a polynomial equation with the positive second power coeficent, you can assume that either the last element of the range is going to yield the maximum or the first.
So you only need to check those values.
value_range = (1,10)
Fp_first = 1000 + 9*(value_range[0]**2) - (183)*value_range[0]
Fp_last = 1000 + 9*(value_range[1]**2) - (183)*value_range[1]
max_val = max(Fp_first , Fp_last)
You could do it like this:-
def func(x):
return 1_000 + 9 * x * x - 183 * x
print(max([func(x) for x in range(1, 11)]))
The problem with your code is that you're taking the max of a scalar rather than of the values of Fp for each of the values of x.
For a small range of integer values of x, you can iterate over them as you do. And, if you only need the max value,
L = range(1, 11)
highest_Fp = 0 # ensured lower than any Fp value
for x in L:
Fp = (1000 + 9*(x**2) - (183)*x)
Td = 20 - 0.12*(x**2) + (4.2)*x
print(Fp, Td)
if Fp > highest_Fp:
highest_Fp = Fp
print(highest_Fp)

converting floats to fractions

I’m writing in Python3.
I created two lists in my code and I want to ‘connect’ them in a loop as fractions. Is there any possibility to do it in another way than using Fractions library? Unfortunately I can’t use it because it’s the task requirement. The problem comes up when fraction is a floating point number (for example 1/3).
How can I solve this problem?
Here's an example:
p = [1,2,3]
q = [3,5,9]
frac = []
for i in p:
for j in q:
f = i/j
if f not in frac:
frac.append(f)
You can use the fractions.Fraction type.
import this using: from fractions import Fraction
cast your f equation f = p/q with Fraction; f = Fraction(p/q)
then use the string conversion as well; f = str(Fraction(p/q))
from fractions import Fraction
f = str(Fraction(p/q))
If I understood correctly your problem is not on "how to convert floats to fractions" but yes "how to get a string representation of fraction from arrays of numbers", right?
Actually you can do that in one line:
p = [1,2,3]
q = [3,5,9]
list(map(lambda pair: f"{pair[0]}/{pair[1]}", [(x, y) for x in p for y in q])))
Explaining:
map - receives a function and an iterator, passing each element of the iterator to that function.
[(x, y) for x in p for y in q] - this is a list comprehension, it is generating pairs of numbers "for each x in array p for each y in array q".
lambda pair - this is an anonymous function receiving an argument pair (which we know will be a tuple '(x, y)') and returns the string "x/y" (which is "pair[0]/pair[1]")
Optional procedures
Eliminate zeros in denominator
If you want to avoid impossible fractions (like anything over 0), the list comprehension should be this one:
[(x, y) for x in p for y in q if x != 0]
Eliminate duplicates
Also, if on top of that you want to eliminate duplicate items, just wrap the entire list in a set() operation (sets are iterables with unique elements, and converting a list to a set automatically removes the duplicate elements):
set([(x, y) for x in p for y in q if x != 0])
Eliminate unnecessary duplicate negative signs
The list comprehension is getting a little bigger, but still ok:
set([(x, y) if x>0 or y>0 else (-x,-y) for x in p for y in q if x != 0])
Explaining: if x>0 or y>0, this means that only one of them could be a negative number, so that's ok, return (x,y). If not, that means both of them are negative, so they should be positive, then return (-x,-y).
Testing
The final result of the script is:
p = [1, -1, 0, 2, 3]
q = [3, -5, 9, 0]
print(list(map(lambda pair: f"{pair[0]}/{pair[1]}", set([(x, y) if x>0 or y>0 else (-x,-y) for x in p for y in q if y != 0]))))
# output:
# ['3/-5', '2/-5', '1/5', '1/-5', '0/3', '0/9', '2/3', '2/9', '3/3', '-1/3', '-1/9', '0/5', '3/9', '1/3', '1/9']
(0.33).as_integer_ratio() could work for your problem. Obviously 0.33 would be replaced by whatever float.
Per this question,
def float_to_ratio(flt):
if int(flt) == flt:
return int(flt), 1
flt_str = str(flt)
flt_split = flt_str.split('.')
numerator = int(''.join(flt_split))
denominator = 10 ** len(flt_split[1])
return numerator, denominator
this is also a solution.
You can use a loop to figure out the fraction by the simple code below
x = 0.6725
a = 0
b = 1
while (x != a/b):
if x > a/b:
a += 1
elif x < a/b:
b += 1
print(a, b)
The result of a and b is going to be
269 400

How to generate random values in range (-1, 1) such that the total sum is 0?

If the sum is 1, I could just divide the values by their sum. However, this approach is not applicable when the sum is 0.
Maybe I could compute the opposite of each value I sample, so I would always have a pair of numbers, such that their sum is 0. However this approach reduces the "randomness" I would like to have in my random array.
Are there better approaches?
Edit: the array length can vary (from 3 to few hundreds), but it has to be fixed before sampling.
There is a Dirichlet-Rescale (DRS) algorithm that generates random numbers summing up to a given number. As it says, it has the feature that
the vectors are uniformly distributed over the valid region of the
domain of all possible vectors, bounded by the constraints.
There is also a Python library for it.
You could use sklearns Standardscaler. It scales your data to have a variance of 1 and a mean of 0. The mean of 0 is equivalent to a sum of 0.
from sklearn.preprocessing import StandardScaler, MinMaxScaler
import numpy as np
rand_numbers = StandardScaler().fit_transform(np.random.rand(100,1, ))
If you don't want to use sklearn you can standardize by hand, the formula is pretty simple:
rand_numbers = np.random.rand(1000,1, )
rand_numbers = (rand_numbers - np.mean(rand_numbers)) / np.std(rand_numbers)
The problem here is the variance of 1, that causes numbers greater than 1 or smaller than -1. Therefor you devide the array by its max abs value.
rand_numbers = rand_numbers*(1/max(abs(rand_numbers)))
Now you have an array with values between -1 and 1 with a sum really close to zero.
print(sum(rand_numbers))
print(min(rand_numbers))
print(max(rand_numbers))
Output:
[-1.51822999e-14]
[-0.99356294]
[1.]
What you will have with this solution is either one 1 or one -1 in your data allways. If you would want to avoid this you could add a positive random factor to the division through the max abs. rand_numbers*(1/(max(abs(rand_numbers))+randomfactor))
Edit
As #KarlKnechtel mentioned the division by the standard deviation is redundant with the division by max absolute value.
The above can be simply done by:
rand_numbers = np.random.rand(100000,1, )
rand_numbers = rand_numbers - np.mean(rand_numbers)
rand_numbers = rand_numbers / max(abs(rand_numbers))
I would try the following solution:
def draw_randoms_while_sum_not_zero(eps):
r = random.uniform(-1, 1)
sum = r
yield r
while abs(sum) > eps:
if sum > 0:
r = random.uniform(-1, 0)
else:
r = random.uniform(0,1)
sum += r
yield r
As the floating point numbers are not perfectly accurate, you can never be sure, that the numbers you'd draw might sum up to 0. You need to decide, what margin is acceptable and call the above generator.
It'll yield (lazily return) random numbers as you need them as long as they don't sum up to 0 ± eps
epss = [0.1, 0.01, 0.001, 0.0001, 0.00001]
for eps in epss:
lengths = []
for _ in range(100):
lengths.append(len(list(draw_randoms_while_sum_not_zero(eps))))
print(f'{eps}: min={min(lengths)}, max={max(lengths)}, avg={sum(lengths)/len(lengths)}')
Results:
0.1: min=1, max=24, avg=6.1
0.01: min=1, max=174, avg=49.27
0.001: min=4, max=2837, avg=421.41
0.0001: min=5, max=21830, avg=4486.51
1e-05: min=183, max=226286, avg=48754.42
Since you are fine with the approach of generating lots of numbers and dividing by the sum, why not generate n/2 positive numbers divide by sum. Then generate n/2 negative numbers and divide by sum?
Want a random positive to negative mix? Randomly generate that mix randomly first then continue.
One way to generate such list is by having the opposite number.
If that is not a desirable property, you can introduce some extra randomness by adding / subtracting the same random value to different opposite couples, e.g.:
def exact_sum_uniform_random(num, min_val=-1.0, max_val=1.0, epsilon=0.1):
items = [random.uniform(min_val, max_val) for _ in range(num // 2)]
opposites = [-x for x in items]
if num % 2 != 0:
items.append(0.0)
for i in range(len(items)):
diff = random.random() * epsilon
if items[i] + diff <= max_val \
and any(opposite - diff >= min_val for opposite in opposites):
items[i] += diff
modified = False
while not modified:
j = random.randint(0, num // 2 - 1)
if opposites[j] - diff >= min_val:
opposites[j] -= diff
modified = True
result = items + opposites
random.shuffle(result)
return result
random.seed(0)
x = exact_sum_uniform_random(3)
print(x, sum(x))
# [0.7646391433441265, -0.7686875811622043, 0.004048437818077755] 2.2551405187698492e-17
EDIT
If the upper and lower limits are not strict, a simple way to construct a zero sum sequence is to sum-normalize two separate sequences to 1 and -1 and join them together:
def norm(items, scale):
return [item / scale for item in items]
def zero_sum_uniform_random(num, min_val=-1.0, max_val=1.0):
a = [random.uniform(min_val, max_val) for _ in range(num // 2)]
a = norm(a, sum(a))
b = [random.uniform(min_val, max_val) for _ in range(num - len(a))]
b = norm(b, -sum(b))
result = a + b
random.shuffle(result)
return result
random.seed(0)
n = 3
x = exact_mean_uniform_random(n)
print(exact_mean_uniform_random(n), sum(x))
# [1.0, 2.2578843364303585, -3.2578843364303585] 0.0
Note that both approaches will not have, in general, a uniform distribution.

How to optimize (3*O(n**2)) + O(n) algorithm?

I am trying to solve the arithmetic progression problem from USACO. Here is the problem statement.
An arithmetic progression is a sequence of the form a, a+b, a+2b, ..., a+nb where n=0, 1, 2, 3, ... . For this problem, a is a non-negative integer and b is a positive integer.
Write a program that finds all arithmetic progressions of length n in the set S of bisquares. The set of bisquares is defined as the set of all integers of the form p2 + q2 (where p and q are non-negative integers).
The two lines of input are n and m, which are the length of each sequence, and the upper bound to limit the search of the bi squares respectively.
I have implemented an algorithm which correctly solves the problem, yet it takes too long. With the max constraints of n = 25 and m = 250, my program does not solve the problem in the 5 second time limit.
Here is the code:
n = 25
m = 250
bisq = set()
for i in range(m+1):
for j in range(i,m+1):
bisq.add(i**2+j**2)
seq = []
for b in range(1, max(bisq)):
for a in bisq:
x = a
for i in range(n):
if x not in bisq:
break
x += b
else:
seq.append((a,b))
The program outputs the correct answer, but it takes too long. I tried running the program with the max n/m values, and after 30 seconds, it was still going.
Disclaimer: this is not a full answer. This is more of a general direction where to look for.
For each member of a sequence, you're looking for four parameters: two numbers to be squared and summed (q_i and p_i), and two differences to be used in the next step (x and y) such that
q_i**2 + p_i**2 + b = (q_i + x)**2 + (p_i + y)**2
Subject to:
0 <= q_i <= m
0 <= p_i <= m
0 <= q_i + x <= m
0 <= p_i + y <= m
There are too many unknowns so we can't get a closed form solution.
let's fix b: (still too many unknowns)
let's fix q_i, and also state that this is the first member of the sequence. I.e., let's start searching from q_1 = 0, extend as much as possible and then extract all sequences of length n. Still, there are too many unknowns.
let's fix x: we only have p_i and y to solve for. At this point, note that the range of possible values to satisfy the equation is much smaller than full range of 0..m. After some calculus, b = x*(2*q_i + x) + y*(2*p_i + y), and there are really not many values to check.
This last step prune is what distinguishes it from the full search. If you write down this condition explicitly, you can get the range of possible p_i values and from that find the length of possible sequence with step b as a function of q_i and x. Rejecting sequences smaller than n should further prune the search.
This should get you from O(m**4) complexity to ~O(m**2). It should be enough to get into the time limit.
A couple more things that might help prune the search space:
b <= 2*m*m//n
a <= 2*m*m - b*n
An answer on math.stackexchange says that for a number x to be a bisquare, any prime factor of x of the form 3 + 4k (e.g., 3, 7, 11, 19, ...) must have an even power. I think this means that for any n > 3, b has to be even. The first item in the sequence a is a bisquare, so it has an even number of factors of 3. If b is odd, then one of a+1b or a+2b will have an odd number of factors of 3 and therefore isn't a bisquare.

Finding number of pythagorean triples in a list using python?

I am coding a solution for a problem where the code will find the number of Pythagorean triples in a list given a list a. However, when I submit my code to the auto-grader, there are some test cases where my code fails, but I have no idea what went wrong. Please help me point out my mistake.....
def Q3(a):
lst = [i ** 2 for i in a]
lst.sort()
ans = 0
for x in lst:
for y in lst:
if (x + y) in lst:
ans += 1
return ans // 2
"Pythagorean triples" are integer solutions to the Pythagorean Theorem, for example, 32+42=52. Given a list of positive integers, find the number of Pythagorean triplets. Two Pythagorean triplets are different if at least one integer is different.
Implementation
· Implement a function Q3(A), where the A is a list of positive integers. The size of list A is up to 250.
· There are no duplicates in the list A
· This function returns the number of Pythagorean triplets.
Sample
· Q3( [3,4,6,5] ) = 1
· Q3( [4,5,6] ) = 0
Simple but not very efficient solution would be to loop through the list of numbers in the range (I have taken number from 1 to 100 for instance) in 3 nested for loops as below. But it would be slower as for 100 elements, it needs to have 100^3 operations
triplets = []
for base in range(1,101):
for height in range(1,101):
for hypotenuse in range(1,101):
# check if forms a triplet
if hypotenuse**2 == base**2 + height**2:
triplets.append(base, height, hypotenuse)
This can be made slightly more efficient (there are better solutions)
by calculating hypotenuse for each base and height combination and then check if the hypotenuse is an Integer
triplets = []
for base in range(1,101):
for height in range(1,101):
hypotenuse = math.sqrt(base**2 + height**2)
# check if hypotenuse is integer by ramiander division by 1
if hypotenuse%1==0:
triplets.append(base, height, hypotenuse)
# the above solution written a list comprehension
a = range(1,101)
[(i,j,math.sqrt(i*i+j*j)) for i in a for j in a if math.sqrt(i*i+j*j)%1==0]
If you consider (3,4,5) and (3,5,4) as different, use a set instead of list and get the len(triplets_set) in the end
Problem 1: Suppose your input is
[3,4,5,5,5]
Though it's somewhat unclear in your question, my presumption is that this should count as three Pythogorean triples, each using one of the three 5s.
Your function would only return 1.
Problem 2: As Sayse points out, your "triple" might be trying to use the same number twice.
You would be better off using itertools.combinations to get distinct combinations from your squares list, and counting how many suitable triples appear.
from itertools import combinations
def Q3(a):
squares = [i**2 for i in a]
squares.sort()
ans = 0
for x,y,z in combinations(squares, 3):
if x + y == z:
ans += 1
return ans
Given the constraints of the input you now added to your question with an edit, I don't think there's anything logically wrong with your implementation. The only type of test cases that your code can fail to pass has to be performance-related as you are using one of the slowest solutions by using 3 nested loops iterating over the full range of the list (the in operator itself is implemented with a loop).
Since the list is sorted and we want x < y < z, we should make y start from x + 1 and make z start from y + 1. And since given an x, the value of x depends on the value of y, for each given y we can increment z until z * z < x * x + y * y no longer holds, and if z * z == x * x + y * y at that point, we've found a Pythagorean triple. This allows y and z to sweep through the values above x only once and therefore reduces the time complexity from O(n^3) to O(n^2), making it around 40 times faster when the size of the list is 250:
def Q3(a):
lst = [i * i for i in sorted(a)]
ans = 0
for x in range(len(lst) - 2):
y = x + 1
z = y + 1
while z < len(lst):
while z < len(lst) and lst[z] < lst[x] + lst[y]:
z += 1
if z < len(lst) and lst[z] == lst[x] + lst[y]:
ans += 1
y += 1
return ans

Categories