Algorithm for distributing tasks to two printers? - python

I am doing the programming exercise online, and I found this question:
Two printers work in different speed. The first printer produces one paper in x minutes, while the second does it in y minutes. To print N papers in total, how to distribute the tasks to those printers so the printing time is minimum?
The exercise gives me three inputs x,y,N and asks for the minimum time as output.
input data:
1 1 5
3 5 4
answer:
3 9
I have tried to set tasks for first printer as a, and the tasks for the second printer as N-a. The most efficient way to print is to let them have the same time, so the minimum time would be ((n*b)/(a+b))+1. However this formula is wrong.
Then I tried to use a brute force way to solve this problem. I first distinguished which one is smaller (faster) in a and b. Then I keep adding one paper to the faster printer. When the time needed for that printer is longer than the time to print one paper of the other printer, I give one paper to the slower printer, and subtract the time of faster printer.
The code is like:
def fastest_time (a, b, n):
""" Return the smalles time when keep two machine working at the same time.
The parameter a and b each should be a float/integer referring to the two
productivities of two machines. n should be an int, refering to the total
number of tasks. Return an int standing for the minimal time needed."""
# Assign the one-paper-time in terms of the magnitude of it, the reason
# for doing that is my algorithm is counting along the faster printer.
if a > b:
slower_time_each = a
faster_time_each = b
elif a < b :
slower_time_each = b
faster_time_each = a
# If a and b are the same, then we just run the formula as one printer
else :
return (a * n) / 2 + 1
faster_paper = 0
faster_time = 0
slower_paper = 0
# Loop until the total papers satisfy the total task
while faster_paper + slower_paper < n:
# We keep adding one task to the faster printer
faster_time += 1 * faster_time_each
faster_paper += 1
# If the time is exceeding the time needed for the slower machine,
# we then assign one task to it
if faster_time >= slower_time_each:
slower_paper += 1
faster_time -= 1 * slower_time_each
# Return the total time needed
return faster_paper * faster_time_each
It works when N is small or x and y are big, but it needs a lot of time (more than 10 minutes I guess) to compute when x and y are very small, i.e. the input is 1 2 159958878.
I believe there is an better algorithm to solve this problem, can anyone gives me some suggestions or hints please?

Given the input in form
x, y, n = 1, 2, 159958878
this should work
import math
math.ceil((max((x,y)) / float(x+y)) * n) * min((x,y))
This works for all your sample inputs.
In [61]: x, y, n = 1,1,5
In [62]: math.ceil((max((x,y)) / float(x+y)) * n) * min((x,y))
Out[62]: 3.0
In [63]: x, y, n = 3,5,4
In [64]: math.ceil((max((x,y)) / float(x+y)) * n) * min((x,y))
Out[64]: 9.0
In [65]: x, y, n = 1,2,159958878
In [66]: math.ceil((max((x,y)) / float(x+y)) * n) * min((x,y))
Out[66]: 106639252.0
EDIT:
This does not work for the case mentioned by #Antti i.e. x, y, n = 4,7,2.
Reason is that we are considering smaller time first. So the solution is to find both the values i.e. considering smaller time and considering larger time, and then choose whichever of the resultant value is smaller.
So, this works for all the cases including #Antii's
min((math.ceil((max((x,y)) / float(x+y)) * n) * min((x,y)),
math.ceil((min((x,y)) / float(x+y)) * n) * max((x,y))))
Although there might be some extreme cases where you might have to change it a little bit.

Related

A while loop time complexity

I'm interested in determining the big O time complexity of the following:
def f(x):
r = x / 2
d = 1e-10
while abs(x - r**2) > d:
r = (r + x/r) / 2
return r
I believe this is O(log n). To arrive at this, I merely collected empirical data via the timeit module and plotted the results, and saw that a plot that looked logarithmic using the following code:
ns = np.linspace(1, 50_000, 100, dtype=int)
ts = [timeit.timeit('f({})'.format(n),
number=100,
globals=globals())
for n in ns]
plt.plot(ns, ts, 'or')
But this seems like a corny way to go about figuring this out. Intuitively, I understand that the body of the while loop involves dividing an expression by 2 some number k times until the while expression is equal to d. This repeated division by 2 gives something like 1/2^k, from which I can see where a log is involved to solve for k. I can't seem to write down a more explicit derivation, though. Any help?
This is Heron's (Or Babylonian) method for calculating the square root of a number. https://en.wikipedia.org/wiki/Methods_of_computing_square_roots
Big O notation for this requires a numerical analysis approach. For more details on the analysis you can check the wikipedia page listed or look for Heron's error convergence or fixed point iteration. (or look here https://mathcirclesofchicago.org/wp-content/uploads/2015/08/johnson.pdf)
Broad-strokes, if we can write the error e_n = (x-r_n**2) in terms of itself to where e_n = (e_n**2)/(2*(e_n+1))
Then we can see that e_n+1 <= min{(e_n**2)/2,e_n/2} so we have the error decrease quadratically. With the degrees of accuracy effectively doubling each iteration.
Whats different between this analysis and Big-O, is that the time it takes does NOT depend on the size of the input, but instead of the wanted accuracy. So in terms of input, this while loop is O(1) because its number of iterations is bounded by the accuracy not the input.
In terms of accuracy the error is bounded by above by e_n < 2**(-n) so we would need to find -n such that 2**(-n) < d. So log_2(d) = b such that 2^b = d. Assuming d < 2, then n = floor(log_2(d)) would work. So in terms of d, it is O(log(d)).
EDIT: Some more info on error analysis of fixed point iteration http://www.maths.lth.se/na/courses/FMN050/media/material/part3_1.pdf
I believe you're correct that it's O(log n).
Here you can see the successive values of r when x = 100000:
1 50000
2 25001
3 12502
4 6255
5 3136
6 1584
7 823
8 472
9 342
10 317
11 316
12 316
(I've rounded them off because the fractions are not interesting).
What you can see if that it goes through two phases.
Phase 1 is when r is large. During these first few iterations, x/r is tiny compared to r. As a result, r + x/r is close to r, so (r + x/r) / 2 is approximately r/2. You can see this in the first 8 iterations.
Phase 2 is when it gets close to the final result. During the last few iterations, x/r is close to r, so r + x/r is close to 2 * r, so (r + x/r) / 2 is close to r. At this point we're just improving the approximation by small amounts. These iterations are not really very dependent on the magnitude of x.
Here's the succession for x = 1000000 (10x the above):
1 500000
2 250001
3 125002
4 62505
5 31261
6 15646
7 7855
8 3991
9 2121
10 1296
11 1034
12 1001
13 1000
14 1000
This time there are 10 iterations in Phase 1, then we again have 4 iterations in Phase 2.
The complexity of the algorithm is dominated by Phase 1, which is logarithmic because it's approximately dividing by 2 each time.

Why is Z3 slow for tiny search space?

I'm trying to make a Z3 program (in Python) that generates boolean circuits that do certain tasks (e.g. adding two n-bit numbers) but the performance is terrible to the point where a brute-force search of the entire solution space would be faster. This is my first time using Z3 so I could be doing something that impacts my performance, but my code seems fine.
The following is copied from my code here:
from z3 import *
BITLEN = 1 # Number of bits in input
STEPS = 1 # How many steps to take (e.g. time)
WIDTH = 2 # How many operations/values can be stored in parallel, has to be at least BITLEN * #inputs
# Input variables
x = BitVec('x', BITLEN)
y = BitVec('y', BITLEN)
# Define operations used
op_list = [BitVecRef.__and__, BitVecRef.__or__, BitVecRef.__xor__, BitVecRef.__xor__]
unary_op_list = [BitVecRef.__invert__]
for uop in unary_op_list:
op_list.append(lambda x, y : uop(x))
# Chooses a function to use by setting all others to 0
def chooseFunc(i, x, y):
res = 0
for ind, op in enumerate(op_list):
res = res + (ind == i) * op(x, y)
return res
s = Solver()
steps = []
# First step is just the bits of the input padded with constants
firststep = Array("firststep", IntSort(), BitVecSort(1))
for i in range(BITLEN):
firststep = Store(firststep, i * 2, Extract(i, i, x))
firststep = Store(firststep, i * 2 + 1, Extract(i, i, y))
for i in range(BITLEN * 2, WIDTH):
firststep = Store(firststep, i, BitVec("const_0_%d" % i, 1))
steps.append(firststep)
# Generate remaining steps
for i in range(1, STEPS + 1):
this_step = Array("step_%d" % i, IntSort(), BitVecSort(1))
last_step = steps[-1]
for j in range(WIDTH):
func_ind = Int("func_%d_%d" % (i,j))
s.add(func_ind >= 0, func_ind < len(op_list))
x_ind = Int("x_%d_%d" % (i,j))
s.add(x_ind >= 0, x_ind < WIDTH)
y_ind = Int("y_%d_%d" % (i,j))
s.add(y_ind >= 0, y_ind < WIDTH)
node = chooseFunc(func_ind, Select(last_step, x_ind), Select(last_step, y_ind))
this_step = Store(this_step, j, node)
steps.append(this_step)
# Set the result to the first BITLEN bits of the last step
if BITLEN == 1:
result = Select(steps[-1], 0)
else:
result = Concat(*[Select(steps[-1], i) for i in range(BITLEN)])
# Set goal
goal = x | y
s.add(ForAll([x, y], goal == result))
print(s)
print(s.check())
print(s.model())
The code basically lays out the inputs as individual bits, then at each "step" one of 5 boolean functions can operate on the values from the previous step, where the final step represents the end result.
In this example, I generate a circuit to calculate the boolean OR of two 1-bit inputs, and an OR function is available in the circuit, so the solution is trivial.
I have a solution space of only 5*5*2*2*2*2=400:
5 Possible functions (two function nodes)
2 Inputs for each function, each of which has two possible values
This code takes a few seconds to run and provides a correct answer, but I feel like it should run instantaneously as there are only 400 possible solutions, of which quite a few are valid. If I increase the inputs to be two bits long, the solution space has a size of (5^4)*(4^8)=40,960,000 and never finishes on my computer, though I feel this should be easily doable with Z3.
I also tried effectively the same code but substituted Arrays/Store/Select for Python lists and "selected" the variables by using the same trick I used in chooseFunc(). The code is here and it runs in around the same time the original code does, so no speedup.
Am I doing something that would drastically slow down the solver? Thanks!
You have a duplicated __xor__ in your op_list; but that's not really the major problem. The slowdown is inevitable as you increase bit-size, but on a first look you can (and should) avoid mixing integer reasoning with booleans here. I'd code your chooseFunc as follows:
def chooseFunc(i, x, y):
res = False;
for ind, op in enumerate(op_list):
res = If(ind == i, op (x, y), res)
return res
See if that improves run-times in any meaningful way. If not, the next thing to do would be to get rid of arrays as much as possible.

Standard deviation of combinations of dices

I am trying to find stdev for a sequence of numbers that were extracted from combinations of dice (30) that sum up to 120. I am very new to Python, so this code makes the console freeze because the numbers are endless and I am not sure how to fit them all into a smaller, more efficient function. What I did is:
found all possible combinations of 30 dice;
filtered combinations that sum up to 120;
multiplied all items in the list within result list;
tried extracting standard deviation.
Here is the code:
import itertools
import numpy
dice = [1,2,3,4,5,6]
subset = itertools.product(dice, repeat = 30)
result = []
for x in subset:
if sum(x) == 120:
result.append(x)
my_result = numpy.product(result, axis = 1).tolist()
std = numpy.std(my_result)
print(std)
Note that D(X^2) = E(X^2) - E(X)^2, you can solve this problem analytically by following equations.
f[i][N] = sum(k*f[i-1][N-k]) (1<=k<=6)
g[i][N] = sum(k^2*g[i-1][N-k])
h[i][N] = sum(h[i-1][N-k])
f[1][k] = k ( 1<=k<=6)
g[1][k] = k^2 ( 1<=k<=6)
h[1][k] = 1 ( 1<=k<=6)
Sample implementation:
import numpy as np
Nmax = 120
nmax = 30
min_value = 1
max_value = 6
f = np.zeros((nmax+1, Nmax+1), dtype ='object')
g = np.zeros((nmax+1, Nmax+1), dtype ='object') # the intermediate results will be really huge, to keep them accurate we have to utilize python big-int
h = np.zeros((nmax+1, Nmax+1), dtype ='object')
for i in range(min_value, max_value+1):
f[1][i] = i
g[1][i] = i**2
h[1][i] = 1
for i in range(2, nmax+1):
for N in range(1, Nmax+1):
f[i][N] = 0
g[i][N] = 0
h[i][N] = 0
for k in range(min_value, max_value+1):
f[i][N] += k*f[i-1][N-k]
g[i][N] += (k**2)*g[i-1][N-k]
h[i][N] += h[i-1][N-k]
result = np.sqrt(float(g[nmax][Nmax]) / h[nmax][Nmax] - (float(f[nmax][Nmax]) / h[nmax][Nmax]) ** 2)
# result = 32128174994365296.0
You ask for a result of an unfiltered lengths of 630 = 2*1023, impossible to handle as such.
There are two possibilities that can be combined:
Include more thinking to pre-treat the problem, e.g. on how to sample only
those with sum 120.
Do a Monte Carlo simulation instead, i.e. don't sample all
combinations, but only a random couple of 1000 to obtain a representative
sample to determine std sufficiently accurate.
Now, I only apply (2), giving the brute force code:
N = 30 # number of dices
M = 100000 # number of samples
S = 120 # required sum
result = [[random.randint(1,6) for _ in xrange(N)] for _ in xrange(M)]
result = [s for s in result if sum(s) == S]
Now, that result should be comparable to your result before using numpy.product ... that part I couldn't follow, though...
Ok, if you are out after the standard deviation of the product of the 30 dices, that is what your code does. Then I need 1 000 000 samples to get roughly reproducible values for std (1 digit) - takes my PC about 20 seconds, still considerably less than 1 million years :-D.
Is a number like 3.22*1016 what you are looking for?
Edit after comments:
Well, sampling the frequency of numbers instead gives only 6 independent variables - even 4 actually, by substituting in the constraints (sum = 120, total number = 30). My current code looks like this:
def p2(b, s):
return 2**b * 3**s[0] * 4**s[1] * 5**s[2] * 6**s[3]
hits = range(31)
subset = itertools.product(hits, repeat=4) # only 3,4,5,6 frequencies
product = []
permutations = []
for s in subset:
b = 90 - (2*s[0] + 3*s[1] + 4*s[2] + 5*s[3]) # 2 frequency
a = 30 - (b + sum(s)) # 1 frequency
if 0 <= b <= 30 and 0 <= a <= 30:
product.append(p2(b, s))
permutations.append(1) # TODO: Replace 1 with possible permutations
print numpy.std(product) # TODO: calculate std manually, considering permutations
This computes in about 1 second, but the confusing part is that I get as a result 1.28737023733e+17. Either my previous approaches or this one has a bug - or both.
Sorry - not that easy: The sampling is not of the same probability - that is the problem here. Each sample has a different number of possible combinations, giving its weight, which has to be considered before taking the std-deviation. I have drafted that in the code above.

Project Euler #45: How is my logic wrong?

From Project Euler, problem 45:
Triangle, pentagonal, and hexagonal numbers are generated by the following formulae:
Triangle T_(n)=n(n+1)/2 1, 3, 6, 10, 15, ...
Pentagonal P_(n)=n(3n−1)/2 1, 5, 12, 22, 35, ...
Hexagonal H_(n)=n(2n−1) 1, 6, 15, 28, 45, ...
It can be verified that T_(285) = P_(165) = H_(143) = 40755.
Find the next triangle number that is also pentagonal and hexagonal.
[ http://projecteuler.net/problem=45 ]
Now to solve them I took three variables and equated the equations to A.
n(n + 1)/2 = a(3a - 1)/2 = b(2b - 1) = A
A = number at which the threee function coincide for values of n, a, b
Resultant we get 3 equations with n and A. Solving with quarditic formula, we get 3 equations.
(-1 + sqrt(1 + 8*A ) )/2
( 1 + sqrt(1 + 24*A) )/6
( 1 + sqrt(1 + 8*A ) )/4
So my logic is to test for values of A at which the three equation give a natural +ve value. So far it works correct for number 40755 but fails to find the next one upto 10 million.
(Edit): Here is my code in python
from math import *
i=10000000
while(1):
i = i + 1
if(((-1+sqrt(1+8*i))/2).is_integer()):
if(((1+sqrt(1+24*i))/6).is_integer()):
if(((1+sqrt(1+8*i))/4).is_integer()):
print i
break
How is my logic wrong? (Apologies for a bit of maths involved. :) )
Given that:
all hexagonals are also triangulars
heapq.merge is quite handy for the task at hand (efficient and saves code)
then this:
import heapq
def hexagonals():
"Simplified generation of hexagonals"
n= 1
dn= 5
while 1:
yield n
n+= dn
dn+= 4
def pentagonals():
n= 1
dn= 4
while 1:
yield n
n+= dn
dn+= 3
def main():
last_n= 0
for n in heapq.merge(hexagonals(), pentagonals()):
if n == last_n:
print n
last_n= n
main()
produces 1, 40755 and the other number you're seeking in almost no time, and a few seconds later a 14-digit number. Just stop the program when you think you burned enough electricity.
In case you want to avoid “opaque” libraries, use the following main (essentially the same algorithm, only spelled out):
def main():
hexagonal= hexagonals()
pentagonal= pentagonals()
h= next(hexagonal)
p= next(pentagonal)
while 1:
while p < h:
p= next(pentagonal)
if p == h:
print p
h= next(hexagonal)
Times look similar, but I didn't bother to benchmark.
Your logic is not wrong, your program just takes a long time to run (by my estimate it should provide an answer in about an hour). I know the answer and tested your program by setting i to a value just below it. Your program then popped out the right answer at once.
Heed the advice of ypercube.
simplest way to implement is to make 3 generators for each sequence and route them in
heapq.merge
and then if you find 3 same consecitive keys you got solution
Simplest way to find this is usnig
itertools.groupby

Speeding up computations with numpy matrices

I have two matrices. Both are filled with zeros and ones. One is a big one (3000 x 2000 elements), and the other is smaller ( 20 x 20 ) elements. I am doing something like:
newMatrix = (size of bigMatrix), filled with zeros
l = (a constant)
for y in xrange(0, len(bigMatrix[0])):
for x in xrange(0, len(bigMatrix)):
for b in xrange(0, len(smallMatrix[0])):
for a in xrange(0, len(smallMatrix)):
if (bigMatrix[x, y] == smallMatrix[x + a - l, y + b - l]):
newMatrix[x, y] = 1
Which is being painfully slow. Am I doing anything wrong? Is there a smart way to make this work faster?
edit: Basically I am, for each (x,y) in the big matrix, checking all the pixels of both big matrix and the small matrix around (x,y) to see if they are 1. If they are 1, then I set that value on newMatrix. I am doing a sort of collision detection.
I can think of a couple of optimisations there -
As you are using 4 nested python "for" statements, you are about as slow as you can be.
I can't figure out exactly what you are looking for -
but for one thing, if your big matrix "1"s density is low, you can certainly use python's "any" function on bigMtarix's slices to quickly check if there are any set elements there -- you could get a several-fold speed increase there:
step = len(smallMatrix[0])
for y in xrange(0, len(bigMatrix[0], step)):
for x in xrange(0, len(bigMatrix), step):
if not any(bigMatrix[x: x+step, y: y + step]):
continue
(...)
At this point, if still need to interact on each element, you do another pair of indexes to walk each position inside the step - but I think you got the idea.
Apart from using inner Numeric operations like this "any" usage, you could certainly add some control flow code to break-off the (b,a) loop when the first matching pixel is found.
(Like, inserting a "break" statement inside your last "if" and another if..break pair for the "b" loop.
I really can't figure out exactly what your intent is - so I can't give you more specifc code.
Your example code makes no sense, but the description of your problem sounds like you are trying to do a 2d convolution of a small bitarray over the big bitarray. There's a convolve2d function in scipy.signal package that does exactly this. Just do convolve2d(bigMatrix, smallMatrix) to get the result. Unfortunately the scipy implementation doesn't have a special case for boolean arrays so the full convolution is rather slow. Here's a function that takes advantage of the fact that the arrays contain only ones and zeroes:
import numpy as np
def sparse_convolve_of_bools(a, b):
if a.size < b.size:
a, b = b, a
offsets = zip(*np.nonzero(b))
n = len(offsets)
dtype = np.byte if n < 128 else np.short if n < 32768 else np.int
result = np.zeros(np.array(a.shape) + b.shape - (1,1), dtype=dtype)
for o in offsets:
result[o[0]:o[0] + a.shape[0], o[1]:o[1] + a.shape[1]] += a
return result
On my machine it runs in less than 9 seconds for a 3000x2000 by 20x20 convolution. The running time depends on the number of ones in the smaller array, being 20ms per each nonzero element.
If your bits are really packed 8 per byte / 32 per int,
and you can reduce your smallMatrix to 20x16,
then try the following, here for a single row.
(newMatrix[x, y] = 1 when any bit of the 20x16 around x,y is 1 ??
What are you really looking for ?)
python -m timeit -s '
""" slide 16-bit mask across 32-bit pairs bits[j], bits[j+1] """
import numpy as np
bits = np.zeros( 2000 // 16, np.uint16 ) # 2000 bits
bits[::8] = 1
mask = 32+16
nhit = 16 * [0]
def hit16( bits, mask, nhit ):
"""
slide 16-bit mask across 32-bit pairs bits[j], bits[j+1]
bits: long np.array( uint16 )
mask: 16 bits, int
out: nhit[j] += 1 where pair & mask != 0
"""
left = bits[0]
for b in bits[1:]:
pair = (left << 16) | b
if pair: # np idiom for non-0 words ?
m = mask
for j in range(16):
if pair & m:
nhit[j] += 1
# hitposition = jb*16 + j
m <<= 1
left = b
# if any(nhit): print "hit16:", nhit
' \
'
hit16( bits, mask, nhit )
'
# 15 msec per loop, bits[::4] = 1
# 11 msec per loop, bits[::8] = 1
# mac g4 ppc

Categories