Combining functions to produce a desired integer output

Combining functions to produce a desired integer output - python

I'm not sure if this question belongs on stackoverflow or math, but I'll give stackoverflow a shot.
I'm attempting to create an algorithm / function that will allow me to solve a puzzle, but I am simply unable to. The puzzle is as stated.
Let a unit be a function that accepts a integer i between 0 and 15.
A unit can add or subtract any number in the range 0 - 15 to/from i.
Additionally, instead of adding or subtracting a number to/from i, a unit can also contain a number from 0-15 and subtract i from that.
A unit can only perform 2 operations on a number, and the operation producing the largest value will be the output of the unit.
Values only go from 0 - 15, so 9 - 15 = 0 and 13 + 5 = 15.
We may combine units together to produce a more complex result.
The first unit may only accept numbers ranging from 0 - 9.
In my examples I will string together 3 units.
This is a problem that is unrelated to coding, but it seems that I need a program to figure out possible solutions. I've attempted to create a brute force algorithm to find solutions, but I've been unable to do so, as I'm not that great with coding.
For example, a problem could be:
For values 1 and 4, let the output be 0. For all others, e.g. 0, 2, 3, 5, 6, 7, 8, 9, the output must be greater than 0.
A solution here might be:
def unit1(input):
return max(5 - input, input)
def unit2(input):
max(14 - input, input)
def unit3(input):
max(10 - input, input - 10)
print(unit3(unit2(unit1(4))))
Another example might be:
For values 4, 5, 6 and 8 the output must be 3 or greater. For all others, e.g. 0, 1, 2, 3, 7, 9, the output must be less than 3.
A solution here might be:
def unit1(input):
return max(4 - input, input - 4)
def unit2(input):
max(2, input)
def unit3(input):
max(1 - input, input - 6)
print(unit3(unit2(unit1(5))))
Given an example as the two stated above, is there a general algorithm / formula I can use to find my desired output?
Additionally, is it possible to solve the problems above using only 2 units?
Please do let me know if I need to elaborate on something, and know that your help is extremely appreciated!

There seems to be basically two kinds of things you have to do: map inputs that should be handled the same way to contiguous ranges, and then move contiguous ranges to the right place.
max(A-x,x-B) is the only kind of unit that can map non-contiguous ranges together. It has limitations: It always maps 2 inputs onto one output, and you've gotta be careful never to map two inputs that have to be handled differently onto the same output.
In terms of what gets mapped together, you only need one parameter, since max(x,A-x) handles all cases. You can try all 16 possibilities to see if any of them help. Sometimes you may need to do a saturating add before the max to collapse inputs at the top or bottom of the range.
In your first example, you need to map 0 and 4 together.
In max(x,A-x), we need 4 = A-1.
Solving that we get A=5, so we start with
max(x, 5-x)
That maps 4 and 1 to 4, and everything else to other values.
Now we need to combine the ranges above and below 4. Everything less than 4 has to map to something higher than 4. We solve 5 = A-3 to get A = 8:
max(x, 8-x)
Now the ranges of things that need to be handled the same way are contiguous, so we just need to move them to the right place. We have values >=4 and we need 4->0. We could add a subtraction unit, but it's shorter just to shift the previous max by subtracting 4 from both cases. We're left with a final solution
max(x, 5-x)
max(x-4, 4-x)
You haven't really defined all the possible questions you might be asked, but it looks like they can all be solved by this two step combine-and-shift process. Sometimes there will be no solution because you can't combine ranges in any valid way with max.

I think your missing piece you need to proceed is a higher-order function. That's a function that returns another function. Something like:
def createUnit(sub, add):
def unit(input):
return max(sub - input, input + add)
return unit
You can use it like:
unit1 = createUnit(5, 0)
unit2 = createUnit(14, 0)
unit3 = createUnit(10, -10)
Your first example can be solved with two units, but your second example can't.
I think the key to solving this efficiently is to work backwards from the output to the input, but I haven't spent enough time on it to figure out precisely how to do so.

Related

Find all the times in a (very large) array that have a difference greater than x

My array is time, so it is sorted and increasing.
I have to pull out the beginning/end where the difference in the array is greater than 30. The problem which other solutions don't cover, is that the array is thousands of values so looping through the array seems inefficient.
hugeArr = np.array([0, 2.072, 50.0, 90.0, 91.1])
My desired output for the above array would be something like: (2.072,50) (50,90).
Is there a way to accomplish this?

You can use np.diff and np.where to find the correct indices:
>>> idxs = np.where(np.diff(hugeArr) > 30)[0]
>>> list(zip(hugeArr[idxs], hugeArr[idxs + 1]))
[(2.072, 50.0), (50.0, 90.0)]
(Assuming you require only consecutive values)
And as #not_speshal mentioned, you can use np.column_stack instead of list(zip(...)) to stay within NumPy boundaries:
>>> np.column_stack((hugeArr[idxs], hugeArr[idxs+1]))
array([[ 2.072, 50. ],
[50. , 90. ]])

Try to think about what your'e trying to do. For each value in the array if the next value is larger by more then 30 you'd like to save the tuple of them.
The key words here are for each. This is a classic O(n) complexity algorithm, so decreasing its time complexity seems impossible to me.
However, you can make changes specific to your array to make the algorithm faster.
For example, if your'e looking for a difference of 30 and you know that the average difference is 1, you might be better off looking for index i at
difference = hugeArr[i+15] - hugeArr[i]
and see if this is bigger then 30. If it isn't (and it probably won't be), you can skip these 15 indices as you know that no gap between two consecutive values is larger then the big gap.
If this works for you, run tests, 15 is completely arbitrary and maybe your magic number is 25. Change it a bit and time how long your function takes to run.

A strategy that comes to mind is that we don't have to check numbers between two numbers that have a distance smaller than 30, we can do this because it is sorted. For example if the abs(hugeArr[0] - hugeArr[-1]) < 30 we dont have to check anything because nothing will have a distance of over 30.
We would start at the ends and work our way inwards. So check the starting number and ending number first. Then we go halfway hugeArr[len(hugeArr)//2] and check that number distance against the hugeArr[0] and hugeArr[-1]. Then we go into the ranges (hugeArr[0:len(hugeArr)//2] and hugeArr[len(hugeArr)//2:-1]). We break those two ranges again in half and wherever there is a distance from end to end smaller than 30 we don't check those. We can make this a recursive algorithm.
Worst case you'll have a distance over 30 everywhere and end up with O(n) but it could give you some advantage.
Something like this however you might want to refactor to numpy.
def check(arr):
pairs = []
def check_range(hugeArr):
difference = abs(hugeArr[0] - hugeArr[-1])
if difference < 30:
return
if len(hugeArr) == 2:
pairs.append((hugeArr[0], hugeArr[1]))
return
halfway = len(hugeArr)//2
check_range(hugeArr[:halfway+1])
check_range(hugeArr[halfway:])
check_range(arr)
return pairs

Converting binary to decimal using only Boolean and logic comparisons

I am taking a Python Certification class and have taken two practice exams to prepare for the timed exam I will be scheduling soon. However, there is limited interaction with professors and the discussion board is mostly students. I have a question that has been on both practice exams, so I imagine it will be on the real exam as well, and I can not see to wrap my head around how to solve it. There is no way in the class to see how to solve coding problems you have gotten incorrect, which is a major disappointment as that helps me in the future. I know there are built in functions for solving binary/decimal conversions, but the professor is wanting this done using Boolean logic and numerical comparisons as we are still in the early stages of the course. If anyone could assist in "walking" through the why's of the answer I would greatly appreciate it. Thank you.
number = 1101
You may modify the lines of code above, but don't move them! When you
Submit your code, we'll change these lines to assign different values
to the variables.
The number above represents a binary number. It will always be up to
eight digits, and all eight digits will always be either 1 or 0.
The string gives the binary representation of a number. In binary,
each digit of that string corresponds to a power of
2. The far left digit represents 128, then 64, then 32, then 16, then 8, then 4, then 2, and then finally 1 at the far right.
So, to convert the number to a decimal number, you want to (for
example) add 128 to the total if the first digit is 1, 64 if the
second digit is 1, 32 if the third digit is 1, etc.
For example, 00001101 is the number 13: there is a 0 in the 128s
place, 64s place, 32s place, 16s place, and 2s place. There are 1s in
the 8s, 4s, and 1s place. 8 + 4 + 1 = 13.
Note that although we use 'if' a lot to describe this problem, this
can be done entirely boolean logic and numerical comparisons.
Print the number that results from this conversion.

number = "00001101" #in Python, leading zeros are not permitted, so use a string
total = 0 #this var will keep track of the number in decimal form
index = len(number)-1 #eg 1100 has 4 digits and the max power is 3, 2^3.
for str_digit in number: #for each digit (as a string) in the number,
#total += int(str_digit)* 2**index #add the value (0 or 1) multiplied by 2 raised to the index power
if int(str_digit): #either 'if 0' or 'if 1'
total += 2**index #add 2 raised to the index power
index -= 1 # decrease the index
print(total)
Note that the line if int(str_digit): is actually redundant if you use the commented line total += int(str_digit)* 2**index instead, but I included it because your question specified that you want to test the Boolean value.
This line is the same as if 0: or if 1: which is the same as if False: or if True:.

All you need is this:
int(number, base=2)

How to fix negative values in log?

So, I am getting the data from a txt file and I want to get specific data within the whole set. In the code, I am trying to grab it by specifying which indexes and which frequencies are being used for those indexes. But my log is showing a negative value and I don't how to fix that. Code is below, thanks!
indexes = [9,10,11,12,13]
frequenciesmh = [151,610,1400,4860,18000]
frequenciesgh = [i*10**-3 for i in frequenciesmh]
bigclusterallfluxes = bigcluster[indexes]
bigclusterlogflux151mhandredshift = [i[indexes] for i in bigcluster]
shiftedlogflux151mh =
[np.interp(np.log10((151*10**-3)*i[0]),np.log10(frequenciesgh),i[1:])
for i in bigclusterlogflux151mhandredshift]
shiftflux151mh = [10**i for i in shiftedlogflux151mh]
bigclusterflux151mhandredshift =
np.array(list(zip(shiftflux151mh,np.transpose(bigcluster)[9])))

I don't know what you are trying to fix exactly, but I would definitely NOT change the negative values as they would change the power to being positive always (if you know some maths you will understand that that means 1/16 ==> 16 and also 16 ==> 16).
What you probably want, as you are working with frequencies (Which are always between 0 and 1, if you normalize them, to do this divide each of them by the sum of all of them, hence your logarithm will always be smaller or equal to 0) is to make them all positive and have the - log 10 of your probability, which is quite a common value to have, then 1 == 1/10, 2 == 1/100, etc (which in genetics at least are called phred values I believe).
Summarizing always call the minus log, not the log
-math.log(0.0001)

The abs() function is what you are looking for.

Finding the number of values that satisfy something in a matrix, without using loops. Python

I am to write a function that takes 3 arguments: Matrix 1, Matrix 2, and a number p. The functions outputs the number of entries in which the difference between Matrix 1 and Matrix 2 is bigger than p. I was instructed not to use loops.
I was advised to use X.sum() function where X is an ndarray.
I don't know what to do here.
The first thing I want to do is to subtract M2 from M1. Now I have entries, one of which is either or not bigger than p.
I tried to find a way to us the sum function, but I am afraid I can't see how it can help me.
The only thing I can think about is going through the entries, which I am not allowed to. I would appreciate you help in this. No recursion allowed as well.

import pandas as pd
# Pick value of P
p = 20
# Instantiate fake frames
a = pd.DataFrame({'foo':[4, 10], 'bar':[34, -12]})
b = pd.DataFrame({'foo':[64, 0], 'bar':[21, 354]})
# Get absolute value of difference
c = (b - a).applymap(abs)
# Boolean slice, then sum along each axis to get total number of "True"s
c.applymap(lambda x: x > p).sum().sum()

Challenging dynamic programming problem

This is a toned down version of a computer vision problem I need to solve. Suppose you are given parameters n,q and have to count the number of ways of assigning integers 0..(q-1) to elements of n-by-n grid so that for each assignment the following are all true
No two neighbors (horizontally or vertically) get the same value.
Value at positions (i,j) is 0
Value at position (k,l) is 0
Since (i,j,k,l) are not given, the output should be an array of evaluations above, one for every valid setting of (i,j,k,l)
A brute force approach is below. The goal is to get an efficient algorithm that works for q<=100 and for n<=18.
def tuples(n,q):
return [[a,]+b for a in range(q) for b in tuples(n-1,q)] if n>1 else [[a] for a in range(q)]
def isvalid(t,n):
grid=[t[n*i:n*(i+1)] for i in range(n)];
for r in range(n):
for c in range(n):
v=grid[r][c]
left=grid[r][c-1] if c>0 else -1
right=grid[r][c-1] if c<n-1 else -1
top=grid[r-1][c] if r > 0 else -1
bottom=grid[r+1][c] if r < n-1 else -1
if v==left or v==right or v==top or v==bottom:
return False
return True
def count(n,q):
result=[]
for pos1 in range(n**2):
for pos2 in range(n**2):
total=0
for t in tuples(n**2,q):
if t[pos1]==0 and t[pos2]==0 and isvalid(t,n):
total+=1
result.append(total)
return result
assert count(2,2)==[1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1]
Update 11/11
I've also asked this on TopCoder forums, and their solution is the most efficient one I've seen so far (about 3 hours for n=10, any q, from author's estimate)

Maybe this sounds too simple, but it works. Randomly distribute values to all the cells until only two are empty. Test for adjacency of all values. Compute the average the percent of successful casts vs. all casts until the variance drops to within an acceptable margin.
The risk goes to zero and the that which is at risk is only a little runtime.

This isn't an answer, just a contribution to the discussion which is too long for a comment.
tl; dr; Any algorithm which boils down to, "Compute the possibilities and count them," such as Eric Lippert's or a brute force approach won't work for #Yaroslav's goal of q <= 100 and n <= 18.
Let's first think about a single n x 1 column. How many valid numberings of this one column exist? For the first cell we can pick between q numbers. Since we can't repeat vertically, we can pick between q - 1 numbers for the second cell, and therefore q - 1 numbers for the third cell, and so on. For q == 100 and n == 18 that means there are q * (q - 1) ^ (n - 1) = 100 * 99 ^ 17 valid colorings which is very roughly 10 ^ 36.
Now consider any two valid columns (call them the bread columns) separated by a buffer column (call it the mustard column). Here is a trivial algorithm to find a valid set of values for the mustard column when q >= 4. Start at the top cell of the mustard column. We only have to worry about the adjacent cells of the bread columns which have at most 2 unique values. Pick any third number for the mustard column. Consider the second cell of the mustard column. We must consider the previous mustard cell and the 2 adjacent bread cells with a total of at most 3 unique values. Pick the 4th value. Continue to fill out the mustard column.
We have at most 2 columns containing a hard coded cell of 0. Using mustard columns, we can therefore make at least 6 bread columns, each with about 10 ^ 36 solutions for a total of at least 10 ^ 216 valid solutions, give or take an order of magnitude for rounding errors.
There are, according to Wikipedia, about 10 ^ 80 atoms in the universe.
Therefore, be cleverer.

Update 11/11 I've also asked this on TopCoder forums, and their solution is the most efficient one I've seen so far (about 41 hours hours for n=10, any q, from author's estimate)
I'm the author. Not 41, just 3 embarrassingly parallelizable CPU hours. I've counted symmetries. For n=10 there are only 675 really distinct pairs of (i,j) and (k,l). My program needs ~ 16 seconds per each.

I'm building a contribution based on the contribution to the discussion by Dave Aaron Smith.
Let's not consider for now the last two constraints ((i,j) and (k,l)).
With only one column (nx1) the solution is q * (q - 1) ^ (n - 1).
How many choices for a second column ? (q-1) for the top cell (1,2) but then q-1 or q-2 for the cell (2,2) if (1,2)/(2,1) have or not the same color.
Same thing for (3,2) : q-1 or q-2 solutions.
We can see we have a binary tree of possibilities and we need to sum over that tree. Let's assume left child is always "same color on top and at left" and right child is "different colors".
By computing over the tree the number of possibilities for the left column to create a such configurations and the number of possibilities for the new cells we are coloring we would count the number of possibilities for coloring two columns.
But let's now consider the probability distribution foe the coloring of the second column : if we want to iterate the process, we need to have an uniform distribution on the second column, it would be like the first one never existed and among all coloring of the first two column we could say things like 1/q of them have color 0 in the top cell of second column.
Without an uniform distribution it would be impossible.
The problem : is the distribution uniform ?
Answer :
We would have obtain the same number of solution by building first the second column them the first one and then the third one. The distribution of the second column is uniform in that case so it also is in the first case.
We can now apply the same "tree idea" to count the number of possibilities for the third column.
I will try to develop on that and build a general formula (since the tree is of size 2^n we don't want to explicitly explore it).

A few observations which might help other answerers as well:
The values 1..q are interchangeable - they could be letters and the result would be the same.
The constraints that no neighbours match is a very mild one, so a brute force approach will be excessively expensive. Even if you knew the values in all but one cell, there would still be at least q-8 possibilities for q>8.
The output of this will be pretty long - every set of i,j,k,l will need a line. The number of combinations is something like n2(n2-3), since the two fixed zeroes can be anywhere except adjacent to each other, unless they need not obey the first rule. For n=100 and q=18, the maximally hard case, this is ~ 1004 = 100 million. So that's your minimum complexity, and is unavoidable as the problem is currently stated.
There are simple cases - when q=2, there are the two possible checkerboards, so for any given pair of zeroes the answer is 1.
Point 3 makes the whole program O( n2(n2-3) ) as a minimum, and also suggests that you will need something reasonably efficient for each pair of zeroes as simply writing 100 million lines without any computation will take a while. For reference, at a second per line, that is 1x108s ~ 3 years, or 3 months on a 12-core box.
I suspect that there is an elegant answer given a pair of zeroes, but I'm not sure that there is an analytic solution to it. Given that you can do it with 2 or 3 colours depending on the positions of the zeroes, you could split the map into a series of regions, each of which uses only 2 or 3 colours, and then it's just the number of different combinations of 2 or 3 in q (qC2 or qC3) for each region times the number of regions, times the number of ways of splitting the map.

I'm not a mathematician, but it occurs to me that there ought to be an analytical solution to this problem, namely:
First, compute now many different colourings are possible for NxN board with Q colours (including that neighbours, defined as having common edge don't get same color). This ought to be pretty simple formula.
Then figure out how many of these solutions have 0 in (i,j), this should be 1/Q's fraction.
Then figure out how many of remaining solutions have 0 in (k,l) depending on manhattan distance |i-k|+|j-l|, and possibly distance to the board edge and "parity" of these distances, as in distance divisible by 2, divisible by 3, divisible by Q.
The last part is the hardest, though I think it might still be doable if you are really good at math.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.