Modelling a probability distribution as a fuzzy set in Python3 - python

I'm trying to build a fuzzy set from a series of example values with python3.
For instance, given [6, 7, 8, 9, 27] I'd like to obtain a function that:
returns 0.0 from 0 to 5ca,
goes gradually up to 1.0 from 5ca to 6,
stays at 1.0 from 6 to 9,
goes gradually down to 0.0 from 9 to 10ca,
stays at 0.0 from 10ca to 26ca,
goes gradually up to 1.0 from 26ca to 27,
goes gradually down to 0.0 from 27 to 28ca,
returns 0.0 from 28ca and afterwards.
Notice that the y values are always in the range [0.0, 1.0] and if a series is missing a value, the y of that value is 0.0.
Please consider that in the most general case, the input values might be something like [9, 41, 20, 13 ,11, 12, 14, 40, 4, 4, 4, 3, 34, 22] (values can always be sorted, but notice that in this series the value 4 is repeated 3 times therefore I'd expect to have a probability of 1 and all the other values a lower probability value -- not necessarily 1/3 as in this case).
The top part of this picture shows the desired function plotted up to x=16 (hand drawn). I'd be more than happy to obtain anything like it.
The bottom part of the picture shows some extra feature that would be nice to have but are not strictly mandatory:
better smoothing than shown in my drawing (A),
cumulative effect (B) provided that...
the function never goes above 1 (C) and...
the function never goes below 0 (D).
I've tried some approaches adapted from polyfit, bezier, gauss or others, for instance, but the results weren't what I expected.
I've also tried with package fuzzpy but I couldn't make it work because of its dependency to epydoc which seems not to be compatible with python3. No luck as well with StatModels.
Can anyone suggest how to achieve the desired function? Thanks in advance.
If you wonder, I plan to use the resulting function to predict the likelihood of a given value; with respect to the fuzzy set described above, for instance, 4.0 returns 0.0, 6.5 returns 1.0 and 5.8 something like 0.85. Maybe there is another simpler way to do this?
This is how I usually process the input values (not sure if the part that adds the 0s is needed), what show I have instead ??? to compute the desired f?
def prepare(values, normalize=True):
max = 0
table = {}
for value in values:
table[value] = (table[value] if value in table else 0) + 1
if normalize and table[value] > max:
max = table[value]
if normalize:
for value in table:
table[value] /= float(max)
for value in range(sorted(table)[-1] + 2):
if value not in table:
table[value] = 0
x = sorted(table)
y = [table[value] for value in x]
return x, y
if __name__ == '__main__':
# get x and y vectors
x, y = prepare([9, 41, 20, 13, 11, 12, 14, 40, 4, 4, 4, 3, 34, 22], normalize=True)
# calculate fitting function
f = ???
# calculate new x's and y's
x_new = np.linspace(x[0], x[-1], 50)
y_new = f(x_new)
# plot the results
plt.plot(x, y, 'o', x_new, y_new)
plt.xlim([x[0] - 1, x[-1] + 1])
plt.show()
print("Done.")
A practical example, just to clarify the motivations for this...
The series of values might be the number of minutes after which persons give up standing in line in front of a kiosk... With such a model, we could try to predict how likely somebody will leave the queue by knowing how long has been waiting. The value read in this way can be then defuzzyfied, for instance, in happily waiting [0.00, 0.33], just waiting (0.33, 0.66] and about to leave (0.66, 1.00]. In case of about to leave that somebody could be engaged by something (and ad?) to convince him to stay.

This only works (due to np.bincount) with a set of integers.
def fuzzy_interp(x, vals):
vmn, vmx = np.amin(vals), np.amax(vals)
v = vals - vmn + 1
b = np.bincount(v, minlength = vmx - vmn + 2)
b = b / np.amax(b)
return np.interp(x - vmn - 1, np.arange(b.size), b, left = 0, right = 0)

def pulse(x):
return np.maximum(0, 1 - abs(x))
def fuzzy_in_unscaled(x, xs):
return pulse(np.subtract.outer(x, xs)).sum(axis=-1)
def fuzzy_in(x, xs):
largest = fuzzy_in_unscaled(xs, xs).max()
return fuzzy_in_unscaled(x, xs) / largest
>>> fuzzy_in(1.5, [1, 3, 4, 5]) # single membership
0.5
>>> fuzzy_in([[1.5, 3], [3.5, 10]], [1, 3, 4, 5]) # vectorized in the first argument
array([[0.5, 1], [1, 0]])
This exploits the fact that the peak values must lie on the elements. This is not true for all pulse functions.
You'd do well to precompute largest, as it's O(N^2)

Related

Guessing a missing value based on historical data

Let's assume i have 100 different kinds of items, each item got a name and a physical weight.
I know the names of all 100 items but only the weight of 80 items.
When i ship items, i pack them in groups of 10 and sum the weight of these items.
Due to some items are missing their weight, this will give an inaccurate sum when im about to ship.
I have different shipments with missing weights
Shipment 1
Item Name
Item Weight
Item 2
10
Item 27
20
Item 42
20
Item 71
-
Item 77
-
Total weight: 75
Shipment 2
Item Name
Item Weight
Item 2
10
Item 27
20
Item 42
20
Item 71
-
Item 92
-
Total weight: 90
Shipment 3
Item Name
Item Weight
Item 2
10
Item 27
20
Item 42
20
Item 55
35
Item 77
-
Total weight: 100
Since some of the shipments share the same items with missing weights and i have the shipments total weight, is there a way with machine learning to determine the weight of these items without im unpacking the entire shipment?
Or would it just be a, in this case, 100x3 Matrix with a lot of empty values?
At this point im not really sure if i should use some type of regression to solve this or if its just a matrix, that would expand a lot if i had n more items to ship.
I also wondered if this was some type of knapsack problem, but i hope anyone can guide my in the right direction.
Forget about machine learning. This is a simple system of linear equations.
w_71 + w_77 = 25
w_71 + w_92 = 40
w_77 = 15
You can solve it with sympy.solvers.solveset.linsolve, or scipy.optimize.linprog, or scipy.linalg.lstsq, or numpy.linalg.lstsq
sympy.linsolve is maybe the easiest to understand if you are not familiar with matrices; however, if the system is underdetermined, then instead of returning a particular solution to the system, sympy.linsolve will return the general solution in parametric form.
scipy.lstsq or numpy.lstsq expect the problem to be given in matrix form. If there is more than one possible solution, they will return the most "average" solution. However, they cannot take any positivity constraint into account: they might return a solution where one of the variables is negative. You can maybe fix this behaviour by adding a new equation to the system to manually force a variable to be positive, then solve again.
scipy.linprog expects the problem to be given in matrix form; it also expects you to specify a linear objective function, to choose which particular solution is "best" in case there is more than one possible solution. linprog also considers that all variables are nonnegative by default, or allows you to specify explicit bounds for the variables yourself. It also allows you to add inequality constraints, in addition to the equations, if you wish to.
Using sympy.solvers.solveset.linsolve
from sympy.solvers.solveset import linsolve
from sympy import symbols
w71, w77, w92 = symbols('w71 w77 w92')
eqs = [w71+w77-25, w71+w92-40, w77-15]
solution = linsolve(eqs, [w71, w77, w92])
# solution = {(10, 15, 30)}
In your example, there is only one possible solution, so linsolve returned that solution: w71 = 10, w77 = 15, w92 = 30.
However, in case there is more than one possible solution, linsolve will return a parametric form for the general solution:
x,y,z = symbols('x y z')
eqs = [x+y-10, y+z-20]
solution = linsolve(eqs, [x, y, z])
# solution = {(z - 10, 20 - z, z)}
Here there is an infinity of possible solutions. linsolve is telling us that we can pick any value for z, and then we'll get the corresponding x and y as x = z - 10 and y = 20 - z.
Using numpy.linalg.lstsq
lstsq expects the system of equations to be given in matrix form. If there is more than one possible solution, then it will return the most "average" solution. For instance, if the system of equation is simply x + y = 10, then lstsq will return the particular solution x = 5, y = 5 and will ignore more "extreme" solutions such as x = 10, y = 0.
from numpy.linalg import lstsq
# w_71 + w_77 = 25
# w_71 + w_92 = 40
# w_77 = 15
A = [[1, 1, 0], [1, 0, 1], [0, 1, 0]]
b = [25, 40, 15]
solution = lstsq(A, b)
solution[0]
# array([10., 15., 30.])
Here lstsq found the unique solution, w71 = 10, w77=15, w92 = 30.
# x + y = 10
# y + z = 20
A = [[1, 1, 0], [0, 1, 1]]
b = [10, 20]
solution = lstsq(A, B)
solution[0]
# array([-3.55271368e-15, 1.00000000e+01, 1.00000000e+01])
Here lstsq had to choose a particular solution, and chose the one it considered most "average", x = 0, y = 10, z = 10. You might want to round the solution to integers.
One drawback of lstsq is that it doesn't take into account your non-negativity constraint. That is, it might return a solution where one of the variables is negative:
# x + y = 2
# y + z = 20
A = [[1, 1, 0], [0, 1, 1])
b = [2, 20]
solution = lstsq(A, b)
solution[0]
# array([-5.33333333, 7.33333333, 12.66666667])
See how lstsq ignored the possible positive solution x = 1, y = 1, z = 18 and instead returned the solution it considered most "average", x = -5.33, y = 7.33, z = 12.67.
One way to fix this is to add an equation yourself to force the offending variable to be positive. For instance, here we noticed that lstsq wanted x to be negative, so we can manually force x to be equal to 1 instead, and solve again:
# x + y = 2
# y + z = 20
# x = 1
A = [[1, 1, 0], [0, 1, 1], [1, 0, 0]]
b = [2, 20, 1]
solution = lstsq(A, b)
solution[0]
# array([ 1., 1., 19.])
Now that we manually forced x to be 1, lstsq found solution x=1, y=1, z=19 which we're more happy with.
Using scipy.optimize.linprog
The particularity of linprog is that it expects you to specify the "objective" used to choose a particular solution, in case there is more than one possible solution.
Also, linprog allows you to specify bounds for the variables. The default is that all variables are nonnegative, which is what you want.
from scipy.optimize import linprog
# w_71 + w_77 = 25
# w_71 + w_92 = 40
# w_77 = 15
A = [[1, 1, 0], [1, 0, 1], [0, 1, 0]]
b = [25, 40, 15]
c = [1, 1, 1] # coefficients for objective: minimise w71 + w77 + w92.
solution = linprog(c, A_eq = A, b_eq = b)
solution.x
# array([10., 15., 30.])

Google Foobar L4: Bringing a gun to a trainer fight

I started this problem a couple days ago (code below).
Bringing a Gun to a Trainer Fight
=================================
Uh-oh -- you've been cornered by one of Commander Lambdas elite bunny trainers! Fortunately, you grabbed a beam weapon from an abandoned storeroom while you were running through the station, so you have a chance to fight your way out. But the beam weapon is potentially dangerous to you as well as to the bunny trainers: its beams reflect off walls, meaning you'll have to be very careful where you shoot to avoid bouncing a shot toward yourself!
Luckily, the beams can only travel a certain maximum distance before becoming too weak to cause damage. You also know that if a beam hits a corner, it will bounce back in exactly the same direction. And of course, if the beam hits either you or the bunny trainer, it will stop immediately (albeit painfully).
Write a function solution(dimensions, your_position, trainer_position, distance) that gives an array of 2 integers of the width and height of the room, an array of 2 integers of your x and y coordinates in the room, an array of 2 integers of the trainer's x and y coordinates in the room, and returns an integer of the number of distinct directions that you can fire to hit the elite trainer, given the maximum distance that the beam can travel.
The room has integer dimensions [1 < x_dim <= 1250, 1 < y_dim <= 1250]. You and the elite trainer are both positioned on the integer lattice at different distinct positions (x, y) inside the room such that [0 < x < x_dim, 0 < y < y_dim]. Finally, the maximum distance that the beam can travel before becoming harmless will be given as an integer 1 < distance <= 10000.
For example, if you and the elite trainer were positioned in a room with dimensions [3, 2], your_position [1, 1], trainer_position [2, 1], and a maximum shot distance of 4, you could shoot in seven different directions to hit the elite trainer (given as vector bearings from your location): [1, 0], [1, 2], [1, -2], [3, 2], [3, -2], [-3, 2], and [-3, -2]. As specific examples, the shot at bearing [1, 0] is the straight line horizontal shot of distance 1, the shot at bearing [-3, -2] bounces off the left wall and then the bottom wall before hitting the elite trainer with a total shot distance of sqrt(13), and the shot at bearing [1, 2] bounces off just the top wall before hitting the elite trainer with a total shot distance of sqrt(5).
Languages
=========
To provide a Java solution, edit Solution.java
To provide a Python solution, edit solution.py
Test cases
==========
Your code should pass the following test cases.
Note that it may also be run against hidden test cases not shown here.
-- Java cases --
Input:
Solution.solution([3,2], [1,1], [2,1], 4)
Output:
7
Input:
Solution.solution([300,275], [150,150], [185,100], 500)
Output:
9
-- Python cases --
Input:
solution.solution([3,2], [1,1], [2,1], 4)
Output:
7
Input:
solution.solution([300,275], [150,150], [185,100], 500)
Output:
9
Use verify [file] to test your solution and see how it does. When you are finished editing your code, use submit [file] to submit your answer. If your solution passes the test cases, it will be removed from your home folder.
I got both test cases to pass on my computer, but for some reason when I verify the code on the platform, only the second out of the two pass (the first one fails). In addition, the 4th, 5th, and 6th test cases pass (all hidden) and the rest (10 total) fail. Here is my code:
from math import sqrt, ceil, atan2
def solution(dimensions, your_position, trainer_position, distance):
# calculate maximum repetiions of current room in mirrored room
cp_x = int(ceil((your_position[0] + distance) / dimensions[0]))
cp_y = int(ceil((your_position[1] + distance) / dimensions[1]))
# generate all possible positions in q1
q1_player = [your_position]
q1_trainer = [trainer_position]
for i in range(0, cp_x):
for j in range(0, cp_y):
if i == 0 and j == 0:
continue
else:
temp_player = [your_position[0] + i * dimensions[0], your_position[1] + j * dimensions[1]]
temp_trainer = [trainer_position[0] + i * dimensions[0], trainer_position[1] + j * dimensions[1]]
if i % 2 != 0:
temp_player[0] = temp_player[0] - (2 * your_position[0]) + dimensions[0]
temp_trainer[0] = temp_trainer[0] - (2 * trainer_position[0]) + dimensions[0]
if j % 2 != 0:
temp_player[1] = temp_player[1] - (2 * your_position[1]) + dimensions[1]
temp_trainer[1] = temp_trainer[1] - (2 * trainer_position[1]) + dimensions[1]
q1_player.append(temp_player)
q1_trainer.append(temp_trainer)
# generate all possible positions in q2, q3, and q4
q2_player = [[-x, y] for [x, y] in q1_player]
q2_trainer = [[-x, y] for [x, y] in q1_trainer]
q3_player = [[-x, -y] for [x, y] in q1_player]
q3_trainer = [[-x, -y] for [x, y] in q1_trainer]
q4_player = [[x, -y] for [x, y] in q1_player]
q4_trainer = [[x, -y] for [x, y] in q1_trainer]
# generate distances from original player
player = [[x, y, dist(your_position, [x, y]), 1] for [x, y] in q1_player + q2_player + q3_player + q4_player]
trainer = [[x, y, dist(your_position, [x, y]), 2] for [x, y] in q1_trainer + q2_trainer + q3_trainer + q4_trainer]
corners = [[x, y, dist(your_position, [x, y]), 3] for [x, y] in [(0, 0), (dimensions[0], 0), (dimensions[0], dimensions[1]), (0, dimensions[1])]]
# filter for distances that are too far away
positions = filter(lambda x: x[2] <= float(distance), player + trainer + corners)
positions = sorted(positions, key=lambda x: x[2]) # so least distance is always first
# reduce list of lists with same angle but grab least distance
angles = {}
for pos in positions[1:]:
agl = ang(your_position, [pos[0], pos[1]])
if agl not in angles:
angles[agl] = pos
# uncomment to see the list of vectors
# print([(pos[0] - your_position[0], pos[1] - your_position[1]) for pos in angles.values() if pos[4] == 2])
# return number of times only trainer is hit
return sum(1 for pos in angles.values() if pos[3] == 2)
def dist(p1, p2):
return sqrt((p1[0] - p2[0])**2 + (p1[1] - p2[1])**2)
def ang(p1, p2):
return atan2((p2[1] - p1[1]), (p2[0] - p1[0]))
I got a few extra test cases from online and by running other people's submitted code to check the answers:
def test():
assert solution([3, 2], [1, 1], [2, 1], 4) == 7
assert solution([2, 5], [1, 2], [1, 4], 11) == 27
assert solution([23, 10], [6, 4], [3, 2], 23) == 8
assert solution([1250, 1250], [1000, 1000], [500, 400], 10000) == 196
assert solution([10, 10], [4, 4], [3, 3], 5000) == 739323
assert solution([3, 2], [1, 1], [2, 1], 7) == 19
assert solution([2, 3], [1, 1], [1, 2], 4) == 7
assert solution([3, 4], [1, 2], [2, 1], 7) == 10
assert solution([4, 4], [2, 2], [3, 1], 6) == 7
assert solution([300, 275], [150, 150], [180, 100], 500) == 9
assert solution([3, 4], [1, 1], [2, 2], 500) == 54243
Everything here passes except for the very last case, solution([3, 4], [1, 1], [2, 2], 500) == 54243, for which I actually get 54239.
I've been stuck on this for several hours and honestly can't figure out why a) I'm failing a visible test on the platform that I know passes quite quickly on my own machine (even though I'm using verified libraries and all that) and b) why I'm passing all other of my own test cases except the last one. I'm hoping this will also help me figure out why I fail the other hidden test cases on the platform.
In Python 2, / performs integer division. Thus, in code like
int(ceil((your_position[0] + distance) / dimensions[0]))
the ceil is useless, as the value has already been rounded down.
Floating-point arithmetic is not necessary for this calculation, and it's better to avoid floating-point for these cases for the usual reasons.
Instead, we'll use a function to get "ceiling integer division" results. The trick is to add to the numerator first, such that the value increases by 1 except when the numerator was already evenly divisible. The amount we need to add, then, is the denominator minus one.
This version should work the same way in both Python 2 and 3, as // performs floored division regardless (in 2, / is floored division for integers, but "true" division for floating-point values).
def ceil_div(quotient, divisor):
return (quotient + divisor - 1) // divisor
And now we can do
def solution(dimensions, your_position, trainer_position, distance):
# calculate maximum repetiions of current room in mirrored room
cp_x = ceil_div((your_position[0] + distance), dimensions[0])
cp_y = ceil_div((your_position[1] + distance), dimensions[1])
and it should work in either Python 2 or Python 3. We no longer need to coerce to int, because the inputs are integer and thus the floored division will also produce an integer.
I managed to figure out what I was missing — my solution was correct in Python 3, and I thought I had accounted for all the version differences in Python 2.7, but it turns out there's one more. I believe it had something to do with how range() works or how I calculated for cp_x and cp_y, the maximum number of copies in the first quadrant. Adding one to my iteration, such that:
# calculate maximum repetitions of current room in mirrored room
cp_x = int(ceil((your_position[0] + distance) / dimensions[0]))
cp_y = int(ceil((your_position[1] + distance) / dimensions[1]))
# generate all possible positions in q1
q1_player = [your_position]
q1_trainer = [trainer_position]
for i in range(0, cp_x + 1): # ADD ONE HERE
for j in range(0, cp_y + 1): # ADD ONE HERE
fixed it.
I managed to figure out that test case 5 is this:
solution([1000,1000], [250,25], [257,49], 25) #=1
Because I had one code that passed all but test case 3 (because it was too slow), then one code that passed all but test case 5, so I combined them and said run through the first code if the distance is above a certain threshold and it worked, so I kept changing the distance value until it stopped working, then I could figure out the distance of test case 5, then I did the same for every other parameter. After I got the test case I could figure out where I went wrong and I finally passed it. If anyone wants some help on the code let me know. But my advice is:
Get the angles and distance for all the times you shoot yourself, and get the angles and distance for all the times you shoot the target. Do this with as minimal for loops as you can and use list comprehension if they are necessary. compile both into two seperate lists. Then use functions that handle the entire lists to deal with the rest from there. I don't want to give too much away but I will happily help if people want.

Phase correlation for rotation registration using opencv

I'm trying to register two images that are a rotated and translated version of one another using opencv. Generally speaking, the procedure is (pseudo code):
a. IF1 = FFT2(I1); IF2 = FFT2(I2)
b. R_translation = (IF1).*(IF2_conjugate)
c. R_translation = R_translation./abs(R_translation)
d. r_translation = IFFT2(R_translation)
where the maximum of r_translation corresponds to the translation. Moving on to calculate the rotation, the abs value removes the translation part,
e. IF1_abs = abs(IF1); IF2_abs = abs(IF2)
Converting to Linear-Polar coordinates,
f. IF1_abs_pol = LINPOL(IF1_abs); IF2_abs_pol = LINPOL(IF2_abs)
f. IFF1 = FFT2(IF1_abs_pol); IFF2 = FFT2(IF2_abs_pol)
f. R_rot = (IFF1).*(IFF2_conjugate)
c. R_rot = R_rot./abs(R_rot)
d. r_rot = IFFT2(R_rot)
where the maximum of r_rotationn corresponds to the rotation. While for translation alone, the cv2.phaseCorrelate function returns expected results, for rotation, it returns odd results. So I had tried the following.
I took two numpy.array-s 5x5, which are a rotated version of one another like so:
a = numpy.array([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]])
a = a.astype('float')/a.astype('float').max()
b = numpy.array([[5, 5, 5, 5, 5], [4, 4, 4, 4, 4], [3, 3, 3, 3, 3], [2, 2, 2, 2, 2], [1, 1, 1, 1, 1]])
b = b.astype('float') / b.astype('float').max()
First I calculated the phase correlation myself:
center_x = numpy.floor(a.shape[0] / 2.0)#the x center of rotation (= x center of image)
center_y = numpy.floor(a.shape[1] / 2.0)#the y center of rotation (= y center of image)
Mvalue = a.shape[1] / numpy.sqrt(
((a.shape[0] / 2.0) ** 2.0) + ((a.shape[1] / 2.0) ** 2.0)) # rotation radius
Calculating the FFT, taking the absolute value (losing the translation difference data if existed), and switching to Linear-Polar coordinates and normalizing:
a_polar = cv2.linearPolar(numpy.abs(numpy.fft.fft2(a)), (center_x, center_y), Mvalue, cv2.WARP_FILL_OUTLIERS)
b_polar = cv2.linearPolar(numpy.abs(numpy.fft.fft2(b)), (center_x, center_y), Mvalue, cv2.WARP_FILL_OUTLIERS)
a_polar = a_polar/a_polar.max()
b_polar = b_polar / b_polar.max()
Another FFT step, multiplying point wise, and IFFT back:
aff = numpy.fft.fft2(a_polar)
bff = numpy.fft.fft2(b_polar)
R = aff * numpy.ma.conjugate(bff)
R = R / numpy.absolute(R)
r = numpy.fft.ifft2(R).real
r = r/r.max()
yields,
Phase correlation for rotation, b with respect to a
According to cv2.linearPolar() the rows, span the angle (in this case with step size of 360/5 = 72degrees) and the columns span the radius (from 0 to the maximum radius given in Mvalue. The maximum is evident at the last row (corresponding to approximately -90degree shift). So far so good..
The second method is using cv2.phaseCorrelate() directly,
r_direct = cv2.phaseCorrelate(a_polar, b_polar)
which yields,
Phase correlation for rotation, b with respect to a direct method
The first tuple, is the X,Y correlation coefficient (in pixels?) and the third number is the fit grade. When it is close to unity, the correlation coefficient represents better the data (the blob around the maximum is more distinct).
Other than the fact that the result is not distinct enough (why?), the result is confusing...
Generally, The first FFT process in this 5x5 example was not necessary. If rotation is the only interference, one can immediately switch to Linear-Polar coordinates and use cv2.phaseCorrelate. In that case, the result is also confusing.
Any help would be appreciated :)
Thanks!
David

Wrapped (circular) 2D interpolation in Python

I have angular data on a domain that is wrapped at pi radians (i.e. 0 = pi). The data are 2D, where one dimension represents the angle. I need to interpolate this data onto another grid in a wrapped way.
In one dimension, the np.interp function takes a period kwarg (for NumPy 1.10 and later):
http://docs.scipy.org/doc/numpy/reference/generated/numpy.interp.html
This is exactly what I need, but I need it in two dimensions. I'm currently just stepping through columns in my array and using np.interp, but this is of course slow.
Anything out there that could achieve this same outcome but faster?
An explanation of how np.interp works
Use the source, Luke!
The numpy doc for np.interp makes the source particularly easy to find, since it has the link right there, along with the documentation. Let's go through this, line by line.
First, recall the parameters:
"""
x : array_like
The x-coordinates of the interpolated values.
xp : 1-D sequence of floats
The x-coordinates of the data points, must be increasing if argument
`period` is not specified. Otherwise, `xp` is internally sorted after
normalizing the periodic boundaries with ``xp = xp % period``.
fp : 1-D sequence of floats
The y-coordinates of the data points, same length as `xp`.
period : None or float, optional
A period for the x-coordinates. This parameter allows the proper
interpolation of angular x-coordinates. Parameters `left` and `right`
are ignored if `period` is specified.
"""
Let's take a simple example of a triangular wave while going through this:
xp = np.array([-np.pi/2, -np.pi/4, 0, np.pi/4])
fp = np.array([0, -1, 0, 1])
x = np.array([-np.pi/8, -5*np.pi/8]) # Peskiest points possible }:)
period = np.pi
Now, I start off with the period != None branch in the source code, after all the type-checking happens:
# normalizing periodic boundaries
x = x % period
xp = xp % period
This just ensures that all values of x and xp supplied are between 0 and period. So, since the period is pi, but we specified x and xp to be between -pi/2 and pi/2, this will adjust for that by adding pi to all values in the range [-pi/2, 0), so that they effectively appear after pi/2. So our xp now reads [pi/2, 3*pi/4, 0, pi/4].
asort_xp = np.argsort(xp)
xp = xp[asort_xp]
fp = fp[asort_xp]
This is just ordering xp in increasing order. This is especially required after performing that modulo operation in the previous step. So, now xp is [0, pi/4, pi/2, 3*pi/4]. fp has also been shuffled accordingly, [0, 1, 0, -1].
xp = np.concatenate((xp[-1:]-period, xp, xp[0:1]+period))
fp = np.concatenate((fp[-1:], fp, fp[0:1]))
return compiled_interp(x, xp, fp, left, right) # Paraphrasing a little
np.interp does linear interpolation. When trying to interpolate between two points a and b present in xp, it only uses the values of f(a) and f(b) (i.e., the values of fp at the corresponding indices). So what np.interp is doing in this last step is to take the point xp[-1] and put it in front of the array, and take the point xp[0] and put it after the array, but after subtracting and adding one period respectively. So you now have a new xp that looks like [-pi/4, 0, pi/4, pi/2, 3*pi/4, pi]. Likewise, fp[0] and fp[-1] have been concatenated around, so fp is now [-1, 0, 1, 0, -1, 0].
Note that after the modulo operations, x had been brought into the [0, pi] range too, so x is now [7*pi/8, 3*pi/8]. Which lets you easily see that you'll get back [-0.5, 0.5].
Now, coming to your 2D case:
Say you have a grid and some values. Let's just take all values to be between [0, pi] off the bat so we don't need to worry about modulos and shufflings.
xp = np.array([0, np.pi/4, np.pi/2, 3*np.pi/4])
yp = np.array([0, 1, 2, 3])
period = np.pi
# Put x on the 1st dim and y on the 2nd dim; f is linear in y
fp = np.array([0, 1, 0, -1])[:, np.newaxis] + yp[np.newaxis, :]
# >>> fp
# array([[ 0, 1, 2, 3],
# [ 1, 2, 3, 4],
# [ 0, 1, 2, 3],
# [-1, 0, 1, 2]])
We now know that all you need to do is to add xp[[-1]] in front of the array and xp[[0]] at the end, adjusting for the period. Note how I've indexed using the singleton lists [-1] and [0]. This is a trick to ensure that dimensions are preserved.
xp = np.concatenate((xp[[-1]]-period, xp, xp[[0]]+period))
fp = np.concatenate((fp[[-1], :], fp, fp[[0], :]))
Finally, you are free to use scipy.interpolate.interpn to achieve your result. Let's get the value at x = pi/8 for all y:
from scipy.interpolate import interpn
interp_points = np.hstack(( (np.pi/8 * np.ones(4))[:, np.newaxis], yp[:, np.newaxis] ))
result = interpn((xp, yp), fp, interp_points)
# >>> result
# array([ 0.5, 1.5, 2.5, 3.5])
interp_points has to be specified as an Nx2 matrix of points, where the first dimension is for each point you want interpolation at the second dimension gives the x- and y-coordinate of that point. See this answer for a detailed explanation.
If you want to get the value outside of the range [0, period], you'll need to modulo it yourself:
x = 21 * np.pi / 8
x_equiv = x % period # Now within [0, period]
interp_points = np.hstack(( (x_equiv * np.ones(4))[:, np.newaxis], yp[:, np.newaxis] ))
result = interpn((xp, yp), fp, interp_points)
# >>> result
# array([-0.5, 0.5, 1.5, 2.5])
Again, if you want to generate interp_points for a bunch of x- and y- values, look at this answer.

Return value for which a function reaches its smallest negative value in python

It's pretty late, so I don't know how clear this will be.
I have a function f(x), I want to get the value of x from a list for which f(x) reachest the smallest negative value, namely:
x = [0, 2, 4, 6]
f(x) = [200, 0, -3, -1000]
In this case, I would like something to return the value 4 in x, which gave me -3. I don't want the absolute minimum (-1000), but the negative value with the lowest absolute value.
I hope that makes sense, thanks a lot for your help.
UPDATE
I was trying to simplify the problem, maybe too much. Here's the thing: I have a list of 2D points that form a polygon and I want to order them clockwise.
For that, I take the cross product between each point and the rest and select the next point based on getting the negative cross product (which tells me the sense of rotation) from the previous and which has the smallest absolute value (which tells me it really is the next point).
so, say:
x = [(1,1), (-1,-1), (-1,1), (1,-1)]
and I would like to get
x = [(1,1), (1,-1), (-1,-1), (-1,1)]
I'm doing
for point in x:
cp = [numpy.cross(point, p) for p in x]
# and then some magic to select the right point...
Thanks for your help again.
a = [0, 2, 4, 6]
b = [200, 0, -3, -1000]
value = max([x for x in b if x < 0])
print a[b.index(value)]
Try this:
inputs = [0, 2, 4, 6]
outputs = [200, 0, -3, -1000]
max = min(outputs)
for n in outputs:
if n >= 0:
continue
if n > max:
max = n
print inputs[outputs.index(max)]
x = [0, 2, 4, 6]
fx = [200, 0, -3, -1000]
print(x[fx.index(max(n for n in fx if n < 0))])
The following solution should work well in most cases
[ z for z in x if f(z) == max( f(y) for y in x if f(y) < 0 ) ]
One characteristic of this solution is that if there is a repetition, i.e. several x's producing the same biggest negative, all of them will be returned.

Categories