Automatically reduce piecewise function components - Pyomo - python

In pyomo, I have a piece-wise linear constraint defined through pyomo.environ.Piecewise. I keep getting a warning along the lines of
Piecewise component '<component name>' has detected slopes of consecutive piecewise segments to be within <tolerance> of one another. Refer to the Piecewise help documentation for information on how to disable this warning.
I know I could increase the tolerance and get rid of the warning, but I'm wondering if there is a general approach (through Pyomo or numpy) to reduce the number of "segments" if two consecutive slopes are below a given tolerance.
I could obviously implement this myself, but I'd like to avoid reinventing the wheel.

Ok, this is what I came up with. Definitely not optimized for performance, but my case depends on few points. It also lacks more validations on the inputs (e.g. x being sorted and unique).
def reduce_piecewise(x, y, abs_tol):
"""
Remove unnecessary points from piece-wise curve.
Points are remove if the slopes of consecutive segments
differ by less than `abs_tol`.
x points must be sorted and unique.
Consecutive y points can be the same though!
Parameters
----------
x : List[float]
Points along x-axis.
y : List[float]
abs_tol : float
Tolerance between consecutive segments.
Returns
-------
(np.array, np.array)
x and y points - reduced.
"""
if not len(x) == len(y):
raise ValueError("x and y must have same shape")
x_reduced = [x[0]]
y_reduced = [y[0]]
for i in range(1, len(x) - 1):
left_slope = (y[i] - y_reduced[-1])/(x[i] - x_reduced[-1])
right_slope = (y[i+1] - y[i])/(x[i+1] - x[i])
if abs(right_slope - left_slope) > abs_tol:
x_reduced.append(x[i])
y_reduced.append(y[i])
x_reduced.append(x[-1])
y_reduced.append(y[-1])
return np.array(x_reduced), np.array(y_reduced)
And here are some examples:
>>> x = np.array([0, 1, 2, 3])
>>> y = np.array([0, 1, 2, 3])
>>> reduce_piecewise(x, y, 0.01)
(array([0, 3]), array([0, 3]))
>>> x = np.array([0, 1, 2, 3, 4, 5])
>>> y = np.array([0, 2, -1, 3, 4.001, 5]) # 4.001 should be removed
>>> reduce_piecewise(x, y, 0.01)
(array([0, 1, 2, 3, 5]), array([ 0., 2., -1., 3., 5.]))

Related

Getting the right sign when calculating repeated sign switches in numpy array

I am trying to simulate a grid of spins in python that can change their orientation (represented by the sign):
>>> import numpy as np
>>> spin_values = np.random.choice([-1, 1], (2, 2))
>>> spin_values
array([[-1, 1],
[ 1, 1]])
I then throw two sets of random indices of that grid for spins that have a certain probability to switch their orientation, let's say:
>>> i = np.array([1, 1])
>>> j = np.array([0, 0])
>>> switches = np.array([-1, -1])
i and j here contain the indices that might change and switches states whether they do switch (-1) or keep their orientation (1). My idea for calculating the new orientations was:
>>> spin_values[i, j] *= switches
When a spin orientation only changes once this works fine. However, when it is supposed to change twice (as with the example values) it only changes once, therefore giving me a wrong result.
>>> spin_values
array([[-1, 1],
[-1, 1]])
How could I get the right results while having a short run time (this has to be done many times on a bigger grid)?
I would use numpy.unique to get the count of unique pairs of indices and compute -1 ** n:
idx, cnt = np.unique(np.vstack([i, j]), axis=1, return_counts=True)
spin_values[tuple(idx)] = (-1) ** cnt
Updated spin_values:
array([[-1, 1],
[ 1, 1]])

PyTorch's torch.as_strided with negative strides for making a Toeplitz matrix

I am writing a jury-rigged PyTorch version of scipy.linalg.toeplitz, which currently has the following form:
def toeplitz_torch(c, r=None):
c = torch.tensor(c).ravel()
if r is None:
r = torch.conj(c)
else:
r = torch.tensor(r).ravel()
# Flip c left to right.
idx = [i for i in range(c.size(0)-1, -1, -1)]
idx = torch.LongTensor(idx)
c = c.index_select(0, idx)
vals = torch.cat((c, r[1:]))
out_shp = len(c), len(r)
n = vals.stride(0)
return torch.as_strided(vals[len(c)-1:], size=out_shp, stride=(-n, n)).copy()
But torch.as_strided currently does not support negative strides. My function, therefore, throws the error:
RuntimeError: as_strided: Negative strides are not supported at the moment, got strides: [-1, 1].
My (perhaps incorrect) understanding of as_strided is that it inserts the values of the first argument into a new array whose size is specified by the second argument and it does so by linearly indexing those values in the original array and placing them at subscript-indexed strides given by the final argument.
Both the NumPy and PyTorch documentation concerning as_strided have scary warnings about using the function with "extreme care" and I don't understand this function fully, so I'd like to ask:
Is my understanding of as_strided correct?
Is there a simple way to rewrite this so negative strides work?
Will I be able to pass a gradient w.r.t c (or r) through toeplitz_torch?
> 1. Is my understanding of as_strided correct?
The stride is an interface for your tensor to access the underlying contiguous data buffer. It does not insert values, no copies of the values are done by torch.as_strided, the strides define the artificial layout of what we refer to as multi-dimensional array (in NumPy) or tensor (in PyTorch).
As Andreas K. puts it in another answer:
Strides are the number of bytes to jump over in the memory in order to get from one item to the next item along each direction/dimension of the array. In other words, it's the byte-separation between consecutive items for each dimension.
Please feel free to read the answers over there if you have some trouble with strides. Here we will take your example and look at how it is implemented with as_strided.
The example given by Scipy for linalg.toeplitz is the following:
>>> toeplitz([1,2,3], [1,4,5,6])
array([[1, 4, 5, 6],
[2, 1, 4, 5],
[3, 2, 1, 4]])
To do so they first construct the list of values (what we can refer to as the underlying values, not actually underlying data): vals which is constructed as [3 2 1 4 5 6], i.e. the Toeplitz column and row flattened.
Now notice the arguments passed to np.lib.stride_tricks.as_strided:
values: vals[len(c)-1:] notice the slice: the tensors show up smaller, yet the underlying values remain, and they correspond to those of vals. Go ahead and compare the two with storage_offset: it's just an offset of 2, the values are still there! How this works is that it essentially shifts the indices such that index=0 will refer to value 1, index=1 to 4, etc...
shape: given by the column/row inputs, here (3, 4). This is the shape of the resulting object.
strides: this is the most important piece: (-n, n), in this case (-1, 1)
The most intuitive thing to do with strides is to describe a mapping between the multi-dimensional space: (i, j) ∈ [0,3[ x [0,4[ and the flattened 1D space: k ∈ [0, 3*4[. Since the strides are equal to (-n, n) = (-1, 1), the mapping is -n*i + n*j = -1*i + 1*j = j-i. Mathematically you can describe your matrix as M[i, j] = F[j-i] where F is the flattened values vector [3 2 1 4 5 6].
For instance, let's try with i=1 and j=2. If you look at the Topleitz matrix above M[1, 2] = 4. Indeed F[k] = F[j-i] = F[1] = 4
If you look closely you will see the trick behind negative strides: they allow you to 'reference' to negative indices: for instance, if you take j=0 and i=2, then you see k=-2. Remember how vals was given with an offset of 2 by slicing vals[len(c)-1:]. If you look at its own underlying data storage it's still [3 2 1 4 5 6], but has an offset. The mapping for vals (in this case i: 1D -> k: 1D) would be M'[i] = F'[k] = F'[i+2] because of the offset. This means M'[-2] = F'[0] = 3.
In the above I defined M' as vals[len(c)-1:] which basically equivalent to the following tensor:
>>> torch.as_strided(vals, size=(len(vals)-2,), stride=(1,), storage_offset=2)
tensor([1, 4, 5, 6])
Similarly, I defined F' as the flattened vector of underlying values: [3 2 1 4 5 6].
The usage of strides is indeed a very clever way to define a Toeplitz matrix!
> 2. Is there a simple way to rewrite this so negative strides work?
The issue is, negative strides are not implemented in PyTorch... I don't believe there is a way around it with torch.as_strided, otherwise it would be rather easy to extend the current implementation and provide support for that feature.
There are however alternative ways to solve the problem. It is entirely possible to construct a Toeplitz matrix in PyTorch, but that won't be with torch.as_strided.
We will do the mapping ourselves: for each element of M indexed by (i, j), we will find out the corresponding index k which is simply j-i. This can be done with ease, first by gathering all (i, j) pairs from M:
>>> i, j = torch.ones(3, 4).nonzero().T
(tensor([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2]),
tensor([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]))
Now we essentially have k:
>>> j-i
tensor([ 0, 1, 2, 3, -1, 0, 1, 2, -2, -1, 0, 1])
We just need to construct a flattened tensor of all possible values from the row r and column c inputs. Negative indexed values (the content of c) are put last and flipped:
>>> values = torch.cat((r, c[1:].flip(0)))
tensor([1, 4, 5, 6, 3, 2])
Finally index values with k and reshape:
>>> values[j-i].reshape(3, 4)
tensor([[1, 4, 5, 6],
[2, 1, 4, 5],
[3, 2, 1, 4]])
To sum it up, my proposed implementation would be:
def toeplitz(c, r):
vals = torch.cat((r, c[1:].flip(0)))
shape = len(c), len(r)
i, j = torch.ones(*shape).nonzero().T
return vals[j-i].reshape(*shape)
> 3. Will I be able to pass a gradient w.r.t c (or r) through toeplitz_torch?
That's an interesting question because torch.as_strided doesn't have a backward function implemented. This means you wouldn't have been able to backpropagate to c and r! With the above method, however, which uses 'backward-compatible' builtins, the backward pass comes free of charge.
Notice the grad_fn on the output:
>>> toeplitz(torch.tensor([1.,2.,3.], requires_grad=True),
torch.tensor([1.,4.,5.,6.], requires_grad=True))
tensor([[1., 4., 5., 6.],
[2., 1., 4., 5.],
[3., 2., 1., 4.]], grad_fn=<ViewBackward>)
This was a quick draft (that did take a little while to write down), I will make some edits. If you have some questions or remarks, don't hesitate to comment! I would be interested in seeing other answers as I am not an expert with strides, this is just my take on the problem.

Optimize the python function with numpy without using the for loop

I have the following python function:
def npnearest(u: np.ndarray, X: np.ndarray, Y: np.ndarray, distance: 'callbale'=npdistance):
'''
Finds x1 so that x1 is in X and u and x1 have a minimal distance (according to the
provided distance function) compared to all other data points in X. Returns the label of x1
Args:
u (np.ndarray): The vector (ndim=1) we want to classify
X (np.ndarray): A matrix (ndim=2) with training data points (vectors)
Y (np.ndarray): A vector containing the label of each data point in X
distance (callable): A function that receives two inputs and defines the distance function used
Returns:
int: The label of the data point which is closest to `u`
'''
xbest = None
ybest = None
dbest = float('inf')
for x, y in zip(X, Y):
d = distance(u, x)
if d < dbest:
ybest = y
xbest = x
dbest = d
return ybest
Where, npdistance simply gives distance between two points i.e.
def npdistance(x1, x2):
return(np.sum((x1-x2)**2))
I want to optimize npnearest by performing nearest neighbor search directly in numpy. This means that the function cannot use for/while loops.
Thanks
Since you don't need to use that exact function, you can simply change the sum to work over a particular axis. This will return a new list with the calculations and you can call argmin to get the index of the minimum value. Use that and lookup your label:
import numpy as np
def npdistance_idx(x1, x2):
return np.argmin(np.sum((x1-x2)**2, axis=1))
Y = ["label 0", "label 1", "label 2", "label 3"]
u = np.array([[1, 5.5]])
X = np.array([[1,2], [1, 5], [0, 0], [7, 7]])
idx = npdistance_idx(X, u)
print(Y[idx]) # label 1
Numpy supports vectorized operations (broadcasting)
This means you can pass in arrays and operations will be applied to entire arrays in an optimized way (SIMD - single instruction, multiple data)
You can then get the address of the array minimum using .argmin()
Hope this helps
In [9]: numbers = np.arange(10); numbers
Out[9]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [10]: numbers -= 5; numbers
Out[10]: array([-5, -4, -3, -2, -1, 0, 1, 2, 3, 4])
In [11]: numbers = np.power(numbers, 2); numbers
Out[11]: array([25, 16, 9, 4, 1, 0, 1, 4, 9, 16])
In [12]: numbers.argmin()
Out[12]: 5

Map numbers to their percentiles

I would like to apply the result of numpy.percentile to its argument, i.e., map every number in the input vector to its quantile.
E.g., if v=np.array([1,2,3,4]), and I want just two quantiles (bigger and smaller than the median), I would get np.array([0,0,1,1]) telling me that the first two elements of v are smaller than the median and the last two are bigger than the median.
Note that I am interested in, say, deciles, not just the median!
IOW, #PaulPanzer hit the nail:
np.digitize(v,np.percentile(v,quantiles))
thanks!
(v > np.percentile(v, 50)).astype(int)
Out[93]:
array([0, 0, 1, 1])
Use np.digitize:
perc = np.percentile(data, q)
indices = np.digitize(data, perc)
Example q = [25,50,75], data = np.arange(8):
indices
# array([0, 0, 1, 1, 2, 2, 3, 3])

Wrapped (circular) 2D interpolation in Python

I have angular data on a domain that is wrapped at pi radians (i.e. 0 = pi). The data are 2D, where one dimension represents the angle. I need to interpolate this data onto another grid in a wrapped way.
In one dimension, the np.interp function takes a period kwarg (for NumPy 1.10 and later):
http://docs.scipy.org/doc/numpy/reference/generated/numpy.interp.html
This is exactly what I need, but I need it in two dimensions. I'm currently just stepping through columns in my array and using np.interp, but this is of course slow.
Anything out there that could achieve this same outcome but faster?
An explanation of how np.interp works
Use the source, Luke!
The numpy doc for np.interp makes the source particularly easy to find, since it has the link right there, along with the documentation. Let's go through this, line by line.
First, recall the parameters:
"""
x : array_like
The x-coordinates of the interpolated values.
xp : 1-D sequence of floats
The x-coordinates of the data points, must be increasing if argument
`period` is not specified. Otherwise, `xp` is internally sorted after
normalizing the periodic boundaries with ``xp = xp % period``.
fp : 1-D sequence of floats
The y-coordinates of the data points, same length as `xp`.
period : None or float, optional
A period for the x-coordinates. This parameter allows the proper
interpolation of angular x-coordinates. Parameters `left` and `right`
are ignored if `period` is specified.
"""
Let's take a simple example of a triangular wave while going through this:
xp = np.array([-np.pi/2, -np.pi/4, 0, np.pi/4])
fp = np.array([0, -1, 0, 1])
x = np.array([-np.pi/8, -5*np.pi/8]) # Peskiest points possible }:)
period = np.pi
Now, I start off with the period != None branch in the source code, after all the type-checking happens:
# normalizing periodic boundaries
x = x % period
xp = xp % period
This just ensures that all values of x and xp supplied are between 0 and period. So, since the period is pi, but we specified x and xp to be between -pi/2 and pi/2, this will adjust for that by adding pi to all values in the range [-pi/2, 0), so that they effectively appear after pi/2. So our xp now reads [pi/2, 3*pi/4, 0, pi/4].
asort_xp = np.argsort(xp)
xp = xp[asort_xp]
fp = fp[asort_xp]
This is just ordering xp in increasing order. This is especially required after performing that modulo operation in the previous step. So, now xp is [0, pi/4, pi/2, 3*pi/4]. fp has also been shuffled accordingly, [0, 1, 0, -1].
xp = np.concatenate((xp[-1:]-period, xp, xp[0:1]+period))
fp = np.concatenate((fp[-1:], fp, fp[0:1]))
return compiled_interp(x, xp, fp, left, right) # Paraphrasing a little
np.interp does linear interpolation. When trying to interpolate between two points a and b present in xp, it only uses the values of f(a) and f(b) (i.e., the values of fp at the corresponding indices). So what np.interp is doing in this last step is to take the point xp[-1] and put it in front of the array, and take the point xp[0] and put it after the array, but after subtracting and adding one period respectively. So you now have a new xp that looks like [-pi/4, 0, pi/4, pi/2, 3*pi/4, pi]. Likewise, fp[0] and fp[-1] have been concatenated around, so fp is now [-1, 0, 1, 0, -1, 0].
Note that after the modulo operations, x had been brought into the [0, pi] range too, so x is now [7*pi/8, 3*pi/8]. Which lets you easily see that you'll get back [-0.5, 0.5].
Now, coming to your 2D case:
Say you have a grid and some values. Let's just take all values to be between [0, pi] off the bat so we don't need to worry about modulos and shufflings.
xp = np.array([0, np.pi/4, np.pi/2, 3*np.pi/4])
yp = np.array([0, 1, 2, 3])
period = np.pi
# Put x on the 1st dim and y on the 2nd dim; f is linear in y
fp = np.array([0, 1, 0, -1])[:, np.newaxis] + yp[np.newaxis, :]
# >>> fp
# array([[ 0, 1, 2, 3],
# [ 1, 2, 3, 4],
# [ 0, 1, 2, 3],
# [-1, 0, 1, 2]])
We now know that all you need to do is to add xp[[-1]] in front of the array and xp[[0]] at the end, adjusting for the period. Note how I've indexed using the singleton lists [-1] and [0]. This is a trick to ensure that dimensions are preserved.
xp = np.concatenate((xp[[-1]]-period, xp, xp[[0]]+period))
fp = np.concatenate((fp[[-1], :], fp, fp[[0], :]))
Finally, you are free to use scipy.interpolate.interpn to achieve your result. Let's get the value at x = pi/8 for all y:
from scipy.interpolate import interpn
interp_points = np.hstack(( (np.pi/8 * np.ones(4))[:, np.newaxis], yp[:, np.newaxis] ))
result = interpn((xp, yp), fp, interp_points)
# >>> result
# array([ 0.5, 1.5, 2.5, 3.5])
interp_points has to be specified as an Nx2 matrix of points, where the first dimension is for each point you want interpolation at the second dimension gives the x- and y-coordinate of that point. See this answer for a detailed explanation.
If you want to get the value outside of the range [0, period], you'll need to modulo it yourself:
x = 21 * np.pi / 8
x_equiv = x % period # Now within [0, period]
interp_points = np.hstack(( (x_equiv * np.ones(4))[:, np.newaxis], yp[:, np.newaxis] ))
result = interpn((xp, yp), fp, interp_points)
# >>> result
# array([-0.5, 0.5, 1.5, 2.5])
Again, if you want to generate interp_points for a bunch of x- and y- values, look at this answer.

Categories