numpy iterate over two 2d arrays - python

Say I have two matrices:
X, Y = np.meshgrid(np.arange(0, 2, 0.1), np.arange(3, 5, 0.1))
And a function, something like:
def func(x) :
return x[0]**2 + x[1]**2
How can I fill a matrix Z (of size np.shape(X)), where each entry is formed by calling func on the two corresponding values of X and Y, i.e.:
Z[i, j] = func([X[i, j], Y[i, j]])
Is there a way without using a double nested for-loop?

This is also works as a vectorized form of function evaluation:
import numpy as np
X, Y = np.meshgrid(np.arange(0, 2, 0.1), np.arange(3, 5, 0.1))
def func(x) :
return x[0]**2 + x[1]**2
Z = func([X,Y])

For given numpy arrays X and Y, you could just do -
Zout = X**2 + Y**2
If you are actually constructing X and Y like that, there is a direct way to get Z with broadcasting and thus avoid np.meshgrid, like so -
Zout = np.arange(0, 2, 0.1)**2 + np.arange(3, 5, 0.1)[:,None]**2

Related

Python: Expand 2D array to multiple 1D arrays

Consider the followoing example from np.meshgrid docs:
nx, ny = (3, 2)
x = np.linspace(0, 1, nx)
y = np.linspace(0, 1, ny)
xv, yv = np.meshgrid(x, y)
In my application, instead of x and y, I've 25 variables. To create a grid out of the 25 variables, one way would be:
v1 = np.linspace(0, 1, 10)
v2 = np.linspace(0, 1, 10)
...
v25 = np.linspace(0, 1, 10)
z_grid = np.meshgrid(v1, v2, ..., v25)
However, the code will look ugly and not modular w.r.t. the number of variables (since each variable is hard-coded). Therefore, I am interested in something like the following:
n_variables = 25
z = np.array([np.linspace(0, 1, 10)] * n_variables)
z_grid = np.dstack(np.meshgrid(z))
However, I am guessing meshgrid(z) is not the correct call, and I should expand z to n_variables arrays. Any thoughts on how I can expand the 2D array into multiple 1D arrays?
this should do it.
n_variables = 25
z = np.array([np.linspace(0, 1, 10)] * n_variables)
z_grid = np.dstack(np.meshgrid(*z))
the * operator before list, unpacks list elements. consider following:
v1 = [1,2,3]
v2 = [4,5,6]
list_of_v = [v1,v2]
some_fucntion(v1,v2) == some_function(*list_ov_v)

How can I use multiple dimensional polynomials with numpy.polynomial?

I'm able to use numpy.polynomial to fit terms to 1D polynomials like f(x) = 1 + x + x^2. How can I fit multidimensional polynomials, like f(x,y) = 1 + x + x^2 + y + yx + y x^2 + y^2 + y^2 x + y^2 x^2? It looks like numpy doesn't support multidimensional polynomials at all: is that the case? In my real application, I have 5 dimensions of input and I am interested in hermite polynomials. It looks like the polynomials in scipy.special are also only available for one dimension of inputs.
# One dimension of data can be fit
x = np.random.random(100)
y = np.sin(x)
params = np.polynomial.polynomial.polyfit(x, y, 6)
np.polynomial.polynomial.polyval([0, .2, .5, 1.5], params)
array([ -5.01799432e-08, 1.98669317e-01, 4.79425535e-01,
9.97606096e-01])
# When I try two dimensions, it fails.
x = np.random.random((100, 2))
y = np.sin(5 * x[:,0]) + .4 * np.sin(x[:,1])
params = np.polynomial.polynomial.polyvander2d(x, y, [6, 6])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-13-5409f9a3e632> in <module>()
----> 1 params = np.polynomial.polynomial.polyvander2d(x, y, [6, 6])
/usr/local/lib/python2.7/site-packages/numpy/polynomial/polynomial.pyc in polyvander2d(x, y, deg)
1201 raise ValueError("degrees must be non-negative integers")
1202 degx, degy = ideg
-> 1203 x, y = np.array((x, y), copy=0) + 0.0
1204
1205 vx = polyvander(x, degx)
ValueError: could not broadcast input array from shape (100,2) into shape (100)
I got annoyed that there is no simple function for a 2d polynomial fit of any number of degrees so I made my own. Like the other answers it uses numpy lstsq to find the best coefficients.
import numpy as np
from scipy.linalg import lstsq
from scipy.special import binom
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
def _get_coeff_idx(coeff):
idx = np.indices(coeff.shape)
idx = idx.T.swapaxes(0, 1).reshape((-1, 2))
return idx
def _scale(x, y):
# Normalize x and y to avoid huge numbers
# Mean 0, Variation 1
offset_x, offset_y = np.mean(x), np.mean(y)
norm_x, norm_y = np.std(x), np.std(y)
x = (x - offset_x) / norm_x
y = (y - offset_y) / norm_y
return x, y, (norm_x, norm_y), (offset_x, offset_y)
def _unscale(x, y, norm, offset):
x = x * norm[0] + offset[0]
y = y * norm[1] + offset[1]
return x, y
def polyvander2d(x, y, degree):
A = np.polynomial.polynomial.polyvander2d(x, y, degree)
return A
def polyscale2d(coeff, scale_x, scale_y, copy=True):
if copy:
coeff = np.copy(coeff)
idx = _get_coeff_idx(coeff)
for k, (i, j) in enumerate(idx):
coeff[i, j] /= scale_x ** i * scale_y ** j
return coeff
def polyshift2d(coeff, offset_x, offset_y, copy=True):
if copy:
coeff = np.copy(coeff)
idx = _get_coeff_idx(coeff)
# Copy coeff because it changes during the loop
coeff2 = np.copy(coeff)
for k, m in idx:
not_the_same = ~((idx[:, 0] == k) & (idx[:, 1] == m))
above = (idx[:, 0] >= k) & (idx[:, 1] >= m) & not_the_same
for i, j in idx[above]:
b = binom(i, k) * binom(j, m)
sign = (-1) ** ((i - k) + (j - m))
offset = offset_x ** (i - k) * offset_y ** (j - m)
coeff[k, m] += sign * b * coeff2[i, j] * offset
return coeff
def plot2d(x, y, z, coeff):
# regular grid covering the domain of the data
if x.size > 500:
choice = np.random.choice(x.size, size=500, replace=False)
else:
choice = slice(None, None, None)
x, y, z = x[choice], y[choice], z[choice]
X, Y = np.meshgrid(
np.linspace(np.min(x), np.max(x), 20), np.linspace(np.min(y), np.max(y), 20)
)
Z = np.polynomial.polynomial.polyval2d(X, Y, coeff)
fig = plt.figure()
ax = fig.gca(projection="3d")
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, alpha=0.2)
ax.scatter(x, y, z, c="r", s=50)
plt.xlabel("X")
plt.ylabel("Y")
ax.set_zlabel("Z")
plt.show()
def polyfit2d(x, y, z, degree=1, max_degree=None, scale=True, plot=False):
"""A simple 2D polynomial fit to data x, y, z
The polynomial can be evaluated with numpy.polynomial.polynomial.polyval2d
Parameters
----------
x : array[n]
x coordinates
y : array[n]
y coordinates
z : array[n]
data values
degree : {int, 2-tuple}, optional
degree of the polynomial fit in x and y direction (default: 1)
max_degree : {int, None}, optional
if given the maximum combined degree of the coefficients is limited to this value
scale : bool, optional
Wether to scale the input arrays x and y to mean 0 and variance 1, to avoid numerical overflows.
Especially useful at higher degrees. (default: True)
plot : bool, optional
wether to plot the fitted surface and data (slow) (default: False)
Returns
-------
coeff : array[degree+1, degree+1]
the polynomial coefficients in numpy 2d format, i.e. coeff[i, j] for x**i * y**j
"""
# Flatten input
x = np.asarray(x).ravel()
y = np.asarray(y).ravel()
z = np.asarray(z).ravel()
# Remove masked values
mask = ~(np.ma.getmask(z) | np.ma.getmask(x) | np.ma.getmask(y))
x, y, z = x[mask].ravel(), y[mask].ravel(), z[mask].ravel()
# Scale coordinates to smaller values to avoid numerical problems at larger degrees
if scale:
x, y, norm, offset = _scale(x, y)
if np.isscalar(degree):
degree = (int(degree), int(degree))
degree = [int(degree[0]), int(degree[1])]
coeff = np.zeros((degree[0] + 1, degree[1] + 1))
idx = _get_coeff_idx(coeff)
# Calculate elements 1, x, y, x*y, x**2, y**2, ...
A = polyvander2d(x, y, degree)
# We only want the combinations with maximum order COMBINED power
if max_degree is not None:
mask = idx[:, 0] + idx[:, 1] <= int(max_degree)
idx = idx[mask]
A = A[:, mask]
# Do the actual least squares fit
C, *_ = lstsq(A, z)
# Reorder coefficients into numpy compatible 2d array
for k, (i, j) in enumerate(idx):
coeff[i, j] = C[k]
# Reverse the scaling
if scale:
coeff = polyscale2d(coeff, *norm, copy=False)
coeff = polyshift2d(coeff, *offset, copy=False)
if plot:
if scale:
x, y = _unscale(x, y, norm, offset)
plot2d(x, y, z, coeff)
return coeff
if __name__ == "__main__":
n = 100
x, y = np.meshgrid(np.arange(n), np.arange(n))
z = x ** 2 + y ** 2
c = polyfit2d(x, y, z, degree=2, plot=True)
print(c)
It doesn't look like polyfit supports fitting multivariate polynomials, but you can do it by hand, with linalg.lstsq. The steps are as follows:
Gather the degrees of monomials x**i * y**j you wish to use in the model. Think carefully about it: your current model already has 9 parameters, if you are going to push to 5 variables then with the current approach you'll end up with 3**5 = 243 parameters, a sure road to overfitting. Maybe limit to the monomials of __total_ degree at most 2 or three...
Plug the x-points into each monomial; this gives a 1D array. Stack all such arrays as columns of a matrix.
Solve a linear system with aforementioned matrix and with the right-hand side being the target values (I call them z because y is confusing when you also use x, y for two variables).
Here it is:
import numpy as np
x = np.random.random((100, 2))
z = np.sin(5 * x[:,0]) + .4 * np.sin(x[:,1])
degrees = [(i, j) for i in range(3) for j in range(3)] # list of monomials x**i * y**j to use
matrix = np.stack([np.prod(x**d, axis=1) for d in degrees], axis=-1) # stack monomials like columns
coeff = np.linalg.lstsq(matrix, z)[0] # lstsq returns some additional info we ignore
print("Coefficients", coeff) # in the same order as the monomials listed in "degrees"
fit = np.dot(matrix, coeff)
print("Fitted values", fit)
print("Original values", y)
I believe you have misunderstood what polyvander2d does and how it should be used. polyvander2d() returns the pseudo-Vandermonde matrix of degrees deg and sample points (x, y).
Here, y is not the value(s) of the polynomial at point(s) x but rather it is the y-coordinate of the point(s) and x is the x-coordinate. Roughly speaking, the returned array is a set of combinations of (x**i) * (y**j) and x and y are essentially 2D "mesh-grids". Therefore, both x and y must have identical shapes.
Your x and y, however, arrays have different shapes:
>>> x.shape
(100, 2)
>>> y.shape
(100,)
I do not believe numpy has a 5D-polyvander of the form polyvander5D(x, y, z, v, w, deg). Notice, all the variables here are coordinates and not the values of the polynomial p=p(x,y,z,v,w). You, however, seem to be using y (in the 2D case) as f.
It appears that numpy does not have 2D or higher equivalents for the polyfit() function. If your intention is to find the coefficients of the best-fitting polynomial in higher-dimensions, I would suggest that you generalize the approach described here: Equivalent of `polyfit` for a 2D polynomial in Python
The option isn't there because nobody wants to do that. Combine the polynomials linearly (f(x,y) = 1 + x + y + x^2 + y^2) and solve the system of equations yourself.

Select values in arrays

I have two arrays of the same length:
x = [2,3,6,100,2,3,5,8,100,100,5]
y = [2,3,4,5,5,5,2,1,0,2,4]
I selected the position where x==100 in this way:
How is possible to have the value of y where x==100? (that is y=5,0,2)?
I tried in this way:
x100=np.where(x==100)
y100=y[x100]
but it doesn't give me the values I want. How can I solve the problem?
Your code works fine when actually using numpy arrays. You can also write it more succinctly like so.
>>> import numpy as np
>>> x = np.array([2,3,6,100,2,3,5,8,100,100,5])
>>> y = np.array([2,3,4,5,5,5,2,1,0,2,4])
>>> y[x == 100]
array([5, 0, 2])
x and y should be numpy arrays:
x = np.array([2,3,6,100,2,3,5,8,100,100,5])
y = np.array([2,3,4,5,5,5,2,1,0,2,4])
Then your code should work as you expect.
What about
[b for (a,b) in zip(x,y) if a==100]
or
itertools.compress(y, [a==100 for a in x])
Iterate over both and check for 100:
x = [2,3,6,100,2,3,5,8,100,100,5]
y = [2,3,4,5,5,5,2,1,0,2,4]
for xi, yi in zip(x, y):
if xi == 100:
print(yi)
Prints:
5
0
2
Or as list comprehension:
>>> [yi for xi, yi in zip(x, y) if xi == 100]
[5, 0, 2]

Python 3D polynomial surface fit, order dependent

I am currently working with astronomical data among which I have comet images. I would like to remove the background sky gradient in these images due to the time of capture (twilight). The first program I developed to do so took user selected points from Matplotlib's "ginput" (x,y) pulled the data for each coordinate (z) and then gridded the data in a new array with SciPy's "griddata."
Since the background is assumed to vary only slightly, I would like to fit a 3d low order polynomial to this set of (x,y,z) points. However, the "griddata" does not allow for an input order:
griddata(points,values, (dimension_x,dimension_y), method='nearest/linear/cubic')
Any ideas on another function that may be used or a method for developing a leas-squares fit that will allow me to control the order?
Griddata uses a spline fitting. A 3rd order spline is not the same thing as a 3rd order polynomial (instead, it's a different 3rd order polynomial at every point).
If you just want to fit a 2D, 3rd order polynomial to your data, then do something like the following to estimate the 16 coefficients using all of your data points.
import itertools
import numpy as np
import matplotlib.pyplot as plt
def main():
# Generate Data...
numdata = 100
x = np.random.random(numdata)
y = np.random.random(numdata)
z = x**2 + y**2 + 3*x**3 + y + np.random.random(numdata)
# Fit a 3rd order, 2d polynomial
m = polyfit2d(x,y,z)
# Evaluate it on a grid...
nx, ny = 20, 20
xx, yy = np.meshgrid(np.linspace(x.min(), x.max(), nx),
np.linspace(y.min(), y.max(), ny))
zz = polyval2d(xx, yy, m)
# Plot
plt.imshow(zz, extent=(x.min(), y.max(), x.max(), y.min()))
plt.scatter(x, y, c=z)
plt.show()
def polyfit2d(x, y, z, order=3):
ncols = (order + 1)**2
G = np.zeros((x.size, ncols))
ij = itertools.product(range(order+1), range(order+1))
for k, (i,j) in enumerate(ij):
G[:,k] = x**i * y**j
m, _, _, _ = np.linalg.lstsq(G, z)
return m
def polyval2d(x, y, m):
order = int(np.sqrt(len(m))) - 1
ij = itertools.product(range(order+1), range(order+1))
z = np.zeros_like(x)
for a, (i,j) in zip(m, ij):
z += a * x**i * y**j
return z
main()
The following implementation of polyfit2d uses the available numpy methods numpy.polynomial.polynomial.polyvander2d and numpy.polynomial.polynomial.polyval2d
#!/usr/bin/env python3
import unittest
def polyfit2d(x, y, f, deg):
from numpy.polynomial import polynomial
import numpy as np
x = np.asarray(x)
y = np.asarray(y)
f = np.asarray(f)
deg = np.asarray(deg)
vander = polynomial.polyvander2d(x, y, deg)
vander = vander.reshape((-1,vander.shape[-1]))
f = f.reshape((vander.shape[0],))
c = np.linalg.lstsq(vander, f)[0]
return c.reshape(deg+1)
class MyTest(unittest.TestCase):
def setUp(self):
return self
def test_1(self):
self._test_fit(
[-1,2,3],
[ 4,5,6],
[[1,2,3],[4,5,6],[7,8,9]],
[2,2])
def test_2(self):
self._test_fit(
[-1,2],
[ 4,5],
[[1,2],[4,5]],
[1,1])
def test_3(self):
self._test_fit(
[-1,2,3],
[ 4,5],
[[1,2],[4,5],[7,8]],
[2,1])
def test_4(self):
self._test_fit(
[-1,2,3],
[ 4,5],
[[1,2],[4,5],[0,0]],
[2,1])
def test_5(self):
self._test_fit(
[-1,2,3],
[ 4,5],
[[1,2],[4,5],[0,0]],
[1,1])
def _test_fit(self, x, y, c, deg):
from numpy.polynomial import polynomial
import numpy as np
X = np.array(np.meshgrid(x,y))
f = polynomial.polyval2d(X[0], X[1], c)
c1 = polyfit2d(X[0], X[1], f, deg)
np.testing.assert_allclose(c1,
np.asarray(c)[:deg[0]+1,:deg[1]+1],
atol=1e-12)
unittest.main()
According to the principle of Least squares, and imitate Kington's style,
while move argument m to argument m_1 and argument m_2.
import numpy as np
import matplotlib.pyplot as plt
import itertools
# w = (Phi^T Phi)^{-1} Phi^T t
# where Phi_{k, j + i (m_2 + 1)} = x_k^i y_k^j,
# t_k = z_k,
# i = 0, 1, ..., m_1,
# j = 0, 1, ..., m_2,
# k = 0, 1, ..., n - 1
def polyfit2d(x, y, z, m_1, m_2):
# Generate Phi by setting Phi as x^i y^j
nrows = x.size
ncols = (m_1 + 1) * (m_2 + 1)
Phi = np.zeros((nrows, ncols))
ij = itertools.product(range(m_1 + 1), range(m_2 + 1))
for h, (i, j) in enumerate(ij):
Phi[:, h] = x ** i * y ** j
# Generate t by setting t as Z
t = z
# Generate w by solving (Phi^T Phi) w = Phi^T t
w = np.linalg.solve(Phi.T.dot(Phi), (Phi.T.dot(t)))
return w
# t' = Phi' w
# where Phi'_{k, j + i (m_2 + 1)} = x'_k^i y'_k^j
# t'_k = z'_k,
# i = 0, 1, ..., m_1,
# j = 0, 1, ..., m_2,
# k = 0, 1, ..., n' - 1
def polyval2d(x_, y_, w, m_1, m_2):
# Generate Phi' by setting Phi' as x'^i y'^j
nrows = x_.size
ncols = (m_1 + 1) * (m_2 + 1)
Phi_ = np.zeros((nrows, ncols))
ij = itertools.product(range(m_1 + 1), range(m_2 + 1))
for h, (i, j) in enumerate(ij):
Phi_[:, h] = x_ ** i * y_ ** j
# Generate t' by setting t' as Phi' w
t_ = Phi_.dot(w)
# Generate z_ by setting z_ as t_
z_ = t_
return z_
if __name__ == "__main__":
# Generate x, y, z
n = 100
x = np.random.random(n)
y = np.random.random(n)
z = x ** 2 + y ** 2 + 3 * x ** 3 + y + np.random.random(n)
# Generate w
w = polyfit2d(x, y, z, m_1=3, m_2=2)
# Generate x', y', z'
n_ = 1000
x_, y_ = np.meshgrid(np.linspace(x.min(), x.max(), n_),
np.linspace(y.min(), y.max(), n_))
z_ = np.zeros((n_, n_))
for i in range(n_):
z_[i, :] = polyval2d(x_[i, :], y_[i, :], w, m_1=3, m_2=2)
# Plot
plt.imshow(z_, extent=(x_.min(), y_.max(), x_.max(), y_.min()))
plt.scatter(x, y, c=z)
plt.show()
If anyone is looking for fitting a polynomial of a specific order (rather than polynomials where the highest power is equal to order, you can make this adjustment to the accepted answer's polyfit and polyval:
instead of:
ij = itertools.product(range(order+1), range(order+1))
which, for order=2 gives [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)] (aka up to a 4th degree polynomial), you can use
def xy_powers(order):
powers = itertools.product(range(order + 1), range(order + 1))
return [tup for tup in powers if sum(tup) <= order]
This returns [(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (2, 0)] for order=2

numpy: operating on multidimensional arrays

Sorry for title's vagueness. I have two related questions.
First, let's say I have a function "hessian" that given two parameters (x, y) returns a matrix. I now want to compute that matrix for (x,y) running over a two dimensional space. I'd like to do something like:
x = linspace(1, 4, 100).reshape(-1,1)
y = linspace(1, 4, 100).reshape(1,-1)
H = vectorize(hessian)(x, y)
with the resulting H of shape (100,100,2,2). The above doesn't work (ValueError: setting an array element with a sequence). The only thing I came up with is
H = array([ hessian(xx, yy) for xx in x.flat for yy in y.flat ]).reshape(100,100,2,2)
is there a better, more direct, way ?
Second, now H has shape (100,100,2,2) and dominant_eigenvector(X) does exactly what you think.
U, V = hsplit(array(map(dominant_eigenvector, H.reshape(10000,2,2))), 2)
I again need to use list comprehension to do the iteration and repack the result in an array specifying manually the shape. Is there a more direct way to achieve the same result ?
Thanks!
edit: as suggested by Paul and JoshAdel, I implemented a version of hessian that works with arrays, here it is
def hessian(w1, w2):
w1 = atleast_1d(w1)[...,newaxis,newaxis]
w2 = atleast_1d(w2)[...,newaxis,newaxis]
o1, o2 = ix_(*map(xrange, Z.shape))
W = Z * pow(w1, o1) * pow(w2, o2)
A = (W).sum()
B = (W * o1).sum()
C = (W * o2).sum()
D = (W * o1 * o1).sum()
E = (W * o1 * o2).sum()
F = (W * o2 * o2).sum()
return array([[ D/A - B/A*B/A, E/A - B/A*C/A ],
[ E/A - B/A*C/A, F/A - C/A*C/A ]])
Z can be considered a global array of roughly 250x150.
o1 and o2 index the two dimensions of Z to compute
things like $\sum_{i,j} Z_{ij} * i * j$.
The problem with this version is that intermediate arrays
are just too big. If w1 and w2 are arrays like w1_k w2_l
W becomes W_{k,l,i,j} on which numpy gives ValueError: too big.
You could try to use meshgrid, maybe you have to flatten xn, yn:
x = linspace(1, 4, 100)
y = linspace(1, 4, 100)
xn,yn=meshgrid(x,y)
H = vectorize(hessian)(xn, yn)

Categories