I was wondering how I could fix the error in the following code
import numpy as np
import matplotlib.pyplot as plt
from sympy.functions.special.polynomials import assoc_legendre
from scipy.misc import factorial, derivative
import sympy as sym
def main():
t = 36000
a=637000000
H=200
g=9.81
x = sym.symbols('x')
for l in range(1, 6):
ω=np.sqrt(g*H*l*(l+1))/a
for n in range(l+1):
nθ, nφ = 128, 256
θ, φ = np.linspace(0, np.pi, nθ), np.linspace(0, 2*np.pi, nφ)
legfun_sym = sym.functions.special.polynomials.assoc_legendre(l, n, x)
legfun_num = sym.lambdify(x,legfun_sym)
X, Y = np.meshgrid(θ, φ)
uθ = (g/(a*ω))*Der_Assoc_Legendre(legfun_num, l, n, X)*np.sin(n*Y-ω*t)
uφ = (g/(a*ω*np.sin(X)))*Assoc_Legendre(l, n, X)*np.cos(n*Y-ω*t)
#speed = np.sqrt(uθ**2 + uφ**2)
fig0, ax = plt.subplots()
strm = ax.streamplot(φ, θ, uφ, uθ, linewidth=2, cmap=plt.cm.autumn)
fig0.colorbar(strm.lines)
plt.show()
def Assoc_Legendre(m, n, X):
L=[]
for i in X:
k=[]
for j in i:
k.append(assoc_legendre(m, n, np.cos(j)))
L.append(k)
return np.array(L)
def Der_Assoc_Legendre(legfun_num, m, n, X):
L=[]
for i in X:
k=[]
for j in i:
k.append(derivative(legfun_num, j, dx=1e-7))
L.append(k)
return np.array(L)
if __name__=='__main__':
main()
The error message 'u' and 'v' must be of shape 'Grid(x,y)' comes up with regard to the strm = ax.streamplot(φ, θ, uφ, uθ, linewidth=2, cmap=plt.cm.autumn) line. How should I fix this?
For reference, I am trying to do a streamplot of $u_{\theta}$ and $u_{\phi}$, where $u_{\theta}=\frac{g}{\omega a}\frac{d}{d\theta}\left(P^n_l\left(cos\theta\right)\right)sin\left(n\phi-\omega t\right)$ and $u_{\phi}=\frac{gn}{\omega a sin\theta}P^n_l\left(cos\theta\right)cos\left(n\phi-\omega t\right)$
EDIT:
This is the current code I have:
import numpy as np
import matplotlib.pyplot as plt
from sympy.functions.special.polynomials import assoc_legendre
from scipy.misc import factorial, derivative
import sympy as sym
def main():
t = 36000
a=637000000
H=200
g=9.81
x = sym.symbols('x')
X, Y = np.mgrid[0.01:np.pi-0.01:100j,0:2*np.pi:100j]
for l in range(1, 6):
ω=np.sqrt(g*H*l*(l+1))/a
for n in range(l+1):
#nθ, nφ = 128, 256
#θ, φ = np.linspace(0.001, np.pi-0.001, nθ), np.linspace(0, 2*np.pi, nφ)
legfun_sym = sym.functions.special.polynomials.assoc_legendre(l, n, x)
legfun_num = sym.lambdify(x, legfun_sym)
uθ = (g/(a*ω*np.sin(X)))*Der_Assoc_Legendre(legfun_num, l, n, X)*np.sin(n*Y-ω*t)
uφ = (g/(a*ω))*Assoc_Legendre(l, n, X)*np.cos(n*Y-ω*t)
#speed = np.sqrt(uθ**2 + uφ**2)
fig0, ax = plt.subplots()
strm = ax.streamplot(Y, X, uθ,uφ, linewidth=0.5, cmap=plt.cm.autumn)
#fig0.colorbar(strm.lines)
plt.show()
print("next")
def Assoc_Legendre(m, n, X):
L=[]
for i in X:
k=[]
for j in i:
k.append(assoc_legendre(m, n, np.cos(j)))
L.append(k)
return np.float64(np.array(L))
def Der_Assoc_Legendre(legfun_num, m, n, X):
L=[]
for i in X:
k=[]
for j in i:
k.append(derivative(legfun_num, j, dx=0.001))
L.append(k)
return np.float64(np.array(L))
if __name__=='__main__':
main()
The current issue seems to be with the derivative function in Der_Assoc_Legendre, that brings up the error ValueError: math domain error after plotting the first plot and onto the second.
While python 3 allows you to use Greek characters as/in variable names, I can assure you that most programmers will find your code unreadable, and it will be a nightmare for others to maintain/develop your code riddled with φ and θ.
Secondly, your code quickly throws a RuntimeWarning concerning a division by zero, which you should most certainly trace down and fix safely.
As for your question, the problem is two-fold. The first problem is that the dimensions of your input don't match in the call to streamline:
>>> print(φ.shape, θ.shape, uφ.shape, uθ.shape)
(256,) (128,) (256, 128) (256, 128)
The trick is that a lot of matplotlib plot functions expect a transposition of its 2d array dimensions, closely related to the weird definition of numpy.meshgrid:
>>> i,j = np.meshgrid(range(3),range(4))
>>> print(i.shape)
(4, 3)
Probably due to this reason, the definition of streamplot is as follows:
Axes.streamplot(ax, *args, **kwargs)
Draws streamlines of a vector flow.
x, y : 1d arrays
an evenly spaced grid.
u, v : 2d arrays
x and y-velocities. Number of rows should match length of y, and the number of columns should match x.
Note the last bit about dimensions. All you need to do is swap x/y, or transpose the angles; you need to check which one of these will lead to a more meaningful plot in your application.
Now, if you fix this, the the following happens:
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Now this is fishy. All input types should be numeric...right? Well, yeah, but they aren't:
>>> print(φ.dtype, θ.dtype, uφ.dtype, uθ.dtype)
float64 float64 object float64
What's that third object-typed array about?
>>> print(uφ[0,0],uθ[0,0])
+inf -0.00055441014491
>>> print(type(uφ[0,0]),type(uθ[0,0]))
<class 'sympy.core.numbers.Float'> <class 'numpy.float64'>
As #Jelmes noted in a comment. the above type of uφ is the direct consequence of its construction using sympy. If one converts these sympy floats to python or numpy floats before constructing the resulting array, the dtype issue should go away. Whether the remaining infinities (the consequence of division by 0 in 1/sin(X)) will be handled gracefully by streamplot, is another question.
Related
I use the following Python code to illustrate the generation of random variables to students:
import numpy as np
import scipy.stats as stats
def lcg(n, x0, M=2**32, a=1103515245, c=12345):
result = np.zeros(n)
for i in range(n):
result[i] = (a*x0 + c) % M
x0 = result[i]
return np.array([x/M for x in result])
x = lcg(10**6, 3)
print(stats.kstest(x, 'uniform'))
The default parameters are the ones used by glibc, according to Wikipedia. The last line of the code prints
KstestResult(statistic=0.043427751892089805, pvalue=0.0)
The pvalue of 0.0 indicates that the observation would basically never occur if the elements of x were truly distributed according to a uniform distribution.
My question is: is there a bug in my code, or does the LCG with the parameters given not pass the Kolmogorov-Smirnov test with 10**6 replicas?
There is problem with your code, it makes uniform distribution like
I've changed your LCG implementation a bit, and all is good now (Python 3.7, Anaconda, Win10 x64)
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
def lcg(n, x0, M=2**32, a=1103515245, c=12345):
result = np.zeros(n)
for i in range(n):
x0 = (a*x0 + c) % M
result[i] = x0
return np.array([x/float(M) for x in result])
#x = np.random.uniform(0.0, 1.0, 1000000)
x = lcg(1000000, 3)
print(stats.kstest(x, 'uniform'))
count, bins, ignored = plt.hist(x, 15, density=True)
plt.plot(bins, np.ones_like(bins), linewidth=2, color='r')
plt.show()
which prints
KstestResult(statistic=0.0007238884545415214, pvalue=0.6711878724246786)
and plots
UPDATE
as #pjs pointed out, you'd better divide by float(M) right in the loop, no need for
second pass over whole array
def lcg(n, x0, M=2**32, a=1103515245, c=12345):
result = np.empty(n)
for i in range(n):
x0 = (a*x0 + c) % M
result[i] = x0 / float(M)
return result
To complement Severin's answer, the reason my code was not working properly is that result was an array of floating point numbers.
We can see the difference between the two implementations already at the second iteration.
After the first iteration, x0 = 3310558080.
In [9]: x0 = 3310558080
In [10]: float_x0 = float(x0)
In [11]: (a*x0 + c) % M
Out[11]: 465823161
In [12]: (a*float_x0 + c) % M
Out[12]: 465823232.0
In [13]: a*x0
Out[13]: 3653251310737929600
In [14]: a*float_x0
Out[14]: 3.6532513107379297e+18
So the problem had to do with the use of floating point numbers.
Usually I use Scipy.optimize.curve_fit to fit custom functions to data.
Data in this case was always a 1 dimensional array.
Is there a similiar function for a two dimensional array?
So, for example, I have a 10x10 numpy array. Then I have a function that does some stuff and creates a 10x10 numpy array, and I want to fit the function, so that the resulting 10x10 array has the best fit to the input array.
Maybe an example is better :)
data = pyfits.getdata('data.fits') #fits is an image format, this gives me a NxM numpy array
mod1 = pyfits.getdata('mod1.fits')
mod2 = pyfits.getdata('mod2.fits')
mod3 = pyfits.getdata('mod3.fits')
mod1_1D = numpy.ravel(mod1)
mod2_1D = numpy.ravel(mod2)
mod3_1D = numpy.ravel(mod3)
def dostuff(a,b): #originaly this is a function for 2D arrays
newdata = (mod1_1D*12)+(mod2_1D)**a - mod3_1D/b
return newdata
Now a and b should be fitted, so that newdata is as close as possible to data.
What I got so far:
data1D = numpy.ravel(data)
data_X = numpy.arange(data1D.size)
fit = curve_fit(dostuff,data_X,data1D)
But print fit only gives me
(array([ 1.]), inf)
I do have some nans in the arrays, maybe thats a problem?
The goal is to express the 2D function as a 1D function: g(x, y, ...) --> f(xy, ...)
Converting the coordinate pair (x, y) into a single number xy may seem tricky at first. But it's actually quite simple. Just enumerate all data points and you have a single number that uniquely defines each coordinate pair. The fitted function simply has to reconstruct the original coordinates, do it's calculations and return the result.
Example that fits a 2D linear gradient in a 20x10 image:
import scipy as sp
import numpy as np
import matplotlib.pyplot as plt
n, m = 10, 20
# noisy example data
x = np.arange(m).reshape(1, m)
y = np.arange(n).reshape(n, 1)
z = x + y * 2 + np.random.randn(n, m) * 3
def f(xy, a, b):
i = xy // m # reconstruct y coordinates
j = xy % m # reconstruct x coordinates
out = i * a + j * b
return out
xy = np.arange(z.size) # 0 is the top left pixel and 199 is the top right pixel
res = sp.optimize.curve_fit(f, xy, np.ravel(z))
z_est = f(xy, *res[0])
z_est2d = z_est.reshape(n, m)
plt.subplot(2, 1, 1)
plt.plot(np.ravel(z), label='original')
plt.plot(z_est, label='fitted')
plt.legend()
plt.subplot(2, 2, 3)
plt.imshow(z)
plt.xlabel('original')
plt.subplot(2, 2, 4)
plt.imshow(z_est2d)
plt.xlabel('fitted')
I would recommend using symfit for this, I wrote that to take care of all of the magic for you automatically.
In symfit you would just write the equation pretty much as you would on paper, and then you can run the fit.
I would do something like this:
from symfit import parameters, variables, Fit
# Assuming all this data is in the form of NxM arrays
data = pyfits.getdata('data.fits')
mod1 = pyfits.getdata('mod1.fits')
mod2 = pyfits.getdata('mod2.fits')
mod3 = pyfits.getdata('mod3.fits')
a, b = parameters('a, b')
x, y, z, u = variables('x, y, z, u')
model = {u: (x * 12) + y**a - z / b}
fit = Fit(model, x=mod1, y=mod2, z=mod3, u=data)
fit_result = fit.execute()
print(fit_result)
Unfortunatelly I have not yet included examples of the kind you need in the docs yet, but if you just look at the docs I think you can figure it out in case this doesn't work out of the box.
I'm just gonna solve and plot a nonlinear equation with matplotlib, but there is an error saying:
TypeError: zip argument #1 must support iteration
Can you help me fix it?...
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import fsolve
r = np.arange(-100, 100, step=0.01, dtype=float)
def equation(p,r0):
x = p
r = r0
return (r * x + np.power(x,3)- np.power(x,5))
temp = []
for i in r:
x = fsolve(equation, 0, args=(i,))
temp.extend((i,x))
my_array = np.array(temp)
#print(my_array)
x, y = zip(*my_array)
plt.plot(x,y)
As #Julien said, you must use append instead of extend. Furthermore, I guess you can't see the result because there is no plt.show() in your snippet. You need to add that right after plt.plot(x,y). Then, the output will be:
You better change your initial guess to something else because 0 is the answer for the equation for all r. As an example, here is the result for 2:
I have been trying to get this to work for a while now, but still not finding a way. I am trying to compute the Look ahead estimate density of a piecewise gaussian function. I'm trying to estimate the stationary distribution of a piecewise normally distributed function. is there a way to avoid the error type:
Error-type: the truth value of an array with more than one element is ambiguous. Use a.any() or a.all().
for instance y=np.linspace(-200.0,200.0,100) and x = np,linspace(-200.0,200.0,100). then verify the condition as stated in the code below?
import numpy as np
import sympy as sp
from numpy import exp,sqrt,pi
from sympy import Integral, log, exp, sqrt, pi
import math
import matplotlib.pyplot as plt
import scipy.integrate
from scipy.special import erf
from scipy.stats import norm, gaussian_kde
from quantecon import LAE
from sympy.abc import q
#from sympy import symbols
#var('q')
#q= symbols('q')
## == Define parameters == #
mu=80
sigma=20
b=0.2
Q=80
Q1=Q*(1-b)
Q2=Q*(1+b)
d = (sigma*np.sqrt(2*np.pi))
phi = norm()
n = 500
#Phi(z) = 1/2[1 + erf(z/sqrt(2))].
def p(x, y):
# x, y = np.array(x, dtype=float), np.array(y, dtype=float)
Positive_RG = norm.pdf(x-y+Q1, mu, sigma)
print('Positive_R = ', Positive_RG)
Negative_RG = norm.pdf(x-y+Q2, mu, sigma)
print('Negative_RG = ', Negative_RG)
pdf_0= (1/(2*math.sqrt(2*math.pi)))*(erf((x+Q2-mu)/(sigma*np.sqrt(2)))-erf((x+Q1-mu)/(sigma*np.sqrt(2))))
Zero_RG =norm.pdf
print('Zero_RG',Zero_RG)
print ('y',y)
if y>0.0 and x -y>=-Q1:
#print('printA', Positive_RG)
return Positive_RG
elif y<0.0 and x -y>=-Q2:
#print('printC', Negative_RG)
return Negative_RG
elif y==0.0 and x >=-Q1:
#print('printB', Zero_RG)
return Zero_RG
return 0.0
Z = phi.rvs(n)
X = np.empty(n)
for t in range(n-1):
X[t+1] = X[t] + Z[t]
#X[t+1] = np.abs(X[t]) + Z[t]
psi_est = LAE(p, X)
k_est = gaussian_kde(X)
fig, ax = plt.subplots(figsize=(10,7))
ys = np.linspace(-200.0, 200.0, 200)
ax.plot(ys, psi_est(ys), 'g-', lw=2, alpha=0.6, label='look ahead estimate')
ax.plot(ys, k_est(ys), 'k-', lw=2, alpha=0.6, label='kernel based estimate')
ax.legend(loc='upper left')
plt.show()
See all those ValueError questions in the side bar????
This error is produced when a boolean array is used in a scalar boolean context, such as if or or/and.
Try your y or x in this test, or even simpler one. Experiment in a interactive shell.
if y>0.0 and x -y>=-Q1: ....
if y>0:
(y>0.0) and (x-y>=10)
will all produce this error with your x and y.
Notice also that I edited your question for clarity.
Error starts with quantecon.LAE(p, X), which expects a vectorized function p. Your function isn't vectorized, which is why everything else doesn't work. You copied some vectorized code, but left a lot of things as sympy style functions which is why the numpy folks were confused about what you wanted.
In this case "vectorized" means transforming two 1D arrays with length n into a 2D n x n array. In this case, you don't want to return 0.0, you want to return out a 2d ndArray which has the value 0.0 in locations out[i,j] where a boolean mask based on a function of x[i], y[j] is false.
You can do this by broadcasting:
def sum_function(x,y):
return x[:, None] + y[None, :] # or however you want to add them, broadcasted to 2D
def myFilter(x,y):
x, y = x.squeeze(), y.squeeze()
out=np.zeros((x.size,y.size))
xyDiff = x[:, None] - y[None, :]
out=np.where(np.bitwise_and(y[None, :] => 0.0, xyDiff >= -Q1), sum_function(x, y), out) # unless the sum functions are different
out=np.where(np.bitwise_and(y[None, :] < 0.0, xyDiff >= -Q2), sum_function(x, y), out)
return out
I have two 2D array, x(ni, nj) and y(ni,nj), that I need to interpolate over one axis. I want to interpolate along last axis for every ni.
I wrote
import numpy as np
from scipy.interpolate import interp1d
z = np.asarray([200,300,400,500,600])
out = []
for i in range(ni):
f = interp1d(x[i,:], y[i,:], kind='linear')
out.append(f(z))
out = np.asarray(out)
However, I think this method is inefficient and slow due to loop if array size is too large. What is the fastest way to interpolate multi-dimensional array like this? Is there any way to perform linear and cubic interpolation without loop? Thanks.
The method you propose does have a python loop, so for large values of ni it is going to get slow. That said, unless you are going to have large ni you shouldn't worry much.
I have created sample input data with the following code:
def sample_data(n_i, n_j, z_shape) :
x = np.random.rand(n_i, n_j) * 1000
x.sort()
x[:,0] = 0
x[:, -1] = 1000
y = np.random.rand(n_i, n_j)
z = np.random.rand(*z_shape) * 1000
return x, y, z
And have tested them with this two versions of linear interpolation:
def interp_1(x, y, z) :
rows, cols = x.shape
out = np.empty((rows,) + z.shape, dtype=y.dtype)
for j in xrange(rows) :
out[j] =interp1d(x[j], y[j], kind='linear', copy=False)(z)
return out
def interp_2(x, y, z) :
rows, cols = x.shape
row_idx = np.arange(rows).reshape((rows,) + (1,) * z.ndim)
col_idx = np.argmax(x.reshape(x.shape + (1,) * z.ndim) > z, axis=1) - 1
ret = y[row_idx, col_idx + 1] - y[row_idx, col_idx]
ret /= x[row_idx, col_idx + 1] - x[row_idx, col_idx]
ret *= z - x[row_idx, col_idx]
ret += y[row_idx, col_idx]
return ret
interp_1 is an optimized version of your code, following Dave's answer. interp_2 is a vectorized implementation of linear interpolation that avoids any python loop whatsoever. Coding something like this requires a sound understanding of broadcasting and indexing in numpy, and some things are going to be less optimized than what interp1d does. A prime example being finding the bin in which to interpolate a value: interp1d will surely break out of loops early once it finds the bin, the above function is comparing the value to all bins.
So the result is going to be very dependent on what n_i and n_j are, and even how long your array z of values to interpolate is. If n_j is small and n_i is large, you should expect an advantage from interp_2, and from interp_1 if it is the other way around. Smaller z should be an advantage to interp_2, longer ones to interp_1.
I have actually timed both approaches with a variety of n_i and n_j, for z of shape (5,) and (50,), here are the graphs:
So it seems that for z of shape (5,) you should go with interp_2 whenever n_j < 1000, and with interp_1 elsewhere. Not surprisingly, the threshold is different for z of shape (50,), now being around n_j < 100. It seems tempting to conclude that you should stick with your code if n_j * len(z) > 5000, but change it to something like interp_2 above if not, but there is a great deal of extrapolating in that statement! If you want to further experiment yourself, here's the code I used to produce the graphs.
n_s = np.logspace(1, 3.3, 25)
int_1 = np.empty((len(n_s),) * 2)
int_2 = np.empty((len(n_s),) * 2)
z_shape = (5,)
for i, n_i in enumerate(n_s) :
print int(n_i)
for j, n_j in enumerate(n_s) :
x, y, z = sample_data(int(n_i), int(n_j), z_shape)
int_1[i, j] = min(timeit.repeat('interp_1(x, y, z)',
'from __main__ import interp_1, x, y, z',
repeat=10, number=1))
int_2[i, j] = min(timeit.repeat('interp_2(x, y, z)',
'from __main__ import interp_2, x, y, z',
repeat=10, number=1))
cs = plt.contour(n_s, n_s, np.transpose(int_1-int_2))
plt.clabel(cs, inline=1, fontsize=10)
plt.xlabel('n_i')
plt.ylabel('n_j')
plt.title('timeit(interp_2) - timeit(interp_1), z.shape=' + str(z_shape))
plt.show()
One optimization is to allocate the result array once like so:
import numpy as np
from scipy.interpolate import interp1d
z = np.asarray([200,300,400,500,600])
out = np.zeros( [ni, len(z)], dtype=np.float32 )
for i in range(ni):
f = interp1d(x[i,:], y[i,:], kind='linear')
out[i,:]=f(z)
This will save you some memory copying that occurs in your implementation, which occurs in the calls to out.append(...).