I'm trying to create a function, but it involves two variables of different lengths. My setup is as follows:
import pandas as pd
import numpy as np
u = np.random.normal(0,1,50)
t = 25
x = t*u/(1-u)
x = np.sort(x, axis=0)
theta = list(range(1, 1001, 1)
theta = np.divide(theta, 10) # theta is now 1000 numbers, going from 0.1 to 100
fx = np.zeros(1000)*np.nan
fx = np.reshape(fx, (1000,1))
I want my function to be the following:
def function(theta):
fx = 50/theta - 2 * np.sum(1/(theta + x))
return fx
but it won't work because theta is length 1000 and x is length 50. I want it to work iteratively for each theta, and for the part at the end:
np.sum(1/(theta + x)
I want it to add the single theta to each of the fifty numbers in x. If I were to do this once, it would look like:
fx[0] = 50/theta[0] - 2 * np.sum(1/(theta[0] + x))
I can get this to work with a "for" loop, but I eventually need to input this into a maximum likelihood function so using that won't work. Any thoughts?
The critical piece to 'vectorize' your function in not just 1D, but 2D is meshgrid. See below and print xv,yv to understand it's workings.
import numpy as np
u = np.random.normal(0,1,50)
t = 25
x = t*u/(1-u)
x = np.sort(x, axis=0)
theta = np.array( range(1, 1001, 1))
theta = theta/10.0 # theta is now 1000 numbers, going from 0.1 to 100
def function(x,theta):
fx = 50/theta - 2 * np.sum(1/(theta + x))
return fx
xv, tv = np.meshgrid(x,theta)
print function(xv,tv)
output:
[[-6582.19087928 -6582.19087928 -6582.19087928 ..., -6582.19087928
-6582.19087928 -6582.19087928]
[-6832.19087928 -6832.19087928 -6832.19087928 ..., -6832.19087928
-6832.19087928 -6832.19087928]
[-6915.52421261 -6915.52421261 -6915.52421261 ..., -6915.52421261
-6915.52421261 -6915.52421261]
...,
[-7081.68987727 -7081.68987727 -7081.68987727 ..., -7081.68987727
-7081.68987727 -7081.68987727]
[-7081.69037878 -7081.69037878 -7081.69037878 ..., -7081.69037878
-7081.69037878 -7081.69037878]
[-7081.69087928 -7081.69087928 -7081.69087928 ..., -7081.69087928
-7081.69087928 -7081.69087928]]
You might be interested in Numba.
The #vectorize decorator allow you to define a function on a scalar and use it on an array.
from numba import vectorize
import pandas as pd
import numpy as np
u = np.random.normal(0,1,50)
t = 25
x = t*u/(1-u)
x = np.sort(x, axis=0)
theta = list(range(1, 1001, 1))
theta = np.divide(theta, 10) # theta is now 1000 numbers, going from 0.1 to 100
#vectorize
def myFunction(theta):
fx = 50/theta - 2 * np.sum(1/(theta + x))
return fx
myFunction(theta)
If you want to trust the function, you can run the following code.
theta = 1
print(50/theta - 2 * np.sum(1/(theta + x)))
theta = 2
print(50/theta - 2 * np.sum(1/(theta + x)))
print(myFunction(np.array([1,2])))
Output :
21.1464816231
32.8089699838
[ 21.14648162 32.80896998]
By the way, I think it is very optimized so it can be useful for your statistical calculations (#jit decorator seems very powerful).
Related
I'm trying to port this MatLab function in Python:
fs = 128;
x = (0:1:999)/fs;
y_orig = sin(2*pi*15*x);
y_noised = y_orig + 0.5*randn(1,length(x));
[yseg] = mapstd(y_noised);
I wrote this code (which works, so there are not problems with missing variables or else):
Norm_Y = 0
Y_Normalized = []
for i in range(0, len(YSeg), 1):
Norm_Y = Norm_Y + (pow(YSeg[i],2))
Norm_Y = sqrt(Norm_Y)
for i in range(0, len(YSeg), 1):
Y_Normalized.append(YSeg[i] / Norm_Y)
print("%3d %f" %(i, Y_Normalized[i]))
YSeg is Y_Noised (I wrote it in another section of the code).
Now I don't expect the values to be same between MatLab code and mine, cause YSeg or Y_Noised are generated by RAND values, so it's ok they are different, but they are TOO MUCH different.
These are the first 10 values in Matlab:
0.145728655284548
1.41918657039301
1.72322238170491
0.684826842884694
0.125379108969931
-0.188899711186140
-1.03820858801652
-0.402591786430960
-0.844782236884026
0.626897216311757
While these are the first 10 numbers in my python code:
0.052015
0.051132
0.041209
0.034144
0.034450
0.003812
0.048629
0.016854
0.024484
0.021435
It's like mine are 100 times lower. So I feel like I've missed a step during normalization. Can you help ?
You can normalize a vector quite easily in python with numpy:
import numpy as np
def normalize_vector(input_vector):
return input_vector / np.sqrt(np.sum(input_vector**2))
random_vec = np.random.rand(10)
vec_norm = normalize_vector(random_vec)
print(vec_norm)
You can call the provided function with your input vector (YSeg) and check the output. I would expect a similar output as in matlab.
This is an implementation in numpy:
import numpy as np
fs = 127
x = np.arange(10000) / fs
y_orig = np.sin(2 * np.pi * 15 * x)
y_noised = y_orig + 0.5 * np.random.randn(len(x))
yseg = (y_noised - y_noised.mean()) / y_noised.std()
However, why do you consider the values to be "too much different"? After all, the values of y_orig are in range [-1, 1] and you are randomly distorting them by ~0.4 on average.
Is it possible to solve Cubic equation without using sympy?
Example:
import sympy as sp
xp = 30
num = xp + 4.44
sp.var('x, a, b, c, d')
Sol3 = sp.solve(0.0509 * x ** 3 + 0.0192 * x ** 2 + 3.68 * x - num, x)
The result is:
[6.07118098358257, -3.2241955998463 - 10.0524891203436*I, -3.2241955998463 + 10.0524891203436*I]
But I want to find a way to do it with numpy or without 3 part lib at all
I tried with numpy:
import numpy as np
coeff = [0.0509, 0.0192, 3.68, --4.44]
print(np.roots(coeff))
But the result is :
[ 0.40668245+8.54994773j 0.40668245-8.54994773j -1.19057511+0.j]
In your numpy method you are making two slight mistakes with the final coefficient.
In the SymPy example your last coefficient is - num, this is, according to your code: -num = - (xp + 4.44) = -(30 + 4.44) = -34.44
In your NumPy example yout last coefficient is --4.44, which is 4.44 and does not equal -34.33.
If you edit the NumPy code you will get:
import numpy as np
coeff = [0.0509, 0.0192, 3.68, -34.44]
print(np.roots(coeff))
[-3.2241956 +10.05248912j -3.2241956 -10.05248912j
6.07118098 +0.j ]
The answer are thus the same (note that NumPy uses j to indicate a complex number. SymPy used I)
You could implement the cubic formula
this Youtube video from mathologer could help understand it.
Based on that, the cubic function for ax^3 + bx^2 + cx + d = 0 can be written like this:
def cubic(a,b,c,d):
n = -b**3/27/a**3 + b*c/6/a**2 - d/2/a
s = (n**2 + (c/3/a - b**2/9/a**2)**3)**0.5
r0 = (n-s)**(1/3)+(n+s)**(1/3) - b/3/a
r1 = (n+s)**(1/3)+(n+s)**(1/3) - b/3/a
r2 = (n-s)**(1/3)+(n-s)**(1/3) - b/3/a
return (r0,r1,r2)
The simplified version of the formula only needs to get c and d as parameters (aka p and q) and can be implemented like this:
def cubic(p,q):
n = -q/2
s = (q*q/4+p**3/27)**0.5
r0 = (n-s)**(1/3)+(n+s)**(1/3)
r1 = (n+s)**(1/3)+(n+s)**(1/3)
r2 = (n-s)**(1/3)+(n-s)**(1/3)
return (r0,r1,r2)
print(cubic(-15,-126))
(5.999999999999999, 9.999999999999998, 2.0)
I'll let you mix in complex number operations to properly get all 3 roots
I'm trying to solve the following system: d²i/dt² + R'(i)/L di/dt + 1/LC i(t) = 1/L dE/dt as a set of coupled first order differential equations:
di/dt = k
dk/dt = 1/L dE/dt - R'(i)/L k - 1/LC i(t)
Here is the code I'm using:
import numpy as np
import sympy as sp
import matplotlib.pyplot as plt
from scipy.integrate import odeint
#Define model: x = [i , k]
def RLC(x , t):
i = sp.Symbol('i')
t = sp.Symbol('t')
#Data:
E = sp.ln(t + 1)
dE_dt = E.diff(t)
R1 = 1000 #1 kOhm
R2 = 100 #100 Ohm
R = R1 * i + R2 * i**3
dR_di = R.diff(i)
i = x[0]
k = x[1]
L = 10e-3 #10 mHy
C = 1.56e-6 #1.56 uF
#Model
di_dt = k
dk_dt = 1/L * dE_dt - dR_di/L * k - 1/(L*C) * i
dx_dt = np.array([di_dt , dk_dt])
return dx_dt
#init cond:
x0 = np.array([0 , 0])
#time points:
time = np.linspace(0, 30, 1000)
#solve ODE:
x = odeint(RLC, x0, time)
i = x[: , 0]
However, I get the following error: TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'
So, I don't know if sympy and odeint don't work well together. Or maybe is it a problem because I defined t as sp.Symbol?
When you differentiate a function, you get a function back. So you need to evaluate it at a point in order to get a number. To evaluate a sympy expression, you could use .subs() but I prefer .replace() which feels more powerful (at least for me).
You must try and make every single variable have its own name in order to avoid confusion. For example, you replace the float input t with a sympy Symbol from the very beginning, thus losing the value of t. The variables x and i are also repeated in the outer scope which is not good practice if they mean different things.
The following should avoid confusion and hopefully produce something that you were expecting:
import numpy as np
import sympy as sp
import matplotlib.pyplot as plt
from scipy.integrate import odeint
# Define model: x = [i , k]
def RLC(x, t):
# define constants first
i = x[0]
k = x[1]
L = 10e-3 # 10 mHy
C = 1.56e-6 # 1.56 uF
R1 = 1000 # 1 kOhm
R2 = 100 # 100 Ohm
# define symbols (used to find derivatives)
i_symbol = sp.Symbol('i')
t_symbol = sp.Symbol('t')
# Data (differentiate and evaluate)
E = sp.ln(t_symbol + 1)
dE_dt = E.diff(t_symbol).replace(t_symbol, t)
R = R1 * i_symbol + R2 * i_symbol ** 3
dR_di = R.diff(i_symbol).replace(i_symbol, i)
# nothing should contain symbols from here onwards
# variables can however contain sympy expressions
# Model (convert sympy expressions to floats)
di_dt = float(k)
dk_dt = float(1 / L * dE_dt - dR_di / L * k - 1 / (L * C) * i)
dx_dt = np.array([di_dt, dk_dt])
return dx_dt
# init cond:
x0 = np.array([0, 0])
# time points:
time = np.linspace(0, 30, 1000)
# solve ODE:
solution = odeint(RLC, x0, time)
result = solution[:, 0]
print(result)
Just something to note: the value i = x[0] seemed to sit very close to 0 throughout each iteration. This means dR_di stayed basically at 1000 the whole time. I'm not familiar with odeint or your specific ODE, but hopefully this phenomenon is expected and isn't a problem.
I need to plot the following function using Python, numpy and matplotlib:
for the values of N = 5, 20 and 60.
I've created a list of odd numbers using:
def odd(n):
nums = []
for i in range(1, 2*n, 2):
nums.append(i)
return nums
But I don't know how to use this in a sigma function because I need to vary my x values and sum over the function for the range of odd(n).
If you want to plot (i.e. visualise) the function for some N, then the procedure is as follows:
Generate an array of x values. In this case, ranging from -pi to pi makes most sense.
Write a loop that computes one sin() at a time, and sum the result in a different array, which we call Psi.
Finally multiply the Psi by the constant 2/(N+1).
Plot the result
import numpy as np
import matplotlib.pyplot as plt
# x is 100 equally spaced points from -pi to pi, inclusive
x = np.linspace(-np.pi, np.pi, 100)
Psi = 0*x # now Psi is an array of zeros
N = 60
# second input of range is N+1 since our index n satisfies 1 <= n < N+1
# third input makes n increment by 2 each loop instead of the default 1
for n in range(1, N+1, 2):
Psi += -1**((n-1)/2) * np.sin(n*x)
Psi *= 2/(N+1)
plt.plot(x, Psi)
Code without pure Python loops:
def Psi(x, N=7):
"""Note: N should be odd """
_s = np.arange(1, int((N + 1) / 2) + 1)
return 2 * np.sum(np.where(_s % 2, 1, -1) * np.sin((2 * _s - 1) * x)) / (N + 1)
This code is without loops and should work for any value of x and N.
x must be an array or list with more than 1 element
import numpy as np
from numpy import matlib
import matplotlib.pyplot as plt
def psi(x,N):
n=np.arange(0,N,2)+1
sigma = matlib.repmat((-1)**((n-1)/2),len(x),1).T*np.sin(matlib.repmat(n,len(x),1).T*x)
PSI = (2/(N+1))*np.sum(sigma,axis=0)
return PSI
x=np.linspace(0,2*np.pi,50)
N=5
y = psi(x,N)
plt.plot(y)
I started with this code to calculate a simple matrix multiplication. It runs with %timeit in around 7.85s on my machine.
To try to speed this up I tried cython which reduced the time to 0.4s. I want to also try to use numba jit compiler to see if I can get similar speed ups (with less effort). But adding the #jit annotation appears to give exactly the same timings (~7.8s). I know it can't figure out the types of the calculate_z_numpy() call but I'm not sure what I can do to coerce it. Any ideas?
from numba import jit
import numpy as np
#jit('f8(c8[:],c8[:],uint)')
def calculate_z_numpy(q, z, maxiter):
"""use vector operations to update all zs and qs to create new output array"""
output = np.resize(np.array(0, dtype=np.int32), q.shape)
for iteration in range(maxiter):
z = z*z + q
done = np.greater(abs(z), 2.0)
q = np.where(done, 0+0j, q)
z = np.where(done, 0+0j, z)
output = np.where(done, iteration, output)
return output
def calc_test():
w = h = 1000
maxiter = 1000
# make a list of x and y values which will represent q
# xx and yy are the co-ordinates, for the default configuration they'll look like:
# if we have a 1000x1000 plot
# xx = [-2.13, -2.1242,-2.1184000000000003, ..., 0.7526000000000064, 0.7584000000000064, 0.7642000000000064]
# yy = [1.3, 1.2948, 1.2895999999999999, ..., -1.2844000000000058, -1.2896000000000059, -1.294800000000006]
x1, x2, y1, y2 = -2.13, 0.77, -1.3, 1.3
x_step = (float(x2 - x1) / float(w)) * 2
y_step = (float(y1 - y2) / float(h)) * 2
y = np.arange(y2,y1-y_step,y_step,dtype=np.complex)
x = np.arange(x1,x2,x_step)
q1 = np.empty(y.shape[0],dtype=np.complex)
q1.real = x
q1.imag = y
# Transpose y
x_y_square_matrix = x+y[:, np.newaxis] # it is np.complex128
# convert square matrix to a flatted vector using ravel
q2 = np.ravel(x_y_square_matrix)
# create z as a 0+0j array of the same length as q
# note that it defaults to reals (float64) unless told otherwise
z = np.zeros(q2.shape, np.complex128)
output = calculate_z_numpy(q2, z, maxiter)
print(output)
calc_test()
I figured out how to do this with some help from someone else.
#jit('i4[:](c16[:],c16[:],i4,i4[:])',nopython=True)
def calculate_z_numpy(q, z, maxiter,output):
"""use vector operations to update all zs and qs to create new output array"""
for iteration in range(maxiter):
for i in range(len(z)):
z[i] = z[i] + q[i]
if z[i] > 2:
output[i] = iteration
z[i] = 0+0j
q[i] = 0+0j
return output
What I learnt is that use numpy datastructures as inputs (for typing), but within use c like paradigms for looping.
This runs in 402ms which is a touch faster than cython code 0.45s so for fairly minimal work in rewriting the loop explicitly we have a python version faster than C(just).