I am getting this error:
ValueError: could not broadcast input array from shape (4,1) into shape (4,)
from my script below, which is trying to integrate some differential equations using interpolated data from a dataset. It might seem slightly nonsensical, since the integrated data will be the same as simply interpolating, but in a bigger project this seemed to me to be the best way to access the rate of change of these variables inside the integration function.
from scipy.integrate import solve_ivp
import numpy as np
import scipy.interpolate
def interpolateVariables(VariableData,t):
FSH_interp = scipy.interpolate.interp1d(VariableData[:,4], VariableData[:,[0]], axis = 0)
LH_interp = scipy.interpolate.interp1d(VariableData[:,4], VariableData[:,[1]], axis = 0)
E2_interp = scipy.interpolate.interp1d(VariableData[:,4], VariableData[:,[2]], axis = 0)
P4_interp = scipy.interpolate.interp1d(VariableData[:,4], VariableData[:,[3]], axis = 0)
return FSH_interp(t), LH_interp(t), E2_interp(t), P4_interp(t)
def simulate_model(VariableData):
tspan = [0,24]
InitialValues = [0.1,0.1,0.1,0.1]
result = solve_ivp(fun = lambda t, y: func(t,y, VariableData), t_span = tspan, y0 = InitialValues, method = "RK45")
return result
def func(t, y, VariableData):
FSH_int, LH_int, E2_int, P4_int = interpolateVariables(VariableData, t)
dFSH = FSH_int - y[0]
dLH = LH_int - y[1]
dE2 = E2_int - y[2]
dP4 = P4_int - y[3]
dy = [dFSH, dLH, dE2, dP4]
print(dy)
return dy
Variable_data = np.array([[6.5000000e+00, 4.8000000e+00, 1.9721760e+01, 1.8870000e-01, 0.0000000e+00],
[6.2000000e+00, 4.1000000e+00, 2.9065080e+01, 1.8870000e-01, 2.0000000e+00],
[7.4000000e+00, 4.3000000e+00, 3.8353920e+01, 1.8870000e-01, 4.0000000e+00],
[6.1000000e+00, 4.9000000e+00, 4.8596160e+01, 1.8870000e-01, 6.0000000e+00],
[4.8000000e+00, 5.2000000e+00, 1.0830624e+02, 3.4595000e-01, 8.0000000e+00],
[3.6000000e+00, 6.0000000e+00, 1.8822840e+02, 2.2015000e-01, 1.0000000e+01],
[1.2900000e+01, 4.8300000e+01, 2.6142228e+02, 7.5480000e-01, 1.2000000e+01],
[6.3000000e+00, 7.2000000e+00, 6.4994640e+01, 3.7111000e+00, 1.4000000e+01],
[4.0000000e+00, 5.9000000e+00, 1.8024708e+02, 1.7769250e+01, 1.8000000e+01],
[3.2000000e+00, 5.3000000e+00, 2.0506272e+02, 1.8272450e+01, 2.0000000e+01],
[2.9000000e+00, 3.0000000e+00, 1.4941140e+02, 1.3680750e+01, 2.2000000e+01],
[3.4000000e+00, 4.8000000e+00, 8.6241840e+01, 3.1450000e+00, 2.4000000e+01]])
test = simulate_model(Variable_data)
Related
My script calculates the location error using a set of the equation for different values of x and y and stores the output into an empty array t_error. However, there are two issues that need to be resolved:
1: How to store the output in a 20_by_20 matrix instead of a 400_by_1 dimension.
2: How to make a contour plot (error surface) using x, y, and out_put parameter that is t_error in our case.
The sample script is as below:
**import pandas as pd
import numpy as np
import math
ev_loc= pd.read_csv("test_grid.txt", sep='\t',header=None)
x=np.array(ev_loc[1])
y=np.array(ev_loc[0])
v=3.5
t_error=[]
for s in x:
for t in y:
for i, j, k in [[73.9,33.1, 1.268571], [73.5,33.1, 1.268571], [73.4,33.1, 2.854286], [73.7,33.2, 0.317143],[73.7,33.0, 0.317143]]:
u=((np.sqrt((t-j)**2 + (s-i)**2)/v)*111 - k)
v=u*u
t_error.append(float(v))
df_hr = pd.DataFrame(t_error)
numbers = np.array(df_hr)
window_size = 5
i = 0
moving_averages = []
while i < len(numbers) - window_size + 1:
this_window = numbers[i : i + window_size]
window_average = sum(this_window)
moving_averages.append(window_average)
i += 5
Error = pd.DataFrame(moving_averages)
Error.to_csv('test_total_error.csv')
print(Error)**
The data of test_grid.txt is as below
x1=np.linspace(73,75,num=41)
y1=np.linspace(33,35,num=41)
v=3.5
t_error=[]
for i, j, k in [[71.91500,33.82850, 57.2], [72.32200,33.16267, 38.28], [72.57900, 33.61317, 37.48], [73.44883, 33.83300, 27.8], [71.52967,33.15267, 58.8],
[73.27017,33.65167, 18.44], [73.14017, 33.75200, 29.97], [72.46550,32.63183, 39.98], [73.22900, 32.99867, 14.77], [72.67167, 31.92100, 48.71],
[71.91817, 32.53983, 54.73],[71.92333,33.04400, 49.67],[71.74417,32.79617, 57.39]]:
u=((np.sqrt((y1-j)**2 + (x1-i)**2)/v)*111 - k)
c=np.sum(u)
t_error.append(c)
plt.plot(t_error)
plt.show()
What is the error suppose to show?
Despite having a working script for curve fitting using the lmfit library, I am not able to solve a display issue. Indeed, having only 5 dependent values, the resulting graph is rather coarse.
Before switching to lmfit, I was using curve_fit and could solve the display issue by simply using np.linspace and plot the optimized values resulting from the fit procedure. Then, I was displaying the "real" values through plt.errorbar. With lmfit, the above solution yields a mismatch error, since it recognizes the "fake" independent variables and launches a mismatch type error.
My full script is the following:
import lmfit as lf
from lmfit import Model, Parameters
import numpy as np
import matplotlib.pyplot as plt
from math import atan
def on_res(omega_eff, thetas, R2avg=5, k_ex=0.1, phi_ex=500):
return R2avg*(np.sin(thetas))**2 + ((np.sin(thetas))**2)*(phi_ex*k_ex/(k_ex**2 + omega_eff**2))
model = Model(on_res,independent_vars=['omega_eff','thetas'])
params = model.make_params(R2avg=5, k_ex=0.01, phi_ex=1500)
carrier = 6146.53
O_1 = 5846
spin_locks = (1000, 2000, 3000, 4000, 5000)
delta_omega = (O_1 - carrier)
omega_eff1 = ((delta_omega**2) + (spin_locks[0]**2))**0.5
omega_eff2 = ((delta_omega**2) + (spin_locks[1]**2))**0.5
omega_eff3 = ((delta_omega**2) + (spin_locks[2]**2))**0.5
omega_eff4 = ((delta_omega**2) + (spin_locks[3]**2))**0.5
omega_eff5 = ((delta_omega**2) + (spin_locks[4]**2))**0.5
theta_rad1 = atan(spin_locks[0]/delta_omega)
theta_rad2 = atan(spin_locks[1]/delta_omega)
theta_rad3 = atan(spin_locks[2]/delta_omega)
theta_rad4 = atan(spin_locks[3]/delta_omega)
theta_rad5 = atan(spin_locks[4]/delta_omega)
x = (omega_eff1/1000, omega_eff2/1000, omega_eff3/1000, omega_eff4/1000, omega_eff5/1000)# , omega_eff6/1000)# , omega_eff7/1000)
theta = (theta_rad1, theta_rad2, theta_rad3, theta_rad4, theta_rad5)
R1rho_vals = (7.9328, 6.2642, 6.0005, 5.9972, 5.988)
e = (0.2, 0.2, 0.2, 0.2, 0.2)
new_x = np.linspace(0, 6, 1000)
omega_eff = np.array(x, dtype=float)
thetas = np.array(theta, dtype=float)
R1rho_vals = np.array(R1rho_vals, dtype=float)
error = np.array(e, dtype=float)
R2avg = []
k_ex = []
phi_ex = []
result = model.fit(R1rho_vals, params, weights=1/error, thetas=thetas, omega_eff=omega_eff, method = "emcee", steps = 1000)
print(result.fit_report())
plt.errorbar(x, R1rho_vals, yerr = error, fmt = ".k", markersize = 8, capsize = 3)
plt.plot(new_x, result.best_fit)
plt.show()
As you can see running it, it launches the mismatch shape error message. Changing the plt.plot line to plt.plot(x, result.best_fit) yields the graph correctly, but displaying a very coarse profile (as one would expect, having only 5 points on the x-axis).
Are you aware of any way to solve this? Checking the documentation, I noticed the examples provided all plot the results via the actual independent variables values, since they have enough experimental values.
You need to re-evaluate the ModelResult with your new values for the independent variables:
plt.plot(new_x, result.eval(omega_eff=new_x/1000., thetas=thetas))
I wrote a piece code to make a simple linear regression model using Python. However, I am having trouble getting the correct cost function, and most importantly the correct theta parameters. The model is implemented from scratch and not using Scikit learn module. I have used Andrew NG's notes from his ML Coursera course to create the model. The correct values of theta are [[-3.630291] [1.166362]].
Would be really grateful if someone could offer their expertise, and point out what I'm doing wrong.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#Load The Dataset
dataset = pd.read_csv("Population vs Profit.txt",names=["Population" ,
"Profit"])
print (dataset.head())
col = len(dataset.columns)
x = dataset.iloc[:,:col-1].values
y = dataset.iloc[:,col-1].values
#Visualizing The Dataset
plt.scatter(x, y, color="red", marker="x", label="Profit")
plt.title("Population vs Profit")
plt.xlabel("Population")
plt.ylabel("Profit")
plt.legend()
plt.show()
#Preprocessing Data
dataset.insert(0,"x0",1)
col = len(dataset.columns)
x = dataset.iloc[:,:col-1].values
b = np.zeros(col-1)
m = len(y)
costlist = []
alpha = 0.001
iteration = 10000
#Defining Functions
def hypothesis(x,b,y):
h = x.dot(b.T) - y
return h
def cost(x,b,y,m):
j = np.sum(hypothesis(x,b,y)**2)
j = j/(2*m)
return j
print (cost(x,b,y,m))
def gradient_descent(x,b,y,m,alpha):
for i in range (iteration):
h = hypothesis(x,b,y)
product = np.sum(h.dot(x))
b = b - ((alpha/m)*product)
costlist.append(cost(x,b,y,m))
return b,cost(x,b,y,m)
b , mincost = gradient_descent(x,b,y,m,alpha)
print (b , mincost)
print (cost(x,b,y,m))
plt.plot(b,color="green")
plt.show()
The dataset I'm using is the following text.
6.1101,17.592
5.5277,9.1302
8.5186,13.662
7.0032,11.854
5.8598,6.8233
8.3829,11.886
7.4764,4.3483
8.5781,12
6.4862,6.5987
5.0546,3.8166
5.7107,3.2522
14.164,15.505
5.734,3.1551
8.4084,7.2258
5.6407,0.71618
5.3794,3.5129
6.3654,5.3048
5.1301,0.56077
6.4296,3.6518
7.0708,5.3893
6.1891,3.1386
20.27,21.767
5.4901,4.263
6.3261,5.1875
5.5649,3.0825
18.945,22.638
12.828,13.501
10.957,7.0467
13.176,14.692
22.203,24.147
5.2524,-1.22
6.5894,5.9966
9.2482,12.134
5.8918,1.8495
8.2111,6.5426
7.9334,4.5623
8.0959,4.1164
5.6063,3.3928
12.836,10.117
6.3534,5.4974
5.4069,0.55657
6.8825,3.9115
11.708,5.3854
5.7737,2.4406
7.8247,6.7318
7.0931,1.0463
5.0702,5.1337
5.8014,1.844
11.7,8.0043
5.5416,1.0179
7.5402,6.7504
5.3077,1.8396
7.4239,4.2885
7.6031,4.9981
6.3328,1.4233
6.3589,-1.4211
6.2742,2.4756
5.6397,4.6042
9.3102,3.9624
9.4536,5.4141
8.8254,5.1694
5.1793,-0.74279
21.279,17.929
14.908,12.054
18.959,17.054
7.2182,4.8852
8.2951,5.7442
10.236,7.7754
5.4994,1.0173
20.341,20.992
10.136,6.6799
7.3345,4.0259
6.0062,1.2784
7.2259,3.3411
5.0269,-2.6807
6.5479,0.29678
7.5386,3.8845
5.0365,5.7014
10.274,6.7526
5.1077,2.0576
5.7292,0.47953
5.1884,0.20421
6.3557,0.67861
9.7687,7.5435
6.5159,5.3436
8.5172,4.2415
9.1802,6.7981
6.002,0.92695
5.5204,0.152
5.0594,2.8214
5.7077,1.8451
7.6366,4.2959
5.8707,7.2029
5.3054,1.9869
8.2934,0.14454
13.394,9.0551
5.4369,0.61705
One issue is with your "product". It is currently a number when it should be a vector. I was able to get the values [-3.24044334 1.12719788] by rerwitting your for-loop as follows:
def gradient_descent(x,b,y,m,alpha):
for i in range (iteration):
h = hypothesis(x,b,y)
#product = np.sum(h.dot(x))
xvalue = x[:,1]
product = h.dot(xvalue)
hsum = np.sum(h)
b = b - ((alpha/m)* np.array([hsum , product]) )
costlist.append(cost(x,b,y,m))
return b,cost(x,b,y,m)
There's possibly another issue besides this as it doesn't converge to your answer. You should make sure you are using the same alpha also.
I have a 4D dataset (time, z, y, x) and I would like to interpolate the data to get a higher resolution, this is a simple example code:
import numpy as np
from scipy.interpolate import griddata
x_0 = 10
cut_index = 10
res = 200j
x_index = x_0
y_index = np.linspace(0, 100, 50).astype(int)
z_index = np.linspace(0, 50, 25).astype(int)
#Time, zyx-coordinate
u = np.random.randn(20, 110, 110, 110)
z_index, y_index = np.meshgrid(z_index, y_index)
data = u[cut_index, z_index, y_index, x_index]
res = 200j
y_f = np.mgrid[0:100:res]
z_f = np.mgrid[0:50:res]
z_f, y_f = np.meshgrid(z_f, y_f)
data = griddata((z_index, y_index), data, (z_f, y_f))
I am getting the ValueError: invalid shape for input data points error. What kind of input is expected by the griddata function?
Your data parameter has to be a 1D array. Try flattening the arrays:
data = griddata((z_index.flatten(), y_index.flatten()), data.flatten(), (z_f, y_f))
I'm trying to make a fit using matplotlib.psd function. My datafile has 8 columns with displacement and speed for a particle (positionX, positionY, positionZ, AveragePositionXYZ, speedX, speedY, speedZ, AverageSpeedXYZ). Using the positionX for example, I try to get the Power Spectrum with matplotlib.psd:
power, freqs = plt.psd(data, len(data), Fs = 256, scale_by_freq=True, return_line=0)
Then, I try to make a curve fitting using linear regression with scipy stas.linregress:
slope, inter, r2, p, stderr = stats.linregress(x, y)
However, my results are very bad. I try to plot with:
line = (inter + slope * (10 * np.log10(freqs)))
plt.semilogx(freqs, line)
plt.show()
And get the following image:
I know that I have a lot of mistakes, and I try to get some solutions in the web. However, I have not had much success. So, I'm asking if there's someone here that could help me.
The datafile has the following format (first 10 lines):
1.50000000,0.00000000,0.00000000,0.50000000,0.00000000,0.00000000,0.00000000,0.00000000
1.49788889,0.00000000,0.00000000,0.49929630,-0.06333333,0.00000000,0.00000000,-0.02111111
1.49367078,0.00000005,0.00000000,0.49789028,-0.12654314,0.00000165,0.00000000,-0.04218050
1.48735391,0.00000027,0.00000000,0.49578473,-0.18950635,0.00000659,0.00000000,-0.06316659
1.47895054,0.00000082,0.00000000,0.49298379,-0.25210085,0.00001647,0.00000000,-0.08402813
1.46847701,0.00000192,0.00000000,0.48949298,-0.31420588,0.00003296,0.00000000,-0.10472431
1.45595360,0.00000385,0.00000000,0.48531915,-0.37570257,0.00005769,0.00000000,-0.12521496
1.44140445,0.00000692,0.00000000,0.48047046,-0.43647431,0.00009232,0.00000000,-0.14546066
1.42485754,0.00001154,0.00000000,0.47495636,-0.49640723,0.00013851,0.00000000,-0.16542291
1.40634452,0.00001814,0.00000000,0.46878755,-0.55539066,0.00019789,0.00000000,-0.18506426
My complete Python code is as follows:
import matplotlib.pyplot as plt
from scipy import stats
import numpy as np
filename = 'datafile.txt'
# Load data
file = np.genfromtxt(filename,
skip_header = 0,
skip_footer = 0,
delimiter = ',',
dtype = 'float32',
filling_values = 0,
usecols = (0, 1, 2, 3, 4, 5, 6, 7),
names = ['posX', 'posY', 'posZ', 'posMedias', 'velX', 'velY', 'velZ', 'velMedias'])
# Map values
posX = file['posX']
posY = file['posY']
posZ = file['posZ']
posMedia = file['posMedias']
velX = file['velX']
velY = file['velY']
velZ = file['velZ']
velMedia = file['velMedias']
# Column data that will be used
data = posMedia
# PSD calculation
power, freqs = plt.psd(data, len(data), Fs = 256, scale_by_freq=True, return_line=0)
# Linear fit
x = np.log10(freqs[1:])
y = np.log10(power[1:])
slope, inter, r2, p, stderr = stats.linregress(x, y)
print(slope, inter)
# Plot
line = (inter + slope * (10 * np.log10(freqs)))
plt.semilogx(freqs, line)
plt.show()
Thank you so much!