How do I switch this MATLAB loop to a Python loop - python

So I have a challenge in my hand that I am trying to accomplish. I have a matlab code that works fine homever, I want to write the same code in python. Homever I don't get the same results.
I have tried using a different for loop than the one in matlab. Although these should give the same results I am fail at some point in the loop, although I couldn't figure out where the mistake was.
for ii = 1:100 #matlab code
healthy=2*randn(100,1000)+5;
patient=2*randn(100,1000)+7;
threshold=mu_healthy-sd_healthy:0.1:mu_patient+sd_patient;
for i=1:length(threshold)
TP(i)=sum(patient>=threshold(i));
FP(i)=sum(healthy>=threshold(i));
TN(i)=sum(healthy<threshold(i));
FN(i)=sum(patient<threshold(i));
end
FPR(ii,:)=FP/1000;
TPR(ii,:)=TP/1000;
def appending(): #python code
for n in range(0,50):
for x in range(0,1000):
for a in range(0,61):
if Apatient[x,n]>=newthreshold[a]:
TP[a].append(Apatient[x,n])
elif Ahealthy[x,n]>=newthreshold[a]:
FP.append(Ahealthy[x,n])
elif Apatient[x,n]<newthreshold[a]:
TN.append(Apatient[x,n])
elif Ahealthy[x,n]<newthreshold[a]:
FN.append(Ahealthy[x,n])
If you can run this in matlab, you will see FN,TN values with 61 values in each column. I want the same to happen in my loop as well,homever I get lots of elements if I run this code. Thanks

Just following MATLAB script, tried translation.
import numpy as np
mu_healthy = 5
sd_healthy = 2
mu_patient = 7
sd_patient = 2
threshold = np.arange(mu_healthy-sd_healthy, mu_patient+sd_patient+0.1, 0.1)
L = len(threshold)
TP = np.zeros([L,1])
FP = np.zeros([L,1])
TN = np.zeros([L,1])
FN = np.zeros([L,1])
FPR = np.zeros([100,L])
TPR = np.zeros([100,L])
for ii in range(0,100):
healthy = sd_healthy*np.random.normal(mu_healthy,1,[100,1000])
patient = sd_patient*np.random.normal(mu_patient,1,[100,1000])
for i in range(0, L):
TP[i] = np.sum(patient>=threshold[i])
FP[i] = np.sum(healthy>=threshold[i])
TN[i] = np.sum(healthy<threshold[i])
FN[i] = np.sum(patient<threshold[i])
FPR[ii,:] = FP[:,0]
TPR[ii,:] = TP[:,0]
FPR = FPR/1000
TPR = TPR/1000

Related

I want to convert the following MATLAB code into python? Is this the correct way?

How can I change this part [AIF,j]=get_AIF_j(InterpFact) and [~,j_index] = min(InterpFact-AIF_vect) correctly? And what about the remaining code? Thanks in advance.
%Matlab code
InterpFact = (fs_h/2/2)/(fd_max);
[AIF,j]=get_AIF_j(InterpFact);
function [AIF,j] = get_AIF_j (InterpFact)
j_vect = 1:10;
AIF_vect = floor(j_vect*InterpFact)./j_vect;
[~,j_index] = min(InterpFact-AIF_vect);
j = j_vect(j_index);
AIF = AIF_vect(j_index);
end
#Python code
InterpFact = (fs_h/2/2)/(fd_max)
[AIF,j]=get_AIF_j(InterpFact)
def get_AIF_j (InterpFact):
j_vect =np.arange(1,11)
AIF_vect = np.floor(j_vect*InterpFact)/j_vect
[~,j_index] = min(InterpFact-AIF_vect)
j = j_vect[j_index]
AIF = AIF_vect[j_index];
return AIF,j
This MATLAB:
[~,j_index] = min(InterpFact-AIF_vect);
would be translated to Python as:
j_index = np.argmin(InterpFact-AIF_vect)
Also, …/(fd_max) can only be translated the way you did if fd_max is a scalar. A division with a matrix in MATLAB solves a system of linear equations.
I strongly recommend that you run the two pieces of code side by side with the same input, to verify that they do the same thing. You cannot go by guesses as to what a piece of code does.
Try this to see if it delivers what it should (I am not sure here as I am not fluent in matlab):
#Python code
import numpy as np
def get_AIF_j (InterpFact):
j_vect = np.arange(1,11)
AIF_vect = np.floor(j_vect*InterpFact)/j_vect
j_index = int( min(InterpFact-AIF_vect) )
print(j_index)
j = j_vect[j_index]
AIF = AIF_vect[j_index];
return AIF, j
fs_h = 24; fd_max = 1
InterpFact = (fs_h/2/2)/(fd_max)
AIF, j = get_AIF_j(InterpFact)
print(AIF,j)
gives:
0
6.0 1

For loop Python- from Matlab

I am starting to code up in Python and I come from a Matlab background. I have a problem with a for loop that I am trying to do.
So this is my for loop from Matlab,
ix = indoor(1);
idx = indoor(2)-indoor(1);
%Initialize X apply I.C
X = [ix;idx];
for k=(1:1:287)
X(:,k+1) = Abest*X(:,k) + Bbest*outdoor(k+1) + B1best* (cbest4/cbest1);
end
In this code Abest is a 2x2 matrix, Bbest is a 2x1 matrix, outdoor is a 288x1 vector, B1best is a 2x1 matrix. The matricies are found from a function using the matrix expodential command. c4 and c1 are terms defined before, constants.
In Python I have been able to get the matrix exponential command to work in my function but I can't get that for loop to work.
Xo = np.array([[ix],[idx]])
num1 = range(0,276)
for k in num1:
Xo[:,k+1] = Ae*Xo[:,k] + Be*outdoor[k+1] + Be1*(c4/c1)
Again Ae,Be,Be1 are matrices of the same size just like the Matlab ones. Same thing for the outdoor vector.
I have tried everything I can think of to make it work... The only thing that worked for me was,
Xo = np.zeros(())
#Initial COnditions
ix = np.array(indoor[0])
idx = np.array(indoor[1]-indoor[0])
Xo = np.array([[ix],[idx]])
#Range for the for loop
num1 = range(0,1)
for k in num1:
Xo = Ae*Xo[k] + Be*outdoor[k+1] + Be1*(c4/c1)
Now, this thing will work but only give me two points. If I change the range I get an error. I'm assuming this code works because my original Xo is just two states so k goes through those two states but that's not what I want.
If anyone could help me out that would be very helpful! If I'm making some code error, it's honestly because I'm not understanding the 'For loop' in python to well when it comes to data analysis and having it loop through the rows and increment the columns. Thank you for your time.
Upon Request here is my full code:
import scipy.io as sc
import math as m
import numpy as np
import matplotlib.pyplot as plt
import sys
from scipy.linalg import expm, sinm, cosm
import pandas as pd
df = pd.read_excel('datatemp.xlsx')
outdoor = np.array(df[['Outdoor']])
indoor = np.array(df[['Indoor']])
###########################. FUNCTION DEFINE. #################################################
#Progress bar
def progress(count, total, status=''):
percents = round(100.0 * count / float(total), 1)
sys.stdout.write(' %s%s ...%s\r' % ( percents, '%', status))
sys.stdout.flush()
#Define Matrix for Model
def Matrixbuild(c1,c2,c3):
A = np.array([[0,1],[-c3/c1,-c2/c1]])
B = np.array([[0],[1/c1]])
B1 = np.array([[1],[0]])
C = np.zeros((2,2))
D = np.zeros((2,2))
F = np.array([[0,1,0,1],[-c3/c1,-c2/c1,1/c1,0],[0,0,0,0],[0,0,0,0]])
R = np.array(expm(F))
Ae = np.array([[R.item(0),R.item(1)],[R.item(4),R.item(5)]])
Be = np.array([[R.item(2)],[R.item(6)]])
Be1 = np.array([[R.item(3)],[R.item(7)]])
return Ae,Be,Be1;
###########################. Data. #################################################
#USED FOR JUST TRYING WITHOUT ACTUAL DATA
# outdoor = np.array([5.8115,4.394,5.094,5.1123,5.1224])
# indoor = np.array([15.595,15.2429,15.0867,14.9982,14.8993])
###########################. Model Define. #################################################
Xo = np.zeros((2,288))
ix = np.array(indoor[0])
idx = np.array(indoor[1])
err_min = m.inf
c1spam = np.linspace(0.05,0.001,30)
c2spam = np.linspace(6.2,6.5,30)
c3spam = np.linspace(7.1,7.45,30)
totalspam = len(c1spam)*len(c2spam)*len(c3spam)
ind = 0
for c1 in c1spam:
for c2 in c2spam:
for c3 in c3spam:
c4 = 1.1
#MatrixBuild Function
result = Matrixbuild(c1,c2,c3)
Ae,Be,Be1 = result
Xo = np.array([ix,idx])
Datarange = range(0,len(outdoor)-1,1)
for k in Datarange:
Xo[:,k+1] = np.matmul(Ae,Xo[:,k]) + np.matmul(Be,outdoor[k+1]) + Be1*(c4/c1)
ind = ind + 1
print(Xo)
err = np.linalg.norm(Xo[0,range(0,287)]-indoor.T)
if err<err_min:
err_min = err
cbest = np.array([[c1],[c2],[c3],[c4]])
progress(ind,totalspam,status='Done')
# print(X)
# print(err)
# print(cbest)
###########################. Model with Cbest Values. #################################################
c1 = cbest[0]
c2 = cbest[1]
c3 = cbest[2]
result2 = Matrixbuild(c1,c2,c3)
AeBest,BeBest,Be1Best = result2
Xo = np.array([ix,idx])
Datarange = np.arange(0,len(outdoor)-1)
for k in Datarange:
Xo[:,k+1] = np.matmul(AeBestb,Xo[:,k]) + np.matmul(BeBest,outdoor[k+1]) + Be1Best*(c4/c1)
err = np.linalg.norm(Xo[0,range(0,287)]-indoor.T)
print(cbest)
print(err)
###########################. Plots. #################################################
plt.figure(0)
time = np.linspace(1,2,2)
plt.scatter(time,X[0],s=15,c="blue")
plt.scatter(time,indoor[0:2],s=15,c="red")
plt.show()
And again my error occurs in the line with the for loop of
for k in Datarange:
Xo[:,k+1] = np.matmul(Ae,Xo[k]) + np.matmul(Be,outdoor[k+1]) + Be1*(c4/c1)
I was trying to use np.matmul for matrix multiplication but even without it, it wasn't working.
If there are any other questions about my code please ask. Essentially I'm trying to find the best c1,c2,c3 coefficients that fit my data which is indoor temperature by using a basic second order constant coefficient model.
Have you tried with Xo[:,k+1] instead of Xo(:,k+1)? Python uses [] for slicing and indexing.
EDIT:
Xo = np.array([[ix],[idx]])
This creates a 1x1 array with 1 value: (ix, idx). I think you're looking for something like Xo = np.zeros((ix, idx)), which will give you an ixxidx array initialized to zeros. If you don't need the zeros you can use Xo = np.empty((ix, idx)).
See the docs on array creation.
So by reading into how python works a little more and allocation for arrays/matrices, I was able to find out how to do it. I needed to first allocate my 'Xo' value and then input the initial conditions in order for the For loop to work.
Xo = np.zeros((2,num2))
Xo = np.asmatrix(Xo)
Xo[0,0] = ix
Xo[1,0] = idx
Also for the 'for loop', I called the range some value like this,
num1 = range(0,4)
num2 = len(num1) + 1
This helped in order to calculate the total dimension of 'Xo', by calling it 'num2'. It was also defined like that because my 'For loop' went (k+1), this the dimension would grow larger, ex:
for k in num1:
Xo[:,k+1] = Ae*Xo[:,k] + Be*outdoor[k+1] + Be1*(c4/c1)
But there it is! I figured it by comparing Matlab printouts to Python printouts and just trying to debug one line at a time. Now I have the same exact value print out in both goods, so it is time to start using the python code!

using jacobi method to solve laplace equation PYTHON

I am fairly new to python and am trying to recreate the electric potential in a metal box using the laplace equation and the jacobi method. I have written a code that seems to work initially, however I am getting the error: IndexError: index 8 is out of bounds for axis 0 with size 7 and can not figure out why. any help would be awesome!
from visual import*
from visual.graph import*
import numpy as np
lenx = leny = 7
delta = 2
vtop = [-1,-.67,-.33,.00,.33,.67,1]
vbottom = [-1,-.67,-.33,.00,.33,.67,1]
vleft = -1
vright = 1
vguess= 0
x,y = np.meshgrid(np.arange(0,lenx), np.arange(0,leny))
v = np.empty((lenx,leny))
v.fill(vguess)
v[(leny-1):,:] = vtop
v [:1,:] = vbottom
v[:,(lenx-1):] = vright
v[:,:1] = vleft
maxit = 500
for iteration in range (0,maxit):
for i in range(1,lenx):
for j in range(1,leny-1):
v[i,j] = .25*(v[i+i][j] + v[i-1][j] + v[i][j+1] + v[i][j-1])
print v
Just from a quick glance at your code it seems as though the indexing error is happening at this part and can be changed accordingly:
# you had v[i+i][j] instead if v[i+1][j]
v[i,j] = .25*(v[i+1][j] + v[i-1][j] + v[i][j+1] + v[i][j-1])
You simply added and extra i to your indexing which would have definitely been out of range

Nested while loop only iterates once

I have written some code which takes data from a csv file, stores it in lists, then iterates over the data returning only the information I need.
I had it working for single lists:
# Import modules
import csv
import datetime
# import numpy as np
import matplotlib.pyplot as plt
# Time code (as slow to run)
tin = []
tout = []
tin = datetime.datetime.now() #tic
plt.close()
# Assign variables
pktime = []
pkey1 = []
pkey2 = []
pkey3 = []
pkey4 = []
pkey5 = []
pkey6 = []
pkeys=[pkey1, pkey2, pkey3, pkey4, pkey5, pkey6]
delt1 = []
delt2 = []
delt3 = []
delt4 = []
delt5 = []
delt6 = []
delts=[delt1, delt2, delt3, delt4, delt5, delt6]
pkey1full=[]
pkey2full=[]
pkey3full=[]
pkey4full=[]
pkey5full=[]
pkey6full=[]
pkeyfull=[pkey1full, pkey2full, pkey3full, pkey4full, pkey5full, pkey6full]
# Read in PK weight/deltaT/time values
with open('PKweight.csv') as pkweight:
red = csv.reader(pkweight)
for t, pk1, pk2, pk3, pk4, pk5, pk6, dt1, dt2, dt3, dt4, dt5, dt6 in red:
pktime.append(datetime.datetime.strptime(t,'%H:%M:%S'))
pkey1.append(float(pk1))
pkey2.append(float(pk2))
pkey3.append(float(pk3))
pkey4.append(float(pk4))
pkey5.append(float(pk5))
pkey6.append(float(pk6))
delt1.append(float(dt1))
delt2.append(float(dt2))
delt3.append(float(dt3))
delt4.append(float(dt4))
delt5.append(float(dt5))
delt6.append(float(dt6))
#calculate the pkweight for each cell, then append it to pkey*full
def pkweight1_calc():
i=1
while i<=(len(pkey1)-1):
if pkey1[i] == 0.0 and pkey1[i-1]!=0.0:
pkey1full.append(pkey1[i-2])
i+=1
pkey1full.reverse()
return pkey1full
pkweight1_calc()
I had this code written out 6 times to complete the function for each of the sets of data(1-6), however I want to have it all as one function. I have tried using a nested while loop within a while loop, however it only returns one of the lists, whatever the inital value of j was:
def pkweight_calc():
i=1
for j in range(0,5):
while i<=(len(pkeys[j])-1):
if (pkeys[j][i]) == 0.0 and (pkeys[j][i-1])!=0.0:
pkeyfull[j].append(pkeys[j][i-2])
i+=1
pkeyfull[j].reverse()
pkweight_calc()
Can anyone help me with this? Thanks in advance!!
EDIT- updated indenting, Sorry!
thanks for the help, I managed to find someone at work who could help me. He wasnt sure why but changing the while loop
while i<=(len(pkeys[j])-1):
to a for loop:
for i in range(2, len(pkeys[j])):
solved it. Not sure why but it did!

I need to vectorize the following in order for the code can run faster

This portion I was able to vectorize and get rid of a nested loop.
def EMalgofast(obsdata, beta, pjt):
n = np.shape(obsdata)[0]
g = np.shape(pjt)[0]
zijtpo = np.zeros(shape=(n,g))
for j in range(g):
zijtpo[:,j] = pjt[j]*stats.expon.pdf(obsdata,scale=beta[j])
zijdenom = np.sum(zijtpo, axis=1)
zijtpo = zijtpo/np.reshape(zijdenom, (n,1))
pjtpo = np.mean(zijtpo, axis=0)
I wasn't able to vectorize the portion below. I need to figure that out
betajtpo_1 = []
for j in range(g):
num = 0
denom = 0
for i in range(n):
num = num + zijtpo[i][j]*obsdata[i]
denom = denom + zijtpo[i][j]
betajtpo_1.append(num/denom)
betajtpo = np.asarray(betajtpo_1)
return(pjtpo,betajtpo)
I'm guessing Python is not your first programming language based on what I see. The reason I'm saying this is that in python, normally we don't have to deal with manipulating indexes. You act directly on the value or the key returned. Make sure not to take this as an offense, I do the same coming from C++ myself. It's a hard to remove habits ;).
If you're interested in performance, there is a good presentation by Raymond Hettinger on what to do in Python to be optimised and beautiful :
https://www.youtube.com/watch?v=OSGv2VnC0go
As for the code you need help with, is this helping for you? It's unfortunatly untested as I need to leave...
ref:
Iterating over a numpy array
http://docs.scipy.org/doc/numpy/reference/generated/numpy.true_divide.html
def EMalgofast(obsdata, beta, pjt):
n = np.shape(obsdata)[0]
g = np.shape(pjt)[0]
zijtpo = np.zeros(shape=(n,g))
for j in range(g):
zijtpo[:,j] = pjt[j]*stats.expon.pdf(obsdata,scale=beta[j])
zijdenom = np.sum(zijtpo, axis=1)
zijtpo = zijtpo/np.reshape(zijdenom, (n,1))
pjtpo = np.mean(zijtpo, axis=0)
betajtpo_1 = []
#manipulating an array of numerator and denominator instead of creating objects each iteration
num=np.zeros(shape=(g,1))
denom=np.zeros(shape=(g,1))
#generating the num and denom real value for the end result
for (x,y), value in numpy.ndenumerate(zijtpo):
num[x],denom[x] = num[x] + value *obsdata[y],denom[x] + value
#dividing all at once after instead of inside the loop
betajtpo_1= np.true_divide(num/denom)
betajtpo = np.asarray(betajtpo_1)
return(pjtpo,betajtpo)
Please leave me some feedback !
Regards,
Eric Lafontaine

Categories