How can I fix Runtime error for log in python - python

num_runs = 100000
p_values = []
for i in range(1, num_runs+1):
ornek1 = 2*np.random.normal(0,1,5)
ornek2 = 3*np.random.normal(0,2,5)
ornek3 = 10*np.random.normal(0,5,5)
levene = sp.levene(ornek1,ornek2,ornek3, center = 'mean')
if levene[1] < 0.05:
ornek1 = np.log10(ornek1)
ornek2 = np.log10(ornek2)
ornek3 = np.log10(ornek3)
else:
continue
anova = sp.f_oneway(ornek1,ornek2,ornek3)
p_values.append(anova[1])
hesap = sum(map(lambda x: x<0.05, p_values))
print(hesap/100000)
RuntimeWarning: invalid value encountered in log10
ornek1 = np.log10(ornek1)
How can I fix this problem ? I can't find anything. I want to transform my samples after levene test but ı can't. Is it about to numpy ?

Just as a sample:
>>> np.random.normal(0, 1, 5)
array([ 1.68127583, -0.82660143, -1.82465141, 0.60495851, -0.90369304])
You're taking the log of negative numbers. What are you expecting to happen?

Related

Python 1 of 2 stop conditions gets ignored in a loop

I am writing this simple function to use power iteration for the dominant eigenvalue. I want to put 2 stop conditionals. One for iterations and one for a precision threshold. But this error calculation does not work.
What a i doing wrong here in principle ?
#power ite. vanilla
A = np.random.uniform(low=-5.0, high=10.0, size=[3,3])
def power_iteration(A, maxiter, threshold):
b0 = np.random.rand(A.shape[1])
it = 0
error = 0
while True:
for i in range(maxiter):
b1 = np.dot(A, b0)
b1norm = np.linalg.norm(b1)
error = np.linalg.norm(b1-b0)
b0 = b1/b1norm
domeig = (b0#A#b0)/np.dot(b0, b0)
if error<threshold:
break
elif it>maxiter:
break
else:
error = 0
it = it + 1
return b0, domeig, it, error
result = power_iteration(A, 10, 0.1)
result
The output shows a very correct eigenvalue of ~9 and corresponding eigenvector ( i checked with numpy)
But the error is off. There is no way the length of the difference vector is 8. Considering the result is very close to the actual.
How i want to calculate error is the norm of the difference between the current eigenvector - the previous (b0). I start the error = 0 because the first iteration is guaranteed to give a big difference if b0 is chosen random
(array([ 0.06009408, 0.95411524, -0.2933476 ]),
9.001665234545708,
11,
8.001665234545815)
Tried to make a loop stop by 2 conditions. One gets ignored
Seems to work much better like this.
def power_it(matrix, iterations, threshold):
domeigenvector = np.random.rand(matrix.shape[1])
counter = np.random.rand(matrix.shape[1])
it = 0
error = 0
for i in range(iterations):
k1 = np.dot(A, domeigenvector)
k1norm = np.linalg.norm(k1)
domeigenvector = k1/k1norm
error = np.linalg.norm(domeigenvector-counter)
counter = domeigenvector
domeigenvalue = (domeigenvector#A#domeigenvector)/np.dot(domeigenvector, domeigenvector)
it = it + 1
if error < threshold:
break
return domeigenvalue, domeigenvector, it
I can now use Schur deflation to calculate the rest of the eigenpairs.

I want to convert the following MATLAB code into python? Is this the correct way?

How can I change this part [AIF,j]=get_AIF_j(InterpFact) and [~,j_index] = min(InterpFact-AIF_vect) correctly? And what about the remaining code? Thanks in advance.
%Matlab code
InterpFact = (fs_h/2/2)/(fd_max);
[AIF,j]=get_AIF_j(InterpFact);
function [AIF,j] = get_AIF_j (InterpFact)
j_vect = 1:10;
AIF_vect = floor(j_vect*InterpFact)./j_vect;
[~,j_index] = min(InterpFact-AIF_vect);
j = j_vect(j_index);
AIF = AIF_vect(j_index);
end
#Python code
InterpFact = (fs_h/2/2)/(fd_max)
[AIF,j]=get_AIF_j(InterpFact)
def get_AIF_j (InterpFact):
j_vect =np.arange(1,11)
AIF_vect = np.floor(j_vect*InterpFact)/j_vect
[~,j_index] = min(InterpFact-AIF_vect)
j = j_vect[j_index]
AIF = AIF_vect[j_index];
return AIF,j
This MATLAB:
[~,j_index] = min(InterpFact-AIF_vect);
would be translated to Python as:
j_index = np.argmin(InterpFact-AIF_vect)
Also, …/(fd_max) can only be translated the way you did if fd_max is a scalar. A division with a matrix in MATLAB solves a system of linear equations.
I strongly recommend that you run the two pieces of code side by side with the same input, to verify that they do the same thing. You cannot go by guesses as to what a piece of code does.
Try this to see if it delivers what it should (I am not sure here as I am not fluent in matlab):
#Python code
import numpy as np
def get_AIF_j (InterpFact):
j_vect = np.arange(1,11)
AIF_vect = np.floor(j_vect*InterpFact)/j_vect
j_index = int( min(InterpFact-AIF_vect) )
print(j_index)
j = j_vect[j_index]
AIF = AIF_vect[j_index];
return AIF, j
fs_h = 24; fd_max = 1
InterpFact = (fs_h/2/2)/(fd_max)
AIF, j = get_AIF_j(InterpFact)
print(AIF,j)
gives:
0
6.0 1

When i try to print the mean and standard deviation, I am prompted with a name error: variable not defined

I am trying to print the mean and standard deviation however in its current form it doesnt recognize anything inside the loop. How would I go about correcting this to properly display what is intended. When i try to print the mean it says ex not defined.
import numpy as np
p = 0.44
q = 0.56
mu_1 = 26.5
sigma = 4.3
mu_2 = 76.4
n = 7
print( 'total number of jobs =', n)
lst_times = []
j = 0
def calc_avg_std(n):
while j < 100:
m = np.random.binomial(n,p)
easy_jobs = np.random.normal(mu_1,sigma,m)
n_chall = n-m
chall_jobs = np.random.exponential(mu_2,n_chall)
totalTime = sum(easy_jobs) + sum(chall_jobs)
lst_times.append(totalTime)
j = j + 1
ex = (mu_1 * p) + (mu_2 * q)
ex2 = (p *((mu_1**2)))+ (q*(mu_2**2)*2)
var = ex2-(ex**2)
stdev = np.sqrt(var)
return [ex , stdev]
print(' mean is',ex)
I tried this code without the def and return and runs properly but the professor insists that it should be implemented.
def is used to create a function. When you use return you return the values to the caller.
Replace your last prin witht the following lines:
call the function and keep the return values
print the returned values
mean, stdev = calc_avg_std(n)
print(mean)

how to multiple variables in for loop range python

I can't seem to get the following for loop in range to work. This is the entire code. Perhaps is more helpful to post the entire program
import pandas as pd
from alpha_vantage.timeseries import TimeSeries
from alpha_vantage.foreignexchange import ForeignExchange
from scipy.stats import ttest_ind
from pandas import ExcelWriter
import time
import numpy as np
import os
api_key="xxx"
cc = ForeignExchange(key=api_key, output_format="pandas", indexing_type= "date")
#ti = TechIndicators(key= api_key, output_format="pandas", indexing_type= "date")
ts = TimeSeries(key= api_key, output_format="pandas", indexing_type= "date")
filePath = r"/Users/LaCasa/PycharmProjects/Forex_Breakout_Backtest_15MIN/forex_pairs.xlsx"
filePath1 = r"/Users/LaCasa/PycharmProjects/Forex_Breakout_Backtest_15MIN/"
stocklist = pd.read_excel(filePath, engine='openpyxl')
stocklist = stocklist.head(5)
exportList = pd.DataFrame(columns=['Base','Quote'])
for i in stocklist.index:
fx_from = str(stocklist["fx_from"][i])
fx_to = str(stocklist["fx_to"][i])
data_fx, meta_data_fx = cc.get_currency_exchange_intraday(from_symbol=fx_from,to_symbol=fx_to,interval='15min',
outputsize='full')
data_fx.sort_index(inplace=True)
total_df = data_fx
total_df["BASE"] = fx_from
total_df["QUOTE"] = fx_to
total_df.rename(columns={'1. open': 'OPEN','2. high': 'HIGH','3. low': 'LOW','4. close':'CLOSE'},inplace=True)
result = []
train_size = 0.6
n_forward = 5
total_df['Forward Close'] = total_df['CLOSE'].shift(-n_forward)
total_df['Forward Return'] = (total_df['Forward Close'] - total_df['CLOSE']) / total_df['CLOSE']
for sma_length, sma_length2, sma_length3 in range(10, 200, 10):
print(sma_length)
total_df["MA1"] = round(total_df["CLOSE"].rolling(window=sma_length).mean(), 5)
total_df["SD1"] = round(total_df["CLOSE"].rolling(window=sma_length).std(), 5)
total_df["z1"] = round((total_df["CLOSE"].sub(total_df["MA1"])).div(total_df["SD1"]), 3)
total_df["Z1"] = round(total_df["z1"].rolling(window=1).mean(), 3)
total_df["MAZ1"] = round(total_df["Z1"].rolling(window=3).mean(), 5)
total_df["SDMA1"] = round(total_df["SD1"].rolling(window=sma_length).mean(), 5)
total_df["STD1"] = round(total_df["SD1"].rolling(window=sma_length).std(), 5)
total_df["zd1"] = round((total_df["SD1"].sub(total_df["SDMA1"])).div(total_df["STD1"]), 3)
total_df["ZDEV1"] = round(total_df["zd1"].rolling(window=1).mean(), 3)
total_df["MAZDEV1"] = round(total_df["ZDEV1"].rolling(window=3).mean(), 3)
total_df["MA2"] = round(total_df["CLOSE"].rolling(window=sma_length2).mean(), 5)
total_df["SD2"] = round(total_df["CLOSE"].rolling(window=sma_length2).std(), 5)
total_df["z2"] = round((total_df["CLOSE"].sub(total_df["MA2"])).div(total_df["SD2"]), 3)
total_df["Z2"] = round(total_df["z2"].rolling(window=1).mean(), 3)
total_df["MAZ2"] = round(total_df["Z2"].rolling(window=3).mean(), 5)
total_df["SDMA2"] = round(total_df["SD2"].rolling(window=sma_length2).mean(), 5)
total_df["STD2"] = round(total_df["SD2"].rolling(window=sma_length2).std(), 5)
total_df["zd2"] = round((total_df["SD2"].sub(total_df["SDMA2"])).div(total_df["STD2"]), 3)
total_df["ZDEV2"] = round(total_df["zd2"].rolling(window=1).mean(), 3)
total_df["MAZDEV2"] = round(total_df["ZDEV2"].rolling(window=3).mean(), 3)
total_df["MA3"] = round(total_df["CLOSE"].rolling(window=sma_length3).mean(), 5)
total_df["SD3"] = round(total_df["CLOSE"].rolling(window=sma_length3).std(), 5)
total_df["z3"] = round((total_df["CLOSE"].sub(total_df["MA3"])).div(total_df["SD3"]), 3)
total_df["Z3"] = round(total_df["z3"].rolling(window=1).mean(), 3)
total_df["MAZ3"] = round(total_df["Z3"].rolling(window=3).mean(), 5)
total_df["SDMA3"] = round(total_df["SD3"].rolling(window=sma_length3).mean(), 5)
total_df["STD3"] = round(total_df["SD3"].rolling(window=sma_length3).std(), 5)
total_df["zd3"] = round((total_df["SD3"].sub(total_df["SDMA3"])).div(total_df["STD3"]), 3)
total_df["ZDEV3"] = round(total_df["zd3"].rolling(window=1).mean(), 3)
total_df["MAZDEV3"] = round(total_df["ZDEV3"].rolling(window=3).mean(), 3)
# BREAKOUT
total_df['input1'] = [int(x) for x in total_df['Z1'] > 2]
total_df['input2'] = [int(x) for x in total_df['Z1'].shift(1) < 2]
# VOLATILITY
total_df['input3'] = [int(x) for x in (total_df['ZDEV2'] > total_df['MAZDEV2'])]
#total_df = total_df.dropna(subset=["MAZDEV2"], inplace=False)
#VOLATILITY #2
total_df['input4'] = [int(x) for x in (total_df['ZDEV3'] > total_df['MAZDEV3'])]
total_df['input5'] = [int(x) for x in total_df['ZDEV3'] < 1]
#
# #TREND
total_df['input6'] = [int(x) for x in (total_df['Z3'] > total_df['MAZ3'])]
#total_df['input7'] = [int(x) for x in total_df['Z3'] > 1]
print(total_df['input4'])
training = total_df.head(int(train_size * total_df.shape[0]))
test = total_df.tail(int((1 - train_size) * total_df.shape[0]))
tr_returns = training[training['input1' and 'input2' and 'input3' and 'input4' and 'input5' and 'input6'] == 1]['Forward Return']
test_returns = test[test['input1' and 'input2' and 'input3' and 'input4' and 'input5' and 'input6'] == 1]['Forward Return']
mean_forward_return_training = tr_returns.mean()
mean_forward_return_test = test_returns.mean()
pvalue = ttest_ind(tr_returns, test_returns, equal_var=False)[1]
result.append({
'base': fx_from,
'quote': fx_to,
'sma_length': sma_length,
'sma_length2': sma_length2,
'sma_length3': sma_length3,
'training_forward_return': mean_forward_return_training,
'test_forward_return': mean_forward_return_test,
'p-value': pvalue
})
result.sort(key=lambda x: -x['training_forward_return'])
print(result[0])
time.sleep(15)
newFile = os.path.dirname(filePath1) + "/period.xlsx"
writer = ExcelWriter(newFile)
total_df.to_excel(writer, "Sheet1", float_format="%.7f")
writer.save()
error: TypeError: cannot unpack non-iterable int object
Ideally I like to find the best rolling window for each of the ZScore formulas you see above but don't know how to make the loop work
As the error says, you are trying to unpack a single integer into 3 values.
The iterator range(20, 500) only returns a single integer on each iteration.
for i in range(20, 50):
do_something(i)
do_something_else(i)
do_a_third_thing(i)
# i is the same single integer in each case
Other than that I'm not sure what you are trying to do. If you need 3 different values you could use an iterator with 3 different values. You could do something like:
for i, j, k in [(a1, b1, c1), (a2, b2, c2), ...]:
...
but there have to be three values to 'unpack'.
EDIT:
As far as I can see from your script there might be two things you could try...
Maybe removing sma_length2 and sma_length3 variables entirely, since the nature of a for loop is that it will iterate over the values it is given, so you will get all of the results within the range you define.
Something like this:
for sma_length in range(10, 201): # remember that with the iterable returned from range, the last value will not be included.
print(sma_length)
total_df["MA1"] = round(total_df["CLOSE"].rolling(window=sma_length).mean(), 5)
total_df["SD1"] = round(total_df["CLOSE"].rolling(window=sma_length).std(), 5)
total_df["z1"] = round((total_df["CLOSE"].sub(total_df["MA1"])).div(total_df["SD1"]), 3)
total_df["Z1"] = round(total_df["z1"].rolling(window=1).mean(), 3)
total_df["MAZ1"] = round(total_df["Z1"].rolling(window=3).mean(), 5)
total_df["SDMA1"] = round(total_df["SD1"].rolling(window=sma_length).mean(), 5)
total_df["STD1"] = round(total_df["SD1"].rolling(window=sma_length).std(), 5)
total_df["zd1"] = round((total_df["SD1"].sub(total_df["SDMA1"])).div(total_df["STD1"]), 3)
total_df["ZDEV1"] = round(total_df["zd1"].rolling(window=1).mean(), 3)
total_df["MAZDEV1"] = round(total_df["ZDEV1"].rolling(window=3).mean(), 3)
Since you are testing all of those values between 10 and 200 anyway with this loop, I'm not sure why you need the other 2 sma_length variables.
One thing to note separately about the code above is that you have some "magic numbers" which will be the same on every iteration.
When you write "window=3" or "window=1" for example, this will never change and you are just wastefully recalculating the same value for every loop.
If, however, you actually want 3 different sma_lengths at the same time you could use the zip() inbuilt function to create an iterable as described above (a sequence of tuples of length 3).
You could do something like this:
iterable = zip(range(100), range(100, 200), range(200, 300))
for a, b, c in iterable:
print((a, b, c))
# (0, 100, 200)
# (1, 101, 201)
# (2, 102, 202)
# ... etc.
But as I said, I think you can do what you require with only one variable as in the previous example. Hope this helps.

Codechef - NZEC Error in python code

The code runs fine on my machine, but when i compile it on codechef it gives a NZEC(Runtime Error).
The link to the problem : https://www.codechef.com/problems/PPTEST
About my solution: I have calculated the percentile of each test case based on their time and point values. Then I have sorted the entries in each test case based on the percentile.
import sys
def check(x):
if not(x in range(1,100)):
sys.exit(1)
T = input()
check(T)
N_W = []
C_P_T = {}
tp = []
tt = []
for i in range(0,T):
tp.append(0)
tt.append(0)
N_W.append(map(int, raw_input().split()))
check(N_W[i][0])
check(N_W[i][1])
C_P_T[i] = []
for j in range(0,N_W[i][0]):
C_P_T[i].append(map(int, raw_input().split()))
check(C_P_T[i][j][0])
check(C_P_T[i][j][1])
check(C_P_T[i][j][2])
C_P_T[i][j].append(N_W[i][1]-C_P_T[i][j][2])
C_P_T[i][j].append(C_P_T[i][j][1]*C_P_T[i][j][0])
C_P_T[i][j].pop(0)
C_P_T[i][j].pop(0)
C_P_T[i][j].pop(0)
tp[i]+= C_P_T[i][j][1]
tt[i]+=C_P_T[i][j][0]
for i in range(0,T):
C_P_T[i].sort(key = lambda x : x[0] , reverse = True)
item_time = C_P_T[i][0][0]
percentile_time = (C_P_T[i][0][0]/float(tt[i]))*((len(C_P_T[i])-1)/float(len(C_P_T[i])))
for j in range(0,N_W[i][0]):
if C_P_T[i][j][0] == item_time:
C_P_T[i][j].append(percentile_time)
else:
item_time = C_P_T[i][j][0]
percentile_time = (C_P_T[i][j][0]/float(tt[i]))*((len(C_P_T[i])-j-1)/float(len(C_P_T[i])))
C_P_T[i][j].append(percentile_time)
for i in range(0,T):
C_P_T[i].sort(key = lambda x : x[1] , reverse = True)
item_points = C_P_T[i][0][1]
percentile_points = (C_P_T[i][0][1]/float(tp[i]))*((len(C_P_T[i])-1)/float(len(C_P_T[i])))
for j in range(0,N_W[i][0]):
if C_P_T[i][j][1] == item_points:
C_P_T[i][j].append(percentile_points)
else:
item_points = C_P_T[i][j][1]
percentile_points = ((C_P_T[i][j][1])/float(tp[i]))*((len(C_P_T[i])-j-1)/float(len(C_P_T[i])))
C_P_T[i][j].append(percentile_points)
C_P_T[i][j].append(C_P_T[i][j][2]+C_P_T[i][j][3])
C_P_T[i][j].append(N_W[i][1]-C_P_T[i][j][0])
C_P_T[i][j].pop(2)
C_P_T[i][j].pop(2)
C_P_T[i].sort(key = lambda x : x[2],reverse = True)
for i in range(0,T):
points = 0
for j in range(0,N_W[i][0]):
if N_W[i][1]-C_P_T[i][j][3] >= 0:
points+=C_P_T[i][j][1]
N_W[i][1]-=C_P_T[i][j][3]
print points
NZEC means "non-zero exit code", so that is probably happening in sys.exit(1) in your check() function. What you are receiving from input() is either not an integer or not in the right range.
Update: I notice that you use range(1, 100) for validity testing.
But the problem description at codechef states that 1 ≤ T ≤ 100. That is equivalent to range(1, 101)
So, codechef could be passing your code a perfectly valid 100, and your code would reject it, probably with the exact error you are seeing.

Categories