Problem with for-if loop statement operation on pandas dataframe

Problem with for-if loop statement operation on pandas dataframe - python

I have a dataset which I want to create a new column that is based on a division of two other columns using a for-loop with if-conditions.
This is the dataset, with the empty 'solo_fare' column created beforehand.
The task is to loop through each row and divide 'Fare' by 'relatives' to get the per-passenger fare. However, there are certain if-conditions to follow (passengers in this category should see per-passenger prices of between 3 and 8)
The code I have tried here doesn't seem to fill in the 'solo_fare' rows at all. It returns an empty column (same as above df).
for i in range(0, len(fare_result)):
p = fare_result.iloc[i]['Fare']/fare_result.iloc[i]['relatives']
q = fare_result.iloc[i]['Fare']
r = fare_result.iloc[i]['relatives']
# if relatives == 0, return original Fare amount
if (r == 0):
fare_result.iloc[i]['solo_fare'] = q
# if the divided fare is below 3 or more than 8, return original Fare amount again
elif (p < 3) or (p > 8):
fare_result.iloc[i]['solo_fare'] = q
# else, return the divided fare to get solo_fare
else:
fare_result.iloc[i]['solo_fare'] = p
How can I get this to work?

You should probably not use a loop for this but instead just use loc
if you first create the 'solo fare' column and give every row the default value from Fare you can then change the value for the conditions you have set out
fare_result['solo_fare'] = fare_result['Fare']
fare_results.loc[(
(fare_results.Fare / fare_results.relatives) >= 3) & (
(fare_results.Fare / fare_results.relatives) <= 8), 'solo_fare'] = (
fare_results.Fare / fare_results.relatives)

Did you try to initialize those new colums first ?
By that I mean that the statement fare_result.iloc[i]['solo_fare'] = q
only means that you are assigning the value q to the field solo_fare of the line i
The issue there is that at this moment, the line i does not have any solo_fare key. Hence, you are only filling the last value of your table here.
To solve this issue, try declaring the solo_fare column before the for loop like:
fare_result['solo_fare'] = np.nan

One way to do is to define a row-wise function, and apply it to the dataframe:
# row-wise function (mockup)
def foo(fare, relative):
# your logic here. Mine just serves as example
if relative > 100:
res = fare/relative
elif (relative < 10):
res = fare
else:
res = 10
return res
Then apply it to the dataframe (row-wise):
fare_result['solo_fare'] = fare_result.apply(lambda row: foo(row['Fare'], row['relatives']) , axis=1)

Related

Im getting a different output than expected when using df.loc to change some values of the df

I have a data frame, and I want to assign a quartile number based on the quartile variable, which gives me the ranges that I later use in the for. The problem is that instead of just changing the quartile number, it its creating n (len of the datframe) rows, and then using the row number for the loop.
expected result
actual output
quartile = numpy.quantile(pivot['AHT'], [0.25,0.5,0.75])
pivot['Quartile'] = 0
for i in range(0,len(pivot)-1):
if i <= quartile[0]:
pivot.loc[i,'Quartile'] = 1
elif i <= quartile[1]:
pivot.loc[i,'Quartile'] = 2
elif i <= quartile[2]:
pivot.loc[i,'Quartile'] = 3
else:
pivot.loc[i,'Quartile'] = 4

Use qcut with labels=False and add 1 or specify values of labels in list:
pivot['Quartile'] = pd.qcut(pivot['AHT'], 4, labels=False) + 1
pivot['Quartile'] = pd.qcut(pivot['AHT'], 4, labels=[1,2,3,4])

Remove following rows that are above or under by X amount from the current row['x']

I am calculating correlations and the data frame I have needs to be filtered.
I am looking to remove the rows under the current row from the data frame that are above or under by X amount starting with the first row and looping through the dataframe all the way until the last row.
example:
df['y'] has the values 50,51,52,53,54,55,70,71,72,73,74,75
if X = 10 it would start at 50 and see 51,52,53,54,55 as within that 10+- range and delete the rows. 70 would stay as it is not within that range and the same test would start again at 70 where 71,72,73,74,75 and respective rows would be deleted
the filter if X=10 would thus leave us with the rows including 50,75 for df.
It would leave me with a clean dataframe that deletes the instances that are linked to the first instance of what is essentially the same observed period. I tried coding a loop to do that but I am left with the wrong result and desperate at this point. Hopefully someone can correct the mistake or point me in the right direction.
df6['index'] = df6.index
df6.sort_values('index')
boom = len(dataframe1.index)/3
#Taking initial comparison values from first row
c = df6.iloc[0]['index']
#Including first row in result
filters = [True]
#Skipping first row in comparisons
for index, row in df6.iloc[1:].iterrows():
if c-boom <= row['index'] <= c+boom:
filters.append(False)
else:
filters.append(True)
# Updating values to compare based on latest accepted row
c = row['index']
df2 = df6.loc[filters].sort_values('correlation').drop('index', 1)
df2
OUTPUT BEFORE
OUTPUT AFTER

IIUC, your main issue is to filter consecutive values within a threshold.
You can use a custom function for that that acts on a Series (=column) to return the list of valid indices:
def consecutive(s, threshold = 10):
prev = float('-inf')
idx = []
for i, val in s.iteritems():
if val-prev > threshold:
idx.append(i)
prev = val
return idx
Example of use:
import pandas as pd
df = pd.DataFrame({'y': [50,51,52,53,54,55,70,71,72,73,74,75]})
df2 = df.loc[consecutive(df['y'])]
Output:
y
0 50
6 70
variant
If you prefer the function to return a boolean indexer, here is a varient:
def consecutive(s, threshold = 10):
prev = float('-inf')
idx = [False]*len(s)
for i, val in s.iteritems():
if val-prev > threshold:
idx[i] = True
prev = val
return idx

How to iterate over rows of each column in a dataframe

My current code functions and produces a graph if there is only 1 sensor, i.e. if col2, and col3 are deleted in the example data provided below, leaving one column.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
d = {'col1': [-2587.944231, -1897.324231,-2510.304231,-2203.814231,-2105.734231,-2446.964231,-2963.904231,-2177.254231, 2796.354231,-2085.304231], 'col2': [-3764.468462,-3723.608462,-3750.168462,-3694.998462,-3991.268462,-3972.878462,3676.608462,-3827.808462,-3629.618462,-1841.758462,], 'col3': [-166.1357692,-35.36576923, 321.4157692,108.9257692,-123.2257692, -10.84576923, -100.7457692, 89.27423077, -211.0857692, 101.5342308]}
df = pd.DataFrame(data=d)
sensors = 3
window_size = 5
dfn = df.rolling(window_size).corr(pairwise = True)
index = df.index #index of values in the data frame.
rows = len(index) #len(index) returns number of rows in the data.
sensors = 3
baseline_num = [0]*(rows) #baseline numerator, by default zero
baseline = [0]*(rows) #initialize baseline value
baseline = DataFrame(baseline)
baseline_num = DataFrame(baseline_num)
v = [None]*(rows) # Initialize an empty array v[] equal to amount of rows in .csv file
s = [None]*(rows) #Initialize another empty array for the slope values for detecting when there is an exposure
d = [0]*(rows)
sensors_on = True #Is the sensor detecting something (True) or not (False).
off_count = 0
off_require = 8 # how many offs until baseline is updated
sensitivity = 1000
for i in range(0, (rows)): #This iterates over each index value, i.e. each row, and sums the values and returns them in list format.
v[i] = dfn.loc[i].to_numpy().sum() - sensors
for colname,colitems in df.iteritems():
for rownum,rowitem in colitems.iteritems():
#d[rownum] = dfone.loc[rownum].to_numpy()
#d[colname][rownum] = df.loc[colname][rownum]
if v[rownum] >= sensitivity:
sensors_on = True
off_count = 0
baseline_num[rownum] = 0
else:
sensors_on = False
off_count += 1
if off_count == off_require:
for x in range(0, (off_require)):
baseline_num[colname][rownum] += df[colname][rownum - x]
elif off_count > off_require:
baseline_num[colname][rownum] += baseline_num[colname][rownum - 1] + df[colname][rownum] - (df[colname][rownum - off_require]) #this loop is just an optimization, one calculation per loop once the first calculation is established
baseline[colname][rownum] = ((baseline_num[colname][rownum])//(off_require)) #mean of the last "off_require" points
dfx = DataFrame(v, columns =['Sensor Correlation']) #converts the summed correlation tables back from list format to a DataFrame, with the sole column name 'Sensor Correlation'
dft = pd.DataFrame(baseline, columns =['baseline'])
dft = dft.astype(float)
dfx.plot(figsize=(50,25), linewidth=5, fontsize=40) # plots dfx dataframe which contains correlated and summed data
dft.plot(figsize=(50,25), linewidth=5, fontsize=40)
Basically, instead of 1 graph as this produces, I would like to iterate over each column only for this loop:
for colname,colitems in df.iteritems():
for rownum,rowitem in colitems.iteritems():
#d[rownum] = dfone.loc[rownum].to_numpy()
#d[colname][rownum] = df.loc[colname][rownum]
if v[rownum] >= sensitivity:
sensors_on = True
off_count = 0
baseline_num[rownum] = 0
else:
sensors_on = False
off_count += 1
if off_count == off_require:
for x in range(0, (off_require)):
baseline_num[colname][rownum] += df[colname][rownum - x]
elif off_count > off_require:
baseline_num[colname][rownum] += baseline_num[colname][rownum - 1] + df[colname][rownum] - (df[colname][rownum - off_require]) #this loop is just an optimization, one calculation per loop once the first calculation is established
I've tried some other solutions from other questions but none of them seem to solve this case.
As of now, I've tried multiple conversions to things like lists and tuples, and then calling them something like this:
baseline_num[i,column] += d[i - x,column]
as well as
baseline_num[i][column += d[i - x][column]
while iterating over the loop using
for column in columns
However no matter how I seem to arrange the solution, there is always some keyerror of expecting integer or slice indices, among other errors.
See pictures for expected/possible outputs of one column on actual data.with varying input parameters (sensitivity value, and off_require is varied in different cases.)
One such solution which didn't work was the looping method from this link:
https://www.geeksforgeeks.org/iterating-over-rows-and-columns-in-pandas-dataframe/
I've also tried creating a loop using iteritems as the outer loop. This did not function as well.
Below are links to possible graph outputs for various sensitivity values, and windows in my actual dataset, with only one column. (i.e i manually deleted other columns, and plotted just the one using the current program)
sensitivity 1000, window 8
sensitivity 800, window 5
sensitivity 1500, window 5
If there's anything I've left out that would be helpful to solving this, please let me know so I can rectify it immediately.
See this picture for my original df.head:
df.head

Did you try,
for colname,colitems in df.iteritems():
for rownum,rowitem in colitems.iteritems():
print(df[colname][rownum])
The first loop iterates over all the columns, and the 2nd loops iterates over all the rows for that column.
edit:
From our conversation below, I think that your baseline and df dataframes don't have the same column names because of how you created them and how you are accessing the elements.
My suggestion is that you create the baseline dataframe to be a copy of your df dataframe and edit the information within it from there.
Edit:
I have managed to make your code work for 1 loop, but I run into an index error, I am not sure what your optimisation function does but i think that is what is causing it, take a look.
It is this part baseline_num[colname][rownum - 1], in the second loop i guess because you do rownum (0) -1, you get index -1. You need to change it so that in the first loop rownum is 1 or something, I am not sure what you are trying to do there.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
d = {'col1': [-2587.944231, -1897.324231,-2510.304231,-2203.814231,-2105.734231,-2446.964231,-2963.904231,-2177.254231, 2796.354231,-2085.304231], 'col2': [-3764.468462,-3723.608462,-3750.168462,-3694.998462,-3991.268462,-3972.878462,3676.608462,-3827.808462,-3629.618462,-1841.758462,], 'col3': [-166.1357692,-35.36576923, 321.4157692,108.9257692,-123.2257692, -10.84576923, -100.7457692, 89.27423077, -211.0857692, 101.5342308]}
df = pd.DataFrame(data=d)
sensors = 3
window_size = 5
dfn = df.rolling(window_size).corr(pairwise = True)
index = df.index #index of values in the data frame.
rows = len(index) #len(index) returns number of rows in the data.
sensors = 3
baseline_num = [0]*(rows) #baseline numerator, by default zero
baseline = [0]*(rows) #initialize baseline value
baseline = pd.DataFrame(df)
baseline_num = pd.DataFrame(df)
#print(baseline_num)
v = [None]*(rows) # Initialize an empty array v[] equal to amount of rows in .csv file
s = [None]*(rows) #Initialize another empty array for the slope values for detecting when there is an exposure
d = [0]*(rows)
sensors_on = True #Is the sensor detecting something (True) or not (False).
off_count = 0
off_require = 8 # how many offs until baseline is updated
sensitivity = 1000
for i in range(0, (rows)): #This iterates over each index value, i.e. each row, and sums the values and returns them in list format.
v[i] = dfn.loc[i].to_numpy().sum() - sensors
for colname,colitems in df.iteritems():
#print(colname)
for rownum,rowitem in colitems.iteritems():
#print(rownum)
#display(baseline[colname][rownum])
#d[rownum] = dfone.loc[rownum].to_numpy()
#d[colname][rownum] = df.loc[colname][rownum]
if v[rownum] >= sensitivity:
sensors_on = True
off_count = 0
baseline_num[rownum] = 0
else:
sensors_on = False
off_count += 1
if off_count == off_require:
for x in range(0, (off_require)):
baseline_num[colname][rownum] += df[colname][rownum - x]
elif off_count > off_require:
baseline_num[colname][rownum] += baseline_num[colname][rownum - 1] + df[colname][rownum] - (df[colname][rownum - off_require]) #this loop is just an optimization, one calculation per loop once the first calculation is established
baseline[colname][rownum] = ((baseline_num[colname][rownum])//(off_require)) #mean of the last "off_require" points
print(baseline[colname][rownum])
dfx = pd.DataFrame(v, columns =['Sensor Correlation']) #converts the summed correlation tables back from list format to a DataFrame, with the sole column name 'Sensor Correlation'
dft = pd.DataFrame(baseline, columns =['baseline'])
dft = dft.astype(float)
dfx.plot(figsize=(50,25), linewidth=5, fontsize=40) # plots dfx dataframe which contains correlated and summed data
dft.plot(figsize=(50,25), linewidth=5, fontsize=40)
My output looks like this,
-324.0
-238.0
-314.0
-276.0
-264.0
-306.0
-371.0
-806.0
638.0
-412.0
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/range.py in get_loc(self, key, method, tolerance)
354 try:
--> 355 return self._range.index(new_key)
356 except ValueError as err:
ValueError: -1 is not in range
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
3 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/range.py in get_loc(self, key, method, tolerance)
355 return self._range.index(new_key)
356 except ValueError as err:
--> 357 raise KeyError(key) from err
358 raise KeyError(key)
359 return super().get_loc(key, method=method, tolerance=tolerance)
KeyError: -1

I don't have enough rep to comment, but below is what I was able to work out. Hope it helps!
I tried to use the to_list() function while working out an answer, and it threw me an error:
AttributeError: 'DataFrame' object has no attribute 'to_list'
So, I decided to circumvent that method and came up with this:
indexes = [x for x in df.index]
row_vals = []
for index in indexes :
for val in df.iloc[i].values:
row_vals.append(val)
The object row_vals will contain all values in row order.
If you only want to get the row values for a particular row or set of rows, you would need to do this:
indx_subset = [`list of row indices`] #(Ex. [1, 2, 5, 6, etc...])
row_vals = []
for indx in indx_subset:
for val in df.loc[indx].values:
row_vals.append(val)
row_vals will then have all the row values from the specified indices.

Calculate column in Pandas Dataframe using adjacent rows without iterating through each row

I would like to see if there is a way to calculate a column in a dataframe that uses something similar to a moving average without iterating through each row.
Current working code:
def create_candles(ticks, instrument, time_slice):
candlesticks = ticks.price.resample(time_slice, base=00).ohlc().bfill()
volume = ticks.amount.resample(time_slice, base=00).sum()
candlesticks['volume'] = volume
candlesticks['instrument'] = instrument
candlesticks['ttr'] = 0
# candlesticks['vr_7'] = 0
candlesticks['vr_10'] = 0
candlesticks = calculate_indicators(candlesticks, instrument, time_slice)
return candlesticks
def calculate_indicators(candlesticks, instrument):
candlesticks.sort_index(inplace=True)
# candlesticks['rsi_14'] = talib.RSI(candlesticks.close, timeperiod=14)
candlesticks['lr_50'] = talib.LINEARREG(candlesticks.close, timeperiod=50)
# candlesticks['lr_150'] = talib.LINEARREG(candlesticks.close, timeperiod=150)
# candlesticks['ema_55'] = talib.EMA(candlesticks.close, timeperiod=55)
# candlesticks['ema_28'] = talib.EMA(candlesticks.close, timeperiod=28)
# candlesticks['ema_18'] = talib.EMA(candlesticks.close, timeperiod=18)
# candlesticks['ema_9'] = talib.EMA(candlesticks.close, timeperiod=9)
# candlesticks['wma_21'] = talib.WMA(candlesticks.close, timeperiod=21)
# candlesticks['wma_12'] = talib.WMA(candlesticks.close, timeperiod=12)
# candlesticks['wma_11'] = talib.WMA(candlesticks.close, timeperiod=11)
# candlesticks['wma_5'] = talib.WMA(candlesticks.close, timeperiod=5)
candlesticks['cmo_9'] = talib.CMO(candlesticks.close, timeperiod=9)
for row in candlesticks.itertuples():
current_index = candlesticks.index.get_loc(row.Index)
if current_index >= 1:
previous_close = candlesticks.iloc[current_index - 1, candlesticks.columns.get_loc('close')]
candlesticks.iloc[current_index, candlesticks.columns.get_loc('ttr')] = max(
row.high - row.low,
abs(row.high - previous_close),
abs(row.low - previous_close))
if current_index > 10:
candlesticks.iloc[current_index, candlesticks.columns.get_loc('vr_10')] = candlesticks.iloc[current_index, candlesticks.columns.get_loc('ttr')] / (
max(candlesticks.high[current_index - 9: current_index].max(), candlesticks.close[current_index - 11]) -
min(candlesticks.low[current_index - 9: current_index].min(), candlesticks.close[current_index - 11]))
candlesticks['timestamp'] = pd.to_datetime(candlesticks.index)
candlesticks['instrument'] = instrument
candlesticks.fillna(0, inplace=True)
return candlesticks
in the iteration, i am calculating the True Range ('TTR') and then the Volatility Ratio ('VR_10')
TTR is calculated on every row in the DF except for the first one. It uses the previous row's close column, and the current row's high and low column.
VR_10 is calculated on every row except for the first 10. it uses the high and low column of the previous 9 rows and the close of the 10th row back.
EDIT 2
I have tried many ways to add a text based data frame in this question, there just doesnt seem to be a solution with the width of my frame. there is no difference in the input and output dataframes other than the column TTR and VR_10 have all 0s in the input, and have non-zero values in the output.
an example would be this dataframe:
Is there a way I can do this without iteration?

With the nudge from Andreas to use rolling, I came to an answer:
first, I had to find out how to use rolling with multiple columns. found that here.
I made a modification because I need to roll up, not down
def roll(df, w, **kwargs):
df.sort_values(by='timestamp', ascending=0, inplace=True)
v = df.values
d0, d1 = v.shape
s0, s1 = v.strides
a = stride(v, (d0 - (w - 1), w, d1), (s0, s0, s1))
rolled_df = pd.concat({
row: pd.DataFrame(values, columns=df.columns)
for row, values in zip(df.index, a)
})
return rolled_df.groupby(level=0, **kwargs)
after that, I created 2 functions:
def calculate_vr(window):
return window.iloc[0].ttr / (max(window.high[1:9].max(), window.iloc[10].close) - min(window.low[1:9].min(), window.iloc[10].close))
def calculate_ttr(window):
return max(window.iloc[0].high - window.iloc[0].low, abs(window.iloc[0].high - window.iloc[1].close), abs(window.iloc[0].low - window.iloc[1].close))
and called those functions like this:
candlesticks['ttr'] = roll(candlesticks, 3).apply(calculate_ttr)
candlesticks['vr_10'] = roll(candlesticks, 11).apply(calculate_vr)
added timers to both ways and this way is roughly 3X slower than iteration.

Python - faster alternative to 'for' loops

I am trying to construct a binomial lattice model in Python. The idea is that there are multiple binomial lattices and based on the value in particular lattice, a series of operations are performed in other lattices.
These operations are similar to 'option pricing model' ( Reference to Black Scholes models) in a way that calculations start at the last column of the lattice and those are iterated to previous column one step at a time.
For example,
If I have a binomial lattice with n columns,
1. I calculate the values in nth column for a single or multiple lattices.
2. Based on these values, I update the values in (n-1)th column in same or other binomial lattices
3. This process continues until I reach the first column.
So in short, I cannot process the calculations for all of the lattice simultaneously as value in each column depends on the values in next column and so on.
From coding perspective,
I have written a function that does the calculations for a particular column in a lattice and outputs the numbers that are used as input for next column in the process.
def column_calc(StockPrices_col, ConvertProb_col, y_col, ContinuationValue_col, ConversionValue_col, coupon_dates_index, convert_dates_index ,
call_dates_index, put_dates_index, ConvertProb_col_new, ContinuationValue_col_new, y_col_new,tau, r, cs, dt,call_trigger,
putPrice,callPrice):
for k in range(1, n+1-tau):
ConvertProb_col_new[n-k] = 0.5*(ConvertProb_col[n-1-k] + ConvertProb_col[n-k])
y_col_new[n-k] = ConvertProb_col_new[n-k]*r + (1- ConvertProb_col_new[n-k]) *(r + cs)
# Calculate the holding value
ContinuationValue_col_new[n-k] = 0.5*(ContinuationValue_col[n-1-k]/(1+y_col[n-1-k]*dt) + ContinuationValue_col[n-k]/(1+y_col[n-k]*dt))
# Coupon payment date
if np.isin(n-1-tau, coupon_dates_index) == True:
ContinuationValue_col_new[n-k] = ContinuationValue_col_new[n-k] + Principal*(1/2*c);
# check put/call schedule
callflag = (np.isin(n-1-tau, call_dates_index)) & (StockPrices_col[n-k] >= call_trigger)
putflag = np.isin(n-1-tau, put_dates_index)
convertflag = np.isin(n-1-tau, convert_dates_index)
# if t is in call date
if (np.isin(n-1-tau, call_dates_index) == True) & (StockPrices_col[n-k] >= call_trigger):
node_val = max([putPrice * putflag, ConversionValue_col[n-k] * convertflag, min(callPrice, ContinuationValue_col_new[n-k])] )
# if t is not call date
else:
node_val = max([putPrice * putflag, ConversionValue_col[n-k] * convertflag, ContinuationValue_col_new[n-k]] )
# 1. if Conversion happens
if node_val == ConversionValue_col[n-k]*convertflag:
ContinuationValue_col_new[n-k] = node_val
ConvertProb_col_new[n-k] = 1
# 2. if put happens
elif node_val == putPrice*putflag:
ContinuationValue_col_new[n-k] = node_val
ConvertProb_col_new[n-k] = 0
# 3. if call happens
elif node_val == callPrice*callflag:
ContinuationValue_col_new[n-k] = node_val
ConvertProb_col_new[n-k] = 0
else:
ContinuationValue_col_new[n-k] = node_val
return ConvertProb_col_new, ContinuationValue_col_new, y_col_new
I am calling this function for every column in the lattice through a for loop.
So essentially I am running a nested for loop for all the calculations.
My issue is - This is very slow.
The function doesn't take much time. but the second iteration where I am calling the function through the for loop is very time consuming ( avg. times the function will be iterated in below for loop is close to 1000 or 1500 ) It takes almost 2.5 minutes to run the complete model which is very slow from standard modeling standpoint.
As mentioned above, most of the time is taken by the nested for loop shown below:
temp_mat = np.empty((n,3))*(np.nan)
temp_mat[:,0] = ConvertProb[:, n-1]
temp_mat[:,1] = ContinuationValue[:, n-1]
temp_mat[:,2] = y[:, n-1]
ConvertProb_col_new = np.empty((n,1))*(np.nan)
ContinuationValue_col_new = np.empty((n,1))*(np.nan)
y_col_new = np.empty((n,1))*(np.nan)
for tau in range(1,n):
ConvertProb_col = temp_mat[:,0]
ContinuationValue_col = temp_mat[:,1]
y_col = temp_mat[:,2]
ConversionValue_col = ConversionValue[:, n-tau-1]
StockPrices_col = StockPrices[:, n-tau-1]
out = column_calc(StockPrices_col, ConvertProb_col, y_col, ContinuationValue_col, ConversionValue_col, coupon_dates_index, convert_dates_index ,call_dates_index, put_dates_index, ConvertProb_col_new, ContinuationValue_col_new, y_col_new, tau, r, cs, dt,call_trigger,putPrice,callPrice)
temp_mat[:,0] = out[0].reshape(np.shape(out[0])[0],)
temp_mat[:,1] = out[1].reshape(np.shape(out[1])[0],)
temp_mat[:,2] = out[2].reshape(np.shape(out[2])[0],)
#Final value
print(temp_mat[-1][1])
Is there any way I can reduce the time consumed in nested for loop? or is there any alternative that I can use instead of nested for loop.
Please let me know. Thanks a lot !!!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Problem with for-if loop statement operation on pandas dataframe - python

Related

Im getting a different output than expected when using df.loc to change some values of the df

Remove following rows that are above or under by X amount from the current row['x']

How to iterate over rows of each column in a dataframe

Calculate column in Pandas Dataframe using adjacent rows without iterating through each row

Python - faster alternative to 'for' loops

Categories

Resources