How to price a SimpleCashFlow - python

I would like to use QuantLib to price a portfolio of liabilities, which are modeled to be deterministic future cash-flows. I am now modelling them as a strip of FixedRateBonds with zero coupons, which seems like a very inelegant solution.
Problem:
Question 1: Is there a way to create an 'Instrument' that is just a 'SimpleCashFlow', 'Redemption' etc. and price it on a discount curve?
Question 2: Is it possible to construct a 'CashFlows' object or Instrument from multiple SimpleCashFlow's and price it on a curve?
Many thanks in advance
Code Example:
See code below for an example of what I am trying to do.
from QuantLib import *
# set params
calc_date = Date(30, 3, 2017)
risk_free_rate = 0.01
discount_curve = YieldTermStructureHandle(
FlatForward(calc_date, risk_free_rate, ActualActual()))
bond_engine = DiscountingBondEngine(discount_curve)
# characteristics of the cash-flow that I am trying to NPV
paymentdate = Date(30, 3, 2018)
paymentamount = 1000
# this works: pricing a fixed rate bond with no coupons
schedule = Schedule(paymentdate-1, paymentdate, Period(Annual), TARGET(),
Unadjusted, Unadjusted, DateGeneration.Backward, False)
fixed_rate_bond = FixedRateBond(0, paymentamount, schedule, [0.0],ActualActual())
bond_engine = DiscountingBondEngine(discount_curve)
fixed_rate_bond.setPricingEngine(bond_engine)
print(fixed_rate_bond.NPV())
# create a simple cashflow
simple_cash_flow = SimpleCashFlow(paymentamount, paymentdate)
# Q1: how to create instrument, set pricing engine and price a SimpleCashFlow?
#wrongcode:# simple_cash_flow.setPricingEngine(bond_engine)
#wrongcode:# print(simple_cash_flow.NPV())
# Q2: can I stick multiple cashflows into a single instrument, e.g.:
# how do I construct and price a CashFlows object from multiple 'SimpleCashFlow's?
simple_cash_flow2 = SimpleCashFlow(paymentamount, Date(30, 3, 2019))
#wrongcode:# cashflows_multiple = CashFlows([simple_cash_flow, simple_cash_flow2])
#wrongcode:# cashflows_multiple.setPricingEngine(bond_engine)
#wrongcode:# print(cashflows_multiple.NPV())

There are a couple of possible approaches. If you want to use an instrument, you can use a ZeroCouponBond instead of the fixed-rate one you're currently using:
bond = ZeroCouponBond(0, TARGET(), paymentamount, paymentdate)
bond.setPricingEngine(bond_engine)
print(bond.NPV())
Using an instrument will give you notifications and recalculation if the discount curve were to change, but might be overkill if you want a single pricing. In that case, you might work directly with the cashflows by using the methods of the CashFlows class:
cf = SimpleCashFlow(paymentamount, paymentdate)
print(CashFlows.npv([cf], discount_curve, True))
where the last parameter is True if you want to include any cashflow happening on today's date and False otherwise (note that this will give you a result a bit different from your calculation; that's because the payment date you used is a TARGET holiday, and the FixedRateBond constructor adjusts it to the next business day).
The above also works with several cash flows:
cfs = [SimpleCashFlow(paymentamount, paymentdate),
SimpleCashFlow(paymentamount*0.5, paymentdate+180),
SimpleCashFlow(paymentamount*2, paymentdate+360)]
print(CashFlows.npv(cfs, discount_curve, True))
Finally, if you want to do the same with an instrument, you can use the base Bond class and pass the cashflows directly:
custom_bond = Bond(0, TARGET(), 100.0, Date(), Date(), cfs)
custom_bond.setPricingEngine(bond_engine)
print(custom_bond.NPV())
this works but is kind of a kludge: the bond uses the passed cashflows directly and ignores the passed face amount and maturity date.

Related

How to save results (and recall them when needed) of a simulation in Python?

I started (based on the idea shown in this model an actuarial project in Python in which I want to simulate, based on a set of inputs and adding (as done here: https://github.com/Saurabh0503/Financial-modelling-and-valuationn/blob/main/Dynamic%20Salary%20Retirement%20Model%20Internal%20Randomness.ipynb) some degree of internal randomness, how much it will take for an individual to retire, with a certain amount of wealth and a certain amount of annual salary and by submitting a certain annual payment (calculated as the desired cash divided by the years that will be necessary to retire). In my model's variation, the user can define his/her own parameters, making the model more flexible and user friendly; and there is a function that calculates the desired retirement cash based on individual's propensity both to save and spend.
The problem is that since I want to summarize (by taking the mean, max, min and std. deviation of wealth, salary and years to retirement) the output I obtain from the model, I have to save results (and to recall them) when I need to do so; but I don't have idea of what to do in order to accomplish this task.
I tried this solution, consisting in saving the simultation's output in a pandas dataframe. In particular I wrote that function:
def get_salary_wealth_year_case_df(data):
all_ytrs = []
salary = []
wealth = []
annual_payments = []
for i in range(data.n_iter):
ytr = years_to_retirement(data, print_output=False)
sal = salary_at_year(data, year, case, print_output=False)
wlt = wealth_at_year(data, year, prior_wealth, case, print_output=False)
pmt = annual_pmts_case_df(wealth_at_year, year, case, print_output=False)
all_ytrs.append(ytr)
salary.append(sal)
annual_payments.append(pmt)
df = pd.DataFrame()
df['Years to Retirement'] = all_ytrs
df['Salary'] = sal
df['Wealth'] = wlt
df['Annual Payments'] = pmt
return df
I need a feedback about what I'm doing. Am I doing it right? If so, are there more efficient ways to do so? If not, what should I do? Thanks in advance!
Given the inputs used for the function, I'm assuming your code (as it is) will do just fine in terms of computation speed.
As suggested, you can add a saving option to your function so the results that are being returned are stored in a .csv file.
def get_salary_wealth_year_case_df(data, path):
all_ytrs = []
salary = []
wealth = []
annual_payments = []
for i in range(data.n_iter):
ytr = years_to_retirement(data, print_output=False)
sal = salary_at_year(data, year, case, print_output=False)
wlt = wealth_at_year(data, year, prior_wealth, case, print_output=False)
pmt = annual_pmts_case_df(wealth_at_year, year, case, print_output=False)
all_ytrs.append(ytr)
salary.append(sal)
annual_payments.append(pmt)
df = pd.DataFrame()
df['Years to Retirement'] = all_ytrs
df['Salary'] = sal
df['Wealth'] = wlt
df['Annual Payments'] = pmt
# Save the dataframe to a given path inside your workspace
df.to_csv(path, header=False)
return df
After saving, returning the object might be optional. This depends on if you are going to use this dataframe on your code moving forward.

Concatenate Instance attribute using class with input

I am super new to python so new to OOP and class (I am originally MATLAB user as an engineer...) so please teach me as much as possible.
Anyways I am trying to do the following.
Create a class called Stock - something like below
class Stock :
def __init__(self,estimate,earning)
self.estimate = estimate # estimation of quarterly earnings
self.earning = earning # actual quarterly earnings
JPM(JP Morgan stock name) = Stock(11.7,10.9)
However, the estimate and earning values are reported every quarter and I want to create a numerical vector for each. The idea is like below, but of course it does not work.
JPM.estimate(1) = 11.9 # the second quarter earnings value at index 1 of the estimate
JPM.estimate(2) = 12.1 # the third quarter earnings value at index 2 of the estimate
JPM.estimate(3) = XX.XX # and so on.
Using .estimate(#) is just to show what I want to do. Using .append() or other methods you would like to teach me is fine.
The reason I am trying to do it this way is because I need 3 vectors for one stock(and I have about 1000 stocks so at the end I would have 3000 vectors to take care of). So I am planning on creating an instance of a stock and having 3 vectors as instance attributes. (Hopefully I got the terminology right.)
earnings vector
estimate vector
the date those earnings were reported.
Am I using the class function wrong(as it was never intended to be used this way?) or what can I do to achieve such concatenation for instance attributes as the data are received from web scraping?
It is not at all clear what you are trying to do with the Stock Class, but if all you want to do is create a list of stock price and earnings organized by date, you could do the following :
from collections import namedtuple, defaultdict
# Create a easily referenced tuple for defining staock data
StockData = namedtuple('StockData', ['date', 'earn', 'est'])
class Stock:
def __init__(self, data: StockData) -> None:
self._quotes = defaultdict()
self._quotes[data.date] = (data.earn, data.est)
def add(self, data: StockData) -> None:
self._quotes[data.date] = (data.earn, data.est)
def value(self, date: str) -> tuple:
# return tuple of (Earnings, Estimate) for date if it exists, else KeyError
return self._quotes[date]
def __repr__(self):
return str(self._quotes)
To load the stock class with data, you can do something along the lines of:
stk = Stock(StockData('1/20/2021', 123.5, 124.0))
stk.add(StockData('6/23/2021', 132.7, 119.4))
print(stk) yields:
defaultdict(None, {'1/20/2021': (123.5, 124.0), '6/23/2021': (132.7, 119.4)})
and, stk.value('1/20/2021')​ yields (123.5, 124.0)

How to filter Django objects based on value returned by a method?

I have an Django object with a method get_volume_sum(self) that return a float, and I would like to query the top n objects with the highest value, how can I do that?
For example I could do a loop like this but I would like a more elegant solution.
vol = []
obj = []
for m in Market.object.filter(**args): # 3000 objects
sum = m.get_volume_sum()
vol.append(sum)
obj.append(m.id)
top = search_top_n(obj, vol, n)
And this is how the method looks like:
# return sum of volume over last hours
def get_volume_sum(self, hours):
return Candle.objects.filter(market=self,
dt__gte=timezone.now()-timedelta(hours=hours)
).aggregate(models.Sum('vo'))
From what I see here even with Python there isn't a single line solution.
You should not filter with the method, this will result in an N+1 problem: for 3'000 Market objects, it will generate an additional 3'0000 queries to obtain the volumes.
You can do this in bulk with a .annotate(…) [Django-doc]:
from django.db.models import Sum
hours = 12 # some value for hours
Market.objects.filter(
**args,
candle__dt__gte=timezone.now()-timedelta(hours=hours),
).annotate(
candle_vol=Sum('candle__vo')
).order_by('-candle_vol')
Here there is however a small caveat: if there is no related Candle, then these Markets will be filtered out. We can prevent that by allowing also Markets without Candles with:
from django.db.models import Q, Sum
hours = 12 # some value for hours
Market.objects.filter(
Q(candle__dt__gte=timezone.now()-timedelta(hours=hours)) |
Q(candle=None),
**args
).annotate(
candle_vol=Sum('candle__vo')
).order_by('-candle_vol')

Alpha Vantage stockinfo only collects 4 dfs properly formatted, not 6

I can get 4 tickers of stockinfo from Alpha Vantage before the rest of the DataFrames are not getting the stockinfo I ask for. So my resulting concatenated df gets interpreted as Nonetype (because the 4 first dfs are formatted differently than the last 2). This is not my problem. The fact that I only get 4 of my requests is... If I can fix that - the resulting concatenated df will be intact.
My code
import pandas as pd
import datetime
import requests
from alpha_vantage.timeseries import TimeSeries
import time
tickers = []
def alvan_csv(stocklist):
api_key = 'demo' # For use with Alpha Vantage stock-info retrieval.
for ticker in stocklist:
#data=requests.get('https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=%s&apikey={}'.format(api_key) %(ticker))
df = pd.read_csv('https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&datatype=csv&symbol=%s&apikey={}'.format(api_key) %(ticker))#, index_col = 0) &outputsize=full
df['ticker'] = ticker
tickers.append(df)
# concatenate all the dfs
df = pd.concat(tickers)
print('\ndata before json parsing for === %s ===\n%s' %(ticker,df))
df['adj_close'] = df['adjusted_close']
del df['adjusted_close']
df['date'] = df['timestamp']
del df['timestamp']
df = df[['date','ticker','adj_close','volume','dividend_amount','split_coefficient','open','high','low']] #
df=df.sort_values(['ticker','date'], inplace=True)
time.sleep(20.3)
print('\ndata after col reshaping for === %s ===\n%s' %(ticker,df))
return df
if __name__ == '__main__':
stocklist = ['vws.co','nflx','mmm','abt','msft','aapl']
df = alvan_csv(stocklist)
NB. Please note that to use the Alpha Vantage API, you need a free API-Key which you may optain here: https://www.alphavantage.co/support/#api-key
Replace the demo API Key with your API Key to make this code work.
Any ideas as to get this to work?
Apparently Alpha Vantage has a pretty low fair usage allowance, where they measure no of queries pr. minute. So in effekt only the first 4 stocks are allowed at full speed. The rest of the stocks need to pause before downloading for not violating their fair-usage policy.
I have now introduced a pause between my stock-queries. At the moment I get approx 55% of my stocks, if I pause for 10 sec. between calls, and 100% if I pause for 15 seconds.
I will be testing exactly how low the pause can be set to allow for 100% of stocks to come through.
I must say compared to the super high-speed train we had at finance.yahoo.com, this strikes me as steam-train. Really really slow downloads. To get my 500 worth of tickers it takes me 2½ hours. But I guess beggars can't be choosers. This is a free service and I will manage with this.

creating signals based on current and prior time periods

I'm trying to write a trading algo and I am very new to python.
Lots of things are easy to understand but I get lost easily. I have a strategy I want to use, but the coding is getting in the way.
I want to create two moving averages and when they cross I want that to be a signal.
The part im I am currently struggling with is also including information about the prior period.
When
MovingAverage1( last 10 candles ) == MovingAverage2( Last 20 candles ),
that's a signal,
but is it a buy or sell?
When
MovingAVerage1( last 10 candles after skipping most recent ) > MovingAverage2( last 10 candles after skipping most recent )
then sell.
Here is what I've got so far, where the MA-s I am using are being simplified for this question:
class MyMACrossStrategy (Strategy):
"""
Requires:
symbol - A stock symbol on which to form a strategy on.
bars - A DataFrame of bars for the above symbol.
short_window - Lookback period for short moving average.
long_window - Lookback period for long moving average."""
def __init__(self, symbol, bars, short_window=4, long_window=9):
self.symbol = symbol
self.bars = bars
self.short_window = short_window
self.long_window = long_window
# Function Helper for indicators
def fill_for_noncomputable_vals(input_data, result_data):
non_computable_values = np.repeat(
np.nan, len(input_data) - len(result_data)
)
filled_result_data = np.append(non_computable_values, result_data)
return filled_result_data
def simple_moving_average(data, period):
"""
Simple Moving Average.
Formula:
SUM(data / N)
"""
catch_errors.check_for_period_error(data, period)
# Mean of Empty Slice RuntimeWarning doesn't affect output so it is
# supressed
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=RuntimeWarning)
sma = [np.mean(data[idx-(period-1):idx+1]) for idx in range(0, len(data))]
sma = fill_for_noncomputable_vals(data, sma)
return sma
def hull_moving_average(data, period):
"""
Hull Moving Average.
Formula:
HMA = WMA(2*WMA(n/2) - WMA(n)), sqrt(n)
"""
catch_errors.check_for_period_error(data, period)
hma = wma(
2 * wma(data, int(period/2)) - wma(data, period), int(np.sqrt(period))
)
return hma
def generate_signals(self):
"""Returns the DataFrame of symbols containing the signals
to go long, short or hold (1, -1 or 0)."""
signals = pd.DataFrame(index=self.bars.index)
signals['signal'] = 0.0
# Create the set of moving averages over the
# respective periods
signals['Fast_Line'] = sma(bars['Close'], self.short_window)
signals['Slow_line'] = hma(bars['Close'], self.long_window)
signals1['Fast_Line'] = sma(bars['Close'], self.short_window[-1])
signals1['Slow_line'] = hma(bars['Close'], self.long_window[-1])
# Create a 'signal' (invested or not invested) when the short moving average crosses the long
# moving average, but only for the period greater than the shortest moving average window
signals['signal'][self.short_window:] = np.where(signals['Fast_Line'][self.short_window:]
> signals['Slow_line'][self.short_window:], 1.0, 0.0)
# Take the difference of the signals in order to generate actual trading orders
signals['positions'] = signals['signal'].diff()
if signals['Fast_Line'] = signals['Slow_Line'] and ...
return signals
Hopefully my question makes sense.
I am assuming that you want to test your strategy first before using it in live market. You can download the stock data from yahoo finance in csv format. And you can upload with below code:
import pandas as pd
import numpy as np
data = pd.read_csv('MSFT.csv')
once the data is stored in the pandas dataframe data, you can moving average of the Closing price with following code:
if you are planning the crossover strategy
sma_days=20
lma_days=50
data['SMA_20']=data['Close'].rolling(window=sma_days,center=False).mean()
data['SMA_50']=data['Close'].rolling(window=lma_days,center=False).mean()
data['SIGNAL']=np.where(data['SMA_20']>data['SMA_50'],'BUY','SELL')

Categories