python FIFO stock portfolio calculations - python

I'm calculating returns for a stock portfolio with hundreds of symbols over a period of many years. I need to evaluate the tax efficiency of this portfolio, which means that I need to account for stock purchases and sales according to FIFO rules of accounting. If I buy a stock and sell it in less than a year, then I pay one tax rate, but if I hold it for over 1 year, I pay a different tax rate. So, for every date, I'd like to calculate the long-term and short-term capital gains for the portfolio from the trades that happen on that date. I'm trying to think of an efficient way to perform these calculations. Ideally, this would be something that would operate on a Pandas DataFrame that contains all of the trades. Perhaps someone knows of a open source portfolio app that already does this? Thanks.

Related

Efficient algorithms to perform Market Basket Analysis

I want to perform Market Basket Analysis (or Association Analysis) on retail ecommerce dataset.
The problem I am facing is the huge data size of 3.3 million transactions in a single month. I cannot cut down the transactions as I may miss some products. Provided below the structure of the data:
Order_ID = Unique transaction identifier
Customer_ID = Identifier of the customer who placed the order
Product_ID = List of all the products the customer has purchased
Date = Date on which the sale has happened
When I feed this data to the #apriori algorithm in Python, my system cannot handle the huge memory requirements to run. It can run with just 100K transactions. I have 16gb RAM.
Any help in suggesting a better (and faster) algorithm is much appreciated.
I can use SQL as well to sort out data size issues, but I will get only 1 Antecedent --> 1 Consequent rule. Is there a way to get multiset rules such as {A,B,C} --> {D,E} i.e, If a customer purchases products A, B and C, then there is a high chance to purchase products D and E.
For a huge data size try FP Growth, as it is an improvement to the Apriori method.
It also only loop data twice when compared to Apriori.
from mlxtend.frequent_patterns import fpgrowth
Then just change:
apriori(df, min_support=0.6)
To
fpgrowth(df, min_support=0.6)
There also an research that compare each algorithm, for memory issue I recommend :
Evaluation of Apriori, FP growth and Eclat association rule miningalgorithms or
Comparing the Performance of Frequent Pattern Mining Algorithms.

Coding a simple trading strategy on Python

Apologies in advance for the long post (please advice if these long posts are poor form). :(
Attempting to code a simple trading strategy to learn how to calculate expected returns and financial trading methods.
I have here loaded S&P 500 data from Yahoo Finance using yfinance. I then loaded the data, and I wanted the user to be able to input how far back the data goes.
Here already begins my problem. My dataframe is loaded such that the "close_price" list has the dates as an index column (can be seen also in the attached image). Not my biggest concern as I'm able to call all the dates and close_prices for the stock I've selected.
From here, I'm trying to calculate the expected returns based on two strategies:
Buy $x on the first date. Buy $x every month thereafter. Calculate the portfolio value (or returns on each investment/total returns) on a specified date.
Buy $x on the first date. Buy $x again if the price drops by 10%. Sell 0.5*$x if the price increases by 10%. Buy $x if 30 days have surpassed and no buy/sell order has been made.
Picture of my data table
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
from pandas_datareader import data as web
# Load stock data
stock_ticker = '^GSPC'
df = yf.download(stock_ticker)
# Allows user to input number of days to trackback prior to today (excluding weekends) for analysis
timescale = int(input("Enter No. of days prior to today (excluding weekends):"))
# List arrays for close price and dates
close_price = df["Adj Close"][-timescale:]
dates = df.index.tolist()[-timescale:]
# Returns
daily_returns = close_price.pct_change(fill_method='pad')
monthly_returns = close_price.resample('M').ffill().pct_change(fill_method='pad')
Things I've tried:
-- Writing a for loop that calculates the multiplies my monthly stock return (monthly return values are of the order 0.01 and stock prices of the order $4000) against my investments per month. So $1000 investment, one month later return is 0.04, so returns are 40, value of portfolio = $1040
-- Write a while loop that is True while the stock prices from the initial value are greater than 0.90% of it. If not True: put $1000 into the stock. If the stock goes up 10% (or if the price from the last buy/sell order is < 1.1x), then sell 50%.
I've tried many ways to logic this in code, but to no avail. Would love your help guys!
Thanks!
There are some modules such as Zipline that can help simulate these trading strategies I think. It might be less time consuming to use that and it incorporates things like slippage and trading fees as well.
However if you want to build your own code from scratch I’d suggest breaking it down into a few smaller steps.
Firstly a section of code that finds trade signals buy/sell, based on your previously stated criteria.
Then another section of code that takes the trading signals and finds trades with entry/exit dates.
When you have a list of trades with entry/exit dates you can create a section of code that turns the list of trades into a portfolio database. The database shows the value of your portfolio over time and how your trades effect that value.

binance python api margin order incomplete repay loan

I am trying to repay a loan using the binance python API. I retrieve the loan size from the acct dictionary and input that as the quoteOrderQty in the auto repay margin order.
When I run closeLong() the loan is not paid off completely, a small balance remains in the base and a small USDT debt remains in the quote. What am I doing wrong here?
acct = client.get_isolated_margin_account()
def quoteDebt():
quoteLoan = round(Decimal(acct['assets'][1]['quoteAsset']['borrowed']),2)
print("USDT Debt: "+str(quoteLoan))
return quoteLoan
def closeLong():
client.create_margin_order(
symbol=sym,
side=SIDE_SELL,
type=ORDER_TYPE_MARKET,
sideEffectType= 'AUTO_REPAY',
isIsolated='TRUE',
quoteOrderQty=quoteDebt())
print("Closed Long")
I've had the same issue. I believe it's that interest is compounded continuously, so a tiny amount of interest is incurred between you checking how much is owed and you repaying it. I hacked around this by looping indefinitely: the first loop repays the principal, the second the interest on that, the third the interest on that, and by the 4th or 5th loop the interest rounds down to 0. I'd much prefer if there were a repay_all() API call available, of course, but I'm not aware of anything like that.
Same issue here, not nice from Binance to not enable us to fully repay automatically through API. If you can lock some extra amount on this, ideal solution for me is to buy the borrowed asset on spot account, fees are paid from BNB, which is less vs. isolated margin account, transfer the asset to isolated margin account a repay (2 step approach). You need to have the reserve though, as you can't transfer all money out of margin account, when in position, as it will get liquidated.
If you try to repay isolated account directly, 0.1% fee is deducted, so the loan is not fully repaid. If buy from spot, you can pay fee as 0.075% in BNB and get the full amount of asset you asked for (as fees are paid in BNB). No tiny amounts remaining.
Ideal approach for you, is based on frequency of trades. High frequency trading - fees matters, trades are small and you can allocate extra funds to this. Low frequency trading - fees does not matter that much, you might want to go all-in, not having extra funds, and you might do this manually as well ("Close all positions" in UI), for those tiny amounts...

Arbitrage algorithm for multiple exchanges, multiple currencies, and multiple amounts

I'm searching for a way to apply an arbitrage algorithm across multiple exchanges, multiple currencies. and multiple trading amounts. I've seen examples of using BellmanFord and FloydWarshall, but the one's I've tried all seem to be assuming the graph data set is made up of prices for multiple currencies on one single exchange. I've tried tinkering and making it support prices across multiple exchanges but I haven't found any success.
One article I read said that I use BellmanFord and simply put only the best exchange's price in the graph (as opposed to all the exchange's prices). While it sounds like that should work, I feel like that could be missing out on value that way. Is this the right way to go about it?
And regarding multiple amounts, should I just make one graph per trade amount? So say I want to run the algorithm for $100 and for $1000, do I just literally populate the graph twice once for each set of data? The prices will be different at $100 than for $1000 so one exchange that has the best price at $100 may be different then that of the $1000 amount.
Examples:
The graph would look like this:
rates = [
[1, 0.23, 0.26, 17.41],
[4.31, 1, 1.14, 75.01],
[3.79, 0.88, 1, 65.93],
[0.057, 0.013, 0.015, 1],
]
currencies = ('PLN', 'EUR', 'USD', 'RUB')
REFERENCES:
Here is the code I've been using, but this assumes one exchange and one single trade quantity
Here is where someone mentions you can just include the best exchange's price in the graph in order to support multiple exchanges
Trying for accuracy over speed, there's a way to represent the whole order book of each exchange inside of a linear program. For each bid, we have a real-valued variable that represents how much we want to sell at that price, ranging between zero and the amount bid for. For each ask, we have a real-valued variable that represents how much we want to buy at that price, ranging between zero and the amount asked for. (I'm assuming that it's possible to get partial fills, but if not, you could switch to integer programming.) The constraints say, for each currency aside from dollars (or whatever you want more of), the total amount bought equals the total amount sold. You can strengthen this by requiring detailed balance for each (currency, exchange) pair, but then you might leave some opportunities on the table. Beware counterparty risk and slippage.
For different amounts of starting capital, you can split dollars into "in-dollars" and "out-dollars" and constrain your supply of "in-dollars", maximizing "out-dollars", with a one-to-one conversion with no limit from in- to out-dollars. Then you can solve for one in-dollars constraint, adjust the constraint, and use dual simplex to re-solve the LP faster than from scratch.

Is there anyway to calculate Market Beta from Yahoo Finance DatasReader on Python?

I'm currently trying to gain market betas from tickers gained through yahoo finance datasreader. I was wondering if there is a way to calculate each stocks market beta, and put it in a dataframe?
This is what I have for my code so far:
import pandas_datareader.data as pdr
Tickers=['SBUX','TLRY']
SD='2005-01-31'
ED='2018-12-31'
TickerW=pdr.datareader(Tickers,'yahoo',SD,ED)
TickerW.head()
Okay, to make sure we're on the same page, we use the formula and definition of market beta from here: https://www.investopedia.com/terms/b/beta.asp
Beta = Covariance(Stock Returns, Market Returns) / Variance(Market Returns)
So first of all, we need the tickers for the market as well as the tickers for the stock. Which ticker you use here depends a lot on what market you want to compare against: Total stock market? Just the S&P 500? Maybe some other international equity index? There's no 100% right answer here, but a good way to pick is think about who the "movers" of your stock are, and what other stocks they hold. (Check out Damodaran's course on valuation, free on the interwebs if you google it).
So now your question becomes: How do I compute the covariance and variance of stock returns?
First, the pandas tickers have a bunch of information. The one we want is the "Adjusted Close". That's the daily closing price of the stock, retroactively adjusted for any "special" events like stock splits, reverse splits, and dividends. Because let's say a stock trades for $1000 a pop one day, but then undergoes a 1 for 2 stock split, so now instead of 1 share for $1000, you have 2 shares for $500 each. In a "raw" price chart, it would appear as if your stock just lost 50% value in a single day when in reality nothing happened. The Adjusted Close time series takes care of that to make sure that only "real" changes to the stock's value are reflected.
You can get that by calling prices = TickerW['Adj. Close'] or whatever key yahoo finance uses these days. By just looking at the TickerW dataframe you should be able to figure that out on your own :)
Next, we'd be changing prices into returns. That's just prices.shift(1) / prices (or maybe the other way round :D consult the documentation and try it out yourself). (Nerd note: Instead of these returns, it is mathematically more sound to use the logarithmic returns, because they have certain reasonable properties. If you want, throw a "log" around the returns.
Finally, we now have a series of returns (or log returns). One series for the stock returns, one for the market returns (e.g. from SPY, for the S&P 500). Now we just need to use them in the formula for beta.
Well, the way to go here is to do what I just did: Hit up google for "pandas covariance between two series" and that gets us to https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.cov.html
So basically, cov = stock_returns.cov(market_returns) and var = market_returns.var and then beta = cov / var.
I'd say that should be enough info to send you on your way. Good luck.

Categories