How to select dataframe specific day? - python

I have a dataframe df:
Open Volume Adj Close Ticker
Date
2006-11-18 140.750000 45505300 114.480649 SPY
2006-11-18 100.470001 274000 72.382071 AGG
2006-11-19 140.750000 45505300 114.480649 SPY
2006-11-19 100.470001 274000 72.382071 AGG
2006-11-22 140.750000 45505300 114.480649 SPY
2006-11-22 100.470001 274000 72.382071 AGG
I use this cmd to select the row want I want:
"2006-11-22” is today.
df[df.index==today]
But I finally want to get the row 2006-11-19(Previous trade day).
I don't know the previos trade day is "2006-11-19".
I only know today is "2006-11-22”.
How to write this code?
Thank you very much.

Related

How to calculate rolling Beta of a stock

I have the following table and want to calculate the rolling beta of each stock with respect to QQQ.
Date
AAPL
AMD
NVDA
SPY
QQQ
20200121
316.550
51.050
256.930
223.340
331.310
20200123
319.290
51.700
259.940
224.530
331.740
20200127
308.960
49.260
266.680
218.050
323.520
20200129
324.320
47.510
265.750
221.720
326.620
20200131
309.330
46.970
268.050
219.010
321.740
20200204
318.860
49.470
264.540
227.420
329.080
20200205
321.580
49.850
266.720
228.250
332.820
20200206
325.210
49.320
268.340
230.170
333.960
20200210
321.540
52.270
271.760
231.960
334.690
I am trying the
df = df.pct_change()
for i in df.columns:
df[f'{i}_Beta'] = df['QQQ'].rolling(10).cov(df[i])
Along the lines, but cant figure out the proper output. Need Help.

How to iterate over a columns of tickers of stocks and retrieve their financial data?

My data is basically columns of stock tickers such as this:
MMM ABT ABBV AC AGRO AUB AIR AAF BABA AGN ... UBER UTX VRTX DG V VOW VOW3 VTBR
I want to use a for loop to iterate for each ticker and retrieve closing prices. Now I have this as an output:
for ticker in data:
df = DataReader(ticker,'yahoo', start, end)['Adj Close']
it only outputs information about one stock ticker
i feel that i am missing something
If datatype is a string, try to use the split function.
for ticker in data.split():
df = DataReader(ticker,'yahoo', start, end)['Adj Close']
I hope this could help.

How to add data grouped week by week to a database?

I am attempting to write a program that should handle a small part of my personal budget. I need to add data to a database that is handled somewhat like this:
Week of 1/1/19 -> [[1/1/19, Walmart, 13.43], [1/2/19, Walgreens, 10.54]]
Week of 1/7/19 -> [[1/7/19, Taco Bell, 24.12]]
...
Basically after a new week, a new "Week of" entry will be created with sub-entries within that. I am stuck on how to create the "Week of" entries and how to add entries within that week.
What is the best way to accomplish this?
I would keep the data in a flat database format and only summarise or group by week when you need to. That way, adding or deleting individual transactions is easy.
Using Pandas, you would do something like this:
import pandas as pd
data = pd.DataFrame([['1/1/19', 'Walmart', 13.43], ['1/2/19', 'Walgreens', 10.54], ['1/7/19', 'Taco Bell', 24.12]], columns=['Date', 'Payee', 'Value'])
data['Date'] = pd.to_datetime(data['Date']
data['Week'] = data['Date'].dt.weekofyear
data
# Date Payee Value Week
# 0 1/1/19 Walmart 13.43 1
# 1 1/2/19 Walgreens 10.54 1
# 2 1/7/19 Taco Bell 24.12 2
data.groupby(by='Week').sum()
# Value
# Week
# 1 23.97
# 2 24.12

how to extract a certain value from a data frame?

I am scraping data from YAHOO and trying to pull a certain value from its data frame, using a value that is on another file.
I have managed to scrape the data and show it as a data frame. the thing is I am trying to extract a certain value from the data using another df.
this is the csv i got
df_earnings=pd.read_excel(r"C:Earnings to Update.xlsx",index_col=2)
stock_symbols = df_earnings.index
output:
Date E Time Company Name
Stock Symbol
CALM 2019-04-01 Before The Open Cal-Maine Foods
CTRA 2019-04-01 Before The Open Contura Energy
NVGS 2019-04-01 Before The Open Navigator Holdings
ANGO 2019-04-02 Before The Open AngioDynamics
LW 2019-04-02 Before The Open Lamb Weston`
then I download the csv for each stock with the data from yahoo finance:
driver.get(f'https://finance.yahoo.com/quote/{stock_symbol}/history?period1=0&period2=2597263000&interval=1d&filter=history&frequency=1d')
output:
Open High Low ... Adj Close Volume Stock Name
Date ...
1996-12-12 1.81250 1.8125 1.68750 ... 0.743409 1984400 CALM
1996-12-13 1.71875 1.8125 1.65625 ... 0.777510 996800 CALM
1996-12-16 1.81250 1.8125 1.71875 ... 0.750229 122000 CALM
1996-12-17 1.75000 1.8125 1.75000 ... 0.774094 239200 CALM
1996-12-18 1.81250 1.8125 1.75000 ... 0.791151 216400 CALM
my problem is here I don't know how to find the date form my data frame and extract it from the downloaded file.
now I don't want to insert a manual date like this :
df = pd.DataFrame.from_csv(file_path)
df['Stock Name'] = stock_symbol
print(df.head())
df = df.reset_index()
print(df.loc[df['Date'] == '2019-04-01'])
output:
Date Open High ... Adj Close Volume Stock Name
5610 2019-04-01 46.700001 47.0 ... 42.987827 846900 CALM
I want a condition that will run my data frame for each stock and pull the date needed
print(df.loc[df['Date'] == the date that is next to the symbol that i just downloaded the file for])
I suppose you could make use of a variable to hold the date.
for sy in stock_symbols:
# The value from the 'Date' column in df_earnings
dt = df_earnings.loc[df_earnings.index == sy, 'Date'][sy]
# From the second block of your code relating to 'manual' date
df = pd.DataFrame.from_csv(file_path)
df['Stock Name'] = sy
df = df.reset_index()
print(df.loc[df['Date'] == dt])

python pandas get first available datapoint of a year / calculate YTD return

I need to calculate the year-to-date relative return of a given dataset. I usually caculate the cumulative relative return with this simple function:
def RelPerf(price):
RelPerf = (price/price[0])
return RelPerf
The problem ist that I need to set instead of "price[0]" the price by the start of each year (first available datapoint of the year). Since the dataset does not contain data for each day of the year I can't simply use sth like +365. So the question is how do I get dynamically the location of the first available datapoint into the formula?
This is a short example of the dataframe used:
CLOSE_SPX Close_iBoxx A_Returns B_Returns A_Vola B_Vola
2014-05-15 1870.85 234.3017 -0.009362 0.003412 0.170535 0.075468
2014-05-16 1877.86 234.0216 0.003747 -0.001195 0.170153 0.075378
2014-05-19 1885.08 233.7717 0.003845 -0.001068 0.170059 0.075384
2014-05-20 1872.83 234.2596 -0.006498 0.002087 0.170135 0.075410
2014-05-21 1888.03 233.9101 0.008116 -0.001492 0.169560 0.075326
2014-05-22 1892.49 233.5429 0.002362 -0.001570 0.169370 0.075341
2014-05-23 1900.53 233.8605 0.004248 0.001360 0.168716 0.075333
2014-05-27 1911.91 234.0368 0.005988 0.000754 0.168797 0.075294
2014-05-28 1909.78 235.4454 -0.001114 0.006019 0.168805 0.075474
2014-05-29 1920.03 235.1813 0.005367 -0.001122 0.168866 0.075451
2014-05-30 1923.57 235.2161 0.001844 0.000148 0.168844 0.075430
2014-06-02 1924.97 233.8868 0.000728 -0.005651 0.168528 0.075641
2014-06-03 1924.24 232.9049 -0.000379 -0.004198 0.167852 0.075267
Use df for dataframe
Group the data with TimeGrouper to get things grouped by year
GroupedDat = df.groupby(pd.TimeGrouper('A'))
Create a new column with YTD data of adjusted close, using a transformation lambda function applied to our group data.
df["YTD"] = GroupedDat['CLOSE_SPX'].transform(lambda x: x/x.iloc[0]-1.0)
Solution was provided by MarkD: https://quant.stackexchange.com/questions/18085/calculate-ytd-return-find-first-available-datapoint-of-a-year-in-python
I got hourly data, and found it easier to locate it with this command:
df['2004'].first('1H')
maybe this helpes someone who's looking for an solution via search function

Categories