After yahoo and google fall, i found a guy who suggested downloading data from morningstar, but it will only give me as much as 5 days prices. I´ve tried with different dates but there is no way to make it work.
Im using python 3.6.5 and PyCharm.
import datetime
import pandas_datareader.data as web
start = datetime.datetime(2016, 1, 1)
end = datetime.datetime(2016, 1, 10)
df = web.DataReader("AAPL", 'morningstar', start, end)
df.reset_index(inplace=True)
df.set_index("Date", inplace=True)
df = df.drop("Symbol", axis=1)
print(df.head())
Im new to python and coding in general, so if there is anything I can be more specific about, just make me know.
Thanks
Just drop .head() in
print(df.head())
because that's meant to yield the first 5 rows.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.head.html
Related
I've been stuck on this problem for two days. Below is the csv file.
df = pd.read_csv('/14100017.csv')
df = pd.DataFrame(data)
df.head()
df_year = df.groupby('REF_DATE')['REF_DATE'].count()
print(df_year)
This is my code. Could you please tell me or give me a hint or show me the website has similar questions. How to convert monthly employment data into annual by taking average? This is so confused.
Thank you very much! Much appreciated
I tried search similar questions in reddit forum and Stack overflow, they all used rsample and get the result.
Just convert REF_DATE to datetime and then extract year:
df['date'] = pd.to_datetime(df['REF_DATE'])
df['year'] = pd.DatetimeIndex(df['date']).year
After, you need to aggregate the value by year:
monthly_year_avg = df.groupby('year')['VALUE'].mean()
I am using a script which uses this line:
res = requests.get('https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?range={data_range}&interval={data_interval}'.format(**locals()))
to collect stock data from Yahoo Finance. Currently, I am inputting values such as '1d' for data_range which gives me the data from the past day. However, what do I enter if I want to collect data say from 2020-11-24 to 2020-11-25 (instead of from past x days)?
I would suggest using the Selenium library.Because you need to handle click event and for the page to relode to get the new updated stock data check bellow
http://lmari.hatenablog.com/entry/selenium-fin
Do the following
#!pip install pandas-datareader
import pandas as pd
from pandas_datareader import data
stock = 'RENT3.SA'
source = 'yahoo'
# Set date range
start = datetime.datetime(2019, 8, 19)
end = datetime.datetime(2020, 11, 27)
# Collect stock data
dataset = data.DataReader(stock, source, start, end)
goog_df['Adj Close'].plot(kind='line', grid=True, title='RENT3 Adjusted Closes')
Please excuse my lack of knowledge but I am completely new at this, coming from a social services background. My classmates and I are all having trouble following our prof sadly. We have a data frame that I've reduced to the 2 columns needed (an excel doc). One column has different dates. We'd like to create a new df that tells us how many months are between all those dates and May 31, 2019, using DateTime. I'd appreciate any input or reference to something similar. The most recent step I've tried is x = DateTime.datetime(2019, 5, 31) but I'm not sure what to do next. I also made the df into an array but I'm also not sure if I'm even supposed to do that, to begin with.
let say the column is called 'date'
first convert it to a date object:
df.date = pd.to_datetime(df.date)
create new column
df['days_difference'] = (df.date - x).days
if you want it in month you can divide by 30.42?
From the daily stock price data, I want to sample and select end of the month price. I am accomplishing using the following code.
import datetime
from pandas_datareader import data as pdr
import pandas as pd
end = datetime.date.today()
begin=end-pd.DateOffset(365*2)
st=begin.strftime('%Y-%m-%d')
ed=end.strftime('%Y-%m-%d')
data = pdr.get_data_yahoo("AAPL",st,ed)
mon_data=pd.DataFrame(data['Adj Close'].resample('M').apply(lambda x: x[-2])).set_index(data.index)
The line above selects end of the month data and here is the output.
If I want to select penultimate value of the month, I can do it using the following code.
mon_data=pd.DataFrame(data['Adj Close'].resample('M').apply(lambda x: x[-2]))
Here is the output.
However the index shows end of the month value. When I choose penultimate value of the month, I want index to be 2015-12-30 instead of 2015-12-31.
Please suggest the way forward. I hope my question is clear.
Thanking you in anticipation.
Regards,
Abhishek
I am not sure if there is a way to do it with resample. But, you can get what you want using groupby and TimeGrouper.
import datetime
from pandas_datareader import data as pdr
import pandas as pd
end = datetime.date.today()
begin = end - pd.DateOffset(365*2)
st = begin.strftime('%Y-%m-%d')
ed = end.strftime('%Y-%m-%d')
data = pdr.get_data_yahoo("AAPL",st,ed)
data['Date'] = data.index
mon_data = (
data[['Date', 'Adj Close']]
.groupby(pd.TimeGrouper(freq='M')).nth(-2)
.set_index('Date')
)
simplest solution is to take the index of your newly created dataframe and subtract the number of days you want to go back:
n = 1
mon_data=pd.DataFrame(data['Adj Close'].resample('M').apply(lambda x: x[-1-n]))
mon_data.index = mon_data.index - datetime.timedelta(days=n)
also, seeing your data, i think that you should resample not to ' month end frequency' but rather to 'business month end frequency':
.resample('BM')
but even that won't cover it all, because for instance December 29, 2017 is a business month end, but this date doesn't appear in your data (which ends in December 08 2017). so you could add a small fix to that (assuming the original data is sorted by the date):
end_of_months = mon_data.index.tolist()
end_of_months[-1] = data.index[-1]
mon_data.index = end_of_months
so, the full code will look like:
n = 1
mon_data=pd.DataFrame(data['Adj Close'].resample('BM').apply(lambda x: x[-1-n]))
end_of_months = mon_data.index.tolist()
end_of_months[-1] = data.index[-1]
mon_data.index = end_of_months
mon_data.index = mon_data.index - datetime.timedelta(days=n)
btw: your .set_index(data.index) throw an error because data and mon_data are in different dimensions (mon_data is monthly grouped_by)
I want to use look ups for any online index, including those with digits. A random example is:
https://uk.finance.yahoo.com/quote/YSM6.AX/futures?p=YSM6.AX
A naive method is to use pandas-datareader:
from pandas_datareader import data as datareader
online_data = datareader.DataReader('YSM6.AX', 'yahoo', start, end)
However, this doesn't work. I think the digits in the ticker aren't handled properly. This command works fine with e.g. "AAPL".
How do I get this to work for any index?
The YSM6.AX link shows that there is no data for this stock.
If you want to grab multiple stock, and get specifically the adjusted close, you can use this code. It takes into account any funny stock tickers that have either a "-", or in the case of YSM6.AX, a "." inside the ticker.
import pandas as pd
import datetime
from pandas_datareader import data, wb
tickers = ["BRK.B", "AAPL", "MSFT", "YHOO", "JPM"]
series_list = []
start = datetime.datetime(2012, 4, 5)
end = datetime.datetime(2017, 3, 28)
for security in tickers:
s = data.DataReader(security.replace(".","-"),"yahoo",start, end )["Adj Close"]
s.name = security
series_list.append(s)
df = pd.concat(series_list, axis=1)
stocks= pd.DataFrame(df)
stocks
If you look at the link you have provided, YSM6 is a futures contract on ASX. Specifically it is the M6 expiry, meaning 2016-06. And Yahoo has no data for this contract on their site anymore--perhaps because it is expired, or perhaps because there was never any data available for it. Furthermore, this product (3 year AU interest rate swap futures) seems to have been discontinued by the exchange.
Your question says you want "stock" data. Here's an example of an actual stock with a numeric symbol:
https://uk.finance.yahoo.com/quote/7203.KL/?p=7203.KL