I can't figure out how to get data for a given day. Using the annual line in my code, I know the milisecond value of give date.
1612159200000.00 AAPL 2/1/2021 6:00
1612418400000.00 AAPL 2/4/2021 6:00
But putting these value in the code doesn't work
data=get_price_history(symbol=i, endDate=1612418400000 , startDate=1612159200000, frequency=1, frequencyType='daily')
import requests
import pandas as pd
import time
import datetime
# tickers_list= ['AAPL', 'AMGN', 'AXP']
# print(len(tickers_list))
key = '****'
def get_price_history(**kwargs):
url = 'https://api.tdameritrade.com/v1/marketdata/{}/pricehistory'.format(kwargs.get('symbol'))
params = {}
params.update({'apikey': key})
for arg in kwargs:
parameter = {arg: kwargs.get(arg)}
params.update(parameter)
return requests.get(url, params=params).json()
tickers_list= ['AAPL', 'AMGN','WMT']
for i in tickers_list:
# get data 1 year 1 day frequency -- good
# data=get_price_history(symbol=i, period=1, periodType='year', frequency=1, frequencyType='daily')
data=get_price_history(symbol=i, endDate=1612418400000 , startDate=1612159200000, frequency=1, frequencyType='daily')
historical['date'] = pd.to_datetime(historical['datetime'], unit='ms')
info=pd.DataFrame(data['candles'])
historical=pd.concat([historical,info])
historical
From the Ameritrade Price History API documentation:
6 Months / 1 Day, including today's data:
https://api.tdameritrade.com/v1/marketdata/XYZ/pricehistory?periodType=month&frequencyType=daily&endDate=1464825600000
Note that periodType=month is specified because the default periodType is day which is not compatible with the frequencyType daily
So it seems that this line in your code:
data=get_price_history(symbol=i, endDate=1612418400000 , startDate=1612159200000, frequency=1, frequencyType='daily')
is missing a valid periodType parameter. Try:
data=get_price_history(symbol=i, endDate=1612418400000 , startDate=1612159200000, frequency=1, periodType='month', frequencyType='daily')
step1: you need a valid session.
step2: you could use the tda api funcion get_price_history()
see example that I successfully used to get daily data given a start and end date
from tda.auth import easy_client
# need a valid refresh token to use easy_client
Client = easy_client(
api_key='APIKEY',
redirect_uri='https://localhost',
token_path='/tmp/token.json')
# getting the daily data given a a date
# get daily data given start and end dat
resp = Client.get_price_history('AAPL',
period_type=Client.PriceHistory.PeriodType.YEAR,
start_datetime= datetime(2019,9,30),
end_datetime= datetime(2019,10,30) ,
frequency_type=Client.PriceHistory.FrequencyType.DAILY,
frequency=Client.PriceHistory.Frequency.DAILY)
assert resp.status_code == httpx.codes.OK
history = resp.json()
aapl = pd.DataFrame(history)
Related
I'm working on python code to update and append token price and volume data using gate.io's API to a .csv file. Basically trying to check to see if it's up to date, and update with the most recently hour's data if not. The below code isn't throwing any errors, but it's not working. My columns are all in the same order as they are in the code. Any assistance would be greatly appreciated, thank you
import requests
import pandas as pd
from datetime import datetime
# Define API endpoint and parameters
host = "https://api.gateio.ws"
prefix = "/api/v4"
url = '/spot/candlesticks'
currency_pair = "BTC_USDT"
interval = "1h"
# Read the existing data from the csv file
df = pd.read_csv("price_calcs.csv")
# Extract the last timestamp from the csv file
last_timestamp = df["time1"].iloc[-1]
# Convert the timestamp to datetime and add an hour to get the new "from" parameter
from_time = datetime.utcfromtimestamp(last_timestamp).strftime('%Y-%m-%d %H:%M:%S')
to_time = datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')
# Use the last timestamp to make a 'GET' request to the API to get the latest hourly data for the token
query_params = {"currency_pair": currency_pair, "from": from_time, "to": to_time, "interval": interval}
r = requests.get(host + prefix + url, params=query_params)
# Append the new data to the existing data from the csv file
new_data = pd.DataFrame(r.json(), columns=["time1", "volume1", "close1", "high1", "low1", "open1", "volume2"])
df = pd.concat([df, new_data])
# Write the updated data to the csv file
df.to_csv("price_calcs.csv", index=False)
Nevermind figured it out myself
I have a class assignment to write a python program to download end-of-day data last 25 years the major global stock market indices from Yahoo Finance:
Dow Jones Index (USA)
S&P 500 (USA)
NASDAQ (USA)
DAX (Germany)
FTSE (UK)
HANGSENG (Hong Kong)
KOSPI (Korea)
CNX NIFTY (India)
Unfortunately, when I run the program an error occurs.
File "C:\ProgramData\Anaconda2\lib\site-packages\yahoofinancials__init__.py", line 91, in format_date
form_date = datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
ValueError: timestamp out of range for platform localtime()/gmtime() function
If you see below, you can see the code that I have written. I'm trying to debug my mistakes. Can you help me out please? Thanks
from yahoofinancials import YahooFinancials
import pandas as pd
# Select Tickers and stock history dates
index1 = '^DJI'
index2 = '^GSPC'
index3 = '^IXIC'
index4 = '^GDAXI'
index5 = '^FTSE'
index6 = '^HSI'
index7 = '^KS11'
index8 = '^NSEI'
freq = 'daily'
start_date = '1993-06-30'
end_date = '2018-06-30'
# Function to clean data extracts
def clean_stock_data(stock_data_list):
new_list = []
for rec in stock_data_list:
if 'type' not in rec.keys():
new_list.append(rec)
return new_list
# Construct yahoo financials objects for data extraction
dji_financials = YahooFinancials(index1)
gspc_financials = YahooFinancials(index2)
ixic_financials = YahooFinancials(index3)
gdaxi_financials = YahooFinancials(index4)
ftse_financials = YahooFinancials(index5)
hsi_financials = YahooFinancials(index6)
ks11_financials = YahooFinancials(index7)
nsei_financials = YahooFinancials(index8)
# Clean returned stock history data and remove dividend events from price history
daily_dji_data = clean_stock_data(dji_financials
.get_historical_stock_data(start_date, end_date, freq)[index1]['prices'])
daily_gspc_data = clean_stock_data(gspc_financials
.get_historical_stock_data(start_date, end_date, freq)[index2]['prices'])
daily_ixic_data = clean_stock_data(ixic_financials
.get_historical_stock_data(start_date, end_date, freq)[index3]['prices'])
daily_gdaxi_data = clean_stock_data(gdaxi_financials
.get_historical_stock_data(start_date, end_date, freq)[index4]['prices'])
daily_ftse_data = clean_stock_data(ftse_financials
.get_historical_stock_data(start_date, end_date, freq)[index5]['prices'])
daily_hsi_data = clean_stock_data(hsi_financials
.get_historical_stock_data(start_date, end_date, freq)[index6]['prices'])
daily_ks11_data = clean_stock_data(ks11_financials
.get_historical_stock_data(start_date, end_date, freq)[index7]['prices'])
daily_nsei_data = clean_stock_data(nsei_financials
.get_historical_stock_data(start_date, end_date, freq)[index8]['prices'])
stock_hist_data_list = [{'^DJI': daily_dji_data}, {'^GSPC': daily_gspc_data}, {'^IXIC': daily_ixic_data},
{'^GDAXI': daily_gdaxi_data}, {'^FTSE': daily_ftse_data}, {'^HSI': daily_hsi_data},
{'^KS11': daily_ks11_data}, {'^NSEI': daily_nsei_data}]
# Function to construct data frame based on a stock and it's market index
def build_data_frame(data_list1, data_list2, data_list3, data_list4, data_list5, data_list6, data_list7, data_list8):
data_dict = {}
i = 0
for list_item in data_list2:
if 'type' not in list_item.keys():
data_dict.update({list_item['formatted_date']: {'^DJI': data_list1[i]['close'], '^GSPC': list_item['close'],
'^IXIC': data_list3[i]['close'], '^GDAXI': data_list4[i]['close'],
'^FTSE': data_list5[i]['close'], '^HSI': data_list6[i]['close'],
'^KS11': data_list7[i]['close'], '^NSEI': data_list8[i]['close']}})
i += 1
tseries = pd.to_datetime(list(data_dict.keys()))
df = pd.DataFrame(data=list(data_dict.values()), index=tseries,
columns=['^DJI', '^GSPC', '^IXIC', '^GDAXI', '^FTSE', '^HSI', '^KS11', '^NSEI']).sort_index()
return df
Your problem is your datetime stamps are in the wrong format. If you look at the error code it clugely tells you:
datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
Notice the int(in_date) part?
It wants the unix timestamp. There are several ways to get this, out of the time module or the calendar module, or using Arrow.
import datetime
import calendar
date = datetime.datetime.strptime("1993-06-30", "%Y-%m-%d")
start_date = calendar.timegm(date.utctimetuple())
* UPDATED *
OK so I fixed up to the dataframes portion. Here is my current code:
# Select Tickers and stock history dates
index = {'DJI' : YahooFinancials('^DJI'),
'GSPC' : YahooFinancials('^GSPC'),
'IXIC':YahooFinancials('^IXIC'),
'GDAXI':YahooFinancials('^GDAXI'),
'FTSE':YahooFinancials('^FTSE'),
'HSI':YahooFinancials('^HSI'),
'KS11':YahooFinancials('^KS11'),
'NSEI':YahooFinancials('^NSEI')}
freq = 'daily'
start_date = '1993-06-30'
end_date = '2018-06-30'
# Clean returned stock history data and remove dividend events from price history
daily = {}
for k in index:
tmp = index[k].get_historical_stock_data(start_date, end_date, freq)
if tmp:
daily[k] = tmp['^{}'.format(k)]['prices'] if 'prices' in tmp['^{}'.format(k)] else []
Unfortunately I had to fix a couple things in the yahoo module. For the class YahooFinanceETL:
#staticmethod
def format_date(in_date, convert_type):
try:
x = int(in_date)
convert_type = 'standard'
except:
convert_type = 'unixstamp'
if convert_type == 'standard':
if in_date < 0:
form_date = datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=in_date)
else:
form_date = datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
else:
split_date = in_date.split('-')
d = date(int(split_date[0]), int(split_date[1]), int(split_date[2]))
form_date = int(time.mktime(d.timetuple()))
return form_date
AND:
# private static method to scrap data from yahoo finance
#staticmethod
def _scrape_data(url, tech_type, statement_type):
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
script = soup.find("script", text=re.compile("root.App.main")).text
data = loads(re.search("root.App.main\s+=\s+(\{.*\})", script).group(1))
if tech_type == '' and statement_type != 'history':
stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"]
elif tech_type != '' and statement_type != 'history':
stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"][tech_type]
else:
if "HistoricalPriceStore" in data["context"]["dispatcher"]["stores"] :
stores = data["context"]["dispatcher"]["stores"]["HistoricalPriceStore"]
else:
stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"]
return stores
You will want to look at the daily dict, and rewrite your build_data_frame function, which it should be a lot simpler now since you are working with a dictionary already.
I am actually the maintainer and author of YahooFinancials. I just saw this post and wanted to personally apologize for the inconvenience and let you all know I will be working on fixing the module this evening.
Could you please open an issue on the module's Github page detailing this?
It would also be very helpful to know which version of python you were running when you encountered these issues.
https://github.com/JECSand/yahoofinancials/issues
I am at work right now, however as soon as I get home in ~7 hours or so I will attempt to code a fix and release it. I'll also work on the exception handling. I try my best to maintain this module, but my day (and often night time) job is rather demanding. I will report back with the final results of these fixes and publish to pypi when it is done and stable.
Also if anyone else has any feedback or personal fixes made you can offer, it would be a huge huge help in fixing this. Proper credit will be given of course. I am also in desperate need of contributers, so if anyone is interested in that as well let me know. I am really wanting to take YahooFinancials to the next level and have this project become a stable and reliable alternative for free financial data for python projects.
Thank you for your patience and for using YahooFinancials.
I came across a very useful set of scripts on the Shane Lynn for the
Analysis of Weather data. The first script, used to scrape data from Weather Underground, is as follows:
import requests
import pandas as pd
from dateutil import parser, rrule
from datetime import datetime, time, date
import time
def getRainfallData(station, day, month, year):
"""
Function to return a data frame of minute-level weather data for a single Wunderground PWS station.
Args:
station (string): Station code from the Wunderground website
day (int): Day of month for which data is requested
month (int): Month for which data is requested
year (int): Year for which data is requested
Returns:
Pandas Dataframe with weather data for specified station and date.
"""
url = "http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID={station}&day={day}&month={month}&year={year}&graphspan=day&format=1"
full_url = url.format(station=station, day=day, month=month, year=year)
# Request data from wunderground data
response = requests.get(full_url, headers={'User-agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'})
data = response.text
# remove the excess <br> from the text data
data = data.replace('<br>', '')
# Convert to pandas dataframe (fails if issues with weather station)
try:
dataframe = pd.read_csv(io.StringIO(data), index_col=False)
dataframe['station'] = station
except Exception as e:
print("Issue with date: {}-{}-{} for station {}".format(day,month,year, station))
return None
return dataframe
# Generate a list of all of the dates we want data for
start_date = "2016-08-01"
end_date = "2016-08-31"
start = parser.parse(start_date)
end = parser.parse(end_date)
dates = list(rrule.rrule(rrule.DAILY, dtstart=start, until=end))
# Create a list of stations here to download data for
stations = ["ILONDON28"]
# Set a backoff time in seconds if a request fails
backoff_time = 10
data = {}
# Gather data for each station in turn and save to CSV.
for station in stations:
print("Working on {}".format(station))
data[station] = []
for date in dates:
# Print period status update messages
if date.day % 10 == 0:
print("Working on date: {} for station {}".format(date, station))
done = False
while done == False:
try:
weather_data = getRainfallData(station, date.day, date.month, date.year)
done = True
except ConnectionError as e:
# May get rate limited by Wunderground.com, backoff if so.
print("Got connection error on {}".format(date))
print("Will retry in {} seconds".format(backoff_time))
time.sleep(10)
# Add each processed date to the overall data
data[station].append(weather_data)
# Finally combine all of the individual days and output to CSV for analysis.
pd.concat(data[station]).to_csv("data/{}_weather.csv".format(station))
However, I get the error:
Working on ILONDONL28
Issue with date: 1-8-2016 for station ILONDONL28
Issue with date: 2-8-2016 for station ILONDONL28
Issue with date: 3-8-2016 for station ILONDONL28
Issue with date: 4-8-2016 for station ILONDONL28
Issue with date: 5-8-2016 for station ILONDONL28
Issue with date: 6-8-2016 for station ILONDONL28
Can anyone help me with this error?
The data for the chosen station and the time period is available, as shown at this link.
The output you are getting is because an exception is being raised. If you added a print e you would see that this is because import io was missing from the top of the script. Secondly, the station name you gave was out by one character. Try the following:
import io
import requests
import pandas as pd
from dateutil import parser, rrule
from datetime import datetime, time, date
import time
def getRainfallData(station, day, month, year):
"""
Function to return a data frame of minute-level weather data for a single Wunderground PWS station.
Args:
station (string): Station code from the Wunderground website
day (int): Day of month for which data is requested
month (int): Month for which data is requested
year (int): Year for which data is requested
Returns:
Pandas Dataframe with weather data for specified station and date.
"""
url = "http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID={station}&day={day}&month={month}&year={year}&graphspan=day&format=1"
full_url = url.format(station=station, day=day, month=month, year=year)
# Request data from wunderground data
response = requests.get(full_url)
data = response.text
# remove the excess <br> from the text data
data = data.replace('<br>', '')
# Convert to pandas dataframe (fails if issues with weather station)
try:
dataframe = pd.read_csv(io.StringIO(data), index_col=False)
dataframe['station'] = station
except Exception as e:
print("Issue with date: {}-{}-{} for station {}".format(day,month,year, station))
return None
return dataframe
# Generate a list of all of the dates we want data for
start_date = "2016-08-01"
end_date = "2016-08-31"
start = parser.parse(start_date)
end = parser.parse(end_date)
dates = list(rrule.rrule(rrule.DAILY, dtstart=start, until=end))
# Create a list of stations here to download data for
stations = ["ILONDONL28"]
# Set a backoff time in seconds if a request fails
backoff_time = 10
data = {}
# Gather data for each station in turn and save to CSV.
for station in stations:
print("Working on {}".format(station))
data[station] = []
for date in dates:
# Print period status update messages
if date.day % 10 == 0:
print("Working on date: {} for station {}".format(date, station))
done = False
while done == False:
try:
weather_data = getRainfallData(station, date.day, date.month, date.year)
done = True
except ConnectionError as e:
# May get rate limited by Wunderground.com, backoff if so.
print("Got connection error on {}".format(date))
print("Will retry in {} seconds".format(backoff_time))
time.sleep(10)
# Add each processed date to the overall data
data[station].append(weather_data)
# Finally combine all of the individual days and output to CSV for analysis.
pd.concat(data[station]).to_csv(r"data/{}_weather.csv".format(station))
Giving you an output CSV file starting as follows:
,Time,TemperatureC,DewpointC,PressurehPa,WindDirection,WindDirectionDegrees,WindSpeedKMH,WindSpeedGustKMH,Humidity,HourlyPrecipMM,Conditions,Clouds,dailyrainMM,SoftwareType,DateUTC,station
0,2016-08-01 00:05:00,17.8,11.6,1017.5,ESE,120,0.0,0.0,67,0.0,,,0.0,WeatherCatV2.31B93,2016-07-31 23:05:00,ILONDONL28
1,2016-08-01 00:20:00,17.7,11.0,1017.5,SE,141,0.0,0.0,65,0.0,,,0.0,WeatherCatV2.31B93,2016-07-31 23:20:00,ILONDONL28
2,2016-08-01 00:35:00,17.5,10.8,1017.5,South,174,0.0,0.0,65,0.0,,,0.0,WeatherCatV2.31B93,2016-07-31 23:35:00,ILONDONL28
If you are not getting a CSV file, I suggest you add a full path to the output filename.
PLEASE NOTE: This question was successfully answered ptrj below. I have also written a blog post on my blog about my experiences with zipline which you can find here: https://financialzipline.wordpress.com
I'm based in South Africa and I'm trying to load South African shares into a dataframe so that it will feed zipline with share price information. Let's say I'm looking at AdCorp Holdings Limited as listed on the JSE (Johannesburg Stock Exchange):
Google Finance gives me the historical price info:
https://www.google.com/finance/historical?q=JSE%3AADR&ei=5G6OV4ibBIi8UcP-nfgB
Yahoo Finance has no information on the company.
https://finance.yahoo.com/quote/adcorp?ltr=1
Typing in the following code within iPython Notebook gets me the dataframe for the information from Google Finance:
start = datetime.datetime(2016,7,1)
end = datetime.datetime(2016,7,18)
f = web.DataReader('JSE:ADR', 'google',start,end)
If I display f, I see that the information actually corresponds to the info off Google Finance as well:
This is the price exactly off Google Finance, you can see the info for the 2016-07-18 on the Google Finance website matches exactly to my dataframe.
However, I'm not sure how to load this dataframe so that it can be used by zipline as a data bundle.
If you look at the example given for buyapple.py, you can see that it just pulls the data of apple shares (APPL) from the ingested data bundle quantopian-quandl. The challenge here is to replace APPL with JSE:ADR so that it will order 10 JSE:ADR shares a day as fed from the dataframe instead of the data bundle quantopian-quandl and plot it on a graph.
Does anyone know how to do this?
There are almost no examples on the net that deals with this...
This is the buyapple.py code as supplied in zipline's example folder:
from zipline.api import order, record, symbol
def initialize(context):
pass
def handle_data(context, data):
order(symbol('AAPL'), 10)
record(AAPL=data.current(symbol('AAPL'), 'price'))
# Note: this function can be removed if running
# this algorithm on quantopian.com
def analyze(context=None, results=None):
import matplotlib.pyplot as plt
# Plot the portfolio and asset data.
ax1 = plt.subplot(211)
results.portfolio_value.plot(ax=ax1)
ax1.set_ylabel('Portfolio value (USD)')
ax2 = plt.subplot(212, sharex=ax1)
results.AAPL.plot(ax=ax2)
ax2.set_ylabel('AAPL price (USD)')
# Show the plot.
plt.gcf().set_size_inches(18, 8)
plt.show()
def _test_args():
"""Extra arguments to use when zipline's automated tests run this example.
"""
import pandas as pd
return {
'start': pd.Timestamp('2014-01-01', tz='utc'),
'end': pd.Timestamp('2014-11-01', tz='utc'),
}
EDIT:
I looked at the code for ingesting the data from Yahoo Finance and modified it a little to make it take on Google Finance data. The code for the Yahoo Finance can be found here: http://www.zipline.io/_modules/zipline/data/bundles/yahoo.html.
This is my code to ingest Google Finance - sadly it is not working. Can someone more fluent in python assist me?:
import os
import numpy as np
import pandas as pd
from pandas_datareader.data import DataReader
import requests
from zipline.utils.cli import maybe_show_progress
def _cachpath(symbol, type_):
return '-'.join((symbol.replace(os.path.sep, '_'), type_))
def google_equities(symbols, start=None, end=None):
"""Create a data bundle ingest function from a set of symbols loaded from
yahoo.
Parameters
----------
symbols : iterable[str]
The ticker symbols to load data for.
start : datetime, optional
The start date to query for. By default this pulls the full history
for the calendar.
end : datetime, optional
The end date to query for. By default this pulls the full history
for the calendar.
Returns
-------
ingest : callable
The bundle ingest function for the given set of symbols.
Examples
--------
This code should be added to ~/.zipline/extension.py
.. code-block:: python
from zipline.data.bundles import yahoo_equities, register
symbols = (
'AAPL',
'IBM',
'MSFT',
)
register('my_bundle', yahoo_equities(symbols))
Notes
-----
The sids for each symbol will be the index into the symbols sequence.
"""
# strict this in memory so that we can reiterate over it
symbols = tuple(symbols)
def ingest(environ,
asset_db_writer,
minute_bar_writer, # unused
daily_bar_writer,
adjustment_writer,
calendar,
cache,
show_progress,
output_dir,
# pass these as defaults to make them 'nonlocal' in py2
start=start,
end=end):
if start is None:
start = calendar[0]
if end is None:
end = None
metadata = pd.DataFrame(np.empty(len(symbols), dtype=[
('start_date', 'datetime64[ns]'),
('end_date', 'datetime64[ns]'),
('auto_close_date', 'datetime64[ns]'),
('symbol', 'object'),
]))
def _pricing_iter():
sid = 0
with maybe_show_progress(
symbols,
show_progress,
label='Downloading Google pricing data: ') as it, \
requests.Session() as session:
for symbol in it:
path = _cachpath(symbol, 'ohlcv')
try:
df = cache[path]
except KeyError:
df = cache[path] = DataReader(
symbol,
'google',
start,
end,
session=session,
).sort_index()
# the start date is the date of the first trade and
# the end date is the date of the last trade
start_date = df.index[0]
end_date = df.index[-1]
# The auto_close date is the day after the last trade.
ac_date = end_date + pd.Timedelta(days=1)
metadata.iloc[sid] = start_date, end_date, ac_date, symbol
df.rename(
columns={
'Open': 'open',
'High': 'high',
'Low': 'low',
'Close': 'close',
'Volume': 'volume',
},
inplace=True,
)
yield sid, df
sid += 1
daily_bar_writer.write(_pricing_iter(), show_progress=True)
symbol_map = pd.Series(metadata.symbol.index, metadata.symbol)
asset_db_writer.write(equities=metadata)
adjustment_writer.write(splits=pd.DataFrame(), dividends=pd.DataFrame())
# adjustments = []
# with maybe_show_progress(
# symbols,
# show_progress,
# label='Downloading Google adjustment data: ') as it, \
# requests.Session() as session:
# for symbol in it:
# path = _cachpath(symbol, 'adjustment')
# try:
# df = cache[path]
# except KeyError:
# df = cache[path] = DataReader(
# symbol,
# 'google-actions',
# start,
# end,
# session=session,
# ).sort_index()
# df['sid'] = symbol_map[symbol]
# adjustments.append(df)
# adj_df = pd.concat(adjustments)
# adj_df.index.name = 'date'
# adj_df.reset_index(inplace=True)
# splits = adj_df[adj_df.action == 'SPLIT']
# splits = splits.rename(
# columns={'value': 'ratio', 'date': 'effective_date'},
# )
# splits.drop('action', axis=1, inplace=True)
# dividends = adj_df[adj_df.action == 'DIVIDEND']
# dividends = dividends.rename(
# columns={'value': 'amount', 'date': 'ex_date'},
# )
# dividends.drop('action', axis=1, inplace=True)
# # we do not have this data in the yahoo dataset
# dividends['record_date'] = pd.NaT
# dividends['declared_date'] = pd.NaT
# dividends['pay_date'] = pd.NaT
# adjustment_writer.write(splits=splits, dividends=dividends)
return ingest
I followed tutorials on http://www.zipline.io/ and I made it work with the following steps:
Prepare an ingestion function for google equities.
The same code you pasted (based on file yahoo.py) with the following modification:
# Replace line
# adjustment_writer.write(splits=pd.DataFrame(), dividends=pd.DataFrame())
# with line
adjustment_writer.write()
I named the file google.py and copied it to subdirectory zipline/data/bundle of the zipline install directory. (It can be placed anywhere on the python path. Or you can modify zipline/data/bundle/__init__.py to be able to call it the same way as yahoo_equities.)
Ingest (see http://www.zipline.io/bundles.html)
Add the following lines to file .zipline/extension.py in the home directory - the home directory is your User directory on Windows (C:\Users\your username). The .zipline folder is a hidden folder, you will have to unhide the files to see it.
from zipline.data.bundles import register
from zipline.data.bundles.google import google_equities
equities2 = {
'JSE:ADR',
}
register(
'my-google-equities-bundle', # name this whatever you like
google_equities(equities2),
)
And run
zipline ingest -b my-google-equities-bundle
Test (as in http://www.zipline.io/beginner-tutorial.html)
I took an example file zipline/examples/buyapple.py (the same you pasted), replaced both occurences of symbol 'AAPL' with 'JSE:ADR', renamed to buyadcorp.py and ran
python -m zipline run -f buyadcorp.py --bundle my-google-equities-bundle --start 2000-1-1 --end 2014-1-1
The outcome was consistent with the data downloaded directly from Google Finance.
I've been working on the quandl API recently and I've been stuck on an issue for a while.
My question is how to create a method on the difference between One
date and the date before for a stock index, Data seems to come out as
an array as an example: [[u'2015-04-30', 17840.52]] for the Dow Jones
Industrial Average. I'd like to also create a way to get the change
from one day away from the latest one. Say getting Friday's stock and
the change between that and the day before.
My code:
def fetchData(apikey, url):
'''Returns JSON data of the Dow Jones Average.'''
parameters = {'rows' : 1, 'auth_token' : apikey}
req = requests.get(url, params=parameters)
data = json.loads(req.content)
parsedData = []
stockData = {}
for datum in data:
if data['code'] == 'COMP':
stockData['name'] = data['name']
stockData['description'] = '''The NASDAQ Composite Index measures all
NASDAQ domestic and international based common type stocks listed on The NASDAQ Stock Market.'''
stockData['data'] = data['data']
stockData['code'] = data['code']
else:
stockData['name'] = data['name']
stockData['description'] = data['description']
stockData['data'] = data['data']
stockData['code'] = data['code']
parsedData.append(stockData)
return parsedData
I've attempted to just tack on [1] on data to get just the current day but both the issue of getting the day before has kinda stumped me.