def Function Is Not Displaying Expected Result

def Function Is Not Displaying Expected Result - python

I’m trying to get stock prices to buy or sell on specific date. When the buy price, given the sell price should be NAN. Likewise, if the sell-price is given, the buy price has to be NAN. This function and coding is originally proposed by Joseph Hart (https://medium.com/analytics-vidhya/sma-short-moving-average-in-python-c656956a08f8).
The return values of the function are (sig_buy_price, sig_sell_price). My data source is Pandas DataFrame, namely qqq_df. SMA_30 and SMA_100 are samples drawn from qqq_df.
The output is not giving me the expected result, which is stated above. Please find the code indicated below. I need specific steps and codes, to resolve the issue. I look forward to hearing from forum members. Thanks.
def buy_sell(qqq_df):
sig_price_buy = []
sig_price_sell = []
flag = -1
for i in range(len(qqq_df)):
if qqq_df['sma_30'][i] > qqq_df['sma_100'][i]:
if flag != 1:
sig_price_buy.append(qqq_df['close'] [i])
sig_price_sell.append(np.nan)
print(qqq_df['date'][i])
else:
sig_price_buy.append(np.nan)
sig_price_buy.append(np.nan)
elif qqq_df['sma_30'][i] < qqq_df['sma_100'][i]:
if flag != 0:
sig_price_buy.append(np.nan)
sig_price_sell.append(qqq_df ['close'] [i])
print(qqq_df['date'][i])
flag = 0
else:
sig_price_buy.append(np.nan)
sig_price_sell.append(np.nan)
else:
sig_price_buy.append(np.nan)
sig_price_sell.append(np.nan)
return(sig_price_buy, sig_price_sell)
b, s = buy_sell(qqq_df = qqq_df)
print(b, s)

The following code worked for me:
def buy_sell(data):
sig_price_buy = []
sig_price_sell = []
flag = -1
for i in range(len(data)):
if data['SMA30'][i] >= data['SMA100'][i] and flag != 1:
sig_price_buy.append(data['TSLA'][i])
sig_price_sell.append(np.nan)
flag = 1
elif data['SMA30'][i] <= data['SMA100'][i] and flag != 0:
sig_price_buy.append(np.nan)
sig_price_sell.append(data['TSLA'][i])
flag = 0
else:
sig_price_buy.append(np.nan)
sig_price_sell.append(np.nan)
return (sig_price_buy, sig_price_sell)
The format of the input data must be:
TSLA = pd.read_csv("TSLA.csv")
close = TSLA.Close
data = pd.DataFrame()
data['SMA30'] = close.rolling(window = 30).mean()
data['SMA100'] = close.rolling(window = 100).mean()
data['TSLA'] = close
I tested it on the TSLA stocks, so nevermind the different naming.
If you want to print it in a certain format, which I read in one of your comments, I recommend first running the function and then printing the resulting arrays formatted the way you like in a second step.
EDIT:
If you wish to access a yahoo finance stock directly from python, you can use the following framework:
import datetime
from os import mkdir, path
from urllib import request
def convertDateToYahooTime(date: datetime.datetime) -> int: #converts a date to a POSIX for the yahoo finance website
return int(date.replace(hour=2, minute=0, second=0, microsecond=0).timestamp())
def downloadStock(stock: str, olderDate: datetime.datetime, recentDate: datetime.datetime) -> str: #downloads a stock from yahoo finance and returns the filename
p1: int = convertDateToYahooTime(olderDate)
p2: int = convertDateToYahooTime(recentDate)
link = f"https://query1.finance.yahoo.com/v7/finance/download/{stock}?period1={p1}&period2={p2}&interval=1d&events=history&includeAdjustedClose=true"
file = f"{stock}_{olderDate.date().strftime('%Y-%m-%d')}_{recentDate.date().strftime('%Y-%m-%d')}"
if not path.exists('stocks'):
mkdir('stocks')
response = request.urlretrieve(link, f'stocks//{file}') #save the stock to a folder named 'stocks'
return file
def fetchStock(stock: str, period: int) -> str: #fetches a stock and returns the filename
today = datetime.datetime.today().replace(hour=0, minute=0, second=0, microsecond=0)
past = datetime.datetime.fromtimestamp(today.timestamp() - period*86400)
return downloadStock(stock, past, today)
downladStock accesses the download link to any yahoo finance stock you want. fetchStock just makes your life easier as you can enter the name of your stock and the period of how many days of the stock you wish to access, counting backward from the present day. So if you do this:
STOCK = 'GOOG'
PERIOD = 1000
file = fetchStock(STOCK, PERIOD)
df = pd.read_csv('stocks//' + file)
df will be the google stock of the past 1000 days. You can do this to load any stock you want. after this, you can repeat the steps I mentioned above to format the stock and run the SMA30/100 algorithm on it.
I'm not sure whether this is what you wanted, but I hope it helps and you have fun with it.

Related

Can't make apache beam write outputs to bigquery when using DataflowRunner

I'm trying to understand why this pipeline writes no output to BigQuery.
What I'm trying to achieve is to calculate the USD index for the last 10 years, starting from different currency pairs observations.
All the data is in BigQuery and I need to organize it and sort it in a chronollogical way (if there is a better way to achieve this, I'm glad to read it because I think this might not be the optimal way to do this).
The idea behing the class Currencies() is to start grouping (and keep) the last observation of a currency pair (eg: EURUSD), update all currency pair values as they "arrive", sort them chronologically and finally get the open, high, low and close value of the USD index for that day.
This code works in my jupyter notebook and in cloud shell using DirectRunner, but when I use DataflowRunner it does not write any output. In order to see if I could figure it out, I tried to just create the data using beam.Create() and then write it to BigQuery (which it worked) and also just read something from BQ and write it on other table (also worked), so my best guess is that the problem is in the beam.CombineGlobally part, but I don't know what it is.
The code is as follows:
import logging
import collections
import apache_beam as beam
from datetime import datetime
SYMBOLS = ['usdjpy', 'usdcad', 'usdchf', 'eurusd', 'audusd', 'nzdusd', 'gbpusd']
TABLE_SCHEMA = "date:DATETIME,index:STRING,open:FLOAT,high:FLOAT,low:FLOAT,close:FLOAT"
class Currencies(beam.CombineFn):
def create_accumulator(self):
return {}
def add_input(self,accumulator,inputs):
logging.info(inputs)
date,currency,bid = inputs.values()
if '.' not in date:
date = date+'.0'
date = datetime.strptime(date,'%Y-%m-%dT%H:%M:%S.%f')
data = currency+':'+str(bid)
accumulator[date] = [data]
return accumulator
def merge_accumulators(self,accumulators):
merged = {}
for accum in accumulators:
ordered_data = collections.OrderedDict(sorted(accum.items()))
prev_date = None
for date,date_data in ordered_data.items():
if date not in merged:
merged[date] = {}
if prev_date is None:
prev_date = date
else:
prev_data = merged[prev_date]
merged[date].update(prev_data)
prev_date = date
for data in date_data:
currency,bid = data.split(':')
bid = float(bid)
currency = currency.lower()
merged[date].update({
currency:bid
})
return merged
def calculate_index_value(self,data):
return data['usdjpy']*data['usdcad']*data['usdchf']/(data['eurusd']*data['audusd']*data['nzdusd']*data['gbpusd'])
def extract_output(self,accumulator):
ordered = collections.OrderedDict(sorted(accumulator.items()))
index = {}
for dt,currencies in ordered.items():
if not all([symbol in currencies.keys() for symbol in SYMBOLS]):
continue
date = str(dt.date())
index_value = self.calculate_index_value(currencies)
if date not in index:
index[date] = {
'date':date,
'index':'usd',
'open':index_value,
'high':index_value,
'low':index_value,
'close':index_value
}
else:
max_value = max(index_value,index[date]['high'])
min_value = min(index_value,index[date]['low'])
close_value = index_value
index[date].update({
'high':max_value,
'low':min_value,
'close':close_value
})
return index
def main():
query = """
select date,currency,bid from data_table
where date(date) between '2022-01-13' and '2022-01-16'
and currency like ('%USD%')
"""
options = beam.options.pipeline_options.PipelineOptions(
temp_location = 'gs://PROJECT/temp',
project = 'PROJECT',
runner = 'DataflowRunner',
region = 'REGION',
num_workers = 1,
max_num_workers = 1,
machine_type = 'n1-standard-1',
save_main_session = True,
staging_location = 'gs://PROJECT/stag'
)
with beam.Pipeline(options = options) as pipeline:
inputs = (pipeline
| 'Read From BQ' >> beam.io.ReadFromBigQuery(query=query,use_standard_sql=True)
| 'Accumulate' >> beam.CombineGlobally(Currencies())
| 'Flat' >> beam.ParDo(lambda x: x.values())
| beam.io.Write(beam.io.WriteToBigQuery(
table = 'TABLE',
dataset = 'DATASET',
project = 'PROJECT',
schema = TABLE_SCHEMA))
)
if __name__ == '__main__':
logging.getLogger().setLevel(logging.INFO)
main()
They way I execute this is from shell, using python3 -m first_script (is this the way I should run this batch jobs?).
What I'm missing or doing wrong? This is my first attemp to use Dataflow, so i'm probably making several mistakes in the book.

For whom it may help: I faced a similar problem but I already used the same code for a different flow that had a pubsub as input where it worked flawless instead a file based input where it simply did not. After a lot of experimenting I found that in the options I changed the flag
options = PipelineOptions(streaming=True, ..
to
options = PipelineOptions(streaming=False,
as of course it is not a streaming source, it's a bounded source, a batch. After I set this flag to true I found my rows in the BigQuery table. After it had finished it even stopped the pipeline as it where a batch operation. Hope this helps

"TypeError: combine() argument 1 must be datetime.date, not int" when using the google calendar API

I am pretty new to coding and trying to create a voice assistant. everything is working fine but I don't know why I am getting this error
Python version: 3.9 64bit
windows version: 10 pro 64bit
I live in India and the timezone we follow is Asia, Kolkata(just telling if that is causing it)
So I have made two functions:
The first function get_date will take a speech-to-text converted string and then try to find the date we are talking about.
def get_date(text):
'''
This fuction will return date value in the form of MM/DD/YYY
it will convert any phrases passes to a date value if it contains the required data
:param text:
:return:
'''
text = text.lower()
today = datetime.date.today()
# if the query contains 'today' simply return today's date
if text.count('today') > 0:
return today
# if the query contains 'tomorrow' simply return tomorrow's date
if text.count('tomorrow') > 0:
return today + datetime.timedelta(1)
month = -1
day_of_week = -1
day = -1
year = today.year
# looping over the given phrase
for word in text.split():
# if the phrase contains name of month find its value from the list
if word in MONTHS:
month = MONTHS.index(word) + 1
# if the phrase contains day of teh week return day of the week from the list
if word in DAYS:
day_of_week = DAYS.index(word)
# if the phrase contains the digit itself then convert it from str to int
if word.isdigit():
day = int(word)
# again run a loop to get values of DAY_EXTENSIONS and then check if we have them in the word
for ext in DAY_EXTENSIONS:
x = word.find(ext)
if x > 0:
try:
day = int(word[:x])
except:
pass
# if given month is passed add to the year
if month < today.month and month != -1:
year = year+1
# if only day is given then finding if month is this or the next
if month == -1 and day != -1:
if day < today.day:
month = today.month + 1
else:
month = today.month
# if only day of the week is provided then find the date
if day_of_week != -1 and month == -1:
current_day_of_week = today.weekday()
diff = day_of_week - current_day_of_week
if diff < 0:
diff += 7
if text.count('next') >= 1:
diff += 7
return today + datetime.timedelta(diff)
if day != -1:
return datetime.date(month=month, day=day, year=year)
Then I have a second function which uses the google calendar API to get the events in the calendar
I don't know much about this but I am following a guy on youtube.
here is the function
def get_events(day, service):
# the below four lines convert the date we provide in terms of utctime format
# If you know what they really mean tell me, please 😅
date = datetime.datetime.combine(day, datetime.datetime.min.time())
end = datetime.datetime.combine(day, datetime.datetime.max.time())
utc = pytz.UTC
date = date.astimezone(utc)
end = end.astimezone(utc)
events_result = service.events().list(calendarId='primary', timeMin=date.isoformat(),
timeMax=end.isoformat(), singleEvents=True,
orderBy='startTime').execute()
events = events_result.get('items', [])
if not events:
print('No upcoming events found.')
for event in events:
start = event['start'].get('dateTime', event['start'].get('date'))
print(start, event['summary'])
here is how I use these functions:
query = take_command().lower()
for phrases in GET_DATE_STRINGS:
if phrases in query.lower():
day = datetime.date(get_date(query))
get_events(day, service)
and then here is the error:
<googleapiclient.discovery.Resource object at 0x00000189C9F60AC0>
Traceback (most recent call last):
File "D:\Coding\jarvis\main.py", line 248, in <module>
get_events(3, service)
File "D:\Coding\jarvis\main.py", line 157, in get_events
date = datetime.datetime.combine(day, datetime.datetime.min.time())
TypeError: combine() argument 1 must be datetime.date, not int
Now according to me, the error is that the day input that I am giving to the get_events() function is int but as I can see the return function in the get_date() returns a DateTime.date object not an int.
kindly help If you know the fix.

If get_date returns a date object why are you again creating a date object here:
day = datetime.date(get_date(query)) ?
You can just do:
day = get_date(query)
get_events(day, service)

ValueError: timestamp out of range for platform localtime()/gmtime() function

I have a class assignment to write a python program to download end-of-day data last 25 years the major global stock market indices from Yahoo Finance:
Dow Jones Index (USA)
S&P 500 (USA)
NASDAQ (USA)
DAX (Germany)
FTSE (UK)
HANGSENG (Hong Kong)
KOSPI (Korea)
CNX NIFTY (India)
Unfortunately, when I run the program an error occurs.
File "C:\ProgramData\Anaconda2\lib\site-packages\yahoofinancials__init__.py", line 91, in format_date
form_date = datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
ValueError: timestamp out of range for platform localtime()/gmtime() function
If you see below, you can see the code that I have written. I'm trying to debug my mistakes. Can you help me out please? Thanks
from yahoofinancials import YahooFinancials
import pandas as pd
# Select Tickers and stock history dates
index1 = '^DJI'
index2 = '^GSPC'
index3 = '^IXIC'
index4 = '^GDAXI'
index5 = '^FTSE'
index6 = '^HSI'
index7 = '^KS11'
index8 = '^NSEI'
freq = 'daily'
start_date = '1993-06-30'
end_date = '2018-06-30'
# Function to clean data extracts
def clean_stock_data(stock_data_list):
new_list = []
for rec in stock_data_list:
if 'type' not in rec.keys():
new_list.append(rec)
return new_list
# Construct yahoo financials objects for data extraction
dji_financials = YahooFinancials(index1)
gspc_financials = YahooFinancials(index2)
ixic_financials = YahooFinancials(index3)
gdaxi_financials = YahooFinancials(index4)
ftse_financials = YahooFinancials(index5)
hsi_financials = YahooFinancials(index6)
ks11_financials = YahooFinancials(index7)
nsei_financials = YahooFinancials(index8)
# Clean returned stock history data and remove dividend events from price history
daily_dji_data = clean_stock_data(dji_financials
.get_historical_stock_data(start_date, end_date, freq)[index1]['prices'])
daily_gspc_data = clean_stock_data(gspc_financials
.get_historical_stock_data(start_date, end_date, freq)[index2]['prices'])
daily_ixic_data = clean_stock_data(ixic_financials
.get_historical_stock_data(start_date, end_date, freq)[index3]['prices'])
daily_gdaxi_data = clean_stock_data(gdaxi_financials
.get_historical_stock_data(start_date, end_date, freq)[index4]['prices'])
daily_ftse_data = clean_stock_data(ftse_financials
.get_historical_stock_data(start_date, end_date, freq)[index5]['prices'])
daily_hsi_data = clean_stock_data(hsi_financials
.get_historical_stock_data(start_date, end_date, freq)[index6]['prices'])
daily_ks11_data = clean_stock_data(ks11_financials
.get_historical_stock_data(start_date, end_date, freq)[index7]['prices'])
daily_nsei_data = clean_stock_data(nsei_financials
.get_historical_stock_data(start_date, end_date, freq)[index8]['prices'])
stock_hist_data_list = [{'^DJI': daily_dji_data}, {'^GSPC': daily_gspc_data}, {'^IXIC': daily_ixic_data},
{'^GDAXI': daily_gdaxi_data}, {'^FTSE': daily_ftse_data}, {'^HSI': daily_hsi_data},
{'^KS11': daily_ks11_data}, {'^NSEI': daily_nsei_data}]
# Function to construct data frame based on a stock and it's market index
def build_data_frame(data_list1, data_list2, data_list3, data_list4, data_list5, data_list6, data_list7, data_list8):
data_dict = {}
i = 0
for list_item in data_list2:
if 'type' not in list_item.keys():
data_dict.update({list_item['formatted_date']: {'^DJI': data_list1[i]['close'], '^GSPC': list_item['close'],
'^IXIC': data_list3[i]['close'], '^GDAXI': data_list4[i]['close'],
'^FTSE': data_list5[i]['close'], '^HSI': data_list6[i]['close'],
'^KS11': data_list7[i]['close'], '^NSEI': data_list8[i]['close']}})
i += 1
tseries = pd.to_datetime(list(data_dict.keys()))
df = pd.DataFrame(data=list(data_dict.values()), index=tseries,
columns=['^DJI', '^GSPC', '^IXIC', '^GDAXI', '^FTSE', '^HSI', '^KS11', '^NSEI']).sort_index()
return df

Your problem is your datetime stamps are in the wrong format. If you look at the error code it clugely tells you:
datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
Notice the int(in_date) part?
It wants the unix timestamp. There are several ways to get this, out of the time module or the calendar module, or using Arrow.
import datetime
import calendar
date = datetime.datetime.strptime("1993-06-30", "%Y-%m-%d")
start_date = calendar.timegm(date.utctimetuple())
* UPDATED *
OK so I fixed up to the dataframes portion. Here is my current code:
# Select Tickers and stock history dates
index = {'DJI' : YahooFinancials('^DJI'),
'GSPC' : YahooFinancials('^GSPC'),
'IXIC':YahooFinancials('^IXIC'),
'GDAXI':YahooFinancials('^GDAXI'),
'FTSE':YahooFinancials('^FTSE'),
'HSI':YahooFinancials('^HSI'),
'KS11':YahooFinancials('^KS11'),
'NSEI':YahooFinancials('^NSEI')}
freq = 'daily'
start_date = '1993-06-30'
end_date = '2018-06-30'
# Clean returned stock history data and remove dividend events from price history
daily = {}
for k in index:
tmp = index[k].get_historical_stock_data(start_date, end_date, freq)
if tmp:
daily[k] = tmp['^{}'.format(k)]['prices'] if 'prices' in tmp['^{}'.format(k)] else []
Unfortunately I had to fix a couple things in the yahoo module. For the class YahooFinanceETL:
#staticmethod
def format_date(in_date, convert_type):
try:
x = int(in_date)
convert_type = 'standard'
except:
convert_type = 'unixstamp'
if convert_type == 'standard':
if in_date < 0:
form_date = datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=in_date)
else:
form_date = datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
else:
split_date = in_date.split('-')
d = date(int(split_date[0]), int(split_date[1]), int(split_date[2]))
form_date = int(time.mktime(d.timetuple()))
return form_date
AND:
# private static method to scrap data from yahoo finance
#staticmethod
def _scrape_data(url, tech_type, statement_type):
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
script = soup.find("script", text=re.compile("root.App.main")).text
data = loads(re.search("root.App.main\s+=\s+(\{.*\})", script).group(1))
if tech_type == '' and statement_type != 'history':
stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"]
elif tech_type != '' and statement_type != 'history':
stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"][tech_type]
else:
if "HistoricalPriceStore" in data["context"]["dispatcher"]["stores"] :
stores = data["context"]["dispatcher"]["stores"]["HistoricalPriceStore"]
else:
stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"]
return stores
You will want to look at the daily dict, and rewrite your build_data_frame function, which it should be a lot simpler now since you are working with a dictionary already.

I am actually the maintainer and author of YahooFinancials. I just saw this post and wanted to personally apologize for the inconvenience and let you all know I will be working on fixing the module this evening.
Could you please open an issue on the module's Github page detailing this?
It would also be very helpful to know which version of python you were running when you encountered these issues.
https://github.com/JECSand/yahoofinancials/issues
I am at work right now, however as soon as I get home in ~7 hours or so I will attempt to code a fix and release it. I'll also work on the exception handling. I try my best to maintain this module, but my day (and often night time) job is rather demanding. I will report back with the final results of these fixes and publish to pypi when it is done and stable.
Also if anyone else has any feedback or personal fixes made you can offer, it would be a huge huge help in fixing this. Proper credit will be given of course. I am also in desperate need of contributers, so if anyone is interested in that as well let me know. I am really wanting to take YahooFinancials to the next level and have this project become a stable and reliable alternative for free financial data for python projects.
Thank you for your patience and for using YahooFinancials.

Stop a loop if information is not found

I created a web scraping program that open several URLs, it checks which one of the URLs has information related to "tomorrow"s date and then it prints some specific information that is on that URL. My problem is that sometimes none of the URLs in that list has information concerning "tomorrow". So I would like that in such case, the program prints other innformation like "no data found". How could I accomplish that? Other doubt I have, do I need the while loop at the beginning? Thanks.
My code is:
from datetime import datetime, timedelta
tomorrow = datetime.now() + timedelta(days=1)
tomorrow = tomorrow.strftime('%d-%m-%Y')
day = ""
while day != tomorrow:
for url in list_urls:
browser.get(url)
time.sleep(1)
dia_page = browser.find_element_by_xpath("//*[#id='item2']/b").text
dia_page = dia_page[-10:]
day_uns = datetime.strptime(dia_page, "%d-%m-%Y")
day = day_uns.strftime('%d-%m-%Y')
if day == tomorrow:
meals = browser.find_elements_by_xpath("//*[#id='item2']/span")
meal_reg = browser.find_element_by_xpath("//*[#id='item_frm']/span[1]").text
sopa2 = (meals[0].text)
refeicao2 = (meals[1].text)
sobremesa2 = (meals[2].text)
print(meal_reg)
print(sopa2)
print(refeicao2)
print(sobremesa2)
break

No need for a while loop, you can use the for-else Python construct for this:
for url in list_urls:
# do stuff
if day == tomorrow:
# do and print stuff
break
else: # break never encountered
print("no data found")

Quandl issue on change and data for latest day

I've been working on the quandl API recently and I've been stuck on an issue for a while.
My question is how to create a method on the difference between One
date and the date before for a stock index, Data seems to come out as
an array as an example: [[u'2015-04-30', 17840.52]] for the Dow Jones
Industrial Average. I'd like to also create a way to get the change
from one day away from the latest one. Say getting Friday's stock and
the change between that and the day before.
My code:
def fetchData(apikey, url):
'''Returns JSON data of the Dow Jones Average.'''
parameters = {'rows' : 1, 'auth_token' : apikey}
req = requests.get(url, params=parameters)
data = json.loads(req.content)
parsedData = []
stockData = {}
for datum in data:
if data['code'] == 'COMP':
stockData['name'] = data['name']
stockData['description'] = '''The NASDAQ Composite Index measures all
NASDAQ domestic and international based common type stocks listed on The NASDAQ Stock Market.'''
stockData['data'] = data['data']
stockData['code'] = data['code']
else:
stockData['name'] = data['name']
stockData['description'] = data['description']
stockData['data'] = data['data']
stockData['code'] = data['code']
parsedData.append(stockData)
return parsedData
I've attempted to just tack on [1] on data to get just the current day but both the issue of getting the day before has kinda stumped me.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

def Function Is Not Displaying Expected Result - python

Related

Can't make apache beam write outputs to bigquery when using DataflowRunner

"TypeError: combine() argument 1 must be datetime.date, not int" when using the google calendar API

ValueError: timestamp out of range for platform localtime()/gmtime() function

Stop a loop if information is not found

Quandl issue on change and data for latest day

Categories

Resources