Querying a mongodb raises stop iteration error - python

Using the following code, I am trying to extract two dates from an object in mongodb and then calculate the difference in time between the two dates -- if both year are in/past 2016. My current code raises the following error:
DeprecationWarning: generator 'QuerySet._iter_results' raised StopIteration
from ipykernel import kernelapp as app
My code:
raw_data = Document.objects()
data = []
for i in raw_data[:10]:
scored = i.date_scored
scored_date = pd.to_datetime(scored, format='%Y-%m-%d %H:%M')
if scored_date == "NoneType":
print("none")
elif scored_date.year >= 2016:
extracted = i.date_extracted
extracted_date = pd.to_datetime(extracted, format='%Y-%m-%d
%H:%M')
bank = i.bank.name
diff = scored_date - extracted_date
print(diff)
datum = [str(bank), str(extracted), str(scored), str(diff)]
data.append(datum)
else:
pass
Any help would be appreciated, thank you!

Related

def Function Is Not Displaying Expected Result

Iā€™m trying to get stock prices to buy or sell on specific date. When the buy price, given the sell price should be NAN. Likewise, if the sell-price is given, the buy price has to be NAN. This function and coding is originally proposed by Joseph Hart (https://medium.com/analytics-vidhya/sma-short-moving-average-in-python-c656956a08f8).
The return values of the function are (sig_buy_price, sig_sell_price). My data source is Pandas DataFrame, namely qqq_df. SMA_30 and SMA_100 are samples drawn from qqq_df.
The output is not giving me the expected result, which is stated above. Please find the code indicated below. I need specific steps and codes, to resolve the issue. I look forward to hearing from forum members. Thanks.
def buy_sell(qqq_df):
sig_price_buy = []
sig_price_sell = []
flag = -1
for i in range(len(qqq_df)):
if qqq_df['sma_30'][i] > qqq_df['sma_100'][i]:
if flag != 1:
sig_price_buy.append(qqq_df['close'] [i])
sig_price_sell.append(np.nan)
print(qqq_df['date'][i])
else:
sig_price_buy.append(np.nan)
sig_price_buy.append(np.nan)
elif qqq_df['sma_30'][i] < qqq_df['sma_100'][i]:
if flag != 0:
sig_price_buy.append(np.nan)
sig_price_sell.append(qqq_df ['close'] [i])
print(qqq_df['date'][i])
flag = 0
else:
sig_price_buy.append(np.nan)
sig_price_sell.append(np.nan)
else:
sig_price_buy.append(np.nan)
sig_price_sell.append(np.nan)
return(sig_price_buy, sig_price_sell)
b, s = buy_sell(qqq_df = qqq_df)
print(b, s)
The following code worked for me:
def buy_sell(data):
sig_price_buy = []
sig_price_sell = []
flag = -1
for i in range(len(data)):
if data['SMA30'][i] >= data['SMA100'][i] and flag != 1:
sig_price_buy.append(data['TSLA'][i])
sig_price_sell.append(np.nan)
flag = 1
elif data['SMA30'][i] <= data['SMA100'][i] and flag != 0:
sig_price_buy.append(np.nan)
sig_price_sell.append(data['TSLA'][i])
flag = 0
else:
sig_price_buy.append(np.nan)
sig_price_sell.append(np.nan)
return (sig_price_buy, sig_price_sell)
The format of the input data must be:
TSLA = pd.read_csv("TSLA.csv")
close = TSLA.Close
data = pd.DataFrame()
data['SMA30'] = close.rolling(window = 30).mean()
data['SMA100'] = close.rolling(window = 100).mean()
data['TSLA'] = close
I tested it on the TSLA stocks, so nevermind the different naming.
If you want to print it in a certain format, which I read in one of your comments, I recommend first running the function and then printing the resulting arrays formatted the way you like in a second step.
EDIT:
If you wish to access a yahoo finance stock directly from python, you can use the following framework:
import datetime
from os import mkdir, path
from urllib import request
def convertDateToYahooTime(date: datetime.datetime) -> int: #converts a date to a POSIX for the yahoo finance website
return int(date.replace(hour=2, minute=0, second=0, microsecond=0).timestamp())
def downloadStock(stock: str, olderDate: datetime.datetime, recentDate: datetime.datetime) -> str: #downloads a stock from yahoo finance and returns the filename
p1: int = convertDateToYahooTime(olderDate)
p2: int = convertDateToYahooTime(recentDate)
link = f"https://query1.finance.yahoo.com/v7/finance/download/{stock}?period1={p1}&period2={p2}&interval=1d&events=history&includeAdjustedClose=true"
file = f"{stock}_{olderDate.date().strftime('%Y-%m-%d')}_{recentDate.date().strftime('%Y-%m-%d')}"
if not path.exists('stocks'):
mkdir('stocks')
response = request.urlretrieve(link, f'stocks//{file}') #save the stock to a folder named 'stocks'
return file
def fetchStock(stock: str, period: int) -> str: #fetches a stock and returns the filename
today = datetime.datetime.today().replace(hour=0, minute=0, second=0, microsecond=0)
past = datetime.datetime.fromtimestamp(today.timestamp() - period*86400)
return downloadStock(stock, past, today)
downladStock accesses the download link to any yahoo finance stock you want. fetchStock just makes your life easier as you can enter the name of your stock and the period of how many days of the stock you wish to access, counting backward from the present day. So if you do this:
STOCK = 'GOOG'
PERIOD = 1000
file = fetchStock(STOCK, PERIOD)
df = pd.read_csv('stocks//' + file)
df will be the google stock of the past 1000 days. You can do this to load any stock you want. after this, you can repeat the steps I mentioned above to format the stock and run the SMA30/100 algorithm on it.
I'm not sure whether this is what you wanted, but I hope it helps and you have fun with it.

python gives invalid argument error on OS level

i have this piece of code:
symbol = 'PSTV'
end_point="https://api.polygon.io/v2/snapshot/locale/us/markets/stocks/tickers/"+symbol+"?apiKey=my_key"
a_json=requests.get(end_point).json()
if a_json['status'] == 'OK':
candle_open = a_json['ticker']['min']['o']
candle_close = a_json['ticker']['min']['c']
candle_high = a_json['ticker']['min']['h']
candle_low = a_json['ticker']['min']['l']
candle_ts = a_json['ticker']['lastQuote']['t']
print(candle_ts/1000000)
candle_ts = datetime.fromtimestamp(candle_ts/1000000).strftime('%Y-%m-%d %H:%M:%S')
print("OK")
im trying to convert timestamp to a readable format like so:
candle_ts = a_json['ticker']['lastQuote']['t'] #get the timestamp
print(candle_ts/1000000)
candle_ts = datetime.fromtimestamp(candle_ts/1000000).strftime('%Y-%m-%d %H:%M:%S')
the print is : 1644529277457.4104
I have no clue why but the error is :
candle_ts = datetime.fromtimestamp(candle_ts/1000000).strftime('%Y-%m-%d %H:%M:%S')
OSError: [Errno 22] Invalid argument
Why do I get such an unusual error??
The value for candle_ts is out of range, as you can see below sample script. The max limit is year 5138 which is around 11digits only. Your value for candle_ts is 13digits.
from datetime import datetime
candle_ts = 1644529277457.4104
try:
candle_ts = datetime.fromtimestamp(candle_ts).strftime('%Y-%m-%d %H:%M:%S')
print(candle_ts)
except Exception as e:
print(e)
Result:
year 54083 is out of range

"TypeError: combine() argument 1 must be datetime.date, not int" when using the google calendar API

I am pretty new to coding and trying to create a voice assistant. everything is working fine but I don't know why I am getting this error
Python version: 3.9 64bit
windows version: 10 pro 64bit
I live in India and the timezone we follow is Asia, Kolkata(just telling if that is causing it)
So I have made two functions:
The first function get_date will take a speech-to-text converted string and then try to find the date we are talking about.
def get_date(text):
'''
This fuction will return date value in the form of MM/DD/YYY
it will convert any phrases passes to a date value if it contains the required data
:param text:
:return:
'''
text = text.lower()
today = datetime.date.today()
# if the query contains 'today' simply return today's date
if text.count('today') > 0:
return today
# if the query contains 'tomorrow' simply return tomorrow's date
if text.count('tomorrow') > 0:
return today + datetime.timedelta(1)
month = -1
day_of_week = -1
day = -1
year = today.year
# looping over the given phrase
for word in text.split():
# if the phrase contains name of month find its value from the list
if word in MONTHS:
month = MONTHS.index(word) + 1
# if the phrase contains day of teh week return day of the week from the list
if word in DAYS:
day_of_week = DAYS.index(word)
# if the phrase contains the digit itself then convert it from str to int
if word.isdigit():
day = int(word)
# again run a loop to get values of DAY_EXTENSIONS and then check if we have them in the word
for ext in DAY_EXTENSIONS:
x = word.find(ext)
if x > 0:
try:
day = int(word[:x])
except:
pass
# if given month is passed add to the year
if month < today.month and month != -1:
year = year+1
# if only day is given then finding if month is this or the next
if month == -1 and day != -1:
if day < today.day:
month = today.month + 1
else:
month = today.month
# if only day of the week is provided then find the date
if day_of_week != -1 and month == -1:
current_day_of_week = today.weekday()
diff = day_of_week - current_day_of_week
if diff < 0:
diff += 7
if text.count('next') >= 1:
diff += 7
return today + datetime.timedelta(diff)
if day != -1:
return datetime.date(month=month, day=day, year=year)
Then I have a second function which uses the google calendar API to get the events in the calendar
I don't know much about this but I am following a guy on youtube.
here is the function
def get_events(day, service):
# the below four lines convert the date we provide in terms of utctime format
# If you know what they really mean tell me, please šŸ˜…
date = datetime.datetime.combine(day, datetime.datetime.min.time())
end = datetime.datetime.combine(day, datetime.datetime.max.time())
utc = pytz.UTC
date = date.astimezone(utc)
end = end.astimezone(utc)
events_result = service.events().list(calendarId='primary', timeMin=date.isoformat(),
timeMax=end.isoformat(), singleEvents=True,
orderBy='startTime').execute()
events = events_result.get('items', [])
if not events:
print('No upcoming events found.')
for event in events:
start = event['start'].get('dateTime', event['start'].get('date'))
print(start, event['summary'])
here is how I use these functions:
query = take_command().lower()
for phrases in GET_DATE_STRINGS:
if phrases in query.lower():
day = datetime.date(get_date(query))
get_events(day, service)
and then here is the error:
<googleapiclient.discovery.Resource object at 0x00000189C9F60AC0>
Traceback (most recent call last):
File "D:\Coding\jarvis\main.py", line 248, in <module>
get_events(3, service)
File "D:\Coding\jarvis\main.py", line 157, in get_events
date = datetime.datetime.combine(day, datetime.datetime.min.time())
TypeError: combine() argument 1 must be datetime.date, not int
Now according to me, the error is that the day input that I am giving to the get_events() function is int but as I can see the return function in the get_date() returns a DateTime.date object not an int.
kindly help If you know the fix.
If get_date returns a date object why are you again creating a date object here:
day = datetime.date(get_date(query)) ?
You can just do:
day = get_date(query)
get_events(day, service)

ValueError: time data 'None' does not match format '%Y-%m-%dT%H:%M:%S.%f'

For the node 'TransactionDate' i have a logic before updating it for policy"POL000002NGJ".
The logic i am trying to implement is If existing 'TransactionDate' < than today, then add 5 days with current value and parse it to xml.
Transaction Date Format in XML : 2020-03-23T10:56:15.00
Please Note that, If i parsing the DateTime value like below, It works good But i dont want to hardcode the value... I want to Parse it as a string object to handle for any datetime in format ""%Y-%m-%dT%H:%M:%S.%f""...
# <TransactionDate>
today = datetime.now()
TransactionDate = doc.find('TransactionDate')
Date = '2020-03-24T10:56:15.00'
previous_update = datetime.strptime(Date, "%Y-%m-%dT%H:%M:%S.%f")
if previous_update < today:
today = previous_update - timedelta(days=-5)
TransactionDate = today.strftime("%Y-%m-%dT%H:%M:%S.%f")
Below code while parsing it as a DateTime Object, I have an issue.. I got struck here and referenced other answers in stackoverflow and python forums, But still i got struct up here and unable to resolve the issue...
if any help to fix will be a great helpful. Thanks. Below code using lxml and getting help to support below code will helpful. Because i already completed for other nodes. My understanding is Date variable is calling as None.. But struck here to fix.. Please help..
# <TransactionDate>
today = datetime.now()
TransactionDate = doc.find('TransactionDate')
Date = str(TransactionDate)
previous_update = datetime.strptime(Date, "%Y-%m-%dT%H:%M:%S.%f")
if previous_update < today:
today = previous_update - timedelta(days=-5)
TransactionDate = today.strftime("%Y-%m-%dT%H:%M:%S.%f")
Full Code is Below
from lxml import etree
from datetime import datetime, timedelta
import random, string
doc = etree.parse(r'C:\Users\python.xml')
# <PolicyId> - Random generated policy number
Policy_Random_Choice = 'POL' + ''.join(random.choices(string.digits, k=6)) + 'NGJ'
# <TransactionDate>
today = datetime.now()
TransactionDate = doc.find('TransactionDate')
Date = str(TransactionDate)
previous_update = datetime.strptime(Date, "%Y-%m-%dT%H:%M:%S.%f")
if previous_update < today:
today = previous_update - timedelta(days=-5)
TransactionDate = today.strftime("%Y-%m-%dT%H:%M:%S.%f")
#Parsing the Variables
replacements = [Policy_Random_Choice , TransactionDate ]
targets = doc.xpath('//ROW[PolicyId="POL000002NGJ"]')
for target in targets:
target.xpath('./PolicyId')[0].text = replacements[0]
target.xpath('.//TransactionDate')[0].text = replacements[1]
print(etree.tostring(doc).decode())
Sample XML
<TABLE>
<ROW>
<PolicyId>POL000002NGJ</PolicyId>
<BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
<TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
</ROW>
<ROW>
<PolicyId>POL111111NGJ</PolicyId>
<BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
<TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
</ROW>
</TABLE>
Maybe the find method is wrong. Try this one
# <TransactionDate>
today = datetime.now()
TransactionDate = doc.xpath('//ROW/TransactionDate') # Change find to xpath
Date = str(TransactionDate[0].text) # Use the first one
previous_update = datetime.strptime(Date, "%Y-%m-%dT%H:%M:%S.%f")

ValueError: timestamp out of range for platform localtime()/gmtime() function

I have a class assignment to write a python program to download end-of-day data last 25 years the major global stock market indices from Yahoo Finance:
Dow Jones Index (USA)
S&P 500 (USA)
NASDAQ (USA)
DAX (Germany)
FTSE (UK)
HANGSENG (Hong Kong)
KOSPI (Korea)
CNX NIFTY (India)
Unfortunately, when I run the program an error occurs.
File "C:\ProgramData\Anaconda2\lib\site-packages\yahoofinancials__init__.py", line 91, in format_date
form_date = datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
ValueError: timestamp out of range for platform localtime()/gmtime() function
If you see below, you can see the code that I have written. I'm trying to debug my mistakes. Can you help me out please? Thanks
from yahoofinancials import YahooFinancials
import pandas as pd
# Select Tickers and stock history dates
index1 = '^DJI'
index2 = '^GSPC'
index3 = '^IXIC'
index4 = '^GDAXI'
index5 = '^FTSE'
index6 = '^HSI'
index7 = '^KS11'
index8 = '^NSEI'
freq = 'daily'
start_date = '1993-06-30'
end_date = '2018-06-30'
# Function to clean data extracts
def clean_stock_data(stock_data_list):
new_list = []
for rec in stock_data_list:
if 'type' not in rec.keys():
new_list.append(rec)
return new_list
# Construct yahoo financials objects for data extraction
dji_financials = YahooFinancials(index1)
gspc_financials = YahooFinancials(index2)
ixic_financials = YahooFinancials(index3)
gdaxi_financials = YahooFinancials(index4)
ftse_financials = YahooFinancials(index5)
hsi_financials = YahooFinancials(index6)
ks11_financials = YahooFinancials(index7)
nsei_financials = YahooFinancials(index8)
# Clean returned stock history data and remove dividend events from price history
daily_dji_data = clean_stock_data(dji_financials
.get_historical_stock_data(start_date, end_date, freq)[index1]['prices'])
daily_gspc_data = clean_stock_data(gspc_financials
.get_historical_stock_data(start_date, end_date, freq)[index2]['prices'])
daily_ixic_data = clean_stock_data(ixic_financials
.get_historical_stock_data(start_date, end_date, freq)[index3]['prices'])
daily_gdaxi_data = clean_stock_data(gdaxi_financials
.get_historical_stock_data(start_date, end_date, freq)[index4]['prices'])
daily_ftse_data = clean_stock_data(ftse_financials
.get_historical_stock_data(start_date, end_date, freq)[index5]['prices'])
daily_hsi_data = clean_stock_data(hsi_financials
.get_historical_stock_data(start_date, end_date, freq)[index6]['prices'])
daily_ks11_data = clean_stock_data(ks11_financials
.get_historical_stock_data(start_date, end_date, freq)[index7]['prices'])
daily_nsei_data = clean_stock_data(nsei_financials
.get_historical_stock_data(start_date, end_date, freq)[index8]['prices'])
stock_hist_data_list = [{'^DJI': daily_dji_data}, {'^GSPC': daily_gspc_data}, {'^IXIC': daily_ixic_data},
{'^GDAXI': daily_gdaxi_data}, {'^FTSE': daily_ftse_data}, {'^HSI': daily_hsi_data},
{'^KS11': daily_ks11_data}, {'^NSEI': daily_nsei_data}]
# Function to construct data frame based on a stock and it's market index
def build_data_frame(data_list1, data_list2, data_list3, data_list4, data_list5, data_list6, data_list7, data_list8):
data_dict = {}
i = 0
for list_item in data_list2:
if 'type' not in list_item.keys():
data_dict.update({list_item['formatted_date']: {'^DJI': data_list1[i]['close'], '^GSPC': list_item['close'],
'^IXIC': data_list3[i]['close'], '^GDAXI': data_list4[i]['close'],
'^FTSE': data_list5[i]['close'], '^HSI': data_list6[i]['close'],
'^KS11': data_list7[i]['close'], '^NSEI': data_list8[i]['close']}})
i += 1
tseries = pd.to_datetime(list(data_dict.keys()))
df = pd.DataFrame(data=list(data_dict.values()), index=tseries,
columns=['^DJI', '^GSPC', '^IXIC', '^GDAXI', '^FTSE', '^HSI', '^KS11', '^NSEI']).sort_index()
return df
Your problem is your datetime stamps are in the wrong format. If you look at the error code it clugely tells you:
datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
Notice the int(in_date) part?
It wants the unix timestamp. There are several ways to get this, out of the time module or the calendar module, or using Arrow.
import datetime
import calendar
date = datetime.datetime.strptime("1993-06-30", "%Y-%m-%d")
start_date = calendar.timegm(date.utctimetuple())
* UPDATED *
OK so I fixed up to the dataframes portion. Here is my current code:
# Select Tickers and stock history dates
index = {'DJI' : YahooFinancials('^DJI'),
'GSPC' : YahooFinancials('^GSPC'),
'IXIC':YahooFinancials('^IXIC'),
'GDAXI':YahooFinancials('^GDAXI'),
'FTSE':YahooFinancials('^FTSE'),
'HSI':YahooFinancials('^HSI'),
'KS11':YahooFinancials('^KS11'),
'NSEI':YahooFinancials('^NSEI')}
freq = 'daily'
start_date = '1993-06-30'
end_date = '2018-06-30'
# Clean returned stock history data and remove dividend events from price history
daily = {}
for k in index:
tmp = index[k].get_historical_stock_data(start_date, end_date, freq)
if tmp:
daily[k] = tmp['^{}'.format(k)]['prices'] if 'prices' in tmp['^{}'.format(k)] else []
Unfortunately I had to fix a couple things in the yahoo module. For the class YahooFinanceETL:
#staticmethod
def format_date(in_date, convert_type):
try:
x = int(in_date)
convert_type = 'standard'
except:
convert_type = 'unixstamp'
if convert_type == 'standard':
if in_date < 0:
form_date = datetime.datetime(1970, 1, 1) + datetime.timedelta(seconds=in_date)
else:
form_date = datetime.datetime.fromtimestamp(int(in_date)).strftime('%Y-%m-%d')
else:
split_date = in_date.split('-')
d = date(int(split_date[0]), int(split_date[1]), int(split_date[2]))
form_date = int(time.mktime(d.timetuple()))
return form_date
AND:
# private static method to scrap data from yahoo finance
#staticmethod
def _scrape_data(url, tech_type, statement_type):
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
script = soup.find("script", text=re.compile("root.App.main")).text
data = loads(re.search("root.App.main\s+=\s+(\{.*\})", script).group(1))
if tech_type == '' and statement_type != 'history':
stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"]
elif tech_type != '' and statement_type != 'history':
stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"][tech_type]
else:
if "HistoricalPriceStore" in data["context"]["dispatcher"]["stores"] :
stores = data["context"]["dispatcher"]["stores"]["HistoricalPriceStore"]
else:
stores = data["context"]["dispatcher"]["stores"]["QuoteSummaryStore"]
return stores
You will want to look at the daily dict, and rewrite your build_data_frame function, which it should be a lot simpler now since you are working with a dictionary already.
I am actually the maintainer and author of YahooFinancials. I just saw this post and wanted to personally apologize for the inconvenience and let you all know I will be working on fixing the module this evening.
Could you please open an issue on the module's Github page detailing this?
It would also be very helpful to know which version of python you were running when you encountered these issues.
https://github.com/JECSand/yahoofinancials/issues
I am at work right now, however as soon as I get home in ~7 hours or so I will attempt to code a fix and release it. I'll also work on the exception handling. I try my best to maintain this module, but my day (and often night time) job is rather demanding. I will report back with the final results of these fixes and publish to pypi when it is done and stable.
Also if anyone else has any feedback or personal fixes made you can offer, it would be a huge huge help in fixing this. Proper credit will be given of course. I am also in desperate need of contributers, so if anyone is interested in that as well let me know. I am really wanting to take YahooFinancials to the next level and have this project become a stable and reliable alternative for free financial data for python projects.
Thank you for your patience and for using YahooFinancials.

Categories