Python: Same code plotting different charts during different runs - python

I am running a mini project for remote sensing water level in the underground tank.
I am collecting data and uploading it using an ESP8266 Wifi to supabase into a table > reading the tables using python and plotting the data using streamlit.
But for some reason on a given run, it shows the correct local IST timing in the chart, on other runs it plots a different UTC timestamp seemingly randomly. I am unable to troubleshoot the problem.
Any help would be appreciated.
Below is my code(Ignore the redundancies, I am still learning to code and will progressively iron out the code with time):
from supabase import create_client
import pandas as pd
import streamlit as st
import plotly.express as px
from datetime import datetime, timedelta
API_URL = [redacted]
API_KEY = [redacted]
supabaseList = supabase.table('Water Level').select('*').execute().data
# time range variables updation to put in x axis range parameters
today = datetime.now()
today = today + \
timedelta(minutes = 30)
present_date = today.strftime("%Y-%m-%d %X")
hrs48 = today - \
timedelta(days = 2)
back_date = hrs48.strftime("%Y-%m-%d %X")
df = pd.DataFrame()
for row in supabaseList:
row["created_at"] = row["created_at"].split(".")[0]
row["time"] = row["created_at"].split("T")[1]
row["date"] = row["created_at"].split("T")[0]
row["DateTime"] = row["created_at"]
df = df.append(row, ignore_index=True)
orignal_title = '<h1 style="font-family:Helvetica; color:Black; font-size: 45px; text-align: center">[Redacted]</p>'
st.markdown(orignal_title, unsafe_allow_html=True)
st.text("")
custom_range = [back_date, present_date]
st.write(custom_range)
fig = px.area(df, x="DateTime", y="water_level", title='',markers=False)
fig.update_layout(
title={
'text': "Water level in %",
'y':0.9,
'x':0.5,
'xanchor': 'center',
'yanchor': 'top'})
fig.update_layout(yaxis_range=[0,120])
fig.update_layout(xaxis_range=custom_range)
#Add Horizontal line for pump trigger level
fig.add_hline(y=80, line_width=3, line_color="black",
annotation_text="Pump Start Level",
annotation_position="top left",
annotation_font_size=15,
annotation_font_color="black"
)
st.plotly_chart(fig,use_container_width=True)
I am expecting it to always plot in IST timestamps, but it prints out UTC on seemingly random runs.

It seems like the issue was with the database. I created a new project with a new table & it runs flawlessly now. Still, couldn't figure out what caused supabase to send different timestamps on different runs for the same data. But creating a completely new table & relinking with the other system has solved it.

Related

Streamlit Session being shared with multiple users

So I am making a Stream lit web application with python. Here is the link to the app: https://jensen-holm-mlb-simapp-app-kclnz9.streamlitapp.com/
I have already deployed the app but have come across a strange problem. When I booted the app it worked great, running simulations for the two teams that I put into it. But after I do this first run, I put two other teams into it. But the simulation yields the same results from the first two teams. I know this because if i try the download play by polay button, it is always the same players from the first simulation. Very weird problem I did not forsee. Even when run on another persons computer, it still yields the same results as it did for that first test I ran on my computer.
I would really like it to re run everything when you enter two different teams and hit go again. Which I thought happens by default.
Any help is greatly appriciated, below is the main app.py file that I am running in the cloud. Here is a link to the github repo: https://github.com/Jensen-holm/MLB-Simapp
import sqlite3
import streamlit as st
from objects import Team
import game_functions
from streamlit_option_menu import option_menu
import pandas as pd
import numpy as np
def convert_df(df):
return df.to_csv().encode("utf-8")
# title and stuff
st.set_page_config(page_title = 'Ball.Sim (Beta)', page_icon = '⚾️')
st.title('Ball.Sim (Beta)')
st.write('\nCreated by Jensen Holm')
st.write('Data Source: [Sports-Reference](https://sports-reference.com)')
st.write('\n[Donate](https://www.paypal.com/donate/?business=HPLUVQJA6GFMN&no_recurring=0&currency_code=USD)')
# get rid of streamlit option menu stuff
# hide_streamlit_style = """
# <style>
# #MainMenu {visibility: hidden;}
# footer {visibility: hidden;}
# </style>
# """
# st.markdown(hide_streamlit_style, unsafe_allow_html=True)
# import the player data dict dictionaries for the keys so we can have options in the select box
sim_data_db = sqlite3.connect("Sim_Data.db")
# get list of all table names with the cursor to put into the st.select_box
all_teams_array = np.array(pd.read_sql_query("SELECT name FROM sqlite_master WHERE type='table'", sim_data_db))
# the all teams array is nested so lets unnest it
all_teams_with_year = []
for sublist in all_teams_array:
for thing in sublist:
all_teams_with_year.append(thing)
all_teams = [team[len('Year '):] for team in all_teams_with_year]
all_teams.sort()
all_teams.insert(0, "Start typing and select team")
# user unput buttons and sliders
team1 = st.selectbox(label = "Team 1", options = all_teams)
team2 = st.selectbox(label = "Team 2", options = all_teams)
number_of_simulations = st.slider("Number of Simulations", min_value = 162, max_value = 16200, step = 162)
# initialize simulation button
init_button = st.button("Go")
if init_button:
if team1 == "Start typing and select team" or team2 == "Start typing and select team":
st.error("Must select team from select boxes.")
st.stop()
# select databse tables based on user input
team1_data = pd.read_sql_query(f"SELECT * FROM 'Year {team1}'", con = sim_data_db)
team2_data = pd.read_sql_query(f"SELECT * FROM 'Year {team2}'", con = sim_data_db)
team1_year = team1[:5]
team2_year = team1[:5]
team1_name = team1[5:]
team2_name = team2[5:]
# generate teams with the data queried above
Team1 = Team(team1_name, team1_year, team1_data, lineup_settings = 'auto')
Team2 = Team(team2_name, team2_year, team2_data, lineup_settings = 'auto')
# begin simulation
pbp_df = game_functions.simulation(number_of_simulations, Team1, Team2, 0)
# make it a csv and use st.download_button()
pbp_csv = convert_df(pbp_df)
st.download_button("Download Play By Play Data", pbp_csv, file_name = "BallSimPBP.csv")
# use st.tabs to filter what to see, like a chart tab for the graph, a place where they can view stats per 162 for each player
# and download data and stuff on another one (may need to figure out the st.session state thing tho)
# footer type stuff, plugging myself.
st.write(f'\n\nFeedback and report bugs: holmj#mail.gvsu.edu')
st.write('\nSocials: [Twitter](https://twitter.com/JensenH_) [GitHub](https://github.com/Jensen-holm) [Linkedin](https://www.linkedin.com/in/jensen-holm-3584981bb/)')
st.write('[Documentation / Code](https://github.com/Jensen-holm/MLB-Simapp)')

How can I refresh the data in the background of a running flask app?

I have a simple flask app that queries a database to write a csv then pyplot to create a chart out of that.
I would like to refresh the data in the background every 10 minutes while the app is running. The page doesn't need to refresh the html automatically. It just needs to have fresh data when someone opens the page.
Can I do that in a single script? Or do I need to run a different script outside in crontab or something?
I would just kick over the container every 10 minutes but it takes about 5 minutes to get the query, so that's a 5 minute outage. Not a great idea. I'd prefer it to fetch in the background.
Here is what I'm working with:
import os
from datetime import date
import teradatasql
import pandas as pd
import matplotlib.pyplot as plt
from flask import Flask, render_template
import time
import multitasking
### variables
ausername = os.environ.get('dbuser')
apassword = os.environ.get('dbpassword')
ahost = os.environ.get('dbserver')
systems = ["prd1", "prd2", "frz1", "frz2", "devl"]
qgsystems = ["", "#Tera_Prd2_v2", "#Tera_Frz1_v2", "#Tera_Frz2_v2", "#Tera_Devl_v2"]
weeks = ["0", "7", "30"]
query = """{{fn teradata_write_csv({system}_{week}_output.csv)}}select (bdi.infodata) as sysname,
to_char (thedate, 'MM/DD' ) || ' ' || Cast (thetime as varchar(11)) as Logtime,
sum(drc.cpuuexec)/sum(drc.secs) (decimal(7,2)) as "User CPU",
sum(drc.cpuuserv)/sum(drc.secs) (decimal(7,2)) as "System CPU",
sum(drc.cpuiowait)/sum(drc.secs) (decimal(7,2)) as "CPU IO Wait"
from dbc.resusagescpu{qgsystem} as drc
left outer join boeing_tables.dbcinfotbl{qgsystem} as bdi
on bdi.infokey = 'sysname'
where drc.thedate >= (current_date - {week})
order by logtime asc
Group by sysname,logtime
;
"""
### functions
#multitasking.task
def fetch(system,qgsystem,week):
with teradatasql.connect (host=ahost, user=ausername, password=apassword) as con:
with con.cursor () as cur:
cur.execute (query.format(system=system, qgsystem=qgsystem, week=week))
[ print (row) for row in cur.fetchall () ]
#multitasking.task
def plot(system,week):
for week in weeks:
for system in systems:
df = pd.read_csv(system + "_" + week + "_output.csv")
df.pop('sysname')
df.plot.area(x="Logtime")
figure = plt.gcf()
figure.set_size_inches(12, 6)
plt.savefig( "/app/static/" + system + "_" + week + "_webchart.png", bbox_inches='tight', dpi=100)
### main
for week in weeks:
for system, qgsystem in zip(systems, qgsystems):
fetch(system,qgsystem,week)
for week in weeks:
for system in systems:
plot(system,week)
app = Flask(__name__,template_folder='templates')
#app.route('/')
def index():
return render_template("index.html")

How can i create a live chart with the help of Live streaming data?

I'm receiving live streaming data from the kite every second, and now I want to create a Chart with the help of that data.
Any Chart can work, I don't need any specific chart type. but I don't understand how can I create a chart which is automatically changing every second.
I want that chart to change like the stock market chart changes every second.
I'm getting this type of values data in each second.
And This is my code-
from kiteconnect import KiteConnect
import time
from kiteconnect import KiteTicker
kws = KiteTicker(zerodha_api_key,acc_tkn)
tokens=[ 60702215, 60700167, 60701191]
# dict={9410818:'BANKNIFTY22MAR27000CE',9411074:'BANKNIFTY22MAR27000PE'}
def on_ticks(ws, ticks):
ticks = (ticks)
bid_price_1 = ticks[0]['depth']['buy'][0]['price']
ask_price_1 = ticks[1]['depth']['sell'][0]['price']
combination_1 = bid_price_1 - ask_price_1
print(combination_1)
def on_connect(ws, response):
ws.subscribe(tokens)
ws.set_mode(ws.MODE_FULL,tokens)
kws.on_ticks = on_ticks
kws.on_connect = on_connect
kws.connect(threaded=True)
count=0
while True:
count+=1
if(count%2==0):
if kws.is_connected():
kws.set_mode(kws.MODE_FULL,tokens)
else:
if kws.is_connected():
kws.set_mode(kws.MODE_FULL,tokens)
time.sleep(1)

Convert Tick Data to OHLC in Realtime with Python Pandas?

I know how to convert static Tick data to OHLC Candlestick data using resampling with Pandas Module of Python.
But how can I do this in Realtime (Just like a Websocket sends it)?
Let's say I am using this while loop code to prepare Tick Data
import time, requests
ohlc []
while True:
r = requests.get('https://api1.binance.com/api/v3/ticker/price?symbol=ETHUSDT')
resp_dict = r.json()
time.time()
print({'time' : time.time(), 'price' : resp_dict["price"]})
Now, how can I keep resample this data in Realtime with Pandas (Just like a Websocket keeps sending us this OHLC data every second which makes it possible to plot candlestick data in realtime)?
Thanks in Advance
I mean, unless you wanna deploy a data provider business it's all cool to make a standalone autonomous bot.
In realtime you gotta actually aggregate OHLC bars live.
Ticks(trades) via WebSocket (or another streaming API); ->
Queue to hold dem ticks; ->
logic to take em ticks from the queue, send em to an aggregator logic and pop em out of the queue so we won't accidentally process one tick several times; ->
Logic to aggregate OHLCV live;
Logic to tell when to start constructing a new bar.
FTX exchange & Python, 15 sec chart aggregation. FTX has an "official" WebSocket Client, I modified it to process all the ticks correctly;
def _handle_trades_message(self, message: Dict) -> None:
self._trades[message['market']].extend(reversed(message['data']))
After that, that's what I got:
import ftx
import time
import pandas as pd
from apscheduler.schedulers.background import BackgroundScheduler
from datetime import datetime
def process_tick(tick):
global data
global flag
global frontier
if (flag == True) and (tick['time'] >= frontier):
start_time = datetime.utcnow().isoformat() #"almost"
time_ = time.time() * 1000 # with higher precision
op = tick['price']
hi = tick['price']
lo = tick['price']
cl = tick['price']
vol = tick['size' ]
row = {'startTime' : start_time,
'time' : time_ ,
'open' : op ,
'high' : hi ,
'low' : lo ,
'close' : cl ,
'volume' : vol }
data = data.append(row, True)
flag = False
print('Opened')
print(tick)
else:
if (tick['price'] > data['high'].iloc[-1]):
data['high'].iloc[-1] = tick['price']
elif (tick['price'] < data['low' ].iloc[-1]):
data['low' ].iloc[-1] = tick['price']
data['close' ].iloc[-1] = tick['price']
data['volume'].iloc[-1] += tick['size' ]
print(tick)
def on_tick():
while True:
try:
try:
process_tick(trades[-1])
trades.pop()
except IndexError:
pass
time.sleep(0.001)
except KeyboardInterrupt:
client_ws._reset_data()
scheduler.remove_job('onClose')
scheduler.shutdown()
print('Shutdown')
break
def on_close():
global flag
global frontier
flag = True
frontier = pd.Timestamp.utcnow().floor('15S').isoformat() #floor to 15 secs
print('Closed')
ASSET = 'LTC-PERP'
RES = '15'
flag = True
frontier = pd.Timestamp.utcnow().floor('15S').isoformat() #floor to 15 secs
cols = ['startTime', 'time', 'open', 'high', 'low', 'close', 'volume']
data = pd.DataFrame(columns=cols)
client_ws = ftx.FtxClientWs()
client_ws._subscribe({'channel': 'trades', 'market': ASSET})
trades = client_ws._trades[ASSET]
scheduler = BackgroundScheduler()
scheduler.configure(timezone='utc')
scheduler.add_job(on_close, trigger='cron', second='0, 15, 30, 45', id='onClose')
scheduler.start()
on_tick()
On top of my stuff, just make on_tick() run in its own thread, data variable will be constantly updated in the background and hold the OHLCV, so you'll be able to use this data within the same program.

How to cache bokeh plots using redis

I'm using bokeh server to render a timeseries graph over a map. As the timeseries progresses, the focus of the map moves.
The code below works, but each progression creates a call that goes off to the google api (GMAP) to get the backdrop. This then takes time to render. At points where the timeseries has shifted the focus a few times in quick succession, the backdrop hasn't had time to render before it is updated.
I've been trying to work out if/how these requests can be made in advance, cached (using redis), enabling the user is able to view the cache with all data already loaded for each tick on the timeseries.
main.py
import settings
from bokeh.plotting import figure, gmap
from bokeh.embed import components
from bokeh.models import CustomJS, ColumnDataSource, Slider, GMapOptions, GMapPlot, Range1d, Button
from bokeh.models.widgets import DataTable, TableColumn, HTMLTemplateFormatter
from bokeh.layouts import column, row, gridplot, layout
from bokeh.io import show, export_png, curdoc
from filehandler import get_graph_data
"""
Get arguments from request
"""
try:
args = curdoc().session_context.request.arguments
pk = int(args.get('pk')[0])
except:
pass
"""
get data for graph from file and initialise variables
"""
#load data into dictionary from file referenced by pk
data_dict = get_graph_data(pk)
no_of_markers = data_dict.get('markers')
length_of_series = data_dict.get('length')
series_data = data_dict.get('data') #lat/lon position of each series at each point in time
series_names = series_data.get('series_names') #names of series
range_x_axis = data_dict.get('xaxis') #min/max lat co-ords
range_y_axis = data_dict.get('yaxis') #min/max lon co-ords
"""
Build data
"""
graph_source = ColumnDataSource(series_data)
"""
Build markers to show current location
"""
markers = ColumnDataSource(data=dict(lon=[], lat=[]))
"""
Build mapping layer
"""
def create_map_backdrop(centroid, zoom, tools):
"""
Create the map backdrop, centered on the starting point
Using GoogleMaps api
"""
map_options = GMapOptions(lng=centroid[1],
lat=centroid[0],
map_type='roadmap',
zoom=zoom,
)
return gmap(google_api_key=settings.MAP_KEY,
map_options=map_options,
tools=tools,
)
#set map focus
centroid = (graph_source.data['lats'][0][0],
graph_source.data['lons'][0][0],
)
"""
Build Plot
"""
tools="pan, wheel_zoom, reset"
p = create_map_backdrop(centroid, 18, tools)
p.multi_line(xs='lons',
ys='lats',
source=graph_source,
line_color='color',
)
p.toolbar.logo = None
p.circle(x='lon', y='lat', source=markers)
"""
User Interactions
"""
def animate_update():
tick = slider.value + 1
slider.value = tick
def slider_update(attr, old, new):
"""
Updates all of the datasources, depending on current value of slider
"""
start = timer()
if slider.value>series_length:
animate()
else:
tick = slider.value
i=0
lons, lats = [], []
marker_lons, marker_lats = [], []
while i < no_of_markers:
#update lines
lons.append(series_data['lons'][i][0:tick])
lats.append(series_data['lats'][i][0:tick])
#update markers
marker_lons.append(series_data['lons'][i][tick])
marker_lats.append(series_data['lats'][i][tick])
#update iterators
i += 1
#update marker display
markers.data['lon'] = marker_lons
markers.data['lat'] = marker_lats
#update line display
graph_source.data['lons'] = lons
graph_source.data['lats'] = lats
#set map_focus
map_focus_lon = series_data['lons'][tick]
map_focus_lat = series_data['lats'][tick]
#update map focus
p.map_options.lng = map_focus_lon
p.map_options.lat = map_focus_lat
slider = Slider(start=0, end=series_length, value=0, step=5)
slider.on_change('value', slider_update)
callback_id = None
def animate():
global callback_id
if button.label == "► Play":
button.label = "❚❚ Pause"
callback_id = curdoc().add_periodic_callback(animate_update, 1)
else:
button.label = "► Play"
curdoc().remove_periodic_callback(callback_id)
button = Button(label="► Play", width=60)
button.on_click(animate)
"""
Display plot
"""
grid = layout([[p, data_table],
[slider, button],
])
curdoc().add_root(grid)
I've tried caching the plot data (p), but it looks like this is persisted before the call to the google api is made.
I've explored caching the map tiles direct from the api and then stitching them into the plot as a background image (using bokeh ImageURL), but I can't get ImageUrl to recognise the in-memory image.
The server documentation suggests that redis can be used as a backend so I wondered whether this might speed thing up, but when I try to start it bokeh serve myapp --allow-websocket-origin=127.0.0.1:5006 --backend=redis I get --backend is not a recognised command.
Is there a way to either cache the fully rendered graph (possibly the graph document itself), whilst retaining the ability for users to interact with the plot; or to cache the gmap plot once it has been rendered and then add it to the rest of the plot?
If this was standalone Bokeh content (i.e. not a Bokeh server app) then you serialize the JSON representation of the plot with json_items and re-hydrate it explicitly in the browser with Bokeh.embed_items. That JSON could potentially be stored in Redis, and maybe that would be relevant. But a Bokeh server is not like that. After the initial session creation, there is never any "whole document" to store or cache, just a sequence of incremental, partial updates that happen over a websocket protocol. E.g. the server says "this specific data source changed" and the browser says "OK I should recompute bounds and re-render".
That said, there are some changes I would suggest.
The first is that you should not update CDS columns one by one. You should not do this:
# BAD
markers.data['lon'] = marker_lons
markers.data['lat'] = marker_lats
This will generate two separate update events and two separate re-render requests. Apart from the extra work this causes, it's also the case that the first update is guaranteed to have mismatched old/new coordinates. Instead, you should always update CDS .data dict "atomically", in one go:
source.data = new_data_dict
Addtionally, you might try curdoc().hold to collect updates into fewer events.

Categories