Plotting values against time in MatPlotlib using SQL - python

I am trying to plot a list of numerical values against a list of timestamps, which are stored in a SQL database called 'scores'. After SELECTING two columns and fetching the data from the database, I then convert the tuple of tuples into lists, and then remove the "None" records, so that it could be plotted using plt.plot(x,y).
However, when I run the program below, I get this error:
ValueError: x and y must have same first dimension, but have shapes (5,) and (4,)
Here is some sample data for yaxis_list and xaxis_final:
yaxis_list = [20,40,60,80]
xaxis_final = ['2023-02-18 02:09:15', '2023-02-18 02:18:00', '2023-02-18 18:52:10']
my imports:
from matplotlib import pyplot as plt
import _sqlite3
I am a beginner at matplotlib and any other code rebuilding suggestions would really help me :)
with _sqlite3.connect("scores.db") as db:
cursor = db.cursor()
cursor.execute("SELECT (phy) FROM scores WHERE phy > 0")
yaxis_tuple = cursor.fetchall()
yaxis_list = [item for t in yaxis_tuple for item in t]
cursor.execute("SELECT (phystamp) FROM scores")
xaxis_tuple = cursor.fetchall()
xaxis_list = [item for t in xaxis_tuple for item in t]
xaxis_final = [i for i in xaxis_list if i is not None]
plt.plot(xaxis_final,yaxis_list)
plt.show()

Related

Making lists store all data of the loop and not only last one

I want to store the JSON I get from an API, but only get the JSON of the last loop. How to get the lists dynamic? Also I need to use the last query (Pandas) but it's not working.
Last how to make an API to :
List latest forecast for each location for every day.
List average the_temp of last 3 forecasts for each location for every day.
Get the top n locations based on each available metric where n is a parameter given in the API call.
import requests
import json
import sqlite3
import pandas as pd #library for data frame
print(sqlite3.sqlite_version)
for x in range(20,28): # i need to get LONDjson/BERLjson/SANjson lists dynamic to store bot 7 jsons from each urls
r = requests.get('https://www.metaweather.com/api/location/44418/2021/4/'+str(x)+'/') #GET request from the source url
LONDjson=r.json() #JSON object of the result
r2 = requests.get('https://www.metaweather.com//api/location/2487956/2021/4/'+str(x)+'/')
SANjson=r2.json()
r3 = requests.get('https://www.metaweather.com//api/location/638242/2021/4/'+str(x)+'/')
BERLjson=r3.json()
conn= sqlite3.connect('D:\weatherdb.db') #create db in path
cursor = conn.cursor()
#import pprint
#pprint.pprint(LONDjson)
cursor.executescript('''
DROP TABLE IF EXISTS LONDjson;
DROP TABLE IF EXISTS SANjson;
DROP TABLE IF EXISTS BERLjson;
CREATE TABLE LONDjson (id int, data json);
''');
for LOND in LONDjson:
cursor.execute("insert into LONDjson values (?, ?)",
[LOND['id'], json.dumps(LOND)])
conn.commit()
z=cursor.execute('''select json_extract(data, '$.id', '$.the_temp', '$.weather_state_name', '$.applicable_date' ) from LONDjson;
''').fetchall() #query the data
hint: in your initial for loop you are not storing the results of api call. you are storing in variable but that is just getting re-written each loop.
a common solution for this starting with empty list that you append to. where perhaps if storing mutliple variables you are storing a dictionary as elements of list
example
results = []
for x in range(10):
results.append(
{
'x': x,
'x_sqaured': x*x,
'abs_x': abs(x)
}
)
print(results)
It looks like there's at least two things that can be improved in the data manipulation part of your code.
Using an array to store the retrieved data
LONDjson = []
SANjson = []
BERLjson = []
for x in range(20,28):
r = requests.get('https://www.metaweather.com/api/location/44418/2021/4/'+str(x)+'/')
LONDjson.append(r.json())
r2 = requests.get('https://www.metaweather.com//api/location/2487956/2021/4/'+str(x)+'/')
SANjson.append(r2.json())
r3 = requests.get('https://www.metaweather.com//api/location/638242/2021/4/'+str(x)+'/')
BERLjson.append(r3.json())
Retrieving the data from the array
# The retrieved data is a dictionary inside a list with only one entry
for LOND in LONDjson:
print(LOND[0]['id'])
Hope this helps you out.

How to iterate over Postgresql rows in a Python script?

I'm writing a script which selects from a DB table and iterates over the rows.
In MySQL I would do:
import MySQLdb
db_mysql=MySQLdb.Connect(user=...,passwd=...,db=..., host=...)
cur = db_mysql.cursor(MySQLdb.cursors.DictCursor)
cur.execute ("""SELECT X,Y,Z FROM tab_a""")
for row in crs.fetchall () :
do things...
But I don't know how to do it in PostgreSQL.
Basically this question could be how to translate the above MySQL code to work with PostgreSQL.
This is what I have so far (I am using PyGreSQL).
import pg
pos = pg.connect(dbname=...,user=...,passwd=...,host=..., port=...)
pos.query("""SELECT X,Y,Z FROM tab_a""")
How do I iterate over the query results?
Retrieved from http://www.pygresql.org/contents/tutorial.html, which you should read.
q = db.query('select * from fruits')
q.getresult()
The result is a Python list of tuples, eardh tuple contains a row, you just need to iterate over the list and iterate or index the tupple.
I think it is the same, you must create cursor, call some fetch and iterate just like in MySQL:
import pgdb
pos = pgdb.connect(database=...,user=...,password=...,host=..., port=...)
sel = "select version() as x, current_timestamp as y, current_user as z"
cursor = db_conn().cursor()
cursor.execute(sel)
columns_descr = cursor.description
rows = cursor.fetchall()
for row in rows:
x, y, z = row
print('variables:')
print('%s\t%s\t%s' % (x, y, z))
print('\nrow:')
print(row)
print('\ncolumns:')
for i in range(len(columns_descr)):
print('-- %s (%s) --' % (columns_descr[i][0], columns_descr[i][1]))
print('%s' % (row[i]))
# this will work with PyGreSQL >= 5.0
print('\n testing named tuples')
print('%s\t%s\t%s' % (row.x, row.y, row.z))

How to store numpy.array in sqlite3 to take benefit of the sum function?

I am trying to use sqlite3 to compute the average of a numpy.array and I would like to take advantage of the sum function.
So far I have taken advantage of this post :
stackoverflow numpy.array
which help me to store and retreive easily the arrays I need.
import sqlite3
import numpy
import io
def adapt_array(arr):
out = io.BytesIO()
numpy.save(out, arr)
out.seek(0)
a = out.read()
return buffer(a)
def convert_array(text):
out = io.BytesIO(text)
out.seek(0)
return numpy.load(out)
sqlite3.register_adapter(numpy.ndarray, adapt_array)
sqlite3.register_converter("array", convert_array)
x1 = numpy.arange(12)
x2 = numpy.arange(12, 24)
con = sqlite3.connect(":memory:", detect_types = sqlite3.PARSE_DECLTYPES)
cur = con.cursor()
cur.execute("create table test (idx int, arr array)")
cur.execute("insert into test (idx, arr) values (?, ?)", (1, x1))
cur.execute("insert into test (idx, arr) values (?, ?)", (2, x2))
cur.execute("select idx, sum(arr) from test")
data = cur.fetchall()
print data
but unfortunately the request output does not give me the sum of the arrays.
[2, (0.0))
I would like to go one step further and get directly the result I want from an sql request. Thanks.
Edit : after reading stackoverflow : manipulation of nyumpy.array witl sqlite3 I am more sceptical about the feasibility of this. Any way to get a result close to what I want would be appreciated.
Edit2 : in other words what I am trying to do is to redefine the sum function to the particular kind of data I am using. IS it doable ? That's what was done to compress / uncompress the numpy.array.

Iterating through python sqlite database

I am trying to iterate through an SQLite database and perform checks or operations on the objects in the list. I need to use a database because the eventual number of objects will be quite large and all the operations are serial in nature (after basic sorting).
My question is how can I iterate through a list and after checking an object for certain qualities put it into a new database object? I would like to perform several serial 'checks' where at most two objects are brought into memory at a time and then re-assigned.
Below is a sample of my code. When I run the last operation I cannot 're-run' the same loop. How can I, instead of just printing the object, save it to a new database?
import os
import sqlite3 as lite
import sys
import random
import gc
import pprint
def make_boxspace():
refine_zone_cube_size = 1
refine_zone_x1 = 1*refine_zone_cube_size
refine_zone_y1 = 1*refine_zone_cube_size
refine_zone_z1 = 1*refine_zone_cube_size
refine_zone_x2 = refine_zone_x1+(2*refine_zone_cube_size)
refine_zone_y2 = refine_zone_y1+(1*refine_zone_cube_size)
refine_zone_z2 = refine_zone_z1+(1*refine_zone_cube_size)
point_pass_length = (1.0/4.0)
outlist = []
for i in range(int((refine_zone_x2-refine_zone_x1)/point_pass_length)):
for j in range(int((refine_zone_y2-refine_zone_y1)/point_pass_length)):
for k in range(int((refine_zone_z2-refine_zone_z1)/point_pass_length)):
if (random.random() > 0.5):
binary = True
else:
binary = False
if binary:
x1 = point_pass_length*i
y1 = point_pass_length*j
z1 = point_pass_length*k
x2 = x1+point_pass_length
y2 = y1+point_pass_length
z2 = z1+point_pass_length
vr_lev = int(random.random()*3)
outlist.append([\
float(str("%.3f" % (x1))),\
float(str("%.3f" % (y1))),\
float(str("%.3f" % (z1))),\
float(str("%.3f" % (x2))),\
float(str("%.3f" % (y2))),\
float(str("%.3f" % (z2))),\
vr_lev
])
return outlist
### make field of "boxes"
boxes = make_boxspace()
### define database object and cursor object
box_data = lite.connect('boxes.db')
cur = box_data.cursor()
### write the list in memory to the database
cur.execute("DROP TABLE IF EXISTS boxes")
cur.execute("CREATE TABLE boxes(x1,y1,z1,x2,y2,z2,vr)")
cur.executemany("INSERT INTO boxes VALUES(?, ?, ?, ?, ?, ?, ?)", boxes)
### clear the 'boxes' list from memory
del boxes
### re-order the boxes
cur.execute("SELECT * FROM boxes ORDER BY z1 ASC")
cur.execute("SELECT * FROM boxes ORDER BY y1 ASC")
cur.execute("SELECT * FROM boxes ORDER BY x1 ASC")
### save the database
box_data.commit()
### print each item
while True:
row = cur.fetchone()
if row == None:
break
print(row)
Thanks guys!!!
I really don't understand what you're asking, but I think you have some fairly fundamental misunderstandings of SQL.
SELECT... ORDER BY does not "order the table", and running commit after a SELECT does not do anything. Sending three separate SELECTs with different ORDER BY but only running fetch once also does not make any sense: you'll only fetch what was provided by the last SELECT.
Perhaps you just want to order by multiple columns at once?
result = cur.execute("SELECT * FROM boxes ORDER BY z1, y1, x1 ASC")
rows = result.fetchall()
I think connecting to the sqlite3 database is as simple as we know that.
since we are accessing database queries and results directly from the database, we need to take all the results in a list with fetchall() method and iterate through that list. so that you can get any number of results in multiple list with single connections.
following is the simple python code
conn = sqlite3.connect("database file name")
cur = conn.cursor()
cur.execute("your query")
a_list = cur.fetchall()
for i in a_list:
"process your list"
"perform another operation using cursor object"

Downloading arrays off cur.fetchall() in Python and Oracle 11g

I'm trying to download a single array off of Oracle 11g into Python using the cur.fetchall command. I'm using the following syntax:
con = cx_Oracle.connect('xxx')
print con.version
cur = con.cursor()
cur.execute("select zc.latitude from orders o, zip_code zc where o.date> '24-DEC-12' and TO_CHAR(zc.ZIP_CODE)=o.POSTAL_CODE")
latitudes = cur.fetchall()
cur.close()
print latitudes
when I print latitudes, I get this:
[(-73.98353999999999,), (-73.96565,), (-73.9531,),....]
the problems is that when I try to manipulate the data -- in this case, via:
x,y = map(longitudes,latitudes)
I get the following error -- note, I'm doing the same exact type of syntax to create 'longitudes':
TypeError: a float is required
I suspect this is because cur.fetchall is returning tuples with commas inside the tuple elements. How do I run the query so I don't get the comma inside the parenthesis, and get an array instead of a tuple? Is there a nice "catch all" command like cur.fetchall, or do I have to manually loop to get the results into an array?
my full code is below:
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np
import cx_Oracle
con = cx_Oracle.connect('xxx')
print con.version
cur = con.cursor()
cur.execute("select zc.latitude from orders o, zip_code zc where psh.ship_date> '24-DEC-12' and TO_CHAR(zc.ZIP_CODE)=o.CONSIGNEE_POSTAL_CODE")
latitudes = cur.fetchall()
cur.close()
cur = con.cursor()
cur.execute("select zc.longitude from orders o, zip_code zc where psh.ship_date> '24-DEC-12' and TO_CHAR(zc.ZIP_CODE)=o.CONSIGNEE_POSTAL_CODE")
longitudes = cur.fetchall()
print 'i made it!'
print latitudes
print longitudes
cur.close()
con.close()
map = Basemap(resolution='l',projection='merc', llcrnrlat=25.0,urcrnrlat=52.0,llcrnrlon=-135.,urcrnrlon=-60.0,lat_ts=51.0)
# draw coastlines, country boundaries, fill continents.
map.drawcoastlines(color ='C')
map.drawcountries(color ='C')
map.fillcontinents(color ='k')
# draw the edge of the map projection region (the projection limb)
map.drawmapboundary()
# draw lat/lon grid lines every 30 degrees.
map.drawmeridians(np.arange(0, 360, 30))
map.drawparallels(np.arange(-90, 90, 30))
plt.show()
# compute the native map projection coordinates for the orders.
x,y = map(longitudes,latitudes)
# plot filled circles at the locations of the orders.
map.plot(x,y,'yo')
The trailing commas are fine that is valid tuple syntax and what you get when you print a tuple.
I don't know what you are trying to achieve, but map is probably not what you want. map takes a function and a list as arguments but you are giving it 2 lists. Something more useful might be to retrieve the latitude and longitude from the database together:
cur.execute("select zc.longitude, zc.latitude from orders o, zip_code zc where o.date> '24-DEC-12' and TO_CHAR(zc.ZIP_CODE)=o.POSTAL_CODE")
Update to Comments
From the original code it looks like you are trying to use the built-in map function which is not the case from your updated code.
The reason you are getting the TypeError is matplotlib is expecting a list of floats but you are providing a list of one tuples. you can unwrap the tuples from your original latitudes with a simple list comprehension (the map built-in would also do the trick):
[row[0] for row in latitudes]
Using one query to return the latitudes and longitudes:
cur.execute("select zc.longitude, zc.latitude from...")
points = cur.fetchall()
longitudes = [point[0] for point in longitudes]
latitudes = [point[1] for point in latitudes]
Now longitudes and latitudes are lists of floats.
Found another way to tackle this -- check out TypeError: "list indices must be integers" looping through tuples in python. Thanks for your hard work in helping me out!
I faced the same issue and one work around is to clean up the commas from the tuple elements in a loop as you mentioned (not so efficient though), something like this:
cleaned_up_latitudes=[] #initialize the list
for x in xrange(len(latitudes)):
pass
cleaned_up_latitudes.append(latitudes[x][0]) #add first element
print cleaned_up_latitudes
[-73.98353999999999, -73.96565, -73.9531,....]
I hope that helps.

Categories