I am trying to iterate through an SQLite database and perform checks or operations on the objects in the list. I need to use a database because the eventual number of objects will be quite large and all the operations are serial in nature (after basic sorting).
My question is how can I iterate through a list and after checking an object for certain qualities put it into a new database object? I would like to perform several serial 'checks' where at most two objects are brought into memory at a time and then re-assigned.
Below is a sample of my code. When I run the last operation I cannot 're-run' the same loop. How can I, instead of just printing the object, save it to a new database?
import os
import sqlite3 as lite
import sys
import random
import gc
import pprint
def make_boxspace():
refine_zone_cube_size = 1
refine_zone_x1 = 1*refine_zone_cube_size
refine_zone_y1 = 1*refine_zone_cube_size
refine_zone_z1 = 1*refine_zone_cube_size
refine_zone_x2 = refine_zone_x1+(2*refine_zone_cube_size)
refine_zone_y2 = refine_zone_y1+(1*refine_zone_cube_size)
refine_zone_z2 = refine_zone_z1+(1*refine_zone_cube_size)
point_pass_length = (1.0/4.0)
outlist = []
for i in range(int((refine_zone_x2-refine_zone_x1)/point_pass_length)):
for j in range(int((refine_zone_y2-refine_zone_y1)/point_pass_length)):
for k in range(int((refine_zone_z2-refine_zone_z1)/point_pass_length)):
if (random.random() > 0.5):
binary = True
else:
binary = False
if binary:
x1 = point_pass_length*i
y1 = point_pass_length*j
z1 = point_pass_length*k
x2 = x1+point_pass_length
y2 = y1+point_pass_length
z2 = z1+point_pass_length
vr_lev = int(random.random()*3)
outlist.append([\
float(str("%.3f" % (x1))),\
float(str("%.3f" % (y1))),\
float(str("%.3f" % (z1))),\
float(str("%.3f" % (x2))),\
float(str("%.3f" % (y2))),\
float(str("%.3f" % (z2))),\
vr_lev
])
return outlist
### make field of "boxes"
boxes = make_boxspace()
### define database object and cursor object
box_data = lite.connect('boxes.db')
cur = box_data.cursor()
### write the list in memory to the database
cur.execute("DROP TABLE IF EXISTS boxes")
cur.execute("CREATE TABLE boxes(x1,y1,z1,x2,y2,z2,vr)")
cur.executemany("INSERT INTO boxes VALUES(?, ?, ?, ?, ?, ?, ?)", boxes)
### clear the 'boxes' list from memory
del boxes
### re-order the boxes
cur.execute("SELECT * FROM boxes ORDER BY z1 ASC")
cur.execute("SELECT * FROM boxes ORDER BY y1 ASC")
cur.execute("SELECT * FROM boxes ORDER BY x1 ASC")
### save the database
box_data.commit()
### print each item
while True:
row = cur.fetchone()
if row == None:
break
print(row)
Thanks guys!!!
I really don't understand what you're asking, but I think you have some fairly fundamental misunderstandings of SQL.
SELECT... ORDER BY does not "order the table", and running commit after a SELECT does not do anything. Sending three separate SELECTs with different ORDER BY but only running fetch once also does not make any sense: you'll only fetch what was provided by the last SELECT.
Perhaps you just want to order by multiple columns at once?
result = cur.execute("SELECT * FROM boxes ORDER BY z1, y1, x1 ASC")
rows = result.fetchall()
I think connecting to the sqlite3 database is as simple as we know that.
since we are accessing database queries and results directly from the database, we need to take all the results in a list with fetchall() method and iterate through that list. so that you can get any number of results in multiple list with single connections.
following is the simple python code
conn = sqlite3.connect("database file name")
cur = conn.cursor()
cur.execute("your query")
a_list = cur.fetchall()
for i in a_list:
"process your list"
"perform another operation using cursor object"
Related
I am new to Python, so need some help in this concern:
Here I am trying to insert the values in the database, when I tried giving the hard coded values then insertion taking place,
Note: common_test2 has only 2 words
But when I am writing like below:
import cx_Oracle
con = cx_Oracle.connect('sys/sys#127.0.0.1:1599/xe')
print(con.version) //just to check the connection
print("this connection is established") //connection is tested
cur=con.cursor()
f3= open("common_test2", 'r+')
string= f3.read()
common_words=string.split()
x=common_words[0]
y=common_words[1]
cur.execute("INSERT INTO test(name,id) VALUES (%s,%d)", ('hello',30))
con.commit()
Error is
Error is cx_Oracle.DatabaseError: ORA-01036: illegal variable name/number
Also tried cur.execute("INSERT INTO test(name,id) VALUES (x, y)")
but no luck
Error is cx_Oracle.DatabaseError: ORA-00984: column not allowed here
Any help?
#I am using this for updating the table
y=100%
x=text
cur.execute("update temp set perc=(:1)", (y))
#Please note: I only want 100 to be updated in the table not the % (only 100 as numeric)
cur.execute("update temp set remarks=(:1)",(x))
Error comes from here:
cur.execute("INSERT INTO test(name,id) VALUES (%s,%d)", ('hello',30))
Try to use :n pointers:
cur.execute("INSERT INTO test(name, id) VALUES (:1, :2)", ('hello', 30))
update
For your second case - if y is a string like y = "100%", then you can make update this way:
cur.execute("update temp set perc = :1", (y[:-1],))
This will insert 100 as an int.
Note that 1-item sized tuple is (x,), not (x).
Code I'm Using
import cx_Oracle
con = cx_Oracle.connect('system/system#127.0.0.1:1521/xe')
# print(con.version)
#
# print("this connection is established")
cur = con.cursor()
f3 = open("common.txt", 'r+') #this text file have only 2 words lets say (10% increased)
string = f3.read() #will read common.txt
common_words = string.split()
x = common_words[0] #here x have 10%
y = common_words[1] #here y have increased
# Now I want 10 should be updated in the temp table **Not 10%**
cur.execute("update temp set perc=(:1)", (y[:-1],))
cur.execute("update temp set remarks=(:1)", (y))
con.commit
Note: I have to retrieve that 10 and do further calculations
Table Temp:
perc remarks
10 increased
I am a very newbie in using python and sqlite. I am trying to create a script that reads a data from a table (rawdata) and then performs some calculations which is then stored in a new table. I am counting the number race that a player has won before that date at a particular track position and calculating the percentage. There are 15 track positions in total. Overall the script is very slow. Any suggestions to improve its speed. I have already used the PRAGMA parameters.
Below is the script.
for item in result:
l1 = str(item[0])
l2 = item[1]
l3 = int(item[2])
winpost = []
key = l1.split("|")
dt = l2
###Denominator--------------
cursor.execute(
"SELECT rowid FROM rawdata WHERE Track = ? AND Date< ? AND Distance = ? AND Surface =? AND OfficialFinish=1",
(key[2], dt, str(key[4]), str(key[5]),))
result_den1 = cursor.fetchall()
cursor.execute(
"SELECT rowid FROM rawdata WHERE Track = ? AND RaceSN<= ? AND Date= ? AND Distance = ? AND Surface =? AND OfficialFinish=1",
(key[2], int(key[3]), dt, str(key[4]), str(key[5]),))
result_den2 = cursor.fetchall()
totalmat = len(result_den1) + len(result_den2)
if totalmat > 0:
for i in range(1, 16):
cursor.execute(
"SELECT rowid FROM rawdata WHERE Track = ? AND Date< ? AND PolPosition = ? AND Distance = ? AND Surface =? AND OfficialFinish=1",
(key[2], dt, i, str(key[4]), str(key[5]),))
result_num1 = cursor.fetchall()
cursor.execute(
"SELECT rowid FROM rawdata WHERE Track = ? AND RaceSN<= ? AND Date= ? AND PolPosition = ? AND Distance = ? AND Surface =? AND OfficialFinish=1",
(key[2], int(key[3]), dt, i, str(key[4]), str(key[5]),))
result_num2 = cursor.fetchall()
winpost.append(len(result_num1) + len(result_num2))
winpost = [float(x) / totalmat for x in winpost]
rank = rankmin(winpost)
franks = list(rank)
franks.insert(0, int(key[3]))
franks.insert(0, dt)
franks.insert(0, l1)
table1.append(franks)
franks = []
cursor.executemany("INSERT INTO posttable VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)", table1)
Sending and retrieving an SQL query is "expensive" in terms of time. The easiest way to speed things up would be to use SQL functions to reduce the number of queries.
For example, the first two queries could be reduced to a single call using COUNT(), UNION, and Aliases.
SELECT COUNT(*)
FROM
( SELECT rowid FROM rawdata where ...
UNION
SELECT rowid FROM rawdata where ...
) totalmatch
In this case we take the two original queries (with your conditions in place of the "...") combine them with a UNION statement, give that union the alias "totalmatch", and count all the rows in it.
Same thing can be done with the second set of queries. Instead of cycling 16 times over 2 queries (resulting in 32 calls to the SQL engine) you can replace it with one query by also using GROUP BY.
SELECT PolPosition, COUNT(PolPosition)
FROM
( SELECT PolPosition FROM rawdata WHERE ...
UNION
SELECt PolPosition FROM rawdata WHERE ...
) totalmatch
GROUP BY PolPosition
In this case we take the exact same query as before and group it by PolPosition, using COUNT to display how many rows are in each group.
W3Schools is a great resource for how these functions work:
http://www.w3schools.com/sql/default.asp
I am trying to use sqlite3 to compute the average of a numpy.array and I would like to take advantage of the sum function.
So far I have taken advantage of this post :
stackoverflow numpy.array
which help me to store and retreive easily the arrays I need.
import sqlite3
import numpy
import io
def adapt_array(arr):
out = io.BytesIO()
numpy.save(out, arr)
out.seek(0)
a = out.read()
return buffer(a)
def convert_array(text):
out = io.BytesIO(text)
out.seek(0)
return numpy.load(out)
sqlite3.register_adapter(numpy.ndarray, adapt_array)
sqlite3.register_converter("array", convert_array)
x1 = numpy.arange(12)
x2 = numpy.arange(12, 24)
con = sqlite3.connect(":memory:", detect_types = sqlite3.PARSE_DECLTYPES)
cur = con.cursor()
cur.execute("create table test (idx int, arr array)")
cur.execute("insert into test (idx, arr) values (?, ?)", (1, x1))
cur.execute("insert into test (idx, arr) values (?, ?)", (2, x2))
cur.execute("select idx, sum(arr) from test")
data = cur.fetchall()
print data
but unfortunately the request output does not give me the sum of the arrays.
[2, (0.0))
I would like to go one step further and get directly the result I want from an sql request. Thanks.
Edit : after reading stackoverflow : manipulation of nyumpy.array witl sqlite3 I am more sceptical about the feasibility of this. Any way to get a result close to what I want would be appreciated.
Edit2 : in other words what I am trying to do is to redefine the sum function to the particular kind of data I am using. IS it doable ? That's what was done to compress / uncompress the numpy.array.
I want to test my MySQL database and how it handles my future data. it is only a table with two columns, one of the column is only one word and another one is 30.000 characters in. So I copied and inserted into the same table 20.000 times that shows the size is 2.0 GB. Now I want to browse them through phpMyAdmin it shows nothing and every destroyed that table to show anything. I output it through python it only shows 5 rows that were inserted before this copy. I used a script to delete rows from IDs between 5000 - 10.000 it works. That means that data is there but doesn't come out. Any explanation?
import MySQLdb as mdb
con = mdb.connect('127.0.0.1', 'root','password', 'database')
title = []
entry = []
y = 0
with con:
conn = con.cursor()
conn.execute("SELECT * FROM mydatabase WHERE id='2' AND myword = 'jungleboy'")
rows = conn.fetchall()
for i in rows:
title.append(i[1])
entry.append(i[2])
for x in range(20000):
cur.execute("INSERT INTO mydatabase(myword,explanation) VALUES (%s,%s)",(str(title[0]),str(entry[0])))
if x > y+50:
print str(x)
y = x
I'm not sure I understand your question, but here are some tips with the code you have pasted.
After any INSERT or other query that adds, removes or changes data in a table, you need to commit the transaction with con.commit().
There is a limit on how many records can be fetched with fetchall(). You can see and adjust this limit by printing the arraysize attribute of the cursor:
print 'I can only fetch {0.arraysize} rows at a time.'.format(cur)
To guarantee that you are fetching every row, loop through the results, like this:
q = "SELECT .... " # some query that returns a lot of results
conn.execute(q)
rows = con.fetchone() # or fetchall()
while rows:
print rows
rows = con.fetchone() # again, or fetchall()
I have two different SQLite databases XXX and YYY.
XXX contains table A and YYY contains B respectively.
A and B have same structure(columns).
How to append the rows of B in A in Python - SQLite API.
After appending A contains rows of A and rows of B.
You first get a connection to the database using sqlite3.connect, then create a cursor so you can execute sql. Once you have a cursor, you can execute arbitrary sql commands.
Example:
import sqlite3
# Get connections to the databases
db_a = sqlite3.connect('database_a.db')
db_b = sqlite3.connect('database_b.db')
# Get the contents of a table
b_cursor = db_b.cursor()
b_cursor.execute('SELECT * FROM mytable')
output = b_cursor.fetchall() # Returns the results as a list.
# Insert those contents into another table.
a_cursor = db_a.cursor()
for row in output:
a_cursor.execute('INSERT INTO myothertable VALUES (?, ?, ...etc..., ?, ?)', row)
# Cleanup
db_a.commit()
a_cursor.close()
b_cursor.close()
Caveat: I haven't actually tested this, so it might have a few bugs in it, but the basic idea is sound, I think.
This is a generalized function and should be customized to your particular environment. To do this, you may structure the "dynamically determine SQL expression requirements" section with the static SQL parameters (rather than PRAGMA table_info). This should improve performance.
import sqlite3
def merge_tables(cursor_new: sqlite3.Cursor, cursor_old: sqlite3.Cursor, table_name: str, del_old_table: bool = False) -> None:
'''
This function merges the content of a specific table from an old cursor into a new cursor.
:param cursor_new: [sqlite3.Cursor] the primary cursor
:param cursor_old: [sqlite3.Cursor] the secondary cursor
:param table_name: [str] the name of the table
:return: None
'''
# dynamically determine SQL expression requirements
column_names = cursor_new.execute(f"PRAGMA table_info({table_name})").fetchall()
column_names = tuple([x[1] for x in column_names][1:]) # remove the primary keyword
values_placeholders = ', '.join(['?' for x in column_names]) # format appropriately
# SQL select columns from table
data = cursor_old.execute(f"SELECT {', '.join(column_names)} FROM {table_name}").fetchall()
# insert the data into the primary cursor
cursor_new.executemany(f"INSERT INTO {table_name} {column_names} VALUES ({values_placeholders})", data)
if (cursor_new.connection.commit() == None):
# With Ephemeral RAM connections & testing, deleting the table may be ill-advised
if del_old_table:
cursor_old.execute(f"DELETE FROM {table_name}") # cursor_old.execute(f'DROP TABLE {table_name}')
cursor_old.connection.commit()
print(f"Table {table_name} merged from {cursor_old.connection} to {cursor_new.connection}") # Consider logging.info()
return None