Load existing sqlite db into memory/ execute and close - python

I have a function in python which connects to sqlite DB which has 20k rows and just executes a simple select query as below
def viewdata(mul):
conn = sqlite3.connect("mynew.db")
cursor = conn.cursor()
cursor.execute(("SELECT ad,abd,acd,ard FROM allrds WHERE mul<=?ORDER BY mul DESC LIMIT 1"),(mul,))
data = [i for i in cursor.fetchall()]
conn.close()
return data
its kind of slow, so i want to move this into in memory Database of SQLite, how can i copy this existing DB to in memory DB and make a connection and fetch the data and close it once the operations are over. Is there anything different i need to do when connecting to memory databases? are the select queries executed the same way like we do for on disk DB? Can someone please give me an example function

Related

About python sqlite3 order by

Now, I have a study about python sqlite3 database. I think it is very simple problem but not allow next step. Could help me?
There is print OK on vscode terminal, but not revised to DB file. I'm searching several times but I can not fix it.
If I execute the code, it not sorting on DB files.
import sqlite3
conn = sqlite3.connect('sqliteDB1.db')
cursor = conn.cursor()
cursor.execute("SELECT * FROM member")
temp123 = cursor. fetchall()
print(temp123)
cursor.execute("SELECT * FROM member ORDER BY -code")
temp321 = cursor.fetchall()
conn.commit
print(temp321)
conn.close()
A select statement just returns data from a database, it will not modify it. Moreover, tables in SQL databases are inherently unordered sets. They have no intrinsic value, and you should never rely on the order of the rows that happens to be returned unless you explicitly sort it with an order by clause.

Ending a SELECT transaction psycopg2 and postgres

I am executing a number of SELECT queries on a postgres database using psycopg2, but I am getting ERROR: Out of shared memory. It suggests that I should increase max_locks_per_transaction., but this confuses me because each SELECT query is operating on only one table, and max_locks_per_transaction is already set to 512, 8 times the default.
I am using TimescaleDB, which could be the result of a larger than normal number of locks (one for each chunk rather than one for each table, maybe), but this still can't explain running out when so many are allowed. I'm assuming what is happening here is that all the queries are all being run as part of one transaction.
The code I am using looks something as follows.
db = DatabaseConnector(**connection_params)
tables = db.get_table_list()
for table in tables:
result = db.query(f"""SELECT a, b, COUNT(c) FROM {table} GROUP BY a, b""")
print(result)
Where db.query is defined as:
def query(self, sql):
with self._connection.cursor() as cur:
cur.execute(sql)
return_value = cur.fetchall()
return return_value
and self._connection is:
self._connection = psycopg2.connect(**connection_params)
Do I need to explicitly end the transaction in some way to free up locks? And how can I go about doing this in psycopg2? I would have assumed that there was an implicit end to the transaction when the cursor is closed on __exit__. I know if I was inserting or deleting rows I would use COMMIT at the end, but it seems strange to use as I am not changing the table.
UPDATE: When I explicitly open and close the connection in the loop, the error does not show. However, I assume there is a better way to end the transaction after each SELECT than this.

SQL command SELECT fetches uncommitted data from Postgresql database

In short:
I have Postgresql database and I connect to that DB through Python's psycopg2 module. Such script might look like this:
import psycopg2
# connect to my database
conn = psycopg2.connect(dbname="<my-dbname>",
user="postgres",
password="<password>",
host="localhost",
port="5432")
cur = conn.cursor()
ins = "insert into testtable (age, name) values (%s,%s);"
data = ("90", "George")
sel = "select * from testtable;"
cur.execute(sel)
print(cur.fetchall())
# prints out
# [(100, 'Paul')]
#
# db looks like this
# age | name
# ----+-----
# 100 | Paul
# insert new data - no commit!
cur.execute(ins, data)
# perform the same select again
cur.execute(sel)
print(cur.fetchall())
# prints out
# [(100, 'Paul'),(90, 'George')]
#
# db still looks the same
# age | name
# ----+-----
# 100 | Paul
cur.close()
conn.close()
That is, I connect to that database which at the start of the script looks like this:
age | name
----+-----
100 | Paul
I perform SQL select and retrieve only Paul data. Then I do SQL insert, however without any commit, but the second SQL select still fetches both Paul and George - and I don't want that. I've looked both into psycopg and Postgresql docs and found out about ISOLATION LEVEL (see Postgresql and see psycopg2). In Postgresql docs (under 13.2.1. Read Committed Isolation Level) it explicitly says:
However, SELECT does see the effects of previous updates executed within its own transaction, even though they are not yet committed.
I've tried different isolation levels, I understand, that Read Committed and Repeatable Read don't wokr, I thought, that Serializable might work, but it does not -- meaning that I still can fetch uncommitted data with select.
I could do conn.set_isolation_level(0), where 0 represents psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT, or I could probably wrap the execute commands inside with statements (see).
After all, I am bit confused, whether I understand transactions and isolations (and the behavior of select without commit is completely normal) or not. Can somebody enlighten this topic to me?
Your two SELECT statements are using the same connection, and therefore the same transaction. From the psycopg manual you linked:
By default, the first time a command is sent to the database ... a new transaction is created. The following database commands will be executed in the context of the same transaction.
Your code is therefore equivalent to the following:
BEGIN TRANSACTION;
select * from testtable;
insert into testtable (age, name) values (90, 'George');
select * from testtable;
ROLLBACK TRANSACTION;
Isolation levels control how a transaction interacts with other transactions. Within a transaction, you can always see the effects of commands within that transaction.
If you want to isolate two different parts of your code, you will need to open two connections to the database, each of which will (unless you enable autocommit) create a separate transaction.
Note that according to the document already linked, creating a new cursor will not be enough:
...not only the commands issued by the first cursor, but the ones issued by all the cursors created by the same connection
Using autocommit will not solve your problem. When autocommit is one every insert and update is automatically committed to the database and all subsequent reads will see that data.
It's most unusual to not want to see data that has been written to the database by you. But if that's what you want, you need two separate connections and you must make sure that your select is executed prior to the commit.

how to make Python faster when processing Mysql query

I have very simple mysql query as following:
db = getDB()
cursor = db.cursor()
cursor.execute('select * from users')
results = cursor.fetchall()
for row in results:
process(row)
Suppose users table has 1 billion records, the process method for one record takes 10ms.
The above code will finish fetching all of the data to client side and then starting process method. It really waste time. Should I do query and process parallel please?
So I'd like to change fetchall() to fetchmany() and start a new thread for process the retrieved result when cursor starting to query new result.

My program isn't changing MySQL database yet presents no error

I've written a program to scrape a website for data, place it into several arrays, iterate through each array and place it in a query and then execute the query. The code looks like this:
for count in range(391):
query = #long query
values = (doctor_names[count].encode("utf-8"), ...) #continues for about a dozen arrays
cur.execute(query, values)
cur.close()
db.close()
I run the program and aside from a few truncation warnings everything goes fine. I open the database in MySQL Workbench and nothing has changed. I tried changing the arrays in the values to constant strings and running it but still nothing would change.
I then created an array to hold the last executed query: sql_queries.append(cur._last_executed) and pushed them out to a text file:
fo = open("foo.txt", "wb")
for q in sql_queries:
fo.write(q)
fo.close()
Which gives me a large text file with multiple queries. When I copy the whole text file and create a new query in MySQL Workbench and execute it, it populates the database as desired. What is my program missing?
If your table is using a transactional storage engine, like Innodb, then you need to call db.commit() to have the transaction stored:
for count in range(391):
query = #long query
values = (doctor_names[count].encode("utf-8"), ...)
cur.execute(query, values)
db.commit()
cur.close()
db.close()
Note that with a transactional database, besides comitting you also have the opportunity to handle errors by rollingback inserts or updates with db.rollback(). The db.commit is required to finalize the transaction. Otherwise,
Closing a connection without committing the changes first will cause
an implicit rollback to be performed.

Categories