Accessing a sqlite database with multiple connections - python

I want to access the same SQLite database from multiple instances.
I tried that from two Python shells but didn't get really consistent results in showing me new entries on the other connection. Does this actually work or was is simply a fluke (or a misunderstanding on my side)?
I was using the following code snippets:
>>> import sqlite3
>>> conn = sqlite3.connect("test.db")
>>> conn.cursor().execute("SELECT * from foo").fetchall()
>>> conn.execute("INSERT INTO foo VALUES (1, 2)")
Of course I wasn't always adding new entries.

It's not a fluke, just a misunderstanding of how the connections are being handled. From the docs:
When a database is accessed by multiple connections, and one of the
processes modifies the database, the SQLite database is locked until
that transaction is committed. The timeout parameter specifies how
long the connection should wait for the lock to go away until raising
an exception. The default for the timeout parameter is 5.0 (five
seconds).
In order to see the changes on other connection you will have to commit() the changes from your execute() command. Again, from the docs:
If you don’t call this method, anything you did since the last call to
commit() is not visible from other database connections. If you wonder
why you don’t see the data you’ve written to the database, please
check you didn’t forget to call this method.

You should also include commit after any DML statements. if the autocommit property of your connection string is set to false
>>> import sqlite3
>>> conn = sqlite3.connect("test.db")
>>> conn.cursor().execute("SELECT * from foo").fetchall()
>>> conn.execute("INSERT INTO foo VALUES (1, 2)")
>>> conn.commit()

Related

Should I pass Database connection or Cursor to a class

I'm writing a Python script to move data from production db to dev db. I'm using vertica-python (something very similar to pyodbc) for db connection and airflow for scheduling.
The script is divided into two files, one for DAG and one for the actual migration job. I use try-except-finally block for all SQL execution functions in the migration job:
try:
# autocommit set to False
# Execute a SQL script
except DatabaseError:
# Logging information
# Rollback
finally:
# autocommit set to False
You can see that setting autocommit and Rollback needs to access the connection, and executing a SQL script needs to access the cursor. The current solution is to simply create two DB connections in DAG and pass them to the migration script. But I also read from a Stackoverflow post that I should pass only the cursor:
Python, sharing mysql connection in multiple functions - pass connection or cursor?
My question is: Is it possible to only pass the cursor from the DAG to the migration script, and still retain the ability to rollback and setting autocommit?
Yes, you can change the autocommit setting via the Cursor:
>>> import pyodbc
>>> cnxn = pyodbc.connect("DSN=mssqlLocal")
>>> cnxn.autocommit
False
>>> crsr = cnxn.cursor()
>>> crsr.connection.autocommit = True
>>> cnxn.autocommit
True
>>>
pyodbc also provides commit() and rollback() methods on the Cursor object, but be aware that they affect all cursors created by the same connection, i.e., crsr.rollback() is exactly the same as calling cnxn.rollback().

What is the difference (in MySQL) between transaction-rollback and not commiting?

I have a question regarding MySQL and transactions. I work with MySQL 5.7.18, python 3 and the Oracle mysql connector v2.1.4
I do not understand the difference between
a) having a transaction and –in case of error – rollback and
b) not having a transaction and – in case of error – simply not commiting the changes.
Both seem to leave me with exactly the same results (i.e. no entries in table, see code example below). Does this have to do with using InnoDB – would the results differ otherwise?
What is the advantage of using a transaction if
1) I cannot rollback commited changes and
2) I could just as well not commit changes (until I am done with my task or sure that some query didn’t raise any exceptions)?
I have tried to find the answers to those questions in https://downloads.mysql.com/docs/connector-python-en.a4.pdf but failed to find the essential difference.
Somebody asked an almost identical question and received some replies but I don’t think those actually contain an answer: Mysql transaction : commit and rollback Replies focused on having multiple connections open and visibility of changes. Is that all there is to it?
import mysql.connector
# Connect to MySQL-Server
conn = mysql.connector.connect(user='test', password='blub',
host='127.0.0.1', db='my_test')
cursor = conn.cursor(buffered=True)
# This is anyway the default in mysql.connector
# cursor.autocommit = False
sql = """CREATE TABLE IF NOT EXISTS `my_test`.`employees` (
`emp_no` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(14) NOT NULL,
PRIMARY KEY (`emp_no`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8"""
try:
cursor.execute(sql)
conn.commit()
except:
print("error")
# Arguments on default values
# conn.start_transaction(consistent_snapshot=False,
# isolation_level=None, readonly=False)
sql = """INSERT INTO `my_test`.`employees`
(`first_name`)
VALUES
(%s);"""
employees = {}
employees["1"] = ["Peter"]
employees["2"] = ["Bruce"]
for employee, value in employees.items():
cursor.execute(sql, (value[0],))
print(conn.in_transaction)
# If I do not commit the changes, table is left empty (whether I write
# start_transaction or not)
# If I rollback the changes (without commit first), table is left empty
# (whether I write start_transaction or not)
# If I commit and then rollback, the rollback had no effect (i.e. there are
# values in the table (whether I write start_transaction or not)
conn.commit()
conn.rollback()
Thank you very much for your help in advance! I appreciate it.
I think having not committed nor rolled back leaves the transaction in a running state, in which it may still hold resources like locks etc
Well it doesn't matter which db you are using when you call a transaction ,it will lock the resource (I.e any table) until the transaction is completed or rolled back for example if i write a transaction to insert something to a table test the test table will be locked until the transaction is completed this may leads to deadlock since others may need that table...You can try it on yourself just open two instances of your mysql in the first instance run transaction without commit and in the second try to insert something on the same table ...it will clear your doubt
Transactions prevent other queries from modifying the data while your query is running. Furthermore, a transaction scope can contain multiple queries - so you can rollback ALL of them in the event of an error, whereas that is not the case if some of them run successfully and only one query results in error, in which case you may end up with partially committed results, like JLH said.
Your decision to have a transaction should take into account the numerous reasons for having one, including having multiple statements each of which commits writes the database.
In your example I don't think it makes a difference, but in more complicated scenarios you need a transaction to ensure ACID.

How to reach postgres maximum connection with sqlalchemy?

This is related to sqlalchemy and pg8000.
I have read everywhere that I should close the ResultProxy object so that connection could be returned to the pool.
The local test database allows a maximum of 100 connections:
$ psql -h 127.0.0.1 -U postgres
Password for user postgres:
psql (9.5.5, server 9.6.0)
WARNING: psql major version 9.5, server major version 9.6.
Some psql features might not work.
Type "help" for help.
postgres=# show max_connections;
max_connections
-----------------
100
(1 row)
The following test script creates an engine in every loop and does not read nor close the ResultProxy object. It really is as bad as it can get.
The weird thing is, it also does not generate a too many connections kind of error. This is really confusing to me. Does sqlalchemy performs some magic? Or maybe postgres is actually magic?
#!/usr/bin/env python2.7
from __future__ import print_function
import sqlalchemy
def handle():
url = 'postgresql+pg8000://{}:{}#{}:{}/{}'
url = url.format("postgres", "pass", "127.0.0.1", "5432", "usercity")
conn = sqlalchemy.create_engine(url, client_encoding='utf8')
meta = sqlalchemy.MetaData(bind=conn, reflect=True)
table = meta.tables['events']
clause = table.select()
result = conn.execute(clause)
if __name__=='__main__':
for i in range(2000):
print(i)
handle()
No magic, just garbage collection. Since handle() doesn't return anything (or modify global data), there is no way for a reference to the connection or cursor it creates to live beyond the scope of handle(). When they go out of scope, their reference counts drop to 0, and they get deleted (there is no hard guarantee about when this happens, but in practice, in CPython this happens immediately).

Should I call connect() and close() for every Sqlite3 transaction?

I want to write a Python module to abstract away database transactions for my application. My question is whether I need to call connect() and close() for every transaction? In code:
import sqlite3
# Can I put connect() here?
conn = sqlite3.connect('db.py')
def insert(args):
# Or should I put it here?
conn = sqlite3.connect('db.py')
# Perform the transaction.
c = conn.cursor()
c.execute(''' insert args ''')
conn.commit()
# Do I close the connection here?
conn.close()
# Or can I close the connection whenever the application restarts (ideally, very rarely)
conn.close()
I have don't much experience with databases, so I'd appreciate an explanation for why one method is preferred over the other.
You can use the same connection repeatedly. You can also use the connection (and the cursor) as a context manager so that you don't need to explicitly call close on either.
def insert(conn, args):
with conn.cursor() as c:
c.execute(...)
conn.commit()
with connect('db.py') as conn:
insert(conn, ...)
insert(conn, ...)
insert(conn, ...)
There's no reason to close the connection to the database, and re-opening the connection each time can be expensive. (For example, you may need to establish a TCP session to connect to a remote database.)
Using a single connection will be faster, and operationally should be fine.
Use the atexit module if you want to ensure the closing eventually happens (even if your program is terminated by an exception). Specifically, import atexit at the start of your program, and atexit.register(conn.close) right after you connect -- note, no () after close, you want to register the function to be called at program exist (whether normal or via an exception), not to call the function.
Unfortunately if Python should crash due e.g to an error in a C-coded module that Python can't catch, or a kill -9, etc, the registered exit function(s) may end up not being called. Fortunately in this case it shouldn't hurt anyway (besides being, one hopes, a rare and extreme occurrence).

Python SQLite - How to manually BEGIN and END transactions?

Context
So I am trying to figure out how to properly override the auto-transaction when using SQLite in Python. When I try and run
cursor.execute("BEGIN;")
.....an assortment of insert statements...
cursor.execute("END;")
I get the following error:
OperationalError: cannot commit - no transaction is active
Which I understand is because SQLite in Python automatically opens a transaction on each modifying statement, which in this case is an INSERT.
Question:
I am trying to speed my insertion by doing one transaction per several thousand records.
How can I overcome the automatic opening of transactions?
As #CL. said you have to set isolation level to None. Code example:
s = sqlite3.connect("./data.db")
s.isolation_level = None
try:
c = s.cursor()
c.execute("begin")
...
c.execute("commit")
except:
c.execute("rollback")
The documentaton says:
You can control which kind of BEGIN statements sqlite3 implicitly executes (or none at all) via the isolation_level parameter to the connect() call, or via the isolation_level property of connections.
If you want autocommit mode, then set isolation_level to None.

Categories