Problem with concurrent reading and writing (distinct processes) using MySQL and Python

Problem with concurrent reading and writing (distinct processes) using MySQL and Python - python

I have two concurrent processes:
1.) Writer - inserts new rows into a MySQL database on a regular basis (10-20 rows/sec)
2.) Reader - reads from the same table being inserted into
I notice that the Reader process only seems to see a snapshot of the database at about the time of its startup. Inserts occuring before this startup are found, but inserts occuring after are not. If I shut the Reader process down and restart it (but leave the Writer running), it will sometimes (but not always) see more data, but again seems to get a point-in-time view of the database.
I'm running a commit after each insert (code snippet below). I investigated whether this was a function of change buffering/pooling, but doing a "set ##global.innodb_change_buffering=none;" had no effect. Also, if I go in through MySQL workbench, I can query the most current data being inserted by the Writer. So this seems to be a function of how the Python/MySQL connection is getting set up.
My environment is:
Windows 7
MySQL 5.5.9
Python 2.6.6 -- EPD 6.3-1 (32-bit)
MySQL python connector
The insert code is:
def insert(dbConnection, statement):
cursor = dbConnection.cursor()
cursor.execute(statement)
warnings = cursor.fetchwarnings()
if warnings:
print warnings
rowid = []
else:
rowid = cursor.lastrowid
cursor.close()
dbConnection.commit()
return rowid
The reader code is:
def select(dbConnection, statement):
cursor = dbConnection.cursor()
cursor.execute(statement)
warnings = cursor.fetchwarnings()
if warnings:
print warnings
values = []
else:
values = np.asarray(cursor.fetchall())
cursor.close()
return values

What's the read side look like?
I bet this is a problem with the isolation level on the read side. Most likely your read connection is getting an implicit transaction and the default InnoDB isolation level is:
Repeatable Read
Try issuing:
cursor.execute("SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED")
on the read side.

Related

flask sqlalchemy update sql raw command does not provide any response

I'm trying to perform an update using flask-sqlalchemy but when it gets to the update script it does not return anything. it seems the script is hanging or it is not doing anything.
I tried to wrap a try catch on the code that does not complete but there are no errors.
I gave it 10 minutes to complete the update statement which only updates 1 record and still, it will not do anything for some reason.
When I cancel the script, it provides an error Communication link failure (0) (SQLEndTran) but I don't think this is the root cause of the error because on the same script, I have other sql scripts that works ok so the connection to db is good
what my script does is get some list of filenames that I need to process (I have no issues with this). then using the retrieved list of filenames, I will look into the directory to check if the file exists. if it does not exists, I will update the database to tag the file as it is not found. this is where I get the issue, it does not perform the update nor provide an error message of some sort.
I even tried to create a new engine just for the update script, but still I get the same behavior.
I also tried to print out the sql script first in python before executing. I ran the printed sql command on my sql browser and it worked ok.
The code is very simple, I'm not really sure why it's having the issue.
#!/usr/bin/env python3
from flask_sqlalchemy import sqlalchemy
import glob
files_directory = "/files_dir/"
sql_string = """
select *
from table
where status is null
"""
# ommited conn_string
engine1 = sqlalchemy.create_engine(conn_string)
result = engine1.execute(sql_string)
for r in result:
engine2 = sqlalchemy.create_engine(conn_string)
filename = r[11]
match = glob.glob(f"{files_directory}/**/{filename}.wav")
if not match:
print('no match')
script = "update table set status = 'not_found' where filename = '" + filename + "' "
engine2.execute(script)
engine2.dispose()
continue
engine1.dispose()

it appears that if I try to loop through 26k records, the script doesn't work. but when I try to do by batches of 2k records per run, then the script will work. so my sql string will become (added top 2000 on the query)
sql_string = """
select top 2000 *
from table
where status is null
"""
it's manual yeah, but it works for me since I just need to run this script once. (I mean 13 times)

inserting data to postgres using python

below is a sample of code that i am using to push data from one postgres server to another postgres server. I am trying to move 28 Million records. This worked perfectly with sql server to postgres, but now that it's postgres to postgres it is hanging on line
sourcecursor.execute('select * from "schema"."reallylargetable"; ')
it never reaches any of the other statements to get to the Iterator.
I get this message:
psycopg2.DatabaseError: out of memory for query result ad the select query statement.
#cursors for aiods and ili#
sourcecursor = sourceconn.cursor()
destcursor= destconn.cursor()
#name of temp csv file
filenme= 'filename.csv'
#defenition that uses fetchmany to iterate through data in batch. default
value is in 10000#
def ResultIterator(cursor, arraysize=1000):
'iterator using fetchmany and consumes less memory'
while True:
results = cursor.fetchmany(arraysize)
if not results:
break
for result in results:
yield result
#set data for the cursor#
print("start get data")
#it is not going past the line below. it errors at with out of memory for query result
sourcecursor.execute('select * from "schema"."reallylargetable"; ')
print("iterator")
dataresults= ResultIterator(sourcecursor)
*****do something with dataresults *********

Please change this line:
sourcecursor = sourceconn.cursor()
to name your cursor (use whatever name pleases you):
sourcecursor = sourceconn.cursor('mysourcecursor')
What this does is direct psycopg2 to open a postgresql server-side named cursor for your query. Without a named cursor on the server side, psycopg2 attempts to grab all rows when executing the query.

Connect / Store sqlite3 database in specified directory other than default -- "conn = sqlite3.connect('name_of_database')"

I'm working with and Sqlite3 database trying to access its data from a different directory than it was originally created.
The python script(test case) that I have ran through our Squish GUI Automation IDE is located in a directory
C:\Squish_Automation_Functional_nVP2\suite_Production_Checklist\tst_dashboard_functional_setup
There I create a database with the following table inside of that same script:
def get_var_value_pass():
conn = sqlite3.connect('test_result.db')
c = conn.cursor()
c.execute('CREATE TABLE IF NOT EXISTS test_result_pass (pass TEXT,result INTEGER)')
c.execute("INSERT INTO test_result_pass VALUES('Pass', 0)")
conn.commit()
c.execute("SELECT * FROM test_result_pass WHERE pass='Pass'", )
pass_result = c.fetchone()
test.log(str(pass_result[1]))
test_result = pass_result[1]
c.close()
conn.close()
return test_result
Now I'd like to access the same database which I've created formerly by "conn = sqlite3.connect('test_result.db')" inside of another test case which is located in a different directory
C:\Squish_Automation_Functional_nVP2\suite_Production_Checklist\tst_Calling_In-Call-Options_Text_Share-Text_Update-Real-Time
The problem is, when I try a select statement for the same database-- inside of a different script(test case) like so:
def get_var_value_pass(pass_value):
conn = sqlite3.connect('test_result.db')
c = conn.cursor()
c.execute("SELECT * FROM test_result_pass WHERE pass='Pass'", )
My test fails as soon as I try the c.execute("") statement because the table can't be found. Instead, the most recent "conn = sqlite3.connect('test_result.db')" has just created a new empty database, instead of referring to my original created database. Therefore, I've come to the conclusion that I'll want to try and store my original database where both test cases can use them as a test_suite_resource--basically inside of another directory where the other test cases will have reference. Ideally here:
C:\Squish_Automation_Functional_nVP2\suite_Production_Checklist
Is there a sqlite3 function that lets you declare the place where you'd like to Connect / Store your database? Similar to sqlite3.connect('test_result.db') except you define its path?
BTW-- I have tried the second snippet of code inside of the same test script and it runs perfectly. I have also tried an in-memory approach by sqlite3.connect(':memory:') -- still no luck. Please help! Thanks.

It's not clear to me that you received an answer. You have a number of options. I happen to have a sqlite database stored in one directory which I can open in that or another directory by specifying it in one of the following way.
import sqlite3
conn = sqlite3.connect(r'C:\Television\programs.sqlite')
conn.close()
print('Hello')
conn = sqlite3.connect('C:\\Television\\programs.sqlite')
conn.close()
print('Hello')
conn = sqlite3.connect('C:/Television/programs.sqlite')
conn.close()
print('Hello')
All three connection attempts succeed. I see three Hellos as output.
Stick an 'r' ahead of the string. See the documents about string for the reason.
Replace each backward stroke with a pair of backward strokes.
Replace each backward stroke with a forward stroke.

Python -printing the sql statement record count in the log file

I am currently using the python program for inserting the record and i am using the below statement.The issue is i am trying to print the no of of record inserted in the log file but it is printing only 0 but i can see the inserted record count in the console while running the program Can you help me to print the record count in the log file
Also i know that redirecting the python program to > file can have the record count but i want to bring all the details in the same log file after the insert record statement is done as i am using loop for different statement.
log="/fs/logfile.txt"
log_file = open(log,'w')
_op = os.system('psql ' + db_host_connection + ' -c "insert into emp select * from emp1;"')
print date , "printing" , _op

You should probably switch to a "proper" python module for postgresql interactions.
Haven't used postgresql in python before, but one of the first search engine hits leads to:
http://initd.org/psycopg/docs/usage.html
You could then do something along the following lines:
import psycopg2
conn = psycopg2.connect("dbname=test user=postgres")
# create a cursor for interaction with the database
cursor = conn.cursor()
# execute your sql statement
cursor.execute("insert into emp select * from emp1")
# retrieve the number of selected rows
number_rows_inserted = cursor.rowcount
# commit the changes
conn.commit()
This should also make things significantly faster than using an os.system call(s), especially if you're planning to execute multiple statements.

Python sqlite3 "unable to open database file" on windows

I am working on a windows vista machine in python 3.1.1. I am trying to insert a large number of rows into a SQLite3 db. The file exists, and my program properly inserts some rows into the db. However, at some point in the insertion process, the program dies with this message:
sqlite3.OperationalError: unable to open database file
However, before it dies, there are several rows that are properly added to the database.
Here is the code which specifically handles the insertion:
idx = 0
lst_to_ins = []
for addl_img in all_jpegs:
lst_to_ins.append((addl_img['col1'], addl_img['col2']))
idx = idx + 1
if idx % 10 == 0:
logging.debug('adding rows [%s]', lst_to_ins)
conn.executemany(ins_sql, lst_to_ins)
conn.commit()
lst_to_ins = []
logging.debug('added 10 rows [%d]', idx)
if len(lst_to_ins) > 0:
conn.executemany(ins_sql, lst_to_ins)
conn.commit()
logging.debug('adding the last few rows to the db')
This code inserts anywhere from 10 to 400 rows, then dies with the error message
conn.executemany(ins_sql, lst_to_ins)
sqlite3.OperationalError: unable to open database file
How is it possible that I can insert some rows, but then get this error?

SQLite does not have record locking; it uses a simple locking mechanism that locks the entire database file briefly during a write. It sounds like you are running into a lock that hasn't cleared yet.
The author of SQLite recommends that you create a transaction prior to doing your inserts, and then complete the transaction at the end. This causes SQLite to queue the insert requests, and perform them using a single file lock when the transaction is committed.
In the newest version of SQLite, the locking mechanism has been enhanced, so it might not require a full file lock anymore.

same error here on windows 7 (python 2.6, django 1.1.1 and sqllite) after some records inserted correctly: sqlite3.OperationalError: unable to open database file
I ran my script from Eclipse different times and always got that error. But as I ran it from the command line (after setting PYTHONPATH and DJANGO_SETTINGS_MODULE) it worked as a charm...
just my 2 cents!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Problem with concurrent reading and writing (distinct processes) using MySQL and Python - python

Related

flask sqlalchemy update sql raw command does not provide any response

inserting data to postgres using python

Connect / Store sqlite3 database in specified directory other than default -- "conn = sqlite3.connect('name_of_database')"

Python -printing the sql statement record count in the log file

Python sqlite3 "unable to open database file" on windows

Categories

Resources