Python multiple MySQL Inserts - python

I'm trying to do multiple inserts on a MySQL db like this:
p = 1
orglist = buildjson(buildorgs(p, p))
while (orglist is not None):
for org in orglist:
sid = org['sid']
try:
sql = "INSERT INTO `Orgs` (`sid`) VALUES (\"{0}\");".format(sid)
cursor.execute(sql)
print("Added {0}".format(org['title']))
except Exception as bug:
print(bug)
conn.commit()
conn.close()
p += 1
orglist = buildjson(buildorgs(p, p))
However I keep getting a bunch of 2055: Lost connection to MySQL server at 'localhost:3306', system error: 9 Bad file descriptor
How can I correctly do multiple inserts at once so I don't have to commit after every single insert. Also, can i only do conn.close()after the while loop or is it better to keep it where it is?

This may be related to this question and/or this question. A couple ideas from the answers to those questions which you might try:
Try closing the cursor before closing the connection (cursor.close() before conn.close(); I don't know if you should close the cursor before or after conn.commit(), so try both.)
If you're using the Oracle MySQL connector, try using PyMySQL instead; several people said that that fixed this problem for them.

Related

Inserting JPEG-filenames into PostgreSQL table using Psycopg2 causes "not all arguments converted during string formatting" error

I'm trying to fill a PostgreSQL table (psycopg2, Python) with the filenames I have in a specific folder. I have created a function that should do the trick, but I get the error:
not all arguments converted during string formatting,
when I run my function. I did a test run and called the function in the following way:
insert_file_names_into_database(["filename1_without_extension", "filename2_without_extension"]),
and I had no problems and the INSERT worked fine. If I did the following:
insert_file_names_into_database(["filename1.extension", "filename2.extension"]),
Then I get the error above. So the problem seems to be the "." character (e.g. image.jpg) which causes the SQL INSERT to fail. I tried to consult the Psycopg2 docs about this, but I found no examples relating to this specific case.
How should I edit the piece of code so I can get to work even with "." characters in the filenames?
def insert_file_names_into_database(file_name_list):
""" insert multiple filenames into the table """
sql = "INSERT INTO mytable(filename) VALUES(%s)"
conn = None
try:
# read database configuration
# connect to the PostgreSQL database
conn = psycopg2.connect(
host="localhost",
database="mydatabase",
user="myusername",
password="mypassword")
# create a new cursor
cur = conn.cursor()
# execute the INSERT statement
cur.executemany(sql, file_name_list)
# commit the changes to the database
conn.commit()
# close communication with the database
cur.close()
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
if conn is not None:
conn.close()
Solved it myself already. I knew I should be using tuples when working with the INSERT, but my function worked fine with list of strings without the "." characters.
The solution I got working was to convert the list of strings into a list of tuples like so:
tuple_file_name = [tuple((file_name,)) for file_name in file_name_list]
So for example if:
file_name_list = ["filename1.jpg", "filename2.jpg"]
Then giving this as input to my function fails. But by making it a list of tuples:
tuple_file_name = [tuple((file_name,)) for file_name in file_name_list]
print(tuple_file_name)
[('filename1.jpg',), ('filename2.jpg',)]
Then now the function accepts the input tuple_file_name and the filenames are saved into the SQL table.

Ending a SELECT transaction psycopg2 and postgres

I am executing a number of SELECT queries on a postgres database using psycopg2, but I am getting ERROR: Out of shared memory. It suggests that I should increase max_locks_per_transaction., but this confuses me because each SELECT query is operating on only one table, and max_locks_per_transaction is already set to 512, 8 times the default.
I am using TimescaleDB, which could be the result of a larger than normal number of locks (one for each chunk rather than one for each table, maybe), but this still can't explain running out when so many are allowed. I'm assuming what is happening here is that all the queries are all being run as part of one transaction.
The code I am using looks something as follows.
db = DatabaseConnector(**connection_params)
tables = db.get_table_list()
for table in tables:
result = db.query(f"""SELECT a, b, COUNT(c) FROM {table} GROUP BY a, b""")
print(result)
Where db.query is defined as:
def query(self, sql):
with self._connection.cursor() as cur:
cur.execute(sql)
return_value = cur.fetchall()
return return_value
and self._connection is:
self._connection = psycopg2.connect(**connection_params)
Do I need to explicitly end the transaction in some way to free up locks? And how can I go about doing this in psycopg2? I would have assumed that there was an implicit end to the transaction when the cursor is closed on __exit__. I know if I was inserting or deleting rows I would use COMMIT at the end, but it seems strange to use as I am not changing the table.
UPDATE: When I explicitly open and close the connection in the loop, the error does not show. However, I assume there is a better way to end the transaction after each SELECT than this.

Python psycopg2 cursor.fetchall() returns empty list but cursor.rowcount is > 1

I am getting an issue here:
conn = psycopg2.connect(conn_string)
cursor = conn.cursor()
sql = """
SELECT DISTINCT (tenor_years)
FROM bond_pnl
WHERE country = '%s'
""" % country
cursor.execute(sql)
print(cursor.fetchall())
print(cursor.rowcount)
It gives the following output:
[]
11
which means that cursor.rowcount is 11 but cursor.fetchall() is empty list. I have already tried doing this:
conn.set_session(readonly=True, autocommit=True)
and this solution as well :Click to see
Any help regarding this will be appreciated.
EDIT: Just came across another thing, this code when executed first time, works fine. But executing it again(second, third, ...n execution) gives the above behavior.
I also faced the same issue. I figured out that,
Might be while debugging, we are allowing some fraction of time after connection has been made
#conn = psycopg2.connect(conn_string)
#cursor = conn.cursor()
By the time we hit the execution button for the next line (which contains the query), database is timed out and it is returning empty list.
If anyone has any other logic for why this is happening please do share.
After trying different solution, I have figured out that the problem described in the question arises when I execute it in "Debugging Mode" in pycharm. But on the other hand if I execute the code in "Run Mode" in pycharm , it returns the right expected output (a list with 11 elements):
[a,b,c,d,e,f,g,h,i,j,k]
Not sure about the exact reason for it but somehow the cursor was breaking somewhere when run in "Debugging Mode".
If anyone describes the exact reason, It ll be highly appreciated.

PyMySQL throws 'BrokenPipeError' after making frequent reads

I have written a script to help me work with a database. Specifically, I am trying to work with files on disk and add the result of this work to my database. I have copied the code below, but removed most of the logic which isn't related to my database to try to keep this question broad and helpful.
I used the code to operate on the files and add the result to the database, overwriting any files with the same identifier as the one I was working on. Later, I modified the script to ignore documents which have already been added to the database, and now whenever I run it I get an error:
pymysql.err.OperationalError: (2006, "MySQL server has gone away (BrokenPipeError(32, 'Broken pipe'))")
It seems like the server is rejecting the requests, possibly because I have written my code poorly? I have noticed that the error always occurs at the same place in the list of files, which doesn't change. If I re-run run the code, replacing the file list with a list of only the file on which the program crashes, it works fine. This makes me think that after making a certain number of requests, the database just bottoms out.
I'm using Python 3 and MySQL Community Edition Version 14.14 on OS X.
Code (stripped of stuff that doesn't have to do with the database):
import pymysql
# Stars for user-specific stuff
connection = pymysql.connect(host='localhost',
user='root',
password='*******',
db='*******',
use_unicode=True,
charset="utf8mb4",
)
cursor = connection.cursor()
f_arr = # An array of all of my data objects
def convertF(file_):
# General layout: Try to work with input and add it the result to DB. The work can raise an exception
# If the record already exists in the DB, ignore it
# Elif the work was already done and the result is on disk, put it on the database
# Else do the work and put it on the database - this can raise exceptions
# Except: Try another way to do the work, and put the result in the database. This can raise an error
# Second (nested) except: Add the record to the database with indicator that the work failed
# This worked before I added the initial check on whether or not the record already exists in the database. Now, for some reason, I get the error:
# pymysql.err.OperationalError: (2006, "MySQL server has gone away (BrokenPipeError(32, 'Broken pipe'))")
# I'm pretty sure that I have written code to work poorly with the database. I had hoped to finish this task quickly instead of efficiently.
try:
# Find record in DB, if text exists just ignore the record
rc = cursor.execute("SELECT LENGTH(text) FROM table WHERE name = '{0}'".format(file_["name"]))
length = cursor.fetchall()[0][0] # Gets the length
if length != None and length > 4:
pass
elif ( "work already finished on disk" ):
# get "result_text" from disk
cmd = "UPDATE table SET text = %s, hascontent = 1 WHERE name = %s"
cursor.execute(cmd, ( pymysql.escape_string(result_text), file_["name"] ))
connection.commit()
else:
# do work to get result_text
cmd = "UPDATE table SET text = %s, hascontent = 1 WHERE name = %s"
cursor.execute(cmd, ( pymysql.escape_string(result_text), file_["name"] ))
connection.commit()
except:
try:
# Alternate method of work to get result_text
cmd = "UPDATE table SET text = %s, hascontent = 1 WHERE name = %s"
cursor.execute(cmd, ( pymysql.escape_string(result_text), file_["name"] ))
connection.commit()
except:
# Since the job can't be done, tell the database
cmd = "UPDATE table SET text = %s, hascontent = 0 WHERE name = %s"
cursor.execute(cmd, ( "NO CONTENT", file_["name"]) )
connection.commit()
for file in f_arr:
convertF(file)
Mysql Server Has Gone Away
This problem is described extensively at http://dev.mysql.com/doc/refman/5.7/en/gone-away.html the usual cause is that the server has disconnected for whatever reason and the usual remedy is to retry the query or to reconnect and retry.
But why this breaks your code is because of the way you have written your code. See below
Possibly because I have written my code poorly?
Since you asked.
rc = cursor.execute("SELECT LENGTH(text) FROM table WHERE name = '{0}'".format(file_["name"]))
This is a bad habit. The manually explicitly warns you against doing this to avoid SQL injections. The correct way is
rc = cursor.execute("SELECT LENGTH(text) FROM table WHERE name = %s", (file_["name"],))
The second problem with the above code is that you don't need to check if a value exists before you try to update it. You can delete the above line and it's associated if else and jump straight to the update. Besides, our elif and else seem to do exactly the same thing. So your code can just be
try:
cmd = "UPDATE table SET text = %s, hascontent = 1 WHERE name = %s"
cursor.execute(cmd, ( pymysql.escape_string(result_text), file_["name"] ))
connection.commit()
except: # <-- next problem.
And we come to the next problem. Never ever catch generic exceptions like this. you should always catch specific exceptions like TypeError, AttributeError etc. When catching generic exceptions is unavoidable, you should at least log it.
For example, here you could catch connection errors and attempt to reconnect to the database. Then the code will not stop executing when your server gone away problem happens.
I've solved the same error in the case when I tried to make a bulk inserts by reducing the number of lines I wanted to insert in one command.
Even the maximum number of lines for bulk insert was much higher, I had this kind of error.

How to fix "can't adapt error" when saving binary data using python psycopg2

I ran across this bug three times today in one of our projects. Putting the problem and solution online for future reference.
impost psycopg2
con = connect(...)
def save(long_blob):
cur = con.cursor()
long_data = struct.unpack('<L', long_blob)
cur.execute('insert into blob_records( blob_data ) values (%s)', [long_data])
This will fail with the error "can't adapt" from psycopg2.
The problem is struct.unpack returns a tuple result, even if there is only one value to unpack. You need to make sure you grab the first item from the tuple, even if there is only one item. Otherwise psycopg2 sql argument parsing will fail trying to convert the tuple to a string giving the "can't adapt" error message.
impost psycopg2
con = connect(...)
def save(long_blob):
cur = con.cursor()
long_data = struct.unpack('<L', long_blob)
# grab the first result of the tuple
long_data = long_data[0]
cur.execute('insert into blob_records( blob_data ) values (%s)', [long_data])
"Can't adapt" is raised when psycopg doesn't know the type of your long_blob variable. What type is it?
You can easily register an adapter to tell psycopg how to convert the value for the database.
Because it is a numerical value, chances are that the AsIs adapter would already work for you.

Categories