I'm trying to append a set containing a number into my MySQL database using the Python MySQLConnector. I am able to add data manually, but the following expression with %s won't work. I tried several variations on this, but nothing from the documentation seems to work in my case. The table was already buildt as you can see:
#Table erstellen:
#cursor.execute('''CREATE TABLE anzahlids( tweetid INT )''')
Here is my code and the error:
print len(idset)
id_data = [
len(idset)
]
print id_data
insert = ("""INSERT INTO anzahlids (idnummer) VALUES (%s)""")
cursor.executemany(insert, id_data)
db_connection.commit()
"Failed processing format-parameters; %s" % e)
mysql.connector.errors.ProgrammingError: Failed processing format-parameters; argument 2 to map() must support iteration
Late answer, but I would like to post some nicer code. Also, the original question was using MySQL Connector/Python.
The use of executemany() is wrong. The executemany() method expects a sequence of tuples, for example, [ (1,), (2,) ].
For the problem at hand, executemany() is actually not useful and execute() should be used:
cur.execute("DROP TABLE IF EXISTS anzahlids")
cur.execute("CREATE TABLE anzahlids (tweetid INT)")
some_ids = [ 1, 2, 3, 4, 5]
cur.execute("INSERT INTO anzahlids (tweetid) VALUES (%s)",
(len(some_ids),))
cnx.commit()
And with MySQL Connector/Python (unlike with MySQLdb), you have to make sure you are committing.
(Note for non-German speakers: 'anzahlids' means 'number_of_ids')
The following is an example that worked on my machine.
import MySQLdb
db = MySQLdb.connect(host="localhost", user="stackoverflow", passwd="", db="stackoverflow")
cursor = db.cursor()
try:
sql = 'create table if not exists anzahlids( tweetid int ) ; '
except:
#ignore
pass
sql = ("""INSERT INTO anzahlids (tweetid) VALUES (%s)""")
data = [1,2,3,4,5,6,7,8,9]
length = [len(data)]
cursor.executemany(sql,length)
db.commit()
if idset is a single value you can use
sql = ("""INSERT INTO anzahlids (tweetid) VALUES (%s)""") % len(idset)
cursor.execute(sql)
db.commit()
Related
I am trying to learn how to save dataframe created in pandas into postgresql db (hosted on Azure). I planned to start with simple dummy data:
data = {'a': ['x', 'y'],
'b': ['z', 'p'],
'c': [3, 5]
}
df = pd.DataFrame (data, columns = ['a','b','c'])
I found a function that pushed df data into psql table. It starts with defining connection:
def connect(params_dic):
""" Connect to the PostgreSQL database server """
conn = None
try:
# connect to the PostgreSQL server
print('Connecting to the PostgreSQL database...')
conn = psycopg2.connect(**params_dic)
except (Exception, psycopg2.DatabaseError) as error:
print(error)
sys.exit(1)
print("Connection successful")
return conn
conn = connect(param_dic)
*param_dic contains all connection details (user/pass/host/db)
Once connection is established then I'm defining execute function:
def execute_many(conn, df, table):
"""
Using cursor.executemany() to insert the dataframe
"""
# Create a list of tupples from the dataframe values
tuples = [tuple(x) for x in df.to_numpy()]
# Comma-separated dataframe columns
cols = ','.join(list(df.columns))
# SQL quert to execute
query = "INSERT INTO %s(%s) VALUES(%%s,%%s,%%s)" % (table, cols)
cursor = conn.cursor()
try:
cursor.executemany(query, tuples)
conn.commit()
except (Exception, psycopg2.DatabaseError) as error:
print("Error: %s" % error)
conn.rollback()
cursor.close()
return 1
print("execute_many() done")
cursor.close()
I executed this function to a psql table that I created in the DB:
execute_many(conn,df,"raw_data.test")
The table raw_data.test consists of columns a(char[]), b(char[]), c(numeric).
When I run the code I get following information in the console:
Connecting to the PostgreSQL database...
Connection successful
Error: malformed array literal: "x"
LINE 1: INSERT INTO raw_data.test(a,b,c) VALUES('x','z',3)
^
DETAIL: Array value must start with "{" or dimension information.
I don't know how to interpret it because none of the columns in df are array
df.dtypes
Out[185]:
a object
b object
c int64
dtype: object
Any ideas what goes wrong there or suggestions how to maybe save df in pSQL in a simpler manner? I found quite a lot of solutions that use sqlalchemy with creating connection string in following way:
conn_string = 'postgres://user:password#host/database'
But I am not sure if that works on cloud db- if I try to edit such connection string with azure host details it does not work.
The usual data type for strings in PostgreSQL is TEXT or VARCHAR(n) or CHAR(n), with round brackets; not CHAR[] with square brackets.
I'm guessing that you want the column to contain a string and that CHAR[] was a typo; in that case, you'll need to recreate (or migrate) the table column to the correct type - most likely TEXT.
(You might use CHAR(n) for fixed-length data, if it's genuinely fixed-length; VARCHAR(n) is mostly of historical interest. In most cases, use TEXT.)
Alternately, if you do mean to make the column an array, you'll need to pass a list in that position from Python.
Consider adjusting your parameterization approach as psycopg2 supports a more optimal approach to format identifiers in SQL statements like table or column names.
In fact, docs indicate your current approach is not optimal and poses a security risk:
# This works, but it is not optimal
query = "INSERT INTO %s(%s) VALUES(%%s,%%s,%%s)" % (table, cols)
Instead use psycop2.sql module:
from psycopg2 import sql
...
query = (
sql.SQL("insert into {} values (%s, %s, %s)")
.format(sql.Identifier('table'))
)
...
cur.executemany(query, tuples)
Also, for best practice in SQL always include column names in append queries and do not rely on column order of stored table:
query = (
sql.SQL("insert into {0} ({1}, {2}, {3}) values (%s, %s, %s)")
.format(
sql.Identifier('table'),
sql.Identifier('col1'),
sql.Identifier('col2'),
sql.Identifier('col3')
)
)
Finally, discontinue using % for string formatting across all your Python code (not just psycopg2). As of Python 3, this method has been de-emphasized but not deprecated yet! Instead, use str.format (Python 2.6+) or F-string (Python 3.6+).
I have a list of lists that I'm trying to query using the (x,y) IN clause in MySQL via Python. The list is like the below:
list = [(item, item), (item, item) ...]
and query
cursor.execute("select * from table where (x,y) in {}", tuple(list))
This is giving me TypeError: not all arguments converted during string formatting. What is the correct way of passing the list of lists as a parameter to the select query?
Here are a few ways you could do this.
If you need to execute from the cursor directly, then this approach works. You need to manually create the placeholders to match the length of items, which is not ideal. I found this worked when the engine connected using pymysql or MySQLdb, but not mysql.connector.
items = [(1, 2), (12, 10)]
dbapi_conn = engine.raw_connection()
cursor = dbapi_conn.cursor()
cursor.execute("SELECT * FROM username WHERE (user_id, batch_id) IN (%s, %s)",
items)
res = cursor.fetchall()
for row in res:
print(row)
print()
dbapi_conn.close()
If a raw connection method is not a requirement, this is how you might execute a raw SQL query in SQLAlchemy 1.4+. Here we can expand the bind parameters to handle a variable number of values.This approach also does not work with mysql.connector.
with engine.connect() as conn:
query = sa.text("""SELECT * FROM username WHERE (user_id, batch_id) IN :values""")
query = query.bindparams(sa.bindparam('values', expanding=True))
res = conn.execute(query, {'values': items})
for row in res:
print(row)
print()
Finally, this approach is pure SQLAlchemy, using the tuple_() construct. It does not require any special handling for values placeholders, but the tuple_ must be configured. This method is the most portable: it worked with all three connectors that I tried.
metadata = sa.MetaData()
username = sa.Table('username', metadata, autoload_with=engine)
tup = sa.tuple_(sa.column('user_id', sa.Integer),
sa.column('batch_id', sa.Integer))
stmt = sa.select(username).where(tup.in_(items))
with engine.connect() as conn:
res = conn.execute(stmt)
for row in res:
print(row)
print()
All of these methods delegate escaping of values to the DBAPI connector to mitigate SQL injections.
I have a sql file in which there are many insert sql to the same table:
insert into tbname
values (xxx1);
insert into tbname values (xxx2);
insert into
tbname values (xxx3);
...
how to convert this file into a new file containing one sql like :
insert into tbname (xxx1),(xxx2),(xxx3)...;
Due to different insert formats as above in the sql file, it make hard to use regular express in python.
If you want to insert multiple rows into the table use executemany() method.
mydb = mysql.connector.connect(
host="localhost",
user="yourusername",
passwd="yourpassword",
database="mydatabase"
)
mycursor = mydb.cursor()
sql = "INSERT INTO tbname VALUES (%s)"
val = [
('xx1'),
('xx2'),
('xx3')
]
mycursor.executemany(sql, val)
mydb.commit()
For more info, follow this.
Open your file to a string and use split to get all values in a table and then again to a string. Something like:
s = "insert into tbname values (val1, val2);insert into tbname values (val3, val4);insert into tbname values (val5, val6);"
values = s.replace(";insert into tbname values", ', ')
It's an anorthodox method but could work in your case to get it in one insert.
I'm trying to insert latitude & longitude that are stored as python variables into a table in PostgreSQL via the INSERT query. Any suggestions on how to cast Point other than what I've tried?
I tried the insert query first as shown -
This is the table:
cur.execute('''CREATE TABLE AccidentList (
accidentId SERIAL PRIMARY KEY,
cameraGeoLocation POINT,
accidentTimeStamp TIMESTAMPTZ);''')
Try1:
cur.execute("INSERT INTO AccidentList(cameraGeoLocation,accidentTimeStamp)
VALUES {}".format((lat,lon),ts));
Error:
>Hint: psycopg2.ProgrammingError: column "camerageolocation" is of type point but expression is of type numeric
LINE 1: ...ist (cameraGeoLocation,accidentTimeStamp) VALUES (13.0843, 8...
^
HINT: You will need to rewrite or cast the expression.
Try2:
query = "INSERT INTO AccidentList (cameraGeoLocation,accidentTimeStamp)
VALUES(cameraGeoLocation::POINT, accidentTimeStamp::TIMESTAMPTZ);"
data = ((lat,lon),ts)
cur.execute(query,data)
Error:
LINE 1: ...List (cameraGeoLocation,accidentTimeStamp) VALUES(cameraGeoL...
^
HINT: There is a column named "camerageolocation" in table "accidentlist", but it cannot be referenced from this part of the query.
Try 3:
query = "INSERT INTO AccidentList (camerageolocation ,accidenttimestamp) VALUES(%s::POINT, %s);"
data = (POINT(lat,lon),ts)
cur.execute(query,data)
Error:
cur.execute(query,data)
psycopg2.ProgrammingError: cannot cast type record to point
LINE 1: ...tion ,accidenttimestamp) VALUES((13.0843, 80.2805)::POINT, '...
Single quote your third attempt.
This works: SELECT '(13.0843, 80.2805)'::POINT
I had a similar problem trying to insert data of type point into Postgres.
Using quotes around the tuple (making it a string) worked for me.
conn = psycopg2.connect(...)
cursor = conn.cursor()
conn.autocommit = True
sql = 'insert into cities (name,location) values (%s,%s);'
values = ('City A','(10.,20.)')
cursor.execute(sql,values)
cursor.close()
conn.close()
My environment:
PostgreSQL 12.4,
Python 3.7.2,
psycopg2-binary 2.8.5
Hi I'm doing something like:
# pyodbc extension
cursor.execute("select a from tbl where b=? and c=?", x, y)
-- some values in the query in provided by variables. But sometimes the variable is interpreted as #P1 in the query.
For example:
import pyodbc
ch = pyodbc.connect('DRIVER={SQL Server};SERVER=xxxx;DATABASE=xxx;Trusted_Connection=True')
cur = ch.cursor()
x = 123
cur.execute('''
CREATE TABLE table_? (
id int IDENTITY(1,1) PRIMARY KEY,
obj varchar(max) NOT NULL
)
''', x).commit()
This results in a new table named table_#P1 (I want table_123)
Another example:
x = 123
cur.execute('''
CREATE TABLE table_2 (
id int IDENTITY(1,1) PRIMARY KEY,
obj varchar(?) NOT NULL
)
''', x).commit()
it reports error:
ProgrammingError: ('42000', "[42000] [Microsoft][ODBC SQL Server
Driver][SQL Server]Incorrect syntax near '#P1'. (102)
(SQLExecDirectW)")
Again, the variable is interpreted as #P1.
Anyone know how to fix this? Any help's appreciated. Thanks-
In your first case, parameter substitution does not work for table/column names. This is common to the vast majority of (if not all) database platforms.
In your second case, SQL Server does not appear to support parameter substitution for DDL statements. The SQL Server ODBC driver converts the pyodbc parameter placeholders (?) to T-SQL parameter placeholders (#P1, #P2, ...) so the statement passed to SQL Server is
CREATE TABLE table_2 (id int IDENTITY(1,1) PRIMARY KEY, obj varchar(#P1) NOT NULL
specifically
exec sp_prepexec #p1 output,N'#P1 int',N'CREATE TABLE table_2 (id int IDENTITY(1,1) PRIMARY KEY, obj varchar(#P1) NOT NULL',123
and when SQL Server tries to prepare that statement it expects a literal value, not a parameter placeholder.
So, in both cases you will need to use dynamic SQL (string formatting) to insert the appropriate values.
There is a way to do this sort of thing. What you need to do is dynamically build the command (ideally as a nvarchar( MAX), not varchar( MAX)) string variable and pass that variable to the cur.execute() - or any other - command. Modifying your first example accordingly:
ch = pyodbc.connect( 'DRIVER={SQL Server};SERVER=xxxx;DATABASE=xxx;Trusted_Connection=True' )
cur = ch.cursor()
x = 123
SQL_Commands = 'CREATE TABLE table_' + str( x ) + '''
(
id int IDENTITY(1,1) PRIMARY KEY,
obj varchar(max) NOT NULL
) '
'''
cur.execute( SQL_Commands ).commit()
BTW, you shouldn't try to do everything in one line, if only to avoid problems like this one. I'd also suggest looking into adding "autocommit=True" to your connect string, that way you wouldn't have to append .commit() to cur.execute().