What is the best way to insert a Python dictionary with many keys into a Postgres database without having to enumerate all keys?
I would like to do something like...
song = dict()
song['title'] = 'song 1'
song['artist'] = 'artist 1'
...
cursor.execute('INSERT INTO song_table (song.keys()) VALUES (song)')
from psycopg2.extensions import AsIs
song = {
'title': 'song 1',
'artist': 'artist 1'
}
columns = song.keys()
values = [song[column] for column in columns]
insert_statement = 'insert into song_table (%s) values %s'
# cursor.execute(insert_statement, (AsIs(','.join(columns)), tuple(values)))
print cursor.mogrify(insert_statement, (AsIs(','.join(columns)), tuple(values)))
Prints:
insert into song_table (artist,title) values ('artist 1', 'song 1')
Psycopg adapts a tuple to a record and AsIs does what would be done by Python's string substitution.
You can also insert multiple rows using a dictionary. If you had the following:
namedict = ({"first_name":"Joshua", "last_name":"Drake"},
{"first_name":"Steven", "last_name":"Foo"},
{"first_name":"David", "last_name":"Bar"})
You could insert all three rows within the dictionary by using:
cur = conn.cursor()
cur.executemany("""INSERT INTO bar(first_name,last_name) VALUES (%(first_name)s, %(last_name)s)""", namedict)
The cur.executemany statement will automatically iterate through the dictionary and execute the INSERT query for each row.
PS: This example is taken from here
Something along these lines should do it:
song = dict()
song['title'] = 'song 1'
song['artist'] = 'artist 1'
cols=song.keys();
vals = [song[x] for x in cols]
vals_str_list = ["%s"] * len(vals)
vals_str = ", ".join(vals_str_list)
cursor.execute("INSERT INTO song_table ({cols}) VALUES ({vals_str})".format(
cols = cols, vals_str = vals_str), vals)
The key part is the generated string of %s elements, and using that in format, with the list passed directly to the execute call, so that psycopg2 can interpolate each item in the vals list (thus preventing possible SQL Injection).
Another variation, passing the dict to execute, would be to use these lines instead of vals, vals_str_list and vals_str from above:
vals_str2 = ", ".join(["%({0})s".format(x) for x in cols])
cursor.execute("INSERT INTO song_table ({cols}) VALUES ({vals_str})".format(
cols = cols, vals_str = vals_str2), song)
The new sql module was created for this purpose and added in psycopg2 version 2.7. According to the documentation:
If you need to generate dynamically an SQL query (for instance choosing dynamically a table name) you can use the facilities provided by the psycopg2.sql module.
Two examples are given in the documentation: http://initd.org/psycopg/docs/sql.html
names = ['foo', 'bar', 'baz']
q1 = sql.SQL("insert into table ({}) values ({})").format(
sql.SQL(', ').join(map(sql.Identifier, names)),
sql.SQL(', ').join(sql.Placeholder() * len(names)))
print(q1.as_string(conn))
insert into table ("foo", "bar", "baz") values (%s, %s, %s)
q2 = sql.SQL("insert into table ({}) values ({})").format(
sql.SQL(', ').join(map(sql.Identifier, names)),
sql.SQL(', ').join(map(sql.Placeholder, names)))
print(q2.as_string(conn))
insert into table ("foo", "bar", "baz") values (%(foo)s, %(bar)s, %(baz)s)
Though string concatenation would produce the same result, it should not be used for this purpose, according to psycopg2 documentation:
Warning: Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
an other approach for query to mySQL or pgSQL from dictionary is using construction %(dic_key)s, it will be replaced by value from dictionary coresponding by dic_key like {'dic_key': 'dic value'}
working perfect, and prevent sqlInjection
tested: Python 2.7
see below:
# in_dict = {u'report_range': None, u'report_description': None, 'user_id': 6, u'rtype': None, u'datapool_id': 1, u'report_name': u'test suka 1', u'category_id': 3, u'report_id': None}
cursor.execute('INSERT INTO report_template (report_id, report_name, report_description, report_range, datapool_id, category_id, rtype, user_id) VALUES ' \
'(DEFAULT, %(report_name)s, %(report_description)s, %(report_range)s, %(datapool_id)s, %(category_id)s, %(rtype)s, %(user_id)s) ' \
'RETURNING "report_id";', in_dict)
OUT:
INSERT INTO report_template (report_id, report_name, report_description, report_range, datapool_id, category_id, rtype, user_id) VALUES (DEFAULT, E'test suka 1', NULL, NULL, 1, 3, NULL, 6) RETURNING "report_id";
Using execute_values https://www.psycopg.org/docs/extras.html is faster and has a fetch argument to return something. Next there is some code that might help.
columns is a string like col_name1, col_name2
template is the one that allows the matching, a string like %(col_name1)s, %(col_name2)
def insert(cur: RealDictCursor,
table_name: str,
values: list[dict],
returning: str = ''
):
if not values:
return []
query = f"""SELECT
column_name AS c
FROM
information_schema.columns
WHERE
table_name = '{table_name}'
AND column_default IS NULL;"""
cur.execute(query)
columns_names = cur.fetchall()
fetch = False
if returning:
returning = f'RETURNING {returning}'
fetch = True
columns = ''
template = ''
for col in columns_names:
col_name = col['c']
for val in values:
if col_name in val:
continue
val[col_name] = None
columns += f'{col_name}, '
template += f'%({col_name})s, '
else:
columns = columns[:-2]
template = template[:-2]
query = f"""INSERT INTO {table_name}
({columns})
VALUES %s {returning}"""
return execute_values(cur, query, values,
template=f'({template})', fetch=fetch)
Python has certain inbuilt features such as join and list using which one can generate the query. Also,the python dictionary offers keys() and values() which can be used to extract column name and column values respectively. This is the approach I used and this should work.
song = dict()
song['title'] = 'song 1'
song['artist'] = 'artist 1'
query = '''insert into song_table (''' +','.join(list(song.keys()))+''') values '''+ str(tuple(song.values()))
cursor.execute(query)
Related
I am trying to create a method in python insert records into a table passing in a list of column names, and an associated list of records.
I was able to set it up where the column names populated dynamically via a for loop, but I can't figure out how to do the same thing with values because the psycopg2.executemany function relies on having %s's as placeholders.
Is it possible to have the number of %s's in the string populate dynamically via a loop? Is there another way to do this?
def load_table(dbname,table_name,fields,records):
try:
#Variable Qty Column Loop
sql_fields = []
for i in fields:
i = sql.Identifier(i)
sql_fields.append(i)
#Need similar loop to replace %s values
#Replace (%s,%s,%s) ???
#.....
#.....
sql_values = []
for i in fields:
sql_values.append('%s')
print(sql_values)
flist = sql.SQL(',').join(sql_fields)
connection, cursor = create_connection(dbname)
insert_query = sql.SQL('INSERT INTO {table_name} ({fields}) VALUES (%s,%s,%s)').format(
table_name = sql.Identifier(table_name),
fields = flist,
cursor.executemany(insert_query,records)
print('Records Loaded Successfully')
except (Exception,psycopg2.Error) as error:
print("Failed to insert record into table {error}".format(error = error))
finally:
# closing database connection.
if (connection):
close_connection(connection,cursor)
You can use sql.Placeholder, to populate the sql statement with the amount of %s-placeholders you need:
def load_table(dbname,table_name,fields,records):
con, cur = create_connection('foo')
query = sql.SQL("insert into {} ({}) values ({})").format(
sql.Identifier(table_name),
sql.SQL(', ').join(map(sql.Identifier, fields)),
sql.SQL(', ').join(sql.Placeholder() * len(fields)))
print(query.as_string(con))
if __name__ == '__main__':
dbname = '...'
table_name = 'messages'
fields = ['user_id', 'message_type', 'message_title']
records = [['12345', 'json', 'my first message'], ]
load_table(dbname,table_name,fields,records)
Output:
insert into "messages" ("user_id", "message_type", "message_title") values (%s, %s, %s)
I am trying to create a method in python insert records into a table passing in a list of column names, and an associated list of records.
I was able to set it up where the column names populated dynamically via a for loop, but I can't figure out how to do the same thing with values because the psycopg2.executemany function relies on having %s's as placeholders.
Is it possible to have the number of %s's in the string populate dynamically via a loop? Is there another way to do this?
def load_table(dbname,table_name,fields,records):
try:
#Variable Qty Column Loop
sql_fields = []
for i in fields:
i = sql.Identifier(i)
sql_fields.append(i)
#Need similar loop to replace %s values
#Replace (%s,%s,%s) ???
#.....
#.....
sql_values = []
for i in fields:
sql_values.append('%s')
print(sql_values)
flist = sql.SQL(',').join(sql_fields)
connection, cursor = create_connection(dbname)
insert_query = sql.SQL('INSERT INTO {table_name} ({fields}) VALUES (%s,%s,%s)').format(
table_name = sql.Identifier(table_name),
fields = flist,
cursor.executemany(insert_query,records)
print('Records Loaded Successfully')
except (Exception,psycopg2.Error) as error:
print("Failed to insert record into table {error}".format(error = error))
finally:
# closing database connection.
if (connection):
close_connection(connection,cursor)
You can use sql.Placeholder, to populate the sql statement with the amount of %s-placeholders you need:
def load_table(dbname,table_name,fields,records):
con, cur = create_connection('foo')
query = sql.SQL("insert into {} ({}) values ({})").format(
sql.Identifier(table_name),
sql.SQL(', ').join(map(sql.Identifier, fields)),
sql.SQL(', ').join(sql.Placeholder() * len(fields)))
print(query.as_string(con))
if __name__ == '__main__':
dbname = '...'
table_name = 'messages'
fields = ['user_id', 'message_type', 'message_title']
records = [['12345', 'json', 'my first message'], ]
load_table(dbname,table_name,fields,records)
Output:
insert into "messages" ("user_id", "message_type", "message_title") values (%s, %s, %s)
convert a dictionary to SQL insert for the cx_Oracle driver in Python
custom_dictionary= {'ID':2, 'Price': '7.95', 'Type': 'Sports'}
I'm need making dynamic code sql insert for cx_Oracle driver from custom dictionary
con = cx_Oracle.connect(connectString)
cur = con.cursor()
statement = 'insert into cx_people(ID, Price, Type) values (:2, :3, :4)'
cur.execute(statement, (2, '7.95', 'Sports'))
con.commit()
If you have a known set of columns to be inserted, simply use the insert with named params and pass the dictionary to the execute() method.
statement = 'insert into cx_people(ID, Price, Type) values (:ID, :Price, :Type)'
cur.execute(statement,custom_dictionary)
If the columns are dynamic, construct the insert statement using the keys and params
put it into a similar execute
cols = ','.join( list(custom_dictionary.keys() ))
params= ','.join( ':' + str(k) for k in list(custom_dictionary.keys()))
statement = 'insert into cx_people(' + cols +' ) values (' + params + ')'
cur.execute(statement,custom_dictionary)
You can use pandas.read_json method with iteration over list converted values through dataframe :
import pandas as pd
import cx_Oracle
con = cx_Oracle.connect(connectString)
cursor = con.cursor()
custom_dictionary= '[{"ID":2, "Price": 7.95, "Type": "Sports"}]'
df = pd.read_json(custom_dictionary)
statement='insert into cx_people values(:1,:2,:3)'
df_list = df.values.tolist()
n = 0
for i in df.iterrows():
cursor.execute(statement,df_list[n])
n += 1
con.commit()
I'm trying to insert dummy data into a mysql database.
The database structure looks like:
database name: messaround
database table name: test
table structure:
id (Primary key, auto increment)
path (varchar(254))
UPDATED 2 method below, and error.
I have a method to try to insert via:
def insert_into_db(dbcursor, table, *cols, **vals):
try:
query = "INSERT INTO {} ({}) VALUES ('{}')".format(table, ",".join(cols), "'),('".join(vals))
print(query)
dbcursor.execute(query)
dbcursor.commit()
print("inserted!")
except pymysql.Error as exc:
print("error inserting...\n {}".format(exc))
connection=conn_db()
insertstmt=insert_into_db(connection, table='test', cols=['path'], vals=['test.com/test2'])
However, this is failing saying:
INSERT INTO test () VALUES ('vals'),('cols')
error inserting...
(1136, "Column count doesn't match value count at row 1")
Can you please assist?
Thank you.
If you use your code:
def insert_into_db(dbcursor, table, *cols, **vals):
query = "INSERT INTO {} ({}) VALUES ({})".format(table,",".join(cols), ",".join(vals))
print(query)
insert_into_db('cursor_here', 'table_here', 'name', 'city', name_person='diego', city_person='Sao Paulo')
Python returns:
INSERT INTO table_here (name,city) VALUES (name_person,city_person)
Now with this other:
def new_insert_into_db(dbcursor, table, *cols, **vals):
vals2 = ''
for first_part, second_part in vals.items():
vals2 += '\'' + second_part + '\','
vals2 = vals2[:-1]
query = "INSERT INTO {} ({}) VALUES ({})".format(table,",".join(cols), vals2)
print(query)
new_insert_into_db('cursor_here', 'table_here', 'name', 'city', name_person='diego', city_person='Sao Paulo')
Python will return the correct SQL:
INSERT INTO table_here (name,city) VALUES ('diego','Sao Paulo')
Generally in Python you pass a parameterized query to the DB driver. See this example in PyMySQL's documentation; it constructs the INSERT query with placeholder characters, then calls cursor.execute() passing the query, and a tuple of the actual values.
Using parameterized queries is also recommended for security purposes, as it defeats many common SQL injection attacks.
you should print the sql statement which you've generated, that makes it a lot easier to see what's wrong.
But I guess you need quotes ' around string values for your ",".join(vals) (in case there are string values.
So your code is producing
insert into test (path,) values (test.com/test2,);
but it should produce
insert into test (`path`) values ('test.com/test2');
Otherwise try https://github.com/markuman/MariaSQL/ which makes it super easy to insert data to MariaDB/MySQL using pymysql.
Change your query as below
query = "INSERT INTO {} ({}) VALUES ('{}')".format(table, ",".join(cols), "'),('".join(vals))
As you are using join, the variable is expected to be a list but not a string
table = 'test'
cols = ['path']
vals = ['test.com/test2', 'another.com/anothertest']
print(query)
"INSERT INTO test (path) VALUES ('test.com/test2'),('another.com/anothertest')"
Update:
def insert_into_db(dbconnection=None, table='', cols=None, vals=None):
mycursor = dbconnection.cursor()
if not (dbconnection and table and cols and vals):
print('Must need all values')
quit()
try:
query = "INSERT INTO {} ({}) VALUES ('{}')".format(table, ",".join(cols), "'),('".join(vals))
mycursor.execute(query)
dbconnection.commit()
print("inserted!")
except pymysql.Error as exc:
print("error inserting...\n {}".format(exc))
connection=conn_db()
insertstmt=insert_into_db(dbconnection=connection, table='test', cols=['path'], vals=['test.com/test2'])
I have this situation where I created a method that will insert rows in database. I provide to that method columns, values and table name.
COLUMNS = [['NAME','SURNAME','AGE'],['SURNAME','NAME','AGE']]
VALUES = [['John','Doe',56],['Doe','John',56]]
TABLE = 'people'
This is how I would like to pass but it doesn't work:
db = DB_CONN.MSSQL() #method for connecting to MS SQL or ORACLE etc.
cursor = db.cursor()
sql = "insert into %s (?) VALUES(?)" % TABLE
cursor.executemany([sql,[COLUMNS[0],VALUES[0]],[COLUMNS[1],VALUES[1]]])
db.commit()
This is how it will pass query but problem is that I must have predefined column names and that's not good because what if the other list has different column sort? Than the name will be in surname and surname in name.
db = DB_CONN.MSSQL() #method for connecting to MS SQL or ORACLE etc.
cursor = db.cursor()
sql = 'insert into %s (NAME,SURNAME,AGE) VALUES (?,?,?)'
cursor.executemany(sql,[['John','Doe',56],['Doe','John',56]])
db.commit()
I hope I explained it clearly enough.
Ps. COLUMNS and VALUES are extracted from json dictionary
[{'NAME':'John','SURNAME':'Doe','AGE':56...},{'SURNAME':'Doe','NAME':'John','AGE':77...}]
if that helps.
SOLUTION:
class INSERT(object):
def __init__(self):
self.BASE_COL = ''
def call(self):
GATHER_DATA = [{'NAME':'John','SURNAME':'Doe','AGE':56},{'SURNAME':'Doe','NAME':'John','AGE':77}]
self.BASE_COL = ''
TABLE = 'person'
#check dictionary keys
for DATA_EVAL in GATHER_DATA:
if self.BASE_COL == '': self.BASE_COL = DATA_EVAL.keys()
else:
if self.BASE_COL != DATA_EVAL.keys():
print ("columns in DATA_EVAL.keys() have different columns")
#send mail or insert to log or remove dict from list
exit(403)
#if everything goes well make an insert
columns = ','.join(self.BASE_COL)
sql = 'insert into %s (%s) VALUES (?,?,?)' % (TABLE, columns)
db = DB_CONN.MSSQL()
cursor = db.cursor()
cursor.executemany(sql, [DATA_EVAL.values() for DATA_EVAL in GATHER_DATA])
db.commit()
if __name__ == "__main__":
ins = INSERT()
ins.call()
You could take advantage of the non-random nature of key-value pair listing for python dictionaries.
You should check that all items in the json array of records have the same fields, otherwise you'll run into an exception in your query.
columns = ','.join(records[0].keys())
sql = 'insert into %s (%s) VALUES (?,?,?)' % (TABLE, columns)
cursor.executemany(sql,[record.values() for record in records])
References:
https://stackoverflow.com/a/835430/5189811