I have a problem where i have a list of dictionaries with for example the following data:
columns = [{
'name': 'column1',
'type': 'varchar'
},
{
'name': 'column2',
'type': 'decimal'
},
.
.
.
]
From that list i need to dynamically create a CREATE TABLE statement based on each dictionary in the list which contains the name of the column and the type and execute it on a PostgreSQL database using the psycopg2 adapter.
I managed to do it with:
columns = "(" + ",\n".join(["{} {}".format(col['name'], col['type']) for col in columns]) + ")"
cursor.execute("CREATE TABLE some_table_name\n {}".format(columns))
But this solution is vulnerable to SQL injection. I tried to do the exact same thing with the sql module from psycopg2 but without luck. Always getting syntax error, because it wraps the type in quotes.
Is there some way this can be done safely?
You can make use of AsIs to get the column types added non-quoted:
import psycopg2
from psycopg2.extensions import AsIs
import psycopg2.sql as sql
conn = psycopg2.connect("dbname=mf port=5959 host=localhost user=mf_usr")
columns = [{
'name': "column1",
'type': "varchar"
},
{
'name': "column2",
'type': "decimal"
}]
# create a dict, so we can use dict placeholders in the CREATE TABLE query.
column_dict = {c['name']: AsIs(c['type']) for c in columns}
createSQL = sql.SQL("CREATE TABLE some_table_name\n ({columns})").format(
columns = sql.SQL(',').join(
sql.SQL(' ').join([sql.Identifier(col), sql.Placeholder(col)]) for col in column_dict)
)
print(createSQL.as_string(conn))
cur = conn.cursor()
cur.execute(createSQL, column_dict)
cur.execute("insert into some_table_name (column1) VALUES ('foo')")
cur.execute("select * FROM some_table_name")
print('Result: ', cur.fetchall())
Output:
CREATE TABLE some_table_name
("column1" %(column1)s,"column2" %(column2)s)
Result: [('foo', None)]
Note:
psycopg2.sql is safe to SQL injection, AsIs probably not.
Testing using 'type': "varchar; DROP TABLE foo" resulted in Postgres syntax error:
b'CREATE TABLE some_table_name\n ("column1" varchar; DROP TABLE foo,"column2" decimal)'
Traceback (most recent call last):
File "pct.py", line 28, in <module>
cur.execute(createSQL, column_dict)
psycopg2.errors.SyntaxError: syntax error at or near ";"
LINE 2: ("column1" varchar; DROP TABLE foo,"column2" decimal)
Expanding on my comment, a complete example:
import psycopg2
from psycopg2 import sql
columns = [{
'name': 'column1',
'type': 'varchar'
},
{
'name': 'column2',
'type': 'decimal'
}]
con = psycopg2.connect("dbname=test host=localhost user=aklaver")
cur = con.cursor()
col_list = sql.SQL(',').join( [sql.Identifier(col["name"]) + sql.SQL(' ') + sql.SQL(col["type"]) for col in columns])
create_sql = sql.SQL("CREATE TABLE tablename ({})").format(col_list)
print(create_sql.as_string(con))
CREATE TABLE tablename ("column1" varchar,"column2" decimal)
cur.execute(create_sql)
con.commit()
test(5432)=> \d tablename
Table "public.tablename"
Column | Type | Collation | Nullable | Default
---------+-------------------+-----------+----------+---------
column1 | character varying | | |
column2 | numeric |
Iterate over the column list of dicts and assign the column names as SQL identifiers and the column types as straight SQL into sql.SQL construct. Use this as parameter to CREATE TABLE SQL.
Caveat: sql.SQL() does not do escaping, so those values would have to be validated before they where used.
Related
Is there a way to truncate a table using pandas? I'm using a config.json to transfer the db config.
with open("config.json") as jsonfile:
db_config = load(jsonfile)['database']
engine = create_engine(db_config['config'], echo = False)
{
"database": {
"config":"mysql+pymysql://root:password#localhost:3306/some_db
}
}
Like:
sql = "TRUNCATE TABLE some_db.some_table;"
pd.read_sql(sql=sql, con=engine)
Error:
sqlalchemy.exc.ResourceClosedError: This result object does not return
rows. It has been closed automatically
I have sql class in python which inserts data to my DB. In my table, one column is jsonfield and when I insert data to that table , i get error (psycopg2.ProgrammingError: can't adapt type 'dict') .
I have used json.load , json.loads , json.dump , json.dumps. None of them worked. Even I tried string formatting. It did not work, either.
Any idea how to do?
my demo code is
json_data = {
"key": "value"
}
query = """INSERT INTO table(json_field) VALUES(%s)"""
self.cursor.execute(query, ([json_data,]))
self.connection.commit()
Below block of code worked for me
import psycopg2
import json
json_data = {
"key": "value"
}
json_object = json.dumps(json_data, indent = 4)
query = """INSERT INTO json_t(field) VALUES(%s)"""
dbConn = psycopg2.connect(database='test', port=5432, user='username')
cursor=dbConn.cursor()
cursor.execute(query, ([json_object,]))
dbConn.commit()
I am trying to insert this value into SQL Server table and I'm not sure is this supposed to be a list or a dictionary.
For some context I am pulling the data from a Sharepoint list using shareplum with code like this
import json
import pandas
import pyodbc
from shareplum import Site
from shareplum import Office365
authcookie = Office365('https://company.sharepoint.com', username='username', password='password').GetCookies()
site = Site('https://company.sharepoint.com/sites/sharepoint/', authcookie=authcookie)
sp_list = site.List('Test')
data = sp_list.GetListItems('All Items')
cnxn = pyodbc.connect("Driver={SQL Server Native Client 11.0};"
"Server=Server;"
"Database=db;"
"Trusted_Connection=yes;")
cursor = cnxn.cursor()
insert_query = "INSERT INTO SharepointTest(No,Name) VALUES (%(No)s,%(Name)s)"
cursor.executemany(insert_query,data)
cnxn.commit
Here's the result when I used print(data)
[
{ 'No': '1', 'Name': 'Qwe' },
{ 'No': '2', 'Name': 'Asd' },
{ 'No': '3', 'Name': 'Zxc' },
{ 'No': '10', 'Name': 'jkl' }
]
If I tried to execute that code will shows me this message
TypeError: ('Params must be in a list, tuple, or Row', 'HY000')
What should I fix in the code?
convert your list of dictionaries to a list or tuple of the dictionary values.
I've done it below using list comprehension to iterate through the list and the values() method to extract the values from a dictionary
insert_query = "INSERT INTO SharepointTest(No,Name) VALUES (?, ?)" #change your sql statement to include parameter markers
cursor.executemany(insert_query, [tuple(d.values()) for d in data])
cnxn.commit() #finally commit your changes
How do I use pandas_gbq.read_gbq safely to protect against SQL Injections as I cannot in the docs find a way to parametrize it
I've looked at the docs at a way to parametrize as well as googles website and other sources.
df_valid = read_gbq(QUERY_INFO.format(variable), project_id='project-1622', location='EU') Where query looks like SELECT name, date FROM table WHERE id = '{0}'
I can input p' or '1'='1 and it works
Per Google BigQuery docs, you have to use a specified configuration with SQL parameterized statement:
import pandas as pd
sql = "SELECT name, date FROM table WHERE id = #id"
query_config = {
'query': {
'parameterMode': 'NAMED',
'queryParameters': [
{
'name': 'id',
'parameterType': {'type': 'STRING'},
'parameterValue': {'value': 1}
}
]
}
}
df = pd.read_gbq(sql, project_id='project-1622', location='EU', configuration=query_config)
I am having trouble inserting an element into a MySQL database.
Here is the code I have:
#!/usr/bin/python
# -*- coding: utf-8 -*-
myData = [ { u'my_text' : {u'id': u'1', u'description' : u'described' }, u'my_id' : u'1' } ]
import MySQLdb as mdb
con = None
con = mdb.connect('localhost', 'abc', 'def', 'ghi');
cur = con.cursor()
con.set_character_set('utf8')
cur.execute('SET NAMES utf8;')
cur.execute('SET CHARACTER SET utf8;')
cur.execute('SET character_set_connection=utf8;')
sql = "INSERT IGNORE INTO MyTable ( 'my_id', 'my_text' ) VALUES ( %(my_id)s, %(my_text)s );"
cur.executemany(sql, myData)
con.commit()
if con: con.close()
My database is created with this:
CREATE TABLE MyTable(
'my_id' INT(10) unsigned NOT NULL,
'my_text' TEXT
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
As you can see from the python script one element of my list is a dictionary and it seems to be this that is stopping the insertion into the MySQL database but as I have only started using MySQL in the last few days I may be wrong.
If I make my_text within myData a simple text phrase such as the following everything works fine and the insertion into the database table works fine:
myData = [ { u'my_text' : u'something simple', u'my_id' : u'1' } ]
I want to be able to use this:
myData = [ { u'my_text' : {u'id': u'1', u'description' : u'described' }, u'my_id' : u'1' } ]
What am I doing wrong?
You have at least two choices.
Change your table schema:
CREATE TABLE MyTable(
'my_id' INT(10) unsigned NOT NULL,
'id' INT(10),
'description' TEXT
) ENGINE=MyISAM DEFAULT CHARSET=utf8
sql = "INSERT IGNORE INTO MyTable ( 'my_id', 'id', 'description' ) VALUES ( %s, %s, %s )"
myArg = [(dct['my_id'], dct['my_text']['id'], dct['my_text']['description'])
for dct in myData]
cur.executemany(sql, myArg)
or change the arg passed to cur.executemany:
myData = [ { u'my_text' : {u'id': u'1', u'description' : u'described' }, u'my_id' : u'1' } ]
myArg = [ {'my_txt' : str(dct['my_text']),
'my_id' : dct['my_id']} for dct in myData ]
cur.executemany(sql, myArg)
The advantage of changing the table schema is that you can now select based on id, so the data is richer.
But if you have no need to separate the id from the description, then the second way will work without having to change the table schema.