how to convert multiple insert into one insert using python - python

I have a sql file in which there are many insert sql to the same table:
insert into tbname
values (xxx1);
insert into tbname values (xxx2);
insert into
tbname values (xxx3);
...
how to convert this file into a new file containing one sql like :
insert into tbname (xxx1),(xxx2),(xxx3)...;
Due to different insert formats as above in the sql file, it make hard to use regular express in python.

If you want to insert multiple rows into the table use executemany() method.
mydb = mysql.connector.connect(
host="localhost",
user="yourusername",
passwd="yourpassword",
database="mydatabase"
)
mycursor = mydb.cursor()
sql = "INSERT INTO tbname VALUES (%s)"
val = [
('xx1'),
('xx2'),
('xx3')
]
mycursor.executemany(sql, val)
mydb.commit()
For more info, follow this.

Open your file to a string and use split to get all values in a table and then again to a string. Something like:
s = "insert into tbname values (val1, val2);insert into tbname values (val3, val4);insert into tbname values (val5, val6);"
values = s.replace(";insert into tbname values", ', ')
It's an anorthodox method but could work in your case to get it in one insert.

Related

INSERT INTO SELECT based on a dataframe

I have a dataframe df and I want to to execute a query to insert into a table all the values from the dataframe. Basically I am trying to load as the following query:
INSERT INTO mytable
SELECT *
FROM mydataframe
For that I have the following code:
import pyodbc
import pandas as pd
connection = pyodbc.connect('Driver={' + driver + '} ;'
'Server=' + server + ';'
'UID=' + user + ';'
'PWD=' + pass + ';')
cursor = connection.cursor()
query = 'SELECT * FROM [myDB].[dbo].[myTable]'
df = pd.read_sql_query(query, connection)
sql = 'INSERT INTO [dbo].[new_date] SELECT * FROM :x'
cursor.execute(sql, x=df)
connection.commit()
However, I am getting the following error:
TypeError: execute() takes no keyword arguments
Does anyone know what I am doing wrong?
For raw DB-API insert query from Pandas, consider DataFrame.to_numpy() with executemany and avoid any top layer for looping. However, explicit columns must be used in append query. Adjust below columns and qmark parameter placeholders to correspond to data frame columns.
# PREPARED STATEMENT
sql = '''INSERT INTO [dbo].[new_date] (Col1, Col2, Col3, ...)
VALUES (?, ?, ?, ...)
'''
# EXECUTE PARAMETERIZED QUERY
cursor.executemany(sql, df.to_numpy().tolist())
conn.commit()
(And by the way, it is best practice generally in SQL queries to always explicitly reference columns and avoid SELECT * for code readability, maintainability, and even performance.)
I had some issues to connect pandas with SQL Server too. But I've get this solution to write my df:
import pyodbc
import sqlalchemy
engine = sqlalchemy.create_engine('mssql+pyodbc://{0}:{1}#{2}:{3}/{4}?driver={5}'.format(username,password,server,port,bdName,driver))
pd.to_sql("TableName",con=engine,if_exists="append")
See below my favourite solution, with UPSERT statement included.
df_columns = list(df)
columns = ','.join(df_columns)
values = 'VALUES({})'.format(','.join(['%s' for col in df_columns]))
update_list = ['{} = EXCLUDED.{}'.format(col, col) for col in df_columns]
update_str = ','.join(update_list)
insert_stmt = "INSERT INTO {} ({}) {} ON CONFLICT ([your_pkey_here]) DO UPDATE SET {}".format(table, columns, values, update_str)
cursor.execute doesnot accepts keyword arguments. One way of doing the insert can be using following below code snippet.
cols = "`,`".join([str(i) for i in df.columns.tolist()])
# Insert DataFrame recrds one by one.
for i,row in df.iterrows():
sql = "INSERT INTO `[dbo].[new_date]` (`" +cols + "`) VALUES (" + "?,"*(len(row)-1) + "%s)"
cursor.execute(sql, tuple(row))
here you are iterating through each row and then inserting it into the table.
thank you for your answers :) but I use the following code to solve my problem:
params = urllib.parse.quote_plus("DRIVER={SQL Server};SERVER=servername;DATABASE=database;UID=user;PWD=pass")
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
engine.connect()
query = query
df = pd.read_sql_query(query, connection)
df.to_sql(name='new_table',con=engine, index=False, if_exists='append')

Insert nested arrays into sql from python

I have a list contains many lists in python.
my_list = [['city', 'state'], ['tampa', 'florida'], ['miami','florida']]
The nested list at index 0 contains the column headers, and rest of the nested lists contain corresponding values. How would I insert this into sql server using pyodbc or slqalchemy? I have been using pandas pd.to_sql and want to make this a process in pure python. Any help would be greatly appreciated.
expected output table would look like:
city |state
-------------
tampa|florida
miami|florida
Since the column names are coming from your list you have to build a query string to insert the values. Column names and table names can't be parameterised with placeholders (?).
import pyodbc
conn = pyodbc.connect(my_connection_string)
cursor = conn.cursor()
my_list = [['city', 'state'], ['tampa', 'florida'], ['miami','florida']]
columns = ','.join(my_list[0]) #String of column names
values = ','.join(['?'] * len(my_list[0])) #Placeholders for values
query = "INSERT INTO mytable({0}) VALUES ({1})".format(columns, values)
#Loop through rest of list, inserting data
for l in my_list[1:]:
cursor.execute(query, l)
conn.commit() #save changes
Update:
If you have a large number of records to insert you can do that in one go using executemany. Change the code like this:
columns = ','.join(my_list[0]) #String of column names
values = ','.join(['?'] * len(my_list[0])) #Placeholders for values
#Bulk insert
query = "INSERT INTO mytable({0}) VALUES ({1})".format(columns, values)
cursor.executemany(query, my_list[1:])
conn.commit() #save change
Assuming conn is already open connection to your database:
cursor = conn.cursor()
for row in my_list:
cursor.execute('INSERT INTO my_table (city, state) VALUES (?, ?)', row)
cursor.commit()
Since the columns value are are the first elemnts in the array, just do:
q ="""CREATE TABLE IF NOT EXISTS stud_data (`{col1}` VARCHAR(250),`{col2}` VARCHAR(250); """
sql_cmd = q.format(col1 = my_list[0][0],col2 = my_list[0][1])
mcursor.execute(sql)#Create the table with columns
Now to add the values to the table, do:
for i in range(1,len(my_list)-1):
sql = "INSERT IGNORE into test_table(city,state) VALUES (%s, %s)"
mycursor.execute(sql,my_list[i][0],my_list[i][1])
mycursor.commit()
print(mycursor.rowcount, "Record Inserted.")#Get count of rows after insertion

Dynamically create MySQL table with python?

So I have an array of 200+ columns. How would I loop through this and create a table using pymysql?
Currently am connected like:
import pymysql
connection = pymysql.connect(
host='my_host_name',
user='my_username',
password='my_password',
port= 0000,
database='my_db')
columns = ['firstname', 'lastname', 'email', .... ]
cursor = connection.cursor()
sql = 'CREATE TABLE my_table (
# For each column in columns
)'
cursor.execute(sql)
Edit: I will loop through the columns first and append their appropriate data type
You can use join to loop through all the columns:
columns = ['firstname VARCHAR(255)', 'lastname VARCHAR(255)'] # and so on
sql = 'CREATE TABLE my_table (' + ', '.join(columns) + ');'
Note that the resulting table is not even in 1NF (First Normal Form), as it doesn't have a PRIMARY KEY. It would be better to set one or more columns as PRIMARY KEY to reduce the risk of inconsistency.

sqlite3 index table in python

I have created this table in python 2.7 . I use it to store unique pairs name and value. In some queries I search for names and in others I search for values. Lets say that SELECT queries are 50-50. Is there any way to create a table that will be double index (one index on names and another for values) so my program will seek faster the data ?
Here is the database and table creation:
import sqlite3
#-------------------------db creation ---------------------------------------#
db1 = sqlite3.connect('/my_db.db')
cursor = db1.cursor()
cursor.execute("DROP TABLE IF EXISTS my_table")
sql = '''CREATE TABLE my_table (
name TEXT DEFAULT NULL,
value INT
);'''
cursor.execute(sql)
sql = ("CREATE INDEX index_my_table ON my_table (name);")
cursor.execute(sql)
Or is there any other faster struct for faster value seek ?
You can create another index...
sql = ("CREATE INDEX index_my_table2 ON my_table (value);")
cursor.execute(sql)
I think the best way for faster research is to create a index on the 2 fields.
like: sql = ("CREATE INDEX index_my_table ON my_table (Field1, field2)")
Multi-Column Indices or Covering Indices.
see the (great) doc here: https://www.sqlite.org/queryplanner.html

Python mysql connector insert with %s

I'm trying to append a set containing a number into my MySQL database using the Python MySQLConnector. I am able to add data manually, but the following expression with %s won't work. I tried several variations on this, but nothing from the documentation seems to work in my case. The table was already buildt as you can see:
#Table erstellen:
#cursor.execute('''CREATE TABLE anzahlids( tweetid INT )''')
Here is my code and the error:
print len(idset)
id_data = [
len(idset)
]
print id_data
insert = ("""INSERT INTO anzahlids (idnummer) VALUES (%s)""")
cursor.executemany(insert, id_data)
db_connection.commit()
"Failed processing format-parameters; %s" % e)
mysql.connector.errors.ProgrammingError: Failed processing format-parameters; argument 2 to map() must support iteration
Late answer, but I would like to post some nicer code. Also, the original question was using MySQL Connector/Python.
The use of executemany() is wrong. The executemany() method expects a sequence of tuples, for example, [ (1,), (2,) ].
For the problem at hand, executemany() is actually not useful and execute() should be used:
cur.execute("DROP TABLE IF EXISTS anzahlids")
cur.execute("CREATE TABLE anzahlids (tweetid INT)")
some_ids = [ 1, 2, 3, 4, 5]
cur.execute("INSERT INTO anzahlids (tweetid) VALUES (%s)",
(len(some_ids),))
cnx.commit()
And with MySQL Connector/Python (unlike with MySQLdb), you have to make sure you are committing.
(Note for non-German speakers: 'anzahlids' means 'number_of_ids')
The following is an example that worked on my machine.
import MySQLdb
db = MySQLdb.connect(host="localhost", user="stackoverflow", passwd="", db="stackoverflow")
cursor = db.cursor()
try:
sql = 'create table if not exists anzahlids( tweetid int ) ; '
except:
#ignore
pass
sql = ("""INSERT INTO anzahlids (tweetid) VALUES (%s)""")
data = [1,2,3,4,5,6,7,8,9]
length = [len(data)]
cursor.executemany(sql,length)
db.commit()
if idset is a single value you can use
sql = ("""INSERT INTO anzahlids (tweetid) VALUES (%s)""") % len(idset)
cursor.execute(sql)
db.commit()

Categories