Importing a csv file into sql server using python

Importing a csv file into sql server using python - python

I'm currently experiencing a problem importing a csv file to sql using a minor variation of python coding used in a previous answer:-
Insert csv into sql database
I've run into an issue where I get the following syntax error:-
line 28, in insert_records
cursor.execute(insert +'('+ ', '.join(values) +');')
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Driver 13 for
SQL Server][SQL Server]Incorrect syntax near '/'. (102) (SQLExecDirectW)")
I believe I am close to succeeding into getting this csv file to import into sql server. Currently the table in sql server headings already present. I've attached the python code I am using, the program terminates at [cursor.execute(insert +'('+ ', '.join(values) +');')]
Thanks in advance,
Bryan
import pyodbc
import csv
print('connecting')
conn = pyodbc.connect(r'DRIVER={ODBC Driver 13 for SQL Server};'r'SERVER=.\SQLExpress;'r'DATABASE=UFOGBobservations;'r'Trusted_Connection=yes')
print('Connected')
my_cursor = conn.cursor()
print('Cursor established')
def insert_records(table, yourcsv, cursor, cnxn):
#INSERT SOURCE RECORDS TO DESTINATION
with open(yourcsv) as csvfile:
csvFile = csv.reader(csvfile, delimiter=',')
header = next(csvFile)
headers = map((lambda x: x.strip()), header)
insert = 'INSERT INTO {} ('.format(table) + ', '.join(headers) + ') VALUES '
for row in csvFile:
values = map((lambda x: "'"+x.strip()+"'"), row)
cursor.execute(insert +'('+ ', '.join(values) +');')
conn.commit() #must commit unless your sql database auto-commits
table = 'table_1'
mycsv = r'C:\DataAnalystData\UFOGB_Observations.csv' # SET YOUR FILEPATH
insert_records(table, mycsv, my_cursor, conn)
cursor.close()

This is possibly an escaping issue. It would be safer if you passed the values as a list of parameters to execute() rather than manually building a string. This will ensure that they are correctly escaped.
insert = 'INSERT INTO {} ('.format(table) + ', '.join(headers) + ') VALUES ({})' \
.format(', '.join(len(headers) * '?')) # Add parameter placeholders as ?
for row in csvFile:
values = map((lambda x: x.strip()), row) # No need for the quotes
cursor.execute(insert, values) # Pass the list of values as 2nd argument
conn.commit()

Related

Error: invalid syntax using cx_oracle in python

Below code is for extracting the data from oracle database in csv file.
In query,For converting from Fractional decimal into date format,i have used To_Date('12/30/1899', 'MM/DD/YYYY HH24:MI:SS')+DTIMESTAMP) Decoded_Date.
And also specified the date range for extracting the data between dates.
Please help what's wrong in below code giving invalid syntax.
import csv
import cx_Oracle
dsn_tns = cx_Oracle.makedsn('hostname', 'port', sid='sid') # if needed, place an 'r' before any parameter in order to address special characters such as '\'.
conn = cx_Oracle.connect(user=r'username', password='password', dsn=dsn_tns)
cursor = conn.cursor()
csv_file = open("C:/Users/locations.csv", "w")
writer = csv.writer(csv_file, delimiter=',', lineterminator="\n", quoting=csv.QUOTE_NONNUMERIC)
r = cursor.execute("""SELECT *
FROM (SELECT LROWNUM,DTIMESTAMP,LSCENARIO,LYEAR,LPERIOD,
LENTITY,LPARENT,LVALUE,LACCOUNT,LICP,LCUSTOM1,
LCUSTOM2,STRUSERNAME,STRSERVERNAME,
LACTIVITY,DDATAVALUE,BNODATA,
(To_Date('12/30/1899', 'MM/DD/YYYY HH24:MI:SS')+DTIMESTAMP) Decoded_Date
FROM TABLE_NAME
) SUB
WHERE SUB.Decoded_Date between '23-MAR-2020' and '24-APR-2020';
""")
for row in cursor:
writer.writerow(row)
cursor.close()
conn.close()
csv_file.close()

The opening and closing parentheses should not be present. I can't test the SQL directly, of course, but this should in theory work for you!
r = cursor.execute"""
SELECT *
FROM
( SELECT LROWNUM,DTIMESTAMP,LSCENARIO,LYEAR,LPERIOD,
LENTITY,LPARENT,LVALUE,LACCOUNT,LICP,LCUSTOM1,
LCUSTOM2,STRUSERNAME,STRSERVERNAME,
LACTIVITY,DDATAVALUE,BNODATA,
To_Date('12/30/1899','MM/DD/YYYY') +
DTIMESTAMP as Decoded_Date
FROM TABLE_NAME
) SUB
WHERE SUB.Decoded_Date between to_date('23-MAR-2020', 'DD-MON-YYYY')
and to_date('24-APR-2020', 'DD-MON-YYYY')
"""
Note the changes to the last line as well. Unless you know the value of NLS_DATE_FORMAT you can't compare strings with dates directly. Note that you can also bind date values directly as in
sql = "select ... where sub.decoded_date between :1 and :2"
cursor.execute(sql, [datetime.date(2020, 3, 23), datetime.date(2020, 4, 24)])

Python: data import from MSSQL 'Incorrect syntax' at WHERE statement

With Python I want to create multiple csv files with data from MS SQL. In MS SQL I have a table with article names and some article information. Article names show up multiple times in the table with varying article information. The script should create one csv file per article name. I did the same script for article numbers (integer) and it worked fine. But with article names (varchar:letters, numbers) I get this error message:
ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Driver 13 for SQL Server][SQL Server]Incorrect syntax near 'SomeArticleName'. (102) (SQLExecDirectW)")
--> I guess the WHERE statement is the problem
import pyodbc
import csv
conn= pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};'
r'servername'
r'DATABASE = mydatabase;'
r'Trusted_Connection=yes;')
cursor = conn.cursor()
cursor.execute("SELECT DISTINCT [ArticleName] from [mydatabase].[dbo].
[mytable]")
allarticles = cursor.fetchall()
for article in allarticles:
anr = article[0]
cursor.execute("SELECT * from [mydatabase].[dbo].[mytable] WHERE
[ArticleName] = " + str(artikel[0]))
rows = cursor.fetchall()
myfile = 'mypath'+anr+'.csv'
with open(myfile, 'w') as fp:
a = csv.writer(fp, lineterminator = '\n')
for row in rows:
a.writerow(row)
conn.close()`
I printed the article names from 'allarticles' and they show up.

csv into sqlite table python

Using python, I am trying to import a csv into an sqlite table and use the headers in the csv file to become the headers in the sqlite table. The code runs but the table "MyTable" does not appear to be created. Here is the code:
with open ('dict_output.csv', 'r') as f:
reader = csv.reader(f)
columns = next(reader)
#Strips white space in header
columns = [h.strip() for h in columns]
#reader = csv.DictReader(f, fieldnames=columns)
for row in reader:
print(row)
con = sqlite3.connect("city_spec.db")
cursor = con.cursor()
#Inserts data from csv into table in sql database.
query = 'insert into MyTable({0}) values ({1})'
query = query.format(','.join(columns), ','.join('?' * len(columns)))
print(query)
cursor = con.cursor()
for row in reader:
cursor.execute(query, row)
#cursor.commit()
con.commit()
con.close()
Thanks in advance for any help.

You can use Pandas to make this easy (you may need to pip install pandas first):
import sqlite3
import pandas as pd
# load data
df = pd.read_csv('dict_output.csv')
# strip whitespace from headers
df.columns = df.columns.str.strip()
con = sqlite3.connect("city_spec.db")
# drop data into database
df.to_sql("MyTable", con)
con.close()
Pandas will do all of the hard work for you, including create the actual table!

You haven't marked your answer solved yet so here goes.
Connect to the database just once, and create a cursor just once.
You can read the csv records only once.
I've added code that creates a crude form of the database table based on the column names alone. Again, this is done just once in the loop.
Your insertion code works fine.
import sqlite3
import csv
con = sqlite3.connect("city_spec.sqlite") ## these statements belong outside the loop
cursor = con.cursor() ## execute them just once
first = True
with open ('dict_output.csv', 'r') as f:
reader = csv.reader(f)
columns = next(reader)
columns = [h.strip() for h in columns]
if first:
sql = 'CREATE TABLE IF NOT EXISTS MyTable (%s)' % ', '.join(['%s text'%column for column in columns])
print (sql)
cursor.execute(sql)
first = False
#~ for row in reader: ## we will read the rows later in the loop
#~ print(row)
query = 'insert into MyTable({0}) values ({1})'
query = query.format(','.join(columns), ','.join('?' * len(columns)))
print(query)
cursor = con.cursor()
for row in reader:
cursor.execute(query, row)
con.commit()
con.close()

You can also do it easy with peewee orm. For this you only use an extension from peewee, the playhouse.csv_loader:
from playhouse.csv_loader import *
db = SqliteDatabase('city_spec.db')
Test = load_csv(db, 'dict_output.csv')
You created the database city_spec.db with the headers as fields and the data from the dict_output.csv
If you don't have peewee you can install it with
pip install peewee

_mysql_exceptions.OperationalError: (1366, "Incorrect integer value: '%s' for column 'ID' at row 1")

I am trying to insert my data from .csv to Mysql database using python script.
python script which i used
import csv
import MySQLdb
db = MySQLdb.connect(host='localhost',user='root',passwd='password',db='EfficientBazzar')
cursor = db.cursor()
csv_data = csv.reader(file('products.csv'))
for row in csv_data:
cursor.execute('INSERT INTO vendor_price_list(ID,Vendor,productname,productcode,unit,weight,price)' 'VALUES("%s","%s","%s","%s","%s","%s","%s")')
db.commit()
cursor.close()
print "Done"

You're forgetting to insert the row data into your update query.
Replace:
'VALUES("%s","%s","%s","%s","%s","%s","%s")'
with:
'VALUES("%s","%s","%s","%s","%s","%s","%s")' % tuple(row)

Actually even better, consider parameterizing query:
cursor.execute('INSERT INTO vendor_price_list(ID,Vendor,productname,productcode,unit,weight,price)' + \
' VALUES(%s,%s,%s,%s,%s,%s,%s)', row)
Even faster is a bulk command straight from csv with MySQL's LOAD DATA INFILE (assuming server instance settings allow it):
cursor.execute("LOAD DATA INFILE '/path/to/products.csv'" + \
" INTO TABLE vendor_price_list" + \
" FIELDS TERMINATED BY ',' ENCLOSED BY '\"'" + \
" LINES TERMINATED BY '\n' IGNORE 1 ROWS")

Writing a csv file into SQL Server database using python

I am trying to write a csv file into a table in SQL Server database using python. I am facing errors when I pass the parameters , but I don't face any error when I do it manually. Here is the code I am executing.
cur=cnxn.cursor() # Get the cursor
csv_data = csv.reader(file(Samplefile.csv')) # Read the csv
for rows in csv_data: # Iterate through csv
cur.execute("INSERT INTO MyTable(Col1,Col2,Col3,Col4) VALUES (?,?,?,?)",rows)
cnxn.commit()
Error:
pyodbc.DataError: ('22001', '[22001] [Microsoft][ODBC SQL Server Driver][SQL Server]String or binary data would be truncated. (8152) (SQLExecDirectW); [01000] [Microsoft][ODBC SQL Server Driver][SQL Server]The statement has been terminated. (3621)')
However when I insert the values manually. It works fine
cur.execute("INSERT INTO MyTable(Col1,Col2,Col3,Col4) VALUES (?,?,?,?)",'A','B','C','D')
I have ensured that the TABLE is there in the database, data types are consistent with the data I am passing. Connection and cursor are also correct. The data type of rows is "list"

Consider building the query dynamically to ensure the number of placeholders matches your table and CSV file format. Then it's just a matter of ensuring your table and CSV file are correct, instead of checking that you typed enough ? placeholders in your code.
The following example assumes
CSV file contains column names in the first line
Connection is already built
File name is test.csv
Table name is MyTable
Python 3
...
with open ('test.csv', 'r') as f:
reader = csv.reader(f)
columns = next(reader)
query = 'insert into MyTable({0}) values ({1})'
query = query.format(','.join(columns), ','.join('?' * len(columns)))
cursor = connection.cursor()
for data in reader:
cursor.execute(query, data)
cursor.commit()
If column names are not included in the file:
...
with open ('test.csv', 'r') as f:
reader = csv.reader(f)
data = next(reader)
query = 'insert into MyTable values ({0})'
query = query.format(','.join('?' * len(data)))
cursor = connection.cursor()
cursor.execute(query, data)
for data in reader:
cursor.execute(query, data)
cursor.commit()

I modified the code written above by Brian as follows since the one posted above wouldn't work on the delimited files that I was trying to upload. The line row.pop() can also be ignored as it was necessary only for the set of files that I was trying to upload.
import csv
def upload_table(path, filename, delim, cursor):
"""
Function to upload flat file to sqlserver
"""
tbl = filename.split('.')[0]
cnt = 0
with open (path + filename, 'r') as f:
reader = csv.reader(f, delimiter=delim)
for row in reader:
row.pop() # can be commented out
row = ['NULL' if val == '' else val for val in row]
row = [x.replace("'", "''") for x in row]
out = "'" + "', '".join(str(item) for item in row) + "'"
out = out.replace("'NULL'", 'NULL')
query = "INSERT INTO " + tbl + " VALUES (" + out + ")"
cursor.execute(query)
cnt = cnt + 1
if cnt % 10000 == 0:
cursor.commit()
cursor.commit()
print("Uploaded " + str(cnt) + " rows into table " + tbl + ".")

You can pass the columns as arguments. For example:
for rows in csv_data: # Iterate through csv
cur.execute("INSERT INTO MyTable(Col1,Col2,Col3,Col4) VALUES (?,?,?,?)", *rows)

If you are using MySqlHook in airflow , if cursor.execute() with params throw san error
TypeError: not all arguments converted during string formatting
use %s instead of ?
with open('/usr/local/airflow/files/ifsc_details.csv','r') as csv_file:
csv_reader = csv.reader(csv_file)
columns = next(csv_reader)
query = '''insert into ifsc_details({0}) values({1});'''
query = query.format(','.join(columns), ','.join(['%s'] * len(columns)))
mysql = MySqlHook(mysql_conn_id='local_mysql')
conn = mysql.get_conn()
cursor = conn.cursor()
for data in csv_reader:
cursor.execute(query, data)
cursor.commit()

I got it sorted out. The error was due to the size restriction restriction of table. It changed the column capacity like from col1 varchar(10) to col1 varchar(35) etc. Now it's working fine.

Here is the script and hope this works for you:
import pandas as pd
import pyodbc as pc
connection_string = "Driver=SQL Server;Server=localhost;Database={0};Trusted_Connection=Yes;"
cnxn = pc.connect(connection_string.format("DataBaseNameHere"), autocommit=True)
cur=cnxn.cursor()
df= pd.read_csv("your_filepath_and_filename_here.csv").fillna('')
query = 'insert into TableName({0}) values ({1})'
query = query.format(','.join(df.columns), ','.join('?' * len(df1.columns)))
cur.fast_executemany = True
cur.executemany(query, df.values.tolist())
cnxn.close()

You can also import data into SQL by using either:
The SQL Server Import and Export Wizard
SQL Server Integration Services (SSIS)
The OPENROWSET function
More details can be found on this webpage:
https://learn.microsoft.com/en-us/sql/relational-databases/import-export/import-data-from-excel-to-sql?view=sql-server-2017

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Importing a csv file into sql server using python - python

Related

Error: invalid syntax using cx_oracle in python

Python: data import from MSSQL 'Incorrect syntax' at WHERE statement

csv into sqlite table python

_mysql_exceptions.OperationalError: (1366, "Incorrect integer value: '%s' for column 'ID' at row 1")

Writing a csv file into SQL Server database using python

Categories

Resources