I am trying to speed up the pandas .to_sql() function as it currently takes ~30 mins to dump a table of 22 columns and 100K rows to a MS SQL Server Db. I've tried using the method='multi' and chunksize=1000 (I've read is the max for sql server) but getting the following error, a bunch of ?s in the error, and my data in the [parameters: section of the error:
DBAPIError: (pyodbc.Error) ('07002', '[07002] [Microsoft][ODBC Driver 17 for SQL Server]COUNT field incorrect or syntax error (0) (SQLExecDirectW)')
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
[parameters: (one big tuple)]
Here is the code I am using:
user_name = 'username'
cred = open('filename', 'r').read()
server_name = 'XXXXXXXX'
port = 'XXXX'
DB = 'database'
driver = 'ODBC Driver 17 for SQL Server'
conn = create_engine('mssql+pyodbc://'+user_name+':'+cred+'#'+server_name+':'+port+'/'+DB+'?driver='+driver)
df.to_sql('test_table', con=conn, if_exists='replace', schema='dbo', method='multi', chunksize= 1000)
Any ideas on what is happening here or another alternative to speed this up?
Related
i am using sqlite3 in python and whenever i insert data it replace it with the data in the DB instead of adding rows
i couldn't have more than one row..
--
def success(self):
global cursor, conn
wallet = random.randint(1000000000, 9999999999)
if self.checkID:
query = [self.fNEntry.get(), self.lNEntry.get(), self.idEntry.get(), self.pwEntry.get(), self.emEntry.get(),
self.pNEntry.get(), wallet, 1000]
cursor = conn.execute('INSERT INTO STUDENTS(FNAME, LNAME, ID, PASSWORD, EMAIL, PHONE, WALLET, BALANCE) \
VALUES(?, ?, ?, ?, ?, ?, ?, ?);', query)
conn.commit()
I have to make a request to a Brazil ZIPCODES API to get JSON data and insert it on a sqlite database using python. I'm currenctly using pycharm but I need to insert a lot of columns, but somehow the code don't insert the values. Here's the code
import requests
import sqlite3
import json
CEPC = input("Please type the zipcode:")
print("Identifying the ZIP CODE")
Requisicao = requests.get(f"https://viacep.com.br/ws/{CEPC}/json")
if Requisicao.status_code == 200:
data = Requisicao.json()
# Database
con = sqlite3.connect("Banco de dados/CEPS.db")
cur = con.cursor()
cur.execute("DROP TABLE IF EXISTS Requisicao")
cur.execute("CREATE TABLE Requisicao (cep, logradouro, bairro, uf, ddd, siafi,
validation, created json)")
cur.executemany("insert into Requisicao values (?, ?, ?, ?, ?, ?, ?, ?)", (data["cep"],
json.dumps(data)))
con.commit()
con.close()
else:
print(f"Request failed with status code {Requisicao.status_code} ")
The outpout of the zipcode is:
{
"cep": "05565-000",
"logradouro": "Avenida General Asdrúbal da Cunha",
"complemento": "",
"bairro": "Jardim Arpoador",
"localidade": "São Paulo",
"uf": "SP",
"ibge": "3550308",
"gia": "1004",
"ddd": "11",
"siafi": "7107"
}
I need to insert all of these columns: "cep, logadouro, complemento, bairro, localidade, uf, ibge, gia, ddd, siafi".When I try to run the code, It gives me the error:
Traceback (most recent call last):
File "C:\Users\Gui\PycharmProjects\pythonProject\main.py", line 19, in <module>
cur.executemany("insert into Requisicao values (?, ?, ?, ?, ?, ?, ?, ?)", (data["cep"],
json.dumps(data)))
sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement
uses 8, and there are 9 supplied
When I try to put the exact same value of columns with the "?", the errors says that "uses 8, and there are 7 supplied.
This code will insert all 10 values from the JSON into the table Requisicao and 0 for both validation and created, though that can be changed.
import requests
import sqlite3
import json
CEPC = input("Please type the zipcode:")
print("Identifying the ZIP CODE")
Requisicao = requests.get(f"https://viacep.com.br/ws/{CEPC}/json")
if Requisicao.status_code == 200:
data = Requisicao.json()
# Database
con = sqlite3.connect("CEPS.db")
cur = con.cursor()
cur.execute("DROP TABLE IF EXISTS Requisicao")
cur.execute("CREATE TABLE Requisicao (cep,logradouro,complemento,bairro,localidade,uf,ibge,gia,ddd,siafi, validation, created)")
cur.execute("insert into Requisicao values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",tuple(data.values())+(0, 0))
con.commit()
con.close()
else:
print(f"Request failed with status code {Requisicao.status_code} ")
I was trying to insert the CSV file to an already existing table in the SSMS database table. I have a data column in my data. But I keep getting this error when I try to insert data. Please tell me where am I doing it wrong because server connection and extracting data from the database are fine. Below is the code.
with open("combine.csv", encoding="utf8") as f:
csvreader = csv.reader(f)
csvdata = []
for row in csvreader:
csvdata.append(row)
print(csvdata)
for row in csvdata:
# Insert a row of data
print(row)
if len(row)>=8:
data = [row[0],row[1],row[2],row[3],row[4],row[5],row[6],row[7]]
cursor.execute("INSERT INTO BILLING_COPY (DATE, DEPARTMENT_NUMBER, DEPARTMENT_NAME, DIVISION_CODE, DIVISION_NAME, O_T_AMT, R_AMT, U_AMT ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)", data)
Error:
File "", line 7, in
cursor.execute("INSERT INTO BILLING_COPY (DATE, DEPARTMENT_NUMBER, DEPARTMENT_NAME, DIVISION_CODE, DIVISION_NAME, O_T_AMT, R_AMT, U_AMT ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)", data)
DataError: ('22007', '[22007] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Conversion failed when converting date and/or time from character string. (241) (SQLExecDirectW)')
File "", line 7, in
cursor.execute("INSERT INTO BILLING_COPY (DATE, DEPARTMENT_NUMBER, DEPARTMENT_NAME, DIVISION_CODE, DIVISION_NAME, O_T_AMT, R_AMT, U_AMT ) VALUES (?, ?, ?, ?, ?, ?, ?, ?)", data)
I think the data type you mentioned in VALUES(?,?,? etc) is not right valid data type, try using it as %d or %s
Here is some example:
mySql_insert_query = """INSERT INTO Laptop (Id, Name, Price, Purchase_date)
VALUES
(10, 'ProductValues SP99', 6459, '2019-12-27') """
cursor = connection.cursor()
cursor.execute(mySql_insert_query)
connection.commit()
My two cents: Better to assign insert query to a variable just like data variable.
Newbie here, trying to import from multiple csv to sql server, the code did run but no data inserted into sql database.
Attached is my code. Maybe the error is lie on the loop.
Please help.
import csv
import pyodbc as p
import os
# Database Connection Info
server = "cld-077\eform"
database = "E-form"
username = "wsmeform"
password = "M1loA1s!"
connStr = (
'DRIVER={ODBC Driver 13 for SQL Server};SERVER=' + server + ';DATABASE=' + database + ';UID=' + username + ';PWD=' + password)
# Open connection to SQL Server Table
conn = p.connect(connStr)
# Get cursor
cursor = conn.cursor()
# Assign path to Excel files
print("Inserting!")
folder_to_import = 'C:/Users/ck.law/Desktop/VBFU_NOV/'
print("path")
l_files_to_import = os.listdir(folder_to_import)
print("inside loop")
for file_to_import in l_files_to_import:
if file_to_import.endswith('.csv'):
csv_files = os.path.join(folder_to_import, file_to_import)
csv_data = csv.reader(csv_files)
for row in csv_data:
if len(row) >= 19:
cursor.execute(
"INSERT INTO VesselBFUData(ShortCode,DocDT,PostDT,DocNo,LineItm,GlCode,ExpType,InvRef,VBaseCurrcy,VBaseAmt,DocCurrcy,DocAmt,VendorCode,Description,InvFilePath,InvCreateDT,InvAppvDT,InvArriDT,PoRef)" " VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
row)
print("Loop!")
cursor.close()
conn.commit()
conn.close()
print("Script has successfully run!")
i'm trying to write an entire folder of CSV files into a SQL Server Table.
I'm getting the following error, and i'm really stumped:
Traceback (most recent call last):
File "C:\\Projects\Import_CSV.py", line 37, in <module>
cursor.execute("INSERT INTO HED_EMPLOYEE_DATA(Company, Contact, Email, Name, Address, City, CentralCities, EnterpriseZones, NEZ, CDBG)" "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)", row)
DataError: ('22001', '[22001] [Microsoft][SQL Server Native Client 10.0][SQL Server]String or binary data would be truncated. (8152) (SQLExecDirectW); [01000] [Microsoft][SQL Server Native Client 10.0][SQL Server]The statement has been terminated. (3621)')
I'm not sure what's wrong in my code. I also need it to skip the first row in the CSV files as that is the header row. Any help would be greatly appreciated. Thank you.
# Import arcpy module
import csv
import arcpy
import pyodbc as p
import os
# Database Connection Info
server = "myServer"
database = "myDB"
connStr = ('DRIVER={SQL Server Native Client 10.0};SERVER=' + server + ';DATABASE=' + database + ';' + 'Trusted_Connection=yes')
# Open connection to SQL Server Table
conn = p.connect(connStr)
# Get cursor
cursor = conn.cursor()
# Assign path to Excel files
folder_to_import = "\\\\Server\\HED_DATA_CSV"
l_files_to_import = os.listdir(folder_to_import)
for file_to_import in l_files_to_import:
if file_to_import.endswith('.CSV'):
csv_files = os.path.join(folder_to_import, file_to_import)
csv_data = csv.reader(file(csv_files))
for row in csv_data:
cursor.execute("INSERT INTO HED_EMPLOYEE_DATA(Company, Contact, Email, Name, Address, City, CentralCities, EnterpriseZones, NEZ, CDBG)" "VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)", row)
cursor.close()
conn.commit()
conn.close()
print"Script has successfully run!"
You can skip the first line this way:
csv_data.next() #throw away first row
for row in csv_data:
if len(row) >= 10:
cursor.execute("INSERT ..." ...)
Also, you should check to make sure that row contains enough elements before executing:
if len(row) >= 10: #use first ten values in row, if there are at least ten
cursor.execute("INSERT ...", row[:10])
You currently have your insert statement listed as two strings next to each other. This has the effect of joining them together with no space in between. You may want a space before "VALUES".