Bulk load failed due to invalid column value in CSV data file - python

I am trying to import data from a .csv file to a SQL server table. This works fine with running the SQL in Microsoft SQL Server Management Studio (SSMS). However, when I try to do it from Python with pyodbc, it gives me the following error:
pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][SQL Server Native Client 11.0]
[SQL Server]Bulk load failed due to invalid column value in CSV data file
C:/~pathToFile~/file.csv in row 2, column 38. (4879) (SQLExecDirectW);
[42000] [Microsoft][SQL Server Native Client 11.0][SQL Server]The OLE DB
provider "BULK" for linked server "(null)" reported an error.
The provider did not give any information about the error. (7399);
[42000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)". (7330)')
My code so far is:
import pyodbc
# ------------------------------------------------------
# DEFINE FUNCTIONS
# ------------------------------------------------------
# Define a database command
def DB(SQL):
cnxn = pyodbc.connect("Driver={SQL Server Native Client 11.0};"
"Server=~myServer~;"
"Database=~myDB~;"
"Trusted_Connection=yes;")
try:
with cnxn.cursor() as cursor:
cursor.execute(SQL)
cnxn.commit()
finally:
cnxn.close()
# ------------------------------------------------------
# Import from .csv file to table
# ------------------------------------------------------
sql = '''BULK INSERT dbo.~myDB~
FROM 'C:/~pathToFile~/file.csv'
WITH (
FORMAT='CSV',
FIRSTROW = 2,
ROWTERMINATOR = '\n',
FIELDQUOTE= '"',
TABLOCK
)'''
DB(sql)
And here is the first few lines of the .csv file I'm trying to import:
SITE_ID,FIELD_SAMPLE_ID,LOCATION_ID,SAMPLE_DATE,PARAMETER_NAME,REPORT_RESULT,REPORT_UNITS,LAB_QUALIFIER,DETECTED,SAMPLE_MATRIX,SAMPLE_PURPOSE,SAMPLE_TYPE,SAMPLE_TIME,LATITUDE_(DECIMAL),LONGITUDE_(DECIMAL),FILTERED,FIELD_SAMPLE_COMMENTS,LAB_MATRIX,COC_#,LAB_METHOD,REPORT_DETECTION_LIMIT,SOURCE_FILENAME,WTR_SOURCE_FLOW,VALIDATION_QUALIFIER,VALIDATION_REASON_CODES,ANALYSIS_DATE,RESULT_TYPE,PARAMETER_CODE,LAB_RESULT,DILUTION_FACTOR,METHOD_DETECTION_LIMIT,INSTRUMENT_DETECTION_LIMIT,ANALYSIS_TYPE_CODE,ANALYSIS_TIME,QC_BATCH_SEQUENCE_#,SAMPLE_RESULT_COMMENTS,LAB_SAMPLE_ID,FIELD_SAMPLE_RESULT_RECORD_ID
"N3B","CAPA-08-11017","03-B-10","03-17-2008","RDX","0.325","ug/L","U","N","W","REG","WG","10:40","35.873716600000","-106.330115800000","N",,"W","08-824","SW-846:8321A_MOD","0.33",,"N","U","U_LAB","03-26-2008","TRG","121-82-4","0.325","2","0.13",,"INIT","00:00",,,"204935003","638"
"N3B","CAPA-08-13138","03-B-10","06-12-2008","RDX","0.325","ug/L","U","N","W","REG","WG","10:35","35.873716600000","-106.330115800000","N",,"W","08-1350","SW-846:8321A_MOD","0.33",,"N","U","U_LAB","06-24-2008","TRG","121-82-4","0.325","2","0.13",,"INIT","00:00",,,"210389014","638"
"N3B","CAPA-08-13139","03-B-10","06-12-2008","RDX","0.325","ug/L","U","N","W","FB","WG","10:35","35.873716600000","-106.330115800000","N",,"W","08-1350","SW-846:8321A_MOD","0.33",,"N","U","U_LAB","06-24-2008","TRG","121-82-4","0.325","2","0.13",,"INIT","00:00",,,"210389017","638"
Any idea why this will not work? Again, it works fine from SSMS, just not Python/pyodbc.

In some cases, you need to add a carriage return to the row terminator along with the new line.
Try adding \r to the row terminator parameter.
ROWTERMINATOR = '\r\n'

Related

Connecting Python to Remote SQL Server

I am trying to connect Python to our remote SQL Server but I am not getting it. Following is a code that I used.
server = 'server,1433'
database = 'db'
username = 'username'
password = 'pw'
cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
cursor = cnxn.cursor()
cursor.execute('SELECT top 1 * FROM db.dbo.t_location')
for row in cursor:
print(row)
We have 2 servers. One is database server but I use application server for SQL which connects to database server. This is the error I'm getting. I am trying for a week but I'm not sure what am I missing here.
Any help would be appreciated
OperationalError: ('08001', '[08001] [Microsoft][ODBC Driver 17 for SQL Server]TCP Provider: No such host is known.\r\n (11001) (SQLDriverConnect); [08001] [Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired (0); [08001] [Microsoft][ODBC Driver 17 for SQL Server]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. (11001)')
ADDED:
connection_str = ("Driver={SQL Server Native Client 11.0};"
"Server= 10.174.124.12,1433;"
#"Port= 1433;"
"Database=AAD;"
"UID=dom\user;"
"PWD=password;"
)
connection = pyodbc.connect(connection_str)
data = pd.read_sql("select top 1 * from dbo.t_location with (nolock);",connection)
I used the above code and now I see this error. Seems like it worked but failed to login. Usually I have to use Windows authentication in SSMS once I put my credentials to login in remote desktop.
('28000', "[28000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Login failed for user 'dom\user'. (18456) (SQLDriverConnect); [28000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Login failed for user 'dom\user'. (18456)")
Answer:
I am excited that I finally found a solution using pymssql. I don't know pyodbc wasn't working but I am sure I must have had done something wrong. I used below code to get the data from remote SQL server using Python.
import pymssql
conn = pymssql.connect(
host=r'10.174.124.12',
user=r'dom\user',
password=r'password',
database='db'
)
cursor = conn.cursor(as_dict=True)
cursor.execute('Select top 4 location_id, description from t_location with (nolock)')
data = cursor.fetchall()
data_df = pd.DataFrame(data)
cursor.close()
Ignore my code at this moment. I still have to do some cleaning but this code will work.
Finally to answer my question, I had to use pymssql which worked. I did not have to put the port number which was making me confused. Thanks everyone for taking out time to answer.
import pymssql
conn = pymssql.connect(
host=r'10.174.124.12',
user=r'dom\user',
password=r'password',
database='db'
)
cursor = conn.cursor(as_dict=True)
cursor.execute('Select top 4 location_id, description from t_location with (nolock)')
data = cursor.fetchall()
data_df = pd.DataFrame(data)
cursor.close()
you can use this function :
def connectSqlServer(Server , Database , Port , User , Password):
try:
conn = pyodbc.connect('Driver={SQL Server}; Server='+Server+';
Database='+Database+'; Port='+Port+'; UID='+User+'; PWD='+Password+';')
cursor = conn.cursor()
except Exception as e:
print("An error occurred when connecting to DB, error details: {}".format(e))
return False, None
else:
return True, cursor

Invalid connection error ('08001')for importing CSV to SQL Server

I am looking for a solution for inserting CSV file to SQL Server and for inserting data to SQL Server I faced with below error.
I read lot about this issue and did several changes but nothing happened and still I can't import csv to relative table. I can't understand why this error also happened?
Microsoft SQL has been installed on my local computer:
conn = pyodbc.connect('Driver=SQL Server; Server=ServerName\MSSQLSERVER; Database=DBName;Trusted_Connection=yes ; UID=Administrator')
Error:
pyodbc.OperationalError: ('08001', '[08001] [Microsoft][ODBC SQL Server Driver][DBMSLPCN]Invalid connection. (14) (SQLDriverConnect); [08001] [Microsoft][ODBC SQL Server
Driver][DBMSLPCN]ConnectionOpen (ParseConnectParams()). (14)')
For anyone who faced with above problem,If you want to connect to sql server with SQL Authentication, the format will be changed as below:
pyodbc.connect('Driver={SQL Server}; Server=ServerName;
Database=DBName; UID=UserName; PWD= {Password};')

pyodbc authentication error

I'm trying to connect to SQL Server and run a query using the following code in Python:
import pyodbc
cnxn = pyodbc.connect("Driver={SQL Server Native Client 11.0};"
"Server = server_name;"
"Database = database_name;"
"UID = user;"
"PWD = password;")
cursor = cnxn.cursor()
cursor.execute('SELECT TOP 10 [column] FROM [table]')
for row in cursor:
print('row = %r' % (row,))
I'm getting the following error:
Traceback (most recent call last):
File "filename", line 3, in <module>
cnxn = pyodbc.connect("Driver={SQL Server Native Client 11.0};"
pyodbc.Error: ('28000', "[28000] [Microsoft][SQL Server Native Client 11.0][SQL Server]Login failed for user 'username'. (18456) (SQLDriverConnect)")
("filename" and "username" inserted above as placeholders)
This is the same error, regardless of what I change the SQL Server username and password to in the code, and the user in the error is my windows login username.
I've also tried replacing UID and PWD with:
"Trusted_connection=yes"
... to no avail. I get the exact same error back.
I've tried several solutions for similar posts on Stackoverflow and elsewhere but no luck. Ideas to the problem or an alternative means of connecting to the database would be appreciated.
Thanks so much

Connecting to ODBC using pyODBC

I've read all the faq pages from the python odbc library as well as other examples and managed to connect to the DSN, using the following code:
cnxn = pyodbc.connect("DSN=DSNNAME")
cursor = cnxn.cursor()
cursor.tables()
rows = cursor.fetchall()
for row in rows:
print row.table_name
but for everything else I keep getting this error:
Error: ('IM002', '[IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified (0) (SQLDriverConnect)')
I know that I can pull up my data using Microsoft Access by going through the following steps: Creating a new database, clicking the external data tab, Click More and select ODBC database, use the Link to the data source by creating a linked table, in the Select data source window choosing Machine Data source and select NAME2 which has a System type, press okay and choose the table acr.Table_one_hh, then select the fields in the table that I want to look at like City, State, Country, Region, etc. When I hover over the table name it shows the DSN name, Description, Trusted Connection = Yes, APP, Database name and the table name.
I've attempted two methods, first
cnxn = pyodbc.connect('DRIVER={SQL Server Native Client 10.0};SERVER=mycomputername;DATABASE=mydatabase;Trusted_Connection=yes;')
cursor = cnxn.cursor()
which gives an error:
Error: ('08001', '[08001] [Microsoft][SQL Server Native Client 10.0]Named Pipes Provider: Could not open a connection to SQL Server [2]. (2) (SQLDriverConnect); [HYT00] [Microsoft][SQL Server Native Client 10.0]Login timeout expired (0); [08001] [Microsoft][SQL Server Native Client 10.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. (2)')
I tried
cnxn = pyodbc.connect("DSN=DSNNAME, DATABASE=mydatabase")
cursor = cnxn.cursor()
cursor.execute("""SELECT 1 AS "test column1" from acr.Table_one_hh""")
cursor.tables()
rows = cursor.fetchall()
for row in rows:
print row.table_name
which gave an error
Error: ('IM002', '[IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified (0) (SQLDriverConnect)')
I managed to solve my issue. My code did not really change.
cnxn = pyodbc.connect("DSN=BCTHEAT")
cursor = cnxn.cursor()
cursor.execute("select * from acr.Table_one_hh")
row = cursor.fetchall()
then I wrote the results into a csv file.

How to backup a database by pyodbc

The backup statement can't be used in a transaction when it execute with pyodbc cursor. It seems that the pyodbc execute the query inside a default transaction.
I have also tried to use the autocommit mode or add the commit statement before the backup statement. Both of these are not working.
#can't execute the backup statement in transaction
cur.execute("backup database database_name to disk = 'backup_path'")
#not working too
cur.execute("commit;backup database database_name to disk = 'backup_path'")
Is it possible to execute the backup statement by pyodbc? Thanks in advance!
-----Added aditional info-----------------------------------------------------------------------
The backup operation is encapsulate in a function such as:
def backupdb(con, name, save_path):
# with autocommit mode, should be pyodbc.connect(con, autocommit=True)
con = pyodbc.connect(con)
query = "backup database %s to disk = '%s'" % (name, save_path)
cur = con.cursor()
cur.execute(query)
cur.commit()
con.close()
If the function is called by following code,
backupdb('DRIVER={SQL Server};SERVER=.\sqlexpress;DATABASE=master;Trusted_Connection=yes',
'DatabaseName',
'd:\\DatabaseName.bak')
then the exception will be:
File "C:/Documents and Settings/Administrator/Desktop/bakdb.py", line 14, in <module>'d:\\DatabaseName.bak')
File "C:/Documents and Settings/Administrator/Desktop/bakdb.py", line 7, in backupdb cur.execute(query)
ProgrammingError: ('42000', '[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Cannot perform a backup or restore operation within a transaction. (3021) (SQLExecDirectW); [42000] [Microsoft][ODBC SQL Server Driver][SQL Server]BACKUP DATABASE is terminating abnormally. (3013)')
With open the keyword autocommit=True, the function will run silently but there is no backup file generated in the backup folder.
Assuming you are using SQL Server, specify autocommit=True when the connection is built:
>>> import pyodbc
>>> connection = pyodbc.connect(driver='{SQL Server Native Client 11.0}',
server='InstanceName', database='master',
trusted_connection='yes', autocommit=True)
>>> backup = "BACKUP DATABASE [AdventureWorks] TO DISK = N'AdventureWorks.bak'"
>>> cursor = connection.cursor().execute(backup)
>>> connection.close()
This is using pyodbc 3.0.7 with Python 3.3.2. I believe with older versions of pyodbc you needed to use Cursor.nextset() for the backup file to be created. For example:
>>> import pyodbc
>>> connection = pyodbc.connect(driver='{SQL Server Native Client 11.0}',
server='InstanceName', database='master',
trusted_connection='yes', autocommit=True)
>>> backup = "E:\AdventureWorks.bak"
>>> sql = "BACKUP DATABASE [AdventureWorks] TO DISK = N'{0}'".format(backup)
>>> cursor = connection.cursor().execute(sql)
>>> while cursor.nextset():
>>> pass
>>> connection.close()
It's worth noting that I didn't have to use Cursor.nextset() for the backup file to be created with the current version of pyodbc and SQL Server 2008 R2.
After hours I found solution. It must be performed no MASTER, other sessions must be terminated, DB must be set to OFFLINE, then RESTORE and then set to ONLINE again.
def backup_and_restore():
server = 'localhost,1433'
database = 'myDB'
username = 'SA'
password = 'password'
cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE=MASTER;UID='+username+';PWD='+ password)
cnxn.autocommit = True
def execute(cmd):
cursor = cnxn.cursor()
cursor.execute(cmd)
while cursor.nextset():
pass
cursor.close()
execute("BACKUP DATABASE [myDB] TO DISK = N'/usr/src/app/myDB.bak'")
# do something .......
execute("ALTER DATABASE [myDB] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;")
execute("ALTER DATABASE [myDB] SET OFFLINE;")
execute("RESTORE DATABASE [myDB] FROM DISK = N'/usr/src/app/myDB.bak' WITH REPLACE")
execute("ALTER DATABASE [myDB] SET ONLINE;")
execute("ALTER DATABASE [myDB] SET MULTI_USER;")

Categories