mysql LOAD DATA INFILE of csv from python (Not Working) - python

After some data manipulation I store two columns in a txt file in a csv format as following:
result.txt ->
id,avg
0,38.0
1,56.5
3,66.5
4,48.666666666666664
then I store the data in a table, which is where i find trouble, i tried running a .sql query that stores the data successfully, but executing the same query from python doesnt seem to work for some reason.
python code->
.
.
.
open('C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/result.txt', 'w').write(res)
print(res)
try:
with mysql.connector.connect(
host="localhost",
user='root',
password='tt',
database="dp",
) as connection:
clear_table_query = "drop table if exists test_db.marks;"
create_table_query = '''
create table test_db.marks (
id varchar(255) not null,
avg varchar(255) not null,
primary key (id)
);
'''
# droping the table and recreating it works fine
add_csv_query = "LOAD DATA INFILE 'C:/ProgramData/MySQL/MySQL Server 8.0/Uploads/result.txt' INTO TABLE marks FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\\n' IGNORE 1 LINES;"
print(add_csv_query) # query is printed correctly
with connection.cursor() as cursor:
cursor.execute(clear_table_query)
cursor.execute(create_table_query)
cursor.execute(add_csv_query)
cursor.execute("SELECT * FROM test_db.marks;") # this produces -> Unread result found
except mysql.connector.Error as e:
print(e)
connection.close()

Related

sqlite3.OperationalError: near "(": syntax error Python " SQL Lite [duplicate]

This question already has answers here:
Passing SQLite variables in Python
(1 answer)
How to use variables in SQL statement in Python?
(5 answers)
Closed last month.
I have a small problem with a piece of code, I copied it from a web, but I have the following error:
sqlite3.OperationalError: near "(": syntax error
The code is the following:
# Import required modules
import csv
import sqlite3
# Connecting to the geeks database
connection = sqlite3.connect('isaDBCommune.db')
# Creating a cursor object to execute
# SQL queries on a database table
cursor = connection.cursor()
# Table Definition
create_table = '''CREATE TABLE IF NOT EXISTS isaCommune(
id_codedep_codecommune INTEGER NOT NULL,
nom_commune TEXT NOT NULL,
code_postal INTEGER NOT NULL,
code_commune INTEGER NOT NULL,
code_departement INTEGER NOT NULL,
nom_departement TEXT NOT NULL,
code_region INTEGER NOT NULL
)'''
# Creating the table into our
# database
cursor.execute(create_table)
# Opening the person-records.csv file
file = open('commune.csv')
# Reading the contents of the
# person-records.csv file
contents = csv.reader(file)
# SQL query to insert data into the
# person table
insert_records = "INSERT INTO isaCommune (id_codedep_codecommune, nom_commune, code_postal, code_commune, code_departement, nom_departement, code_region) VALUES ('id_codedep_codecommune', 'nom_commune', 'code_postal', 'code_commune', 'code_departement', 'nom_departement', 'code_region')"
# Importing the contents of the file
# into our person table
cursor.executemany (insert_records, contents)
# SQL query to retrieve all data from
# the person table To verify that the
# data of the csv file has been successfully
# inserted into the table
select_all = "SELECT * FROM isaCommune"
rows = cursor.execute(select_all).fetchall()
What would be the solution? I have searched all over Stack Overflow and I can't find the solution
THX
Any solution ? Or explanation to this error that for me is hidden?
New error with correction ...
sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement uses 0, and there are 1 supplied.
This will be your answer:-
import csv
import sqlite3
connection = sqlite3.connect('isaDBCommune.db')
cursor = connection.cursor()
create_table = '''CREATE TABLE IF NOT EXISTS isaCommune(
id_codedep_codecommune TEXT NOT NULL,
nom_commune TEXT NOT NULL,
code_postal TEXT NOT NULL,
code_commune TEXT NOT NULL,
code_departement TEXT NOT NULL,
nom_departement TEXT NOT NULL,
code_region TEXT NOT NULL
)'''
cursor.execute(create_table)
file = open('commune.csv')
contents = csv.reader(file)
for l in contents:
insert_records = """INSERT INTO isaCommune ('id_codedep_codecommune', 'nom_commune', 'code_postal','code_commune','code_departement','nom_departement','code_region')
VALUES(?,?,?,?,?,?,?)"""
a = (l[0],l[1],l[2],l[3],l[4],l[5],l[6],)
cursor.execute(insert_records, a)
select_all = "SELECT * FROM isaCommune"
rows = cursor.execute(select_all).fetchall()
for row in rows:
print(row)
Hope it will work now...
You need to replace the '?' by the value you want to insert in the corresponding column depending on its type INTEGER, TEXT etc..
For example:
insert_records = "INSERT INTO isaCommune VALUES(1, 'test', 1, 1, 1, 'test', 1) ('id_codedep_codecommune', 'nom_commune', 'code_postal', 'code_commune', 'code_departement', 'nom_departement', 'code_region')"

Primary key constraint gets removed when creating postgres table from pandas dataframe

I am trying to create few tables in Postgres from pandas dataframe but I am kept getting this error.
psycopg2.errors.InvalidForeignKey: there is no unique constraint matching given keys for referenced table "titles"
After looking into this problem for hours, i finally found that when I am inserting the data into parent table from pandas dataframe, the primary key constraint gets removed for some reasons and due to that I am getting this error when trying to refernece it from another table.
But I am not having this problem when I am using pgAdmin4 to create the table and inserting few rows of data manually.
you can see when I created the tables using pgAdmin, the primary key and foreign keys are getting created as expected and I have no problem with it.
But when I try to insert the data from pandas dataframe using psycopg2 library, the primary key is not getting created.
I Can't able to understand why is this happening.
The code I am using to create the tables -
# function for faster data insertion
def psql_insert_copy(table, conn, keys, data_iter):
"""
Execute SQL statement inserting data
Parameters
----------
table : pandas.io.sql.SQLTable
conn : sqlalchemy.engine.Engine or sqlalchemy.engine.Connection
keys : list of str
Column names
data_iter : Iterable that iterates the values to be inserted
"""
# gets a DBAPI connection that can provide a cursor
dbapi_conn = conn.connection
with dbapi_conn.cursor() as cur:
s_buf = StringIO()
writer = csv.writer(s_buf)
writer.writerows(data_iter)
s_buf.seek(0)
columns = ", ".join('"{}"'.format(k) for k in keys)
if table.schema:
table_name = "{}.{}".format(table.schema, table.name)
else:
table_name = table.name
sql = "COPY {} ({}) FROM STDIN WITH CSV".format(table_name, columns)
cur.copy_expert(sql=sql, file=s_buf)
def create_titles_table():
# connect to the database
conn = psycopg2.connect(
dbname="imdb",
user="postgres",
password=os.environ.get("DB_PASSWORD"),
host="localhost",
)
# create a cursor
c = conn.cursor()
print()
print("Creating titles table...")
c.execute(
"""CREATE TABLE IF NOT EXISTS titles(
title_id TEXT PRIMARY KEY,
title_type TEXT,
primary_title TEXT,
original_title TEXT,
is_adult INT,
start_year REAL,
end_year REAL,
runtime_minutes REAL
)
"""
)
# commit changes
conn.commit()
# read the title data
df = load_data("title.basics.tsv")
# replace \N with nan
df.replace("\\N", np.nan, inplace=True)
# rename columns
df.rename(
columns={
"tconst": "title_id",
"titleType": "title_type",
"primaryTitle": "primary_title",
"originalTitle": "original_title",
"isAdult": "is_adult",
"startYear": "start_year",
"endYear": "end_year",
"runtimeMinutes": "runtime_minutes",
},
inplace=True,
)
# drop the genres column
title_df = df.drop("genres", axis=1)
# convert the data types from str to numeric
title_df["start_year"] = pd.to_numeric(title_df["start_year"], errors="coerce")
title_df["end_year"] = pd.to_numeric(title_df["end_year"], errors="coerce")
title_df["runtime_minutes"] = pd.to_numeric(
title_df["runtime_minutes"], errors="coerce"
)
# create SQLAlchemy engine
engine = create_engine(
"postgresql://postgres:" + os.environ["DB_PASSWORD"] + "#localhost:5432/imdb"
)
# insert the data into titles table
title_df.to_sql(
"titles", engine, if_exists="replace", index=False, method=psql_insert_copy
)
# commit changes
conn.commit()
# close cursor
c.close()
# close the connection
conn.close()
print("Completed!")
print()
def create_genres_table():
# connect to the database
conn = psycopg2.connect(
dbname="imdb",
user="postgres",
password=os.environ.get("DB_PASSWORD"),
host="localhost",
)
# create a cursor
c = conn.cursor()
print()
print("Creating genres table...")
c.execute(
"""CREATE TABLE IF NOT EXISTS genres(
title_id TEXT NOT NULL,
genre TEXT,
FOREIGN KEY (title_id) REFERENCES titles(title_id)
)
"""
)
# commit changes
conn.commit()
# read the data
df = load_data("title.basics.tsv")
# replace \N with nan
df.replace("\\N", np.nan, inplace=True)
# rename columns
df.rename(columns={"tconst": "title_id", "genres": "genre"}, inplace=True)
# select only relevant columns
genres_df = df[["title_id", "genre"]].copy()
genres_df = genres_df.assign(genre=genres_df["genre"].str.split(",")).explode(
"genre"
)
# create engine
engine = create_engine(
"postgresql://postgres:" + os.environ["DB_PASSWORD"] + "#localhost:5432/imdb"
)
# insert the data into genres table
genres_df.to_sql(
"genres", engine, if_exists="replace", index=False, method=psql_insert_copy
)
# commit changes
conn.commit()
# close cursor
c.close()
# close the connection
conn.close()
print("Completed!")
print()
if __name__ == "__main__":
print()
print("Creating IMDB Database...")
# connect to the database
conn = psycopg2.connect(
dbname="imdb",
user="postgres",
password=os.environ.get("DB_PASSWORD"),
host="localhost",
)
# create the titles table
create_titles_table()
# create genres table
create_genres_table()
# close the connection
conn.close()
print("Done with Everything!")
print()
I think the problem is to_sql(if_exists="replace"). Try using to_sql(if_exists="append") - my understanding is that "replace" drops the whole table and creates a new one with no constraints.

syntax error unexpected character after line continuation character

Can anybody tell me what's wrong in my program? When I run this program I get the following error message:
syntaxerror unexpected character after line continuation character
import sqlite3
sqlite_file = 'my_first_db.sqlite' # NAME OF THE SQL DATABASE FILE
table_name1 = 'my_table_1' # NAME OF THE TABLE THAT TO BE CREATED.
table_name2 = 'my_table_2' # NAME OF THE SECOND TABLE THAT TO BE CREATED.
new_filed = 'my_1st_coulmn' # NAME OF THE COULMN
filed_type = 'INTEGER' # COULMN DATA TYPE
# CONNECTING TO DATA BASE FILE
conn = sqlite3.connect(sqlite_file)
c = conn.cursor()
# CREATEING NEW SQLITE TABLE WITH 1 COULMN
c.execute('create table {tn} ({nf}) {ft})'\ .format(tn=table_name1,nf=new_filed,ft=filed_type))
# Creating a second table with 1 column and set it as PRIMARY KEY
# note that PRIMARY KEY column must consist of unique values!
c.execute('create table {tn} ({nf}) {ft} primary key)'\.format(tn=table_name2,nf=new_filed,ft=filed_type))
# Committing changes and closing the connection to the database file
conn.commit()
conn.close()
\ is used for line continuation while writing long queries.
so just remove \ before .format() at creation and execution of query if you continued the code after .
# CREATEING NEW SQLITE TABLE WITH 1 COULMN
c.execute('create table {tn} ({nf}) {ft})'.format(tn=table_name1,nf=new_filed,ft=filed_type))
and for more info read this.. https://pyformat.info/

error of creating tables on SQL server 2008 R2 from python3.2 and pyodbc on win7

I am trying to access SQL server 2008 R2 from Eclipse pydev ( python 3.2 ) on win7 .
I need to create a table on database.
The code can be run well. But, I cannot create tables in the database.
If I print the sql string and run the query from SQL server management studio, no problems.
import pyodbc
sql_strc = " IF OBJECT_ID(\'[my_db].[dbo].[my_table]\') IS NOT NULL \n"
sql_strc1 = " DROP TABLE [my_db].[dbo].[my_table] \n"
sql_stra = " CREATE TABLE [my_db].[dbo].[my_table] \n"
sql_stra1 = "(\n"
sql_stra1a = " person_id INT NOT NULL PRIMARY KEY, \n"
sql_stra1b = " value float NULL, \n"
sql_stra1r = "); \n"
sql_str_create_table = sql_strc + sql_strc1 + sql_stra + sql_stra1 + sql_stra1a + sql_stra1b + sql_stra1r
# create table
sql_str_connect_db = "DRIVER={SQL server};SERVER={my_db};DATABASE={my_table};UID=my_id; PWD=my_password"
cnxn = pyodbc.connect(sql_str_connect_db)
cursor = cnxn.cursor()
cursor.execute( sql_str_create_table)
Any help would be appreciated.
Thanks
Autocommit is off by default, add the following to commit your change:
cnxn.commit()
Some unsolicited advice for making your code more readable:
Remove unnecessary escape characters from SQL strings
Use triple-quote (""") syntax when defining multiline strings. Newline characters are preserved and don't need to be explicitly added.
Use keywords in the connect function call (this is trivial, but I think it makes formatting easier)
With these changes, your final code looks something like:
import pyodbc
sql = """
IF OBJECT_ID('[my_db].[dbo].[my_table]') IS NOT NULL
DROP TABLE [my_db].[dbo].[my_table]
CREATE TABLE [my_db].[dbo].[my_table]
(
person_id INT NOT NULL PRIMARY KEY,
value FLOAT NULL
)
"""
cnxn = pyodbc.connect(driver='{SQL Server}', server='server_name',
database='database_name', uid='uid', pwd='pwd')
cursor = cnxn.cursor()
# create table
cursor = cursor.execute(sql)
cnxn.commit()

my python script does not fully execute my stored proc

my stored procedure works fine on its own but my python script fails to fully execute the stored procedure with my downloaded files. the purpose of the python script is to download files using ftp and store the files locally.It first compares the remote location and the local location to find new files and then download the new files to the local location. And then executes the stored procedure on each new file.
python script:
import os
import ftplib
import pyodbc
connection to sql server*
conn = pyodbc.connect('DRIVER={SQL Server};SERVER=localhost;DATABASE=Development;UID=myid;PWD=mypassword')
cursor = conn.cursor()
ftp = ftplib.FTP("myftpaddress.com")
ftp.login("loginname", "password")
print 'ftp on'
#directory listing
rfiles = ftp.nlst()
print 'remote listing'
#save local directory listing to files
lfiles = os.listdir(r"D:\Raw_Data\myFiles")
print 'local listing'
#compare and find files in rfiles but not in lfiles
nfiles = set(rfiles) - set(lfiles)
nfiles = list(nfiles)
print 'compared listings'
#loop through the new files
#download the new files and open each file and run stored proc
#close files and disconnect to sql server
for n in nfiles:
local_filename = os.path.join(r"D:\Raw_Data\myFiles",n)
lf = open(local_filename, "wb")
ftp.retrbinary("RETR " + n, lf.write, 1024)
lf.close()
print 'file written'
cursor.execute("exec SP_my_Dailyfiles('n')")
conn.close()
lf.close()
print 'sql executed'
ftp.quit()
stored proc:
ALTER PROCEDURE [dbo].[SP_my_Dailyfiles]
-- Add the parameters for the stored procedure here
#file VARCHAR(255)
-- Add the parameters for the stored procedure here
AS
BEGIN
IF EXISTS(SELECT * FROM sysobjects WHERE name = 'myinvoice')
DROP TABLE dbo.myinvoice
----------------------------------------------------------------------------------------------------
CREATE TABLE myinvoice(
[Billing] varchar(255)
,[Order] varchar(45)
,[Item] varchar(255)
,[Quantity in pack] varchar(255)
,[Invoice] varchar(255)
,[Date] varchar(255)
,[Cost] varchar(255)
,[Quantity of pack] varchar(255)
,[Extended] varchar(255)
,[Type] varchar(25)
,[Date Due] varchar(255)
)
----------------------------------------------------------------------------------------------------
DECLARE #SourceDirectory VARCHAR(255)
DECLARE #SourceFile VARCHAR(255)
EXEC (' BULK
INSERT dbo.myinvoice
FROM ''D:\Raw_Data\myfile\'+#file+'''
WITH
(
FIRSTROW = 1,
FIELDTERMINATOR = '','',
ROWTERMINATOR = ''0x0a''
)'
)
-------------------------------------------------------------------------------------------------------------
INSERT INTO [Development].[dbo].[my_Dailyfiles](
[Billing]
,[Order]
,[Item]
,[Quantity in pack]
,[Invoice]
,[Date]
,[Cost]
,[Quantity of pack]
,[Extended]
,[Type]
,[Date Due]
,[FileName]
,[IMPORTEDDATE]
)
SELECT
replace([Billing], '"', '')
,replace([Order], '"', '')
,replace([Item], '"','')
,replace([Quantity in pack],'"','')
,replace([Invoice],'"','')
,cast(replace([Date],'"','') as varchar(255)) as date
,replace([Cost],'"','')
,replace([Quantity of pack],'"','')
,replace([Extended],'"','')
,replace([Type],'"','')
,cast(replace([Date Due],'"','') as varchar(255)) as date
,#file,
GetDate()
FROM [myinvoice] WHERE [Bill to] <> ' ' and ndc != '"***********"'
I think the problem may be that you are closing the DB connection immediately after you execute the stored procedure, whilst still in the loop.
This means the second time around the loop, the DB connection is closed when you try to execute the SP. I would actually expect an error to be thrown the second around the loop.
The way I would structure this is something like:
conn = pyodbc.connect(...)
for n in nfiles:
...
cursor = conn.cursor()
cursor.execute("exec SP_my_Dailyfiles('n')")
conn.commit()

Categories