Problem
I am trying to read a csv file to Pandas, and write it to a SQLite database.Process works for all the columns in the csv file except for "Fill qty" which is a Positive Integer(int64). The process changes the type from TEXT/INTEGER to BLOB.
So I tried to load only the "Fll qty" column from Pandas to SQLite, and surprisingly I noticed I can safely do that for all integers smaller than 10 (I don't have 9 in my dataset, so basically 1,2,...,8 loaded successfully).
Here is what I tried:
I tried what I could think of: change "Fill_Qty" type in Schema to INTEGER to REAL, NULL or TEXT , change data type in Pandas from int64 to float or string before inserting to SQLite table. None of them worked. By the look of it, the "Trade_History.csv" file seems to be fine in Pandas or Excel. Is there something that my eyes dont see?!? So I am really confused what is happening here!
You would need the .csv file to test the code. Here is the code and .csv file: https://github.com/Meisam-Heidari/Trading_Min_code
The code:
### Imports:
import pandas as pd
import numpy as np
import sqlite3
from sqlite3 import Error
def create_database(db_file):
try:
conn = sqlite3.connect(db_file)
finally:
conn.close()
def create_connection(db_file):
""" create a database connection to the SQLite database
specified by db_file
:param db_file: database file
:return: Connection object or None
"""
try:
conn = sqlite3.connect(db_file)
return conn
return None
def create_table(conn,table_name):
try:
c = conn.cursor()
c.execute('''CREATE TABLE {} (Fill_Qty TEXT);'''.format(table_name))
except Error as e:
print('Error Code: ', e)
finally:
conn.commit()
conn.close()
return None
def add_trade(conn, table_name, trade):
try:
print(trade)
sql = '''INSERT INTO {} (Fill_Qty)
VALUES(?)'''.format(table_name)
cur = conn.cursor()
cur.execute(sql,trade)
except Error as e:
print('Error When trying to add this entry: ',trade)
return cur.lastrowid
def write_to_db(conn,table_name,df):
for i in range(df.shape[0]):
trade = (str(df.loc[i,'Fill qty']))
add_trade(conn,table_name,trade)
conn.commit()
def update_db(table_name='My_Trades', db_file='Trading_DB.sqlite', csv_file_path='Trade_History.csv'):
df_executions = pd.read_csv(csv_file_path)
create_database(db_file)
conn = create_connection(db_file)
table_name = 'My_Trades'
create_table(conn, table_name)
# writing to DB
conn = create_connection(db_file)
write_to_db(conn,table_name,df_executions)
# Reading back from DB
df_executions = pd.read_sql_query("select * from {};".format(table_name), conn)
conn.close()
return df_executions
### Main Body:
df_executions = update_db()
Any alternatives
I am wondering if anyone have a similar experience? Any advices/solutions to help me load the data in SQLite?
I am Trying to have something light and portable and unless there is no alternatives, I prefer not to go with Postgres or MySQL.
You're not passing a container to .execute() when inserting the data. Reference: https://www.python.org/dev/peps/pep-0249/#id15
What you need to do instead is:
trade = (df.loc[i,'Fill qty'],)
# ^ this comma makes `trade` into a tuple
The types of errors you got would've been:
ValueError: parameters are of unsupported type
Or:
sqlite3.ProgrammingError: Incorrect number of bindings supplied. The
current statement uses 1, and there are 2 supplied.
The code is the following (I am new to Python/Mysql):
import mysql.connector
conn = mysql.connector.connect(host='localhost',user='user1',password='puser1',db='mm')
cursor = conn.cursor()
string1 = 'test1'
insert_query = """INSERT INTO items_basic_info (item_name) VALUES (%s)""", (string1)
cursor.execute(insert_query)
conn.commit()
When I run this code I get this error:
Traceback (most recent call last)
File "test3.py", line 9, in <module>
cursor.execute(insert_query)
File "C:\Users\Emanuele-PC\AppData\Local\Programs\Python\Python36\lib\site-packages\mysql\connector\cursor.py", line 492, in execute
stmt = operation.encode(self._connection.python_charset)
AttributeError: 'tuple' object has no attribute 'encode'
I have seen different answers to this problem but the cases were quite different from mine and I couldn't really understand where I am making mistakes. Can anyone help me?
For avoid SQL-injections Django documentation fully recommend use placeholders like that:
import mysql.connector
conn = mysql.connector.connect(host='localhost',user='user1',password='puser1',db='mm')
cursor = conn.cursor()
string1 = 'test1'
insert_query = """INSERT INTO items_basic_info (item_name) VALUES (%s)"""
cursor.execute(insert_query, (string1,))
conn.commit()
You have to pass tuple/list params in execute method as second argument. And all should be fine.
Not exactly OP's problem but i got stuck for a while writing multiple variables to MySQL.
Following on from Jefferson Houp's answer, if adding in multiple strings, you must specify the argument 'multi=True' in the 'cursor.execute' function.
import mysql.connector
conn = mysql.connector.connect(host='localhost',user='user1',password='puser1',db='mm')
cursor = conn.cursor()
string1 = 'test1'
string2 = 'test2'
insert_query = """INSERT INTO items_basic_info (item_name) VALUES (%s, %s)"""
cursor.execute(insert_query, (string1, string2), multi=True)
conn.commit()
I have a problem with creating SQL query for Oracle database using Python.
I want to bind string variable and it does not work, could you tell me what am I doing wrong?
This is my code:
import cx_Oracle
dokList = []
def LoadDatabase():
conn = None
cursor = None
try:
conn = cx_Oracle.connect("login", "password", "localhost")
cursor = conn.cursor()
query = "SELECT * FROM DOCUMENT WHERE DOC = :param"
for doknumber in dokList:
cursor.execute(query, {'doknr':doknumber})
print(cursor.rowcount)
except cx_Oracle.DatabaseError as err:
print(err)
finally:
if cursor:
cursor.close()
if conn:
conn.close()
def CheckData():
with open('changedNamed.txt') as f:
lines = f.readlines()
for line in lines:
dokList.append(line)
CheckData()
LoadDatabase()
The output of cursor.rowcount is 0 but it should be number greater than 0.
You're using a dictionary ({'doknr' : doknumber}) for your parameter, so it's a named parameter - the :param needs to match the key name. Try this:
query = "SELECT * FROM DOCUMENT WHERE DOC = :doknr"
for doknumber in dokList:
cursor.execute(query, {'doknr':doknumber})
print(cursor.rowcount)
For future troubleshooting, to check whether your parameter is getting passed properly, you can also try changing your query to "select :param from dual".
I am trying to insert info from a pandas DataFrame into a database table by using a function that I wrote:
def insert(table_name="", name="", genere="", year=1, impd_rating=float(1)):
conn = psycopg2.connect("dbname='database1' user='postgres' password='postgres333' host='localhost' port=5433 ")
cur = conn.cursor()
cur.execute("INSERT INTO %s VALUES %s,%s,%s,%s" % (table_name, name, genere, year, impd_rating))
conn.commit()
conn.close()
When I try to use this function like this:
b=0
for row in DF['id']:
insert(impd_rating=float(DF['idbm_rating'][b]),
year=int(DF['year'][b]),
name=str(DF['name'][b]),
genere=str(DF['genere'][b]),
table_name='test_movies')
b = b+1
I get the following syntax error:
SyntaxError: invalid syntax
PS D:\tito\scripts\database training> python .\postgres_script.py
Traceback (most recent call last):
File ".\postgres_script.py", line 56, in <module>insert (impd_rating=float(DF['idbm_rating'][b]),year=int(DF['year'][b]),name=str(DF['name'][b]),genere=str(DF['genere'][b]),table_name='test_movies')
File ".\postgres_script.py", line 15, in insert
cur.execute("INSERT INTO %s VALUES %s,%s,%s,%s" % (table_name ,name ,genere , year,impd_rating))
psycopg2.ProgrammingError: syntax error at or near "Avatar"
LINE 1: INSERT INTO test_movies VALUES Avatar,action,2009,7.9
I also tried to change the str replacement method from %s to .format()
but I had the same error.
The error message is explicit, this SQL command is wrong at Avatar: INSERT INTO test_movies VALUES Avatar,action,2009,7.9. Simply because values must be enclosed in parenthesis, and character strings must be quoted, so the correct SQL is:
INSERT INTO test_movies VALUES ('Avatar','action',2009,7.9)
But building a full SQL command by concatenating parameters is bad practice (*), only the table name should be directly inserted into the command because is is not a SQL parameter. The correct way is to use a parameterized query:
cur.execute("INSERT INTO %s VALUES (?,?,?,?)" % (table_name,) ,(name ,genere , year,impd_rating)))
(*) It was the cause of numerous SQL injection flaws because if one of the parameter contains a semicolumn (;) what comes after could be interpreted as a new command
Pandas has a DataFrame method for this, to_sql:
# Only needs to be executed once.
conn=psycopg2.connect("dbname='database1' user='postgres' password='postgres333' host='localhost' port=5433 ")
df.to_sql('test_movies', con=conn, if_exists='append', index=False)
This should hopefully get you going in the right direction.
In your original query
INSERT INTO %s VALUES %s,%s,%s,%s
there is a sql problem: you need braces around the values, i.e. it should be VALUES (%s, %s, %s, %s). On top of that the table name cannot be merged as a parameter, or it would be escaped as a string, which is not what you want.
You can use the psycopg 2.7 sql module to merge the table name to the query, with placeholders for the values:
from psycopg2 import sql
query = sql.SQL("INSERT INTO {} VALUES (%s, %s, %s, %s)").format(
sql.Identifier('test_movies'))
cur.execute(query, ('Avatar','action',2009,7.9))
This will make secure both merging the table name and the arguments to the query.
Hello mohamed mahrous,
First install psycopg2 package for the access access PostgreSQL database.
Try this below code,
import psycopg2
conn=psycopg2.connect("dbname='database1' user='postgres' password='postgres333' host='localhost' port=5433 ")
cur=conn.cursor()
def insert(table_name,name,genere,year,impd_rating):
query = "INSERT INTO "+table_name+"(name,genere,year,impd_rating) VALUES(%s,%s,%s,%s)"
try:
print query
cur.execute(query,(name,genere,year,impd_rating))
except Exception, e:
print "Not execute..."
conn.commit()
b=0
for row in DF['id']:
insert (impd_rating=float(DF['idbm_rating'][b]),year=int(DF['year'][b]),name=str(DF['name'][b]),genere=str(DF['genere'][b]),table_name='test_movies')
b= b+1
conn.close()
Example,
import psycopg2
conn=psycopg2.connect("dbname='database1' user='postgres' password='postgres333' host='localhost' port=5433 ")
cur=conn.cursor()
def insert(table_name,name,genere,year,impd_rating):
query = "INSERT INTO "+table_name+"(name,genere,year,impd_rating) VALUES(%s,%s,%s,%s)"
try:
print query
cur.execute(query,(name,genere,year,impd_rating))
except Exception, e:
print "Not execute"
conn.commit()
b=0
for row in DF['id']:
insert (impd_rating="7.0",year="2017",name="Er Ceo Vora Mayur",genere="etc",table_name="test_movies")
b= b+1
conn.close()
I hope my answer is helpful.
If any query so comment please.
i found a solution for my issue by using sqlalchemy and pandas to_sql method
thanks for help everyone
from sqlalchemy import *
import pandas as pd
def connect(user, password, db, host='localhost', port=5433):
'''Returns a connection and a metadata object'''
# We connect with the help of the PostgreSQL URL
# postgresql://federer:grandestslam#localhost:5432/tennis
url = 'postgresql://{}:{}#{}:{}/{}'
url = url.format(user, password, host, port, db)
# The return value of create_engine() is our connection object
con = sqlalchemy.create_engine(url, client_encoding='utf8')
# We then bind the connection to MetaData()
meta = sqlalchemy.MetaData(bind=con, reflect=True)
return con, meta
con, meta = connect('postgres','postgres333','database1')
movies= Table('test',meta,
Column('id',Integer,primary_key=True),
Column('name',String),
Column('genere',String),
Column('year',Integer),
Column('idbm_rating',REAL))
meta.create_all(con)
DF=pd.read_csv('new_movies.txt',sep=' ',engine='python')
DF.columns=('id','name' ,'genere' ,'year' ,'idbm_rating' )
DF.to_sql('movies', con=con, if_exists='append', index=False)