Everytime I am executing my python (sql) code it just keeps adding more and more data to my table more and more rows its just keeps growing, somehow i need to Update it or Delete before INSERT'ing data to table, but i dont know how.
Here is my code:
import MySQLdb
from calculationmethod import Method
class dbcclib(Method):
def __str__(self):
"""Return a string representation of the object."""
return "Density matrix of" % (self.data)
def __repr__(self):
"""Return a representation of the object."""
return 'Density matrix("%s")' % (self.data)
def push(self):
# Open database connection
dbhost1 = raw_input("Enter databse host: ")
dbport = int(raw_input("Enter databse port: "))
dbuser = raw_input("Enter dabase user: ")
dbpass = raw_input("Enter databse password: ")
dbname = raw_input("Enter database: ")
db = MySQLdb.connect(host = dbhost1,port = dbport,user = dbuser,passwd = dbpass , db = dbname)
# prepare a cursor object using cursor() method
cur = db.cursor()
# Create table as per requirement
self.data.vibstate
cur.executemany("""
INSERT INTO
cord
(x, y, z)
VALUES
(%s, %s, %s)
""", self.data.vibstate)
db.commit()
# disconnect from server
db.close()
print "Baigta"
I will define now my "MESS", in this example i have 2D array lets say its like this :
a = [[1,2,3],[3,2,1]]
and now then i am INSERT'ing it couple of times into my table : it looks like that:
columns x y z
1 2 3
3 2 1
1 2 3
3 2 1
Its duplicating everytime. Everytime i execute it it adds more and more rows. So i need to get rid off of that duplication.
If you need to get rid of duplicate data, you should use PK and UPSERT (Insert if not exist, else update). if you need clean db each time, just run truncate command before insert.
In Mysql use following structure:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
Check this link
If you don't want to modify and just need to get rid of duplicate user INSERT IGNORE instead.
INSERT IGNORE INTO table (a,b,c) VALUES (1,2,3);
Remember to create unique constrains for all 3 fields.
ALTER TABLE table ADD UNIQUE INDEX unq (a, b, c)
Related
I currently have a list of id's approx. of size 10,000. I need to update all rows in the mySQL table which have an id in the inactive_ids list that you see below. I need to change their active status to 'No' which is a column in the mySQL table.
I am using mysql.connector python library.
When I run the code below, it is taking about 0.7 seconds to execute each iteration in the for loop. Thats about a 2 hour run time for all 10,000 id's to be changed. Is there a more optimal/quicker way to do this?
# inactive_ids are unique strings something like shown below
# inactive_ids = ['a9okeoko', 'sdfhreaa', 'xsdfasy', ..., 'asdfad']
# initialize connection
mydb = mysql.connector.connect(
user="REMOVED",
password="REMOVED",
host="REMOVED",
database="REMOVED"
)
# initialize cursor
mycursor = mydb.cursor(buffered=True)
# Function to execute multiple lines
def alter(state, msg, count):
result = mycursor.execute(state, multi=True)
result.send(None)
print(str(count), ': ', msg, result)
count += 1
return count
# Try to execute, throw exception if fails
try:
count = 0
for Id in inactive_ids:
# SAVE THE QUERY AS STRING
sql_update = "UPDATE test_table SET Active = 'No' WHERE NoticeId = '" + Id + "'"
# ALTER
count = alter(sql_update, "done", count)
# commits all changes to the database
mydb.commit()
except Exception as e:
mydb.rollback()
raise e
Do it with a single query that uses IN (...) instead of multiple queries.
placeholders = ','.join(['%s'] * len(inactive_ids))
sql_update = f"""
UPDATE test_table
SET Active = 'No'
WHERE NoticeId IN ({placeholders})
"""
mycursor.execute(sql_update, inactive_ids)
Example of a column:
This is what I have tried. I only want to search based on a single column in the table. Lets says the table name is Employees. The input parameter is entered by the user in console.
exists = cursor.execute("SELECT TOP 1 * FROM Employees WHERE ID = ?", (str(input),))
print(exists)
if exists is None:
return False
else:
return True
I think this is what you are looking for:
insert_query = '''SELECT TOP 1 * FROM EmployeeTable WHERE ID = (?);''' # '?' is a placeholder
cursor.execute(insert_query, str(input))
I have this situation where I created a method that will insert rows in database. I provide to that method columns, values and table name.
COLUMNS = [['NAME','SURNAME','AGE'],['SURNAME','NAME','AGE']]
VALUES = [['John','Doe',56],['Doe','John',56]]
TABLE = 'people'
This is how I would like to pass but it doesn't work:
db = DB_CONN.MSSQL() #method for connecting to MS SQL or ORACLE etc.
cursor = db.cursor()
sql = "insert into %s (?) VALUES(?)" % TABLE
cursor.executemany([sql,[COLUMNS[0],VALUES[0]],[COLUMNS[1],VALUES[1]]])
db.commit()
This is how it will pass query but problem is that I must have predefined column names and that's not good because what if the other list has different column sort? Than the name will be in surname and surname in name.
db = DB_CONN.MSSQL() #method for connecting to MS SQL or ORACLE etc.
cursor = db.cursor()
sql = 'insert into %s (NAME,SURNAME,AGE) VALUES (?,?,?)'
cursor.executemany(sql,[['John','Doe',56],['Doe','John',56]])
db.commit()
I hope I explained it clearly enough.
Ps. COLUMNS and VALUES are extracted from json dictionary
[{'NAME':'John','SURNAME':'Doe','AGE':56...},{'SURNAME':'Doe','NAME':'John','AGE':77...}]
if that helps.
SOLUTION:
class INSERT(object):
def __init__(self):
self.BASE_COL = ''
def call(self):
GATHER_DATA = [{'NAME':'John','SURNAME':'Doe','AGE':56},{'SURNAME':'Doe','NAME':'John','AGE':77}]
self.BASE_COL = ''
TABLE = 'person'
#check dictionary keys
for DATA_EVAL in GATHER_DATA:
if self.BASE_COL == '': self.BASE_COL = DATA_EVAL.keys()
else:
if self.BASE_COL != DATA_EVAL.keys():
print ("columns in DATA_EVAL.keys() have different columns")
#send mail or insert to log or remove dict from list
exit(403)
#if everything goes well make an insert
columns = ','.join(self.BASE_COL)
sql = 'insert into %s (%s) VALUES (?,?,?)' % (TABLE, columns)
db = DB_CONN.MSSQL()
cursor = db.cursor()
cursor.executemany(sql, [DATA_EVAL.values() for DATA_EVAL in GATHER_DATA])
db.commit()
if __name__ == "__main__":
ins = INSERT()
ins.call()
You could take advantage of the non-random nature of key-value pair listing for python dictionaries.
You should check that all items in the json array of records have the same fields, otherwise you'll run into an exception in your query.
columns = ','.join(records[0].keys())
sql = 'insert into %s (%s) VALUES (?,?,?)' % (TABLE, columns)
cursor.executemany(sql,[record.values() for record in records])
References:
https://stackoverflow.com/a/835430/5189811
I have written a small app that uses mysql to get a list of products that need updating on our magento website.
Python then actions these updates and marks the product in the db as complete.
My Original code (pseudo to show the overview)
class Mysqltools:
def get_products():
db = pymysql.connect(host= .... )
mysqlcursor = db.cursor(pymysql.cursors.DictCursor)
sql = select * from x where y = z
mysqlcursor.execute(sql % (z))
rows = mysqlcursor.fetchall()
mysqlcursor.close()
db.close
return rows
def write_products(sku, name, id):
db = pymysql.connect(host= .... )
mysqlcursor = db.cursor(pymysql.cursors.DictCursor)
sql = update table set sku = sku, name = name, id = id.....
mysqlcursor.execute(sql % (sku, name, id))
mysqlcursor.close()
db.close
This was working ok, but on each db connection string we were getting a pause.
I did a bit of research and did the following:
class Mysqltools:
def __init__():
self.db = pymysql.connect(host= .... )
def get_products():
mysqlcursor = self.db.cursor(pymysql.cursors.DictCursor)
sql = select * from x where y = z
mysqlcursor.execute(sql % (z))
rows = mysqlcursor.fetchall()
mysqlcursor.close()
def write_products(sku, name, id):
mysqlcursor = self.db.cursor(pymysql.cursors.DictCursor)
sql = update table set sku = sku, name = name, id = id.....
mysqlcursor.execute(sql % (sku, name, id))
mysqlcursor.close()
db.commit()
This has a MASSIVE speed improvement. However, it would only do a successful get_products on the first iteration, once it was called a second time, it was finding 0 products to update, even though performing the same SQL on the db would show a number of rows returned.
Am I doing something wrong with the connections ?
I have also tried moving the db = outside of the class and referencing it but that still gives the same issue.
UPDATE
Doing some testing, and if I remove the DictCursor from the cursor I can get the correct rows returned each time (I've just created a quick loop to keep checking for records)
Is the DictCursor doing something I am unaware of ?
** UPDATE 2 **
I've removed the DictCursor, and tried the following.
Create a while True loop which calls my get_product method.
In MySQL change some rows so that they should be found.
If I go from having 0 possible rows to find, then change some so they should be found, my code just shows 0 found, and loops stating this.
If I got from having x possible rows to find, then change it to 0 in mysql, my code continues to loop showing the x possible rows.
Ok, the answer to this is as follows:
db = pymysql.connect(host=.... user=... )
class MySqlTools:
def get_products():
mysqlcursor = db.cursor(pymysql.cursors.DictCursor)
sql = select * from x where y = z
mysqlcursor.execute(sql % (z))
rows = mysqlcursor.fetchall()
mysqlcursor.close()
db.commit()
This then allows you to re-use the db connection and remove the overhead of creating and closing a connection each and every time.
In testing, downloading 500 orders from our website and writing them to a db went from 16minutes to <3 minutes.
I have normalised three tables (Product, ProductType and ProductGender) and I'm looking to call them in my main program so that the user can successfully enter values and the data be stored in the correct table.
Here are the SQL tables being created
def create_product_table():
sql = """create table Product
(ProductID integer,
Name text,
primary key(ProductID))"""
create_table(db_name, "Product", sql)
def create_product_type_table():
sql = """create table ProductType
(ProductID integer,
Colour text,
Size text,
Gender text,
AmountInStock integer,
Source text,
primary key(ProductID, Colour, Size, Gender)
foreign key(Gender) references ProductGender(Gender)
foreign key(ProductID) references Product(ProductID))"""
create_table(db_name, "ProductType", sql)
def create_product_gender_table():
sql = """create table ProductGender
(Gender text,
Price text,
primary key(Gender))"""
create_table(db_name, "ProductGender", sql)
Here are the SQL subroutines
def insert_data(values):
with sqlite3.connect("jam_stock.db") as db:
cursor = db.cursor()
sql = "insert into Product (Name, ProductID) values (?,?)"
cursor.execute(sql,values)
db.commit()
def insert_product_type_data(records):
sql = "insert into ProductType(Amount, Size, Colour, Source) values (?,?,?,?)"
for record in records:
query(sql,record)
def insert_product_gender_data(records):
sql = "insert into ProductGender(Gender, Price) values (?,?)"
for record in records:
query(sql, records)
def query(sql,data): #important
with sqlite3.connect("jam_stock.db") as db:
cursor = db.cursor()
cursor.execute("PRAGMA foreign_keys = ON") #referential integrity
cursor.execute(sql,data)
db.commit()
Below is the code where the user will enter the values.
if ans=="1": #1=to option 'Add Stock'
a = input("Enter Gender: ")
b = float(input("Enter Price: "))
c = int(input("Enter ProductID: "))
d = input("Enter Name: ")
e = input("Enter Size: ")
f = input("Enter Colour: ")
g = input("Enter Source: ")
h = input("Enter Amount: ")
#code calling tables should be here
Help is gratefully appreciated. Seriously not sure how to link the 3 tables with the user's input.
This is what I did before I normalised the database. So the one table in 'Product' would be updated instead of adding an already existing product. Obviously that has changed now, since I've created two new tables but I can't successfully add a product let alone edit one.
def update_product(data): #subroutine for editing stock
with sqlite3.connect("jam_stock.db") as db:
cursor = db.cursor()
sql = "update Product set Name=?, Price=?, Amount=?, Size=?, Colour=?, Source=?, Gender=? where ProductID=?"
cursor.execute(sql,data)
db.commit()
Given the code you show above, and assuming (BIG assumption, see later!) that the user never enters data for existing records, the following code should do it:
query('insert into Product (Name, ProductID) values (?,?)',
[d, c])
query('insert into ProductGender (Gender, Price) values (?,?)',
[a, b])
query('insert into ProductType (ProductID, Colour, Size, Gender, '
AmountInStock, Source) values (?,?,?,?,?,?)',
[c, f, e, a, h, g])
Your use of arbitrary single-letter variable names makes this very hard to follow, of course, but I think I got the correspondence right:-).
Much more important is the problem that you never tell us what to do if the user enters data for an already existing record in one or more of the three tables (as determined by the respective primary keys).
For example, what if Product already has a record with a ProductID of foobar and a Name of Charlemagne; and the user enters ProductID as foobar and a Name of Alexandre; what do you want to happen in this case? You never tell us!
The code I present above will just fail the whole sequence because of the attempt to insert a new record in Product with an already-existing primary key; if you don't catch the exception and print an error message this will in fact crash your whole program.
But maybe you want to do something completely different in such cases -- and there are so many possibilities that we can't just blindly guess!
So please edit your Q to clarify in minute detail what's supposed to happen in each case of primary key "duplication" in one or more table (unless you're fine with just crashing in such cases!-), and the SQL and Python code to make exactly-that happen will follow. But of course we can't decide what the semantics of your program are meant to be...!-)