creating mysql tables id best pratice

creating mysql tables id best pratice - python

I am coding a new python (version 3.8.2) project with a mysql(8.0.19) db.
This is the table creation code:
import mysql.connector
mydb = mysql.connector.connect(
host = "localhost",
user = "root",
password = "mypassword",
database = "acme_db"
)
mycursor = mydb.cursor()
sql_formula = ("CREATE TABLE employee (employee_id INT AUTO_INCREMENT PRIMARY KEY,"
"first_name VARCHAR(255),"
"last_name VARCHAR(255),"
"email VARCHAR(255),"
"phone_nr VARCHAR(255),"
"hire_date DATE,"
"job_id INTEGER,"
"salary NUMERIC(8,2),"
"commission_pct NUMERIC(8,2),"
"manager_id INTEGER,"
"department_id INTEGER)")
mycursor.execute(sql_formula)
mycursor.execute("CREATE TABLE jobs (job_id INT, job VARCHAR(255))")
mycursor.execute("CREATE TABLE managers (manager_id INT, employee_id INTEGER)")
mycursor.execute("CREATE TABLE departments (department_id INT, department_name VARCHAR(255))")
The question is, what is the best practice about id?
What I mean is this, employee_id is unique auto increment pk, that I understand, what about the other id's? for tables jobs, managers and departments.
Shouldn't they be also the same as employee_id definition or just INT and I need to take care of the number, that it doesn't repeat itself, and so on?
I did make all id's the same definition but I coudn't insert data to the tables:
dptFormula = "INSERT INTO depatments (department_name) VALUES (%s)"
acme_departments = [("Accounting"),("R&D"),("Support")]
mycursor.executemany(dptFormula, acme_departments)
I got:
Traceback (most recent call last):
File "c:/Users/Daniel/EmployeeProject/employee_mgt/insert_into.py", line 20, in <module>
mycursor.executemany(dptFormula, acme_departments)
File "C:\Program Files\Python38\lib\site-packages\mysql\connector\cursor.py", line 668, in executemany
stmt = self._batch_insert(operation, seq_params)
File "C:\Program Files\Python38\lib\site-packages\mysql\connector\cursor.py", line 613, in _batch_insert
raise errors.ProgrammingError(
mysql.connector.errors.ProgrammingError: Not all parameters were used in the SQL statement

You're already using one best practice. Each table has an autoincrementing integer for a primary key, and you named those keys after the tables (employee_id, 'job_id, not justid`)
Each table has its own sequence of autoincrementing id values. If you try to look up something in your employees table using a job_id, you'll get nonsense. But you can do things like this
SELECT e.first_name, e.last_name, j.job
FROM employee e
JOIN job h ON e.job_id = j.job_id
to exploit the relationship between employee and job. The relationships between rows of tables expressed by JOIN job h ON e.job_id = j.job_id are the reason for the term relational database management system.
In a comment, #Akina pointed out you should use this sort of definition for your primary key columns.
something_id UNSIGNED INT NOT NULL AUTO_INCREMENT PRIMARY KEY
That's good advice.

Related

Issue while trying to select record in mysql using Python

Error Message
You have an error in your SQL syntax; check the manual that
corresponds to your MariaDB server version for the right syntax to use
near '%s' at line 1
MySQL Database Table
CREATE TABLE `tblorders` (
`order_id` int(11) NOT NULL,
`order_date` date NOT NULL,
`order_number` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
ALTER TABLE `tblorders`
ADD PRIMARY KEY (`order_id`),
ADD UNIQUE KEY `order_number` (`order_number`);
ALTER TABLE `tblorders`
MODIFY `order_id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=4;
Code
mydb = mysql.connector.connect(host = "localhost", user = "root", password = "", database = "mydb")
mycursor = mydb.cursor()
sql = "Select order_id from tblorders where order_number=%s"
val = ("1221212")
mycursor.execute(sql, val)
Am I missing anything?

You must pass a list or a tuple as the arguments, but a tuple of a single value is just a scalar in parentheses.
Here are some workarounds to ensure that val is interpreted as a tuple or a list:
sql = "Select order_id from tblorders where order_number=%s"
val = ("1221212",)
mycursor.execute(sql, val)
sql = "Select order_id from tblorders where order_number=%s"
val = ["1221212"]
mycursor.execute(sql, val)
This is a thing about Python that I always find weird, but it makes a kind of sense.

In case you want to insert data you have to modify your SQL. Use INSERT instead of SELECT like this:
INSERT INTO tblorders (order_number) VALUES ("122121");
That statement will add new record to the table. Besides, in MariaDB you need to use ? instead of %s that works on Mysql database.
sql = "INSERT INTO tblorders (order_number) VALUES (?);"
val = "1231231"
mycursor.execute(sql, [val])

Creating an sql table based off of a variable

I am creating a database that includes 3 primary tables:
Users, Assignments, Groups
With other potential tables that relate to the table 'Groups'. These potential tables are meant to be tables which are named based on the key variable in the "Groups" table. eg if the Groups table has an entry with groupName = "Group_One", I want to create a table called "Group_One" and then store the usernames of other users into that table. As don't see a practical way to store multiple usernames in one row of the 'Groups' table.
Here is the code I am testing to try and implement this:
import sqlite3
def Database_Setup():
Cur.executescript(
"""
CREATE TABLE IF NOT EXISTS USERS
(
username text,
password text,
clearance int,
classes int
);
CREATE TABLE IF NOT EXISTS GROUPS
(
groupName text
teacher text,
teachingAssistant text
users
);
CREATE TABLE IF NOT EXISTS ASSIGNMENTS
(
assignmentID int,
assignmentName text,
assignmentInfo text,
dueDate date,
setDate date,
completedAmount int
)
"""
)
def Potential_Solution():
Group_Name = "Group1"
List_Of_Users = ["User1","User2","User3"]
Cur.execute("""
CREATE TABLE IF NOT EXISTS {}
(
username text,
randomVar text
)
""".format(Group_Name))
# This part works fine ^^
for User in List_Of_Users:
Cur.execute("INSERT INTO TABLE ? values (?,'Some_Var')",(Group_Name,User))
def Main():
Database_Setup()
Potential_Solution()
Cur.execute("SELECT * FROM Group1")
print(Cur.fetchall())
if __name__ == "__main__":
Conn = sqlite3.connect("FOO_DB.db")
Cur = Conn.cursor()
Main()
However when I execute this, I run into this error:
Traceback (most recent call last):
File "E:/Python/Py_Proj/DB LIST vs new db example.py", line 53, in <module>
Main()
File "E:/Python/Py_Proj/DB LIST vs new db example.py", line 46, in Main
Potential_Solution()
File "E:/Python/Py_Proj/DB LIST vs new db example.py", line 42, in Potential_Solution
Cur.execute("INSERT INTO TABLE ? values (?,Some_Var)",(Group_Name,User))
sqlite3.OperationalError: near "TABLE": syntax error
is there a practical way to do what I am trying to achieve? Or should I resort to another method?

I tried the following ,
you have to remove TABLE keyword use positional formatting
Cur.execute("INSERT INTO {0} VALUES('{1}', 'SomeVar')".format(Group_Name,User )
)
import sqlite3
def Database_Setup():
Cur.executescript(
"""
CREATE TABLE IF NOT EXISTS USERS
(
username text,
password text,
clearance int,
classes int
);
CREATE TABLE IF NOT EXISTS GROUPS
(
groupName text
teacher text,
teachingAssistant text
users
);
CREATE TABLE IF NOT EXISTS ASSIGNMENTS
(
assignmentID int,
assignmentName text,
assignmentInfo text,
dueDate date,
setDate date,
completedAmount int
)
"""
)
def Potential_Solution():
Group_Name = "Group1"
List_Of_Users = ["User1","User2","User3"]
Cur.execute("""
CREATE TABLE IF NOT EXISTS {}
(
username text,
randomVar text
)
""".format(Group_Name))
for User in List_Of_Users:
Cur.execute("INSERT INTO {0} VALUES('{1}', 'SomeVar')".format(Group_Name,User )
)
def Main():
Database_Setup()
Potential_Solution()
Cur.execute("SELECT * FROM Group1")
print(Cur.fetchall())
if __name__ == "__main__":
Conn = sqlite3.connect("FOO_DB.db")
Cur = Conn.cursor()
Main()

Working with databases with Python: Course registration data in JSON

I am able to get my Python code to run print the desired results, but my problem is with the SQLite table. I was asked to apply this SQL command to the tables:
SELECT hex(User.name || Course.title || Member.role ) AS X
FROM User JOIN Member JOIN Course
ON User.id = Member.user_id AND Member.course_id = Course.id
ORDER BY X
I was able to execute the command in SQLite, but according to the instructions for this project, X is supposed to start with 416 in row one of the results column produced. However, the X I got for row 1 in the results was:
43616C6962736933313030
Here is what I wrote in Python so far:
import sqlite3
import json
#Working with Java and Sqlite
conn = sqlite3.connect('rosterdb.sqlite')
cur = conn.cursor()
cur.executescript('''
DROP TABLE IF EXISTS User;
DROP TABLE IF EXISTS Member;
DROP TABLE IF EXISTS Course;
CREATE TABLE User(
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
name TEXT UNIQUE
);
CREATE TABLE Member(
user_id INTEGER UNIQUE,
course_id INTEGER UNIQUE,
role INTEGER,
PRIMARY KEY (user_id, course_id)
);
CREATE TABLE Course(
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
title TEXT UNIQUE
);
''')
#primary key for junction table is composite of both user_id and course_id
fname = raw_input("Enter file name:")
if (len(fname) < 1): fname = 'roster_data.json'
#prompts for file name
str_data = open(fname).read()
json_data = json.loads(str_data)
#opens the file and reads it all
#loads the json data and now is a python list
for entry in json_data:
title = entry[1];
name = entry [0];
role = entry[2];
#["Charley, "sill0", 1] represents the name, course title, and role
print name, title, role
cur.execute('''INSERT or IGNORE INTO User (name)
VALUES (?)''', (name, ))
cur.execute('SELECT id FROM User WHERE name = ?',(name, ))
user_id = cur.fetchone()[0]
cur.execute('''INSERT or IGNORE INTO Course (title)
VALUES (?)''', (title, ))
cur.execute('SELECT id FROM Course WHERE title = ?', (title, ))
course_id = cur.fetchone()[0]
cur.execute('''INSERT or REPLACE INTO Member (user_id, course_id, role)
VALUES (?,?,?)''', (user_id, course_id, role))
#INSERT, SELECT AND FETCHONE STATEMENTS
conn.commit()
Here is the JSON data that I was working with. It is about course registration for students: roster_data.json Here is the link to it:
https://pr4e.dr-chuck.com/tsugi/mod/sql-intro/roster_data.php?PHPSESSID=9addd537cfe55c03585d2bfaa757f6b0
I am not sure if I implemented the "role" key correctly. Thank you for your inputs!

The problem is that you made Member.course_id unique. Thus you can have no more members than courses. Using REPLACE in INSERT or REPLACE into Member hides this error.
Just drop UNIQUE constraint on Member.course and you will get expected result.

How to import csv data with parent/child (category-subcategory) hierarchy to MySQL using Python

I am importing a csv file containing a parent/child (category-subcategory) hierarchy to MySQL, using Python's MySQLdb module. Here is an example csv file:
vendor,category,subcategory,product_name,product_model,product_price
First vendor,category1,subcategory1,product1,model1,100
First vendor,category1,subcategory2,product2,model2,110
First vendor,category2,subcategory3,product3,model3,130
First vendor,category2,subcategory4,product5,model7,190
In MySQL I want to use a category table with a hierarchical structure, like this:
CREATE TABLE IF NOT EXISTS `category` (
`category_id` int(11) NOT NULL AUTO_INCREMENT,
`parent_id` int(11) NOT NULL DEFAULT '0',
`status` tinyint(1) NOT NULL,
PRIMARY KEY (`category_id`),
KEY `parent_id` (`parent_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
My question is: How do I determine the parent_id in this table?
Here is the Python script I have so far:
import MySQLdb
import csv
con = MySQLdb.connect('localhost', 'root', '', 'testdb', use_unicode=True, charset='utf8')
with con:
cur = con.cursor()
csv_data = csv.reader(file('test.csv'))
csv_data.next()
for row in csv_data:
cur.execute("SELECT manufacturer_id FROM manufacturer WHERE name=%s", [row[0]],)
res = cur.fetchall()
if res:
vendor_id = res[0][0]
else:
cur.execute("INSERT INTO manufacturer (name) VALUES (%s)", (row[0],))
vendor_id = cur.lastrowid
cur.execute("SELECT category_id FROM category_description WHERE name=%s", [row[2]])
res = cur.fetchall()
if res:
category_id = res[0][0]
else:
# What parent_id should be inserted here?
cur.execute("INSERT INTO category (`status`, `parent_id`) VALUES (%s,%s)", (1,))
category_id = cur.lastrowid
cur.execute("INSERT INTO category_description (category_id, name) VALUES (%s,%s)", (category_id,row[2],))
cur.execute("INSERT INTO product (model, manufacturer_id, price,) VALUES (%s, %s, %s)", (row[4], `vendor_id`, row[8],))
product_id = cur.lastrowid
cur.execute("INSERT INTO product_to_category (product_id, category_id) VALUES (%s, %s)", (product_id, category_id,))
cur.commit()
Here are the definitions of the other tables used in my example:
CREATE TABLE IF NOT EXISTS `manufacturer` (
`manufacturer_id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(64) NOT NULL,
PRIMARY KEY (`manufacturer_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
CREATE TABLE IF NOT EXISTS `category_description` (
`category_id` int(11) NOT NULL,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`category_id`,`language_id`),
KEY `name` (`name`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
CREATE TABLE IF NOT EXISTS `product` (
`product_id` int(11) NOT NULL AUTO_INCREMENT,
`model` varchar(64) NOT NULL,
`manufacturer_id` int(11) NOT NULL,
`price` decimal(15,4) NOT NULL DEFAULT '0.0000',
PRIMARY KEY (`product_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;
CREATE TABLE IF NOT EXISTS `product_to_category` (
`product_id` int(11) NOT NULL,
`category_id` int(11) NOT NULL,
PRIMARY KEY (`product_id`,`category_id`),
KEY `category_id` (`category_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;

In a hierarchical table structure, any member at the top of its hierarchy has no parents. I would probably show this with a NULL parent ID but based on the way you've defined your category table, it looks like you want to show this by giving the value 0 for the parent ID.
Since you have fixed-depth hierarchies with only two levels (category and subcategory), the task is relatively simple. For each row of the CSV data, you need to:
Check whether the parent (row[1]) is in the table; if not, insert it with a parent ID of 0.
Get the category_id of the parent from step 1.
Check whether the child (row[2]) is in the table; if not, insert it with a parent ID equal to the category_id from step 2.
In your example code, you never access the parent (row[1]); you need to insert this into the table for it to have an ID that the child can refer to. If you've already inserted the parents before this point, you should probably still check to make sure it's there.
You have some other problems here:
The PK of your category_description table is defined on a column that you forgot to define in the table (language_id).
You should really be using InnoDB in this physical model so that you can enforce foreign key constraints in category_description, product and product_to_category.
In your example, cur.commit() is going to throw an exception – that's a method of the Connection object in MySQLdb. Of course, COMMIT isn't implemented for MyISAM tables anyway, so you could also avoid the exception by removing the line entirely.
Referencing row[8] is also going to throw an exception, according to the CSV data you've shown us. (This is a good example of why you should test your MCVE to make sure it works!)
If you do switch to InnoDB – and you probably should – you can use with con as cur: to get a cursor that commits itself when you exit the with block. This saves a couple lines of code and lets you manage transactions without micromanaging the connection object.

Getting the id of the last record inserted for Postgresql SERIAL KEY with Python

I am using SQLAlchemy without the ORM, i.e. using hand-crafted SQL statements to directly interact with the backend database. I am using PG as my backend database (psycopg2 as DB driver) in this instance - I don't know if that affects the answer.
I have statements like this,for brevity, assume that conn is a valid connection to the database:
conn.execute("INSERT INTO user (name, country_id) VALUES ('Homer', 123)")
Assume also that the user table consists of the columns (id [SERIAL PRIMARY KEY], name, country_id)
How may I obtain the id of the new user, ideally, without hitting the database again?

You might be able to use the RETURNING clause of the INSERT statement like this:
result = conn.execute("INSERT INTO user (name, country_id) VALUES ('Homer', 123)
RETURNING *")
If you only want the resulting id:
result = conn.execute("INSERT INTO user (name, country_id) VALUES ('Homer', 123)
RETURNING id")
[new_id] = result.fetchone()

User lastrowid
result = conn.execute("INSERT INTO user (name, country_id) VALUES ('Homer', 123)")
result.lastrowid

Current SQLAlchemy documentation suggests
result.inserted_primary_key should work!

Python + SQLAlchemy
after commit, you get the primary_key column id (autoincremeted) updated in your object.
db.session.add(new_usr)
db.session.commit() #will insert the new_usr data into database AND retrieve id
idd = new_usr.usrID # usrID is the autoincremented primary_key column.
return jsonify(idd),201 #usrID = 12, correct id from table User in Database.

this question has been asked many times on stackoverflow and no answer I have seen is comprehensive. Googling 'sqlalchemy insert get id of new row' brings up a lot of them.
There are three levels to SQLAlchemy.
Top: the ORM.
Middle: Database abstraction (DBA) with Table classes etc.
Bottom: SQL using the text function.
To an OO programmer the ORM level looks natural, but to a database programmer it looks ugly and the ORM gets in the way. The DBA layer is an OK compromise. The SQL layer looks natural to database programmers and would look alien to an OO-only programmer.
Each level has it own syntax, similar but different enough to be frustrating. On top of this there is almost too much documentation online, very hard to find the answer.
I will describe how to get the inserted id AT THE SQL LAYER for the RDBMS I use.
Table: User(user_id integer primary autoincrement key, user_name string)
conn: Is a Connection obtained within SQLAlchemy to the DBMS you are using.
SQLite
======
insstmt = text(
'''INSERT INTO user (user_name)
VALUES (:usernm) ''' )
# Execute within a transaction (optional)
txn = conn.begin()
result = conn.execute(insstmt, usernm='Jane Doe')
# The id!
recid = result.lastrowid
txn.commit()
MS SQL Server
=============
insstmt = text(
'''INSERT INTO user (user_name)
OUTPUT inserted.record_id
VALUES (:usernm) ''' )
txn = conn.begin()
result = conn.execute(insstmt, usernm='Jane Doe')
# The id!
recid = result.fetchone()[0]
txn.commit()
MariaDB/MySQL
=============
insstmt = text(
'''INSERT INTO user (user_name)
VALUES (:usernm) ''' )
txn = conn.begin()
result = conn.execute(insstmt, usernm='Jane Doe')
# The id!
recid = conn.execute(text('SELECT LAST_INSERT_ID()')).fetchone()[0]
txn.commit()
Postgres
========
insstmt = text(
'''INSERT INTO user (user_name)
VALUES (:usernm)
RETURNING user_id ''' )
txn = conn.begin()
result = conn.execute(insstmt, usernm='Jane Doe')
# The id!
recid = result.fetchone()[0]
txn.commit()

result.inserted_primary_key
Worked for me. The only thing to note is that this returns a list that contains that last_insert_id.

Make sure you use fetchrow/fetch to receive the returning object
insert_stmt = user.insert().values(name="homer", country_id="123").returning(user.c.id)
row_id = await conn.fetchrow(insert_stmt)

For Postgress inserts from python code is simple to use "RETURNING" keyword with the "col_id" (name of the column which you want to get the last inserted row id) in insert statement at end
syntax -
from sqlalchemy import create_engine
conn_string = "postgresql://USERNAME:PSWD#HOSTNAME/DATABASE_NAME"
db = create_engine(conn_string)
conn = db.connect()
INSERT INTO emp_table (col_id, Name ,Age)
VALUES(3,'xyz',30) RETURNING col_id;
or
(if col_id column is auto increment)
insert_sql = (INSERT INTO emp_table (Name ,Age)
VALUES('xyz',30) RETURNING col_id;)
result = conn.execute(insert_sql)
[last_row_id] = result.fetchone()
print(last_row_id)
#output = 3
ex -

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.