column names as variables while creating table - python

Iam new to python sqlite, and I have a problem with create table query.
I need to create a table, but I have the column names of the table as a list.
columnslist = ["column1", "column2", "column3"]
Now, I have to create a table MyTable with the above columns. But the problem is, I won't know before hand how may columns are there in columnslist
Is it possible to create a table with the number of and name of columns given in columnslist and its syntax?

You can first convert your list to tuple and use str.format:
import sqlite3
conn = sqlite3.connect('example.db')
c = conn.cursor()
c.execute('''CREATE TABLE table_name {}'''.format(tuple(column_list)))

Related

Dynamically using INSERT for cx_Oracle - Python

I've been looking around so hopefully someone here can assist:
I'm attempting to use cx_Oracle in python to interface with a database; my task is to insert data from an excel file to an empty (but existing) table.
I have the excel file with almost all of the same column names as the columns in the database's table, so I essentially want to check if the columns share the same name; and if so, I insert that column from the excel (dataframe --pandas) file to the table in Oracle.
import pandas as pd
import numpy as np
import cx_Oracle
df = pd.read_excel("employee_info.xlsx")
con = None
try:
con = cx_Oracle.connect (
config.username,
config.password,
config.dsn,
encoding = config.encoding)
except cx_Oracle.Error as error:
print(error)
finally:
cursor = con.cursor()
rows = [tuple(x) for x in df.values]
cursor.executemany( ''' INSERT INTO ODS.EMPLOYEES({x} VALUES {rows}) '''
I'm not sure what sql I should put or if there's a way I can use a for-loop to iterate through the columns but my main issue stems from how can I dynamically add these for when our dataset grows in columns?
I check the columns that match by using:
sql = "SELECT * FROM ODS.EMPLOYEES"
cursor.execute(sql)
data = cursor.fetchall()
col_names = []
for i in range (0, len(cursor.description)):
col_names.append(cursor.description[i][0])
a = np.intersect1d(df.columns, col_names)
print("common columns:", a)
that gives me a list of all the common columns; so I'm not sure? I've renamed the columns in my excel file to match the columns in the database's table but my issue is that how can I match these in a dynamic/automated way so I can continue to add to my datasets without worrying about changing the code.
Bonus: I also am importing SQL in a case statement to create a new column where I'm rolling up a few other columns; if there's a way to add this to the first part of my SQL or if it's advisable to do all manipulations before using an insert statement that'll be helpful to know as well.
Look at https://github.com/oracle/python-oracledb/blob/main/samples/load_csv.py
You would replace the CSV reading bit with parsing your data frame. You need to construct a SQL statement similar to the one used in that example:
sql = "insert into LoadCsvTab (id, name) values (:1, :2)"
For each spreadsheet column that you decide matches a table column, construct the (id, name) bit of the statement and add another id to the bind section (:1, :2).

Is There a Way To Select All Columns From The Table And Except One Column

I have been using sqlite3 with python for creating databases. Till Now I have been successful,
But Unfortunately I have No way Out Of This. I have A Table With 63 columns but I Want To Select Only 62 Out Of Them, I am Sure That I can write The Names of The Columns In The Select Statement. But Writing 62 Of Them seems like a non-logical(for a programmer like me) idea for me. I am using Python-sqlite3 databases. Is There A Way Out Of This
I'm Sorry If I am Grammarly Mistaken.
Thanks in advance
With Sqlite, you can:
do a PRAGMA table_info(tablename); query to get a result set that describes that table's columns
pluck the column names out of that result set and remove the one you don't want
compose a column list for the select statement using e.g. ', '.join(column_names) (though you might want to consider a higher-level SQL statement builder instead of playing with strings).
Example
A simple example using a simple table and an in-memory SQLite database:
import sqlite3
con = sqlite3.connect(":memory:")
con.executescript("CREATE TABLE kittens (id INTEGER, name TEXT, color TEXT, furriness INTEGER, age INTEGER)")
columns = [row[1] for row in con.execute("PRAGMA table_info(kittens)")]
print(columns)
selected_columns = [column for column in columns if column != 'age']
print(selected_columns)
query = f"SELECT {', '.join(selected_columns)} FROM kittens"
print(query)
This prints out
['id', 'name', 'color', 'furriness', 'age']
['id', 'name', 'color', 'furriness']
SELECT id, name, color, furriness FROM kittens

Update SQL database with dataframe content

I have a pandas dataframe containing two columns: ID and MY_DATA. I have an SQL database that contains a column named ID and some other data. I want to match the ID of the SQL database column to the rows of the dataframe ID column and update it with a new column MY_DATA.
So far I used the following:
import sqlite3
df = pd.read_csv('my_filename.csv')
con = sqlite3.connect('my_database.sqlite')
cur = con.cursor()
for row in cur.execute('SELECT ID FROM main;'):
for i in len(df):
if (row[i] == df.ID.iloc[i]):
update_sqldb(df, i)
However, I think this way of having two nested for-loops is probably ugly and not very pythonic. I thought that maybe I should use the map() function, but is this the right direction to go?

Pandas to_sql() to update unique values in DB?

How can I use the df.to_sql(if_exists = 'append') to append ONLY the unique values between the dataframe and the database. In other words, I would like to evaluate the duplicates between the DF and the DB and drop those duplicates before writing to the database.
Is there a parameter for this?
I understand that the parameters if_exists = 'append' and if_exists = 'replace'is for the entire table - not the unique entries.
I am using:
sqlalchemy
pandas dataframe with the following datatypes:
index: datetime.datetime <-- Primary Key
float
float
float
float
integer
string <--- Primary Key
string<---- Primary Key
I'm stuck on this so your help is much appreciated. -Thanks
In pandas, there is no convenient argument in to_sql to append only non-duplicates to a final table. Consider using a staging temp table that pandas always replaces and then run a final append query to migrate temp table records to final table accounting only for unique PK's using the NOT EXISTS clause.
engine = sqlalchemy.create_engine(...)
df.to_sql(name='myTempTable', con=engine, if_exists='replace')
with engine.begin() as cn:
sql = """INSERT INTO myFinalTable (Col1, Col2, Col3, ...)
SELECT t.Col1, t.Col2, t.Col3, ...
FROM myTempTable t
WHERE NOT EXISTS
(SELECT 1 FROM myFinalTable f
WHERE t.MatchColumn1 = f.MatchColumn1
AND t.MatchColumn2 = f.MatchColumn2)"""
cn.execute(sql)
This would be an ANSI SQL solution and not restricted to vendor-specific methods like UPSERT and so is compliant in practically all SQL-integrated relational databases.

How to add a new column (Python list) to a Postgresql table?

I have a Python list newcol that I want to add to an existing Postgresql table. I have used the following code:
conn = psycopg2.connect(host='***', database='***', user='***', password='***')
cur = conn.cursor()
cur.execute('ALTER TABLE %s ADD COLUMN %s text' % ('mytable', 'newcol'))
conn.commit()
This added the list newcol to my table, however the new column has no values in it. In python, when I print the the list in python, it is a populated list.
Also, the number of rows in the table and in the list I want to add are the same. I'm a little confused.
Thanks in advance for the help.
ALTER TABLE only changes table schema -- in your case it will create the new column and initialize it with empty (NULL) values.
To add list of values to this column you can do:
UPDATE TABLE <table> SET ... in a loop.

Categories