I am trying to create tables out of json files containing the field names and types of each table of a database downloaded from Bigquery. The SQL request semt fine to me and but no table was created according to psql command-line interpreter typing \d
So, to begin I've just tried with a simpler sql request that doesn't work neither,
Here is the code :
import pandas as pd
import psycopg2
# information used to create a database connection
sqluser = 'postgres'
dbname = 'testdb'
pwd = 'postgres'
# Connect to postgres database
con = psycopg2.connect(dbname=dbname, user=sqluser, password=pwd )
curs=con.cursor()
q="""set search_path to public,public ;
CREATE TABLE tab1(
i INTEGER
);
"""
curs.execute(q)
q = """
SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE';
"""
df = pd.read_sql_query(q, con)
print(df.head())
print("End of test")
The code written above displays this new table tab1, but actually this new table doesn't appear listed when typing \d within the psql command line interpreter. If I type in the psql interpreter :
SELECT table_name
FROM information_schema.tables
WHERE table_type='BASE TABLE';
it doesn't get listed neither , seems it's not actually created, Thanks in advance for your help
There was a commit() call missing, that must be written after the table creation sql request,
This code works:
import pandas as pd
import psycopg2
# information used to create a database connection
sqluser = 'postgres'
dbname = 'testdb'
pwd = 'postgres'
# Connect to postgres database
con = psycopg2.connect(dbname=dbname, user=sqluser, password=pwd )
curs=con.cursor()
q="""set search_path to public,public ;
CREATE TABLE tab1(
i INTEGER
);
"""
curs.execute(q)
con.commit()
q = """
SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE';
"""
df = pd.read_sql_query(q, con)
print(df.head())
print("End of test")
Related
I am trying to use a pandas dataframe to insert data to sql. I am using pandas because there are some columns that I need to drop before I insert it into the SQL table.
The database is in the cloud, but that isn't the issue.
I've been able to create static strings, insert them in the the database & it works fine.
The database is postgres db, using the pg8000 driver.
In this example, I am pulling out one column & one value and trying to insert it in to the database.
connection = db_connection.connect()
for i, rowx in data.iterrows():
with connection as db_conn:
name_column = ['name']
name_value = [data.iloc[0]["name"]]
cols = "`,`".join([str(i) for i in name_column])
sql = "INSERT INTO person ('" + cols + "') VALUES ( " + " %s,"* ( len(name_value) - 1 ) + "%s" + " )"
db_conn.execute(sql, tuple(name_value))
The error I get is usually something related to the formatting of the cols.
Error: 'syntax error at or near "\'name\'"
variable cols:
(Pdb) cols
'name'
I guess it's upset that 'name' is a string but that seems odd.
variable sql:
"INSERT INTO persons ('name') VALUES ( %s )"
Not a fan of the string encapsulation, I got this from a guide:
https://www.dataquest.io/blog/sql-insert-tutorial/
Just looking for a reliable way to script this insert from pandas to pg.
IIUC, I think you can use sqlalchemy package with to_sql() to export pandas dataframe to the database table directly.
Please consider the code structure here
import sqlalchemy as sa
from sqlalchemy import create_engine
import psycopg2
user="username"
password="passwordgohere"
host="host.or.ip"
port=5432
dbname="your_db_name"
db_string = sa.engine.url.URL.create(
drivername="postgresql+psycopg2",
username=user,
password=password,
host=host,
port=port,
database=dbname,
)
db_engine = create_engine(db_string)
or you may use your pg8000 as your choice
import sqlalchemy as sa
from sqlalchemy import create_engine
import pg8000
user="username"
password="passwordgohere"
host="host.or.ip"
port=5432
dbname="your_db_name"
db_string = sa.engine.url.URL.create(
drivername="postgresql+pg8000",
username=user,
password=password,
host=host,
port=port,
database=dbname,
)
db_engine = create_engine(db_string)
And then you can export to the table like this (df is you pandas dataframe)
df.to_sql('your_table_name',con=db_engine, if_exists='replace', index=False, )
or if you would like to append, use if_exists='append'
df.to_sql('your_table_name',con=db_engine, if_exists='append', index=False, )
I have a column called REQUIREDCOLUMNS in a SQL database which contains the columns which I need to select in my Python script below.
Excerpt of Current Code:
db = mongo_client.get_database(asqldb_row.SCHEMA_NAME)
coll = db.get_collection(asqldb_row.TABLE_NAME)
table = list(coll.find())
root = json_normalize(table)
The REQUIREDCOLUMNSin SQL contains values reportId, siteId, price, location
So instead of explicitly typing:
print(root[["reportId","siteId","price","location"]])
Is there a way to do print(root[REQUIREDCOLUMNS])?
Note: (I'm already connected to the SQL database in my python script)
You will have to use cursors if you are using mysql or pymysql , both the syntax are almost similar below i will mention for mysql
import mysql
import mysql.connector
db = mysql.connector.connect(
host = "localhost",
user = "root",
passwd = " ",
database = " "
)
cursor = db.cursor()
sql="select REQUIREDCOLUMNS from table_name"
cursor.execute(sql)
required_cols = cursor.fetchall()#this wll give ["reportId","siteId","price","location"]
cols_as_string=','.join(required_cols)
new_sql='select '+cols_as_string+' from table_name'
cursor.execute(new_sql)
result=cursor.fetchall()
This should probably work, i intentionally split many lines into several lines for understanding.
syntax could be slightly different for pymysql
I am using cx_Oracle module to connect to oracle database. In the script i use two variables schema_name and table_name. The below query works fine
cur1.execute("select owner,table_name from dba_tables where owner ='schema_name'")
But i need to query the num of rows of a table, where i need to qualify the table_name with the schema_name and so the query should be
SELECT count(*) FROM "schema_name"."table_name"
This does not work when using in the code, i have tried to put it in triple quotes, single quotes and other options but it does not format the query as expected and hence errors out with table does not exist.
Any guidance is appreciated.
A prepared statement containing placeholders with variables of the form ...{}.{}".format(sc,tb) might be used
sc='myschema'
tb='mytable'
cur1.execute("SELECT COUNT(*) FROM {}.{}".format(sc,tb))
print(cur1.fetchone()[0])
In this particular case, you could also try setting Connection.current_schema, see the cx_Oracle API doc
For example, if you create table in your own schema:
SQL> show user
USER is "CJ"
SQL> create table ffff (mycol number);
Table created.
SQL> insert into ffff values (1);
1 row created.
SQL> commit;
Commit complete.
Then run Python code that connects as a different user:
import cx_Oracle
import os
import sys, os
if sys.platform.startswith("darwin"):
cx_Oracle.init_oracle_client(lib_dir=os.environ.get("HOME")+"/Downloads/instantclient_19_8")
username = "system"
password = "oracle"
connect_string = "localhost/orclpdb1"
connection = cx_Oracle.connect(username, password, connect_string)
connection.current_schema = 'CJ';
with connection.cursor() as cursor:
sql = """select * from ffff"""
for r in cursor.execute(sql):
print(r)
sql = """select sys_context('USERENV','CURRENT_USER') from dual"""
for r in cursor.execute(sql):
print(r)
the output will be:
(1,)
('SYSTEM',)
The last query shows that it is not the user that is being changed, but just the first query is automatically changed from 'ffff' to 'CJ.ffff'.
I have a postgresql database "Test2" hosted in my localhost. I am able to see the tables using pgAdmin. I want to fetch the data of the DB from Jupyter Notebook. I tried to connect to the DB by following the steps shown in "2) of Part 2" of https://towardsdatascience.com/python-and-postgresql-how-to-access-a-postgresql-database-like-a-data-scientist-b5a9c5a0ea43
Thus, my code is --
import config as creds
import pandas as pd
def connect():
# Set up a connection to the postgres server.
conn_string = "host="+ creds.PGHOST +" port="+ "5432" +" dbname="+ creds.PGDATABASE +" user=" + creds.PGUSER \
+" password="+ creds.PGPASSWORD
conn = psycopg2.connect(conn_string)
print("Connected!")
# Create a cursor object
cursor = conn.cursor()
return conn, cursor
#Connecting to DB
conn, cursor = connect()
#SQL command to create inventory table
abc = ("""SELECT * FROM clubs""")
#Execute SQL Command and commit to DB
cursor.execute(abc)
results = cursor.fetchall()
print(results)
conn.commit()
My config.py looks like this -->
PGHOST = 'localhost'
PGDATABASE = 'Test2'
PGUSER = '#####'
PGPASSWORD = '#####'
I able to get the output when the table name has all lowercase characters but for table names which has mixed character like "clubCategory", it throws an error stating "relation "clubcategory" does not exist"
I tried
abc = ("""SELECT * FROM 'clubCategory' """)
but its still throws error.
Any help please?
Try using double quotes:
abc = ('''SELECT * FROM "clubCategory" ''')
Also see this answer: https://stackoverflow.com/a/21798517/1453822
I am trying to just create a temporary table in my SQL database, where I then want to insert data (from a Pandas DataFrame), and via this temporary table insert the data into a 'permanent' table within the database.
So far I have something like
""" Database specific... """
import sqlalchemy
from sqlalchemy.sql import text
dsn = 'dsn-sql-acc'
database = "MY_DATABASE"
connection_str = """
Driver={SQL Server Native Client 11.0};
Server=%s;
Database=%s;
Trusted_Connection=yes;
""" % (dsn,database)
connection_str_url = urllib.quote_plus(connection_str)
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % connection_str_url, encoding='utf8', echo=True)
# Open connection
db_connection = engine.connect()
sql_create_table = text("""
IF OBJECT_ID('[MY_DATABASE].[SCHEMA_1].[TEMP_TABLE]', 'U') IS NOT NULL
DROP TABLE [MY_DATABASE].[SCHEMA_1].[TEMP_TABLE];
CREATE TABLE [MY_DATABASE].[SCHEMA_1].[TEMP_TABLE] (
[Date] Date,
[TYPE_ID] nvarchar(50),
[VALUE] nvarchar(50)
);
""")
db_connection.execute("commit")
db_connection.execute(sql_create_table)
db_connection.close()
The "raw" SQL-snippet within sql_create_table works fine when executed in SQL Server, but when running the above in Python, nothing happens in my database...
What seems to be the issue here?
Later on I would of course want to execute
BULK INSERT [MY_DATABASE].[SCHEMA_1].[TEMP_TABLE]
FROM '//temp_files/temp_file_data.csv'
WITH (FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR='\n');
in Python as well...
Thanks
These statements are out of order:
db_connection.execute("commit")
db_connection.execute(sql_create_table)
Commit after creating your table and your table will persist.