Python - Reading specific column from SQL output stored in a variable - python

I have a basic question here. I am pulling a SQL output as below:
cur = connection.cursor()
cur.execute("""select store_name,count(*) from stores group by store_name""")
data = cur.fetchall()
The output of the above SQL is as below:
Store_1,23
Store_2,13
Store_3,43
Store_4,2
I am trying to read column 1 (store_name) in the above output.
Expected Output:
Store_1
Store_2
Store_3
Store_4
Could anyone advice on how could I have this done. Thanks..

If I correctly understand your question, I guess just correcting your SQL will give you the desired result. Fetch distinct store_name
select distinct store_name from stores
Edit
Response to comment:
Try following:
from operator import itemgetter
data = cur.fetchall()
list(map(itemgetter(0), data)) # your answer

In your code, you can simply append the following lines:
for rows in data:
print(rows[0])
hope this helps.
BTW: I am not on the computer and have not crosschecked the solution.

Harvard's CS50 web class has the following which I think helps you in it's last 3 lines.
import os
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
engine = create_engine(os.getenv("DATABASE_URL")) # database engine object from SQLAlchemy that manages connections to the database
# DATABASE_URL is an environment variable that indicates where the database lives
db = scoped_session(sessionmaker(bind=engine)) # create a 'scoped session' that ensures different users' interactions with the
# database are kept separate
flights = db.execute("SELECT origin, destination, duration FROM flights").fetchall() # execute this SQL command and return all of the results
for flight in flights
print(f"{flight.origin} to {flight.destination}, {flight.duration} minutes.") # for every flight, print out the flight info
So in your case I suppose its:
results = db.execute( <PUT YOUR SQL QUERY HERE> )
for row in results:
print(row.store_name)

Related

ORA-00942 when importing data from Oracle Database to Pandas

I try to import data from a oracle database to a pandas dataframe.
right Now i am using:
import cx_Oracle
import pandas as pd
db_connection_string = '.../A1#server:port/servername'
con = cx_Oracle.connect(db_connection_string)
query = """SELECT*
FROM Salesdata"""
df = pd.read_sql(query, con=con)
and get the following error: DatabaseError: ORA-00942: Table or view doesn't exist
When I run a query to get the list of all tables:
cur = con.cursor()
cur.execute("SELECT table_name FROM dba_tables")
for row in cur:
print (row)
The output looks like this:
('A$',)
('A$BD',)
('Salesdata',)
What I am doing wrong? I used this question to start.
If I use the comment to print(query)I get:
SELECT*
FROM Salesdata
Getting ORA-00942 when running SELECT can have 2 possible causes:
The table does not exist: here you should make sure the table name is prefixed by the table owner (schema name) as in select * from owner_name.table_name. This is generally needed if the current Oracle user connected is not the table owner.
You don't have SELECT privileges on the table. This is also generally needed if the current Oracle user connected is not the table owner.
You need to check both.

SQLite query difficulty

I have a SQLite db with three relational tables. I'm trying to return the max record from a log table along with related columns from the other tables based on the ID relationships.
I created the query in DB Browser and verified it returns the expected record however, when I use the exact same query statement in my python code it never steps into the 'for' loop.
SQL statement in python -
def GetLastLogEntry():
readings = ()
conn = sqlite3.connect(dbName)
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
cursor.execute("SELECT f.FoodCategory, f.FoodName, gs.FoodWeight,
gsl.GrillDateTime, gsl.CurrentGrillTemp, gsl.TargetGrillTemp,
gsl.CurrentFoodTemp, gsl.TargetFoodTemp, gsl.CurrentOutsideTemp,
gsl.CurrentOutsideHumidity FROM Food as f, GrillSession as gs,
GrillSessionLog as gsl WHERE f.FoodId = gs.FoodId AND
gs.GrillSessionID = gsl.GrillSessionID AND gsl.GrillSessionLogID =
(SELECT MAX(GrillSessionLog.GrillSessionLogID) FROM
GrillSessionLog, GrillSession WHERE GrillSessionLog.GrillSessionID
= GrillSession.GrillSessionID AND GrillSession.ActiveSession =
1)")
for row in cursor:
print("In for loop")
readings = readings + (row['FoodCategory'], row['FoodName'])
print("Food Cat = " + row['FoodCategory'])
cursor.close()
return readings
The query in DB Browser returns only one row which is what I'm trying to have happen in the python code.
Just discovered the issue....
Using DB Browser, I updated a record I'm using for testing but failed to "write" the change to the database table. As a result, every time I was executing my python code against the table it was executing the query with the original record values because my change wasn't yet committed via DB Browser.
Huge brain fart on that one.... Hopefully it will be a lesson learned for someone else in the future.

Python Running SQL Query With Temp Tables

I am new to the Python-SQL connectivity world. My goal is to retrieve data from SQL in a pandas DataFrame format by executing long SQL queries thru my python script.
Most of my SQL queries are long with multiple interim-temp tables before the final SELECT statement from the last temp table. When I run such a monolithic query in Python I get an error saying -
"pandas.io.sql.DatabaseError: Execution failed on sql"
Though they run absolutely fine in MS SQL Management Studio
I suspect this is due to the interim-temp tables, because if I split my long query into two pieces (with everything before the final SELECT in 1st section and final SELECT in the 2nd section) the two section sequentially, run fine
Can someone guide me why is it so or alternatively what is the best way to run long queries with temp tables/views and retrieve results in a pandas DataFrame?
Here is my sample Python code that ideally should take a fine name as an input and run the SQL to retrieve results in a data frame, however it fails in case of a query with temp tables
import pyodbc as db
import pandas as pd
filename = 'file.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sqlcommand1 = sql
df_table = pd.read_sql(sqlcommand1, conn)
If I break my sql query in two pieces (one with all temp tables and 2nd with final Select), then it runs fine. Below is a modified function that splits the long Query after finding '/**/' and it works fine
"""
This Function Reads a SQL Script From an Extrenal File and Executes The
Script in SQL. If The SQL Script Has Bunch of Tem Tables/Views
Followed By a Select Statement to Retrieve Data From Those Views Then Input
SQL File Should Have '/**/' Immediately Before the Final
Select Statement. This is to Esnure Final Select Statement is Executed on
the Temporary Views Already Run by Python.
Input is a SQL File Name and Output is a DataFrame
"""
import pyodbc as db
import pandas as pd
filename = 'filename.sql'
username = 'XXXX'
password = 'YYYYY'
driver= '{ODBC Driver 13 for SQL Server}'
database = 'DB'
server = 'local'
conn = db.connect('DRIVER='+driver+'; PORT=1433; SERVER='+server+';
PORT=1443; DATABASE='+database+'; UID='+username+'; PWD='+ password)
fd = open(filename, 'r')
sqlfile = fd.read()
fd.close()
sql = sqlfile.split('/**/')
sqlcommand1 = sql[0] #1st Section of Query with temp tables
sqlcommand2 = sql[1] #2nd section of Query with final SELECT statement
conn.execute(sqlcommand1)
df_table = pd.read_sql(sqlcommand2, conn)
Quick and dirty answer: if using T-SQL put the line SET NOCOUNT ON at the beginning of your query.
Like #Parfait mentioned above the pandas read_sql method can only support one result set. However, when you generate a temp table in T-sql you do create a result set in the form "(XX row(s) affected)" which is what causes your original query to fail. By setting NOCOUNT you eliminate any early returns and only get the results from your final SELECT statement.
Alternatively, if using pyodbc cursor instead of pandas you can utilize nextset() to skip the result sets from the temp table(s). More info on pyodbc here.

How to write csv file into sql database with python

I have csv file that include some information about computer as like ostype, ram, cpu value and ı have sql database that already has same information and ı want to updated that database table with by python script. database table and csv file has uniqe "id" parameters.
import csv
with open("Hypersanal.csv") as csvfile:
readCSV = csv.reader(csvfile, delimiter=';')
for row in readCSV:
print row
Depending on what type of database, there will be some slight adjustments to make to the code.
For this example, I'll use SQLAlchemy with the pymysql driver. To find out what the first part of the connection String should be (depends on the kind of database you want to connect to), check SQLAlchemy Doc about Dialects.
First, we import the necessary modules
from sqlalchemy import *
from sqlalchemy.orm import create_session
from sqlalchemy.ext.declarative import declarative_base
Then, we create the connection string
dialect_part = "mysql+pymysql://"
# username is the usernmae we'll use to connect to the db, and password the corresponding password
# server_name is the name fo the server where the db is. It INCLUDES the port number (eg : 'localhost:9080')
# database is the name of the db on the server we'll work on
connection_string = dialect_part+username+":"+password+"#"+server_name+"/"+database
Some more setups needed for SQLAlchemy :
Base = declarative_base()
engine = create_engine(connection_string)
metadata = MetaData(bind=engine)
Now, we have a link to the db, but need some more work before being able to do anything to it.
We create a class corresponding to the table of the db we'll hit. This class will 'autofill' according to how the table is in the db. You can also fill it manually.
class TableWellHit(Base):
__table__ = Table(name_of_the_table, metadata, autoload=True)
Now, to be able to interact with the table, we need to create a session :
session = create_session(bind=engine)
Now, we need to begin the session, and we'll be set.
Your code will now be used.
import csv
with open("Hypersanal.csv") as csvfile:
readCSV = csv.reader(csvfile, delimiter=';')
for row in readCSV:
# print row
# I chose to push each value from the db one by one
# If you're sure there won't be any duplicates for the primary key, you can put the session.begin() before the for loop
session.begin()
# I create an element for the db
new_element = TableWellHit(field_in_table=row[field])
An example for this, imagine you have required fiels 'username' and 'password' in the table, and row contains a dictionnary containing 'user' and 'pass' as keys.
The elements will be created by : TableWellHit(username=row['user],password=row['pass'])
# I add the element to the table
# I choose to merge instead of add, so as to prevent duplicates, one more time
session.merge(new_element)
# Now, we commit our changes to the db
# This also closes the session
# if you put the session.begin() outside of the loop, do the same for the session.commit()
session.commit()
Hope this answers your question, and if he does not, just let me know so I can correct my answer.
edit :
For MSSQL :
- Install pymssql (pip install pymssql)
The connection_string should be of the following form, according to this SQLAlchemy page : mssql+pymssql://<username>:<password>#<freetds_name>/?charset=utf8
Using merge allows you to create or update a value, depending on whether or not it already exists.

List database tables with SQLAlchemy

I want to implement a function that gives information about all the tables (and their column names) that are present in a database (not only those created with SQLAlchemy). While reading the documentation it seems to me that this is done via reflection but I didn't manage to get something working. Any suggestions or examples on how to do this?
start with an engine:
from sqlalchemy import create_engine
engine = create_engine("postgresql://u:p#host/database")
quick path to all table /column names, use an inspector:
from sqlalchemy import inspect
inspector = inspect(engine)
for table_name in inspector.get_table_names():
for column in inspector.get_columns(table_name):
print("Column: %s" % column['name'])
docs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html?highlight=inspector#fine-grained-reflection-with-inspector
alternatively, use MetaData / Tables:
from sqlalchemy import MetaData
m = MetaData()
m.reflect(engine)
for table in m.tables.values():
print(table.name)
for column in table.c:
print(column.name)
docs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html#reflecting-all-tables-at-once
First set up the sqlalchemy engine.
from sqlalchemy import create_engine, inspect, text
from sqlalchemy.engine import url
connect_url = url.URL(
'oracle',
username='db_username',
password='db_password',
host='db_host',
port='db_port',
query=dict(service_name='db_service_name'))
engine = create_engine(connect_url)
try:
engine.connect()
except Exception as error:
print(error)
return
Like others have mentioned, you can use the inspect method to get the table names.
But in my case, the list of tables returned by the inspect method was incomplete.
So, I found out another way to find table names by using pure SQL queries in sqlalchemy.
query = text("SELECT table_name FROM all_tables where owner = '%s'"%str('db_username'))
table_name_data = self.session.execute(query).fetchall()
Just for sake of completeness of answer, here's the code to fetch table names by inspect method (if it works good in your case).
inspector = inspect(engine)
table_names = inspector.get_table_names()
Hey I created a small module that helps easily reflecting all tables in a database you connect to with SQLAlchemy, give it a look: EZAlchemy
from EZAlchemy.ezalchemy import EZAlchemy
DB = EZAlchemy(
db_user='username',
db_password='pezzword',
db_hostname='127.0.0.1',
db_database='mydatabase',
d_n_d='mysql' # stands for dialect+driver
)
# this function loads all tables in the database to the class instance DB
DB.connect()
# List all associations to DB, you will see all the tables in that database
dir(DB)
I'm proposing another solution as I was not satisfied by any of the previous in the case of postgres which uses schemas. I hacked this solution together by looking into the pandas source code.
from sqlalchemy import MetaData, create_engine
from typing import List
def list_tables(pg_uri: str, schema: str) -> List[str]:
with create_engine(pg_uri).connect() as conn:
meta = MetaData(conn, schema=schema)
meta.reflect(views=True)
return list(meta.tables.keys())
In order to get a list of all tables in your schema, you need to form your postgres database uri pg_uri (e.g. "postgresql://u:p#host/database" as in the zzzeek's answer) as well as the schema's name schema. So if we use the example uri as well as the typical schema public we would get all the tables and views with:
list_tables("postgresql://u:p#host/database", "public")
While reflection/inspection is useful, I had trouble getting the data out of the database. I found sqlsoup to be much more user-friendly. You create the engine using sqlalchemy and pass that engine to sqlsoup.SQlSoup. ie:
import sqlsoup
def create_engine():
from sqlalchemy import create_engine
return create_engine(f"mysql+mysqlconnector://{database_username}:{database_pw}#{database_host}/{database_name}")
def test_sqlsoup():
engine = create_engine()
db = sqlsoup.SQLSoup(engine)
# Note: database must have a table called 'users' for this example
users = db.users.all()
print(users)
if __name__ == "__main__":
test_sqlsoup()
If you're familiar with sqlalchemy then you're familiar with sqlsoup. I've used this to extract data from a wordpress database.

Categories