Background:
I am very new to SQLAlchemy and it seems to be fairly confusing as to how I should be selecting things.
I have a table in my mysql database which is called genes, where I have gene_id, gene_description, and gene_symbol
What I want to do:
All I want to do is a simple select query:
Select * from Genes
But I seem to be confused as to how we shall achieve this
Here is what I have done:
import sqlalchemy
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.automap import automap_base
import csv
import pandas as pd
engine = sqlalchemy.create_engine('mysql://root:toor#localhost') # connect to server
metadata = sqlalchemy.MetaData(bind=engine)
engine.execute("USE TestDB")
genes = sqlalchemy.table('Genes')
s = sqlalchemy.select([genes])
engine.execute(s)
The problem:
ProgrammingError: (_mysql_exceptions.ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'FROM `Genes`' at line 2") [SQL: u'SELECT \nFROM `Genes`']
Also is there some type of "intellisense" where I can just do something like gene_table = engine.Gene. If I am not mistake there is a way to do this with mapping but it didn't work for me
EDIT:
This may help:
How to automatically reflect database to sqlalchemy declarative?
So we can use reflection and do not have to create classes explicitly, but if we want speed we can create them using something like sqlautocode as stated here:
Reverse engineer SQLAlchemy declarative class definition from existing MySQL database?
Also there is an issue with mysql databases where it will give an error that looks like the following: (taken from bitbucket: https://bitbucket.org/zzzeek/sqlalchemy/issues/1909/reflection-issue-with-mysql-url-with-no)
SNIP...
File "/opt/buildout-eggs/SQLAlchemy-0.6.4-py2.5.egg/sqlalchemy/schema.py", line 1927, in __init__
self.reflect()
File "/opt/buildout-eggs/SQLAlchemy-0.6.4-py2.5.egg/sqlalchemy/schema.py", line 2037, in reflect
connection=conn))
File "/opt/buildout-eggs/SQLAlchemy-0.6.4-py2.5.egg/sqlalchemy/engine/base.py", line 1852, in table_names
return self.dialect.get_table_names(conn, schema)
File "<string>", line 1, in <lambda>
File "/opt/buildout-eggs/SQLAlchemy-0.6.4-py2.5.egg/sqlalchemy/engine/reflection.py", line 32, in cache
return fn(self, con, *args, **kw)
File "/opt/buildout-eggs/SQLAlchemy-0.6.4-py2.5.egg/sqlalchemy/dialects/mysql/base.py", line 1791, in get_table_names
self.identifier_preparer.quote_identifier(current_schema))
File "/opt/buildout-eggs/SQLAlchemy-0.6.4-py2.5.egg/sqlalchemy/sql/compiler.py", line 1517, in quote_identifier
return self.initial_quote + self._escape_identifier(value) + self.final_quote
File "/opt/buildout-eggs/SQLAlchemy-0.6.4-py2.5.egg/sqlalchemy/dialects/mysql/mysqldb.py", line 77, in _escape_identifier
value = value.replace(self.escape_quote, self.escape_to_quote)
AttributeError: 'NoneType' object has no attribute 'replace'
This is resolved by adding a database name (the one you are using) as follows:
engine = create_engine('mysql+mysqldb://USER_NAME:PASSWORD#127.0.0.1/DATABASE_NAME', pool_recycle=3600) # connect to server
I used this to connect correctly:
http://docs.sqlalchemy.org/en/latest/orm/extensions/automap.html
and this:
http://docs.sqlalchemy.org/en/latest/core/engines.html
This also may help:
How to automatically reflect database to sqlalchemy declarative?
My code finally looks like this:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
Base = automap_base()
# engine, suppose it has two tables 'user' and 'address' set up
engine = create_engine('mysql+mysqldb://root:toor#127.0.0.1/TestDB', pool_recycle=3600) # connect to server
# reflect the tables
Base.prepare(engine, reflect=True)
# mapped classes are now created with names by default
# matching that of the table name.
Genes = Base.classes.Genes
Address = Base.classes.address
#Start Session
session = Session(engine)
#add a row:
session.add(Genes(Gene_Id=1,Gene_Symbol = "GENE_SYMBOL", Gene_Description="GENE_DESCRIPTION"))
session.commit()
q = session.query(Genes).all()
for gene in q:
print "This is the Gene ID {},\n This is the Gene Desc {},\n this is the Gene symbol {}.".format(gene.Gene_Id,gene.Gene_Description, gene.Gene_Symbol )
Related
I am trying to create table in database as this is my connection as the below code:
# pyodbc connection connect to server
conn = pyodbc.connect(
"driver={SQL Server};server=xxxxxxxxxxx; database=master; trusted_connection=true",
autocommit=True, Trusted_Connection='Yes')
crsr = conn.cursor()
# connect db (connect to database name) using SQL-Alchemy
engine = create_engine(
'mssql+pyodbc://xxxxxxxxxxx/master?driver=SQL+Server+Native+Client+11.0')
connection = engine.connect()
it's just a pyodbc conncetion
and this is the error I found:
Traceback (most recent call last):
File "C:/Users/haroo501/PycharmProjects/ToolUpdated/app.py", line 22, in <module>
dfeed_gsm_relation_m.push_dfeed_gsm_relation_sql()
File "C:\Users\haroo501\PycharmProjects\ToolUpdated\meta_data\dfeed_gsm_relation_m.py", line 31, in push_dfeed_gsm_relation_sql
if connec.crsr.dialect.has_table(connec.crsr, DATAF_GSM_RELATION):
AttributeError: 'pyodbc.Cursor' object has no attribute 'dialect'
and this is the code that creates the table in the database using MetaData():
from sqlalchemy import MetaData, Table, Column, Integer, String, Date, Float
from database import connec
import sqlalchemy as db
import pandas as pd
import numpy as np
from txt_to_csv import convert_to_csv
import os
def push_dfeed_gsm_relation_sql():
# Create a ditionary for all gsm_relations_mnm relation excel file
dataf_gsm_relation_col_dict = {
'cell_name': 'Cellname',
'n_cell_name': 'Ncellname',
'technology': 'Technology',
}
# table name in database 'df_gsm_relation'
DATAF_GSM_RELATION = 'df_gsm_relation'
# Create a list for dataf_gsm_relation_cols and put the dictionary in the list
dataf_gsm_relation_cols = list(dataf_gsm_relation_col_dict.keys())
dataf_gsm_relation_cols_meta = MetaData()
dataf_gsm_relation_relations = Table(
DATAF_GSM_RELATION, dataf_gsm_relation_cols_meta,
Column('id', Integer, primary_key=True),
Column(dataf_gsm_relation_cols[0], Integer),
Column(dataf_gsm_relation_cols[1], Integer),
Column(dataf_gsm_relation_cols[2], String),
)
if connec.crsr.dialect.has_table(connec.crsr, DATAF_GSM_RELATION):
dataf_gsm_relation_relations.drop(connec.crsr)
dataf_gsm_relation_cols_meta.create_all(connec.crsr)
dataf_gsm_rel_txt = 'gsmrelation_mnm.txt'
dataf_gsm_txt_df = pd.read_csv(dataf_gsm_rel_txt, sep=';')
dataf_gsm_rel_df_column_index = list(dataf_gsm_txt_df.columns)
dataf_gsm_txt_df.reset_index(inplace=True)
dataf_gsm_txt_df.drop(columns=dataf_gsm_txt_df.columns[-1], inplace=True)
dataf_gsm_rel_df_column_index = dict(zip(list(dataf_gsm_txt_df.columns), dataf_gsm_rel_df_column_index))
dataf_gsm_txt_df.rename(columns=dataf_gsm_rel_df_column_index, inplace=True)
dataf_gsm_txt_df.to_excel('gsmrelation_mnm.xlsx', 'Sheet1', index=False)
dataf_gsm_rel_excel = 'gsmrelation_mnm.csv'
dataf_gsm_rel_df = pd.read_csv(os.path.join(os.path.dirname(__file__), dataf_gsm_rel_excel), dtype={
dataf_gsm_relation_col_dict[dataf_gsm_relation_cols[0]]: int,
dataf_gsm_relation_col_dict[dataf_gsm_relation_cols[1]]: int,
dataf_gsm_relation_col_dict[dataf_gsm_relation_cols[2]]: str,
})
dataf_gsm_relations_table_query = db.insert(dataf_gsm_relation_relations)
dataf_gsm_relations_values_list = []
dataf_gsm_relations_row_count = 1
for i in dataf_gsm_rel_df.index:
dataf_gsm_relations_row = dataf_gsm_rel_df.loc[i]
dataf_gsm_rel_df_record = {'id': dataf_gsm_relations_row_count}
for col in dataf_gsm_relation_col_dict.keys():
if col == dataf_gsm_relation_cols[0] or col == dataf_gsm_relation_cols[1]:
dataf_gsm_rel_df_record[col] = int(dataf_gsm_relations_row[dataf_gsm_relation_col_dict[col]])
else:
dataf_gsm_rel_df_record[col] = dataf_gsm_relations_row[dataf_gsm_relation_col_dict[col]]
dataf_gsm_relations_values_list.append(dataf_gsm_rel_df_record)
dataf_gsm_relations_row_count += 1
ResultProxy_dataf_gsm_relations = connec.crsr.execute(dataf_gsm_relations_table_query,
dataf_gsm_relations_values_list)
as the problem in this part:
if connec.crsr.dialect.has_table(connec.crsr, DATAF_GSM_RELATION):
dataf_gsm_relation_relations.drop(connec.crsr)
dataf_gsm_relation_cols_meta.create_all(connec.crsr)
I know dialect function is related to from sqlalchemy import create_engine and this is my old connection as I changed to new connection using import pyodbc .....
So how can I solve this case using pyodbc module?
Edited
The other way to solve this is how to CREATE and DROP table in existing database using SQL ALCHEMY
and this is the related code example:
from database import connec
def create_db():
create_bd_query = "CREATE DATABASE MyNewDatabase"
connec.crsr.execute(create_bd_query)
def delete_database():
delete_bd_query = "DROP DATABASE MyNewDatabase"
connec.crsr.execute(delete_bd_query)
You cannot just import a completley different module and expect it to be the same :)
Dialects are what SQLalchemy uses to communicate to different drivers.
In this instance Pyodbc IS the driver so it has no need for a dialect.
From SQLAlchemy:
Dialects
The dialect is the system SQLAlchemy uses to communicate with various types of DBAPI implementations and databases. The sections that follow contain reference documentation and notes specific to the usage of each backend, as well as notes for the various DBAPIs.
All dialects require that an appropriate DBAPI driver is installed.
Included Dialects
PostgreSQL
MySQL
SQLite
Oracle
Microsoft SQL Server
Microsoft SQL Server
Support for the Microsoft SQL Server database.
DBAPI Support
The following dialect/DBAPI options are available. Please refer to individual
DBAPI sections for connect information.
PyODBC
mxODBC
pymssql
zxJDBC for Jython
adodbapi
Judging from the error and by looking at the PyODBC Wiki Documentation
I think this line:
if connec.crsr.dialect.has_table(connec.crsr, DATAF_GSM_RELATION):
should read:
# Does table 'DATAF_GSM_RELATION' exist?
if connec.crsr.tables(table=DATAF_GSM_RELATION).fetchone():
...
This question already has an answer here:
SQLAlchemy Automap does not create class for tables without primary key
(1 answer)
Closed 5 years ago.
Please, help how to solve this error - AttributeError(key):
File "pivot_table_measurements.py", line 1, in <module>
from database import *
File "/home/dedeco/Projetos/bigclima-project/database.py", line 24, in <module>
MeasureRanges = Base.classes.measure_ranges
File "/home/dedeco/craw/lib/python3.5/site-packages/sqlalchemy/util/_collections.py", line 212, in __getattr__
raise AttributeError(key)
AttributeError: measure_ranges
database.py:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
from sqlalchemy.orm.exc import MultipleResultsFound, NoResultFound
import string
from decimal import Decimal
ECHO = False
AUTOFLUSH = False
Base = automap_base()
engine = create_engine('postgresql://user:pass#localhost:5432/clima', echo=ECHO)
Base.prepare(engine, reflect=True)
Country = Base.classes.countries
State = Base.classes.states
City = Base.classes.cities
Measurement = Base.classes.measurements
MeasurementHourly = Base.classes.measurements_hourly
MeasureRanges = Base.classes.measure_ranges
Parameter = Base.classes.parameters
WeatherStation = Base.classes.weather_stations
session = Session(engine, autoflush=AUTOFLUSH)
The table measure_ranges exits on the database but I don't known why I'm receiving this error. When I delete this line ( MeasureRanges = Base.classes.measure_ranges) works, so I believe that error it's related with some issue in this table.
I decided to share this problem and the answer because it's a basic thing but can difficult to find the solution.
The problem happened because I don't create a primary key on this table, and because that it's not possible to automap to a class.
See the documentation:
By viable, we mean that for a table to be mapped, it must specify a
primary key. Additionally, if the table is detected as being a pure
association table between two other tables, it will not be directly
mapped and will instead be configured as a many-to-many table between
the mappings for the two referring tables.
See more details here: SQLAlchemy 1.2 Documentation - Autoamp
I'm trying to populate a database table using sqlalchemy.
I'm using dataset to write to the database.
from sqlalchemy import Column, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
from sqlalchemy import exc
import dataset
import sqlite3
Base = declarative_base()
class Eticks(Base):
__tablename__ = 'Eticks'
id = Column(String(25),primary_key=True)
affected_vers = Column(String(250),primary_key=False)
engine = create_engine('sqlite:///work_items.db', pool_recycle=3600)
Base.metadata.create_all(engine)
def format_vers(versobj):
if isinstance(versobj,list):
return " - ".join(versobj)
else:
return versobj
for i in list_of_objects:
with dataset.connect('sqlite:///work_items.db',
engine_kwargs={'pool_recycle': 3600}) as table:
table['Eticks'].upsert(dict(id=i.id,
affected_vers=format_vers(getattr(i,'Affected versions','Unspecified'))),['id'])
I've used this exact same syntax before for another table, however I'm now getting an error when I try it here:
sqlalchemy.exc.OperationalError:
(sqlite3.OperationalError) cannot rollback - no transaction is active
The list that I'm looping through is quite large - 24,000 items - could that be related?
I've also noticed that the error gets thrown more quickly if I use table['Eticks'].upsert rather than .insert
As I said, this syntax worked perfectly for another table, so I can't see what's caused this issue.
Can anyone help?
I want to implement a function that gives information about all the tables (and their column names) that are present in a database (not only those created with SQLAlchemy). While reading the documentation it seems to me that this is done via reflection but I didn't manage to get something working. Any suggestions or examples on how to do this?
start with an engine:
from sqlalchemy import create_engine
engine = create_engine("postgresql://u:p#host/database")
quick path to all table /column names, use an inspector:
from sqlalchemy import inspect
inspector = inspect(engine)
for table_name in inspector.get_table_names():
for column in inspector.get_columns(table_name):
print("Column: %s" % column['name'])
docs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html?highlight=inspector#fine-grained-reflection-with-inspector
alternatively, use MetaData / Tables:
from sqlalchemy import MetaData
m = MetaData()
m.reflect(engine)
for table in m.tables.values():
print(table.name)
for column in table.c:
print(column.name)
docs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html#reflecting-all-tables-at-once
First set up the sqlalchemy engine.
from sqlalchemy import create_engine, inspect, text
from sqlalchemy.engine import url
connect_url = url.URL(
'oracle',
username='db_username',
password='db_password',
host='db_host',
port='db_port',
query=dict(service_name='db_service_name'))
engine = create_engine(connect_url)
try:
engine.connect()
except Exception as error:
print(error)
return
Like others have mentioned, you can use the inspect method to get the table names.
But in my case, the list of tables returned by the inspect method was incomplete.
So, I found out another way to find table names by using pure SQL queries in sqlalchemy.
query = text("SELECT table_name FROM all_tables where owner = '%s'"%str('db_username'))
table_name_data = self.session.execute(query).fetchall()
Just for sake of completeness of answer, here's the code to fetch table names by inspect method (if it works good in your case).
inspector = inspect(engine)
table_names = inspector.get_table_names()
Hey I created a small module that helps easily reflecting all tables in a database you connect to with SQLAlchemy, give it a look: EZAlchemy
from EZAlchemy.ezalchemy import EZAlchemy
DB = EZAlchemy(
db_user='username',
db_password='pezzword',
db_hostname='127.0.0.1',
db_database='mydatabase',
d_n_d='mysql' # stands for dialect+driver
)
# this function loads all tables in the database to the class instance DB
DB.connect()
# List all associations to DB, you will see all the tables in that database
dir(DB)
I'm proposing another solution as I was not satisfied by any of the previous in the case of postgres which uses schemas. I hacked this solution together by looking into the pandas source code.
from sqlalchemy import MetaData, create_engine
from typing import List
def list_tables(pg_uri: str, schema: str) -> List[str]:
with create_engine(pg_uri).connect() as conn:
meta = MetaData(conn, schema=schema)
meta.reflect(views=True)
return list(meta.tables.keys())
In order to get a list of all tables in your schema, you need to form your postgres database uri pg_uri (e.g. "postgresql://u:p#host/database" as in the zzzeek's answer) as well as the schema's name schema. So if we use the example uri as well as the typical schema public we would get all the tables and views with:
list_tables("postgresql://u:p#host/database", "public")
While reflection/inspection is useful, I had trouble getting the data out of the database. I found sqlsoup to be much more user-friendly. You create the engine using sqlalchemy and pass that engine to sqlsoup.SQlSoup. ie:
import sqlsoup
def create_engine():
from sqlalchemy import create_engine
return create_engine(f"mysql+mysqlconnector://{database_username}:{database_pw}#{database_host}/{database_name}")
def test_sqlsoup():
engine = create_engine()
db = sqlsoup.SQLSoup(engine)
# Note: database must have a table called 'users' for this example
users = db.users.all()
print(users)
if __name__ == "__main__":
test_sqlsoup()
If you're familiar with sqlalchemy then you're familiar with sqlsoup. I've used this to extract data from a wordpress database.
I'm writing a quick one-off migration script that updates a single field in a table with half a million rows.
Since I hadn't planned on writing out full models for the joins I'm doing to fetch the initial ~25000 rows of data, I've been trying to figure out how to do an UPDATE statement using a from_statement() call and using my own raw sql, but I can't find any examples.
Along with that, SQLalchemy is throwing an error. Here's an example of my call and error:
mydb = self.session()
mydb.query().from_statement(
"""
UPDATE my_table
SET settings=mysettings
WHERE user_id=myuserid AND setting_id=123
""").params(mysettings=new_settings, myuserid=user_id).all()
The error I get:
Traceback (most recent call last):
File "./sample_script.py", line 111, in <module>
main()
File "./sample_script.py", line 108, in main
migrate.set_migration_data()
File "./sample_script.py", line 100, in set_migration_data
""").params(mysettings=new_settings, myuserid=user_id).all()
File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 1267, in all
return list(self)
File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 1361, in __iter__
return self._execute_and_instances(context)
File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 1364, in _execute_and_instances
result = self.session.execute(querycontext.statement, params=self._params, mapper=self._mapper_zero_or_none())
File "/usr/lib/pymodules/python2.6/sqlalchemy/orm/query.py", line 251, in _mapper_zero_or_none
if not getattr(self._entities[0], 'primary_entity', False):
IndexError: list index out of range
UPDATE
I'm using MySQL.
Per Samy's suggestion, I tried this:
mydb.execute(
"UPDATE mytable SET settings=:mysettings WHERE user_id=:userid AND setting_id=123",
{'userid': user_id, 'mysettings': new_settings}
)
This had no effect. I don't get any errors, but the statement doesn't seem to actually execute, as the row does not change. If I manually cut and paste the query that gets logged from the echo=True option, the row updates in the database just fine.
UPDATE - SOLVED
Samy's suggestion was correct but the .execute() call only works on 'engine', not 'session', so this worked just fine:
self.engine.execute(
"UPDATE mytable SET settings=:mysettings WHERE user_id=:userid AND setting_id=123",
{'userid': user_id, 'mysettings': new_settings}
)
Well this is rather strange, according to the docs, the from_statement is used for SELECT statements.
Execute the given SELECT statement and return results.
I could be looking at the wrong function, or it may be possible to use other type of statements, Im not really sure.
You could just use execute since it can do any type of statement, heres a quick example.
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
session = sessionmaker(bind = create_engine('sqlite://'), autocommit = True)()
_ = session.execute('CREATE TABLE my_table (user_id int, setting_id int, settings string)')
for id in xrange(200):
_ = session.execute('INSERT INTO my_table (user_id, setting_id) VALUES (:user_id, :setting_id)',
{'user_id':id, 'setting_id':id})
_ = session.execute(
"""
UPDATE my_table
SET settings = :mysettings
WHERE user_id = :user_id AND setting_id = 123
""", {'user_id':123, 'mysettings':'test'})
r = session.execute('SELECT * FROM my_table WHERE user_id = :user_id', {'user_id':123}).fetchall()
print r
[(123, 123, u'test')]
note that this isn't really the best way to use sqlalchemy, which was designed to create a dry environment, decoupled from a specific db backend, though you probably have your reasons for using raw sql versus its ORM.
You need to use the proper parameter syntax; the format depends entirely on your database adapter. For example, some adapters support :name paramaters, in which case you are missing those colons in your query:
mydb.query().from_statement(
"""
UPDATE my_table
SET settings=:mysettings
WHERE user_id=:myuserid AND setting_id=123
""").params(mysettings=new_settings, myuserid=user_id).all()
The DBAPI 2.0 spec supports several formats, including positional parametrs with ? and %s placeholders, and named parameters in the above form and as %(name)s formatting. You need to review your database adapter documentation to find out what is supported.