I am trying to query one of my tables in my Postgres database using SqlAlchemy in Python 3. It runs the query fine but as I go through each row in the result that SqlAlchemy returns, I try to use the attribute 'text' (one of my column names). I receive this error:
'str' object has no attribute 'text'
I have printed the attribute like so:
for row in result:
print(row.text)
This does not give the error. The code that produces the error is below. However, to give my environment:
I have two servers running. One is for my database the other is for my python server.
Database Server:
Postgres v9.6 - On Amazon's RDS
Server with Python
Linux 3.13.0-65-generic x86_64 - On an Amazon EC2 Instance
SqlAlchemy v1.1.5
Python v3.4.3
Flask 0.11.1
Files related:
import sqlalchemy as sa
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
import re
from nltk import sent_tokenize
class DocumentProcess:
def __init__(self):
...
Engine = sa.create_engine(
CONFIG.POSTGRES_URL,
client_encoding='utf8',
pool_size=20,
max_overflow=0
)
# initialize SQLAlchemy
Base = automap_base()
# reflect the tables
Base.prepare(Engine, reflect=True)
# Define all needed tables
self.Document = Base.classes.documents
self.session = Session(Engine)
...
def process_documents(self):
try:
offset = 5
limit = 50
###### This is the query in question ##########
result = self.session.query(self.Document) \
.order_by(self.Document.id) \
.offset(offset) \
.limit(limit)
for row in result:
# The print statement below does print out the text
print(row.text)
# when passing document.text to sent_tokenize, it
# gives the following error:
# 'str' object has no attribute 'text'
snippets = sent_tokenize(row.text.strip('\n')) # I have removed strip, but the same problem
except Exception as e:
logging.info(format(e))
raise e
This is my model for Document, in my PostgreSQL database:
class Document(db.Model):
__tablename__ = "documents"
id = db.Column(db.Integer, primary_key=True)
text = db.Column(db.Text)
tweet = db.Column(db.JSON)
keywords = db.Column(db.ARRAY(db.String), nullable=True)
def to_dict(self):
return dict(
id=self.id,
text=self.text,
tweet=self.tweet,
keywords=self.keywords
)
def json(self):
return jsonify(self.to_dict())
def __repr__(self):
return "<%s %r>" % (self.__class__, self.to_dict())
Things I have tried
Before, I did not have order_by in the Document query. This was working before. However, even removing order_by does not fix it anymore.
Used a SELECT statement and went through the result manually, but still the same result
What I haven't tried
I am wondering if its because I named the column 'text'. I noticed that when I write this query out in postgres, it highlights it as a reserved word. I'm confused why my query worked before, but now it doesn't work. Could this be the issue?
Any thoughts on this issue would be much appreciated.
It turns out that text is a reserved word in PostgreSQL. I renamed the column name and refactored my code to match. This solved the issue.
You are likely to get this error in PostgreSQL if you are creating a foreign table and one of the column datatype is text. Change it to character varying() and the error disappears!
Related
Here is some custom code I wrote that I think might be problematic for this particular use case.
class SQLServerConnection:
def __init__(self, database):
...
self.connection_string = \
"DRIVER=" + str(self.driver) + ";" + \
"SERVER=" + str(self.server) + ";" + \
"DATABASE=" + str(self.database) + ";" + \
"Trusted_Connection=yes;"
self.engine = sqlalchemy.create_engine(
sqlalchemy.engine.URL.create(
"mssql+pyodbc", \
query={'odbc_connect': self.connection_string}
)
)
# Runs a command and returns in plain text (python list for multiple rows)
# Can be a select, alter table, anything like that
def execute(self, command, params=False):
# Make a connection object with the server
with self.engine.connect() as conn:
# Can send some parameters along with a plain text query...
# could be single dict or list of dict
# Doc: https://docs.sqlalchemy.org/en/14/tutorial/dbapi_transactions.html#sending-multiple-parameters
if params:
output = conn.execute(sqlalchemy.text(command,params))
else:
output = conn.execute(sqlalchemy.text(command))
# Tell SQL server to save your changes (assuming that is applicable, is not with select)
# Doc: https://docs.sqlalchemy.org/en/14/tutorial/dbapi_transactions.html#committing-changes
try:
conn.commit()
except Exception as e:
#pass
warn("Could not commit changes...\n" + str(e))
# Try to consolidate select statement result into single object to return
try:
output = output.all()
except:
pass
return output
If I try:
cnxn = SQLServerConnection(database='MyDatabase')
cnxn.execute("SELECT * INTO [dbo].[MyTable_newdata] FROM [dbo].[MyTable] ")
or
cnxn.execute("SELECT TOP 0 * INTO [dbo].[MyTable_newdata] FROM [dbo].[MyTable] ")
Python returns this object without error, <sqlalchemy.engine.cursor.LegacyCursorResult at 0x2b793d71880>, but upon looking in MS SQL Server, the new table was not generated. I am not warned about the commit step failing with the SELECT TOP 0 way; I am warned ('Connection' object has no attribute 'commit') in the above way.
CREATE TABLE, ALTER TABLE, or SELECT (etc) appears to work fine, but SELECT * INTO seems to not be working, and I'm not sure how to troubleshoot further. Copy-pasting the query into SQL Server and running appears to work fine.
As noted in the introduction to the 1.4 tutorial here:
A Note on the Future
This tutorial describes a new API that’s released in SQLAlchemy 1.4 known as 2.0 style. The purpose of the 2.0-style API is to provide forwards compatibility with SQLAlchemy 2.0, which is planned as the next generation of SQLAlchemy.
In order to provide the full 2.0 API, a new flag called future will be used, which will be seen as the tutorial describes the Engine and Session objects. These flags fully enable 2.0-compatibility mode and allow the code in the tutorial to proceed fully. When using the future flag with the create_engine() function, the object returned is a subclass of sqlalchemy.engine.Engine described as sqlalchemy.future.Engine. This tutorial will be referring to sqlalchemy.future.Engine.
That is, it is assumed that the engine is created with
engine = create_engine(connection_url, future=True)
You are getting the "'Connection' object has no attribute 'commit'" error because you are creating an old-style Engine object.
You can avoid the error by adding future=True to your create_engine() call:
self.engine = sqlalchemy.create_engine(
sqlalchemy.engine.URL.create(
"mssql+pyodbc",
query={'odbc_connect': self.connection_string}
),
future=True
)
Use this recipe instead:
#!python
from sqlalchemy.sql import Select
from sqlalchemy.ext.compiler import compiles
class SelectInto(Select):
def __init__(self, columns, into, *arg, **kw):
super(SelectInto, self).__init__(columns, *arg, **kw)
self.into = into
#compiles(SelectInto)
def s_into(element, compiler, **kw):
text = compiler.visit_select(element)
text = text.replace('FROM',
'INTO TEMPORARY TABLE %s FROM' %
element.into)
return text
if __name__ == '__main__':
from sqlalchemy.sql import table, column
marker = table('marker',
column('x1'),
column('x2'),
column('x3')
)
print SelectInto([marker.c.x1, marker.c.x2], "tmp_markers").\
where(marker.c.x3==5).\
where(marker.c.x1.in_([1, 5]))
This needs some tweaking, hence it will replace all subquery selects as select INTOs, but test it for now, if it worked it would be better than raw text statments.
Have you tried this from this answer by #Michael Berkowski:
INSERT INTO assets_copy
SELECT * FROM assets;
The answer states that MySQL documentation states that SELECT * INTO isn't supported.
I'm learning SQLAlchemy right now, but I've encountered an error that puzzles me. Yes, there are similar questions here on SO already, but none of them seem to be solved.
My goal is to use the ORM mode to query the database. So I create a model:
from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.orm import Session, registry
from sqlalchemy.sql import select
database_url = "mysql+pymysql://..."
mapper_registry = registry()
Base = mapper_registry.generate_base()
class User(Base):
__tablename__ = "user"
id = Column(Integer, primary_key=True)
name = Column(String(32))
engine = create_engine(database_url, echo=True)
mapper_registry.metadata.create_all(engine)
New I want to load the whole row for all entries in the table:
with Session(engine) as session:
for row in session.execute(select(User)):
print(row.name)
#- Error: #
Traceback (most recent call last):
...
print(row.name)
AttributeError: Could not locate column in row for column 'name'
What am I doing wrong here? Shouldn't I be able to access the fields of the ORM model? Or am I misunderstanding the idea of ORM?
I'm using Python 3.8 with PyMySQL 1.0.2 and SQLAlchemy 1.4.15 and the server runs MariaDB.
This is example is as minimal as I could make it, I hope anyone can point me in the right direction. Interestingly, inserting new rows works like a charm.
session.execute(select(User)) will return a list of Row instances (tuples), which you need to unpack:
for row in session.execute(select(Object)):
# print(row[0].name) # or
print(row["Object"].name)
But I would use the .query which returns instances of Object directly:
for row in session.query(Object):
print(row.name)
I'd like to add some to what above #Van said.
You can get object instances using session.execute() as well.
for row in session.execute(select(User)).scalars().all():
print(row.name)
Which is mentioned in migrating to 2.0.
I just encountered this error today when executing queries that join two or more tables.
It turned out that after updating psycopg2 (2.8.6 -> 2.9.3), SQLAlchemy (1.3.23 -> 1.4.39), and flask-sqlalchemy (2.4.4 -> 2.5.1) the Query.all() method return type is a list of sqlalchemy.engine.row.Rows and before it was a list of tuples. For instance:
query = database.session.query(model)
query = query.outerjoin(another_model, some_field == another_field)
results = query.all()
# type(results[0]) -> sqlalchemy.engine.row.Row
if isinstance(results[0], (list, tuple)):
# Serialize as a list of rows
else:
# Serialize as a single row
I'm trying to add a block of text into a sqlAlchemy table, which I want to compress to save space with it. Looking through various answers I came up with what I think should be working, but is not. I'm working with a sqlite database.
Updated: Was pointed out I was attempting to use mysql on sqlite which I wasn't aware that was what was happening. I adjusted to use zlib instead and it works to a degree, which gives me a new error that I do not understand.
# proper imports and stuff to make this work
from sqlalchemy import func
class Data(Base):
__tablename__ = 'data'
# ...
text_blobbed = Column('text', BLOB)
#hybrid_property
def text(self):
# return func.decompress(self.text_blobbed)
return self.text_blobbed.decode("zlib")
#text.setter
def text(self, stuff):
# self.text_blobbed = func.compress(stuff)
self.text_blobbed = stuff.encode("zlib")
old error from func.
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such function: compress [SQL: ...... ]
I can now add in the text via Data.text = "a really big block of text"
But when I go to query for this like
session.query(Data.text).filter(Data.id.like(2)).first()
I get an error:
AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object associated with Data.text_blobbed has an attribute 'decode'
Doing this is fine.
r = session.query(Data).filter(Data.id.like(2)).first()
print r.text
I've also looked at the text_blobbed which is a set(). And I can do this that works:
r = session.query(Data.text_blobbed.filter( ... ).first()[0].decode("zlib")
print r
But if I move that [0] into the hybrid_property for
...
return self.text_blobbed[0].decode("zlib")
and query:
r = session.query(Data.text).filter( ... ).first()
I get the error:
NotImplementedError: Operator 'getitem' is not supported on this expression
So, I'm a bit confused still.
I've been looking at these things:
SQLAlchemy - Writing a hybrid method for child count
mysql Compress() with sqlalchemy
SELECT UNCOMPRESS(text) FROM with sqlalchemy
http://docs.sqlalchemy.org/en/latest/orm/mapped_sql_expr.html?highlight=descriptor
I have been struggling on this for a while now and did not find an answer yet, or maybe I already have seen the answer and just didnt get it - however, I hope I am able to describe my problem.
I have a MS SQL database in which the tables are grouped in namespaces (or whatever it is called), denoted by Prefix.Tablename (with a dot). So a native sql statement to request some content looks like this:
SELECT TOP 100
[Value], [ValueDate]
FROM [FinancialDataBase].[Reporting].[IndexedElements]
How to map this to sqlalchemy?
If the "Reporting" prefix would not be there, the solution (or one way to do it) looks like this:
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base, declared_attr
from sqlalchemy.orm import sessionmaker
def get_session():
from urllib.parse import quote_plus as urllib_quote_plus
server = "FinancialDataBase.sql.local"
connstr = "DRIVER={SQL Server};SERVER=%s;DATABASE=FinancialDataBase" % server
params = urllib_quote_plus(connstr)
base_url = "mssql+pyodbc:///?odbc_connect=%s" % params
engine = create_engine(base_url,echo=True)
Session = sessionmaker(bind=engine)
session = Session()
return engine, session
Base = declarative_base()
class IndexedElements(Base):
__tablename__ = "IndexedElements"
UniqueID = Column(String,primary_key=True)
ValueDate = Column(DateTime)
Value = Column(Float)
And then requests can be done and wrapped in a Pandas dataframe for example like this:
import pandas as pd
engine, session = get_session()
query = session.query(IndexedElements.Value,IndexedElements.ValueDate)
data = pd.read_sql(query.statement,query.session.bind)
But the SQL statement that is compiled and actually executed in this, includes this wrong FROM part:
FROM [FinancialDataBase].[IndexedElements]
Due to the namespace-prefix it would have to be
FROM [FinancialDataBase].[Reporting].[IndexedElements]
Simply expanding the table name to
__tablename__ = "Reporting.IndexedElements"
doesnt fix it, because it changes the compiled sql statement to
FROM [FinancialDataBase].[Reporting.IndexedElements]
which doesnt work properly.
So how can this be solved?
The answer is given in the comment by Ilja above:
The "namespace" is a so called schema and has to be declarated in the mapped object. Given the example from the opening post, the mapped table has to be defined like this:
class IndexedElements(Base):
__tablename__ = "IndexedElements"
__table_args__ = {"schema": "Reporting"}
UniqueID = Column(String,primary_key=True)
ValueDate = Column(DateTime)
Value = Column(Float)
Or define a base class containing these informations for different schemata. Check also "Augmenting the base" in sqlalchemy docs:
http://docs.sqlalchemy.org/en/latest/orm/extensions/declarative/mixins.html#augmenting-the-base
I want to implement a function that gives information about all the tables (and their column names) that are present in a database (not only those created with SQLAlchemy). While reading the documentation it seems to me that this is done via reflection but I didn't manage to get something working. Any suggestions or examples on how to do this?
start with an engine:
from sqlalchemy import create_engine
engine = create_engine("postgresql://u:p#host/database")
quick path to all table /column names, use an inspector:
from sqlalchemy import inspect
inspector = inspect(engine)
for table_name in inspector.get_table_names():
for column in inspector.get_columns(table_name):
print("Column: %s" % column['name'])
docs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html?highlight=inspector#fine-grained-reflection-with-inspector
alternatively, use MetaData / Tables:
from sqlalchemy import MetaData
m = MetaData()
m.reflect(engine)
for table in m.tables.values():
print(table.name)
for column in table.c:
print(column.name)
docs: http://docs.sqlalchemy.org/en/rel_0_9/core/reflection.html#reflecting-all-tables-at-once
First set up the sqlalchemy engine.
from sqlalchemy import create_engine, inspect, text
from sqlalchemy.engine import url
connect_url = url.URL(
'oracle',
username='db_username',
password='db_password',
host='db_host',
port='db_port',
query=dict(service_name='db_service_name'))
engine = create_engine(connect_url)
try:
engine.connect()
except Exception as error:
print(error)
return
Like others have mentioned, you can use the inspect method to get the table names.
But in my case, the list of tables returned by the inspect method was incomplete.
So, I found out another way to find table names by using pure SQL queries in sqlalchemy.
query = text("SELECT table_name FROM all_tables where owner = '%s'"%str('db_username'))
table_name_data = self.session.execute(query).fetchall()
Just for sake of completeness of answer, here's the code to fetch table names by inspect method (if it works good in your case).
inspector = inspect(engine)
table_names = inspector.get_table_names()
Hey I created a small module that helps easily reflecting all tables in a database you connect to with SQLAlchemy, give it a look: EZAlchemy
from EZAlchemy.ezalchemy import EZAlchemy
DB = EZAlchemy(
db_user='username',
db_password='pezzword',
db_hostname='127.0.0.1',
db_database='mydatabase',
d_n_d='mysql' # stands for dialect+driver
)
# this function loads all tables in the database to the class instance DB
DB.connect()
# List all associations to DB, you will see all the tables in that database
dir(DB)
I'm proposing another solution as I was not satisfied by any of the previous in the case of postgres which uses schemas. I hacked this solution together by looking into the pandas source code.
from sqlalchemy import MetaData, create_engine
from typing import List
def list_tables(pg_uri: str, schema: str) -> List[str]:
with create_engine(pg_uri).connect() as conn:
meta = MetaData(conn, schema=schema)
meta.reflect(views=True)
return list(meta.tables.keys())
In order to get a list of all tables in your schema, you need to form your postgres database uri pg_uri (e.g. "postgresql://u:p#host/database" as in the zzzeek's answer) as well as the schema's name schema. So if we use the example uri as well as the typical schema public we would get all the tables and views with:
list_tables("postgresql://u:p#host/database", "public")
While reflection/inspection is useful, I had trouble getting the data out of the database. I found sqlsoup to be much more user-friendly. You create the engine using sqlalchemy and pass that engine to sqlsoup.SQlSoup. ie:
import sqlsoup
def create_engine():
from sqlalchemy import create_engine
return create_engine(f"mysql+mysqlconnector://{database_username}:{database_pw}#{database_host}/{database_name}")
def test_sqlsoup():
engine = create_engine()
db = sqlsoup.SQLSoup(engine)
# Note: database must have a table called 'users' for this example
users = db.users.all()
print(users)
if __name__ == "__main__":
test_sqlsoup()
If you're familiar with sqlalchemy then you're familiar with sqlsoup. I've used this to extract data from a wordpress database.