Solved: Adding new Column to ORM SQLAlchemy table in a volatile setting - python

I am working on a open source persistance layer for a MQTT-Broker https://github.com/volkerjaenisch/amqtt_db
Incoming MQTT messages are irregular blobs of data so usually the DB-Backend is some kind of object storage.
I do it the hard way and deserialize the blobs into typed data colums and store them into a fast relational database. My finally target will be timescaleDB but first I go via SQLAlchemy to access a wide bunch of DBs with one API.
MQTT messages are volatile (think not always complete) so the DB scheme has to adjust dynamically e.g. adding new columns for new information.
First Message:
Time: 1234
Temperature : 23.4
Second Message:
Time: 1245
Temperature : 23.6
Rel Hum : 87 %
I have used SQLalchemy ORM for more than a decade but always for quite static databases. So I am new to work dynmically.
Utilizing the ORM to build DB tables dynamically from the structure of incoming MQTT-Messages was quite doable and worked out perfect.
But currently I am stuck with the case of additional information in the MQTT-Packages that extends the tables with new columns.
What I did so far:
Utilizing sqlalchemy-migration it was quite easy to dynamically add new columns to the existing table in the DB. In the code "topic_cls" is the declarative class and "column_def" a col_name - type mapping.
from migrate.versioning.schema import Table as MiTable, Column as MiColumn
def add_new_colums(self, topic_cls, column_def):
table_name = str(topic_cls.__table__.name)
table = MiTable(table_name, self.metadata)
for col_name, col_type in column_def.items():
col = MiColumn(col_name, col_type)
col.create(table)
Works like a charm. But how to get this changes to the DB reflected back into declarative classes? I tried to get a new inspection of the table:
new_table = Table(topic_cls.__table__.name, self.metadata, autoload_with=self.engine)
This also works but it gives me a new table but not a declarative base.
So my stupid questions are:
Is this the right way to achive my goal?
How can I get a declarative class by inspecting an already existing table in a DB?
"Drop the ORM and use SQL" is not the answer I am looking for.
Cheers,
Volker

Found a solution but it is a bit of a hack.
new_table = Table("test/topic_growth", Base.metadata, autoload_with=self.engine)
Base.metadata.remove(topic_cls.__table__)
new_dcl = type(str(table_name), (Base,), {'__table__': new_table})
Base.metadata._add_table(table_name, None, new_table)
After you obtained the new table via inspection, remove the old table entry from the metadata.
Then generate a new declarative base with the new table and same table name.
At last add the new table to the metadata.

Related

Bulk insert using sqlalchemy Engine

Is there a way to bulk-insert/update values into a Microsoft SQLserver Database using Engine?
I have read several (very) old posts regarding this, and it seems not very easy to do (back then).
E.g in some examples we need to create a class, add those classes to a session and at last commit the session.
Isn't there a way like (pseudo) this:
from sqlalchemy import String, Integer, Float
values= [(1,"hello",2.5),(2,"world",10.5)] #values to insert
table = "my_schema.my_table" #Table name
col = ["id","statement","ratio"] #Name of the columns in the database
type = [Integer,String,Float] #Type of each value
engine = sqlalchemy.create_engine(connection_string)
with engine.session():
try:
engine.bulk_insert(table,values,col,type)
except:
engine.rollback()
or something else, instead of looping over engine.execute("INSERT INTO ...")?
I know I can use pandas.DataFrame.to_sql but since I want to be able to roll-back in case of errors etc. I won't use that

How to get the data object of a newly inserted data row and flask-mysqldb?

I have work in Perl where I am able to get the newly created data object ID by passing the result back to a variable. For example:
my $data_obj = $schema->resultset('PersonTable')->create(\%psw_rec_hash);
Where the $data_obj contains the primary key's column value.
I want to be able to do the same thing using Python 3.7, Flask and flask-mysqldb,
but without having to do another query. I want to be able to use the specific
record's primary key column value for another method.
Python and flask-mysqldb inserts data like so:
query = "INSERT INTO PersonTable (fname, mname, lname) VALUES('Phil','','Vil')
cursor = db.connection.cursor()
cursor.execute(query)
db.connection.commit()
cursor.close()
The PersonTable has a primary key column called, id. So, the newly inserted data row would look
like:
23, 'Phil', 'Vil'
Because there are 22 rows of data before the last inserted data, I don't want to perform a search
for the data, because there could be more than one entry with the same data. However, all I want
the most recent data row.
Can I do something similar to Perl with python 3.7 and flask-mysqldb?
You may want to consider the Flask-SQLAlchemy package to help you with this.
Although the syntax is going to be slightly different from Perl, what you can do is, when you create the model object, you can set it to a variable. Then, when you either flush or commit on the Database session, you can pull up your primary key attribute on that model object you had created (whether it's "id" or something else), and use it as needed.
SQLAlchemy supports MySQL, as well as several other relational databases. In addition, it is able to help prevent SQL injection attacks so long as you use model objects and add/delete them to your database session, as opposed to straight SQL commands.

Simple SQLAlchemy update table based on value in another table

I am a newcomer to SQLAlchemy, so please forgive what must be an elementary question.
I have a database table properties (mapped in SQLALchemy as the object Property) which contains a field MEBID. I have another table mebs (mapped in SQLAlchemy as MEB). I want to set the properties.MEBID field to mebs.id where properties.PostCode == mebs.PostCode.
I can do this simply in SQL using the command
update properties, mebs set properties.mebid = mebs.id where mebs.PostCode = properties.PostCode
but am struggling with doing it in SQLAlchemy. If I try the command
session.query(Property, MEB).\
filter(Property.PostCode == MEB.PostCode).\
update({Property.MEBID : MEB.id})
I get
InvalidRequestError: This operation requires only one Table or entity be specified as the target.
I know that this must be elementary as it's such a fundamental operation, but can't work out how it's done.
To update:
for prop, meb in session.query(Property, MEB).filter(Property.PostCode == MEB.PostCode).all():
prop.MEBID=meb.id
session.add(prop)

Web2py SQLFORM.grid with executesql

I am making a web2py application and I have my two mysql tables defined in my models db.py file:
db.define_table('table1',
Field('id','integer'),
Field('name','string'),
migrate=False)
db.define_table('table2',
Field('id','integer'),
Field('name','string'),
migrate=False)
I want my application to return a union of these tables:
data=db.executesql('SELECT * FROM table1 union select * from table2;')
in a SQLFORM.grid but apparently
form=SQLFORM.grid(data, create=False, deletable=False, editable=False, maxtextlength=100, paginate=10)
is not the way to go.
Can somebody help me please? It must be really simple but I'm having trouble finding the solution.
Thank you
The grid is designed to take a table or query, so you cannot pass a Rows object or arbitrary SQL. The best approach would be to create a view in the database and create a new DAL model definition associated with that view (be sure to set migrate=False, as you don't want the DAL to attempt to create a table with the name of the view). Then you can pass the view model to the grid:
db.define_table('t1_t2_union_view',
Field('id','integer'),
Field('name','string'),
migrate=False)
grid = SQLFORM.grid(db.t1_t2_union_view, ...)
The above works because web2py will treat the model of the database view like any other database table, issuing a query to select all of its records. There is no need for executesql in this case because the union of the tables is handled in the database by the view.
Actually, you can simplify the table definition to:
db.define_table('t1_t2_union_view', db.table1, migrate=False)
When you pass an existing table to .define_table(), you get a new table with the same field defintions as the original, which is what we want here.
If creating separate views for each possible union is not feasible, a possible alternative would be to retrieve the data via executesql and then iterate through the records, inserting each one into an in-memory SQLite database table, which could then be passed to the grid:
union_tables = ('table1', 'table2')
temp_db = DAL('sqlite:memory')
union_table = temp_db.define_table('union_table', db[union_tables[0]])
records = db.executesql(sql, as_dict=True)
for record in records:
union_table.insert(**union_table._filter_fields(record))
grid = SQLFORM.grid(union_table, create=False, editable=False, deletable=False)
Setting as_dict=True results in a list of dictionaries being returned, which makes it easier to do the inserts, as the keys of the dictionaries are the field names needed for the inserts.
Note, this method is somewhat inefficient, so you would have to test it to see how it performs with your workload.

How to get datatypes of specific fields of an Access database using pyodbc?

I'm using pyodbc to data-mine a big database in a .mbd (access) file.
I want to create a new table taking relevant information from several existing tables (to then feed it to a tool).
I think I know all I need to transfer the data, and I know how to create a table given column names and datatypes, but I'm having trouble getting the datatypes (INTEGER, VARCHAR, etc.) of the respective columns in the existing tables. I need these types to create the new columns compatibly.
What I found on the internet (like this and this) is getting me into invalid-command trouble, so I think this is a platform-specific issue. Then again, I'm fairly green on databases.
Does anybody know how to get the types of these fields?
The reason why those articles aren't helping you is because they are for SQL Server. SQL Server has system tables that you can query to get the column data, MS Access doesn't. MS Access only lets you query the object names.
However ODBC does support getting the schema through its connection via the ODBC.SQLColumns functions.
According to this answer PyODBC exposes this via a cursor method
# columns in table x
for row in cursor.columns(table='x'):
print row.column_name
As Mark noted in the comments you probably also want the row.data_type. The link he provided includes all the columns it provides
table_cat
table_schem
table_name
column_name
data_type
type_name
column_size
buffer_length
decimal_digits
num_prec_radix
nullable
remarks
column_def
sql_data_type
sql_datetime_sub
char_octet_length
ordinal_position
is_nullable: One of SQL_NULLABLE, SQL_NO_NULLS, SQL_NULLS_UNKNOWN.
I am not familiar with pyodbc, but I have done this in VBA in the past.
The 2 links you mentionned are for SQL Server, not for Access. To find out the data type of each field in an Access table, you can use DAO or ADOX.
Here is an example I did, in VBA with Excel 2010, where I connect to the Access database (2000 mdb format) and list the tables, fields and their datatypes (as an enum, for example '4' means dbLong). You can see in the output the system tables and, at the bottom, tables created by the user.
You can easily find examples on internet for how to do something similar with ADOX. I Hope this helps.
Private Sub TableDefDao()
Dim db As DAO.Database
Set db = DAO.OpenDatabase("C:\Database.mdb")
Dim t As DAO.TableDef
Dim f As DAO.Field
For Each t In db.TableDefs
Debug.Print t.Name
For Each f In t.Fields
Debug.Print vbTab & f.Name & vbTab & f.Type
Next
Next
End Sub
You'll get some info from this output:
import pandas as pd
dbq='D:\....\xyz.accdb'
conn=pyodbc.connect(r"Driver={Microsoft Access Driver (*.mdb, *.accdb)}; Dbq=%s;" %(dbq))
query = 'select * from tablename'
dataf = pd.read_sql(query, conn)
print(list(dataf.dtypes))

Categories