I have a class which uses the pyodbc library successfully - it can perform a variety of reads from the database (so the connection and DSN are hunky dory).
What I've being trying to implement are functions to write and delete columns from tables in a sql database (the same one I'm able to read from).
I have tested the calls using isql commands and I can see the changes occur in my database. For example;
SQL> ALTER TABLE DunbarGen ADD testCol float(4)
SQLRowCount returns -1
Adds a new column to the table from the terminal (this works). I have a code which, I think, should replicate this command - which causes no errors in my class - and looks like this;
def createColumn(self, columnName, tableName, isFloat, isDateTime, isString):
if isFloat:
typeOf = 'float(4)'
elif isDateTime:
typeOf = 'datetime2'
elif isString:
typeOf = 'text'
else:
return False
self.cursor.execute("ALTER TABLE " + tableName + " ADD " + columnName + " " + typeOf)
print 'command has executed'
Do I need to do something else with the pyodbc class to finalize the command or something?
Thanks!
self.cursor.commit()
After the execute function has been called.
Related
I am unable to understand why there are two queries being executed. First we are executing the prepared statement and we are using the build cypher function. The code can be found here
https://github.com/apache/age/blob/master/drivers/python/age/age.py
def execCypher(conn:ext.connection, graphName:str, cypherStmt:str, cols:list=None, params:tuple=None) -> ext.cursor :
if conn == None or conn.closed:
raise _EXCEPTION_NoConnection
cursor = conn.cursor()
#clean up the string for modification
cypherStmt = cypherStmt.replace("\n", "")
cypherStmt = cypherStmt.replace("\t", "")
cypher = str(cursor.mogrify(cypherStmt, params))
cypher = cypher[2:len(cypher)-1]
preparedStmt = "SELECT * FROM age_prepare_cypher({graphName},{cypherStmt})"
cursor = conn.cursor()
try:
cursor.execute(sql.SQL(preparedStmt).format(graphName=sql.Literal(graphName),cypherStmt=sql.Literal(cypher)))
except SyntaxError as cause:
conn.rollback()
raise cause
except Exception as cause:
conn.rollback()
raise SqlExecutionError("Execution ERR[" + str(cause) +"](" + preparedStmt +")", cause)
stmt = buildCypher(graphName, cypher, cols)
cursor = conn.cursor()
try:
cursor.execute(stmt)
return cursor
except SyntaxError as cause:
conn.rollback()
raise cause
except Exception as cause:
conn.rollback()
raise SqlExecutionError("Execution ERR[" + str(cause) +"](" + stmt +")", cause)
Both statements perform the same operation.
The difference is that preparedStmt and buildCypher function use different form of cypher queries as shown in code. (cypherStmt & cypher) And their code for building the query is a bit different.
I can't tell you why it's done this way but I'll show you why it's different. Also apologies but I'm not used to Python or C.
The preparedStatement is calling a custom postgres function age_prepare_cypher in this file here apache/age/src/backend/utils/adt/age_session_info.c, which calls set_session_info(graph_name_str, cypher_statement_str);.
And the set_session_info in this file here apache/age/src/backend/utils/adt/age_session_info.c just sets it to a global variable session_info_cypher_statement.
So your graph name and query are being set in the session.
There's another function that gets your graph name and query back out of the session, and that is the convert_cypher_to_subquery. It only gets them out if is_session_info_prepared() is true, and only if graph_name and query_str provided to it are NULL.
Seems strange right? But now let's look at this bit of the python buildCypher function code:
stmtArr = []
stmtArr.append("SELECT * from cypher(NULL,NULL) as (")
stmtArr.append(','.join(columnExp))
stmtArr.append(");")
return "".join(stmtArr)
It's taking your query and saying your graph name and query string are NULL.
So we can conclude that the prepare statement is storing those values in session memory, and then when you execute your statement after using buildCypher, it's getting them out of memory and completing the statement again.
I can't explain exactly why or how it does it, but I can see a chunk of test sql in the project that is doing the same sort of thing here:
-- should return true and execute cypher command
SELECT * FROM age_prepare_cypher('analyze', 'MATCH (u) RETURN (u)');
SELECT * FROM cypher(NULL, NULL) AS (result agtype);
So tl;dr, executing the prepareStatement is storing it in session memory, and executing the normal statement after running it through buildCypher is grabbing what was just stored in the session.
I'm using Jupyter notebook to run a PL/SQL script but I get an error. The code block in the notebook is as follows:
%%sql
DECLARE BEGIN
FOR record_item IN (
SELECT
*
FROM
duplicated_records
) LOOP
EXECUTE IMMEDIATE 'UPDATE table_name SET record_id ='|| record_item.original_record_id || ' WHERE record_id =' || record_item.duplicated_record_id;
EXECUTE IMMEDIATE 'DELETE FROM records WHERE id ='|| record_item.duplicated_record_id;
END LOOP;
END
The error is
(cx_Oracle.DatabaseError) ORA-06550: line 8, column 165:
PLS-00103: Encountered the symbol "end-of-file" when expecting one of the following:
Non-PL/SQL code, such as select, and update statements, seems to work.
It works perfectly fine with other SQL clients like SQL developer. I've tried adding/removing the; at the end but it still doesn't work.
I don't know Python so I can't assist about that, but - as far as Oracle is concerned - you don't need DECLARE (as you didn't declare anything), and you certainly don't need dynamic SQL (EXECUTE IMMEDIATE) as there's nothing dynamic there.
Rewritten:
BEGIN
FOR record_item IN (SELECT * FROM duplicated_records) LOOP
UPDATE table_name
SET record_id = record_item.original_record_id
WHERE record_id = record_item.duplicated_record_id;
DELETE FROM records
WHERE id = record_item.duplicated_record_id;
END LOOP;
END;
On the other hand, row-by-row processing is slow-by-slow. Consider using two separate statements: one which will update existing rows, and another which will delete rows (from a different table, apparently):
merge into table_name a
using duplicated_records b
on (a.record_id = b.duplicate_record_id)
when matched then update set
a.record_id = b.original_record_id;
delete from records a
where a.id in (select b.duplicated_record_id from duplicated_records b);
If tables are properly indexed (on ID columns), that should behave better (faster).
The direct implementation of your code in Python would be like:
import oracledb
import traceback
import os
import sys
#if sys.platform.startswith('darwin'):
# oracledb.init_oracle_client(lib_dir=os.environ.get('HOME')+'/Downloads/instantclient_19_8')
un = os.environ.get('PYTHON_USERNAME')
pw = os.environ.get('PYTHON_PASSWORD')
cs = os.environ.get('PYTHON_CONNECTSTRING')
try:
connection = oracledb.connect(user=un, password=pw, dsn=cs)
with connection.cursor() as cursor:
plsql = """BEGIN
FOR RECORD_ITEM IN (
SELECT
*
FROM
DUPLICATED_RECORDS
) LOOP
EXECUTE IMMEDIATE 'UPDATE table_name SET record_id ='
|| RECORD_ITEM.ORIGINAL_RECORD_ID
|| ' WHERE record_id ='
|| RECORD_ITEM.DUPLICATED_RECORD_ID;
EXECUTE IMMEDIATE 'DELETE FROM records WHERE id ='
|| RECORD_ITEM.DUPLICATED_RECORD_ID;
END LOOP;
END;"""
cursor.execute(plsql)
except oracledb.Error as e:
error, = e.args
traceback.print_tb(e.__traceback__)
print(error.message)
For this you need to install the oracledb module, which is the renamed, latest version of the cx_Oracle module. It will work with cx_Oracle by changing the import to import cx_Oracle as oracledb.
However, before blindly copying this, check #littlefoot's answer for more about the PL/SQL code.
Here is some custom code I wrote that I think might be problematic for this particular use case.
class SQLServerConnection:
def __init__(self, database):
...
self.connection_string = \
"DRIVER=" + str(self.driver) + ";" + \
"SERVER=" + str(self.server) + ";" + \
"DATABASE=" + str(self.database) + ";" + \
"Trusted_Connection=yes;"
self.engine = sqlalchemy.create_engine(
sqlalchemy.engine.URL.create(
"mssql+pyodbc", \
query={'odbc_connect': self.connection_string}
)
)
# Runs a command and returns in plain text (python list for multiple rows)
# Can be a select, alter table, anything like that
def execute(self, command, params=False):
# Make a connection object with the server
with self.engine.connect() as conn:
# Can send some parameters along with a plain text query...
# could be single dict or list of dict
# Doc: https://docs.sqlalchemy.org/en/14/tutorial/dbapi_transactions.html#sending-multiple-parameters
if params:
output = conn.execute(sqlalchemy.text(command,params))
else:
output = conn.execute(sqlalchemy.text(command))
# Tell SQL server to save your changes (assuming that is applicable, is not with select)
# Doc: https://docs.sqlalchemy.org/en/14/tutorial/dbapi_transactions.html#committing-changes
try:
conn.commit()
except Exception as e:
#pass
warn("Could not commit changes...\n" + str(e))
# Try to consolidate select statement result into single object to return
try:
output = output.all()
except:
pass
return output
If I try:
cnxn = SQLServerConnection(database='MyDatabase')
cnxn.execute("SELECT * INTO [dbo].[MyTable_newdata] FROM [dbo].[MyTable] ")
or
cnxn.execute("SELECT TOP 0 * INTO [dbo].[MyTable_newdata] FROM [dbo].[MyTable] ")
Python returns this object without error, <sqlalchemy.engine.cursor.LegacyCursorResult at 0x2b793d71880>, but upon looking in MS SQL Server, the new table was not generated. I am not warned about the commit step failing with the SELECT TOP 0 way; I am warned ('Connection' object has no attribute 'commit') in the above way.
CREATE TABLE, ALTER TABLE, or SELECT (etc) appears to work fine, but SELECT * INTO seems to not be working, and I'm not sure how to troubleshoot further. Copy-pasting the query into SQL Server and running appears to work fine.
As noted in the introduction to the 1.4 tutorial here:
A Note on the Future
This tutorial describes a new API that’s released in SQLAlchemy 1.4 known as 2.0 style. The purpose of the 2.0-style API is to provide forwards compatibility with SQLAlchemy 2.0, which is planned as the next generation of SQLAlchemy.
In order to provide the full 2.0 API, a new flag called future will be used, which will be seen as the tutorial describes the Engine and Session objects. These flags fully enable 2.0-compatibility mode and allow the code in the tutorial to proceed fully. When using the future flag with the create_engine() function, the object returned is a subclass of sqlalchemy.engine.Engine described as sqlalchemy.future.Engine. This tutorial will be referring to sqlalchemy.future.Engine.
That is, it is assumed that the engine is created with
engine = create_engine(connection_url, future=True)
You are getting the "'Connection' object has no attribute 'commit'" error because you are creating an old-style Engine object.
You can avoid the error by adding future=True to your create_engine() call:
self.engine = sqlalchemy.create_engine(
sqlalchemy.engine.URL.create(
"mssql+pyodbc",
query={'odbc_connect': self.connection_string}
),
future=True
)
Use this recipe instead:
#!python
from sqlalchemy.sql import Select
from sqlalchemy.ext.compiler import compiles
class SelectInto(Select):
def __init__(self, columns, into, *arg, **kw):
super(SelectInto, self).__init__(columns, *arg, **kw)
self.into = into
#compiles(SelectInto)
def s_into(element, compiler, **kw):
text = compiler.visit_select(element)
text = text.replace('FROM',
'INTO TEMPORARY TABLE %s FROM' %
element.into)
return text
if __name__ == '__main__':
from sqlalchemy.sql import table, column
marker = table('marker',
column('x1'),
column('x2'),
column('x3')
)
print SelectInto([marker.c.x1, marker.c.x2], "tmp_markers").\
where(marker.c.x3==5).\
where(marker.c.x1.in_([1, 5]))
This needs some tweaking, hence it will replace all subquery selects as select INTOs, but test it for now, if it worked it would be better than raw text statments.
Have you tried this from this answer by #Michael Berkowski:
INSERT INTO assets_copy
SELECT * FROM assets;
The answer states that MySQL documentation states that SELECT * INTO isn't supported.
I'm trying to make friends with postgresql (14.0 build 1914 64-bit on windows), psycopg2 (2.9.1 installed using pip) and python 3.8.10 on windows.
I have created a postgresql function in a database that returns a cursor, somthing like below
CREATE get_rows
...
RETURNS refcursor
...
DECLARE
res1 refcursor;
BEGIN
OPEN res1 FOR
SELECT some_field, and_another_field FROM some_table;
RETURN res1;
END
The function can be run from pgAdmin4 Quert tool
SELECT get_rows();
and will then return a cursor like "<unnamed portal 1>"
Still within query tool in pgAdmin4 I can issue:
BEGIN;
SELECT get_rows();
FETCH ALL IN "<unnamed portal 2>"; -- Adjust counter number
And this will get me the rows returned by the cursor.
Now I want to replicate this using psycopg instead of pgAdmin4
I have the below code
conn = psycopg2.connect("dbname='" + db_name + "' "\
"user='" + db_user + "' " +\
"host='" + db_host + "' "+\
"password='" + db_passwd + "'")
cursor = conn.cursor()
cursor.callproc('get_rows')
print("cursor.description: ", end = '')
print(cursor.description)
for record in cursor:
print("record: ", end = '')
print (record)
The above code only gives the cursor string name (as returned by the postgresql function 'get_rows') in the single record of the cursor created by psycopg.
How can I get a cursor-class object from psycopg that provides access the cursor returned by 'get_rows'?
https://www.psycopg.org/docs/cursor.html says cursor.name is read-only and I dont see an obvious way to connect the cursor from 'get_rows' with a psycopg cursor-instance
The cursor link you show refers to the Python DB API cursor not the Postgres one. There is an example of how to do what you want here Server side cursor in section:
Note It is also possible to use a named cursor to consume a cursor created in some other way than using the DECLARE executed by execute(). For example, you may have a PL/pgSQL function returning a cursor:
CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS $$
BEGIN
OPEN $1 FOR SELECT col FROM test;
RETURN $1;
END;
$$ LANGUAGE plpgsql;
You can read the cursor content by calling the function with a regular, non-named, Psycopg cursor:
cur1 = conn.cursor()
cur1.callproc('reffunc', ['curname'])
and then use a named cursor in the same transaction to “steal the cursor”:
cur2 = conn.cursor('curname')
for record in cur2: # or cur2.fetchone, fetchmany...
# do something with record
pass
UPDATE
Be sure and close the named cursor(cur2) to release the server side cursor. So:
cur2.close()
def rmv_dupes_in_psumsdb(setoffilestoprocess, config):
setoffilestoprocess_fnames = [file.name for file in setoffilestoprocess]
constring = config['db_string']['db_string']
cnxn = pyodbc.connect(constring)
FilesToBeCrunched1000 = list(chunks(list(setoffilestoprocess_fnames), 2))
for FilesChunks1000 in FilesToBeCrunched1000:
sqlstring = 'DELETE FROM {0} WHERE [LOG] IN ('.format(config['db_string']['bd_psums_meta_table'])
# print((FilesChunks1000))
values_string = (', '.join("'" + item + "'" for item in FilesChunks1000))
sqlstring+=values_string
sqlstring+=')'
cnxn.execute(sqlstring)
The script calls this function but nothing happens on the database side. I wrote a function similar to this that does a Select statement and it works. But this one doesn't. I printed out (sqlstring) and it correctly gave me the following output:
DELETE FROM [NSGWSAINLINE].[dbo].[bd_psums_meta] WHERE [LOG] IN ('0517312.002.7312-08.FRP.00.S25D._Yo9EXAOaDK6asQ2_.0.zip', '0503302.002.3302-20.FRP.00.S26A._obBBQu5GUT1pnKO_.0.zip')
DELETE FROM [NSGWSAINLINE].[dbo].[bd_psums_meta] WHERE [LOG] IN ('0524222.002.4222-08.FRP.00.S25D._cH03BJws2g1pnKO_.0.zip', '0532722.002.2722-15.92FIP.00.S26A._hR10vpeCvpsonKO_.0.zip')
DELETE FROM [NSGWSAINLINE].[dbo].[bd_psums_meta] WHERE [LOG] IN ('0524282.002.4282-25.FRP.00.S25D._0sOzWcCcEptonKO_.0.zip')
I actually went ahead and copied the outputs above and they ran in SQL Server and did the delete statements. So why isn't this working from within Python?
You should commit your changes after execute with cnxn.commit() if there is no autocommit option in your connection enabled.