python and cx_Oracle - dynamic cursor.setinputsizes

python and cx_Oracle - dynamic cursor.setinputsizes - python

I'm using cx_Oracle to select rows from one database and then insert those rows to a table in another database. The 2nd table's columns match the first select.
So I have (simplified):
db1_cursor.execute('select col1, col2 from tab1')
rows = db1_cursor.fetchall()
db2_cursor.bindarraysize = len(rows)
db2_cursor.setinputsizes(cx_Oracle.NUMBER, cx_Oracle.BINARY)
db2_cursor.executemany('insert into tab2 values (:1, :2)', rows)
This works fine, but my question is how to avoid the hard coding in setinputsizes (I have many more columns).
I can get the column types from db1_cursor.description, but I'm not sure how to feed those into setinputsizes. i.e. how can I pass a list to setinputsizes instead of arguments?
Hope this makes sense - new to python and cx_Oracle

Just use tuple unpacking.
eg.
db_types = (d[1] for d in db1_cursor.description)
db2_cursor.setinputsizes(*db_types)

Related

Query external tuples list embedded within a SQL query

I have to run a sql query that grabs the values only if two conditions are true. So for example, I need to grab all the values where asset=x and id_name=12345. There are about 10k combinations between asset and id_name that I need to be able to query for using sql. Usually I would just do the following:
select * from database where id_name IN (12345)
But how do I make this query when two conditions have to be true. id_name has to equal 12345 AND asset has to equal x.
I tried turning the list i need into tuples like this:
new_list = list(scams[['asset', 'id_name']].itertuples(index=False, name=None))
which gives me a list like this:
new_list = (12345, x), (32342, z)...etc.
Any suggestions would be great. Thanks!

Based on my understanding you need to query or fetch records based on a combination of two filters. Also you have around 10K combinations. Here is a simple SQL based solution.
Create a new column in the same table or build a temp table/view with a new column say "column_new". Populate the concatenated value of id_name and asset in the new column. You can use a concatenation function based on the database. Example in SQL server use CONCAT(column1,column2).
Now you can write your SQL as select * from database where colum_new IN ("12345x","32342z");.
Note : You can also use a "-" or "|" between column 1 and column 2 while doing a concatenation.

How to delete large quantity of records from Oracle Table that has no primary key

The situation: I'm loading an entire SQL table into my program. For convenience I'm using pandas to maintain the row data. I am then creating a dataframe of rows I would like to have removed from the SQL table. Unfortunately (and I can't change this) the table does not have any primary keys other than the built-in Oracle ROWID (which isn't a real table column its a pseudocolumn), but I can make ROWID part of my dataframe if I need to.
The table has hundreds of thousands of rows, and I'll probably be deleting a few thousand records with each run of the program.
Question:
Using Cx_Oracle what is the best method of deleting multiple rows/records that don't have a primary key? I don't think creating a loop to submit thousands of delete statements is very efficient or pythonic. Although I am concerned about building a singular SQL delete statement keyed off of ROWID and that contains a clause with thousands of items:
Where ROWID IN ('eg1','eg2',........, 'eg2345')
Is this concern valid? Any Suggestions?

Using ROWID
Since you can use ROWID, that would be the ideal way to do it. And depending on the Oracle version, the query length limit may be large enough for a query with that many elements in the IN clause. The issue is the number of elements in the IN expression list - limited to 1000.
So you'll either have to break up the list of RowIDs into sets of 1000 at a time or delete just a single row at a time; with or without executemany().
>>> len(delrows) # rowids to delete
5000
>>> q = 'DELETE FROM sometable WHERE ROWID IN (' + ', '.join(f"'{row}'" for row in delrows) + ')'
>>> len(q) # length of the query
55037
>>> # let's try with just the first 1000 id's and no extra spaces
... q = 'DELETE FROM sometable WHERE ROWID IN (' + ','.join(f"'{row}'" for row in delrows[:1000]) + ')'
>>> len(q)
10038
You're probably within query-length limits, and can even save some chars with a minimal ',' item separator.
Without ROWID
Without the Primary Key or ROWID, the only way to identify each row is to specify all the columns in the WHERE clause and to do many rows at a time, they'll need to be OR'd together:
DELETE FROM sometable
WHERE ( col1 = 'val1'
AND col2 = 'val2'
AND col3 = 'val3' ) -- row 1
OR ( col1 = 'other2'
AND col2 = 'value2'
AND col3 = 'val3' ) -- row 2
OR ( ... ) -- etc
As you can see it's not the nicest query to construct but allows you to do it without ROWIDs.
And in both cases, you probably don't need to be using parameterised queries since the IN list in 1 or OR grouping in 2 is variable. (Yes, you could create it parameterised after constructing the whole extended SQL with thousands of parameters. Not sure what the limit is on that.) The executemany() way is definitely easier to write & do but for speed, the single large queries (either of the above two) will probably outperform executemany with thousands of items.

You can use cursor.executemany() to delete multiple rows at once. Something like the following should work:
dataToDelete = [['eg1'], ['eg2'], ...., ['eg2345']]
cursor.executemany("delete from sometable where rowid = :1", dataToDelete)

Combining 2 columns in sql query based on alternate field

I have 2 columns in a database(sent_time, accept_time) which each include time stamps, as well as a field that can have 2 different values (ref_col). I would like to find a way in my query to make a new column (result_col), which will check the value of ref_col, and copy sent_time if the value is 1, and copy accept_time if the value is 2.
I am using pandas to query the database in python, if that has any bearing on the answer.

Just use case expression statement :
SELECT sent_time,
accept_time,
ref_col,
CASE WHEN ref_col = 1 THEN sent_col
ELSE accept_col
END AS result_col
FROM Your_Table

When you say "I have 2 columns in a database", what you actually mean is that you have 2 columns in a table, right?
In sql for postgresql it would be something like:
select (case when ref_col = 1 then sent_time else accept_time end) as result_col
from mytable
don't know how close from SQL standard that is, but would assume it's not that far.

SQL Alchemy Filter rows based on the values contained in cells of other column

I am new to python and SQLALCHEMY, and I came across this doubt, whether can we filter rows of the table based on cell values of the column of same table.
example:
Sbranch=value
result=Transaction.query.filter(Transaction.branch==Sbranch)
.order_by(desc(Transaction.id)).limit(50).all()
if the value of Sbranch=0, i want to read all the rows regardless of Sbranch value, else i want to filter rows with contains Transaction.branch==Sbranch.
I know that it can be achieved by comparing the values of Sbranch(if-else conditions),but it gets complicated as the number of such columns increases.
Example:
Sbranch=value1
trans_by=value2
trans_to=value3
.
.
result=Transaction.query.filter(Transaction.branch==Sbranch,Transaction.trans_by==value2,Transaction_to==trans_to)
.order_by(desc(Transaction.id)).limit(50).all()
I want to apply similar filter with all 3 columns.
I want to know if there is any inbuilt function in SQLALCHEMY to this problem.

You can optionally add the filter based on the value of SBranch
query = Transaction.query
if SBranch != 0:
query = query.filter(Transaction.branch == SBranch)
result = query.order_by(Transaction.id.desc()).limit(50).all()

I think i found the solution, it's not the best but will reduce the work for the developers(not the processor).
Sbranch=value
branches=[]
if Sbranch==0:
# Append all the values into the array for which the rows are filtered
# for example:
branches=[1,2,4,7,3,8]
else:
branches.append(branch)
result=Transaction.query.filter(Transaction.branch.in_(branches))
.order_by(desc(Transaction.id)).limit(50).all()

python sql statement optimalization in python

I found the only way how to update only null variables in mysql db with python.
I have this kind of statement:
sql = "INSERT INTO `table` VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s)\
ON DUPLICATE KEY UPDATE Data_block_1_HC1_sec_voltage=IF(VALUES(Data_block_1_HC1_sec_voltage)IS NULL,Data_block_1_HC1_sec_voltage,VALUES(Data_block_1_HC1_sec_voltage)),\
`Data_block_1_TC1_1`=IF(VALUES(`Data_block_1_TC1_1`)IS NULL,`Data_block_1_TC1_1`,VALUES(`Data_block_1_TC1_1`)),\
`Data_block_1_TC1_2`=IF(VALUES(`Data_block_1_TC1_2`)IS NULL,`Data_block_1_TC1_2`,VALUES(`Data_block_1_TC1_2`)),\
`Data_block_1_TCF1_1`=IF(VALUES(`Data_block_1_TCF1_1`)IS NULL,`Data_block_1_TCF1_1`,VALUES(`Data_block_1_TCF1_1`)),\
`HC1_HC1_output`=IF(VALUES(`HC1_HC1_output`)IS NULL,`HC1_HC1_output`,VALUES(`HC1_HC1_output`)),\
`Data_block_1_HC1_sec_cur`=IF(VALUES(`Data_block_1_HC1_sec_cur`)IS NULL,`Data_block_1_HC1_sec_cur`,VALUES(`Data_block_1_HC1_sec_cur`)),\
`Data_block_1_HC1_power`=IF(VALUES(`Data_block_1_HC1_power`)IS NULL,`Data_block_1_HC1_power`,VALUES(`Data_block_1_HC1_power`)),\
`HC1_HC1_setpoint`=IF(VALUES(`HC1_HC1_setpoint`)IS NULL,`HC1_HC1_setpoint`,VALUES(`HC1_HC1_setpoint`))\
"
Datablocks are columns in db. Primary key is datetime. Right now there are 8 columns but I will have a lot more variables (more columns). I am not reallz good at python but I dont like the statement because its kind of hardcoded. Could I make this statement somehow in a for cycle or something so It doesnt have to be so long and I dont have to write all the variables manually?
Thx for your help

Let's assume that your column names are stored in an array cols. Then in order to generate the "interesting" inner part of the SQL statement above, you could do
',\\\n'.join(map(lambda c: r'`%(col)s` = IF(VALUES(`%(col)s`) IS NULL, `%(col)s`, VALUES(`%(col)s`))' % {'col': c}, cols))
Here, map generates for each element of cols the corresponding line of the SQL statement and join then stitches everything together.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python and cx_Oracle - dynamic cursor.setinputsizes - python

Just use tuple unpacking. eg. db_types = (d[1] for d in db1_cursor.description) db2_cursor.setinputsizes(*db_types)

Related

Query external tuples list embedded within a SQL query

How to delete large quantity of records from Oracle Table that has no primary key

Combining 2 columns in sql query based on alternate field

SQL Alchemy Filter rows based on the values contained in cells of other column

python sql statement optimalization in python

Categories

Resources