Psycopg2 : Insert multiple values if not exists in the table - python

I need to insert multiple values into a table after checking if it doesn't exist using psycopg2.
The query am using:
WITH data(name,proj_id) as (
VALUES ('hello',123),('hey',123)
)
INSERT INTO keywords(name,proj_id)
SELECT d.name,d.proj_id FROM data d
WHERE NOT EXISTS (SELECT 1 FROM keywords u2 WHERE
u2.name=d.name AND u2.proj_id=d.proj_id)
But how to format or add the values section from tuple to ('hello',123),('hey',123) in query.

As suggested in the comment, assuming that your connection is already established as conn one of the ways would be:
from typing import Iterator, Dict, Any
def insert_execute_values_iterator(connection, keywords: Iterator[Dict[str, Any]], page_size: int = 1000) -> None:
with connection.cursor() as cursor:
psycopg2.extras.execute_values(
cursor,
""" WITH data(name,proj_id) as (VALUES %s)
INSERT INTO keywords(name,proj_id)
SELECT d.name,d.proj_id FROM data d
WHERE NOT EXISTS (SELECT 1 FROM keywords u2 WHERE
u2.name=d.name AND u2.proj_id=d.proj_id);""",
(( keyword['name'],
keyword['proj_id'] ) for keyword in keywords),
page_size=page_size)
insert_execute_values_iterator(conn,{'hello':123,'hey':123})

insert_query = """WITH data(name, proj_id) as (
VALUES (%s,%s)
)
INSERT INTO keywords(name, proj_id)
SELECT d.name,d.proj_id FROM data d
WHERE NOT EXISTS (
SELECT 1 FROM keywords u2
WHERE u2.name = d.name AND u2.proj_id = d.proj_id)"""
tuple_values = (('hello',123),('hey',123))
psycopg2.extras.execute_batch(cursor,insert_query,tuple_values)

Related

Python MySQL. How to INSERT ON DUPLICATE KEY UPDATE from dictionary

I am updating a MySQL database table from a Python dictionary where the keys are the database fields. I have an insert statement that works but what I really need is INSERT ON DUPLICATE KEY UPDATE.
Here is the insert statement:
for d in r.json()['mydict'].items():
d = d[1] #the dictionary is the 2nd element in the tuple
placeholders = ', '.join(['%s'] * len(d))
columns = ', '.join(d.keys())
sql = "INSERT INTO %s ( %s ) VALUES ( %s )" % ("my_table", columns, placeholders)
c = create_connection() #from function create tables
cur = c[0] #function returns cursor
db = c[1] #function returns db
cur.execute(sql, list(d.values()))
db.commit()
db.close()
As always, any help is appreciated.

How can I replace NULL in a database query?

The problem is that from the query result in the database:
cur1 = con.cursor()
result1 = ("SELECT DDATE FROM TABLE(NULL)")
cur1.execute(result1)
result1 = cur1.fetchone()
Result from the query - 43949.0
need put the result in the next query, replacing the first two "NULL" values ​​in it, which select:
cur = con.cursor()
POS = (SELECT ST1,ST2 FROM SOMETABLE(**NULL**, **NULL**, NULL, NULL)")
cur.execute(POS)
POS = cur.fetchall()
The result should be a successful request like this: POS = (SELECT ST1,ST2 FROM SOMETABLE(43949.0, 43949.0, NULL, NULL)")
If you want to execute a second statement with input from the first statement, you can use a parameterized statement, and pass parameters from the first to the second in your python code.
For example, something like this:
cur = con.cursor()
cur.execute('select output1 from step1(null)')
result1 = cur.fetchone()
cur.execute('select output1, output2 from step2(?, ?, null, null)', (result1[0], result1[0]))
result2 = cur.fetchall()
Alternatively, you can join the stored procedures together to do this in one query. For example:
select s2.*
from step1(null) s1
cross join step2(s1.output1, s1.output1, null, null) s2
Contrary to normal tables, using a cross join with a stored procedure does not produce a cross-product, but instead behaves as a lateral join.

MySQL - select table name if it contains record for list of tables

I am interested in finding the most efficient manner to query the following:
For a list of table names, return the table name if it contains at least one record that meet the conditions
Essentially, something similar to the following Python code in a single query:
dfs = [pd.read_sql('SELECT name FROM {} WHERE a=1 AND b=2'.format(table), engine) for table in tables]
tables = [table for table, df in zip(tables, dfs) if not df.empty]
Is this possible in MySQL?
Assuming you trust the table names in tables not to contain any surprises leading to SQL injection, you could device something like:
from sqlalchemy import text
selects = [f'SELECT :table_{i} FROM {table} WHERE a = 1 AND b = 2'
for i, table in enumerate(tables)]
stmt = ' UNION '.join(selects)
stmt = text(stmt)
results = engine.execute(
stmt, {f'table_{i}': table for i, table in enumerate(tables)})
or you could use SQLAlchemy constructs to build the same query safely:
from sqlalchemy import table, column, union, and_, select, Integer, literal
tbls = [table(name,
column('a', Integer),
column('b', Integer)) for name in tables]
stmt = union(*[select([literal(name).label('name')]).
select_from(tbl).
where(and_(tbl.c.a == 1, tbl.c.b == 2))
for tbl, name in zip(tbls, tables)])
results = engine.execute(stmt)
You can use a UNION of queries that search each table.
(SELECT 'table1' AS table_name
FROM table1
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table2' AS table_name
FROM table2
WHERE a = 1 AND b = 2
LIMIT 1)
UNION
(SELECT 'table3' AS table_name
FROM table3
WHERE a = 1 AND b = 2
LIMIT 1)
...

Not able to insert multiple columns in once using executemany

I have two variables to insert in my table.
user_id - int
marks - float
and I am having this data for multiple users like this:
user_ids = (-,-,-,-,-,-,-) **TUPLE**
marks = (-,-,-,-,-,-,-,-) **TUPLE**
I want to insert this data into my table using executemany and I am executing this query in my flask snippet:
con = pymysql.connect(
host=host,
user=user,
password=password,
db=db,
charset=charset,
cursorclass=pymysql.cursors.DictCursor,
port=port,
)
cur = con.cursor()
percs = calcattnonull()
# percs contains list of dictionaries.
# {[<'user_id'>: <'marks'>], [<'user_id'>: <'marks'>]........}
id_ = []
perc_ = []
final = []
for perc in tqdm(percs):
id_.append(perc["user_id"])
perc_.append(perc["att_perc"])
id_ = tuple(id_)
perc_ = tuple(perc_)
final.append(id_)
final.append(perc_)
cur.executemany(
"UPDATE dream_offline_calculate SET (user_id,att_percentage) VALUES (?,?)",
final,
)
con.commit()
I am getting this error again and again:
TypeError: not all arguments converted during string formatting
Thanks in advance for helping me.
executemany takes an iterable of the same placeholders you would use when calling execute several times.
So if your original query would be
cur.execute(
"UPDATE dream_offline_calculate SET (user_id,att_percentage) VALUES (?,?)",
(user_id, att_perc),
)
the equivalent executemany would be
cur.executemany(
"UPDATE dream_offline_calculate SET (user_id,att_percentage) VALUES (?,?)",
[(user_id, att_perc)],
)
So that said, simply
cur.executemany(
"UPDATE dream_offline_calculate SET (user_id,att_percentage) VALUES (?,?)",
[(perc["user_id"], perc["att_perc"]) for perc in percs],
)
should do the trick.

Get table name for field in database result in Python (PostgreSQL)

I'm trying to get table name for field in result set that I got from database (Python, Postgres). There is a function in PHP to get table name for field, I used it and it works so I know it can be done (in PHP). I'm looking for similar function in Python.
pg_field_table() function in PHP gets results and field number and "returns the name of the table that field belongs to". That is exactly what I need, but in Python.
Simple exaple - create tables, insert rows, select data:
CREATE TABLE table_a (
id INT,
name VARCHAR(10)
);
CREATE TABLE table_b (
id INT,
name VARCHAR(10)
);
INSERT INTO table_a (id, name) VALUES (1, 'hello');
INSERT INTO table_b (id, name) VALUES (1, 'world');
When using psycopg2 or sqlalchemy I got right data and right field names but without information about table name.
import psycopg2
query = '''
SELECT *
FROM table_a A
LEFT JOIN table_b B
ON A.id = B.id
'''
con = psycopg2.connect('dbname=testdb user=postgres password=postgres')
cur = con.cursor()
cur.execute(query)
data = cur.fetchall()
print('fields', [desc[0] for desc in cur.description])
print('data', data)
The example above prints field names. The output is:
fields ['id', 'name', 'id', 'name']
data [(1, 'hello', 1, 'world')]
I know that there is cursor.description, but it does not contain table name, just the field name.
What I need - some way to retrieve table names for fields in result set when using raw SQL to query data.
EDIT 1: I need to know if "hello" came from "table_a" or "table_b", both fields are named same ("name"). Without information about table name you can't tell in which table the value is.
EDIT 2: I know that there are some workarounds like SQL aliases: SELECT table_a.name AS name1, table_b.name AS name2 but I'm really asking how to retrieve table name from result set.
EDIT 3: I'm looking for solution that allows me to write any raw SQL query, sometimes SELECT *, sometimes SELECT A.id, B.id ... and after executing that query I will get field names and table names for fields in the result set.
It is necessary to query the pg_attribute catalog for the table qualified column names:
query = '''
select
string_agg(format(
'%%1$s.%%2$s as "%%1$s.%%2$s"',
attrelid::regclass, attname
) , ', ')
from pg_attribute
where attrelid = any (%s::regclass[]) and attnum > 0 and not attisdropped
'''
cursor.execute(query, ([t for t in ('a','b')],))
select_list = cursor.fetchone()[0]
query = '''
select {}
from a left join b on a.id = b.id
'''.format(select_list)
print cursor.mogrify(query)
cursor.execute(query)
print [desc[0] for desc in cursor.description]
Output:
select a.id as "a.id", a.name as "a.name", b.id as "b.id", b.name as "b.name"
from a left join b on a.id = b.id
['a.id', 'a.name', 'b.id', 'b.name']

Categories