python mysqldb query with where - python

I use MySQLDB to query some data from database, when use like in sql, I am confused about sql sentence.
As I use like, so I construct below sql which can get correct result.
cur.execute("SELECT a FROM table WHERE b like %s limit 0,10", ("%"+"ccc"+"%",))
Now I want to make column b as variable as below. it will get none
cur.execute("SELECT a FROM table WHERE %s like %s limit 0,10", ("b", "%"+"ccc"+"%"))
I searached many website but not get result. I am a bit dizzy.

In the db-api, parameters are for values only, not for columns or other parts of the query. You'll need to insert that using normal string substitution.
column = 'b'
query = "SELECT a FROM table WHERE {} like %s limit 0,10".format(column)
cur.execute(query, ("%"+"ccc"+"%",))
You could make this a bit nicer by using format in the parameters too:
cur.execute(query, ("%{}%".format("ccc",))

The reason that the second query does not work is that the query that results from the substitution in the parameterised query looks like this:
select a from table where 'b' like '%ccc%' limit 0,10
'b' does not refer to a table, but to the static string 'b'. If you instead passed the string abcccba into the query you'd get a query that selects all rows:
cur.execute("SELECT a FROM table WHERE %s like %s limit 0,10", ("abcccba", "%"+"ccc"+"%"))
generates query:
SELECT a FROM table WHERE 'abcccba' like '%ccc%' limit 0,10
From this you should now be able to see why the second query returns an empty result set: the string b is not like %ccc%, so no rows will be returned.
Therefore you can not set values for table or column names using parameterised queries, you must use normal Python string subtitution:
cur.execute("SELECT a FROM table WHERE {} like %s limit 0,10".format('b'), ("abcccba", "%"+"ccc"+"%"))
which will generate and execute the query:
SELECT a FROM table WHERE b like '%ccc%' limit 0,10

You probably need to rewrite your variable substitution from
cur.execute("SELECT a FROM table WHERE b like %s limit 0,10", ("%"+"ccc"+"%"))
to
cur.execute("SELECT a FROM table WHERE b like %s limit 0,10", ("%"+"ccc"+"%",))
Note the trailing comma which adds a last empty element, which makes sure the tuple that states variables is longer than 1 element. In this example the string concatenation isn't even necessary, this code says:
cur.execute("SELECT a FROM table WHERE b like %s limit 0,10", ("%ccc%",))

Related

Best way to inject a variable column retrieval list into an SQL query (via psycopg2 execution)

I have a query as such:
SELECT_DATA = """select *
from schema.table tb
order by tb.created_time
"""
However, instead of selecting for all the columns in this table, I want to retrieve by a specified column list that I supply via psycopg2 injection in Python. The supplied column list string would look like this:
'col1, col2, col3'
Simple enough, except I also need to append the table alias "tb" to the beginning of each column name, so it needs to look like:
'tb.col1, tb.col2, tb.col3'
The resulting query is therefore:
SELECT_DATA = """select tb.col1, tb.col2, tb.col3
from schema.table tb
order by tb.created_time
"""
The most straightforward way I'm thinking in my head would be to parse the given string into a comma-separated list, append "tb." to the beginning of each column name, then parse the list back to a string for injection. But that seems pretty messy and hard to follow, so I was wondering if there might a better way to handle this?
Consider a list comprehension of sqlIdentifiers after splitting comma-separated string:
commas_sep_str = "col1, col2, col3"
field_identifiers = [sql.Identifier(s) for s in commas_sep_str.split(',')]
query = (sql.SQL("select {fields} from {schema}.{table}")
.format(
fields=sql.SQL(',').join(field_identifiers),
schema=sql.Identifier('my_schema')
table=sql.Identifier('my_table')
)
)

Redshift - Passing columns as a list

I have a set of columns passed into a python list. I am trying to see if I can pass this list as part of the select statement in redshift.
list_name = ['col_a', 'col_b']
Trying to pass this list into the below query:
cur.execute("""select {} from table""".format(list_name))
I get the below message:
ProgrammingError: syntax error at or near "'col_a'"
The above SQL should be equivalent to
select col_a, col_b from table
You can convert a list into a string by using join(), specifying the text to put between entries. For example this:
','.join(['col_a', 'col_b'])
would return:
'col_a,col_b'
Therefore, you can use it when creating the SQL query:
cur.execute("select {} from table".format(','.join(list_name)))
Or using an f-string:
cur.execute(f"select {','.join(list_name)} from table")

Read multiple lists from python into an SQL query

I have 3 lists of user id's and time ranges (different for each user id) for which I would like to extract data. I am querying an AWS redshift database through Python. Normally, with one list, I'd do something like this:
sql_query = "select userid from some_table where userid in {}".format(list_of_users)
where list of users is the list of user id's I want - say (1,2,3...)
This works fine, but now I need to somehow pass it along a triplet of (userid, lower time bound, upper time bound). So for example ((1,'2018-01-01','2018-01-14'),(2,'2018-12-23','2018-12-25'),...
I tried various versions of this basic query
sql_query = "select userid from some_table where userid in {} and date between {} and {}".format(list_of_users, list_of_dates_lower_bound, list_of_dates_upper_bound)
but no matter how I structure the lists in format(), it doesn't work. I am not sure this is even possible this way or if I should just loop over my lists and call the query repeatedly for each triplet?
suppose the list of values are something like following:
list_of_users = [1,2],
list_of_dates_lower_bound = ['2018-01-01', '2018-12-23']
list_of_dates_lower_bound = ['2018-01-14', '2018-12-25']
the formatted sql would be:
select userid from some_table where userid in [1,2] and date between ['2018-01-01', '2018-12-23'] and ['2018-01-14', '2018-12-25']
This result should not be what you thought as is, it's just an invalid sql, the operand of between should be scalar value.
I suggest loop over the lists, and pass a single value to the placeholder.
You can select within a particular range by using
select col from table where col between range and range;
In your case it may be
select userid from some_table where date_from between yesterday and today;
or even
select userid from some_table where date_from >= yesterday and date_from <= today;

cannot insert None value in postgres using psycopg2

I have a database(postgresql) with more than 100 columns and rows. Some cells in the table are empty,I am using python for scripting so None value is placed in empty cells but it shows the following error when I try to insert into table.
" psycopg2.ProgrammingError: column "none" does not exist"
Am using psycopg2 as python-postgres interface............Any suggestions??
Thanks in advance......
Here is my code:-
list1=[None if str(x)=='nan' else x for x in list1];
cursor.execute("""INSERT INTO table VALUES %s""" %list1;
);
Do not use % string interpolation, use SQL parameters instead. The database adapter can handle None just fine, it just needs translating to NULL, but only when you use SQL parameters will that happen:
list1 = [(None,) if str(x)=='nan' else (x,) for x in list1]
cursor.executemany("""INSERT INTO table VALUES %s""", list1)
I am assuming that you are trying to insert multiple rows here. For that, you should use the cursor.executemany() method and pass in a list of rows to insert; each row is a tuple with one column here.
If list1 is just one value, then use:
param = list1[0]
if str(param) == 'nan':
param = None
cursor.execute("""INSERT INTO table VALUES %s""", (param,))
which is a little more explicit and readable.

Can I split INSERT statement into several ones without repeat inserting rows?

I have such an INSERT statement:
mtemp = "station, calendar, type, name, date, time"
query = "INSERT INTO table (%s) VALUES ( '%s', '%s', '%s', %s, '%s', '%s' );"
query = query % (mtemp, mstation, mcalendar, mtype, mname, mdate, mtime)
curs.execute(query, )
conn.commit()
The problem is that I can not get the variables: mcalendar, mdate, mtime in this statement. They are not constant values. I would have to access each of them within a forloop. However, the values of mstation, mtype and mname are fixed. I tried to split the INSERT statement into several ones: one for each of the three variables in a forloop, and one for the three fixed values in a single forloop. The forloop is basically to define when to insert rows. I have a list of rows1 and a list of rows2, rows1 is a full list of records while rows2 lack some of them. I’m checking if the rows2 record exist in rows1. If it does, then execute the INSERT statement, if not, do nothing.
I ran the codes and found two problems:
It’s inserting way more rows than it is supposed to. It’s supposed to insert no more than 240 rows for there are only 240 time occurrences in each day for each sensor. (I wonder if it is because I wrote too many forloop so that it keeps inserting rows). Now it’s getting more than 400 new rows.
In these new rows being inserted to the table, they only have values in the columns of fixed value. For the three ones that I use the single forloop to insert data, they don’t have value at all.
Hope someone give me some tip here. Thanks in advance! I can put more codes here if needed. I’m not even sure if I’m in the right track.
I'm not sure I understand exactly your scenario, but is this the sort of thing you need?
Pseudo code
mstation = "foo"
mtype = "bar"
mname = "baz"
mtemp = "station, calendar, type, name, date, time"
queryTemplate = "INSERT INTO table (%s) VALUES ( '%s', '%s', '%s', %s, '%s', '%s' );"
foreach (mcalendar in calendars)
foreach (mdate in dates)
foreach (mtime in times)
query = queryTemplate % (mtemp, mstation, mcalendar, mtype, mname, mdate, mtime)
curs.execute(query, )
One INSERT statement always corresponds to one new row in a table. (Unless of course there is an error during the insert.) You can INSERT a row, and then UPDATE it later to add/change information but there is no such thing as splitting up an INSERT.
If you have a query which needs to be executed multiple times with changing data, the best option is a prepared statement. A prepared statement "compiles" an SQL query but leaves placeholders that can set each time it is executed. This improves performance because the statement doesn't need to be parsed each time. You didn't specify what library you're using to connect to postgres so I don't know what the syntax would be, but it's something to look in to.
If you can't/don't want to use prepared statements, you'll have to just create the query string once for each insert. Don't substitute the values in before the loop, wait until you know them all before creating the query.
Following syntax works in SQL Server 2008 but not in SQL Server 2005.
CREATE TABLE Temp (id int, name varchar(10));
INSERT INTO Temp (id, name) VALUES (1, 'Anil'), (2, 'Ankur'), (3, 'Arjun');
SELECT * FROM Temp;
id | name
------------
1 | Anil
2 | Ankur
3 | Arjun

Categories