I have the following function in Python:
def get_emo_results(emo, operator):
cursor.execute("SELECT avg(?) FROM [LIWC Post Stats] "
"WHERE goals_scored {0} goals_taken "
"GROUP BY post_number "
"ORDER BY post_number ASC "
"LIMIT ?".format(operator), [emo, posts_limit])
print "SELECT avg({1}) FROM [LIWC Post Stats] "\
"WHERE goals_scored {0} goals_taken "\
"GROUP BY post_number "\
"ORDER BY post_number ASC "\
"LIMIT {2}".format(operator, emo, posts_limit)
return [x[0] for x in cursor.fetchall()]
I call it with get_emo_results('posemo', '>') and get this output to stdout:
SELECT avg(posemo) FROM [LIWC Post Stats] WHERE goals_scored > goals_taken GROUP BY post_number ORDER BY post_number ASC LIMIT 200
However, the function itself returns
[0.0, 0.0, 0.0, 0.0, 0.0, ... 0.0]
I copy and paste that exact expression in stdout to my SQLite process that I have opened, and I get this:
1.8730701754386
2.48962719298246
2.18607456140351
2.15342105263158
2.33107456140351
2.11631578947368
2.37100877192982
1.95228070175439
2.01013157894737
...
3.37183673469388
So not 0 at all. Why does my Python function return something different despite using the same query? What am I doing wrong?
EDIT:
It now works when I get rid of the question marks and format the string directly. Why don't parameterized queries work in this case?
It is being handled differently because you are parametirzing your query. You can't really parameterize a column name like that. It is trying to protect you from SQL injection so it is (I'm simplifying here, but basically) encapsulating any strings in quotes before passing it to the SQL engine.
Essentially, SQLlite is trying to average the string literal 'posemo'
You can keep your limit parameterized, but when it comes to column names you need to have them hardcoded or else put them in the string with something like format.
Related
I am trying to go through a list of gene names and query them in my SQL database like this:
list_of_genes = ["IFNL", "TMPT", "G6PD", "UGT1A1", ...]
for gene in list_of_genes:
sql.execute('''SELECT DISTINCT gene_symbol, haplo_function FROM Haplotypes
WHERE gene_symbol LIKE "%" + ? + "%"
''', (gene,))
What I want to accomplish with this is to get all records from my Haplotypes table where the gene_symbol is similar to my gene from list_of_genes.
A gene from the list could be called IFNL and a gene_symbol in the database could be like IFNL*1 or something similar.
This query gives 0 results, so how can I add wildcards to a SELECT statement together with a placeholder?
If I query it like the following; I do get a lot of results, but not all, since some gene_symbols have added information besides the gene_name.
for gene in list_of_genes:
sql.execute('''SELECT DISTINCT gene_symbol, haplo_function FROM Haplotypes
WHERE gene_symbol LIKE ?
''', (gene,))
I am sorry if I'm asking simple or stupid questions, but I have tried to search for it, but could not find anything combining both <%> wildcards and ? placeholders/parameters.
If I got your question right, you can try something like this:
cur.execute('SELECT * FROM t WHERE t.param LIKE ?', ("%" + val + "?",))
I apologize if this is redundant but I can't seem to find the answer.
I've supplied all the values. Still it gives me error that I did not supply value for binding 6. This is My code
def Update_Employee(id,name,phoneno,address,nic,
joindate,email,picture,role,status,
salary,username,password):
with conn:
c.execute("UPDATE Employee "
"SET emp_Name=:emp_name,"
"emp_PhoneNo=:emp_phoneno,"
"emp_Address=:emp_address,"
"emp_NIC=:emp_nic,"
"emp_JoinDate=:emp_joindate,"
"emp_Email=:emp_email"
"emp_Picture=:emp_picture,"
"emp_role=:emp_role,"
"emp_status=:emp_Status,"
"emp_salary=:emp_Salary,"
"emp_Username=:emp_username,"
"emp_Password=:emp_password "
"WHERE emp_ID=:emp_id",
{'emp_id':id,
'emp_name':name,
'emp_phoneno':phoneno,
'emp_address':address,
'emp_nic':nic,
'emp_joindate':joindate,
'emp_email':email,
'emp_picture':picture,
'emp_role':role,
'emp_Status':status,
'emp_Salary':salary,
'emp_username':username,
'emp_password':password})
I've double checked the attributes in my database. Names\spellings are 100% alright and all the values have been suppplied.
I have a SQL statement that works in mysql:
SELECT * FROM `ps_message` WHERE `id_order` = 111 ORDER BY id_message asc LIMIT 1
What is wrong with the following statement in Python:
cursor2.execute("SELECT * FROM ps_message WHERE id_order='%s'" % order["id_order"] " ORDER BY id_message asc LIMIT 1")
How should the syntax be in Python to work?
You have a syntax error in string formatting. Should be:
cursor2.execute("SELECT * FROM ps_message WHERE id_order='%s' ORDER BY id_message asc LIMIT 1" % order["id_order"])
Using format() is also preferable over old-style string formatting. Read more about it here.
Pass the order number as a query parameter.
e.g.
cursor2.execute("SELECT * FROM ps_message WHERE id_order=%s ORDER BY id_message asc LIMIT 1", [ order["id_order"] ])
Note that when using query parameters you don't put quotes around the %s.
This approach is recommended to avoid the risk of sql injection attacks.
It should also be more efficient if there are many queries.
https://docs.python.org/2/library/sqlite3.html
http://pymssql.org/en/stable/pymssql_examples.html
I am new to python. What I am trying to achieve is to insert values from my list/tuple into my redshift table without iteration.I have around 1 million rows and 1 column. Below is the code I am using to create my list/tuple.
cursor1.execute("select domain from url limit 5;")
for record, in cursor1:
ext = tldextract.extract(record)
mylist.append(ext.domain + '.' + ext.suffix)
mytuple = tuple(mylist)
I am not sure what is best to use, tuple or list. output of print(mylist) and print(mytuple) are as follows.
List output
['friv.com', 'steep.tv', 'wordpress.com', 'fineartblogger.net',
'v56.org'] Tuple Output('friv.com', 'steep.tv', 'wordpress.com',
'fineartblogger.net', 'v56.org')
Now, below is the code I am using to insert the values into my redshift table but I am getting an error:
cursor2.execute("INSERT INTO sample(domain) VALUES (%s)", mylist) or
cursor2.execute("INSERT INTO sample(domain) VALUES (%s)", mytuple)
Error - not all arguments converted during string formatting
Any help is appreciated. If any other detail is required please let me know, I will edit my question.
UPDATE 1:
Tried using below code and getting different error.
args_str = ','.join(cur.mogrify("(%s)", x) for x in mylist)
cur.execute("INSERT INTO table VALUES " + args_str)
ERROR - INSERT has more expressions than target columns
I think you're looking for Fast Execution helpers:
mylist=[('t1',), ('t2',)]
execute_values(cursor2, "INSERT INTO sample(domain) %s", mylist, page_size=100)
what this does is it replaces the %s with 100 VALUES. I'm not sure how high you can set page_size, but that should be far more performant.
Finally found a solution. For some reason cur.mogrify was not giving me proper sql string for insert. Created my own SQl string and it works alot faster than cur.executeall()
list_size = len(mylist)
for len in range(0,list_size):
if ( len != list_size-1 ):
sql = sql + ' ('+ "'"+ mylist[len] + "'"+ ') ,'
else:
sql = sql + '('+ "'"+ mylist[len] + "'"+ ')'
cursor1.execute("INSERT into sample(domain) values " + sql)
Thanks for your help guys!
Let's suppose I have the following table :
Id (int, Primary Key) | Value (varchar)
----------------------+----------------
1 | toto
2 | foo
3 | bar
I would like to know if giving two request, the result of the first must be contained in the result of the second without executing them.
Some examples :
# Obvious example
query_1 = "SELECT * FROM example;"
query_2 = "SELECT * FROM example WHERE id = 1;"
is_sub_part_of(query_2, query_1) # True
# An example we can't know before executing the two requests
query_1 = "SELECT * FROM example WHERE id < 2;"
query_2 = "SELECT * FROM example WHERE value = 'toto' or value = 'foo';"
is_sub_part_of(query_2, query_1) # False
# An example we can know before executing the two requests
query_1 = "SELECT * FROM example WHERE id < 2 OR value = 'bar';"
query_2 = "SELECT * FROM example WHERE id < 2 AND value = 'bar';"
is_sub_part_of(query_2, query_1) # True
# An example about columns
query_1 = "SELECT * FROM example;"
query_2 = "SELECT id FROM example;"
is_sub_part_of(query_2, query_1) # True
Do you know if there's a module in Python that is able to do that, or if it's even possible to do ?
Interesting problem. I don't know of any library that will do this for you. My thoughts:
Parse the SQL, see this for example.
Define which filtering operations can be added to a query that can only result in the same or a narrower result set. "AND x" can always be added, I think, without losing the property of being a subset. "OR x" can not. Anything else you can do to the query? For example "SELECT *", vs "SELECT x", vs "SELECT x, y".
Except for that, I can only say it's an interesting idea. You might get some more input on DBA. Is this an idea you're researching or is it related to a real-world problem you are solving, like optimizing a DB query? Maybe your question could be updated with information about this, since this is not a common way to optimize queries (unless you're working on the DB engine itself, I guess).