I have a few queries in my python script which also use variables as parameters to retrieve data.
'''SELECT * FROM order_projections_daily WHERE days >= ''' + each + '''AND days < ''' + next_monday
How can I store queries like this in a separate file and call it directly from there rather than cramming them in the code?
I have tried storing the queries in a file and calling them as a string but it doesn't work with variables. it works with:
'''SELECT * FROM order_projections_daily'''
This is a very simple query, I am using much more complicated queries in the real case.
Use parameterised strings—
'''SELECT * FROM order_projections_daily WHERE days >= %(start)s AND days < %(end)s'''
Later, when executing the query, build a params dict like this:
params = {'start': ..., 'end': ...}
These params should then be passed to the DBC driver function which will take care of inserting the parameters:
cursor.execute(query, params)
Note: Do not inject format strings into your query, you're liable to SQL injection that way.
Use python string formatting.
In your separate file save query as below e.g:
query = "select * from my_persons where name = {name} and age = {age}"
In python file, format the query this way:
sql = query.format(name="jack", age=27)
You can save all your queries in a separate python file as string values and import all of them into your code anywhere. In my eg I am assuming query is saved in a separate python file.
Formatting your query
query='''SELECT * FROM order_projections_daily WHERE days >={each} AND days < {next_monday}
Format it as:
sql = query.format(each=each, next_monday=next_monday)
It's a good practice to use format method or % or maybe even join rather than using string concatenation as string concatenation creates intermediate string objects.
Ok so formatting is a bad idea then, have a look at this. Name binding in oracle https://stackoverflow.com/a/33882805/2196290
Related
I am new to working on Python. I m not able to understand how can I send the correct input t0 the query.
list_of_names = []
for country in country_name_list.keys():
list_of_names.append(getValueMethod(country))
sql_query = f"""SELECT * FROM table1
where name in (%s);"""
db_results = engine.execute(sql_query, list_of_names).fetchone()
Give the error " not all arguments converted during string formatting"
As implied by John Gordon's comment, the number of placeholders in the SQL statement should match the number of elements in the list. However SQLAlchemy 2.0+ no longer accepts raw SQL statements. A future-proof version of the code would be:
import sqlalchemy as sa
...
# SQL statements should be wrapped with text(), and should used
# the "named" parameter style.
sql_query = sa.text("""SELECT * FROM table1 where name in :names)"""
# Values should be dictionaries of lists of dictionaries,
values = {'names': list_of_names}
# Execute statements using a context manager.
with engine.connect() as conn:
db_results = conn.execute(sql_query, values).fetchone()
If I know right, there are a simpler solution. If you write curly bracets {}, not bracets (), and you place inside the bracets a variable, which contains the %s value, should work. I don't know, how sql works, but you should use one " each side, not three.
Sorry, I'm not english. From this, maybe I wasn't help with the question, because I don't understand correctly.
I am wanting to run a Presto SQL query in a for loop so that the query will pull hourly data based on my date variables.
Example query is along the lines of:
x = datetime.strptime('12-10-22', '%d-%m-%y').date()
y = datetime.strptime('13-10-22', '%d-%m-%y').date()
for dt in rrule.rrule(rrule.HOURLY, dtstart=nextProcStart, until=nextProcEnd):
sql_query = "SELECT SUM(sales) FROM a WHERE date between x and y"
I will note I'm using the syntax of writing the SQL query as a variable so along the lines of:
sql_query = """ SELECT... FROM..."""
I have tried just adding the variables into the query but no luck. Unsure what steps will work.
I've also tried using .format(x,y) at the end of my SQL query but keep getting an error saying
KeyError: 'x'
Remember that your SQL statement is no more than a string, so you just need to know how to incorporate a variable into a string.
Try:
sql_query = "SELECT SUM(sales) FROM a WHERE date between {} and {}".format(x, y)
Read How do I put a variable’s value inside a string (interpolate it into the string)? for more info or alternative methods.
Hopefully this answers your immediate question above on how to incorporate variable into string and get your code, as is, to work. However, as #nbk, mentions in comment below, this method is NOT recommended as it is insecure.
Using concatenations in SQL statements like this does open the code up to injection attacks. Even if your database does not contain sensitive information, it is bad practice.
Prepared statements have many advantages, not least of all that they are more secure and more efficient. I would certainly invest some time in researching and understanding SQL prepared statements.
I have a function that executes many SQL queries with different dates.
What I want is to pass all dates and other query variables as function parameters and then just execute the function. I have figured out how to do this for datetime variables as below. But I also have a query that looks at specific campaign_names in a database and pulls those as strings. I want to be able to pass those strings as function parameters but I haven't figured out the correct syntax for this in the SQL query.
def Camp_eval(start_date,end_1M,camp1,camp2,camp3):
query1 = f"""SELECT CONTACT_NUMBER, OUTCOME_DATE
FROM DATABASE1
where OUTCOME_DATE >= (to_date('{start_date}', 'dd/mm/yyyy'))
and OUTCOME_DATE < (to_date('{end_1M}', 'dd/mm/yyyy'))"""
query2 = """SELECT CONTACT_NUMBER
FROM DATABASE2
WHERE (CAMP_NAME = {camp1} or
CAMP_NAME = {camp2} or
CAMP_NAME = {camp3})"""
Camp_eval('01/04/2022','01/05/2022','Camp_2022_04','Camp_2022_05','Camp_2022_06')
The parameters start_date and end_1M work fine with the {} brackets but the camp variables, which are strings don't return any results even though there are results in the database with those conditions if I were to write them directly in the query.
Any help would be appreciated!!
Please, do not use f-strings for creating SQL queries!
Most likely, any library you use for accessing a database already has a way of creating queries: SQLite docs (check code examples).
Another example: cur.execute("SELECT * FROM tasks WHERE priority = ?", (priority,)).
Not only this way is safer (fixes SQL Injection problem mentioned by #d-malan in comments), but it also eliminates the need to care about how data is represented in SQL - the library will automatically cast dates, strings, etc. in what they need to be casted into. Therefore, your problem can be fixed by using proper instruments.
I am aware that queries in Python can be parameterized using either ? or %s in execute query here or here
However I have some long query that would use some constant variable defined at the beginning of the query
Set #my_const = 'xyz';
select #my_const;
-- Query that use #my_const 40 times
select ... coalesce(field1, #my_const), case(.. then #my_const)...
I would like to do the least modif possible to the query from Mysql. So that instead of modifying the query to
pd.read_sql(select ... coalesce(field1, %s), case(.. then %s)... , [my_const, my_const, my_const, ..]
,I could write something along the line of the initial query. Upon trying the following, however, I am getting a TypeError: 'NoneType' object is not iterable
query_str = "Set #null_val = \'\'; "\
" select #null_val"
erpur_df = pd.read_sql(query_str, con = db)
Any idea how to use the original variable defined in Mysql query ?
The reason
query_str = "Set #null_val = \'\'; "\
" select #null_val"
erpur_df = pd.read_sql(query_str, con = db)
throws that exception is because all you are doing is setting null_value to '' and then selecting that '' - what exactly would you have expected that to give you? EDIT read_sql only seems to execute one query at a time, and as the first query returns no rows it results in that exception.
If you split them in to two calls to read_sql then it will in fact return you the value of your #null value in the second call. Due to this behaviour read_sql is clearly not a good way to do this. I strongly suggest you use one of my suggestions below.
Why are you wanting to set the variable in the SQL using '#' anyway?
You could try using the .format style of string formatting.
Like so:
query_str = "select ... coalesce(field1, {c}), case(.. then {c})...".format(c=my_const)
pd.read_sql(query_str)
Just remember that if you do it this way and your my_const is a user input then you will need to sanitize it manually to prevent SQL injection.
Another possibility is using a dict of params like so:
query_str = "select ... coalesce(field1, %(my_const)s, case(.. then %(my_const)s)..."
pd.read_sql(query_str, params={'my_const': const_value})
However this is dependent on which database driver you use.
From the pandas.read_sql docs:
Check your database driver documentation for which of the five syntax
styles, described in PEP 249’s paramstyle, is supported. Eg. for
psycopg2, uses %(name)s so use params={‘name’ : ‘value’}
I would not call myself a newbie, but I am not terribly conversant with programming. Any help would be appreciated. I have this project that is almost done. Figured out lots of stuff, but this issue has me at a loss.
Is there a simple way to insert an acceptable date value in a postgresql query from:
start_date = raw_input('Start date: ')
end_date = raw_input('End date: ')
I want the variables above to work in the following.
WHERE (gltx.post_date > start_date AND gltx.post_date < end_date )
'YYYY-MM-DD' format works in the SELECT Query of the postgresql database through python triple quoted cursor.execute.
The postgresql column(post.date) is date format.
here is the header for the python script.
#!/usr/bin/python
import psycopg2 as dbapi2
import psycopg2.extras
import sys
import csv
For now I have been altering the query for different periods of time.
Also is there an easy way format the date returned as YYYYMMDD. Perhaps a filter that replaced dashes or hyphens with nothing. I could use that for phone numbers also.
If you are going to execute this SELECT inside a Python script, you should not be placing strings straight into your database query - else you run the risk of SQL injections. See the psycopg2 docs - the problem with query parameters.
Instead you need to use placeholders and place all your string arguments into an iterable (usually a tuple) which is passed as the second argument to cursor.execute(). Again see the docs -passing parameters to sql queries.
So you would create a cursor object, and call the execute() method passing the query string as the first argument and a tuple containing the two dates as the second. Eg
query = "SELECT to_char(gltx.post_date, 'YYYYMMDD') FROM gltx WHERE (gltx.post_date > %s AND gltx.post_date < %s)"
args = (start_date, end_date)
cursor.execute(query, args)
To format the date in Python space, you can use the strftime() method on a date object. You should probably be working with datetime objects not strings anyway, if you want to do anything more than print the output.
You also probably want to validate that the date entered into the raw_input() is a valid date too.
Use the cursor.execute method's parameter substitution
import psycopg2
query = """
select to_char(gltx.post_date, 'YYYYMMDD') as post_date
from gltx
where gltx.post_date > %s AND gltx.post_date < %s
;"""
start_date = '2014-02-17'
end_date = '2014-03-04'
conn = psycopg2.connect("dbname=cpn")
cur = conn.cursor()
cur.execute(query, (start_date, end_date))
rs = cur.fetchall()
conn.close()
print rs