I am trying to use Pandas and SQLAlchemy to run a query on a MySQL instance. In the actual query, there is a 'WHERE' statement referencing a specific date. I'd like to run this query separately for each date in a Python list, and append each date's dataframe iteratively to another Master dataframe. My code right now looks like this (excluding SQLAlchemy engine creation):
dates = ['2016-01-12','2016-01-13','2016-01-14']
for day in dates:
query="""SELECT * from schema.table WHERE date = '%s' """
df = pd.read_sql_query(query,engine)
frame.append(df)
My error is
/opt/rh/python27/root/usr/lib64/python2.7/site-packages/MySQLdb/cursors.pyc in execute(self, query, args)
157 query = query.encode(charset)
158 if args is not None:
--> 159 query = query % db.literal(args)
160 try:
161 r = self._query(query)
TypeError: not enough arguments for format string
I'm wondering what the best way to insert the string from the list into my query string is?
Use params to parameterize your query:
dates = ['2016-01-12', '2016-01-13', '2016-01-14']
query = """SELECT * from schema.table WHERE date = %s"""
for day in dates:
df = pd.read_sql_query(query, engine, params=(day, ))
frame.append(df)
Note that I've removed the quotes around the %s placeholder - the data type conversions would be handled by the database driver itself. It would put quotes implicitly if needed.
And, you can define the query before the loop once - no need to do it inside.
I also think that you may need to have a list of date or datetime objects instead of strings:
from datetime import date
dates = [
date(year=2016, month=1, day=12),
date(year=2016, month=1, day=13),
date(year=2016, month=1, day=14),
]
Related
Im trying to query a table, and need to grab all products that have a date = today date.
Below is my code so far
import sqlite3
from datetime import date
date = date.today()
con = sqlite3.connect('test.db')
cur = con.cursor()
date = date.today()
sql_q = f'''SELECT date, name FROM table WHERE date = {date}'''
table = cur.execute(sql_q)
for row in table:
print(row)
i am using an SQlite 3 db and all data has been entered with the following format:
2022-09-20
However this variable type does not seem to work with SQL.
i know the SQL code should look somthing like this
SELECT name FROM test WHERE date = '2022-09-20'
but i'd like the date to be selected automatically from python rather than typing it in manually.
Use the function date() to get the current date:
SELECT name FROM test WHERE date = date()
or CURRENT_DATE:
SELECT name FROM test WHERE date = CURRENT_DATE
I think you need to convert date to string and then pass it in query.
maybe your datatype of column and date.today() is different.
date = date.strftime('%Y-%m-%d')
try using this.
My database is SQL Server 2008.
The type of time character I want to query in the database (such as finishdate) is datetime2
I just want data between "10-11" and "10-17".
When using Sqlalchemy, I use
cast(FinishDate, DATE).between(cast(time1, DATE),cast(time2, DATE))
to query dates, but it does not return any data (I confirm that there must be some data statements meet the query time range)
==============================================
from sqlalchemy import DATE
bb = "2021-10-11 12:21:23"
cc = "2021-10-17 16:12:34"
record = session.query(sa.Name cast(sa.FinishDate, DATE)).filter(
cast(sa.SamplingTime, DATE).between(cast(bb, DATE), cast(cc, DATE)),
sa.SamplingType != 0
).all()
or
record = session.query(sa.Name cast(sa.FinishDate, DATE)).filter(
cast(sa.SamplingTime, DATE)>= cast(bb, DATE),
sa.SamplingType != 0
).all()
Both return []
Something is wrong with my code and I don't know what the trouble is.
It is working for me, I only changed the DATE that you are using to Date
from sqlalchemy import Date
record = session.query(
sa.Name cast(sa.FinishDate, Date)
).filter(
cast(sa.SamplingTime, Date).between(
cast(bb, Date), cast(cc, Date)
),
sa.SamplingType != 0
).all()
As a matter of fact first parameter of cast can be a string also, so in this case its fine to pass date as string in cast.
:param expression: A SQL expression, such as a
:class:`_expression.ColumnElement`
expression or a Python string which will be coerced into a bound
literal value.
I have a Python(3) script that is supposed to run each morning. In it, I call some SQL. However I'm getting an error message:
Error while connecting to PostgreSQL operator does not exist: date = integer
The SQL is based on the concatenation of a string:
ecom_dashboard_query = """
with
days_data as (
select
s.date,
s.user_type,
s.channel_grouping,
s.device_category,
sum(s.sessions) as sessions,
count(distinct s.dimension2) as daily_users,
sum(s.transactions) as transactions,
sum(s.transaction_revenue) as revenue
from ga_flagship_ecom.sessions s
where date = """ + run.start_date + """
group by 1,2,3,4
)
insert into flagship_reporting.ecom_dashboard
select *
from days_data;
"""
Here is the full error:
09:31:25 Error while connecting to PostgreSQL operator does not exist: date = integer
09:31:25 LINE 14: where date = 2020-01-19
09:31:25 ^
09:31:25 HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
I tried wrapping run.start_date within str like so: str(run.start_date) but I received the same error message.
I suspect it may be to do with the way I concatenate the SQL query string, but I am not sure.
The query runs fine in SQL directly with a hard coded date and no concatenation:
where date = '2020-01-19'
How can I get the query string to work correctly?
It's more better to pass query params to cursor.execute method. From docs
Warning Never, never, NEVER use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string. Not even at gunpoint.
So instead of string concatenation pass run.start_date as second argument of cursor.execute.
In your query instead of concatenation use %s:
where date = %s
group by 1,2,3,4
In your python code add second argument to execute method:
cur.execute(ecom_dashboard_query , (run.start_date,))
Your sentece is wrong:
where date = """ + run.start_date + """
try to compare a date and a string and this is not posible, you need to convert "run.start_date" to datetime and compare simply:
date_format = datetime.strptime(your_date_string, '%y-%m-%d')
and with this date converted to datetime do:
where date = date_format
Final code:
date_format = datetime.strptime(your_date_string, '%y-%m-%d')
ecom_dashboard_query = """
with
days_data as (
select
s.date,
s.user_type,
s.channel_grouping,
s.device_category,
sum(s.sessions) as sessions,
count(distinct s.dimension2) as daily_users,
sum(s.transactions) as transactions,
sum(s.transaction_revenue) as revenue
from ga_flagship_ecom.sessions s
where date = {}
group by 1,2,3,4
)
insert into flagship_reporting.ecom_dashboard
select *
from days_data;
""".format(date_format)
I'm trying to have a start and end date as variables in a long SQL query in python that generates a dataframe. I've gone through the other posts regarding this and tried everything I know but I get errors and none of them work. I've shortened the sql query to show just the relevant part. Can anyone please give me any suggestions? I think the issue might have to do with the format of the date.
def get_dataframe():
startdate = 'input_startdate'
enddate = 'input_enddate'
query="""
where date between ? and ?
"""
params={'start':startdate, 'end':enddate}
conn = db.msSQLConnect()
df = pd.read_sql(query,conn,params)
return df
Remove quotes around startdate and enddate variable assignments.
With quotes literal string 'input_startdate' and 'input_enddate' are passed to the query, instead of the date values of the variables.
startdate = input_startdate
enddate = input_enddate
I am running a query that I plan on using multiple times. However when running this query the 'my-job1a' has to be different everytime so I was planning on making this go by the date time. Does anybody know how to implement the date time function for this?
from google.cloud import bigquery
client = bigquery.Client('dataworks-356fa')
query = query
dataset = client.dataset('FirebaseArchive')
table = dataset.table(name='test1')
tbl = dataset.table(name='test12')
job = client.run_async_query('my-job1a', query)
job.destination = tbl
job.write_disposition= 'WRITE_TRUNCATE'
job.begin()
i believe "my-job1a" is a constant string. and you want to change the string for new query.
import datetime
# "my-job1a" replace this with "my-job1a" + datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
job = client.run_async_query("my-job1a-" + datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"), query)
this will change for each second . if you want in millisecond then change the strftime function parameter. if you don't want such a big string , then change strftime parameter as per your choice.