select records from last 48 hours with GQL [duplicate]

select records from last 48 hours with GQL [duplicate] - python

I have a simple table in Google App Engine with a date field. I want to query all the rows with the date field valued between now and 6 hours ago. How do I form this query?

I know you say GQL, but here's a python helper function I use:
import datetime
def seconds_ago(time_s):
return datetime.datetime.now() - datetime.timedelta(seconds=time_s)
There may well be a more concise way to write it: I'm not a python expert and went with the first thing that worked. Take a look at the datetime docs if you care. It's used like this:
my_query = MyTable.all().filter("date >", seconds_ago(6*60*60))
I'm sure that can be translated to GQL without much bother, but I prefer the object-oriented interface, and I don't know the necessary DATETIME syntax.
In python the query is then used like this:
# get a count
my_query.count()
# get up to 1000 records
my_query.fetch(1000)
# iterate over up to 1000 records
for result in my_query:
# do something with result

SELECT * FROM simpletable
WHERE datefield < DATETIME(year, month, day, hour, minute, second)
computing those year, month, &c, in your application code.

Related

Update a specific field for each document based on a function

I have ~10k documents in my collection, with 3 fields(name, wait, utc).
The timestamps are too granular for my use, and I want to round them down to the last 10 minutes.
I created a function to modify these timestamps (I am rounding them via a function called round_to_10min(), which I import from another python file I have called utility_func.py).
It's not slick or anything but it works:
from datetime import datetime as dt
def round_to_10min(my_dt):
hours = my_dt.hour
minutes =(my_dt.minute//10)*10
date = dt(my_dt.year,my_dt.month,my_dt.day)
return dt(date.year, date.month,date.day, hours, minutes)
Is there a way for me to update the 'utc' field for each document in my collection, without taking the cursor and saving it into a list, iterating through it?
An example of what I would like to avoid having to do(doesn't seem efficient):
alldocs = collection.find({})
for x in alldocs:
id = x['_id']
old_value = int(x['utc'])
new_value = utility_func.round_to_10min(old_value)
update_val = {"$set":{"utc":new_value}}
collection.update_one({"_id":ObjectId(id)},update_val)
Here's where I think I should be headed, but the update argument has me stumped...
update_value = {'$set':{'utc':result_from_function}}
collection.update_many({},update_value)
Is this achievable in pymongo?

The mechanism you are seeking will not work.
Pymongo supports MongoDB operations only. If you can find a way to achieve your goal using MongoDB operations, you can perform this in a single update_many or aggregate query.
If you prefer to use python, then you're limited to your original approach of find, loop, update_one.

Is there a way to modify datetime objects through the Django ORM Query?

We've a Django, Postgresql database that contains objects with:
object_date = models.DateTimeField()
as a field.
We need to count the objects by hour per day, so we need to remove some of the extra time data, for example: minutes, seconds and microseconds.
We can remove the extra time data in python:
query = MyModel.objects.values('object_date')
data = [tweet['tweet_date'].replace(minute=0, second=0, microsecond=0) for tweet in query
Which leaves us with a list containing the date and hour.
My Question: Is there a better, faster, cleaner way to do this in the query itself?

If you simply want to obtain the dates without the time data, you can use extra to declare calculated fields:
query = MyModel.objects
.extra(select={
'object_date_group': 'CAST(object_date AS DATE)',
'object_hour_group': 'EXTRACT(HOUR FROM object_date)'
})
.values('object_date_group', 'object_hour_group')
You don't gain too much from just that, though; the database is now sending you even more data.
However, with these additional fields, you can use aggregation to instantly get the counts you were looking for, by adding one line:
query = MyModel.objects
.extra(select={
'object_date_group': 'CAST(object_date AS DATE)',
'object_hour_group': 'EXTRACT(HOUR FROM object_date)'
})
.values('object_date_group', 'object_hour_group')
.annotate(count=Count('*'))
Alternatively, you could use any valid SQL to combine what I made two fields into one field, by formatting it into a string, for example. The nice thing about doing that, is that you can then use the tuples to construct a Counter for convenient querying (use values_list()).
This query will certainly be more efficient than doing the counting in Python. For a background job that may not be so important, however.
One downside is that this code is not portable; for one, it does not work on SQLite, which you may still be using for testing purposes. In that case, you might save yourself the trouble and write a raw query right away, which will be just as unportable but more readable.
Update
As of 1.10 it is possible to perform this query nicely using expressions, thanks to the addition of TruncHour. Here's a suggestion for how the solution could look:
from collections import Counter
from django.db.models import Count
from django.db.models.functions import TruncHour
counts_by_group = Counter(dict(
MyModel.objects
.annotate(object_group=TruncHour('object_date'))
.values_list('object_group')
.annotate(count=Count('object_group'))
)) # query with counts_by_group[datetime.datetime(year, month, day, hour)]
It's elegant, efficient and portable. :)

count = len(MyModel.objects.filter(object_date__range=(beginning_of_hour, end_of_hour)))
or
count = MyModel.objects.filter(object_date__range=(beginning_of_hour, end_of_hour)).count()
Assuming I understand what you're asking for, this returns the number of objects that have a date within a specific time range. Set the range to be from the beginning of the hour until the end of the hour and you will return all objects created in that hour. Count() or len() can be used depending on the desired use. For more information on that check out https://docs.djangoproject.com/en/1.9/ref/models/querysets/#count

Django queryset filtering by ISO week number

I have a model that contains datefield. I'm trying to get query set of that model that contains current week (starts on Monday).
So since Django datefield contains simple datetime.date model I assumed to filter by using .isocalendar(). Logically it's exactly what I want without no extra comparisons and calculations by current week day.
So what I want to do essentially is force .filter statement to behave in this logic:
if model.date.isocalendar()[2] == datetime.date.today().isocalendar()[2]
...
Yet how to write it inside filter statement?
.filter(model__date__isocalendar=datetime.date.today().isocalendar()) will give wrong results (same as comparing to today not this week).
As digging true http://docs.python.org/library/datetime.html I have not noticed any other week day options...
Note from documentation:
date.isocalendar() Return a 3-tuple, (ISO year, ISO week number, ISO
weekday).
Update:
Although I disliked the solution of using ranges yet it's the best option.
However in my case I made a variable that marks the beginning of the week and just look greater or equal value because if I'm looking for a matches for current week. In case of giving the number of the week It would require both ends.
today = datetime.date.today()
monday = today - datetime.timedelta(days=today.weekday())
... \
.filter(date__gte=monday)

You're not going to be able to do this. Remember it's not just an issue of what Python supports, Django has to communicate the filter to the database, and the database doesn't support such complex date calculations. You can use __range, though, with a start date and end date.

Even simpler than using Extract function that Amit mentioned in his answer is using __week field lookup added in Django 1.11, so you can simply do:
.filter(model__date__week=datetime.date.today().isocalendar()[1])

ExtractWeek has been introduced in Django 1.11 for filtering based on isoweek number.
For Django 1.10 and lower versions, following solution works for filtering by iso number week on postgres database:
from django.db.models.functions import Extract
from django.db import models
#models.DateTimeField.register_lookup
class ExtractWeek(Extract):
lookup_name = 'week'
Now do query as follows
queryset.annotate(week=ExtractWeek('date'))\
.filter(week=week_number)

(This answer should only work for postgres, but might work for other databases.)
A quick and elegant solution for this problem would be to define these two custom transformers:
from django.db import models
from django.db.models.lookups import DateTransform
#models.DateTimeField.register_lookup
class WeekTransform(DateTransform):
lookup_name = 'week'
#models.DateTimeField.register_lookup
class ISOYearTransform(DateTransform):
lookup_name = 'isoyear'
Now you can query by week like this:
from django.utils.timezone import now
year, week, _ = now().isocalendar()
MyModel.objects.filter(created__isoyear=year, created__week=week)
Behinds the scenes, the Django DateTransform object uses the postgres EXTRACT function, which supports week and isoyear.

Handling dates prior to 1970 in a repeatable way in MySQL and Python

In my MySQL database I have dates going back to the mid 1700s which I need to convert somehow to ints in a format similar to Unix time. The value of the int isn't important, so long as I can take a date from either my database or from user input and generate the same int. I need to use MySQL to generate the int on the database side, and python to transform the date from the user.
Normally, the UNIX_TIMESTAMP function, would accomplish this in MySQL, but for dates before 1970, it always returns zero.
The TO_DAYS MySQL function, also could work, but I can't take a date from user input and use Python to create the same values as this function creates in MySQL.
So basically, I need a function like UNIX_TIMESTAMP that works in MySQL and Python for dates between 1700-01-01 and 2100-01-01.
Put another way, this MySQL pseudo-code:
select 1700_UNIX_TIME(date) from table;
Must equal this Python code:
1700_UNIX_TIME(date)

I don't have MySQL here installed, but when I look here: http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_to-days - I see an example TO_DAYS('2008-10-07') returning 733687.
The following Python function returns datetime(2008,10,7).toordinal() = 733322, which is 365 less than the MySQL's output.
So take this:
from datetime import datetime
query = '2008-10-07'
nbOfDays = datetime.strptime(query, '%Y-%m-%d').toordinal() + 365
and it should work for dates between 1700 and 2100.

According to the link that you gave,
Given a date date, returns a day number (the number of days since year 0).
mysql> SELECT TO_DAYS(950501);
-> 728779
mysql> SELECT TO_DAYS('2007-10-07');
-> 733321
Corresponding numbers in Python:
>>> import datetime
>>> datetime.date(1995,5,1).toordinal()
728414
>>> datetime.date(2007,10,7).toordinal()
732956
So the relationship is : mySQL_int == Python_int + 365 and you can convert in the other direction by using the fromordinal class method:
>>> datetime.date.fromordinal(728779 - 365)
datetime.date(1995, 5, 1)

Shall I bother with storing DateTime data as julianday in SQLite?

SQLite docs specifies that the preferred format for storing datetime values in the DB is to use Julian Day (using built-in functions).
However, all frameworks I saw in python (pysqlite, SQLAlchemy) store the datetime.datetime values as ISO formatted strings. Why are they doing so?
I'm usually trying to adapt the frameworks to storing datetime as julianday, and it's quite painful. I started to doubt that is worth the efforts.
Please share your experience in this field with me. Does sticking with julianday make sense?

Julian Day is handy for all sorts of date calculations, but it can's store the time part decently (with precise hours, minutes, and seconds). In the past I've used both Julian Day fields (for dates), and seconds-from-the-Epoch (for datetime instances), but only when I had specific needs for computation (of dates and respectively of times). The simplicity of ISO formatted dates and datetimes, I think, should make them the preferred choice, say about 97% of the time.

Store it both ways. Frameworks can be set in their ways and if yours is expecting to find a raw column with an ISO formatted string then that is probably more of a pain to get around than it's worth.
The concern in having two columns is data consistency but sqlite should have everything you need to make it work. Version 3.3 has support for check constraints and triggers. Read up on date and time functions. You should be able to do what you need entirely in the database.
CREATE TABLE Table1 (jd, isotime);
CREATE TRIGGER trigger_name_1 AFTER INSERT ON Table1
BEGIN
UPDATE Table1 SET jd = julianday(isotime) WHERE rowid = last_insert_rowid();
END;
CREATE TRIGGER trigger_name_2 AFTER UPDATE OF isotime ON Table1
BEGIN
UPDATE Table1 SET jd = julianday(isotime) WHERE rowid = old.rowid;
END;
And if you cant do what you need within the DB you can write a C extension to perform the functionality you need. That way you wont need to touch the framework other than to load your extension.

But typically, the Human doesn't read directly from the database. Fractional time on a Julian Day is easily converted to human readible by (for example)
void hour_time(GenericDate *ConvertObject)
{
double frac_time = ConvertObject->jd;
double hour = (24.0*(frac_time - (int)frac_time));
double minute = 60.0*(hour - (int)hour);
double second = 60.0*(minute - (int)minute);
double microsecond = 1000000.0*(second - (int)second);
ConvertObject->hour = hour;
ConvertObject->minute = minute;
ConvertObject->second = second;
ConvertObject->microsecond = microsecond;
};

Because 2010-06-22 00:45:56 is far easier for a human to read than 2455369.5318981484. Text dates are great for doing ad-hoc queries in SQLiteSpy or SQLite Manager.
The main drawback, of course, is that text dates require 19 bytes instead of 8.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

select records from last 48 hours with GQL [duplicate] - python

I have a simple table in Google App Engine with a date field. I want to query all the rows with the date field valued between now and 6 hours ago. How do I form this query?

SELECT * FROM simpletable WHERE datefield < DATETIME(year, month, day, hour, minute, second) computing those year, month, &c, in your application code.

Related

Update a specific field for each document based on a function

Is there a way to modify datetime objects through the Django ORM Query?

Django queryset filtering by ISO week number

Handling dates prior to 1970 in a repeatable way in MySQL and Python

Shall I bother with storing DateTime data as julianday in SQLite?

Categories

Resources