With PostgreSQL, one of my tables has an 'interval' column, values of which I would like to extract as something I can manipulate (datetime.timedelta?); however I am using PyGreSQL which seems to be returning intervals as strings, which is less than helpful.
Where should I be looking to either parse the interval or make PyGreSQL return it as a <something useful>?
Use Psycopg 2. It correctly converts between Postgres's interval data type and Python's timedelta.
Related
I am moving some of my code onto sqlalchemy from using raw MySQL queries.
The current issue I am having is that the datetime was saved in a string format by a C# tool. Unfortunately, the representation does not match up with Python's (as well as that it has an extra set of single quotes), thus making filtering somewhat cumbersome.
Here is an example of the format that the date was saved in:
'2016-07-01T17:27:01'
Which I was able to convert to a usable datetime using the following MySQL command:
STR_TO_DATE(T.PredicationGeneratedTime, \"'%%Y-%%m-%%dT%%H:%%i:%%s'\")
However, I cannot find any documentation that describes how to invoke built-in functions such as STR_TO_DATE when filtering with sqlalchemy
The following Python code:
session.query(Train.Model).filter(cast(Train.Model.PredicationGeneratedTime, date) < start)
is giving me:
TypeError: Required argument 'year' (pos 1) not found
There does not seem to be a way to specify the format for the conversion.
Note: I realize the solution is to fix the way the datetime is stored, but in the mean time I'd like to run queries against the existing data.
You can try to use func.str_to_date(COLUMN, FORMAT_STRING) instead of cast
In the cast() you should be using sqlalchemy.DateTime, not (what I assume is) a datetime.date - that is the cause of the exception.
However, fixing that will not really help because of the embedded single quotes.
You are fortunate that the dates stored in your table are in ISO format. That means that lexicographic comparisons will work on the date strings themselves, without casting. As long as you use a string for start with the surrounding single quotes, it will work.
from datetime import datetime
start = "'{}'".format(datetime.now().isoformat())
session.query(Train.Model).filter(Train.Model.PredicationGeneratedTime < start)
I have having a problem with inserting date values into an SQL query. I am using sqlite3 and python. The query is:
c.execute("""SELECT tweeterHash.* FROM tweeterHash, tweetDates WHERE
Date(tweetDates.start) > Date(?) AND
Date(tweetDates.end) > Date(?)""",
(start,end,))
The query doesn't return any values, and there is no error message. If I use this query:
c.execute("""SELECT tweeterHash.* FROM tweeterHash, tweetDates WHERE
Date(tweetDates.start) > Date(2014-01-01) AND
Date(tweetDates.end) > Date(2015-01-01)""")
Then I get the values that I want, which is as expected?
The values start and end come from a text file:
f = open('dates.txt','r')
start = f.readline().strip('\n')
end = f.readline().strip('\n')
but I have also just tried declaring it as well:
start = '2014-01-01'
end = '2015-01-01'
I guess I don't understand why passing the string in from the start and end variables doesn't work? What is the best way to pass a date variable into a SQL query? Any help is greatly appreciated.
These aren't the same dates—and it's the non-parameterized ones you've got wrong.
Date(2014-01-01) calculates the arithmetic expression 2014 - 01 - 01, then constructs a Date from the resulting number 2012, which will get you something in 4707 BC.
Date('2014-01-01'), or Date(?) where the parameter is the string '2014-01-01', constructs the date you want, in 2014 AD.
You can see this more easily by just selecting dates directly:
>>> cur.execute('SELECT Date(2014-01-01), Date(?)', ['2014-01-01'])
>>> print(cur.fetchone())
('-4707-05-28', '2014-01-01')
Meanwhile:
What is the best way to pass a date variable into a SQL query?
Ideally, use actual date objects instead of strings. The sqlite3 library knows how to handle datetime.datetime and datetime.date. And don't call Date on the values, just compare them. (Yes, sqlite3 might then compare them as strings instead of dates, but the whole point of using ISO8601-like formats is that this always gives the same result… unless of course you have a bunch of dates from 4707 BC lying around.) So:
start = datetime.date(2014, 1, 1)
end = datetime.date(2015, 1, 1)
c.execute("""SELECT tweeterHash.* FROM tweeterHash, tweetDates WHERE
tweetDates.start > ? AND
tweetDates.end > ?""",
(start,end,))
And would this also mean that when I create the table, I would want: " start datetime, end datetime, "?
That would work, but I wouldn't do that. Python will convert date objects to ISO8601-format strings, but not convert back on SELECT, and SQLite will let you transparently compare those strings to the values returned by the Date function.
You could get the same effect with TEXT, but I believe you'd find it less confusing, DATETIME will set the column affinity to NUMERIC, which can confuse both humans and other tools when you're actually storing strings.
Or you could use the type DATE—which is just as meaningless to SQLite as DATETIME, but it can tell Python to transparently convert return values into datetime.date objects. See Default adapters and converters in the sqlite3 docs.
Also, if you haven't read Datatypes in SQLite Version 3 and SQLite and Python types, you really should; there are a lot of things that are both surprising (even—or maybe especially—if you've used other databases), and potentially very useful.
Meanwhile, if you think you're getting the "right" results from passing Date(2014-01-01) around, that means you've actually got a bunch of garbage values in your database. And there's no way to fix them, because the mistake isn't reversible. (After all, 2014-01-01 and 2015-01-02 are both 2012…) Hopefully you either don't need the old data, or can regenerate it. Otherwise, you'll need some kind of workaround that lets you deal with existing data as usefully as possible under the circumstances.
I have chapter times in the form of HH:MM:SS. I am parsing them from a document, and I will have times as a string in the format of '00:12:14'. How would I store this in a mysql column, and then retrieve it in the required format to be able to:
1) order by time;
2) convert to a string in the above format.
I suggest you look at the MySQL time type. It will allow you to sort and format as you wish.
http://dev.mysql.com/doc/refman/5.0/en/time.html
Use the TIME type.
It allows "time values to be represented in several formats, such as quoted strings or as numbers, depending on the exact type of the value and other factors." In addition, you can perform various functions to manipulate the time.
If I have such a simple task, I choose a simple solution: I would choose the python datetime.time module (see: datetime.time) and store a TIME object using strftime.
Loading it back in is a little painful as you would have to split your string at : and then pass the values to the time constructor. Example:
def load(timestr):
hours,minutes,seconds = timestr.split(":")
return datetime.time(hours,minutes,seconds)
Hope this helps.
I have a scientific model which I am running in Python which produces a lookup table as output. That is, it produces a many-dimensional 'table' where each dimension is a parameter in the model and the value in each cell is the output of the model.
My question is how best to store this lookup table in Python. I am running the model in a loop over every possible parameter combination (using the fantastic itertools.product function), but I can't work out how best to store the outputs.
It would seem sensible to simply store the output as a ndarray, but I'd really like to be able to access the outputs based on the parameter values not just indices. For example, rather than accessing the values as table[16][5][17][14] I'd prefer to access them somehow using variable names/values, for example:
table[solar_z=45, solar_a=170, type=17, reflectance=0.37]
or something similar to that. It'd be brilliant if I were able to iterate over the values and get their parameter values back - that is, being able to find out that table[16]... corresponds to the outputs for solar_z = 45.
Is there a sensible way to do this in Python?
Why don't you use a database? I have found MongoDB (and the official Python driver, Pymongo) to be a wonderful tool for scientific computing. Here are some advantages:
Easy to install - simply download the executables for your platform (2 minutes tops, seriously).
Schema-less data model
Blazing fast
Provides map/reduce functionality
Very good querying functionalities
So, you could store each entry as a MongoDB entry, for example:
{"_id":"run_unique_identifier",
"param1":"val1",
"param2":"val2" # etcetera
}
Then you could query the entries as you will:
import pymongo
data = pymongo.Connection("localhost", 27017)["mydb"]["mycollection"]
for entry in data.find(): # this will yield all results
yield entry["param1"] # do something with param1
Whether or not MongoDB/pymongo are the answer to your specific question, I don't know. However, you could really benefit from checking them out if you are into data-intensive scientific computing.
If you want to access the results by name, then you could use a python nested dictionary instead of ndarray, and serialize it in a .JSON text file using json module.
One option is to use a numpy ndarray for the data (as you do now), and write a parser function to convert the query values into row/column indices.
For example:
solar_z_dict = {...}
solar_a_dict = {...}
...
def lookup(dataArray, solar_z, solar_a, type, reflectance):
return dataArray[solar_z_dict[solar_z] ], solar_a_dict[solar_a], ...]
You could also convert to string and eval, if you want to have some of the fields to be given as "None" and be translated to ":" (to give the full table for that variable).
For example, rather than accessing the values as table[16][5][17][14]
I'd prefer to access them somehow using variable names/values
That's what numpy's dtypes are for:
dt = [('L','float64'),('T','float64'),('NMSF','float64'),('err','float64')]
data = plb.loadtxt(argv[1],dtype=dt)
Now you can access the data elements using date['T']['L']['NMSF']
More info on dtypes:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.html
SQLite docs specifies that the preferred format for storing datetime values in the DB is to use Julian Day (using built-in functions).
However, all frameworks I saw in python (pysqlite, SQLAlchemy) store the datetime.datetime values as ISO formatted strings. Why are they doing so?
I'm usually trying to adapt the frameworks to storing datetime as julianday, and it's quite painful. I started to doubt that is worth the efforts.
Please share your experience in this field with me. Does sticking with julianday make sense?
Julian Day is handy for all sorts of date calculations, but it can's store the time part decently (with precise hours, minutes, and seconds). In the past I've used both Julian Day fields (for dates), and seconds-from-the-Epoch (for datetime instances), but only when I had specific needs for computation (of dates and respectively of times). The simplicity of ISO formatted dates and datetimes, I think, should make them the preferred choice, say about 97% of the time.
Store it both ways. Frameworks can be set in their ways and if yours is expecting to find a raw column with an ISO formatted string then that is probably more of a pain to get around than it's worth.
The concern in having two columns is data consistency but sqlite should have everything you need to make it work. Version 3.3 has support for check constraints and triggers. Read up on date and time functions. You should be able to do what you need entirely in the database.
CREATE TABLE Table1 (jd, isotime);
CREATE TRIGGER trigger_name_1 AFTER INSERT ON Table1
BEGIN
UPDATE Table1 SET jd = julianday(isotime) WHERE rowid = last_insert_rowid();
END;
CREATE TRIGGER trigger_name_2 AFTER UPDATE OF isotime ON Table1
BEGIN
UPDATE Table1 SET jd = julianday(isotime) WHERE rowid = old.rowid;
END;
And if you cant do what you need within the DB you can write a C extension to perform the functionality you need. That way you wont need to touch the framework other than to load your extension.
But typically, the Human doesn't read directly from the database. Fractional time on a Julian Day is easily converted to human readible by (for example)
void hour_time(GenericDate *ConvertObject)
{
double frac_time = ConvertObject->jd;
double hour = (24.0*(frac_time - (int)frac_time));
double minute = 60.0*(hour - (int)hour);
double second = 60.0*(minute - (int)minute);
double microsecond = 1000000.0*(second - (int)second);
ConvertObject->hour = hour;
ConvertObject->minute = minute;
ConvertObject->second = second;
ConvertObject->microsecond = microsecond;
};
Because 2010-06-22 00:45:56 is far easier for a human to read than 2455369.5318981484. Text dates are great for doing ad-hoc queries in SQLiteSpy or SQLite Manager.
The main drawback, of course, is that text dates require 19 bytes instead of 8.