Datetime iteration for KML placemarks - python

All,
I am writing an app that produces 64800 polygons and I want each of them to have a unique time stamp. The problem is that my script runs in about 3 seconds which means that there are exactly 3 unique dates spread across all 64k polygons. I've tried going back in time using timedelta() which does actually go back in time from utcnow() but the problem is that it still iterates for 3 seconds with the 3 unique date stamps. I have been working on variations of this...
def time():
now = datetime.datetime.utcnow()
start_day = datetime.timedelta(days=-2000)
timer = now + start_day
Can anyone lend a hand to get me to the 64800 unique time stamps?
Thanks,
Adam

You could add a polygon counter to the generated timestamp to differentiate each polygon from the previous one. As a matter of fact, if you do not need accurate wall-clock timestamps - and your messing with the timestamps with timedelta() indicates that you do not - you could even use an abritrary counter as your clock, as long as it's atomically increased.
If you do need accurate timestamps, you could try for sub-second resolution, as mentioned here. Keep in mind, however, that many systems will only provide timestamps in increments of several milli-seconds, which is still plenty of time for more than one polygon to be created, so you will still have to use an additional counter or something similar to create unique timestamps.
We might be able to help you more if you mentioned what these timestamps will be used for...

Related

How to find epoch time of future time in Python

I'm currrently writing an alarm clock in Python, However i have some technical difficulties.
The user has the option for the alarm to repeat (on given days), or to not repeat. They then provide the minutes and hour at which they want the alarm to trigger.
For my alarm system to work, i need to know the time as an epoch of when the alarm should trigger.
If i am trying to set an alarm, (for example for 19:30, time will always be inputted as 24 hours), i need the alarm to be able to find out the epoch time of the next time it is 19:30, because it could either be on the same day if i set the alarm before 19:30, or it could be for the next day if i set the alarm after 19:30.
Because of this it means i can't simply do time.localtime(), and then take the resulting struct_time object and swap out the hours and minutes to the integers of 19 and 30 (located at indexes 3 and 4 respectively of the object's named tuple), as i would also have to correctly assign the values of the month, day, and day of the year in order to have a valid struct_time object, which, whilst possible, would require a lot of manipulating, when i feel like there is likely a much more reasonable way of doing this.
Any help would be much appreciated
You can simply use the timestamp method on the result. This will return the epoch time of the datetime instance. This will work in almost any circumstance, especially since it is a simple alarm clock, but be aware of this wraning from the docs.
Naive datetime instances are assumed to represent local time and this method relies on the platform C mktime() function to perform the conversion. Since datetime supports wider range of values than mktime() on many platforms, this method may raise OverflowError for times far in the past or far in the future.
Depending on your program architecture you might also consider using the amount of seconds between two times, which can be done using simple subtraction to get a timedelta and the total_seconds function:
import time
import datetime
start = datetime.datetime.now()
time.sleep(2)
end = datetime.datetime.now()
# print total seconds
print((end - start).total_seconds())

Expiration of Memcache objects on specific date

I would like to set the expiry time for memcache objects to a specific date.
cache.set(string, 1, 86400)
The statement above allows me to set it for a day, but it does not expire if the date changes. One way I could handle this is by calculating the number of seconds left in the day and provide it as a variable.
I was wondering if there was a more simpler/efficient way to do it.
Looking at the documentation, we see that the expiration parameter is explained as:
Optional expiration time, either relative number of seconds from current time (up to 1 month), or an absolute Unix epoch time. By default, items never expire, though items may be evicted due to memory pressure. Float values will be rounded up to the nearest whole second.
So basically if the number you put in there is less than 2592000, it is interpreted as a relative time. So the number 86400 would be interpreted as 86400 seconds (one day) from now, the time it's being set.
It looks like you're going to want to use a number bigger than that to signify an absolute time. There are a variety of ways to get a unix timestamp. But quite simply you can do:
time_tuple = (2013, 2, 15, 0, 0, 0,0,0,0)
timestamp = time.mktime(time_tuple)
cache.set(string, 1, timestamp);
You initial idea is correct. You can find out the timestamp for now, and the timestamp of the date you want and just provide the difference, that would be equivalent too.
The day changes at least every hour of every day, does it not? Either the client or the server must specify which one of those is relevant to any given request. This is generally a better task for the client application.
Do note that you can specify absolute timestamps, which might make it easier to calculate when that expiry time is since you'd be able to reuse it for the whole day (or at least an hour).

How to handle time zones in a CMS?

I'm storing MySQL DateTimes in UTC, and let the user select their time zone, storing that information.
However, I want to to some queries that uses group by a date. Is it better to store that datetime information in UTC (and do the calculation every time) or is it better to save it in the timezone given? Since time zones for users can change, I wonder.
Thanks
Generally always store in UTC and convert for display, it's the only sane way to do time differences etc. Or when somebody next year decides to change the summer time dates.
It's almost always better to save the time information in UTC, and convert it to local time when needed for presentation and display.
Otherwise, you will go stark raving mad trying to manipulate and compare dates and times in your system because you will have to convert each time to UTC time for comparison and manipulation.

Efficiently determining if a business is open or not based on store hours

Given a time (eg. currently 4:24pm on Tuesday), I'd like to be able to select all businesses that are currently open out of a set of businesses.
I have the open and close times for every business for every day of the week
Let's assume a business can open/close only on 00, 15, 30, 45 minute marks of each hour
I'm assuming the same schedule each week.
I am most interested in being able to quickly look up a set of businesses that is open at a certain time, not the space requirements of the data.
Mind you, some my open at 11pm one day and close 1am the next day.
Holidays don't matter - I will handle these separately
What's the most efficient way to store these open/close times such that with a single time/day-of-week tuple I can speedily figure out which businesses are open?
I am using Python, SOLR and mysql. I'd like to be able to do the querying in SOLR. But frankly, I'm open to any suggestions and alternatives.
If you are willing to just look at single week at a time, you can canonicalize all opening/closing times to be set numbers of minutes since the start of the week, say Sunday 0 hrs. For each store, you create a number of tuples of the form [startTime, endTime, storeId]. (For hours that spanned Sunday midnight, you'd have to create two tuples, one going to the end of the week, one starting at the beginning of the week). This set of tuples would be indexed (say, with a tree you would pre-process) on both startTime and endTime. The tuples shouldn't be that large: there are only ~10k minutes in a week, which can fit in 2 bytes. This structure would be graceful inside a MySQL table with appropriate indexes, and would be very resilient to constant insertions & deletions of records as information changed. Your query would simply be "select storeId where startTime <= time and endtime >= time", where time was the canonicalized minutes since midnight on sunday.
If information doesn't change very often, and you want to have lookups be very fast, you could solve every possible query up front and cache the results. For instance, there are only 672 quarter-hour periods in a week. With a list of businesses, each of which had a list of opening & closing times like Brandon Rhodes's solution, you could simply, iterate through every 15-minute period in a week, figure out who's open, then store the answer in a lookup table or in-memory list.
The bitmap field mentioned by another respondent would be incredibly efficient, but gets messy if you want to be able to handle half-hour or quarter-hour times, since you have to increase arithmetically the number of bits and the design of the field each time you encounter a new resolution that you have to match.
I would instead try storing the values as datetimes inside a list:
openclosings = [ open1, close1, open2, close2, ... ]
Then, I would use Python's "bisect_right()" function in its built-in "bisect" module to find, in fast O(log n) time, where in that list your query time "fits". Then, look at the index that is returned. If it is an even number (0, 2, 4...) then the time lies between one of the "closed" times and the next "open" time, so the shop is closed then. If, instead, the bisection index is an odd number (1, 3, 5...) then the time has landed between an opening and a closing time, and the shop is open.
Not as fast as bitmaps, but you don't have to worry about resolution, and I can't think of another O(log n) solution that's as elegant.
You say you're using SOLR, don't care about storage, and want the lookups to be fast. Then instead of storing open/close tuples, index an entry for every open block of time at the level of granularity you need (15 mins). For the encoding itself, you could use just cumulative hours:minutes.
For example, a store open from 4-5 pm on Monday, would have indexed values added for [40:00, 40:15, 40:30, 40:45]. A query at 4:24 pm on Monday would be normalized to 40:15, and therefore match that store document.
This may seem inefficient at first glance, but it's a relatively small constant penalty for indexing speed and space. And makes the searches as fast as possible.
Sorry I don't have an easy answer, but I can tell you that as the manager of a development team at a company in the late 90's we were tasked with solving this very problem and it was HARD.
It's not the weekly hours that's tough, that can be done with a relatively small bitmask (168 bits = 1 per hour of the week), the trick is the businesses which are closed every alternating Tuesday.
Starting with a bitmask then moving on to an exceptions field is the best solution I've ever seen.
In your Solr index, instead of indexing each business as one document with hours, index every "retail session" for every business during the course of a week.
For example if Joe's coffee is open Mon-Sat 6am-9pm and closed on Sunday, you would index six distinct documents, each with two indexed fields, "open" and "close". If your units are 15 minute intervals, then the values can range from 0 to 7*24*4. Assuming you have a unique ID for each business, store this in each document so you can map the sessions to businesses.
Then you can simply do a range search in Solr:
open:[* TO N] AND close:[N+1 TO *]
where N is computed to the Nth 15 minute interval that the current time falls into. For examples if it's 10:10AM on Wednesday, your query would be:
open:[* TO 112] AND close:[113 TO *]
aka "find a session that starts at or before 10:00am Wed and ends at or after 10:15am Wed"
If you want to include other criteria in your search, such as location or products, you will need to index this with each session document as well. This is a bit redundant, but if your index is not huge, it shouldn't be a problem.
If you can control your data well, I see a simple solution, similar to #Sebastian's. Follow the advice of creating the tuples, except create them of the form [time=startTime, storeId] and [time=endTime, storeId], then sort these in a list. To find out if a store is open, simply do a query like:
select storeId
from table
where time <= '#1'
group by storeId
having count(storeId) % 2 == 1
To optimize this, you could build a lookup table at each of time t, store the stores that are open at t, and the store openings/closings between t and t+1 (for any grouping of t).
However, this has the drawback of being harder to maintain (overlapping openings/closings need to be merged into a longer open-close period).
Have you looked at how many unique open/close time combinations there are? If there are not that many, make a reference table of the unique combinations and store the index of the appropriate entry against each business. Then you only have to search the reference table and then find the business with those indices.

Python - Hits per minute implementation?

This seems like such a trivial problem, but I can't seem to pin how I want to do it. Basically, I want to be able to produce a figure from a socket server that at any time can give the number of packets received in the last minute. How would I do that?
I was thinking of maybe summing a dictionary that uses the current second as a key, and when receiving a packet it increments that value by one, as well as setting the second+1 key above it to 0, but this just seems sloppy. Any ideas?
A common pattern for solving this in other languages is to let the thing being measured simply increment an integer. Then you leave it to the listening client to determine intervals and frequencies.
So you basically do not let the socket server know about stuff like "minutes", because that's a feature the observer calculates. Then you can also support multiple listeners with different interval resolution.
I suppose you want some kind of ring-buffer structure to do the rolling logging.
When you say the last minute, do you mean the exact last seconds or the last full minute from x:00 to x:59? The latter will be easier to implement and would probably give accurate results. You have one prev variable holding the value of the hits for the previous minute. Then you have a current value that increments every time there is a new hit. You return the value of prev to the users. At the change of the minute you swap prev with current and reset current.
If you want higher analysis you could split the minute in 2 to 6 slices. You need a variable or list entry for every slice. Let's say you have 6 slices of 10 seconds. You also have an index variable pointing to the current slice (0..5). For every hit you increment a temp variable. When the slice is over, you replace the value of the indexed variable with the value of temp, reset temp and move the index forward. You return the sum of the slice variables to the users.
For what it's worth, your implementation above won't work if you don't receive a packet every second, as the next second entry won't necessarily be reset to 0.
Either way, afaik the "correct" way to do this, ala logs analysis, is to keep a limited record of all the queries you receive. So just chuck the query, time received etc. into a database, and then simple database queries will give you the use over a minute, or any minute in the past. Not sure whether this is too heavyweight for you, though.

Categories