Convert string array to datetime and compare - python

I just started programming with Python, and have some simple questions (probably). What I would like to do is compare some timestamps to find the closest that isn't later then now.
Basically what Iam trying to do is getting the current track played on the radio, and they have a feed that show the next 20 or so with time for when the track starts. I want to get whats playing right now!
Here is an example array of strings:
examples = ['2012-12-10 02:06:45', '2012-12-10 02:02:43', '2012-12-10 01:58:53']
Now what I would like to do is compare the time closest to now (but not later) to see whats currently playing.
This is my script so far:
import datetime, itertools, time
currentTimeMachine = datetime.datetime.now()
now = currentTimeMachine.strftime("%Y-%m-%d %H:%M:%S")
examples = ['2012-12-10 02:06:45', '2012-12-10 02:02:43', '2012-12-10 01:58:53']
tmsg = examples.strftime('%d%b%Y')
print [x for x in itertools.takewhile( lambda t: now > datetime.datetime.strptime(t, "%Y-%m-%d %H:%M:%S"), examples )][-1]
The last bit there I picked up somwhere else, but I cant seem to get it to work.
Any help would be greatly appreciated!

The other answers have fixed your errors, so your algorithm now runs properly.
But the algorithm itself is wrong. You want to get the closest to the present without going over. But what you've written is:
[x for x in itertools.takewhile(pred, examples)][-1]
Think about what this means. First, takewhile will return examples until one of them fails the predicate. Then you're taking the last one that succeeded. So, if your examples looked like this:
[now-3, now-10, now-5, now+3, now-9, now-1, now+9]
First, takewhile will yield now-3, now-10, now-5 and then stop because pred(now+3) returns False. Then, you take the last one, now-5.
This would work if you sorted the examples in ascending order:
[now-10, now-9, now-5, now-3, now-1, now+3, now+9]
Now takewhile will yield everything up to now-1, so the last thing it yields is the one you want.
But the example in your initial question were in descending order, and in the comment to Anthony Kong's answer, you added some more that aren't in any order at all. So, you obviously can't rely on them being in sorted order. So, one possible fix is to sort them:
>>> import datetime, itertools, time
>>> currentTimeMachine = datetime.datetime.now()
>>> print [x for x in itertools.takewhile(lambda t: currentTimeMachine > datetime.datetime.strptime(t, "%Y-%m-%d %H:%M:%S"), sorted(examples))][-1]
2012-12-10 02:06:45
Or, to make things a bit more readable, break up that last line, and get rid of the extraneous list comprehension:
>>> exampleDates = [datetime.datetime.strptime(t, "%Y-%m-%d %H:%M:%S") for t in examples]
>>> def beforeNow(t):
... return currentTimeMachine > t
>>> print list(itertools.takewhile(beforeNow, sorted(exampleDates))[-1]
However, this is kind of a silly way to do things. What you really want is the maximum value in examples that isn't after the present. So just translate that English sentence into code:
>>> print max(x for x in exampleDates if x <= currentTimeMachine)
Let's put it all together:
>>> examples = ['2012-12-10 02:06:45', '2012-12-10 02:02:43', '2012-12-10 01:58:53']
>>> exampleDates = (datetime.datetime.strptime(t, "%Y-%m-%d %H:%M:%S") for t in examples)
>>> currentTimeMachine = datetime.datetime.now()
>>> print max(t for t in exampleDates if t <= currentTimeMachine)
2012-12-10 02:06:45
I used a generator expression rather than a list for exampleDates because you don't actually need the list for anything you just need to iterate over it once. If you want to keep it around for inspection or repeated use, change the parens to square brackets.
Also, I changed the < to <=, because you said "isn't later then now" rather than "is earlier than now" (in other words, now should count).
As a side note, because you happen to have ISO-esque timestamps, you actually can sort them as strings:
>>> now = datetime.datetime.now()
>>> currentTimeMachine = datetime.datetime.strftime(now, "%Y-%m-%d %H:%M:%S")
>>> print max(t for t in examples if t <= currentTimeMachine)
2012-12-10 02:06:45
There's no good reason to do things this way, and it will invariably lead you to bugs when you get timestamps in slightly different formats (e.g., '2012-12-10 02:06:45' compares before '2012-12-10Z01:06:45'), but it isn't actually a problem with your original code.

Since you did not post the error message, so based on the code your post, there are a few issues
1) tmsg = examples.strftime('%d%b%Y') won't work because you apply a call on strftime the list
2) As others have pointed out already, in the takewhile you're comparing string with datetime.
This will work:
>>> import datetime, itertools, time
>>> currentTimeMachine = datetime.datetime.now()
>>> print [x for x in itertools.takewhile( lambda t: currentTimeMachine > datetime.datetime.strptime(t, "%Y-%m-%d %H:%M:%S"), examples )][-1]
2012-12-10 01:58:53

Use datetime.datetime.strptime() to convert a string to a datetime object.
>>> import datetime
>>> examples = ['2012-12-10 02:06:45', '2012-12-10 02:02:43', '2012-12-10 01:58:53']
>>> parsed_datetimes = [datetime.datetime.strptime(e, "%Y-%m-%d %H:%M:%S") for e in examples]
>>> parsed_datetimes
[datetime.datetime(2012, 12, 10, 2, 6, 45), datetime.datetime(2012, 12, 10, 2, 2, 43), datetime.datetime(2012, 12, 10, 1, 58, 53)]
This will then get the minimum difference datetime, "closest to now", from the current datetime now:
>>> now = datetime.datetime.now()
>>> min_time_diff, min_datetime = min((now - i, i) for i in parsed_datetimes)
>>> min_time_diff, min_datetime
(datetime.timedelta(0, 36265, 626000), datetime.datetime(2012, 12, 10, 2, 6, 45))

Related

Python - Get newest dict value where string = string

I have this code and it works. But I want to get two different files.
file_type returns either NP or KL. So I want to get the NP file with the max value and I want to get the KL file with the max value.
The dict looks like
{"Blah_Blah_NP_2022-11-01_003006.xlsx": "2022-03-11",
"Blah_Blah_KL_2022-11-01_003006.xlsx": "2022-03-11"}
This is my code and right now I am just getting the max date without regard to time. Since the date is formatted how it is and I don't care about time, I can just use max().
I'm having trouble expanding the below code to give me the greatest NP file and the greatest KL file. Again, file_type returns the NP or KL string from the file name.
file_dict = {}
file_path = Path(r'\\place\Report')
for file in file_path.iterdir():
if file.is_file():
path_object = Path(file)
filename = path_object.name
stem = path_object.stem
file_type = file_date = stem.split("_")[2]
file_date = stem.split("_")[3]
file_dict.update({filename: file_date})
newest = max(file_dict, key=file_dict.get)
return newest
I basically want newest where file_type = NP and also newest where file_type = KL
You could filter the dictionary into two dictionaries (or however many you need if there's more types) and then get the max date for any of those.
But the whole operation can be done efficiently in only few lines:
from pathlib import Path
from datetime import datetime
def get_newest():
maxs = {}
for file in Path(r'./examples').iterdir():
if file.is_file():
*_, t, d, _ = file.stem.split('_')
d = datetime(*map(int, d.split('-')))
maxs[t] = d if t not in maxs else max(d, maxs[t])
return maxs
print(get_newest())
This:
collects the maximum date for each type into a dict maxs
loops over the files like you did (but in a location where I created some examples following your pattern)
only looks at the files, like your code
assumes the files all meet your pattern, and splits them over '_', only keeping the next to last part as the date and the part before it as the type
converts the date into a datetime object
keeps whichever is greater, the new date or a previously stored one (if any)
Result:
{'KL': datetime.datetime(2023, 11, 1, 0, 0), 'NP': datetime.datetime(2022, 11, 2, 0, 0)}
The files in the folder:
Blah_Blah_KL_2022-11-01_003006.txt
Blah_Blah_KL_2023-11-01_003006.txt
Blah_Blah_NP_2022-11-02_003051.txt
Blah_Blah_NP_2022-11-01_003006.txt
Blah_Blah_KL_2021-11-01_003006.txt
In the comments you asked
no idea how the above code it getting the diff file types and the max. Is it just looing for all the diff types in general? It's hard to know what each piece is with names like s, d, t, etc. Really lost on *_, t, d, _ = and also d = datetime(*map(int, d.split('-')))
That's a fair point, I prefer short names when I think the meaning is clear, but a descriptive name might have been better. t is for type (and type would be a bad name, shadowing type, so perhaps file_type). d is for date, or dt for datetime might have been better. I don't see s?
The *_, t, d, _ = is called 'extended tuple unpacking', it takes all the results from what follows and only keeps the 3rd and 2nd to last, as t and d respectively, and throws the rest away. The _ takes up a position, but the underscore indicates we "don't care" about whatever is in that position. And the *_ similarly gobbles up all values at the start, as explained in the linked PEP article.
The d = datetime(*map(int, d.split('-'))) is best read from the inside out. d.split('-') just takes a date string like '2022-11-01' and splits it. The map(int, ...) that's applied to the result applies the int() function to every part of that result - so it turns ('2022', '11', '01') into (2022, 11, 1). The * in front of map() spreads the results as parameters to datetime - so, datetime(2022, 11, 1) would be called in this example.
This is what I both like and hate about Python - as you get better at it, there are very concise (and arguably beautiful - user #ArtemErmakov seems to agree) ways to write clean solutions. But they become hard to read unless you know most of the basics of the language. They're not easy to understand for a beginner, which is arguably a bad feature of a language.
To answer the broader question: since the loop takes each file, gets the type (like 'KL') from it and gets the date, it can then check the dictionary, add the date if the type is new, or if the type was already in the dictionary, update it with the maximum of the two, which is what this line does:
maxs[t] = d if t not in maxs else max(d, maxs[t])
I would recommend you keep asking questions - and whenever you see something like this code, try to break it down into all it small parts, and see what specific parts you don't understand. Python is a powerful language.
As a bonus, here is the same solution, but written a bit more clearly to show what is going on:
from pathlib import Path
from datetime import datetime
def get_newest_too():
maximums = {}
for file_path in Path(r'./examples').iterdir():
if file_path.is_file():
split_file = file_path.stem.split('_')
file_type = split_file[-3]
date_time_text = split_file[-2]
date_time_parts = (int(part) for part in date_time_text.split('-'))
date_time = datetime(*date_time_parts) # spreading is just right here
if file_type in maximums:
maximums[file_type] = max(date_time, maximums[file_type])
else:
maximums[file_type] = date_time
return maximums
print(get_newest_too())
Edit: From the comments, it became clear that you had trouble selecting the actual file of each specific type for which the date was the maximum for that type.
Here's how to do that:
from pathlib import Path
from datetime import datetime
def get_newest():
maxs = {}
for file in Path(r'./examples').iterdir():
if file.is_file():
*_, t, d, _ = file.stem.split('_')
d = datetime(*map(int, d.split('-')))
maxs[t] = (d, file) if t not in maxs else max((d, file), maxs[t])
return {f: d for _, (d, f) in maxs.items()}
print(get_newest())
Result:
{WindowsPath('examples/Blah_Blah_KL_2023-11-01_003006.txt'): datetime.datetime(2023, 11, 1, 0, 0), WindowsPath('examples/Blah_Blah_NP_2022-11-02_003051.txt'): datetime.datetime(2022, 11, 2, 0, 0)}
You could construct another dict containing only the items you need:
file_dict_NP = {key:value for key, value in file_dict.items() if 'NP' in key}
And then do the same thing on it:
newest_NP = max(file_dict_NP, key=file_dict_NP.get)

Python doesn't detect that two values are equal and returns false all the time

Sorry, I don't really know how to explain this. I am trying to create a little program that detects if the employees have entered to work at good time.
First of all, with this I converted a float (that represents an hour) to a datetime value:
estabHourF=(float(estabHour)+0.18)
minutes = estabHourF*60
hours, minutes = divmod(minutes, 60)
print("%02d:%02d"%(hours,minutes))
todaysYear = datetime.date.today().year
todaysMonth = datetime.date.today().month
todaysDay =datetime.date.today().day
todaysSeconds = datetime.datetime.now().second
Then, I inserted all that into a list:
nowH = datetime.datetime(todaysYear, todaysMonth, todaysDay, int("%02d"%(hours)), int("%02d"%(minutes)), todaysSeconds)
Then, I created another list that creates like periods of minutes so I can compare now to the established hour to enter later on:
numMinutes = 15
date_list = [nowH - datetime.timedelta(minutes=x) for x in range(numMinutes)]
If I print this, it is like this:
datetime.datetime(2021, 7, 27, 9, 14, 33)
This is how it looks on the console.
Finally, I try to compare the list of time aproved to now by trying to imitate how it looks like:
for x in range(len(date_list)):
if (date_list[checkList])=="datetime.datetime({0}, {1}, {2}, {3}, {4}, {5})".format(todaysYear, todaysMonth, todaysDay, int(datetime.datetime.now().strftime("%I")), datetime.datetime.now().minute, todaysSeconds):
punctual = True
print("Puntual: ", punctual)
print(datetime.datetime.now())
else:
print("datetime.datetime({0}, {1}, {2}, {3}, {4}, {5})".format(todaysYear, todaysMonth, todaysDay, int(datetime.datetime.now().strftime("%I")), datetime.datetime.now().minute, todaysSeconds))
checkList+=1
Yeah, if wrong, I wanted it to show me the same value that I am comparing, and, don't tell me this two aren't the same.
from my list: datetime.datetime(2021, 7, 27, 9, 25, 49)
from the print inside "else": datetime.datetime(2021, 7, 27, 9, 25, 49)
They are exactly the same, but it seems like python doesn't recognize it due to some of my mistakes :')
If you can help me, I would be really really thankful :'3
The "datetime.datetime(2021, 7, 27, 9, 14, 33)" you get when printing it is only a string representation of your datetime object. The datetime object itself is never equal to that string.
You can directly compare two datetimes using comparison operators.
I am also not 100% sure of the purpose of your checklist variable but it seems to me that you are using it to iterate over date_list (I don't see it initialized in your snippet though). If this is the case, why not just use x or even better, you can directly do this:
for date in date_list:
if date == nowH:
#[...]
The date variable will in turn take all values from your list directly.
You can define your "latest acceptable datetime"
lateAfter = nowH + datetime.timedelta(minutes=numMinutes)
and then simply use comparison operators for dates by replacing your for with:
if datetime.datetime.now() <= lateAfter:
punctual = True
print("Puntual: ", punctual)
print(datetime.datetime.now())
else: # Not punctual
print(datetime.datetime.now())
Do I understand correctly that the comparison that does not give you True is this?
if (date_list[checkList])=="datetime.datetime({0}, {1}, {2}, {3}, {4}, {5})".format(todaysYear, todaysMonth, todaysDay, int(datetime.datetime.now().strftime("%I")), datetime.datetime.now().minute, todaysSeconds):
Sorry if I'm making no sense (newbie) but aren't you comparing str with a datetime?

What is an efficient way to test a string for a specific datetime format like "m%/d%/Y%" in python 3.6?

In my Python 3.6 application, from my input data I can receive datatimes in two different formats:
"datefield":"12/29/2017" or "datefield":"2017-12-31"
I need to make sure the that I can handle either datetime format and convert them to (or leave it in) the iso 8601 format. I want to do something like this:
#python pseudocode
import datetime
if datefield = "m%/d%/Y%":
final_date = datetime.datetime.strptime(datefield, "%Y-%m-%d").strftime("%Y-%m-%d")
elif datefield = "%Y-%m-%d":
final_date = datefield
The problem is I don't know how to check the datefield for a specific datetime format in that first if-statement in my pseudocode. I want a true or false back. I read through the Python docs and some tutorials. I did see one or two obscure examples that used try-except blocks, but that doesn't seem like an efficient way to handle this. This question is unique from other stack overflow posts because I need to handle and validate two different cases, not just one case, where I can simply fail it if it does validate.
You can detect the first style of date by a simple string test, looking for the / separators. Depending on how "loose" you want the check to be, you could check a specific index or scan the whole string with a substring test using the in operator:
if "/" in datefield: # or "if datefield[2] = '/'", or maybe "if datefield[-5] = '/'"
final_date = datetime.datetime.strptime(datefield, "%m/%d/%Y").strftime("%Y-%m-%d")
Since you'll only ever deal with two date formats, just check for a / or a - character.
import datetime
# M/D/Y
if '/' in datefield:
final_date = datetime.datetime.strpdate(date, '%M/%D/%Y').isoformat()
# Y-M-D
elif '-' in datefield:
final_date = datetime.datetime.strpdate(date, '%Y-%M-%D').isoformat()
A possible approach is to use the dateutil library. It contains many of the commonest datetime formats and can automagically detect these formats for you.
>>> from dateutil.parser import parse
>>> d1 = "12/29/2017"
>>> d2 = "2017-12-31"
>>> parse(d1)
datetime.datetime(2017, 12, 29, 0, 0)
>>> parse(d2)
datetime.datetime(2017, 12, 31, 0, 0)
NOTE: dateutil is a 3rd party library so you may need to install it with something like:
pip install python-dateutil
It can be found on pypi:
https://pypi.python.org/pypi/python-dateutil/2.6.1
And works with Python2 and Python3.
Alternate Examples:
Here are a couple of alternate examples of how well dateutil handles random date formats:
>>> d3 = "December 28th, 2017"
>>> parse(d3)
datetime.datetime(2017, 12, 28, 0, 0)
>>> d4 = "27th Dec, 2017"
>>> parse(d4)
datetime.datetime(2017, 12, 27, 0, 0)
I went with the advice of #ChristianDean and used the try-except block in effort to be Pythonic. The first format %m/%d/%Y appears a bit more in my data, so I lead the try-except with that datetime formatting attempt.
Here is my final solution:
import datetime
try:
final_date = datetime.datetime.strptime(datefield, "%m/%d/%Y").strftime("%Y-%m-%d")
except ValueError:
final_date = datetime.datetime.strptime(datefield, "%Y-%m-%d").strftime("%Y-%m-%d")

Python Regex: Mixed format string duration to seconds

I have a bunch of time durations in a list as follows
['23m3s', '23:34', '53min 3sec', '2h 3m', '22.10', '1:23:33', ...]
A you can guess, there are N permutations of time formatting being used.
What is the most efficient or simplest way to extract duration in seconds from each element in Python?
This is perhaps still a bit crude, but it seems to do the trick for all the data you've posted so far. The second totals all come to what I would expect. A combination of re and timedelta seems to do the trick for this small sample.
>>> import re
>>> from datetime import timedelta
First a dictionary of regexes: UPDATED BASED ON YOUR COMMENT
d = {'hours': [re.compile(r'(\d+)(?=h)'), re.compile(r'^(\d+)[:.]\d+[:.]\d+')],
'minutes': [re.compile(r'(\d+)(?=m)'), re.compile(r'^(\d+)[:.]\d+$'),
re.compile(r'^\d+[.:](\d+)[.:]\d+')], 'seconds': [re.compile(r'(\d+)(?=s)'),
re.compile(r'^\d+[.:]\d+[.:](\d+)'), re.compile(r'^\d+[:.](\d+)$')]}
Then a function to try out the regexes (perhaps still a bit crude):
>>> def convert_to_seconds(*time_str):
timedeltas = []
for t in time_str:
td = timedelta(0)
for key in d:
for regex in d[key]:
if regex.search(t):
if key == 'hours':
td += timedelta(hours=int(regex.search(t).group(1)))
elif key == 'minutes':
td += timedelta(seconds=int(regex.search(t).group(1)) * 60)
elif key == 'seconds':
td += timedelta(seconds=int(regex.search(t).group(1)))
print(td.seconds)
Here are the results:
>>> convert_to_seconds(*t)
1383
1414
3183
7380
1330
5013
You could add more regexes as you encounter more data, but only to an extent.

Convert Military Time from text file to Standard time Python

I am having problems with logic on how to convert military time from a text file to standard time and discard all the wrong values. I have only got to a point where the user is asked for the input file and the contents are displayed from the text file entered. Please help me
Python's datetime.time objects use "military" time. You can do things like this:
>>> t = datetime.time(hour=15, minute=12)
>>> u = datetime.time(hour=16, minute=44)
>>> t = datetime.datetime.combine(datetime.datetime.today(), t)
>>> t
datetime.datetime(2011, 5, 11, 15, 12)
>>> u = datetime.datetime.combine(datetime.datetime.today(), u)
>>> t - u
datetime.timedelta(-1, 80880)
With a little twiddling, conversions like the ones you describe should be pretty simple.
Without seeing any code, it's hard to tell what exactly you want. But I assume you could do something like this:
raw_time = '2244'
converted_time = datetime.time(hour=int(raw_time[0:2]), minute=int(raw_time[2:4]))
converted_time = datetime.datetime.combine(datetime.datetime.today(), converted_time)
Now you can work with converted_time, adding and subtracting timedelta objects. Fyi, you can create a timedelta like so:
td = datetime.timedelta(hours=4)
The full list of possible keyword arguments to the timedelta constructor is here.
from dateutil.parser import parse
time_today = parse("08:00")
from dateutil.relativedelta import relativedelta
required_time = time_today-relativedelta(minutes=35)
print required_time
datetime.datetime(2011, 5, 11, 7, 25)
It's not a true answer like the other two, but the philosophy I use when dealing with dates and times in python: convert to a datetime object as soon as possible after getting from the user, and only convert back into a string when you are presenting the value back to the user.
Most of the date/time math you will need can be done by datetime objects, and their cousins the timedelta objects. Unless you need ratios of timedeltas to other timedeltas, but that's another story.

Categories