Correct reading of datetime with AM/PM format - python

I'm trying to read some csv files that contain a column called 'timestamp' with this format:
7/6/2022 7:30:00 PM which should translate to (mm/dd/YYYY hh:mm:ss). What I tried was after reading the csv file using:
df['timestamp']= pd.to_datetime(df['timestamp'],format='%m/%d/%Y %I:%M:%S %p')
And it renders a totally different thing with this error:
ValueError: time data '07-06 19:30' does not match format '%m/%d/%Y %I:%M:%S %p' (match)
'07-06 19:30' This value is the same that appears when reading the csv directly with no formatting which is strange as when I open the csv the full date is there. I'm a bit lost on this case as it appears as I cannot convert the date.
Thanks

The format='%m/%d/%Y %I:%M:%S %p' should work, make sure you read your data as string.
That said, pandas is advanced enough to figure out the format semi-automatically, the only ambiguity to resolve is to specify that the first digits are not days:
df['new_timestamp'] = pd.to_datetime(df['timestamp'], dayfirst=False)
example:
timestamp new_timestamp
0 7/6/2022 7:30:00 PM 2022-07-06 19:30:00

Related

Converting date into '%Y-%m-%d' using strptime and strftime using Python

I have a .csv file with a date format I am unfamiliar with (I think from reading around it's unix).
1607299200000,string,int,float
1607385600000,string,int,float
1606953600000,string,int,float
I have been trying to convert it into '%Y-%m-%d' using Python but keep getting various errors. I am fairly novice with Python but this is what I have so far:
outRow.append(datetime.datetime.strptime(row[0],'%B %d %Y').strftime('%Y-%m-%d'))
Any help would be appreciated.
import datetime
timestamp = datetime.datetime.fromtimestamp(1500000000)
print(timestamp.strftime('%Y-%m-%d %H:%M:%S'))
Output:
2017-07-14 08:10:00
The tricky part here is that you have a unix timestamp with microseconds.
AFAIK there's no option to convert unix ts with ms to datetime.
So first you have to drop them (div by 1000), then add them if needed.
ts = row[0]
dt = datetime.utcfromtimestamp(ts//1000).replace(microsecond=ts%1000*1000)
then you can strftime to whichever format you need.
Though if you need to execute this operation for the entire csv, you better look into pandas but that's out of the scope of this question.

Pandas to_datetime error 'unconverted data remains'

I'm trying to convert date column in my Pandas DataFrame to datetime format. If I don't specify date format, it works fine, but then further along in the code I get issues because of different time formats.
The original dates looks like this 10/10/2019 6:00 in european date format.
I tried specifying format like so:
df['PeriodStartDate'] = pd.to_datetime(df['PeriodStartDate'],
format="%d/%m/%Y")
which results in an error: unconverted data remains 6:00
I then tried to update format directive to format="%d/%m/%Y %-I/%H" which comes up with another error: '-' is a bad directive in format '%d/%m/%Y %-I/%H' even though I thought that to_datetime uses the same directives and strftime and in the latter %-I is allowed.
In frustration I then decided to chop off the end of the string since I don't really need hours and minutes:
df['PeriodStartDate'] = df['PeriodStartDate'].str[:10]
df['PeriodStartDate'] = pd.to_datetime(df['PeriodStartDate'],
format="%d/%m/%Y")
But this once again results in an error: ValueError: unconverted data remains: which of course comes from the fact that some dates have 9 digits like 3/10/2019 6:00
Not quite sure where to go from here.
format %H:%M would work(don't forget the : in between)
pd.to_datetime('10/10/2019 6:00', format="%m/%d/%Y %H:%M")
Out[1049]: Timestamp('2019-10-10 06:00:00')
pd.to_datetime('3/10/2019 18:00', format="%d/%m/%Y %H:%M")
Out[1064]: Timestamp('2019-10-03 18:00:00')
Oh, I feel so dumb. I figured out what the issue was. For some reason I thought that hours were in a 12-hour format, but they were in fact in a 24-hour format, so changing directive to "%d/%m/%Y %H:%M" solved it.

Python convert string to datetime in variable formats

I am generating date and time information which is string from an API. The generated string is in your system's date/time format by default (Win 10 in my case). For example if you are using MM/DD/YYYY HH:MM:SS tt in your computer, the generated string would be something like "05/07/2019 06:00:00 AM".
For comparison purpose, I would then convert the string to datetime format by using datetime.datetime.strptime(i,"%m/%d/%Y %I:%M:%S %p"). This works prefectly fine, however if someone else whose system date/time format is different from' MM/DD/YYYY HH:MM:SS tt' runs my script, he would get a mismatch error as the string can no longer be converted by %m/%d/%Y %I:%M:%S %p.
So would it be possible to make the desired datetime format become a variable argument in the strptime function? Or even simpler, just make the format to be the same as the system's date/time format.
You can use try and except to fix this (although there might be another way)
try:
datetime.datetime.strptime(i,"%m/%d/%Y %I:%M:%S %p")
except TypeError:
try:
datetime.datetime.strptime(i,"%d/%m/%Y %I:%M:%S %p")
except:
try:
datetime.datetime.strptime(i,"%d/%m/%y %I:%M:%S %p")
# and so on....

Can we Auto detect Datetime format of given column from csv file?

Suppose I have a csv with a timestamp but the format is not defined. It can be of any format with any separator like -
mm/dd/yyyy hh:mm or dd/mm/yyyy hh:mm:ss or mm-dd-yyyy hh:mm or dd-mm-yyyy hh:mm:ss or just like that.
I am trying to parse dates of any format.
Here:
dateparse = lambda dates: datetime.strptime(dates, '%m/%d/%Y %H:%M')
We have defined to parse dates in this format: %m/%d/%Y %H:%M
If anyone can give any valuable suggestion then it will be helpful.
pandas.read_csv has an infer_datetime_format parameter:
infer_datetime_format : boolean, default False
If True and parse_dates is enabled, pandas will attempt to infer the format of the datetime strings in the columns, and if it can be inferred, switch to a faster method of parsing them. In some cases this can increase the parsing speed by ~5-10x.

Grabbing & displaying the current time

I'm very new to python and trying to build a simple web app in pieces.
I'm using the datetime library for the first time so please be patient with me.
All I'm trying to do is to get and display the current time and date so that I can cross-reference it with a target time & date later.
I'm getting some colossal errors. Any help is appreciated. Not sure what I'm doing incorrectly here to display the time formatted the way I want.
from datetime import datetime
date_string = "4:21 PM 1.24.2011"
format = "%I.%M %p %m %d, %Y"
my_date = datetime.strptime(date_string, format)
print(my_date.strftime(format))
The format of the date_string doesn't match the format you're trying to parse it with. The following format string should allow you to parse the date.
format = "%I:%M %p %m.%d.%Y"
And afterwards, if you want to print it using the other format
print(my_date.strftime("%I.%M %p %m %d, %Y"))
You're using wrong format string. Try to replace it with "%I:%M %p %m.%d.%Y".
Here's documentation how to use datetime class properly.
The problem is with your format. You need to make the format match date_string. So try this:
format = "%I:%M %p %m.%d.%Y"
That should do the trick
Also, it might be of interest to you to take a look at time.asctime

Categories