csv Pandas datetime convert time to seconds - python

I work with data from Datalogger and the timestap is not supported by datetime in the Pandas Dataframe.
I would like to convert this timestamp into a format pandas knows and the then convert the datetime into seconds, starting with 0.
>>>df.time
0 05/20/2019 19:20:27:374
1 05/20/2019 19:20:28:674
2 05/20/2019 19:20:29:874
3 05/20/2019 19:20:30:274
Name: time, dtype: object
I tried to convert it from the object into datetime64[ns]. with %m or %b for month.
df_time = pd.to_datetime(df["time"], format = '%m/%d/%y %H:%M:%S:%MS')
df_time = pd.to_datetime(df["time"], format = '%b/%d/%y %H:%M:%S:%MS')
with error: redefinition of group name 'M' as group 7; was group 5 at position 155
I tried to reduce the data set and remove the milliseconds without success.
df['time'] = pd.to_datetime(df['time'],).str[:-3]
ValueError: ('Unknown string format:', '05/20/2019 19:20:26:383')
or is it possible to just subtract the first time line from all the other values in the column time?

Use '%m/%d/%Y %H:%M:%S:%f' as format instead of '%m/%d/%y %H:%M:%S:%MS'
Here is the format documentation for future reference

I am not exactly sure what you are looking for but you can use the above example to format your output and then you can remove items from your results like the microseconds this way:
date = str(datetime.now())
print(date)
2019-07-28 14:04:28.986601
print(date[11:-7])
14:04:28
time = date[11:-7]
print(time)
14:04:28

Related

time data '42:53.700' does not match format '%H:%M:%S.%f' (match)

I am trying to convert a column in string format to DateTime format, However, I am getting the following error, could somebody please help?
The error:time data '42:53.700' does not match format '%H:%M:%S.%f' (match)
Code:
Merge_df['Time'] = pd.to_datetime(Merge_df['Time'], format='%H:%M:%S.%f')
You'll need to clean the data to get a common format before you can parse to data type 'datetime'. For example you can remove the colons and fill with zeros, then parse with the appropriate directive:
import pandas as pd
df = pd.DataFrame({'time': ["1:45.333", "45:22.394", "4:55:23.444", "23:44:01.004"]})
df['time'] = pd.to_datetime(df['time'].str.replace(':', '').str.zfill(10), format="%H%M%S.%f")
df['time']
0 1900-01-01 00:01:45.333
1 1900-01-01 00:45:22.394
2 1900-01-01 04:55:23.444
3 1900-01-01 23:44:01.004
Name: time, dtype: datetime64[ns]
Since the data actually looks more like a duration to me, here's a way how to convert to data type 'timedelta'. You'll need to ensure HH:MM:SS.fff format which is a bit more work:
# ensure common string length
df['time'] = df['time'].str.zfill(12)
# ensure HH:MM:SS.fff format
df['time'] = df['time'].str[:2] + ":" + df['time'].str[3:5] + ":" + df['time'].str[6:]
df['timedelta'] = pd.to_timedelta(df['time'])
df['timedelta']
0 0 days 00:01:45.333000
1 0 days 00:45:22.394000
2 0 days 04:55:23.444000
3 0 days 23:44:01.004000
Name: timedelta, dtype: timedelta64[ns]
The advantage of using timedelta is that you can now also handle hours greater 23.

How to remove the time from datetime of the pandas Dataframe. The type of the column is str and objects, but the value is dateime [duplicate]

i have a variable consisting of 300k records with dates and the date look like
2015-02-21 12:08:51
from that date i want to remove time
type of date variable is pandas.core.series.series
This is the way i tried
from datetime import datetime,date
date_str = textdata['vfreceiveddate']
format_string = "%Y-%m-%d"
then = datetime.strftime(date_str,format_string)
some Random ERROR
In the above code textdata is my datasetname and vfreceived date is a variable consisting of dates
How can i write the code to remove the time from the datetime.
Assuming all your datetime strings are in a similar format then just convert them to datetime using to_datetime and then call the dt.date attribute to get just the date portion:
In [37]:
df = pd.DataFrame({'date':['2015-02-21 12:08:51']})
df
Out[37]:
date
0 2015-02-21 12:08:51
In [39]:
df['date'] = pd.to_datetime(df['date']).dt.date
df
Out[39]:
date
0 2015-02-21
EDIT
If you just want to change the display and not the dtype then you can call dt.normalize:
In[10]:
df['date'] = pd.to_datetime(df['date']).dt.normalize()
df
Out[10]:
date
0 2015-02-21
You can see that the dtype remains as datetime:
In[11]:
df.dtypes
Out[11]:
date datetime64[ns]
dtype: object
You're calling datetime.datetime.strftime, which requires as its first argument a datetime.datetime instance, because it's an unbound method; but you're passing it a string instead of a datetime instance, whence the obvious error.
You can work purely at a string level if that's the result you want; with the data you give as an example, date_str.split()[0] for example would be exactly the 2015-02-21 string you appear to require.
Or, you can use datetime, but then you need to parse the string first, not format it -- hence, strptime, not strftime:
dt = datetime.strptime(date_str, '%Y-%m-%d %H:%M:%S')
date = dt.date()
if it's a datetime.date object you want (but if all you want is the string form of the date, such an approach might be "overkill":-).
simply writing
date.strftime("%d-%m-%Y") will remove the Hour min & sec

Pandas to_datetime not formatting as expected

I have a data frame with a column 'Date' with data type datetime64. The values are in YYYY-MM-DD format.
How can I convert it to YYYY-MM format and use it as a datetime64 object itself.
I tried converting my datetime object to a string in YYYY-MM format and then back to datetime object in YYYY-MM format but it didn't work.
Original data = 1988-01-01.
Converting datatime object to string in YY-MM format
df['Date']=df['Date'].dt.strftime('%Y-%m')
This worked as expected, my column value became
1988-01
Converting the string back to datetime object in Y-m format
df['Date']=pd.to_datetime(df['Date'],format= '%Y-%m')
I was expecting the Date column in YYYY-MM format but it became YYYY-MM-DD format.
1988-01-01
Can you please let me know if I am missing something.
Thanks
It is expected behaviour, in datetimes the year, month and day arguments are required.
If want remove days need month period by to_period:
df['Date'] = df['Date'].dt.to_period('M')
df['Date'] = pd.to_datetime(df['Date'],format= '%Y-%m').dt.to_period('M')
Sample:
df = pd.DataFrame({'Date':pd.to_datetime(['1988-01-01','1999-01-15'])})
print (df)
Date
0 1988-01-01
1 1999-01-15
df['Date'] = df['Date'].dt.to_period('M')
print (df)
Date
0 1988-01
1 1999-01

Conversion of set of numbers into Date Format using Python

I have a dataframe named 'train' with column ID which represents 'date' in a very unusual manner. For e.g. certain entry in ID:
For example, the value of ID 2013043002 represents the date 30/04/2013
02:00:00
First 4 digits represents year, subsequent 2 digits represent month and day respectively. And last two digits represent time.
So I want to convert this into proper date time format to perform time series analysis.
Use to_datetime with parameter format - check http://strftime.org/:
df = pd.DataFrame({'ID':[2013043002,2013043002]})
df['ID'] = pd.to_datetime(df['ID'], format='%Y%m%d%H')
print(df)
ID
0 2013-04-30 02:00:00
1 2013-04-30 02:00:00
print(df['ID'].dtype)
datetime64[ns]
Use datetime for date time manipulations.
datetime.strptime(d,"%Y%m%d%H").strftime("%d/%m/%Y %H:%M:%S")
First, if you are gonna have ALWAYS the same input style in the Id you could play with string or digit formating ...
Id = 2013043002
Year = Id[0:3]
Month = Id[4:5]
Day = Id[6:7]
Time= Id[-2:-1]
DateFormat = "{}-{}-{}".format(Day,Month,Year)
TimeFormar = "%d:00:00"%Time
Print (DateFormat)
Output:
04:30:2013
Then with this you could wrap it into a function and pass every Ids by loops and manage your data.
Of course, if you dont know your previous ID incomming format you should used the other time module options, and manage the string formating to show it in the order you want.
By using the module datetime you can do that easily with the function strptime :
my_date = datetime.datetime.strptime(ID, "%Y%m%d%H")
"%Y%m%d%H"
is the format of your date : %Y is the year, %m is the month(0 padded), %d is the day(0 padded) and %H is the hour(24H, 0 padded). See http://strftime.org/ for more.

Extract Date from excel and append it in a list using python

I have an column in excel which has dates in the format ''17-12-2015 19:35". How can I extract the first 2 digits as integers and append it to a list? In this case I need to extract 17 and append it to a list. Can it be done using pandas also?
Code thus far:
import pandas as pd
Location = r'F:\Analytics Materials\files\paymenttransactions.csv'
df = pd.read_csv(Location)
time = df['Creation Date'].tolist()
print (time)
You could extract the day of each timestamp like
from datetime import datetime
import pandas as pd
location = r'F:\Analytics Materials\files\paymenttransactions.csv'
df = pd.read_csv(location)
timestamps = df['Creation Date'].tolist()
dates = [datetime.strptime(timestamp, '%d-%m-%Y %H:%M') for timestamp in timestamps]
days = [date.strftime('%d') for date in dates]
print(days)
The '%d-%m-%Y %H:%M'and '%d' bits are format specififers, that describe how your timestamp is formatted. See e.g. here for a complete list of directives.
datetime.strptime parses a string into a datetimeobject using such a specifier. dateswill thus hold a list of datetime instances instead of strings.
datetime.strftime does the opposite: It turns a datetime object into string, again using a format specifier. %d simply instructs strftime to only output the day of a date.

Categories