I have a datetime format of yyyy-mm-dd hh:mm:ss in one line and I want to split date and time in separate columns can any one help
df_train['Time1'] = df_train['server_time'].apply(lambda x : x.split(' ')[1])
when I apply this code I'm getting an error as "list index out of range"
This will help you.
df_train['date'] = [d.date() for d in df_train['server_time']]
df_train['time'] = [d.time() for d in df_train['server_time']]
You can get the date and time from datetime.date and datetime.time:
import datetime as dt
df_train['Date'] = df_train['server_time'].dt.date
df_train['Time'] = df_train['server_time'].dt.time
Related
So, I have a dataframe (mean_df) with a very messy column with dates. It's messy because it is in this format: 1/1/2018, 1/2/2018, 1/3/2018.... When it should be 01/01/2018, 02/01/2018, 03/01/2018... Not only has the wrong format, but it's ascending by the first day of every month, and then following second day of every month, and so on...
So I wrote this code to fix the format:
mean_df["Date"] = mean_df["Date"].astype('datetime64[ns]')
mean_df["Date"] = mean_df["Date"].dt.strftime('%d-%m-%Y')
Then, from displaying this:
It's now showing this (I have to run the same cell 3 times to make it work, it always throws error the first time):
Finally, in the last few hours I've been trying to sort the 'Dates' column, in an ascending way, but it keeps sorting it the wrong way:
mean_df = mean_df.sort_values(by='Date') # I tried this
But this is the output:
As you can see, it is still ascending prioritizing days.
Can someone guide me in the right direction?
Thank you in advance!
Make it into right format
mean_df["sort_date"] = pd.to_datetime(mean_df["Date"],format = '%d/%m/%Y')
mean_df = mean_df.sort_values(by='sort_date') # Try this now
You should sort the date just after convert it to datetime since dt.strftime convert datetime to string
mean_df["Date"] = pd.to_datetime(mean_df["Date"], dayfirst=True)
mean_df = mean_df.sort_values(by='Date')
mean_df["Date"] = mean_df["Date"].dt.strftime('%d-%m-%Y')
Here is my sample code.
import pandas as pd
df = pd.DataFrame()
df['Date'] = "1/1/2018, 1/2/2018, 1/3/2018".split(", ")
df['Date1'] = pd.to_datetime(df['Date'], format='%d/%m/%Y')
df['Date2'] = df['Date1'].dt.strftime('%d/%m/%Y')
df.sort_values(by='Date2')
First, I convert Date to datetime format. As I observed, you data follows '%d/%m/%Y' format. If you want to show data in another form, try the following line, for example
df['Date2'] = df['Date1'].dt.strftime('%d/%m/%Y')
I have the following list of strings that looks like this
dates_list = ['7.0:2.0', '6.0:32.0', '6.0:16.0', '6.0:16.0', '6.0:17.0', '6.0:2.0', '6.0:1.0']
I want to plot this into a graph but I need to convert into date time of minutes and seconds (%M:%S)
I tried using strptime
for i in dates_list:
datetime_object = datetime.strptime(i,'%M:%S')
I get the following error:
ValueError: time data '7.0:2.0' does not match format '%M:%S'
Is there a way to handle floats or do I have to convert the strings to int and remove the decimals?
This will allow you to introduce any decimal:
from datetime import datetime
for i in dates_list:
time_list = i.split(':')
minute_decimal = time_list[0].split('.')[1]
second_decimal = time_list[1].split('.')[1]
datetime_object = datetime.strptime(i,'%M.{0}:%S.{1}'.format(minute_decimal,second_decimal)).time()
Try this, you need to correctly specify the format
import datetime as dt
dates_list = ['7.0:2.0', '6.0:32.0', '6.0:16.0', '6.0:16.0', '6.0:17.0', '6.0:2.0', '6.0:1.0']
for i in dates_list:
datetime_object = dt.datetime.strptime(i,'%M.0:%S.0').time()
print(datetime_object)
I would do this with pd.to_datetime(). However, it would add unnecessary date information which you can then remove with strftime:
[x.strftime("%M:%S") for x in pd.to_datetime(dates_list,format='%M.0:%S.0')]
Returning:
['07:02', '06:32', '06:16', '06:16', '06:17', '06:02', '06:01']
I'm generating a list of random dates using Datetime and need to display as dd/mm/yy (eg 24 March 20 is 24/03/20). I can get this sorted with strftime, however it breaks the sort as it goes left to right so it's taking the day and sorting in that order.
It seems like overkill to get a datetime object, convert into a string for formatting, then convert that string back into a datetime object to sort.
How can I sort this list for dates correctly?
Thanking you in advance!
import random
from datetime import datetime, timedelta
user_test_date = datetime.strptime("12/12/21", '%d/%m/%y')
ledger = []
''' Create date for transaction '''
def date_gen():
date_step = random.randrange(1, 60) # Set range of 2 months
raw_date = user_test_date + timedelta(days =- date_step) # Alter days value each loop
date = raw_date.strftime('%w %b %y') #Change format of date
ledger.append(date)
ledger.sort(key=lambda item: item[0], reverse=True) #Sort list of dates
for i in range(10):
date_gen()
print(ledger)
here:
date = raw_date.strftime('%w %b %y') #Change format of date
you convert the datetime object to str. Try instead to skip this line and replace the last line in date_gen with ledger.sort()
I'am trying to calculate the difference between string time values but i could not read microseconds format. Why i have this type of errors ? and how i can fix my code for it ?
I have already tried "datetime.strptime" method to get string to time format then use pandas.dataframe.diff method to calculate the difference between each item in the list and create a column in excel for it.
```
from datetime import datetime
import pandas as pd
for itemz in time_list:
df = pd.DataFrame(datetime.strptime(itemz, '%H %M %S %f'))
ls_cnv.append(df.diff())
df = pd.DataFrame(time_list)
ls_cnv = [df.diff()]
print (ls_cnv)
```
I expect the output to be
ls_cnv = [NaN, 00:00:00, 00:00:00]
time_list = ['10:54:05.912783', '10:54:05.912783', '10:54:05.912783']
but i have instead (time data '10:54:05.906224' does not match format '%H %M %S %f')
The error you get is because you are using strptime wrong.
df = pd.DataFrame(datetime.strptime(itemz, '%H:%M:%S.%f'))
The above would be the correct form, the one passed from your time_list but that's not the case. You create the DataFrame in the wrong way too. DataFrame is a table if you wish of data. The following lines will create and replace in every loop a new DataFrame for every itemz which is one element of your list at time. So it will create a DataFrame with one element in the first loop which will be '10:54:05.912783' and it will diff() that with itself while there is no other value.
for itemz in time_list:
df = pd.DataFrame(datetime.strptime(itemz, '%H %M %S %f'))
ls_cnv.append(df.diff())
Maybe what you wanted to do is the following:
from datetime import datetime
import pandas as pd
ls_cnv = []
time_list = ['10:54:03.912743', '10:54:05.912783', '10:44:05.912783']
df = pd.to_datetime(time_list)
data = pd.DataFrame({'index': range(len(time_list))}, index=df)
a = pd.Series(data.index).diff()
ls_cnv.append(a)
print (ls_cnv)
Just because your time format must include colons and point like this
"%H:%M:%S.%f"
I have an column in excel which has dates in the format ''17-12-2015 19:35". How can I extract the first 2 digits as integers and append it to a list? In this case I need to extract 17 and append it to a list. Can it be done using pandas also?
Code thus far:
import pandas as pd
Location = r'F:\Analytics Materials\files\paymenttransactions.csv'
df = pd.read_csv(Location)
time = df['Creation Date'].tolist()
print (time)
You could extract the day of each timestamp like
from datetime import datetime
import pandas as pd
location = r'F:\Analytics Materials\files\paymenttransactions.csv'
df = pd.read_csv(location)
timestamps = df['Creation Date'].tolist()
dates = [datetime.strptime(timestamp, '%d-%m-%Y %H:%M') for timestamp in timestamps]
days = [date.strftime('%d') for date in dates]
print(days)
The '%d-%m-%Y %H:%M'and '%d' bits are format specififers, that describe how your timestamp is formatted. See e.g. here for a complete list of directives.
datetime.strptime parses a string into a datetimeobject using such a specifier. dateswill thus hold a list of datetime instances instead of strings.
datetime.strftime does the opposite: It turns a datetime object into string, again using a format specifier. %d simply instructs strftime to only output the day of a date.