How to remove time part of list of datetime strings? - python

I have a list of 1000 different dates and time strings like below in python.
dates[:3] ==
['2019-11-29 12:50:54',
'2019-11-29 12:46:53',
'2019-11-29 12:46:10']
I would like to get these strings so that it only shows the date part and not the time for all 1000 in the list like example below.
date_only(dates[:3]) == ['2019-11-29', '2019-11-29', '2019-11-29']

You can simply split() the array
dates =['2019-11-29 12:50:54', '2019-11-29 12:46:53','2019-11-29 12:46:10']
for i in range(len(dates)):
date = [dates[i].split()[0]]
print(date)

Since the list is already string object, you can use split() and get the first element which is date only:
dates = ['2019-11-29 12:50:54', '2019-11-29 12:46:53', '2019-11-29 12:46:10']
dateonly = [elt.split()[0] for elt in dates]
print(dateonly)

Related

replace the date section of a string in python

if I have a string 'Tpsawd_20220320_default_economic_v5_0.xls'.
I want to replace the date part (20220320) with a date variable (i.e if I define the date = 20220410, it will replace 20220320 with this date). How should I do it with build-in python package? Please note the date location in the string can vary. it might be 'Tpsawd_default_economic_v5_0_20220320.xls' or 'Tpsawd_default_economic_20220320_v5_0.xls'
Yes, this can be done with regex fairly easily~
import re
s = 'Tpsawd_20220320_default_economic_v5_0.xls'
date = '20220410'
s = re.sub(r'\d{8}', date, s)
print(s)
Output:
Tpsawd_20220410_default_economic_v5_0.xls
This will replace the first time 8 numbers in a row are found with the given string, in this case date.

Reformat a list of dates from "/" to "-" (using python)

I have a list of dates.
['1/12/2022', '1/13/2022','1/17/2022']
How do I reformat them to look like this:
['2022-1-12', '2022-1-13','2022-1-17']
EDIT: My original post asked about the wrong format. I've corrected it because I meant for the the format to be "Year-Month-Day"
I am assuming you are using Python... Please correct me if I am wrong. You can loop through the list of dates using a enumerated for-loop (enumerate(list) function lets you know the index of each value during the loop) with each date, use the .replace() method of str to replace '/' with '-' like this:
list_of_dates = ['1/12/2022', '1/13/2022','1/17/2022']
for i, date in enumerate(list_of_dates):
list_of_dates[i] = date.replace('/', '-')
or use list comprehension like this (thank you #Eli Harold ):
list_of_dates = [date.replace('/', '-') for date in list_of_dates]
If you want to change the order of the numbers in the date string you can split them by the '/' or '-' into a new list and change the order if you want like this:
for i, date in enumerate(list_of_dates):
month, day, year = date.split('-') # assuming you already changed it to dashes
list_of_dates[i] = f'{day}-{month}-{year}'
you can use strptime
from datetime import datetime
dates = []
for date_str in ['1/12/2022', '1/13/2022','1/17/2022']:
date = datetime.strptime(date_str, '%m/%d/%Y')
dates.append(date.strftime('%m-%d-%Y'))
I opted to split the individual dates and then add in the "-" delimiter after the fact, but you could also replace those on iteration. Once your data has been transformed, I just pushed it into a new list of reformatted dates.
This may not result in the best performance for longer iterations, though.
dates = ['1/12/2022', '1/13/2022','1/17/2022']
newdates = []
for x in range(0, len(dates)):
split_date = dates[x].split('/')
month = split_date[0]
day = split_date[1]
year = split_date[2]
your_date = year +"-"+month+"-"+day
newdates.apppend(your_date)
print(your_date)
And the output:
2022-1-12
2022-1-13
2022-1-17
from datetime import datetime
dates = [datetime.strptime(x, "%-m/%-d/%Y") for x in list_of_dates]
new_dates = [x.strftime("%Y-%-m-%-d") for x in dates]
dates = ['1/12/2022', '1/13/2022','1/17/2022']
dates = [x.replace('/', '-') for x in dates]

How can I take list of Dates from csv (as strings) and return only the dates/data between a start date and end date?

I have a csv file with dates in format M/D/YYYY from 1948 to 2017. I'm able to plot other columns/lists associated with each date by list index. I want to be able to ask the user for a start date, and an end date, then return/plot the data from only within that period.
Problem is, reading dates in from the csv, they are strings so I cannot use if date[x] >= startDate && date[x] <= endDate because theres no way for me to turn dates in this format to integers.
Here is my csv file
I am already able to read in the dates from the csv to its own list.
How can I take the dates in my list and only return the ones within the user specified date range?
Here is my function for plotting the entire dataset right now:
#CSV Plotting function
def CSV_Plot (data,header,column1,column2):
#pyplot.plot([item[column1] for item in data] , [item[column2] for item in data])
pyplot.scatter([item[column1] for item in data] , [item[column2] for item in data])
pyplot.xlabel(header[column1])
pyplot.ylabel(header[column2])
pyplot.show()
return True
CSV_Plot(mycsvdata,data_header,dateIndex,rainIndex)
This is how I am asking the user to input the start and end dates:
#Ask user for start date in M/D/YYY format
startDate = input('Please provide the start date (M/D/YYYY) of the period for the data you would like to plot: ')
endDate = input('Please provide the end date (M/D/YYYY) of the period for the data you would like to plot: ')
You need to compare the dates.
I would suggest parsing the dates from your CSV into a datetime object, and also turning the user input value into a datetime object.
How to create a datetime object from a string? You need to specify the format string and the strptime() will parse it for you. Details here:
Converting string into datetime
In your case, it could be something like
from datetime import datetime
# Considering date is in M/D/YYYY format
datetime_object1 = datetime.strptime(date_string, "%m/%d/%Y")
Then you can compare them with a > or < operator. Here you can find details of how to compare the dates.

How to sort dates imported from a CSV file?

I'm trying to write a program that can print a list of sorted dates but it keeps sorting by the 'day' instead of the full date, day,month,year
Im very new to python so theres probably a lot i'm doing wrong but any help would be greatly appreciated.
So I have it so that you can view the list over two pages.
the dates will sort
12/03/2004
13/08/2001
15/10/2014
but I need the full date sorted
df = pd.read_csv('Employee.csv')
df = df.sort_values('Date of Employment.')
List1 = df.iloc[:50, 1:]
List2 = df.iloc[50:99, 1:]
The datetime data type has to be used for the dates to be sorted correctly
You need to use either one of these approaches to convert the dates to datetime objects:
Approach 1
pd.to_datetime + DataFrame.sort_values:
df['Date of Employment.'] = pd.to_datetime(df['Date of Employment.']')
Approach 2
You can parse the dates at the same time that the Pandas DataFrame is being loaded:
df = pd.read_csv('Employee.csv', parse_dates=['Date of Employement.'])
This is equivalent to the first approach with the exception that everything is done in one step.
Next you need to sort the datetime values in either ascending or descending order.
Ascending:
`df.sort_values('Date of Employment.')`
Descending
`df.sort_values('Date of Employment.',ascending=False)`
You need to convert Date of Employment. to a Date before sorting
df['Date of Employment.'] = pd.to_datetime(df['Date of Employment.'],format= '%d/%m/%Y')
Otherwise it's just strings for Python

Converting a list of datetime objects to a list of number of days since a certain date

I have a large list of dates that are datetime objects like for example
[datetime.datetime(2016,8,14),datetime.datetime(2016,8,13),datetime.datetime(2016,8,12),....etc.]
Instead of datetime objects of the date what I want instead is a list of numerical integer values since the date 1/1/1900. I have defined 1/1/1900 as the base date and in the for loop below, I have calculated the days between the date in the list since that base date:
baseDate = datetime(1900,1,1)
numericalDates = []
for i in enumerate(dates):
a=i[1]-baseDate
numericalDates.append(a)
print(numericalDates)
However when I print this out, I get datetime.timedelta objects instead
[datetime.timedelta(42592), datetime.timedelta(42591), datetime.timedelta(42590),...etc.]
Any ideas on how I can convert it the proper way?
timedelta objects have days attribute, so you can simply append that as an int:
numericalDates.append(a.days)
will result with numericalDates being [42594, 42593, 42592].
Note that you can also simplify your code a bit by using list comprehension:
numericalDates = [(d - baseDate).days for d in dates]

Categories