Convert weekday name string into datetime - python

I have the following date (as an object format) : Tue 31 Jan in a pandas Series.
and I try to change it into : 31/01/2019
Please, how can I achieve this ? I understand more or less that pandas.Datetime can convert easily when a string date is clearer (like 6/1/1930 22:00) but not in my case, when their is a weekday name.
Thank you for your help.

Concat the year and callpd.to_datetime with a custom format:
s = pd.Series(['Tue 31 Jan', 'Mon 20 Feb',])
pd.to_datetime(s + ' 2019', format='%a %d %b %Y')
0 2019-01-31
1 2019-02-20
dtype: datetime64[ns]
This is fine as long as all your dates follow this format. If that is not the case, this cannot be solved reliably.
More information on datetime formats at strftime.org.
Another option is using the 3rd party dateutil library:
import dateutil
s.apply(dateutil.parser.parse)
0 2018-01-31
1 2018-02-20
dtype: datetime64[ns]
This can be installed with PyPi.
Another, slower option (but more flexible) is using the 3rd party datefinder library to sniff dates from string containing random text (if this is what you need):
import datefinder
s.apply(lambda x: next(datefinder.find_dates(x)))
0 2018-01-31
1 2018-02-20
dtype: datetime64[ns]
You can install it with PyPi.

Convert to a datetime object
If you wanted to use the datetime module, you could get the year by doing the following:
import datetime as dt
d = dt.datetime.strptime('Tue 31 Jan', '%a %d %b').replace(year=dt.datetime.now().year)
This is taking the date in your format, but replacing the default year 1900 with the current year in a reliable way.
This is similar to the other answers, but uses the builtin replace method as opposed to concatenating a string.
Output
To get the desired output from your new datetime object, you could perform the following:
>>> d.strftime('%d/%m/%Y')
'31/01/2018'

Here is two alternate ways to achieve the same result.
Method 1: Using datetime module
from datetime import datetime
datetime_object = datetime.strptime('Tue 31 Jan', '%a %d %b')
print(datetime_object) # outputs 1900-01-31 00:00:00
If you had given an Year parameter like Tue 31 Jan 2018, then this code would work.
from datetime import datetime
datetime_object = datetime.strptime('Tue 31 Jan 2018', '%a %d %b %Y')
print(datetime_object) # outputs 2018-01-31 00:00:00
To print the resultant date in a format like this 31/01/2019. You can use
print(datetime_object.strftime("%d/%m/%Y")) # outputs 31/01/2018
Here are all the possible formatting options available with datetime object.
Method 2: Using dateutil.parser
This method automatically fills in the Year parameter with current year.
from dateutil import parser
string = "Tue 31 Jan"
date = parser.parse(string)
print(date) # outputs 2018-01-31 00:00:00

Related

Extract each year, month, day, year from getctime , getmtime in Python

I want to extract the year month day hours min eachly from below value.
import os, time, os.path, datetime
date_of_created = time.ctime(os.path.getctime(folderName))
date_of_modi = time.ctime(os.path.getmtime(folderName))
Now I only can get like below
'Thu Dec 26 19:21:37 2019'
but I want to get the the value separtly
2019 // Dec(Could i get this as int??) // 26
each
I want to extract each year month day each time min value from date_of_created and date_of_modi
Could i get it? in python?
You can convert the string to a datetime object:
from datetime import datetime
date_of_created = datetime.strptime(time.ctime(os.path.getctime(folderName)), "%a %b %d %H:%M:%S %Y") # Convert string to date format
print("Date created year: {} , month: {} , day: {}".format(str(date_of_created.year),str(date_of_created.month),str(date_of_created.day)))
The time.ctime function returns the local time in string form. You might want to use the time.localtime function, which returns a struct_time object which contains the information you are looking for. As example,
import os, time
date_created_string = time.ctime(os.path.getctime('/home/b-fg/Downloads'))
date_created_obj = time.localtime(os.path.getctime('/home/b-fg/Downloads'))
print(date_created_string) # Mon Feb 10 09:41:03 2020
print('Year: {:4d}'.format(date_created_obj.tm_year)) # Year: 2020
print('Month: {:2d}'.format(date_created_obj.tm_mon)) # Month: 2
print('Day: {:2d}'.format(date_created_obj.tm_mday)) # Day: 10
Note that these are integer values, as requested.
time.ctime([secs])
Convert a time expressed in seconds since the epoch to a string of a form: 'Sun Jun 20 23:21:05 1993' representing local time.
If that's not what you want... use something else? time.getmtime will return a struct_time which should have the relevant fields, or for a more modern interface use datetime.datetime.fromtimestamp which... returns a datetime object from a UNIX timestamp.
Furthermore, using stat would probably more efficient as it ctime and mtime will probably perform a stat call each internally.
You can use the datetime module, more specifically the fromtimestamp() function from the datetime module to get what you expect.
import os, time, os.path, datetime
date_of_created = datetime.datetime.fromtimestamp(os.path.getctime(my_repo))
date_of_modi = datetime.datetime.fromtimestamp(os.path.getmtime(my_repo))
print(date_of_created.strftime("%Y"))
Output will be 2020 for a repo created in 2020.
All formats are available at this link

pandas read_csv parse foreign dates

I am trying to use read_csv on a .csv file that contains a date column. The problem is that the date column is in a foreign language (romanian), with entries like:
'26 septembrie 2017'
'13 iulie 2017'
etc. How can I parse this nicely into a pandas dataframe which has a US date format?
you can pass a converter for that column:
df = pd.read_csv(myfile, converters={'date_column': foreign_date_converter})
But first you have to define the converter to do what you want. This approach uses locale manipulation:
def foreign_date_converter(text):
# Resets locale to "ro_RO" to parse romanian date properly
# (non thread-safe code)
loc = locale.getlocale(locale.LC_TIME)
locale.setlocale(locale.LC_TIME, 'ro_RO')
date = datetime.datetime.strptime(text '%d %b %Y').date()
locale.setlocale(locale.LC_TIME, loc) # restores locale
return date
Use dateparser module.
import dateparser
df = pd.read_csv('yourfile.csv', parse_dates=['date'], date_parser=dateparser.parse)
Enter your date column name in parse_dates parameter. I'm just assuming it as date
You may have output like this:
date
0 2017-09-26
1 2017-07-13
If you want to change the format use strftime strftime
df['date'] = df.date.dt.strftime(date_format = '%d %B %Y')
output:
date
0 26 September 2017
1 13 July 2017
The easiest solution would be to simply use 12 times the str.replace(old, new) function.
It is not pretty but if you just built the function:
def translater(date_string_with_exatly_one_date):
date_str = date_string_with_exatly_one_date
date_str = date_str.replace("iulie", "july")
date_str = date_str.replace("septembrie", "september")
#do this 10 more times with the right translation
return date_str
Now you just have to call it for every entry. After that you can handle it like a US date string. This is not very efficient but it will get the job done and you do not have to search for special libraries.

How can I convert text to DateTime?

I scraped a website and got the following Output:
2018-06-07T12:22:00+0200
2018-06-07T12:53:00+0200
2018-06-07T13:22:00+0200
Is there a way I can take the first one and convert it into a DateTime value?
Just parse the string into year, month, day, hour and minute integers and then create a new date time object with those variables.
Check out the datetime docs
You can convert string format of datetime to datetime object like this using strptime, here %z is the time zone :
import datetime
dt = "2018-06-07T12:22:00+0200"
ndt = datetime.datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S%z")
# output
2018-06-07 12:22:00+02:00
The following function (not mine) should help you with what you want:
df['date_column'] = pd.to_datetime(df['date_column'], format = '%d/%m/%Y %H:%M').dt.strftime('%Y%V')
You can mess around with the keys next to the % symbols to achieve what you want. You may, however, need to do some light cleaning of your values before you can use them with this function, i.e. replacing 2018-06-07T12:22:00+0200 with 2018-06-07 12:22.
You can use datetime lib.
from datetime import datetime
datetime_object = datetime.strptime('Jun 1 2005 1:33PM', '%b %d %Y %I:%M%p')
datetime.strptime documentation
Solution here

Converting Pandas Series to a String to Convert DateTime

I have a DataFrame that includes column df['date'] and df['time']
Which I have put together in one column named df['datetime']
the output looks like the following: 2017-04-12 17:30:18.733
My end goal is to have it converted to a string Wed, 12 Apr 2017 17:30:18 733
When I try different methods as pd.to_datetime() it tells me I need it to be a string.
and I can't find a method to turn the whole column to a bunch of strings
I tried calling .astype(str) .apply(str)
Any suggestions?
You are taking to strings (one in the date column and the other in the time column), joining them together with a space to create a new datetime string (e.g. "2017-04-12 17:30:18.733"). You then use strptime to parse this string into a datetime object. I used a form that is amenable to the inclusion of microseconds or not. You now use 'strftime' to parse this datetime object back into your desired string format.
from datetime import datetime
df = pd.DataFrame({'date': ['2017-04-12', '2017-04-13'],
'time': ['17:30:18.733', '07:30:18']})
def date_parser(date_string):
try:
timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')
timestamp = timestamp.strftime('%a, %d %b %Y %H:%M:%S %f')[:-3]
except ValueError:
timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S')
timestamp = timestamp.strftime('%a, %d %b %Y %H:%M:%S 000')
return timestamp
df['datetime_str'] = (df['date'] + ' ' + df['time']).apply(lambda x: date_parser(x))
>>> df
date time datetime_str
0 2017-04-12 17:30:18.733 Wed, 12 Apr 2017 17:30:18 733
1 2017-04-13 07:30:18 Thu, 13 Apr 2017 07:30:18 000
use something like this:
df.apply(lambda x: x.datetime.strftime('%D %d ...the format you want...'))

Python Date / Time Regular Expression

I am pretty new to regular expressions and it's pretty alien to me. I am parsing an XML feed which produces a date time as follows:
Wed, 23 July 2014 19:25:52 GMT
But I want to split these up so there are as follows:
date = 23/07/2014
time = 19/25/52
Where would I start? I have looked at a couple of other questions on SO and all of them deviate a bit from what I am trying to achieve.
Use datetime.strptime to parse the date from string and then format it using the strftime method of datetime objects:
>>> from datetime import datetime
>>> dt = datetime.strptime("Wed, 23 July 2014 19:25:52 GMT", "%a, %d %B %Y %H:%M:%S %Z")
>>> dt.strftime('%d/%m/%Y')
'23/07/2014'
>>> dt.strftime('%H/%M/%S')
'19/25/52'
But if you're okay with the ISO format you can call date and time methods:
>>> str(dt.date())
'2014-07-23'
>>> str(dt.time())
'19:25:52'

Categories