getting date from datetime data

getting date from datetime data - python

I have a datetime data in this format,
08:15:54:012 12 03 2016 +0000 GMT+00:00
I need to extract only date,that is 12 03 2016 in python.
I have tried
datetime_object=datetime.strptime('08:15:54:012 12 03 2016 +0000 GMT+00:00','%H:%M:%S:%f %d %m %Y')
I get an
ValueError: unconverted data remains: +0000 GMT+00:00

If you don't mind using an external library, I find the dateparser module much more intuitive than pythons internal datetime. It can parse pretty much anything if you just do
>>> import dateparser
>>> dateparser.parse('08:15:54:012 12 03 2016 +0000 GMT+00:00')
It claims it can handle timezone offsets tho I haven't tested it.

If you need this as string then use slicing
text = '08:15:54:012 12 03 2016 +0000 GMT+00:00'
print(text[13:23])
# 12 03 2016
but you can also convert to datetime
from datetime import datetime
text = '08:15:54:012 12 03 2016 +0000 GMT+00:00'
datetime_object = datetime.strptime(text[13:23],'%d %m %Y')
print(datetime_object)
# datetime.datetime(2016, 3, 12, 0, 0)
BTW:
in your oryginal version you have to remove +0000 GMT+00:00 usinig slicing [:-16]
strptime('08:15:54:012 12 03 2016 +0000 GMT+00:00'[:-16], '%H:%M:%S:%f %d %m %Y')
You can also use split() and join()
>>> x = '08:15:54:012 12 03 2016 +0000 GMT+00:00'.split()
['08:15:54:012', '12', '03', '2016', '+0000', 'GMT+00:00']
>>> x[1:4]
['12', '03', '2016']
>>> ' '.join(x[1:4])
'12 03 2016'

You can do it like this:
d = '08:15:54:012 12 03 2016 +0000 GMT+00:00'
d = d[:23] #Remove the timezone details
from datetime import datetime
d = datetime.strptime(d, "%H:%M:%S:%f %m %d %Y") #parse the string
d.strftime('%m %d %Y') #format the string
You get:
'12 03 2016'

Related

convert the datetime stamp in numpy array

This might be simple but I had no luck finding the right solution.
I have a 'date' column in np array with dates in format 'Tue Feb 04 17:04:01 +0000 2020' which I would like to convert to '2020-02-04 17:04:01'
Are there any inherent methods in np which does that?
There are solutions which suggested looping through the elements in the column, but I guess that's not Numpy - thonic way.

Maybe you can try dateutil to parse dates
from dateutil import parser
date_str = 'Tue Feb 04 17:04:01 +0000 2020'
new_date = parser.parse(date_str).strftime('%Y-%m-%d %T')
With NumPy maybe you do as below:
np.datetime64(new_date)
#Example
date_str = 'Tue Feb 04 17:04:01 +0000 2020'
date_str2 = 'Fri Feb 07 17:04:01 +0000 2020'
new_date = parser.parse(date_str).strftime('%Y-%m-%d %T')
new_date2 = parser.parse(date_str2).strftime('%Y-%m-%d %T')
np.arange(np.datetime64(new_date), np.datetime64(new_date2))

Change datetime format to hours only in Pandas

I have a list of strings date. Formatted in like
Fri Apr 23 12:38:07 +0000 2021
How can I change its format? I want to take only the hours. I checked other source before, but you need to change the date format, which obviously I'm struggling rn
As I know, you can write the code like
ds['waktu'] = pd.to_datetime(ds['tanggal'], format='%A %b %d %H:%M:%S %z %Y')
to change its format. But idk what +0000 means.

If you only want to take the hours from the date strings, you can use .dt.strftime() after the pd.to_datetime() call, as follows:
ds['waktu'] = pd.to_datetime(ds['tanggal'], format='%a %b %d %H:%M:%S %z %Y').dt.strftime('%H:%M:%S')
Note that your format string for pd.to_datetime() is not correct and need to replace %A by %a.
+0000 is the time zone, which you can parse with %z in the format string.
Demo
ds = pd.DataFrame({'tanggal': ['Fri Apr 23 12:38:07 +0000 2021', 'Thu Apr 22 11:28:17 +0000 2021']})
ds['waktu'] = pd.to_datetime(ds['tanggal'], format='%a %b %d %H:%M:%S %z %Y').dt.strftime('%H:%M:%S')
print(ds)
tanggal waktu
0 Fri Apr 23 12:38:07 +0000 2021 12:38:07
1 Thu Apr 22 11:28:17 +0000 2021 11:28:17

Changing datetime format in Python Language

I am parsing emails through Gmail API and have got the following date format:
Sat, 21 Jan 2017 05:08:04 -0800
I want to convert it into ISO 2017-01-21 (yyyy-mm-dd) format for MySQL storage. I am not able to do it through strftime()/strptime() and am missing something. Can someone please help?
TIA

isoformat() in the dateutil.
import dateutil.parser as parser
text = 'Sat, 21 Jan 2017 05:08:04 -0800'
date = (parser.parse(text))
print(date.isoformat())
print (date.date())
Output :
2017-01-21T05:08:04-08:00
2017-01-21

You can do it with strptime():
import datetime
datetime.datetime.strptime('Sat, 21 Jan 2017 05:08:04 -0800', '%a, %d %b %Y %H:%M:%S %z')
That gives you:
datetime.datetime(2017, 1, 21, 5, 8, 4, tzinfo=datetime.timezone(datetime.timedelta(-1, 57600)))

You can even do it manually using simple split and dictionary.That way, you will have more control over formatting.
def dateconvertor(date):
date = date.split(' ')
month = {'Jan': 1, 'Feb': 2, 'Mar': 3}
print str(date[1]) + '-' + str(month[date[2]]) + '-' + str(date[3])
def main():
dt = "Sat, 21 Jan 2017 05:08:04 -0800"
dateconvertor(dt)
if __name__ == '__main__':
main()
Keep it simple.

from datetime import datetime
s="Sat, 21 Jan 2017 05:08:04 -0800"
d=(datetime.strptime(s,"%a, %d %b %Y %X -%f"))
print(datetime.strftime(d,"%Y-%m-%d"))
Output : 2017-01-21

Find date within strings using regex in both Python and grep

I have a log with entries in the following format:
1483528632 3 1 Wed Jan 4 11:17:12 2017 501040002 4
1533528768 4 2 Thu Jan 5 19:17:45 2017 534040012 3
...
How do I fetch only the timestamp component (eg. Wed Jan 4 11:17:12 2017) using regular expressions?
I have to implement the final product in python, but the requirement is to have part of an automated regression suite in bash/perl (with the final product eventually being in Python).

If the format is fixed in terms of space delimiters, you can simply split, get a slice of a date string and load it to datetime object via datetime.strptime():
In [1]: from datetime import datetime
In [2]: s = "1483528632 3 1 Wed Jan 4 11:17:12 2017 501040002 4"
In [3]: date_string = ' '.join(s.split()[3:8])
In [4]: datetime.strptime(date_string, "%a %b %d %H:%M:%S %Y")
Out[4]: datetime.datetime(2017, 1, 4, 11, 17, 12)

Grep is most often used in this scenario if you are working with syslog. But as the post is also tagged with Python. This example uses regular expressions with re:
import re
Define the pattern to match:
pat = "\w{3}\s\w{3}\s+\w\s\w{2}:\w{2}:\w{2}\s\w{4}"
Then use re.findall to return all non-overlapping matches of pattern in txt:
re.findall(pat,txt)
Output:
['Wed Jan 4 11:17:12 2017', 'Thu Jan 5 19:17:45 2017']
If you want to then use datetime:
import datetime
dates = re.findall(pat,txt)
datetime.datetime.strptime(dates[0], "%a %b %d %H:%M:%S %Y")
Output:
datetime.datetime(2017, 1, 4, 11, 17, 12)
You can then utilise these datetime objects:
dateObject = datetime.datetime.strptime(dates[0], "%a %b %d %H:%M:%S %Y").date()
timeObject = datetime.datetime.strptime(dates[0], "%a %b %d %H:%M:%S %Y").time()
print('The date is {} and time is {}'.format(dateObject,timeObject))
Output:
The date is 2017-01-04 and time is 11:17:12

The regex to match the timestamp is:
'[a-zA-Z]{3} +[a-zA-Z]{3} +\d{1,2} +\d{2}:\d{2}:\d{2} +\d{4}'.
With grep that can be used like this (if your log file was called log.txt):
$ grep -oE '[a-zA-Z]{3} +[a-zA-Z]{3} +\d{1,2} +\d{2}:\d{2}:\d{2} +\d{4}' log.txt
# Wed Jan 4 11:17:12 2017
# Thu Jan 5 19:17:45 2017
In python you can use that like so:
import re
log_entry = "1483528632 3 1 Wed Jan 4 11:17:12 2017 501040002 4"
pattern = '[a-zA-Z]{3} +[a-zA-Z]{3} +\d{1,2} +\d{2}:\d{2}:\d{2} +\d{4}'
compiled = re.compile(pattern)
match = compiled.search(log_entry)
match.group(0)
# 'Wed Jan 4 11:17:12 2017'
You can use this to get an actual datetime object from the string (expanding on above code):
from datetime import datetime
import re
log_entry = "1483528632 3 1 Wed Jan 4 11:17:12 2017 501040002 4"
pattern = '[a-zA-Z]{3} +[a-zA-Z]{3} +\d{1,2} +\d{2}:\d{2}:\d{2} +\d{4}'
compiled = re.compile(pattern)
match = compiled.search(log_entry)
log_time_str = match.group(0)
datetime.strptime(log_time_str, "%a %b %d %H:%M:%S %Y")
# datetime.datetime(2017, 1, 4, 11, 17, 12)

Two approaches: with and without using regular expressions
1) using re.findall() function:
with open('test.log', 'r') as fh:
lines = re.findall(r'\b[A-Za-z]{3}\s[A-Za-z]{3}\s{2}\d{1,2} \d{2}:\d{2}:\d{2} \d{4}\b',fh.read(), re.M)
print(lines)
2) usign str.split() and str.join() functions:
with open('test.log', 'r') as fh:
lines = [' '.join(d.split()[3:8]) for d in fh.readlines()]
print(lines)
The output in both cases will be a below:
['Wed Jan 4 11:17:12 2017', 'Thu Jan 5 19:17:45 2017']

grep -E '\b(Mon|Tue|Wed|Thu|Fri|Sat|Sun) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) +[0-9]+ [0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{4}\b' dates

If you just wanted to list the dates, rather than grep, perhaps:
sed -nre 's/^.*([A-Za-z]{3}\s+[A-Za-z]{3}\s+[0-9]+\s+[0-9]+:[0-9]+:[0-9]+\s+[0-9]{4}).*$/\1/p' filename

how to get a string from split line and compare in python

I have a line after split like in here:
lineaftersplit=Jan 31 00:57:07 2012 GMT
How do I get only year 2012 from this and compare if it falls between (2010) and (2013)

If lineaftersplit is a string value, you can use the datetime module to parse out the information, including the year:
import datetime
parsed_date = datetime.datetime.strptime(lineaftersplit, '%b %d %H:%M:%S %Y %Z')
if 2010 <= parsed_date.year <= 2013:
# year between 2010 and 2013.
This has the advantage that you can do further tests on the datetime object, including sorting and date arithmetic.
Demo:
>>> import datetime
>>> lineaftersplit="Jan 31 00:57:07 2012 GMT"
>>> parsed_date = datetime.datetime.strptime(lineaftersplit, '%b %d %H:%M:%S %Y %Z')
>>> parsed_date
datetime.datetime(2012, 1, 31, 0, 57, 7)
>>> parsed_date.year
2012

You can use str.rsplit:
>>> strs = 'Jan 31 00:57:07 2012 GMT'
str.rstrip will return a list like this:
>>> strs.rsplit(None,2)
['Jan 31 00:57:07', '2012', 'GMT']
Now we need the second item:
>>> year = strs.rsplit(None,2)[1]
>>> year
'2012'
>>> if 2010 <= int(year) <= 2013: #apply int() to get the integer value
... #do something
...

Try this:
st="Jan 31 00:57:07 2012 GMT".split()
year=int(st[3])
This actually works if the string is always of this format

str='Jan 31 00:57:07 2012 GMT'
str.split()[3]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

getting date from datetime data - python

You can do it like this: d = '08:15:54:012 12 03 2016 +0000 GMT+00:00' d = d[:23] #Remove the timezone details from datetime import datetime d = datetime.strptime(d, "%H:%M:%S:%f %m %d %Y") #parse the string d.strftime('%m %d %Y') #format the string You get: '12 03 2016'

Related

convert the datetime stamp in numpy array

Change datetime format to hours only in Pandas

Changing datetime format in Python Language

Find date within strings using regex in both Python and grep

how to get a string from split line and compare in python

Categories

Resources