Using python to substitute awk for Linux commands - python

I am new to Python and I need to learn it for work purposes. I am having trouble figuring out a way to use Python to replace awk for column prints.
For example, I need to print out the date:
root#user:~# date
Mon Jun 24 01:30:08 EDT 2013
But, I only need a certain part of it:
root#user:~# date | awk '{print $2" "$3" "$4" "$5}'
Jun 24 01:30:54 EDT
Is there a way in Python to do this without needing to do the following:
import os
os.system("date | awk '{print $2" "$3" "$4" "$5}'")
I have tried to do an extensive Google/Bing/Ask/Yahoo search and have seemed to have come up short on this.

You probably want to look at the datetime.datetime.strftime() function for that particular task.
However, for the more general task of printing out certain fields, you'd use .split() and list slicing:
date_string = "Mon Jun 24 01:30:08 EDT 2013"
fields = date_string.split()
print ' '.join(fields[1:5]) # Prints "June 24 01:30:08 EDT"

Related

Extract timestamp from a given string using python

I tried multiple packages to extract timestamp from a given string, but no package gives correct results. I did use dateutils, datefinder, parsedatetime, etc. for this task. They extract some datetimes which are in certain formats but not all formats, sometimes they extract some unwanted numbers also as timestamps.
Is there any python package which extracts datetime from a given string.
Assume, I have 2 strings like these:
scala> val xorder= new order(1,"2016-02-22 00:00:00.00",100,"COMPLETED")
and
Fri, 10 Jun 2011 11:04:17 +0200 (CEST)
and want to extract only datetime. Is there any function which extracts both formats of datetimes from above strings. In other cases formats may be different, still it should pick out datetime strings
You can use the datetime function strptime() as follows
dt = datetime.strptime("21/11/06 16:30", "%d/%m/%y %H:%M")
You can create your own formatting and use the function as well.
I created a small python package datetime_extractor to pull out timestamps from a given strings. It can extract many datetime formats from given strings. Hope it will be useful.
pip install datetime-extractor
from datetime_extractor import DateTimeExtractor
import re
samplestring1 = 'scala> val xorder= new order(1,"2016-02-22 00:00:00.00",100,"COMPLETED")'
DateTimeExtractor(samplestring1)
Out: ['2016-02-22 00:00:00.00']
samplestring2 = 'Fri, 10 Jun 2011 11:04:17 +0200 (CEST)'
DateTimeExtractor(samplestring2)
Out: ['10 Jun 2011 11:04:17']
#Allan & #Manmeet Singh, Let me know your comments.

Convert multiple types of dates

I'm working with data that comes from different places and need to convert dates into the same format. Below are few examples of what I have:
Thu Dec 03 07:27:23 GMT 2015
3-Dec-15
2015-12-04T06:58:54Z
23-Sep-2015 07:03:37 UTC
The desired output format should be the same for all dates, like this:
12/03/2007
12/03/2015
12/04/2015
09/23/2015
Any suggestions how to achieve that with Python? Thanks in advance!
Yes, the dateutil library provides date format detection with the parse function :
from dateutil.parser import parse
parse(text).strftime("%m/%d/%Y")

Identify that a string could be a datetime object

If I knew the format in which a string represents date-time information, then I can easily use datetime.datetime.strptime(s, fmt). However, without knowing the format of the string beforehand, would it be possible to determine whether a given string contains something that could be parsed as a datetime object with the right format string?
Obviously, generating every possible format string to do an exhaustive search is not a feasible idea. I also don't really want to write one function with many format strings hardcoded into it.
Does anyone have any thoughts on how this can be accomplished (perhaps some sort of regex?)?
What about fuzzyparsers:
Sample inputs:
jan 12, 2003
jan 5
2004-3-5
+34 -- 34 days in the future (relative to todays date)
-4 -- 4 days in the past (relative to todays date)
Example usage:
>>> from fuzzyparsers import parse_date
>>> parse_date('jun 17 2010') # my youngest son's birthday
datetime.date(2010, 6, 17)
Install with:
$ pip install fuzzyparsers
You can use parser from dateutil
Example usage:
from dateutil import parser
dt = parser.parse("Aug 28 1999 12:00AM")

This specific str.replace() in Python with BeautifulSoup isn't working

I'm trying to automate a task that occurs roughly monthly, which is adding a hyperlink to a page that looks like:
2013: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2012: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2011: Jan Feb Mar ...
Whenever we get a new document for that month, we add the
Jul
tags around it.
So I'm using BeautifulSoup in Python. You can see below that I'm picking out the HTML "p" tag that contains this data and doing a replace() on the first month that it finds (finds Month using the reverse dictionary I created, and the third parameter of replace() indicates to only do the first one it finds).
# Modify link in hr.php:
hrphp = open('\\\\intranet\\websites\\infonet\\hr\\hr.php', 'r').read()
soup = BeautifulSoup(hrphp) # Parsing with BeautifulSoup
Months = {k: v for k,v in enumerate(calendar.month_abbr)} # Creates a reverse dictionary for month abbreviation lookup by month number, ie. "print Months[07]" will print "Jul"
print hrphp+"\n\n\n\n\n" # DEBUGGING: Compare output before
hrphp = hrphp.replace(
str(soup.findAll('p')[4]),
str(soup.findAll('p')[4]).replace(
Months[int(InterlinkDate[1][-5:-3])],
""+Months[int(InterlinkDate[1][-5:-3])]+"",
1),
1
)
print hrphp # DEBUGGING: Compare output after
See how it's a nested replace()? The logic seems to work out fine, but for some reason it doesn't actually change the value. Earlier in the script I do something similar with the Months[] dictionary and str.replace() on a segment of the page, and that works out, although it doesn't have a nested replace() like this nor does it search for a block of text using soup.findAll().
Starting to bang my head around on the desk, any help would be greatly appreciated. Thanks in advance.
What you end up doing with the code str(soup.findAll('p')[4]).replace is just replacing the values that are found in a string representation of the results in soup.findAll('p')[4], which will more than likely differ from the string in hrphp because "Beautiful Soup gives you Unicode" after it parses.
Beautiful Soups documentation holds the answer. Have a look at the Changing Attribute Values section.

Convert Chrome history date/time stamp to readable format

I originally posted this question looking for an answer with using python, got some good help, but have still not been able to find a solution. I have a script running on OS X 10.5 client machines that captures internet browsing history (required as part of my sys admin duties in a US public school). Firefox 3.x stores history in a sqlite db, and I have figured out how to get that info out using python/sqlite3. Firefox 3.x uses a conventional unixtimestamp to mark visits and that is not difficult to convert... Chrome also stores browser history in a sqlite db, but its timestamp is formatted as the number of microseconds since January, 1601. I'd like to figure this out using python, but as far as I know, the sqlite3 module doesn't support that UTC format. Is there another tool out there to convert Chrome timestamps to a human readable format?
Use the datetime module. For example, if the number of microseconds in questions is 10**16:
>>> datetime.datetime(1601, 1, 1) + datetime.timedelta(microseconds=1e16)
datetime.datetime(1917, 11, 21, 17, 46, 40)
>>> _.isoformat()
'1917-11-21T17:46:40'
this tells you it was just past a quarter to 6pm of November 21, 1917. You can format datetime objects in any way you want thanks to their strftime method, of course. If you also need to apply timezones (other than the UTC you start with), look at third-party module pytz.
Bash
$ date -ud #$[13315808702856828/10**6-11644473600] +"%F %T %Z"
2022-12-18 03:45:02 UTC
$ printf '%(%FT %T %z)T\n' $[13315808702856828/10**6-11644473600]
2022-12-17 T19:45:02 -0800
Perl
$ echo ".. 13315808702856828 .." |\
perl -MPOSIX -pe 's!\b(1\d{16})\b!strftime(q/%F/,gmtime($1/1e6-11644473600))!e'
.. 2022-12-17 ..

Categories