I'm trying to change delimiter on CSV and write to a new file, is just a simple modification but isn't it.
#!/usr/bin/python
#-*- econde: utf-8 -*-
import sys
import csv
def main():
r = open(sys.argv[1],"r")
wr = open(sys.argv[2],"a+")
rea = csv.reader(r, delimiter=',')
writer = csv.writer(wr,delimiter="|", quotechar="'")
for row in rea:
#line = str(row).replace(",","|")
#writer.writerow("".join(line))
writer.writerow(row)
print type(row)
print row
r.close()
wr.close()
if __name__ == '__main__':
main()
Update:
The output in console looks so so like:
./csv_read.py fim.csv salida.csv
<type 'list'>
['9/17/18 22:29', 'any', 'la_cuerda.net', 'Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch']
but in the file is writing 3 times the same string but on three different ways
the first way is still the same: 1 char per field (wrong format and brackets are included)
the second way is inserting all in one cell without split it like the original
This is the content of the Input File and the Output file
$ cat Input.csv
Time(GMT),Host,dest,Alert
9/17/18 22:34,any,google.com.mx,monitor: Agent started: 'discovery.channel.org->any'.
9/17/18 22:29,any,la_cuerda.net,Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch
$ cat Output.csv
[,'''',T,i,m,e,(,G,M,T,),'''',|, ,'''',H,o,s,t,'''',|, ,'''',d,e,s,t,'''',|, ,'-''',A,l,e,r,t,'''',]
[,'''',9,/,1,7,/,1,8, ,2,2,:,3,4,'''',|, ,'''',a,n,y,'''',|, ,'''',g,o,o,g,l,e,.,c,o,m,.,m,x,'''',|, ,",m,o,n,i,t,o,r,:, ,A,g,e,n,t, ,s,t,a,r,t,e,d,:, ,'''',d,i,s,c,o,v,e,r,y,.,c,h,a,n,n,e,l,.,o,r,g,-,>,a,n,y,'''',.,",]
[,'''',9,/,1,7,/,1,8, ,2,2,:,2,9,'''',|, ,'''',a,n,y,'''',|, ,'''',l,a,_,c,u,e,r,d,a,.,n,e,t,'''',|, ,'''',S,e,p, ,1,7, ,2,2,:,2,9,:,2,9, ,r,u,n,n,i,n,g, ,y,u,m,[,3,7,1,4,4,],:, ,I,n,s,t,a,l,l,e,d,:, ,I,m,a,g,e,M,a,g,i,c,-,t,o,o,l,k,i,t,-,2,.,1,.,7,-,1,.,n,o,a,r,c,h,'''',]
Time(GMT)|Host|dest|Alert
9/17/18 22:34|any|google.com.mx|'monitor: Agent started: ''discovery.channel.org->any''.'
9/17/18 22:29|any|la_cuerda.net|Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch
Time(GMT)|Host|dest|Alert
9/17/18 22:34|any|google.com.mx|'monitor: Agent started: ''discovery.channel.org->any''.'
9/17/18 22:29|any|la_cuerda.net|Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch
Time(GMT)|Host|dest|Alert
9/17/18 22:34|any|google.com.mx|'monitor: Agent started: ''discovery.channel.org->any''.'
9/17/18 22:29|any|la_cuerda.net|Sep 17 22:29:29 running yum[37144]: Installed: ImageMagic-toolkit-2.1.7-1.noarch
wr = open(sys.argv[2],"a+") is the cause. Each time you run your program, it appends its output to the file. The bogus data you see is from previous runs.
Unless your program is really supposed to append to the output file rather than overwrite it, open the file in wb mode.
Also note that csv.reader and csv.writer docs mandate opening the file in binary mode (because the module is supposed to do its own transcoding).
I am trying to run a Python script from Cron but it does not work. I have already tried everything I have seen in multiple Stackoverflow questions.The machine is a Raspberry running Raspbian. The following piece of code is the edition of crontab:
PATH=/usr/sbin:/usr/bin:/sbin/bin:/sbin:/bin:/home/pi/miniconda/bin:/usr/local/bin:/usr/local/sbin
*/5 * * * * rsync -az --timeout=10 --progress pritms#bigdata.trainhealthmanagement.com:Upload/*.csv /home/pi/PAD-S100/PAD-S100-Bloque_Motor/from_repo/ | /bin/sh /home/pi/PAD-S100/PAD-S100-Bloque_Motor/adddate_to_logs.sh >> /home/pi/PAD-S100/PAD-S100-Bloque_Motor/log.log 2>&1
*/5 * * * * /bin/sh /home/pi/PAD-S100/PAD-S100-Bloque_Motor/from_repo/launcher.sh | /bin/sh /home/pi/PAD-S100/PAD-S100-Bloque_Motor/adddate_to_logs.sh >> /home/pi/PAD-S100/PAD-S100-Bloque_Motor/log.log 2>&1
*/30 * * * * rm /home/pi/PAD-S100/PAD-S100-Bloque_Motor/log.log
* * * * * /usr/bin/python /home/pi/PAD-S100/PAD-S100-Bloque_Motor/from_repo/event_management.py | /bin/sh /home/pi/PAD-S100/PAD-S100-Bloque_Motor/adddate_to_logs.sh >> home/pi/PAD-S100/PAD-S100-Bloque_Motor/log.log 2>&1
0 0 * * * rm /home/pi/PAD-S100/PAD-S100-Bloque_Motor/from_repo/*.csv | /bin/sh /home/pi/PAD-S100/PAD-S100-Bloque_Motor/adddate_to_logs.sh >> /home/pi/PAD-S100/PAD-S100-Bloque_Motor/log.log 2>&1
Crontab Observations:
The Path obtained from echo $PATH is included.
launcher.sh, addddate_to_logs.sh and
event_management are executables using the command sudo chmod a+x <file_name>.
The log.log file does not show anything strange.
The system log file /var/log/syslog has the following logs:
Feb 27 15:11:08 raspberrypi cron[21814]: sendmail: Cannot open :25
Feb 27 15:12:01 raspberrypi rsyslogd-2007: action 'action 17' suspended, next retry is Mon Feb 27 15:13:31 2017 [try http://www.rsyslog.com/e/2007 ]
Feb 27 15:12:01 raspberrypi CRON[22209]: (pi) CMD (/usr/bin/python /home/pi/PAD-S100/PAD-S100-Bloque_Motor/from_repo/event_management.py | /bin/sh /home/pi/PAD-S100/PAD-S100-Bloque_Motor/adddate_to_logs.sh >> home/pi/PAD-S100/PAD-S100-Bloque_Motor/log.log 2>&1)
Feb 27 15:12:09 raspberrypi sSMTP[22212]: Unable to set UsesSTARTTILS=""
Feb 27 15:12:09 raspberrypi sSMTP[22212]: Unable to locate
Feb 27 15:12:09 raspberrypi cron[21814]: sendmail: Cannot open :25
Feb 27 15:12:09 raspberrypi sSMTP[22212]: Cannot open :25
Feb 27 15:12:09 raspberrypi CRON[22205]: (pi) MAIL (mailed 178 bytes of output but got status 0x0001 from MTA#012)
We can observe that it is probable that the failing crontab line is the one of the python script. As I am not an expert in Linux I believe it may be something related to the sSMTP. The same kind of error log appears after every call of the cron python script. But I have no idea of how to fix it or configure the local email.
Here is the piece of code of event_management.py file :
#!/usr/bin/python
# -*- coding: utf-8 -*-
import imaplib
import email
import csv
import datetime
EMAIL = <email_user>
FROM_PWD = <password>
SMTP_SERVER = 'mail.o365.alstom.com'
datum = dict()
translate = {'#09': 1, '#0A': 2, '#0B': 3, '#0C': 4}
def connect_imap():
mail = imaplib.IMAP4_SSL(SMTP_SERVER)
mail.login(EMAIL, FROM_PWD)
return mail
def read_email_from_gmail(writer, mail):
mail.select('BRMS')
kind, data = mail.search(None, 'ALL')
mail_ids = data[0]
id_list = mail_ids.split()
first_email_id = int(id_list[0])
latest_email_id = int(id_list[-1])
for i in range(latest_email_id, first_email_id, -1):
typ, data = mail.fetch(i, '(RFC822)')
for response_part in data:
if isinstance(response_part, tuple):
msg = email.message_from_string(response_part[1])
for part in msg.walk():
if part.get_content_type() == 'text/html':
content = part.get_payload()
manage_email_content(content, writer)
return 0
def manage_email_content(content, writer):
content = content.split('\n')
for i, line in enumerate(content):
if 'Alert description' in line:
line = line.split()
datum['Event code'] = line[-1][4:]
if line[-1][:3] in translate:
datum['Motor block num'] = translate[line[-1][:3]]
else:
datum['Motor block num'] = 'Defecto ajeno al bloque motor'
elif 'Alert condition' in line:
line = line.split()
datum['Code description'] = ' '.join(line[4:])
elif 'Unit id' in line:
line = line.split()
datum['Train num'] = line[3][3:]
elif 'Alert raised' in line:
line = line.split()
datum['Date'] = line[4][:10]
datum['Time'] = line[4][11:]
writer.writerow(datum)
print datum
return 0
def move_to_trash_before_date(mail, folder, days_before):
# required to perform search, m.list() for all lables, '[Gmail]/Sent Mail'
no_of_msgs = int(mail.select(folder)[1][0])
print("- Found a total of {1} messages in '{0}'.".format(folder, no_of_msgs))
before_date = (datetime.date.today() - datetime.timedelta(days_before)).strftime("%d-%b-%Y")
typ, data = mail.search(None, '(BEFORE {0})'.format(before_date)) # search pointer for msgs before before_date
if data != ['']: # if not empty list means messages exist
no_msgs_del = data[0].split()[-1] # last msg id in the list
print("- Marked {0} messages for removal with dates before {1} in '{2}'.".format(no_msgs_del, before_date, folder))
mail.store("1:{0}".format(no_msgs_del), '+X-GM-LABELS', '\\Trash') # move to trash
empty_folder(mail, 'Elementos eliminados', do_expunge=True) # can send do_expunge=False, default True
else:
print("- Nothing to remove.")
return 0
def empty_folder(mail, folder, do_expunge=True):
mail.select(folder) # select all trash
mail.store("1:*", '+FLAGS', '\\Deleted') # Flag all Trash as Deleted
if do_expunge: # See Gmail Settings -> Forwarding and POP/IMAP -> Auto-Expunge
mail.expunge() # not need if auto-expunge enabled
else:
print("Expunge was skipped.")
return 0
def disconnect_imap(mail):
mail.close()
mail.logout()
return 0
def main():
with open('email_data.csv', 'w') as f:
writer = csv.DictWriter(f, fieldnames=['Time', 'Date', 'Train num', 'Motor block num',
'Event code', 'Code description'], delimiter=';')
try:
m = connect_imap()
writer.writeheader()
read_email_from_gmail(writer, m)
move_to_trash_before_date(m, 'BRMS', 15) # inbox cleanup, before 15 days
disconnect_imap(m)
except Exception, e:
print str(e)
if __name__ == "__main__":
main()
event_management file connects to an Outlook email folder, reads the emails and builds a CSV file with data extracted from the the emails' contents. This file works properly, it is already tested; and it works fine when executed manually (not using Cron). So I not sure it is related to the sSMTP issue appearing in the system log.
I will apreciate every kind of help or suggestions!
After some test and reading other users' answers, I have found the problem. It is a combination of two different issues that are not directly related, but together made this problem a pain in the ass to debug.
First Problem:
log.log file contains logs and errors from three different executables, hence I did not notice that evet_management file did not have the correct permissions. I did not apply chmod command well, and I have not notice it as it contained a lot of data.
Conclussion 1: One cronjob, one log file.
Conclussion 2: /var/log/syslog contains a lot of data, from various resources, hence it may confuse you when trying to debug. Better to produce log files apart.
Second Problem:
I have two Python distributions installed in my machine. When I execute manually the script, one is used. When Cron executes the script the other one is used. Furthermore I noticed it when first problem was fixed. I got an error of module not found when running the Python script by Cron in the log file, but perfectly worked when manually executed. Hence I have seen that when using pip install <module-name>, it is just for one distro. To check wich version of Python I was using:
which Python
Conclussion: Be smart, don't be like me, don't mess with multiple Python distributions.
Bonus: Always use full paths to be clear. Cron has different env than yours.
This question already has answers here:
Checking File Permissions in Linux with Python
(5 answers)
Closed 2 years ago.
How can I get a file's permission mask like 644 or 755 on *nix using python?
Is there any function or class for doing that? Thank you very much!
os.stat is a wrapper around the stat(2) system call interface.
>>> import os
>>> from stat import *
>>> os.stat("test.txt") # returns 10-tupel, you really want the 0th element ...
posix.stat_result(st_mode=33188, st_ino=57197013, \
st_dev=234881026L, st_nlink=1, st_uid=501, st_gid=20, st_size=0, \
st_atime=1300354697, st_mtime=1300354697, st_ctime=1300354697)
>>> os.stat("test.txt")[ST_MODE] # this is an int, but we like octal ...
33188
>>> oct(os.stat("test.txt")[ST_MODE])
'0100644'
From here you'll recognize the typical octal permissions.
S_IRWXU 00700 mask for file owner permissions
S_IRUSR 00400 owner has read permission
S_IWUSR 00200 owner has write permission
S_IXUSR 00100 owner has execute permission
S_IRWXG 00070 mask for group permissions
S_IRGRP 00040 group has read permission
S_IWGRP 00020 group has write permission
S_IXGRP 00010 group has execute permission
S_IRWXO 00007 mask for permissions for others (not in group)
S_IROTH 00004 others have read permission
S_IWOTH 00002 others have write permission
S_IXOTH 00001 others have execute permission
You are really only interested in the lower bits, so you could chop off the rest:
>>> oct(os.stat("test.txt")[ST_MODE])[-3:]
'644'
>>> # or better
>>> oct(os.stat("test.txt").st_mode & 0o777)
Sidenote: the upper parts determine the filetype, e.g.:
S_IFMT 0170000 bitmask for the file type bitfields
S_IFSOCK 0140000 socket
S_IFLNK 0120000 symbolic link
S_IFREG 0100000 regular file
S_IFBLK 0060000 block device
S_IFDIR 0040000 directory
S_IFCHR 0020000 character device
S_IFIFO 0010000 FIFO
S_ISUID 0004000 set UID bit
S_ISGID 0002000 set-group-ID bit (see below)
S_ISVTX 0001000 sticky bit (see below)
I think this is the clearest way of getting a file's permission bits:
stat.S_IMODE(os.lstat("file").st_mode)
If the file is a symlink, os.lstat() will give you the mode of the link itself, whereas os.stat() dereferences the link. Therefore I find os.lstat() the most generally useful.
stat.S_IMODE() gets "the file’s permission bits, plus the sticky bit, set-group-id, and set-user-id bits".
Here's an example case, given regular file "testfile" and symlink to it, "testlink":
import stat
import os
print oct(stat.S_IMODE(os.lstat("testlink").st_mode))
print oct(stat.S_IMODE(os.stat("testlink").st_mode))
This script outputs the following for me:
0777
0666
Another way to do it if you don't want to work out what stat means is to use the os.access command http://docs.python.org/library/os.html#os.access
BUT read the docs about possible security issues
For instance to check permissions on the file test.dat which has read/write permissions
os.access("test.dat",os.R_OK)
>>> True
#Execute permissions
os.access("test.dat",os.X_OK)
>>> False
#And Combinations thereof
os.access("test.dat",os.R_OK or os.X_OK)
>>> True
os.access("test.dat",os.R_OK and os.X_OK)
>>> False
oct(os.stat('file').st_mode)[4:]
os.access(path, mode) method returns True if access is allowed on path, False if not.
available modes are :
os.F_OK - test the existence of path.
os.R_OK - test the readability of path.
os.W_OK - test the writability of path.
os.X_OK - test if path can be executed.
for example, checking file /tmp/test.sh has execute permission
ls -l /tmp/temp.sh
-rw-r--r-- 1 * * 0 Mar 2 12:05 /tmp/temp.sh
os.access('/tmp/temp.sh',os.X_OK)
False
after changing the file permission to +x
chmod +x /tmp/temp.sh
ls -l /tmp/temp.sh
-rwxr-xr-x 1 * * 0 Mar 2 12:05 /tmp/temp.sh
os.access('/tmp/temp.sh',os.X_OK)
True
Here is a simple way to check the permissions of a directory .
import os
import stat
mode = os.stat("path_of_directory").st_mode
if not ((mode & stat.S_IWUSR):
print('not writable by user')
if not ((mode & stat.S_IWUSR) and (mode & stat.S_IWGRP) and (mode & stat.S_IWOTH)):
print('not writable by all')
The flag list is herebelow :
S_IRWXU 00700 mask for file owner permissions
S_IRUSR 00400 owner has read permission
S_IWUSR 00200 owner has write permission
S_IXUSR 00100 owner has execute permission
S_IRWXG 00070 mask for group permissions
S_IRGRP 00040 group has read permission
S_IWGRP 00020 group has write permission
S_IXGRP 00010 group has execute permission
S_IRWXO 00007 mask for permissions for others (not in group)
S_IROTH 00004 others have read permission
S_IWOTH 00002 others have write permission
S_IXOTH 00001 others have execute permission
There are a lot of file based functions inside the os module im sure. If you run os.stat(filename) you can always interprate the results.
http://docs.python.org/library/stat.html
os.stat is analogous to the c-lib stat (man 2 stat on linux to see the information)
stats = os.stat('file.txt')
print(stats.st_mode)
You can just run a Bash stat command with Popen if you want:
The normal Bash command:
jlc#server:~/NetBeansProjects/LineReverse$ stat -c '%A %a %n' revline.c
-rw-rw-r-- 664 revline.c
And then with Python:
>>> from subprocess import Popen, PIPE
>>> fname = 'revline.c'
>>> cmd = "stat -c '%A %a %n' " + fname
>>> out = Popen(cmd, shell=True, stdout=PIPE).communicate()[0].split()[1].decode()
>>> out
'664'
And here's another way if you feel like searching the directory:
>>> from os import popen
>>> cmd = "stat -c '%A %a %n' *"
>>> fname = 'revline.c'
>>> for i in popen(cmd):
... p, m, n = i.split()
... if n != fname:
... continue
... print(m)
break
...
664
>>>