Python: Creating an empty file object - python

I am attempting to make a logging module for Python that does not work because it fails on creation of the file object.
debug.py:
import os
import datetime
import globals
global fil
fil = None
def init(fname):
fil = open(fname, 'w+')
fil.write("# PyIDE Log for" + str(datetime.datetime.now()))
def log(strn):
currentTime = datetime.datetime.now()
fil.write(str(currentTime) + ' ' + str(os.getpid()) + ' ' + strn)
print str(currentTime) + ' ' + str(os.getpid()) + ' ' + strn
def halt():
fil.close()
fil will not work as None as I get an AttributeError. I also tried creating a dummy object:
fil = open("dummy.tmp","w+")
but the dummy.tmp file is written to instead, even though init() is called before log() is. Obviously you cannot open a new file over an already opened file. I attempted to close fil before init(), but Python said it could not perform write() on a closed file.
This is the code that is accessing debug.py
if os.path.exists(temp):
os.rename(temp, os.path.join("logs","archived","log-" + str(os.path.getctime(temp)) + ".txt"))
debug.init(globals.logPath)
debug.log("Logger initialized!")
I would like to have logging in my program and I cannot find a workaround for this.

Your problem is that you don't assign to the global fil:
def init(fname):
fil = open(fname, 'w+')
This creates a new local variable called fil.
If you want to assign to the global variable fil you need to bring it into the local scope:
def init(fname):
global fil
fil = open(fname, 'w+')

If you want to MAKE your own logging module, then you may want to turn what you already have into a class, so you can import it as a module.
#LoggerThingie.py
import os
import datetime
class LoggerThingie(object):
def __init__(self,fname):
self.fil = open(fname, 'w+')
self.fil.write("# PyIDE Log for" + str(datetime.datetime.now()))
def log(self,strn):
currentTime = datetime.datetime.now()
self.fil.write(str(currentTime) + ' ' + str(os.getpid()) + ' ' + strn)
print str(currentTime) + ' ' + str(os.getpid()) + ' ' + strn
def halt(self):
self.fil.close()
If you did this as a class, you would not have to keep track of globals in the first place (which is generally understood as bad practice in the world of programming: Why are global variables evil? )
Since it is now a module on its own, when you want to use it in another python program you would do this:
from LoggerThingie import LoggerThingie
#because module filename is LoggerThingie.py and ClassName is LoggerThingie
and then use it wherever you want, for example:
x = LoggerThingie('filename.txt') #create LoggerThingie object named x
and every-time you want to insert logs into it:
x.log('log this to the file')
and when you are finally done:
x.halt() # when ur done

If you don't want to start with an empty file you could use StringIO to keep the messages in memory and write them to disk at the end but be careful, if something happened and you didn't wrote the messages they will be lost.

Related

Send different items as a string in Python

I am trying to make a custom logger for my python application and I am having some issues with parsing the messages. All in a nutshell I want to be able to replace print(Item[0], "has", numberOfItems, Price[0], Availablity[0]) with my logger: Logger.Log(Item[0], "has", numberOfItems, Price[0], Availablity[0]).
On the Logger.py I have:
import os
from datetime import datetime
dirName = 'Logs'
#def initialiseLogger():
if not os.path.exists(dirName):
os.mkdir(dirName)
dateTimeObj = datetime.now()
date = dateTimeObj.strftime("%d.%m.%Y")
logFileName = dirName + "\\" + "Log." + date + ".log"
LOG_FILE = open(logFileName, "w+")
LOG_FILE.write("Started log file")
def Log(message):
dateTimeObj = datetime.now()
timestampStr = dateTimeObj.strftime("%d:%m:%Y-%H:%M:%S")
LOG_FILE.write(str(timestampStr) + "::" + message)
From the elements that I am trying to print some are not strings, so I get various errors if I try to use str(). I tried converting all of the elements into a string and then use str.join(), but with no luck.
Is there an easier way to do the conversion or should I think of implementing more logic on the Logger.Log side in order to receive anything I send it?
Thanks!
Looks like a perfect case for using str.format:
Logger.Log("{} has {} {} {}".format(Item[0], numberOfItems, Price[0], Availablity[0]))

Adding a header to multiple csv files

Can anyone guide me on how to add a header to multiple csv files?
Optional: If anyone knows a method add add a header to pre-existing files in C# or can guide me to the relevant resources. That would be great.
import os
import os.path as path
## First create a function that will generate random files.
def create_random_csv_files(fault_classes, number_of_files_in_each_class):
os.mkdir("./random_data/") # Make a directory to save created files.
for fault_class in fault_classes:
for i in range(number_of_files_in_each_class):
data = np.random.rand(1024,3)
file_name = "./random_data/" + eval("fault_class") + "_" + "{0:03}".format(i+1) + ".csv" # This creates file_name
np.savetxt(eval("file_name"), data, delimiter = ",", comments = "")
print(str(eval("number_of_files_in_each_class")) + " " + eval("fault_class") + " files" + " created.")
import os
import os.path as path
## First create a function that will generate random files.
def create_random_csv_files(fault_classes, number_of_files_in_each_class):
os.mkdir("./random_data/") # Make a directory to save created files.
for fault_class in fault_classes:
for i in range(number_of_files_in_each_class):
data = np.random.rand(1024,3)
file_name = "./random_data/" + eval("fault_class") + "_" + "{0:03}".format(i+1) + ".csv" # This creates file_name
np.savetxt("file_name", data, delimiter = ",", header = "V1,V2,V3", comments = "")
print(str("number_of_files_in_each_class") + " " + "fault_class" + " files" + " created.")

Exporting results to CSV file

I'm pretty new to coding and I'm stuck on this problem. Written in python.
import logging
import os
import sys
import json
import pymysql
import requests
import csv
## set up logger to pass information to Cloudwatch ##
#logger = logging.getLogger()
#logger.setLevel(logging.INFO)
## define RDS variables ##
rds_host = 'host'
db_username = 'username'
db_password = 'password'
db_name = 'name'
## connect to rds database ##
try:
conn = pymysql.connect(host=rds_host, user=db_username, password=db_password, db=db_name, port=1234,
connect_timeout=10)
except Exception as e:
print("ERROR: Could not connect to MySql instance.")
print(e)
sys.exit()
print("SUCCESS: Connection to RDS mysql instance succeeded")
def main():
with conn.cursor() as cur:
cur.execute("SELECT Domain FROM domain_reg")
domains = cur.fetchall()
# logger.info(domains)
conn.close()
new_domains = []
for x in domains:
a = "http://" + x[0] + ("/orange/health")
new_domains.append(a)
print(new_domains)
for y in new_domains:
try:
response = requests.get(y)
if response.status_code == 200:
print("Domain " + y + " exists")
else:
print("Domain " + y + " does not exist; Status code = " + str(response.status_code))
except Exception as e:
print("Exception: With domain " + y)
with open("new_orangeZ.csv", "w", newline='') as csv_file:
writer = csv.writer(csv_file, delimiter=',')
for line in new_domains:
writer.writerow([new_domains])
if __name__ == "__main__":
main()
This code does create a CSV file, but it's not exactly exporting what I want it to export. It only creates a csv file listing only the "Y" and I understand that because i'm calling "new_domains" in writer.writerow. I'm trying to figure out how to also export the print function that matches with the if else statement into the csv, if that makes sense. Sorry if this may sounds gibberish, like I said, I'm super new to coding. Was hoping to post a picture of what I get in the csv file vs what I wanted but I'm new to stackoverflow also so it doesn't allow me to post pictures haha.
Thank you!!!
print() only displays the strings on the screen.
You need to remember them somewhere, like in a new list:
result=[] #somewhere at the beginning
...
print("Domain " + y + " exists")
result.append([y,"Domain " + y + " exists"]) #after each print
and save both in the CSV file with something like:
for domain,status in new_domains:
writer.writerow([domain, status])
It's easier to save the domains again, as the for / in may not keep their order.
By the way, with "for line in new_domains:" I guess you should have written "line" in the CSV insead of "new_domains"...
Question: m trying to figure out how to also export the print function that matches with the if else statement into the csv
If you want to print into a file, you have to give the file object to print(..., file=<my file object>. In your example, move the for ... inside of with ....
Note: It's no good Idea to use csv.writer(... for non csv data"
with open("test", "w", newline='') as my_file_object:
for y in new_domains:
From Python Documentation - Built-in Functions
print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)
Print objects to the text stream file, separated by sep and followed by end.
sep, end, file and flush, if present, must be given as keyword arguments.
The file argument must be an object with a write(string) method; if it is not present or None, sys.stdout will be used.

Python3 convert result of DB query to individual strings

I have 2 functions in a python script.
The first one gets the data from a database with a WHERE clause but the second function uses this data and iterates through the results to download a file.
I can get to print the results as a tuple?
[('mmpc',), ('vmware',), ('centos',), ('redhat',), ('postgresql',), ('drupal',)]
But I need to to iterate through each element as a string so the download function can append it onto the url for the response variable
Here is the code for the download script which contains the functions:-
import requests
import eventlet
import os
import sqlite3
# declare the global variable
active_vuln_type = None
# Get the active vulnerability sets
def GetActiveVulnSets() :
# make the variable global
global active_vuln_type
active_vuln_type = con = sqlite3.connect('data/vuln_sets.db')
cur = con.cursor()
cur.execute('''SELECT vulntype FROM vuln_sets WHERE active=1''')
active_vuln_type = cur.fetchall()
print(active_vuln_type)
return(active_vuln_type)
# return str(active_vuln_type)
def ExportList():
vulnlist = list(active_vuln_type)
activevulnlist = ""
for i in vulnlist:
activevulnlist = str(i)
basepath = os.path.dirname(__file__)
filepath = os.path.abspath(os.path.join(basepath, ".."))
response = requests.get('https://vulners.com/api/v3/archive/collection/?type=' + activevulnlist)
with open(filepath + '/vuln_files/' + activevulnlist + '.zip', 'wb') as f:
f.write(response.content)
f.close()
return activevulnlist + " - " + str(os.path.getsize(filepath + '/vuln_files/' + activevulnlist + '.zip'))
Currently it creates a corrupt .zip as ('mmpc',).zip so it is not the actual file which would be mmpc.zip for the first one but it does not seem to be iterating through the list either as it only creates the zip file for the first result from the DB, not any of the others but a print(i) returns [('mmpc',), ('vmware',), ('centos',), ('redhat',), ('postgresql',), ('drupal',)]
There is no traceback as the script thinks it is working.
The following fixes two issues: 1. converting the query output to an iterable of strings and 2. replacing the return statement with a print function so that the for-loop does not end prematurely.
I have also taken the liberty of removing some redundancies such as closing a file inside a with statement and pointlessly converting a list into a list. I am also calling the GetActiveVulnSets inside the ExportList function. This should eliminate the need to call GetActiveVulnSets outside of function definitions.
import requests
import eventlet
import os
import sqlite3
# declare the global variable
active_vuln_type = None
# Get the active vulnerability sets
def GetActiveVulnSets() :
# make the variable global
global active_vuln_type
active_vuln_type = con = sqlite3.connect('data/vuln_sets.db')
cur = con.cursor()
cur.execute('''SELECT vulntype FROM vuln_sets WHERE active=1''')
active_vuln_type = [x[0] for x in cur]
print(active_vuln_type)
return(active_vuln_type)
# return str(active_vuln_type)
def ExportList():
GetActiveVulnSets()
activevulnlist = ""
for i in active_vuln_type:
activevulnlist = str(i)
basepath = os.path.dirname(__file__)
filepath = os.path.abspath(os.path.join(basepath, ".."))
response = requests.get('https://vulners.com/api/v3/archive/collection/?type=' + activevulnlist)
with open(filepath + '/vuln_files/' + activevulnlist + '.zip', 'wb') as f:
f.write(response.content)
print(activevulnlist + " - " + str(os.path.getsize(filepath + '/vuln_files/' + activevulnlist + '.zip')))
While this may solve the problem you are encountering, I would recommend that you write functions with parameters. This way, you know what each function is supposed to take in as an argument and what it spits out as output. In essence, avoid the usage of global variables if you can. They are hard to debug and quite frankly unnecessary in many use cases.
I hope this helps.

Script that reads PDF metadata and writes to CSV

I wrote a script to read PDF metadata to ease a task at work. The current working version is not very usable in the long run:
from pyPdf import PdfFileReader
BASEDIR = ''
PDFFiles = []
def extractor():
output = open('windoutput.txt', 'r+')
for file in PDFFiles:
try:
pdf_toread = PdfFileReader(open(BASEDIR + file, 'r'))
pdf_info = pdf_toread.getDocumentInfo()
#print str(pdf_info) #print full metadata if you want
x = file + "~" + pdf_info['/Title'] + " ~ " + pdf_info['/Subject']
print x
output.write(x + '\n')
except:
x = file + '~' + ' ERROR: Data missing or corrupt'
print x
output.write(x + '\n')
pass
output.close()
if __name__ == "__main__":
extractor()
Currently, as you can see, I have to manually input the working directory and manually populate the list of PDF files. It also just prints out the data in the terminal in a format that I can copy/paste/separate into a spreadsheet.
I'd like the script to work automatically in whichever directory I throw it in and populate a CSV file for easier use. So far:
from pyPdf import PdfFileReader
import csv
import os
def extractor():
basedir = os.getcwd()
extension = '.pdf'
pdffiles = [filter(lambda x: x.endswith('.pdf'), os.listdir(basedir))]
with open('pdfmetadata.csv', 'wb') as csvfile:
for f in pdffiles:
try:
pdf_to_read = PdfFileReader(open(f, 'r'))
pdf_info = pdf_to_read.getDocumentInfo()
title = pdf_info['/Title']
subject = pdf_info['/Subject']
csvfile.writerow([file, title, subject])
print 'Metadata for %s written successfully.' % (f)
except:
print 'ERROR reading file %s.' % (f)
#output.writerow(x + '\n')
pass
if __name__ == "__main__":
extractor()
In its current state it seems to just prints a single error (as in, the error message in the exception, not an error returned by Python) message and then stop. I've been staring at it for a while and I'm not really sure where to go from here. Can anyone point me in the right direction?
writerow([file, title, subject]) should be writerow([f, title, subject])
You can use sys.exc_info() to print the details of your error
http://docs.python.org/2/library/sys.html#sys.exc_info
Did you check the pdffiles variable contains what you think it does? I was getting a list inside a list... so maybe try:
for files in pdffiles:
for f in files:
#do stuff with f
I personally like glob. Notice I add * before the .pdf in the extension variable:
import os
import glob
basedir = os.getcwd()
extension = '*.pdf'
pdffiles = glob.glob(os.path.join(basedir,extension)))
Figured it out. The script I used to download the files was saving the files with '\r\n' trailing after the file name, which I didn't notice until I actually ls'd the directory to see what was up. Thanks for everyone's help.

Categories