I am trying to use "directory path" and "prefirx_pattern" from config file.
I get correct results in vdir2 and vprefix2 variable but list local_file_list is still empty.
result
vdir2 is"/home/ab_meta/abfiles/"
vprefix2 is "rp_pck."
[]
code
def get_files(self):
try:
print "vdir2 is" + os.environ['dir_path']
print "vprefix2 is "+ os.environ['prefix_pattern']
local_file_list = filter(os.path.isfile, glob.glob(os.environ['dir_path'] + os.environ['prefix_pattern'] + "*"))
print local_file_list
local_file_list.sort(key=lambda s: os.path.getmtime(os.path.join(os.environ['dir_path'], s)))
except Exception, e:
print e
self.m_logger.error("Exception: Process threw an exception " + str(e))
log.sendlog("error",50)
sys.exit(1)
return local_file_list
I have tried another way as given below but again list is coming as empty.
2nd Option :
def get_config(self):
try:
v_dir_path = os.environ['dir_path']
v_mail_prefix = os.environ['mail_prefix']
self.m_dir_path = v_dir_path
self.m_prefix_pattern = v_prefix_pattern
self.m_mail_prefix = v_mail_prefix
except KeyError, key:
self.m_logger.error("ERROR: Unable to retrieve the key " + str(key))
except Exception, e:
print e
self.m_logger.error("Error: job_prefix Unable to get variables " + str(e))
sys.exit(1)
def get_files(self):
try:
local_file_list = filter(os.path.isfile, glob.glob(self.m_dir_path + self.m_prefix_pattern + "*"))
local_file_list.sort(key=lambda s: os.path.getmtime(os.path.join(os.environ['dir_path'], s)))
except Exception, e:
print e
Thanks
Sandy
Outside of this program, wherever you set the environment variables, you are setting them incorrectly. Your environment variables have quote characters in them.
Set your environment varaibles to have the path data, but no quotes.
Assign the enviornment variable and then pass the path you are interested in into the function.
Accessing global state from within your function can make it hard to follow and debug.
Use os.walk to get the list of files, it returns a tuple of the root dir, a list of dirs, and a list of files. For me its cleaner than using os.isfile to filter.
Use a list comprehension to filter the list of files returned by os.walk.
I'm presuming the prints statements are for debugging so left them out.
vdir2 = os.environ['dir_path']
vprefix2 = os.environ['prefix_pattern']
def get_files(vpath):
for root, dirs, files in os.walk(vpath):
local_file_list = [f for f in files if f.startswith(vprefix2)]
local_file_list.sort(key=lambda x: os.path.getmtime(x))
return local_file_list
Related
How do I store the values with content into strings?
I know there has to be a much cleaner and more efficient way of doing this but currently I am struggling to find a way. I would appreciate a set of fresh eyes on this since I must be missing something. I have spent an outlandish time on this.
My objective is:
Check if sheet.values has content -> if so, store as a string
Check if sheet.values has content -> if not, skip or create no string
The priority of this is that sheet.values can contain an undetermined amount of content that needs to be identified. Such as sheet.values filled in being up to [9] one instance but being filled in to [6] another instance. So it needs to account for this.
The sheet.values also have to return as a string as I use makedirs() later in the code (it gets a bit testy this also needs work if you can help)
I know a for loop should be able to help me but just not found the right one just yet.
import os
import pandas as pd
from openpyxl import load_workbook
from pandas.core.indexes.base import Index
os. chdir("C:\\Users\\NAME\\desktop")
workbook = pd.ExcelFile('Example.xlsx')
sheet = workbook.parse('Sheet1')
print (sheet.values[0])
os.getcwd()
path = os.getcwd()
for input in sheet.values:
if any(sheet.values):
if input == None:
break
else:
if any(sheet.values):
sheet.values == input
set
str1 = '1'.join(sheet.values[0])
str2 = '2'.join(sheet.values[1])
str3 = '3'.join(sheet.values[2])
str4 = '4'.join(sheet.values[3])
str5 = '5'.join(sheet.values[4])
str6 = '6'.join(sheet.values[5])
str7 = '7'.join(sheet.values[6])
str8 = '8'.join(sheet.values[7])
str9 = '9'.join(sheet.values[8])
str10 = '10'.join(sheet.values[9])
str11 = '11'.join(sheet.values[10])
str12 = '12'.join(sheet.values[11])
str13 = '13'.join(sheet.values[12])
str14 = '14'.join(sheet.values[13])
str15 = '15'.join(sheet.values[14])
str16 = '16'.join(sheet.values[15])
str17 = '17'.join(sheet.values[16])
str18 = '18'.join(sheet.values[17])
str19 = '19'.join(sheet.values[18])
str20 = '20'.join(sheet.values[19])
str21 = '21'.join(sheet.values[20])
########################ONE################################################
try:
if not os.path.exists(str1):
os.makedirs(str1)
except OSError:
print ("Creation of the directory %s failed" % str1)
else:
print ("Successfully created the directory %s " % str1)
########################TWO################################################
try:
if not os.path.exists(str2):
os.makedirs(str2)
except OSError:
print ("Creation of the directory %s failed" % str2)
else:
print ("Successfully created the directory %s " % str2)
########################THREE################################################
try:
if not os.path.exists(str3):
os.makedirs(str3)
except OSError:
print ("Creation of the directory %s failed" % str3)
else:
print ("Successfully created the directory %s " % str3)
########################FOUR################################################
try:
if not os.path.exists(str4):
os.makedirs(str4)
except OSError:
print ("Creation of the directory %s failed" % str4)
else:
print ("Successfully created the directory %s " % str4)
Note: The makedirs() code runs down till to the full amount of strings
The Excel document shows the following: enter image description here
This script results in: index 9 is out of bounds for axis 0 with size 9
This is truthfully expected as the sheet.values only this amount.
Can anyone help me? I know it is messy
Updated Code
import os
import pandas as pd
from openpyxl import load_workbook
from pandas.core.indexes.base import Index
os. chdir("C:\\Users\\NAME\\desktop")
workbook = pd.ExcelFile('Example.xlsx')
sheet = workbook.parse('Sheet1')
print (sheet.values[0])
os.getcwd()
path = os.getcwd()
print ("The current working Directory is %s" % path)
for col in sheet.values:
for row in range(len(col)):
dir_name = str(row + 1) + col[row]
try:
os.makedirs(dir_name, exist_ok=True)
except OSError:
print ("Creation of the directory %s failed" % dir_name)
else:
print ("Successfully created the directory %s " % dir_name)
it seems like you're trying to read the first column of a csv, and create directories based on the value.
with open(mypath+file) as file_name:
file_read = csv.reader(file_name)
file = list(file_read)
for col in file:
for row in range(len(col)):
dir_name = str(row + 1) + col[row]
try:
# https://docs.python.org/3/library/os.html#os.makedirs
os.makedirs(dir_name, exist_ok=True)
except OSError:
print ("Creation of the directory %s failed" % str1)
else:
print ("Successfully created the directory %s " % str1)
I'm trying to stop this code from giving me an error about a file I created called beloved.txt I used the FillNotFoundError: to say not to give me the error and to print the file thats not found but instead its printing the message and the error message. How can I fix it ?
def count_words(Filenames):
with open(Filenames) as fill_object:
contentInFill = fill_object.read()
words = contentInFill.rsplit()
word_length = len(words)
print("The file " + Filename + " has " + str(word_length) + " words.")
try:
Filenames = open("beloved.txt", mode="rb")
data = Filenames.read()
return data
except FileNotFoundError as err:
print("Cant find the file name")
Filenames = ["anna.txt", "gatsby.txt", "don_quixote.txt", "beloved.txt", "mockingbird.txt"]
for Filename in Filenames:
count_words(Filename)
A few tips:
Don't capitalize variables besides class names.
Use different variable names when referring to different things. (i.e. don't use Filenames = open("beloved.txt", mode="rb") when you already have a global version of that variable, and a local version of that variable, and now you are reassigning it to mean something different again!! This behavior will lead to headaches...
The main problem with the script though is trying to open a file outside your try statement. You can just move your code to be within the try:! I also don't understand except FileNotFoundError as err: when you don't use err. You should rewrite that to except FileNotFoundError: in this case :)
def count_words(file):
try:
with open(file) as fill_object:
contentInFill = fill_object.read()
words = contentInFill.rsplit()
word_length = len(words)
print("The file " + file + " has " + str(word_length) + " words.")
with open("beloved.txt", mode="rb") as other_file:
data = other_file.read()
return data
except FileNotFoundError:
print("Cant find the file name")
filenames = ["anna.txt", "gatsby.txt", "don_quixote.txt", "beloved.txt", "mockingbird.txt"]
for filename in filenames:
count_words(filename)
I also do not understand why you have your function return data when data is read from the same file regardless of that file you input to the function?? You will get the same result returned in all cases...
The "with open(Filenames) as fill_objec:" sentence will throw you the exception.
So you at least must enclose that sentence in the try part. In your code you first get that len in words, and then you check for the specific file beloved.txt. This doubled code lets you to the duplicated mensajes. Suggestion:
def count_words(Filenames):
try:
with open(Filenames) as fill_object:
contentInFill = fill_object.read()
words = contentInFill.rsplit()
word_length = len(words)
print("The file " + Filename + " has " + str(word_length) + " words.")
except FileNotFoundError as err:
print("Cant find the file name")
Using part of a script I found on here to run through the directories on a Windows PC to produce an XML file. However I am running into the above error and I am not sure how to handle the error. I have added a try/except in however it still crashes out. This works perfectly if i set the directory to be my Current Working Directory by replacing the "DirTree3("C:/") " with "DirTree3((os.getcwd())"
def DirTree3(path):
try:
result = '<dir>%s\n' % xml_quoteattr(os.path.basename(path))
for item in os.listdir(path):
itempath = os.path.join(path, item)
if os.path.isdir(itempath):
result += '\n'.join(' ' + line for line in
DirTree3(os.path.join(path, item)).split('\n'))
elif os.path.isfile(itempath):
result += ' <file> %s </file>\n' % xml_quoteattr(item)
result += '</dir> \n'
return result
except Exception:
pass
print '<DirectoryListing>\n' + DirTree3("C:/") + '\n</DirectoryListing>'
As a side note, this script will be run on a system without admin privileges so running as admin is not an option
Based on your comments below about getting and wanting to ignore any path access errors, I modified the code in my answer below to do that as best it can. Note that it will still terminate if some other type of exception occurs.
def DirTree3(path):
try:
result = '<dir>%s\n' % xml_quoteattr(os.path.basename(path))
try:
items = os.listdir(path)
except WindowsError as exc:
return '<error> {} </error>'.format(xml_quoteattr(str(exc)))
for item in items:
itempath = os.path.join(path, item)
if os.path.isdir(itempath):
result += '\n'.join(' ' + line for line in
DirTree3(os.path.join(path, item)).split('\n'))
elif os.path.isfile(itempath):
result += ' <file> %s </file>\n' % xml_quoteattr(item)
result += '</dir> \n'
return result
except Exception as exc:
print('exception occurred: {}'.format(exc))
raise
print '<DirectoryListing>\n' + DirTree3("C:/") + '\n</DirectoryListing>'
Ive created a simple dns.query function, I am attempting to add the results into a list or potentially a dictionary. However I cant work out how to achieve it, I have tried list.append(subdomain, item), ive tried using the join function and I have tried to use the update function respectably.
Any pointers would be appreciated.
ORIGINAL
def get_brutes(subdomain):
targets = []
try:
myResolver = dns.resolver.Resolver()
myResolver.nameservers = ['8.8.8.8']
myAnswers = myResolver.query(subdomain)
for item in myAnswers.rrset:
targets.append(subdomain,item)
except Exception as e:
pass
return targets
FIX
def get_brutes(subdomain):
targets = []
try:
myResolver = dns.resolver.Resolver()
myResolver.nameservers = ['8.8.8.8']
myAnswers = myResolver.query(subdomain)
for item in myAnswers.rrset:
targets.append(subdomain + ' ' + str(item))
except Exception as e:
pass
return targets
I am using following function of a Class to find out if every .csv has corresponding .csv.meta in the given directory.
I am getting "None " for file which are just .csv and hexadecimal code for .csv.meta.
Result
None
<_sre.SRE_Match object at 0x1bb4300>
None
<_sre.SRE_Match object at 0xbd6378>
This is code
def validate_files(self,filelist):
try:
local_meta_file_list = []
local_csv_file_list = []
# Validate each files and see if they are pairing properly based on the pattern *.csv and *.csv.meta
for tmp_file_str in filelist:
csv_match = re.search(self.vprefix_pattern + '([0-9]+)' + self.vcsv_file_postfix_pattern + '$' , tmp_file_str)
if csv_match:
local_csv_file_list.append(csv_match.group())
meta_file_match_pattern=self.vprefix_pattern + csv_match.group(1) + self.vmeta_file_postfix_pattern
tmp_meta_file = [os.path.basename(s) for s in filelist if meta_file_match_pattern in s]
local_meta_file_list.extend(tmp_meta_file)
except Exception, e:
print e
self.m_logger.error("Error: Validate File Process thrown exception " + str(e))
sys.exit(1)
return local_csv_file_list, local_meta_file_list
These are file names.
File Names
rp_package.1406728501.csv.meta
rp_package.1406728501.csv
rp_package.1402573701.csv.meta
rp_package.1402573701.csv
rp_package.1428870707.csv
rp_package.1428870707.meta
Thanks
Sandy
If all you need is to find .csv files which have corresponding .csv.meta files, then I don’t think you need to use regular expressions for filtering them. We can filter the file list for those with the .csv extension, then filter that list further for files whose name, plus .meta, appears in the file list.
Here’s a simple example:
myList = [
'rp_package.1406728501.csv.meta',
'rp_package.1406728501.csv',
'rp_package.1402573701.csv.meta',
'rp_package.1402573701.csv',
'rp_package.1428870707.csv',
'rp_package.1428870707.meta',
]
def validate_files(file_list):
loc_csv_list = filter(lambda x: x[-3:].lower() == 'csv', file_list)
loc_meta_list = filter(lambda c: '%s.meta' % c in file_list, loc_csv_list)
return loc_csv_list, loc_meta_list
print validate_files(myList)
If there may be CSV files that don’t conform to the rp_package format, and need to be excluded, then we can initially filter the file list using the regex. Here’s an example (swap out the regex parameters as necessary):
import re
vprefix_pattern = 'rp_package.'
vcsv_file_postfix_pattern = '.csv'
regex_str = vprefix_pattern + '[0-9]+' + vcsv_file_postfix_pattern
def validate_files(file_list):
csv_list = filter(lambda x: re.search(regex_str, x), file_list)
loc_csv_list = filter(lambda x: x[-3:].lower() == 'csv', csv_list)
loc_meta_list = filter(lambda c: '%s.meta' % c in file_list, loc_csv_list)
return loc_csv_list, loc_meta_list
print validate_files(myList)