Iterating the entire list - python

My code hits the variable end point and then creates the logfile(UUID.log) these log files are unique for every hit. Inside every log file there is a json(process_name,Process_id) where endpoint name gets logged in as a process_name.
The if condition checks in for the duplicate process_name inside the log files before creating a new file to ensure that the log file with duplicate process_name do not get logged in.
from flask import Flask, jsonify
import json
import uuid
import os
import test1
app = Flask(__name__)
#app.route('/<string:name>')
def get_stats(name):
proceuudi = uuid.uuid4()
stat = [
{
'process_id': str(proceuudi),
'process_name': name
}
]
os.chdir("file_path")
files = os.listdir('file_path')
l=[]
for i in files:
with open(i) as f:
data = json.load(f)
for j in data:
l.append(j)
for j in l:
print(j)
if j['process_name'] != name:
with open(str(proceuudi) + '.log', 'w+') as f: # writing JSON object
json.dump(stat, f)
return jsonify({'stats':stat})
else:
return 'Process already running'
app.run(port = 6011)
Whenever i am trying to parse the list(l=[]) containing the process_name and process_id. I am not able to parse the entire list. it is only checking at the starting index. if it gets j['process_name'] != name at the first index it is getting returned. Is there a way through which entire list could be parsed and then if the process_name do not exist in any log file the log file with that process name gets created.

use set to hold process_name as this will avoid scanning whole list.
don't scan all files on every call use global variable to hold name in memory
app = Flask(__name__)
# use set as membership (in operator) check is O(1)
l = set()
running = False
#app.route('/<string:name>')
def get_stats(name):
global l, running
proceuudi = uuid.uuid4()
# why list as from the code it is clear that one file will have only one entry
stat = [
{
'process_id': str(proceuudi),
'process_name': name
}
]
# take all name at the start of server
if not running:
# better to write new function for this stuff
files = os.listdir('./file_path')
print files
for i in files:
with open("./file_path/"+i) as f:
data = json.load(f)
for j in data:
l.add(j["process_name"])
running = True
if name in l:
# use jsonfy here too
return jsonify("proces running")
else:
# add new process_name to in momery variable
l.add(stat[0]["process_name"])
with open("./file_path/"+str(proceuudi) + '.log', 'w+') as f: # writing JSON object
json.dump(stat, f)
return jsonify({'stats':stat})
app.run(port = 6011)
NOTE: use code review for such type of question.

Related

best way to check if files have been processed

E: My initial Title was very misleading.
I have a SQL server with a database and I have around 10,000 excel files in a directory. The files contain values I need to copy into the DB with new excel files being added on a daily basis. Additionally, each file contains a field "finished" with a boolean value, that expresses if the file is ready to be copied to the DB. However, the filename is not connected to it's contend. Only the content of file contains primary keys and filed names corresponding to the DB's keys and field names.
Checking if the file's content is already in the DB by comparing the primary key over and over is not feasible, since opening the files is far too slow. I could however check if files are already in the DB initially and write the result in a file (say copied.txt), so it simply holds the filenames of all already copied files. The real service could then load this file's content into a dictionary (dict1) with the filename as the key and with no value (I think hash tables are the fastest for comparative operations), then store the filenames of all existing excel files in the dir in a second dictionary (dict2) and compare both dictionary and create a list of all files that are in dict2 but not in dict1. I would then iterate through the list (should usually only contain around 10-20 files), checking if the files are flagged as "ready to be copied" and copy the values to the database. Finally, I would add this file's name to dict1 and store it back to the copied.txt file.
My idea is to run this python script as a service that loops as long as there are files to work with. When it can't find files to copy from, it should wait for x seconds (maybe 45) than do it all over.
This my best concept so far. Is there a faster/ more efficient way to do it?
It just came back to my mind that sets only contain unique elements and thus are the best data type for a comparison like this. It is a data type that I hardly know but now I can see how useful it can be.
The part of the code that is related to my original question is in Part 1-3:
The program:
1. loads file names from a file to a set
2. loads file names from the filesystem/ a certain dir + subdirs to a set
3. creates a list of the difference of the two sets
4. iterates through all remaining files
looks, if they have been flagged as "finalized",
than for each row:
creates a new record in the database
and adds values to given record (one by one)
5. adds the processed file's name to the file of filenames.
It does so every 5 minutes. This is completely fine for my purpose.
I am very new to coding so sorry for my dilettantish approach. At least it works so far.
#modules
import pandas as pd
import pyodbc as db
import xlwings as xw
import glob
import os
from datetime import datetime, date
from pathlib import Path
import time
import sys
#constants
tick_time_seconds = 300
line = ("################################################################################### \n")
pathTodo = "c:\\myXlFiles\\**\\*"
pathDone = ("c:\\Done\\")
pathError = ("c:\\Error\\")
sqlServer = "MyMachine\\MySQLServer"
sqlDriver = "{SQL Server}"
sqlDatabase="master"
sqlUID="SA"
sqlPWD="PWD"
#functions
def get_list_of_files_by_extension(path:str, extension:str) -> list:
"""Recieves string patch and extension;
gets list of files with corresponding extension in path;
return list of file with full path."""
fileList = glob.glob(path+extension, recursive=True)
if not fileList:
print("no found files")
else:
print("found files")
return fileList
def write_error_to_log(description:str, errorString:str, optDetails=""):
"""Recieves strings description errorstring and opt(ional)Details;
writes the error with date and time in logfile with the name of current date;
return nothing."""
logFileName = str(date.today())+".txt"
optDetails = optDetails+"\n"
dateTimeNow = datetime.now()
newError = "{0}\n{1}\n{2}{3}\n".format(line, str(dateTimeNow), optDetails, errorString)
print(newError)
with open(Path(pathError, logFileName), "a") as logFile:
logFile.write(newError)
def sql_connector():
"""sql_connector: Recieves nothing;
creates a connection to the sql server (conncetion details sould be constants);
returns a connection."""
return db.connect("DRIVER="+sqlDriver+"; \
SERVER="+sqlServer+"; \
DATABASE="+sqlDatabase+"; \
UID="+sqlUID+"; \
PWD="+sqlPWD+";")
def sql_update_builder(dbField:str, dbValue:str, dbKey:str) -> str:
""" sql_update_builder: takes strings dbField, dbValue and dbKey;
creates a sql syntax command with the purpose to update the value of the
corresponding field with the corresponding key;
returns a string with a sql command."""
return "\
UPDATE [tbl_Main] \
SET ["+dbField+"]='"+dbValue+"' \
WHERE ((([tbl_Main].MyKey)="+dbKey+"));"
def sql_insert_builder(dbKey: str) -> str:
""" sql_insert_builder: takes strings dbKey;
creates a sql syntax command with the purpose to create a new record;
returns a string with a sql command."""
return "\
INSERT INTO [tbl_Main] ([MyKey])\
VALUES ("+dbKey+")"
def append_filename_to_fileNameFile(xlFilename):
"""recieves anywthing xlFilename;
converts it to string and writes the filename (full path) to a file;
returns nothing."""
with open(Path(pathDone, "filesDone.txt"), "a") as logFile:
logFile.write(str(xlFilename)+"\n")
###################################################################################
###################################################################################
# main loop
while __name__ == "__main__":
###################################################################################
""" 1. load filesDone.txt into set"""
listDone = []
print(line+"reading filesDone.txt in "+pathDone)
try:
with open(Path(pathDone, "filesDone.txt"), "r") as filesDoneFile:
if filesDoneFile:
print("file contains entries")
for filePath in filesDoneFile:
filePath = filePath.replace("\n","")
listDone.append(Path(filePath))
except Exception as err:
errorDescription = "failed to read filesDone.txt from {0}".format(pathDone)
write_error_to_log(description=errorDescription, errorString=str(err))
continue
else: setDone = set(listDone)
###################################################################################
""" 2. load filenames of all .xlsm files into set"""
print(line+"trying to get list of files in filesystem...")
try:
listFileSystem = get_list_of_files_by_extension(path=pathTodo, extension=".xlsm")
except Exception as err:
errorDescription = "failed to read file system "
write_error_to_log(description=errorDescription, errorString=str(err))
continue
else:
listFiles = []
for filename in listFileSystem:
listFiles.append(Path(filename))
setFiles = set(listFiles)
###################################################################################
""" 3. create list of difference of setMatchingFiles and setDone"""
print(line+"trying to compare done files and files in filesystem...")
setDifference = setFiles.difference(setDone)
###################################################################################
""" 4. iterate thru list of files """
for filename in setDifference:
""" 4.1 try: look if file is marked as "finalized=True";
if the xlfile does not have sheet 7 (old ones)
just add the xlfilename to the xlfilenameFile"""
try:
print("{0}trying to read finalized state ... of {1}".format(line, filename))
filenameClean = str(filename).replace("\n","")
xlFile = pd.ExcelFile(filenameClean)
except Exception as err:
errorDescription = "failed to read finalized-state from {0} to dataframe".format(filename)
write_error_to_log(description=errorDescription, errorString=str(err))
continue
else:
if "finalized" in xlFile.sheet_names:
dataframe = xlFile.parse("finalized")
print("finalized state ="+str(dataframe.iloc[0]["finalized"]))
if dataframe.iloc[0]["finalized"] == False:
continue
else:
append_filename_to_fileNameFile(filename) #add the xlfilename to the xlfilenameFile"
continue
###################################################################################
""" 4.2 try: read values to dataframe"""
try:
dataframe = pd.read_excel(Path(filename), sheet_name=4)
except Exception as err:
errorDescription = "Failed to read values from {0} to dataframe".format(filename)
write_error_to_log(description=errorDescription, errorString=str(err))
continue
###################################################################################
""" 4.2 try: open connection to database"""
print("{0}Trying to open connection to database {1} on {2}".format(line, sqlDatabase, sqlServer))
try:
sql_connection = sql_connector() #create connection to server
stuff = sql_connection.cursor()
except Exception as err:
write_error_to_log(description="Failed to open connection:", errorString=str(err))
continue
###################################################################################
""" 4.3 try: write to database"""
headers = list(dataframe) #copy header from dataframe to list; easier to iterate
values = dataframe.values.tolist() #copy values from dataframe to list of lists [[row1][row2]...]; easier to iterate
for row in range(len(values)): #iterate over lines
dbKey = str(values[row][0]) #first col is key
sqlCommandString = sql_insert_builder(dbKey=dbKey)
""" 4.3.1 firts trying to create (aka insert) new record in db ..."""
try:
print("{0}Trying insert new record with the id {1}".format(line, dbKey))
stuff.execute(sqlCommandString)
sql_connection.commit()
print(sqlCommandString)
except Exception as err:
sql_log_string = " ".join(sqlCommandString.split()) #get rid of whitespace in sql command
write_error_to_log(description="Failed to create new record in DB:", errorString=str(err), optDetails=sql_log_string)
else: #if record was created add the values one by one:
print("{0}Trying to add values to record with the ID {1}".format(line, dbKey))
""" 4.3.2 ... than trying to add the values one by one"""
for col in range(1, len(headers)): #skip col 0 (the key)
dbField = str(headers[col]) #field in db is header in the excel sheet
dbValue = str(values[row][col]) #get the corresponding value
dbValue = (dbValue.replace("\"","")).replace("\'","") #getting rid of ' and " to prevent trouble with the sql command
sqlCommandString = sql_update_builder(dbField, dbValue, dbKey) # calling fuction to create a sql update command string
try: #try to commit the sql command
stuff.execute(sqlCommandString)
sql_connection.commit()
print(sqlCommandString)
except Exception as err:
sql_log_string = " ".join(sqlCommandString.split()) #get rid of whitespace in sql command
write_error_to_log(description="Failed to add values in DB:", errorString=str(err), optDetails=sql_log_string)
append_filename_to_fileNameFile(filename)
print(line)
# wait for a certain amount of time
for i in range(tick_time_seconds, 0, -1):
sys.stdout.write("\r" + str(i))
sys.stdout.flush()
time.sleep(1)
sys.stdout.flush()
print(line)
#break # this is for debuggung

grab next .zip file in folder (iterate through zip directory)

Below is my most recent attempt; but alas, I print 'current_file' and it's always the same (first) .zip file in my directory?
Why/how can I iterate this to get to the next file in my zip directory?
my DIRECTORY_LOCATION has 4 zip files in it.
def find_file(cls):
listOfFiles = os.listdir(config.DIRECTORY_LOCATION)
total_files = 0
for entry in listOfFiles:
total_files += 1
# if fnmatch.fnmatch(entry, pattern):
current_file = entry
print (current_file)
""""Finds the excel file to process"""
archive = ZipFile(config.DIRECTORY_LOCATION + "/" + current_file)
for file in archive.filelist:
if file.filename.__contains__('Contact Frog'):
return archive.extract(file.filename, config.UNZIP_LOCATION)
return FileNotFoundError
find_file usage:
excel_data = pandas.read_excel(self.find_file())
Update:
I just tried changing return to yield at:
yield archive.extract(file.filename, config.UNZIP_LOCATION)
and now getting the below error at my find_file line.
ValueError: Invalid file path or buffer object type: <class 'generator'>
then I alter with the generator obj as suggested in comments; i.e.:
generator = self.find_file(); excel_data = pandas.read_excel(generator())
and now getting this error:
generator = self.find_file(); excel_data = pandas.read_excel(generator())
TypeError: 'generator' object is not callable
Here is my /main.py if helpful
"""Start Point"""
from data.find_pending_records import FindPendingRecords
from vital.vital_entry import VitalEntry
import sys
import os
import config
import datetime
# from csv import DictWriter
if __name__ == "__main__":
try:
for file in os.listdir(config.DIRECTORY_LOCATION):
if 'VCCS' in file:
PENDING_RECORDS = FindPendingRecords().get_excel_data()
# Do operations on PENDING_RECORDS
# Reads excel to map data from excel to vital
MAP_DATA = FindPendingRecords().get_mapping_data()
# Configures Driver
VITAL_ENTRY = VitalEntry()
# Start chrome and navigate to vital website
VITAL_ENTRY.instantiate_chrome()
# Begin processing Records
VITAL_ENTRY.process_records(PENDING_RECORDS, MAP_DATA)
except:
print("exception occured")
raise
It is not tested.
def find_file(cls):
listOfFiles = os.listdir(config.DIRECTORY_LOCATION)
total_files = 0
for entry in listOfFiles:
total_files += 1
# if fnmatch.fnmatch(entry, pattern):
current_file = entry
print (current_file)
""""Finds the excel file to process"""
archive = ZipFile(config.DIRECTORY_LOCATION + "/" + current_file)
for file in archive.filelist:
if file.filename.__contains__('Contact Frog'):
yield archive.extract(file.filename, config.UNZIP_LOCATION)
This is just your function rewritten with yield instead of return.
I think it should be used in the following way:
for extracted_archive in self.find_file():
excel_data = pandas.read_excel(extracted_archive)
#do whatever you want to do with excel_data here
self.find_file() is a generator, should be used like an iterator (read this answer for more details).
Try to integrate the previous loop in your main script. Each iteration of the loop, it will read a different file in excel_data, so in the body of the loop you should also do whatever you need to do with the data.
Not sure what you mean by:
just one each time the script is executed
Even with yield, if you execute the script multiple times, you will always start from the beginning (and always get the first file). You should read all of the files in the same execution.

Verify data on a file

I'm trying to make my life easier on my work, and writing down errors and solutions for that same errors. The program itself works fine when it's about adding new errors, but then I added a function to verify if the error exists in the file and then do something to it (not added yet).
The function doesn't work and I don't know why. I tried to debug it, but still not able to find the error, maybe a conceptual error?
Anyway, here's my entire code.
import sys
import os
err = {}
PATH = 'C:/users/userdefault/desktop/errordb.txt'
#def open_file(): #Not yet used
#file_read = open(PATH, 'r')
#return file_read
def verify_error(error_number, loglist): #Verify if error exists in file
for error in loglist:
if error_number in loglist:
return True
def dict_error(error_number, solution): #Puts input errors in dict
err = {error_number: solution}
return err
def verify_file(): #Verify if file exists. Return True if it does
archive = os.path.isfile(PATH)
return archive
def new_error():
file = open(PATH, 'r') #Opens file in read mode
loglist = file.readlines()
file.close()
found = False
error_number = input("Error number: ")
if verify_error(error_number, loglist) == True:
found = True
# Add new solution, or another solution.
pass
solution = str(input("Solution: "))
file = open(PATH, 'a')
error = dict_error(error_number, solution)
#Writes dict on file
file.write(str(error))
file.write("\n")
file.close()
def main():
verify = verify_file() #Verify if file exists
if verify == True:
new = str.lower(input("New job Y/N: "))
if new == 'n':
sys.exit()
while new == 'y':
new_error()
new = str.lower(input("New job Y/N: "))
else:
sys.exit()
else:
file = open(PATH, "x")
file.close()
main()
main()
To clarify, the program executes fine, it don't return an error code. It just won't execute the way I'm intended, I mean, it supposed to verify if certain error number already exists.
Thanks in advance :)
The issue I believe you're having is the fact that you're not actually creating a dictionary object in the file and modifying it but instead creating additional dictionaries every time an error is added then reading them back as a list of strings by using the .readlines() method.
An easier way of doing it would be to create a dictionary if one doesn't exist and append errors to it. I've made a few modifications to your code which should help.
import sys
import os
import json # Import in json and use is as the format to store out data in
err = {}
PATH = 'C:/users/userdefault/desktop/errordb.txt'
# You can achieve this by using a context manager
#def open_file(): #Not yet used
#file_read = open(PATH, 'r')
#return file_read
def verify_error(error_number, loglist): #Verify if error exists in file
# Notice how we're looping over keys of your dictionary to check if
# an error already exists.
# To access values use loglist[k]
for k in loglist.keys():
if error_number == k:
return True
return False
def dict_error(loglist, error_number, solution): #Puts input errors in dict
# Instead of returning a new dictionary, return the existing one
# with the new error appended to it
loglist[error_number] = solution
return loglist
def verify_file(): #Verify if file exists. Return True if it does
archive = os.path.isfile(PATH)
return archive
def new_error():
# Let's move all the variables to the top, makes it easier to read the function
# Changes made:
# 1. Changed the way we open and read files, now using a context manager (aka with open() as f:
# 2. Added a json parser to store in and read from file in a json format. If data doesn't exist (new file?) create a new dictionary object instead
# 3. Added an exception to signify that an error has been found in the database (this can be removed to add additional logic if you'd like to do more stuff to the error, etc)
# 4. Changed the way we write to file, instead of appending a new line we now override the contents with a new updated dictionary that has been serialized into a json format
found = False
loglist = None
# Open file as read-only using a context manager, now we don't have to worry about closing it manually
with open(PATH, 'r') as f:
# Lets read the file and run it through a json parser to get a python dictionary
try:
loglist = json.loads(f.read())
except json.decoder.JSONDecodeError:
loglist = {}
error_number = input("Error number: ")
if verify_error(error_number, loglist) is True:
found = True
raise Exception('Error exists in the database') # Raise exception if you want to stop loop execution
# Add new solution, or another solution.
solution = str(input("Solution: "))
# This time open in write only and replace the dictionary
with open(PATH, 'w') as f:
loglist = dict_error(loglist, error_number, solution)
# Writes dict on file in json format
f.write(json.dumps(loglist))
def main():
verify = verify_file() #Verify if file exists
if verify == True:
new = str.lower(input("New job Y/N: "))
if new == 'n':
sys.exit()
while new == 'y':
new_error()
new = str.lower(input("New job Y/N: "))
else:
sys.exit()
else:
with open(PATH, "x") as f:
pass
main()
main()
Note that you will have to create a new errordb file for this snippet to work.
Hope this has helped somehow. If you have any further questions hit me up in the comments!
References:
Reading and Writing files in Python
JSON encoder and decoder in Python
I think that there may be a couple of problems with your code, but the first thing that I noticed was that you are saving Error Numbers and Solutions as a dictionary in errorsdb.txt and when you read them back in you are reading them back in as a list of strings:
The line:
loglist = file.readlines()
in new_error returns a list of strings. This means that verify_error will always return False.
So you have a couple of choices:
You could modify verify_error to the following:
def verify_error(error_number, loglist): #Verify if error exists in file
for error in loglist:
if error_number in error:
return True
Although, I think that a better solution would be to load errorsdb.txt as a JSON file and then you'll have a dictionary. That would look something like:
import json
errordb = {}
with open(PATH) as handle:
errordb = json.load(handle)
So here are the full set of changes I would make:
import json
def verify_error(error_number, loglist): #Verify if error exists in file
for error in loglist:
if error_number in error:
return True
def new_error():
errordb = list()
exitsting = list()
with open(PATH) as handle:
existing = json.load(handle)
errordb += existing
error_number = input("Error number: ")
if verify_error(error_number, errordb) == True:
# Add new solution, or another solution.
print("I might do something here.")
else:
solution = str(input("Solution: "))
errordb.append({error_number, solution})
#Writes dict on file
with open(PATH, "w") as handle:
json.dump(errordb, handle)

Cant get text to append to file in python

Ok so this snippet of code is a http response inside of a flask server. I dont think this information will be of any use but its there if you need to know it.
This Code is suppose to read in the name from the post request and write to a file.
Then it checks a file called saved.txt which is stored in the FILES dictionary.
If we do not find our filename in the saved.txt file we append the filename to the saved file.
APIResponce function is just a json dump
At the moment it doesn't seem to be appending at all. The file is written just fine but append doesn't go thru.
Also btw this is being run on Linino, which is just a distribution of Linux.
def post(self):
try:
## Create the filepath so we can use this for mutliple schedules
filename = request.form["name"] + ".txt"
path = "/mnt/sda1/arduino/www/"
filename_path = path + filename
#Get the data from the request
schedule = request.form["schedule"]
replacement_value = schedule
#write the schedule to the file
writefile(filename_path,replacement_value)
#append the file base name to the saved file
append = True
schedule_names = readfile(FILES['saved']).split(" ")
for item in schedule_names:
if item == filename:
append = False
if append:
append_to = FILES['saved']
filename_with_space =filename + " "
append(append_to,filename_with_space)
return APIResponse({
'success': "Successfully modified the mode."
})
except:
return APIResponse({
'error': "Failed to modify the mode"
})
Here are the requested functions
def writefile(filename, data):
#Opens a file.
sdwrite = open(filename, 'w')
#Writes to the file.
sdwrite.write(data)
#Close the file.
sdwrite.close()
return
def readfile(filename):
#Opens a file.
sdread = open(filename, 'r')
#Reads the file's contents.
blah = sdread.readline()
#Close the file.
sdread.close()
return blah
def append(filename,data):
## use mode a for appending
sdwrite = open(filename, 'a')
## append the data to the file
sdwrite.write(data)
sdwrite.close()
Could it be that the bool object append and the function name append are the same? When I tried it, Python complained with "TypeError: 'bool' object is not callable"

Using python script to search in multiple files and outputting an individual file for each one

I am trying to get a program up and running that takes astronomical data files with the extension .fits and takes all of the files with that extension in a folder and searches for specific header information, and subsequently places it into a text folder corresponding to each file. I am using a while loop, and please forgive me if this code is badly formatted, it is my first time using python! My main problem is that I can only get the program to read one file before it closes itself.
#!/usr/bin/env python
#This code properly imports all '.fits' files in a specified directory and
#outputs them into a .txt format that allows several headers and their contained
#data to be read.
import copy
import sys
import pyfits
import string
import glob
import os.path
import fnmatch
import numpy as np
DIR = raw_input("Please input a valid directory : ") #-----> This prompts for input from the user to find the '.fits' files
os.chdir(DIR)
initialcheck = 0 #Initiates the global counter for the number of '.fits' files in the specified directory
targetcheck = 0 #Initiates the global counter for the amount of files that have been processed
def checkinitial(TD):
#This counts the number of '.fits' files in your directory
for files in glob.iglob('*.fits'):
check = len(glob.glob1(TD,"*.fits"))
global initialcheck
initialcheck = check
if initialcheck == 0:
print 'There are no .FITS files in this directory! Try Again...'
sys.exit()
return initialcheck
def sorter(TD, targcheck, inicheck):
#This function will call the two counters and compare them until the number of processed files is greater than the files in the #directory, thereby finishing the loop
global initialcheck
inicheck = initialcheck
global targetcheck
targcheck = targetcheck
while targcheck <= inicheck:
os.walk(TD)
for allfiles in glob.iglob('*.fits'):
print allfiles #This prints out the filenames the porgram is currently processing
with pyfits.open(allfiles) as HDU:
#This block outlines all of the search terms in their respective headers, you will need to set the indices #below to search in the correct header for the specified term you are looking for, however no alterations to #the header definitions should be made.
HDU_HD_0 = HDU[0].header
HDU_HD_1 = HDU[1].header
#HDU_HD_2 = HDU[2].header -----> Not usually needed, can be activated if data from this header is required
#HDU_HD_3 = HDU[3].header -----> Use this if the '.fits' file contains a third header (unlikely but possible)
KeplerIDIndex = HDU_HD_0.index('KEPLERID')
ChannelIndex = HDU_HD_0.index('SKYGROUP')
TTYPE1Index = HDU_HD_1.index('TTYPE1')
TTYPE8Index = HDU_HD_1.index('TTYPE8')
TTYPE9Index = HDU_HD_1.index('TTYPE9')
TTYPE11Index = HDU_HD_1.index('TTYPE11')
TTYPE12Index = HDU_HD_1.index('TTYPE12')
TTYPE13Index = HDU_HD_1.index('TTYPE13')
TTYPE14Index = HDU_HD_1.index('TTYPE14')
TUNIT1Index = HDU_HD_1.index('TUNIT1')
TUNIT8Index = HDU_HD_1.index('TUNIT8')
TUNIT9Index = HDU_HD_1.index('TUNIT9')
TUNIT11Index = HDU_HD_1.index('TUNIT11')
TUNIT12Index = HDU_HD_1.index('TUNIT12')
TUNIT13Index = HDU_HD_1.index('TUNIT13')
TUNIT14Index = HDU_HD_1.index('TUNIT14')
#The below variables are an index search for the data found in the specified indices above, allowing the data #to be found in teh numpy array that '.fits' files use
File_Data_KID = list( HDU_HD_0[i] for i in [KeplerIDIndex])
File_Data_CHAN = list( HDU_HD_0[i] for i in [ChannelIndex])
Astro_Data_1 = list( HDU_HD_1[i] for i in [TTYPE1Index])
Astro_Data_8 = list( HDU_HD_1[i] for i in [TTYPE8Index])
Astro_Data_9 = list( HDU_HD_1[i] for i in [TTYPE9Index])
Astro_Data_11 = list( HDU_HD_1[i] for i in [TTYPE11Index])
Astro_Data_12 = list( HDU_HD_1[i] for i in [TTYPE12Index])
Astro_Data_13 = list( HDU_HD_1[i] for i in [TTYPE13Index])
Astro_Data_14 = list( HDU_HD_1[i] for i in [TTYPE14Index])
Astro_Data_Unit_1 = list( HDU_HD_1[i] for i in [TUNIT1Index])
Astro_Data_Unit_8 = list( HDU_HD_1[i] for i in [TUNIT8Index])
Astro_Data_Unit_9 = list( HDU_HD_1[i] for i in [TUNIT9Index])
Astro_Data_Unit_11 = list( HDU_HD_1[i] for i in [TUNIT11Index])
Astro_Data_Unit_12 = list( HDU_HD_1[i] for i in [TUNIT12Index])
Astro_Data_Unit_13 = list( HDU_HD_1[i] for i in [TUNIT13Index])
Astro_Data_Unit_14 = list( HDU_HD_1[i] for i in [TUNIT14Index])
HDU.close()
with open('Processed ' + allfiles + ".txt", "w") as copy:
targetcheck += 1
Title1_Format = '{0}-----{1}'.format('Kepler I.D.','Channel')
Title2_Format = '-{0}--------{1}------------{2}------------{3}------------{4}------------{5}-------------{6}-'.format('TTYPE1','TTYPE8','TTYPE9','TTYPE11','TTYPE12','TTYPE13','TTYPE14')
File_Format = '{0}--------{1}'.format(File_Data_KID, File_Data_CHAN)
Astro_Format = '{0}---{1}---{2}---{3}---{4}---{5}---{6}'.format(Astro_Data_1, Astro_Data_8, Astro_Data_9, Astro_Data_11, Astro_Data_12, Astro_Data_13, Astro_Data_14)
Astro_Format_Units = '{0} {1} {2} {3} {4} {5} {6}'.format(Astro_Data_Unit_1, Astro_Data_Unit_8, Astro_Data_Unit_9, Astro_Data_Unit_11, Astro_Data_Unit_12, Astro_Data_Unit_13, Astro_Data_Unit_14)
copy.writelines("%s\n" % Title1_Format)
copy.writelines( "%s\n" % File_Format)
copy.writelines('\n')
copy.writelines("%s\n" % Title2_Format)
copy.writelines( "%s\n" % Astro_Format)
copy.writelines('\n')
copy.writelines( "%s\n" % Astro_Format_Units)
Results = copy
return Results
checkinitial(DIR)
sorter(DIR, targetcheck, initialcheck)
I think you keep getting confused between a single file and a list of files. Try something like this:
def checkinitial(TD):
#This counts the number of '.fits' files in your directory
check = len(glob.glob1(TD,"*.fits"))
if not check:
print 'There are no .FITS files in this directory! Try Again...'
sys.exit()
return check
def sorter(TD, targcheck, inicheck):
"""This function will call the two counters and compare them until the number of processed
files is greater than the files in the directory, thereby finishing the loop
"""
for in_file in glob.iglob(os.path.join(TD,'*.fits')):
print in_file # This prints out the filenames the program is currently processing
with pyfits.open(in_file) as HDU:
# <Process input file HDU here>
out_file_name = 'Processed_' + os.path.basename(in_file) + ".txt"
with open(os.path.join(TD, out_file_name), "w") as copy:
# <Write stuff to your output file copy here>

Categories