Creating a watchfolder and exporting to txt - python

I am trying to create a program that lists all the files in a directory in real time. If a file is deleted than it deletes the filename in the txt file. If one file is added than it is added to the txt file.
So far I only got to create a program that lists and exports the content once. And as I am using a while(1) loop, it doesn`t stop creating files. I also need it to ignore duplicated names.
Can you help me with it? My code is as it follows:
import os
Path = 'Mypath'
arr = os.listdir(Path)
print (arr)
file1 = open ("File.txt","a")
while (1):
for file in arr:
# if file not in file1:
file1.writelines(file + "\n")

Simple solution using polling.
If something changed, replace the whole file.
import os
import time
path = 'mypath/'
cur_list = None
while True:
new_list = os.listdir(path)
if new_list != cur_list:
cur_list = new_list
with open("File.txt", "w") as f:
f.write('\n'.join(cur_list))
time.sleep(5)

Related

How to open and read text files in a folder python

I have a folder which has a text files in it. I want to be able to put in a path to this file and have python go through the folder, open each file and append its content to a list.
import os
folderpath = "/Users/myname/Downloads/files/"
inputlst = [os.listdir(folderpath)]
filenamelist = []
for filename in os.listdir(folderpath):
if filename.endswith(".txt"):
filenamelist.append(filename)
print(filename list)
So far this outputs:
['test1.txt', 'test2.txt', 'test3.txt', 'test4.txt', 'test5.txt', 'test6.txt', 'test7.txt', 'test8.txt', 'test9.txt', 'test10.txt']
I want to have the code take each of these files, open them and put all of its content into a single huge list not just print the file name. Is there any way to do this?
You should use file open for this.
Read here a documentation about its advanced options
Anyway, here is one way how you can do it:
import os
folderpath = r"yourfolderpath"
inputlst = [os.listdir(folderpath)]
filenamecontent = []
for filename in os.listdir(folderpath):
if filename.endswith(".txt"):
f = open(os.path.join(folderpath,filename), 'r')
filenamecontent.append(f.read())
print(filenamecontent)
If you are using Python3, you can use :
for filename in filename_list :
with open(filename,"r") as file_handler :
data = file_handler.read()
Please do mind that you will need the full (either relative or absolute) path to your file in filename
This way, your file handler will be automatically closed when you get out of the with scope.
More information around here : https://docs.python.org/fr/3/library/functions.html#open
On a side note, in order to list files, you might want to have a look to glob and use :
filename_list = glob.glob("/path/to/files/*.txt")
You can use fileinput
Code:
import fileinput
folderpath = "your_path_to_directory_where_files_are_stored"
file_list = [a for a in os.listdir(folderpath) if a.endswith(".txt")]
# This will return all the files which are in .txt format
get_all_files = fileinput.input(file_list)
with open("alldata.txt", 'ab+') as writefile:
for line in get_all_files:
writefile.write(line+'\n')
The above code will read all the data from .txt from a specified directory(folderpath) and store it in alldata.txt So, you wanted to have that long list, that list is now stored in .txt file if you want, else you can remove the write process.
Links:
https://docs.python.org/3/library/fileinput.html
https://docs.python.org/3/library/functions.html#open

opening and reading all the files in a directory in python - python beginner

I'd like to read the contents of every file in a folder/directory and then print them at the end (I eventually want to pick out bits and pieces from the individual files and put them in a separate document)
So far I have this code
import os
path = 'results/'
fileList = os.listdir(path)
for i in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines = file.readlines()
print(allLines)
at the end I dont get any errors but it only prints the contents of the last file in my folder in a series of strings and I want to make sure its reading every file so I can then access the data I want from each file. I've looked online and I cant find where I'm going wrong. Is there any way of making sure the loop is iterating over all my files and reading all of them?
also i get the same result when I use
file = open(os.path.join('results/',i), 'r')
in the 5th line
Please help I'm so lost
Thanks!!
Separate the different functions of the thing you want to do.
Use generators wherever possible. Especially if there are a lot of files or large files
Imports
from pathlib import Path
import sys
Deciding which files to process:
source_dir = Path('results/')
files = source_dir.iterdir()
[Optional] Filter files
For example, if you only need files with extension .ext
files = source_dir.glob('*.ext')
Process files
def process_files(files):
for file in files:
with file.open('r') as file_handle :
for line in file_handle:
# do your thing
yield line
Save the lines you want to keep
def save_lines(lines, output_file=sys.std_out):
for line in lines:
output_file.write(line)
you forgot indentation at this line allLines = file.readlines()
and maybe you can try that :
import os
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines.append(file.read())
print(allLines)
You forgot to indent this line allLines.append(file.read()).
Because it was outside the loop, it only appended the file variable to the list after the for loop was finished. So it only appended the last value of the file variable that remained after the loop. Also, you should not use readlines() in this way. Just use read() instead;
import os
allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
file = open(os.path.join('results/'+ i), 'r')
allLines.append(file.read())
print(allLines)
This also creates a file containing all the files you wanted to print.
rootdir= your folder, like 'C:\\Users\\you\\folder\\'
import os
f = open('final_file.txt', 'a')
for root, dirs, files in os.walk(rootdir):
for filename in files:
data = open(full_name).read()
f.write(data + "\n")
f.close()
This is a similar case, with more features: Copying selected lines from files in different directories to another file

Getting sub directories list in a text file and append that txt file with new subdirectory name

I am trying to write a script which will list down all subdirectories in a directory into a txt file.
this script will run every 1 hour through cron job so that i can append to the txt file already created in previous run and add new subdir names.
For eg:
/Directory
/subdir1
/subdir2
/subdir3
txt.file should have following columns:
subdir_name timestamp first_filenamein_thatSUBDIR
subdir1 2015-23-12 abc.dcm
subdir2 2014-23-6 ghj.nii
.
.
.
I know to get list of directories using os.listdir but don't know how to approach this problem as i want to write same txt file with new names. ANy idea how should i do that in python?
EDit: With os.listdir i am getting sub directories name but not the time stamp. And other problem is how can i create two columns one with sub directory name and other with its time stamp as shown above?
With #Termi's help i got this code working:
import time
import os
from datetime import datetime
parent_dir = '/dicom/'
sub_dirs = os.walk(parent_dir).next()[1]
with open('exam_list.txt','a+') as f:
lines = f.readlines()
present_dirs = [line.split('\t')[0] for line in lines]
for sub in sub_dirs[1:len(sub_dirs)]:
sub = sub + '/0001'
latest_modified = os.path.getctime(os.path.join(parent_dir,sub))
if sub not in present_dirs and time.time() - latest_modified < 4600 :
created = datetime.strftime(datetime.fromtimestamp(latest_modified),'%Y-%d-%m')
file_in_subdir = os.walk(os.path.join(parent_dir,sub)).next()[2][1]
f.write("%s\t%s\t%s\n"%(sub,created,file_in_subdir))
This code, when typed on python terminal, works well with all the variables sub, created, file_in_subdir holding some value, however, is not able to write it in a file mentioned at the beginning of the code.
I also tried if file writing is a problem using following code:
with open('./exam_list.txt','a+') as f:
f.write("%s\t%s\n"%(sub,file_in_subdir))
Above two lines creates file properly as i intended..
Not able to point out what is the error.
To get the immediate sub-directories in the parent directory use os.walk('path/to/parent/dir').next()[1].
os.walk().next() gives a list of lists as [current_dir, [sub-dirs], [files] ] so next()[1] gives sub-directories
opening the file with 'a+' will allow you to both read and append to the file. Then store the sub-directories that are already in the file
with open('dirname.txt','a+') as f:
lines = f.readlines()
present_dirs = [line.split('\t')[0] for line in lines]
Now for each sub-directory check whether it is already present in the list and if not, add it to the file. If you execute it every hour you can even check for new files created(or modified in linux systems) in the last hour by using getctime
time.time() - os.path.getctime(os.path.join(parent_dir,sub)) < 3600
Now for any new sub-directory use os.walk('path/to/subdir').next[2] and get the filenames inside
import time
import os
from datetime import datetime
parent_dir = '/path/to/parent/directory'
sub_dirs = os.walk(parent_dir).next()[1]
with open('dirname.txt','a+') as f:
lines = f.readlines()
present_dirs = [line.split('\t')[0] for line in lines]
for sub in sub_dirs:
latest_modified = os.path.getctime(os.path.join(parent_dir,sub))
if sub not in present_dirs and time.time() - latest_modified < 3600 :
created = datetime.strftime(datetime.fromtimestamp(latest_modified),'%Y-%d-%m')
file_in_subdir = os.walk(os.path.join(parent_dir,sub)).next()[2][0]
f.write("%s\t%s\t%s\n"%(sub,created,file_in_subdir))
with open('some.txt', 'a') as output:
output.write('whatever you want to add')
Opening a file with 'a' as a parameter appends everything you write to it to its end.
You can use walk from os package.
It's more better than listdir.
You can read more about it here
En Example:
import os
from os.path import join, getctime
with open('output.txt', 'w+') as output:
for root, dirs, files in os.walk('/Some/path/'):
for name in files:
create_time = getctime(join(root, name))
output.write('%s\t%s\t%s\n' % (root, name, create_time))

Locating files by name for copying elsewhere

New to Python...
I'm trying to have python take a text file of file names (new name on each row), and store them as strings ...
i.e
import os, shutil
files_to_find = []
with open('C:\\pathtofile\\lostfiles.txt') as fh:
for row in fh:
files_to_find.append(row.strip)
...in order to search for these files in directories and then copy any found files somewhere else...
for root, dirs, files in os.walk('D:\\'):
for _file in files:
if _file in files_to_find:
print ("Found file in: " + str(root))
shutil.copy(os.path.abspath(root + '/' + _file), 'C:\\destination')
print ("process completed")
Despite knowing these files exist, the script runs without any errors but without finding any files.
I added...
print (files_to_find)
...after the first block of code to see if it was finding anything and saw screeds of "built-in method strip of str object at 0x00000000037FC730>,
Does this tell me it's not successfully creating strings to compare file names against? I wonder where I'm going wrong?
Use array to create a list of files.
import os
import sys
import glob
import shutil
def file_names(self,filepattern,dir):
os.chdir(dir)
count = len(glob.glob(filepattern))
file_list = []
for line in sorted(glob.glob(filepattern)):
line = line.split("/")
line = line[-1]
file_list.append(line)
return file_list
The loop over the array list to compare.

pipe one file at a time python

I have more than 10000 json files which I have to convert to for further processing. I am using the following code:
import json
import time
import os
import csv
import fnmatch
tweets = []
count = 0
search_folder = ('/Volumes/Transcend/Axiom/IPL/test/')
for root, dirs, files in os.walk(search_folder):
for file in files:
pathname = os.path.join(root, file)
for file in open(pathname):
try:
tweets.append(json.loads(file))
except:
pass
count = count + 1
This iterates over just one file and stops. I tried adding while True: before for file in open(pathname): and it just doesn't stop nor it creates the csv files. I want to read one file at a time, convert it to csv, then move on to the next file. I tried adding count = count + 1 at the end of the file after completing converting the csv. Still it stops after converting the first file. Can someone help please?
Your indentation is off; you need to put the second for loop inside the first one.
Separate from your main problem, you should use a with statement to open the file. Also, you were reusing the variable name file, which you shouldn't be using anyway since it's the name of a built-in. I also made a few other minor edits.
import json
import os
tweets = []
count = 0
search_folder = '/Volumes/Transcend/Axiom/IPL/test/'
for root, dirs, filenames in os.walk(search_folder):
for filename in filenames:
pathname = os.path.join(root, filename)
with open(pathname, 'r') as infile:
for line in infile:
try:
tweets.append(json.loads(line))
except: # Don't use bare except clauses
pass
count += 1

Categories