Trying to reach all .txt files in Python - python

I have a folder which is "labels". In this folder, thera are 50 folders again and each of these 50 folder have .txt files. How can I reach these .txt files with using Python 2?

Here's code that will go through all folders in labels and print content of txt files located inside them.
import os
for folder in os.listdir('labels'):
for txt_file in os.listdir('labels/{}'.format(folder)):
if txt_file.endswith('.txt'):
file = open('labels/{}/{}'.format(folder, txt_file), 'r')
content = file.read()
file.close()
print(content)

If you just want to list the files in the folders:
import os
rootdir = 'C:/Users/youruser/Desktop/test'
for subdir, dirs, files in os.walk(rootdir):
for file in files:
print (os.path.join(subdir, file))

Related

Find directories missing .csv file in Python

I have ~1000 directories, containing various .csv files within them. I am trying to check if a specific type of csv file, containing a filename that begins with PTSD_OCOTBER, exists in each directory.
If this file does not exist in the directory, I want to print out that directory into a .txt file.
Here is what I have so far.
import os,sys,time,shutil
import subprocess
#determine filetype to look for.
file_type = ".csv"
print("Running file counter for" + repr(file_type))
#for each folder in the root directory
for subdir, dirs, files in os.walk(rootdir):
if("GeneSet" in subdir):
folder_name = subdir.rsplit('/', 1)[-1] #get the folder name.
for f in files:
#unclear how to write this part.
#how to tell if no files exist in directory?
This successfully finds the .csv files of interest, but how do achieve the above?
So files is the list of files in that directory that you are currently walking. You want to know if there are no files that start with PTSD_OCOTBER (PTSD_OCTOBER ?):
for subdir, dirs, files in os.walk(rootdir):
if("GeneSet" in subdir):
folder_name = subdir.rsplit('/', 1)[-1] #get the folder name.
dir_of_interest = not any(f.startswith('PTSD_OCOTBER') for f in files)
if dir_of_interest:
# do stuff with folder_name
Now you want to save the results into a text file? If you have a Unix-style computer, then you can use output redirection on your terminal, such as
python3 fileanalysis.py > result.txt
after writing print(folder_name) instead of # do stuff with folder_name.
Or you can use Python itself to write the file, such as:
found_dirs = []
for subdir, dirs, files in os.walk(rootdir):
...
if dir_of_interest:
found_dirs.append(folder_name)
with open('result.txt', 'w') as f:
f.write('\n'.join(found_dirs))

how to combine all the files of one directory of one extension into one folder

how can i combine all PDF files of one directory (this pdfs can be on different deep of directory) into one new folder?
i have been tried this:
new_root = r'C:\Users\me\new_root'
root_with_files = r'C:\Users\me\all_of_my_pdf_files\'
for root, dirs, files in os.walk(root_with_files):
for file in files:
os.path.join(new_root, file)
but it's doest add anything to my folder
You may try this:
import shutil
new_root = r'C:\Users\me\new_root'
root_with_files = r'C:\Users\me\all_of_my_pdf_files'
for root, dirs, files in os.walk(root_with_files):
for file in files:
if file.lower().endswith('.pdf') : # .pdf files only
shutil.copy( os.path.join(root, file), new_root )
Your code doesn't move any files to new folder. you can move your files using os.replace(src,dst).
try this:
new_root = r'C:\Users\me\new_root'
root_with_files = r'C:\Users\me\all_of_my_pdf_files\'
for root, dirs, files in os.walk(root_with_files):
for file in files:
os.replace(os.path.join(root, file),os.path.join(new_root, file))

Read in multiple folder and combine multiple text files contents to one file per folder - Python

I'm new to Python. I have 100's of multiple folders in the same Directory inside each folder i have multiple text files each. i want to combine all text files contents to one per folder.
Folder1
text1.txt
text2.txt
text3.txt
.
.
Folder2
text1.txt
text2.txt
text3.txt
.
.
i need output as copy all text files content in to one text1.txt + text2.txt + text3.txt ---> Folder1.txt
Folder1
text1.txt
text2.txt
text3.txt
Folder1.txt
Folder2
text1.txt
text2.txt
text3.txt
Folder2.txt
i have below code which just list out the text files.
for path,subdirs, files in os.walk('./data')
for filename in files:
if filename.endswith('.txt'):
please help me how to proceed on the task. Thank you.
Breaking down the problem we need the solution to:
Find all files in a directory
Merge contents of all the files into one file - with the same name as the name of the directory.
And then apply this solution to every sub directory in the base directory. Tested the code below.
Assumption: the subfolders have only text files and no directories
import os
# Function to merge all files in a folder
def merge_files(folder_path):
# get all files in the folder,
# assumption: folder has no directories and all text files
files = os.listdir(folder_path)
# form the file name for the new file to create
new_file_name = os.path.basename(folder_path) + '.txt'
new_file_path = os.path.join(folder_path, new_file_name)
# open new file in write mode
with open(new_file_path, 'w') as nf:
# open files to merge in read mode
for file in files:
file = os.path.join(folder_path, file)
with open(file, 'r') as f:
# read all lines of a file and write into new file
lines_in_file = f.readlines()
nf.writelines(lines_in_file)
# insert a newline after reading each file
nf.write("\n")
# Call function from the main folder with the subfolders
folders = os.listdir("./test")
for folder in folders:
if os.path.isdir(os.path.join('test', folder)):
merge_files(os.path.join('test', folder))
First you will need to get all folder names, which can be done with os.listdir(path_to_dir). Then you iterate over all of them, and for each you will need to iterate over all of its children using the same function, while concatenating contents using this: https://stackoverflow.com/a/13613375/13300960
Try writing it by yourself and update the answer with your code if you will need more help.
Edit: os.walk might not be the best solution since you know your folder structure and just two listdirs will do the job.
import os
basepath = '/path/to/directory' # maybe just '.'
for dir_name in os.listdir(basepath):
dir_path = os.path.join(basepath, dir_name)
if not os.path.isdir(dir_path):
continue
with open(os.path.join(dir_path, dir_name+'.txt') , 'w') as outfile:
for file_name in os.listdir(dir_path):
if not file_name.endswith('.txt'):
continue
file_path = os.path.join(dir_path, file_name)
with open(file_path) as infile:
for line in infile:
outfile.write(line)
This is not the best code, but it should get the job done and it is the shortest.

How to read one file at a time from folder and pass data as a string into API, while writing the responses back into file?

I have a folder with 500 text files. The text files hold data that I want to send into an API. I want to write the response object from the API into another text file into a folder.
This is my code so far to loop through the files in the folder. However this loops through all the files:
import os
directory = os.path.normpath("file path to folder")
for subdir, dirs, files in os.walk(directory):
for file in files:
if file.endswith(".txt"):
f=open(os.path.join(subdir, file),'r')
a = f.read()
print a
r = requests.post(url1,data=a).content
file = 'file path to write api response'
f = open(file, 'a+')
f.write(r)
f.close()
How do I only loop through one file at a time and pass the result into the api?
Try glob for iterating over the *.txt files.
Import glob
f = “./path/to/file/*.txt”
for files in glob.glob(f)
with open(files) as f:
#do your code here

reading HTML(different folders) files

I want to read HTML files in python. Normaly I do it like this (and it works):
import codecs
f = codecs.open("test.html",'r')
print f.read()
The Problem is that my html files are not all in the same Folder since have a program which generates this html files and save them into folders which are inside the folder where I have my script to read the files.
Summarizing, I have my script in a Folder and inside this Folder there are more Folders where the generated html files are.
Does anybody know how can I proceed?
import os
import codecs
for root, dirs, files in os.walk("./"):
for name in files:
abs_path = os.path.normpath(root + '/' + name)
file_name, file_ext = os.path.splitext(abs_path)
if file_ext == '.html':
f = codecs.open(abs_path,'r')
print f.read()
This will walk through <script dir>/ (./ will get translated to your script-directory) and loop through all files in each sub-directory.
It will check if the extension is .html and do the work on each .html file.
You would perhaps define more file endings that are "accepted" (for instance .htm).
use os.walk:
import os,codecs
for root, dirs, files in os.walk("/mydir"):
for file in files:
if file.endswith(".html"):
f = codecs.open(os.path.join(root, file),'r')
print f.read()

Categories