Python 2.5.2: trying to open files recursively - python

The script below should open all the files inside the folder 'pruebaba' recursively but I get this error:
Traceback (most recent call last):
File
"/home/tirengarfio/Desktop/prueba.py",
line 8, in
f = open(file,'r') IOError: [Errno 21] Is a directory
This is the hierarchy:
pruebaba
folder1
folder11
test1.php
folder12
test1.php
test2.php
folder2
test1.php
The script:
import re,fileinput,os
path="/home/tirengarfio/Desktop/pruebaba"
os.chdir(path)
for file in os.listdir("."):
f = open(file,'r')
data = f.read()
data = re.sub(r'(\s*function\s+.*\s*{\s*)',
r'\1echo "The function starts here."',
data)
f.close()
f = open(file, 'w')
f.write(data)
f.close()
Any idea?

Use os.walk. It recursively walks into directory and subdirectories, and already gives you separate variables for files and directories.
import re
import os
from __future__ import with_statement
PATH = "/home/tirengarfio/Desktop/pruebaba"
for path, dirs, files in os.walk(PATH):
for filename in files:
fullpath = os.path.join(path, filename)
with open(fullpath, 'r') as f:
data = re.sub(r'(\s*function\s+.*\s*{\s*)',
r'\1echo "The function starts here."',
f.read())
with open(fullpath, 'w') as f:
f.write(data)

You're trying to open everything you see. One thing you tried to open was a directory; you need to check if an entry is a file or is a directory, and make a decision from there. (Was the error IOError: [Errno 21] Is a directory not descriptive enough?)
If it is a directory, then you'll want to make a recursive call to your function to walk over the files in that directory as well.
Alternatively, you might be interested in the os.walk function to take care of the recursive-ness for you.

os.listdir lists both files and directories. You should check if what you're trying to open really is a file with os.path.isfile

Related

How to get name of random .jpg in folder? Python [duplicate]

I was trying to iterate over the files in a directory like this:
import os
path = r'E:/somedir'
for filename in os.listdir(path):
f = open(filename, 'r')
... # process the file
But Python was throwing FileNotFoundError even though the file exists:
Traceback (most recent call last):
File "E:/ADMTM/TestT.py", line 6, in <module>
f = open(filename, 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'foo.txt'
So what is wrong here?
It is because os.listdir does not return the full path to the file, only the filename part; that is 'foo.txt', when open would want 'E:/somedir/foo.txt' because the file does not exist in the current directory.
Use os.path.join to prepend the directory to your filename:
path = r'E:/somedir'
for filename in os.listdir(path):
with open(os.path.join(path, filename)) as f:
... # process the file
(Also, you are not closing the file; the with block will take care of it automatically).
os.listdir(directory) returns a list of file names in directory. So unless directory is your current working directory, you need to join those file names with the actual directory to get a proper absolute path:
for filename in os.listdir(path):
filepath = os.path.join(path, filename)
f = open(filepath,'r')
raw = f.read()
# ...
Here's an alternative solution using pathlib.Path.iterdir, which yields the full paths instead, removing the need to join paths:
from pathlib import Path
path = Path(r'E:/somedir')
for filename in path.iterdir():
with filename.open() as f:
... # process the file

I keep getting FileNotFoundError [duplicate]

This question already has an answer here:
using os.remove() in os.walk() for loop returns FileNotFoundError
(1 answer)
Closed 11 months ago.
This is the code
import os
new_file = open("C:/Users/USER/Desktop/Coding/Python/element_search.txt", "w")
path = "C:/Users/USER/Desktop/Coding"
# This is to access sub-folders
dirs = os.listdir(path)
for root, dir, files in os.walk(path):
for file in files:
f = open(file)
content = f.read()
print(file)
And this is the error
C:\Users\USER\Desktop\Coding\Python\personal_projects\venv\Scripts\python.exe C:/Users/USER/Desktop/Coding/Python/personal_projects/element_search.py
Traceback (most recent call last):
File "C:\Users\USER\Desktop\Coding\Python\personal_projects\element_search.py", line 10, in <module>
f = open(file)
FileNotFoundError: [Errno 2] No such file or directory: 'launch.json'
But I have the file launch.json present.
The os.walk returns the walking directory as root, and the list of file names. You need to construct the full path of the file for opening, else python will search the file only in working directory.
You can construct the full path with pathlib or os.join. pathlib is the recommended option. See an edited approach below.
import os
import pathlib
new_file = open("C:/Users/USER/Desktop/Coding/Python/element_search.txt", "w")
path = "C:/Users/USER/Desktop/Coding"
# This is to access sub-folders
dirs = os.listdir(path)
# root is the parent directory and files is the list of names returned.
for root, _, files in os.walk(path): # do not use the dir as its a builtin keyword
for file in files:
# join the name and dir path, to make it readable
abs_file = pathlib.Path(root) / file # Prepare the absolute path for the file
with open(abs_file) as f: # Open with a context, so it closes after use
content = f.read()
print(content)
If you have binary files in the list, then you must provide the read as binary flag in opening like with open(abs_file, "rb")

Python - loop through subfolders and files in a directory without ignoring the subfolder

I have read all the stack exchange help files on looping through subfolders, as as well as the os documentation, but I am still stuck. I am trying to loop over files in subfolders, open each file, extract the first number in the first line, copy the file to a different subfolder(with the same name but in the output directory) and rename the file copy with the number as a suffix.
import os
import re
outputpath = "C:/Users/Heather/Dropbox/T_Files/Raw_FRUS_Data/Wisconsin_Copies_With_PageNumbers"
inputpath = "C:/Users/Heather/Dropbox/T_Files/Raw_FRUS_Data/FRUS_Wisconsin"
suffix=".txt"
for root, dirs, files in os.walk(inputpath):
for file in files:
file_path = os.path.join(root, file)
foldername=os.path.split(os.path.dirname(file_path))[1]
filebname=os.path.splitext(file)[0]
filename=filebname + "_"
f=open(os.path.join(root,file),'r')
data=f.readlines()
if data is None:
f.close()
else:
with open(os.path.join(root,file),'r') as f:
for line in f:
s=re.search(r'\d+',line)
if s:
pagenum=(s.group())
break
with open(os.path.join(outputpath, foldername,filename+pagenum+suffix), 'w') as f1:
with open(os.path.join(root,file),'r') as f:
for line in f:
f1.write(line)
I expect the result to be copies of the files in the input directory placed in the corresponding subfolder in the output directory, renamed with a suffix, such as "005_2", where 005 is the original file name, and 2 is the number the python code extracted from it.
The error I get seems to indicates that I am not looping through files correctly. I know the code for extracting the first number and renaming the file works because I tested it on a single file. But using os.walk to loop through multiple subfolders is not working, and I can't figure out what I am doing wrong. Here is the error:
File "<ipython-input-1-614e2851f16a>", line 23, in <module>
with open(os.path.join(outputpath, foldername,filename+pagenum+suffix), 'w') as f1:
IOError: [Errno 2] No such file or directory: 'C:/Users/Heather/Dropbox/T_Files/Raw_FRUS_Data/Wisconsin_Copies_With_PageNumbers\\FRUS_Wisconsin\\.dropbox_1473986809.txt'
Well, this isn't eloquent, but it worked
from glob import glob
folderlist=glob("C:\\...FRUS_Wisconsin*\\")
outputpath = "C:\\..\Wisconsin_Copies_With_PageNumbers"
for folder in folderlist:
foldername = str(folder.split('\\')[7])
for root, dirs, files in os.walk(folder):
for file in files:
filebname=os.path.splitext(file)[0]
filename=filebname + "_"
if not filename.startswith('._'):
with open(os.path.join(root,file),'r') as f:
for line in f:
s=re.search(r'\d+',line)
if s:
pagenum=(s.group())
break
with open(os.path.join(outputpath, foldername,filename+pagenum+suffix), 'w') as f1:
with open(os.path.join(root,file),'r') as f:
for line in f:
f1.write(line)

How to recursively go through all subdirectories and read files?

I have a root-ish directory containing multiple subdirectories, all of which contain a file name data.txt. What I would like to do is write a script that takes in the "root" directory, and then reads through all of the subdirectories and reads every "data.txt" in the subdirectories, and then writes stuff from every data.txt file to an output file.
Here's a snippet of my code:
import os
import sys
rootdir = sys.argv[1]
with open('output.txt','w') as fout:
for root, subFolders, files in os.walk(rootdir):
for file in files:
if (file == 'data.txt'):
#print file
with open(file,'r') as fin:
for lines in fin:
dosomething()
My dosomething() part -- I've tested and confirmed for it to work if I am running that part just for one file. I've also confirmed that if I tell it to print the file instead (the commented out line) the script prints out 'data.txt'.
Right now if I run it Python gives me this error:
File "recursive.py", line 11, in <module>
with open(file,'r') as fin:
IOError: [Errno 2] No such file or directory: 'data.txt'
I'm not sure why it can't find it -- after all, it prints out data.txt if I uncomment the 'print file' line. What am I doing incorrectly?
You need to use absolute paths, your file variable is just a local filename without a directory path. The root variable is that path:
with open('output.txt','w') as fout:
for root, subFolders, files in os.walk(rootdir):
if 'data.txt' in files:
with open(os.path.join(root, 'data.txt'), 'r') as fin:
for lines in fin:
dosomething()
[os.path.join(dirpath, filename) for dirpath, dirnames, filenames in os.walk(rootdir)
for filename in filenames]
A functional approach to get the tree looks shorter, cleaner and more Pythonic.
You can wrap the os.path.join(dirpath, filename) into any function to process the files you get or save the array of paths for further processing

python - open all plain text files in a directory with ".dta" extension and write lines to csv

I have a number of plain-text config files (.dta) that are spread through 27 sub-directories. I am trying to parse some information from all of them into a common document that is easier to work with.
Thus far I have:
import linecache
import csv
import os
csvout = csv.writer(open("dtaCompile.csv","wb"))
directory = os.path.join("c:\\","DirectKey")
for root,dirs,files in os.walk(directory):
for file in files:
if file.endswith(".DTA"):
f=open(file,'r')
lines = f.readlines()
description = lines[1]
articleCode = lines[2]
OS = lines[25]
SMBIOS = lines[32]
pnpID = lines[34]
cmdLine = lines[28]
csvout.writerow([SMBIOS, description, articleCode, pnpID, OS, cmdLine])
f.close()
I'm getting the following error:
Traceback (most recent call last):
File "test.py", line 11, in <module>
f=open(file,'r')
IOError: [Errno 2] No such file or directory: '000003APP.DTA'
Instead of
f=open(file,'r')
Your probaby need
f=open(os.path.join(directory, root, file),'r')
file is just the name of the file, and doesn't say anything about the path to it. you have to use os.path.join with the various components to create the full path
if file.endswith(".DTA"):
file = os.path.join(directory, root, file)
Instead of:
f=open(file,'r')
Try:
f = open(os.path.join(directory, file), "r")
My guess is that the directory you're program is executing in is not the same as the directory you're walking.
Try printing:
os.getcwd()
to see.

Categories