I want to go through all folders inside a directory:
directory\
folderA\
a.cpp
folderB\
b.cpp
folderC\
c.cpp
folderD\
d.cpp
The name of the folders are all known.
Specifically, I am trying to count the number of lines of code on each of the a.cpp, b.cpp, c.pp and d.cpp source files. So, go inside folderA and read a.cpp, count lines and then go back to directory, go inside folderB, read b.cpp, count lines etc.
This is what I have up until now,
dir = directory_path
for folder_name in folder_list():
dir = os.path.join(dir, folder_name)
with open(dir) as file:
source= file.read()
c = source.count_lines()
but I am new to Python and have no idea if my approach is appropriate and how to proceed. Any example code shown will be appreciated!
Also, does the with open handles the file opening/closing as it should for all those reads or more handling is required?
I would do it like this:
import glob
import os
path = 'C:/Users/me/Desktop/' # give the path where all the folders are located
list_of_folders = ['test1', 'test2'] # give the program a list with all the folders you need
names = {} # initialize a dict
for each_folder in list_of_folders: # go through each file from a folder
full_path = os.path.join(path, each_folder) # join the path
os.chdir(full_path) # change directory to the desired path
for each_file in glob.glob('*.cpp'): # self-explanatory
with open(each_file) as f: # opens a file - no need to close it
names[each_file] = sum(1 for line in f if line.strip())
print(names)
Output:
{'file1.cpp': 2, 'file3.cpp': 2, 'file2.cpp': 2}
{'file1.cpp': 2, 'file3.cpp': 2, 'file2.cpp': 2}
Regarding the with question, you don't need to close the file or make any other checks. You should be safe as it is now.
You may, however, check if the full_path exists as somebody (you) could mistakenly delete a folder from your PC (a folder from list_of_folders)
You can do this by os.path.isdir which returns True if the file exists:
os.path.isdir(full_path)
PS: I used Python 3.
Use Python 3's os.walk() to traverse all subdirectories and files of a given path, opening each file and do your logic. You can use a 'for' loop to walk it, simplifying your code greatly.
https://docs.python.org/2/library/os.html#os.walk
As manglano said, os.walk()
you can generate a list of folder.
[src for src,_,_ in os.walk(sourcedir)]
you can generate a list of file path.
[src+'/'+file for src,dir,files in os.walk(sourcedir) for file in files]
Related
I have a list
fileslist=[1.jpg,2.xml,3.png]
I want to search files in list in current working directory
I have tried
listingdir=os.getcwd()
for rootpath,directories,files in os.walk(listingdir):
for file in fileslist:
if file in files:
print("file:{} found".format(file))
I also tried
list=(set(files).intersection(fileslist))
but not worked because of not only one type extentions in files
when I used set it creates a list like following and i don't get the results
f=set(files)
print(f)
#result is
[[1.jpg,2.jpg,....],[1.png,2.png,...],[1.xml,2.xml,.......]]
If you only want to search through the current dir, you can do something like:
files = [f for f in os.listdir() if os.path.isfile(f)]
fileslist = ['1.jpg','2.xml','3.png']
list = (set(files).intersection(fileslist))
Output:
{'1.png'} # it wont always be this, just an example.
You may use os.path.isfile(...). It will check if a certain file exists. It may accept a full path or a filename only (then it will check if the file exists in the current working directory).
import os.path
fileslist=['1.jpg','2.xml','3.png'] # no, it won't work without the quotes!
for f in fileslist:
if os.path.isfile(f):
print("file:{} found".format(f))
I'm trying to print the names of all the files from a folder directory. I have a folder called "a", and in that folder there are 3 NC files, lets call them "b","c","d", whose directory I want to print. How would I do this?
For example, given my path to the folder is
path=r"C:\\Users\\chz08006\\Documents\\Testing\\a"
I want to print the directories to all the files in the folder "a", so the result should print:
C:\\Users\\chz08006\\Documents\\Testing\\a\\b.nc
C:\\Users\\chz08006\\Documents\\Testing\\a\\c.nc
C:\\Users\\chz08006\\Documents\\Testing\\a\\d.nc
So far, I've tried
for a in path:
print(os.path.basename(path))
But that doesn't seem to be right.
I think you're looking for this:
import os
path = r"C:\\Users\\chz08006\\Documents\\Testing\\a"
for root, dirs, files in os.walk(path):
for file in files:
print("{root}\\{file}".format(root=root, file=file))
You can have a list of file names in a folder using listdir().
import os
path = "C:\\Users\\chz08006\\Documents\\Testing\\a"
l = os.listdir(path)
for a in l:
print(path + a)
You made a couple mistakes. You were using os.path.basename, that only returns the name of the file or folder represented at the end of a path after the last file separator.
Instead, use os.path.abspath to get the full path of any file.
The other mistake was one of using the wrong variable inside the loop (print(os.path.basename(path) instead of using the variable a)
Also, dont forget to use os.listdir to list the files inside the folder before looping.
import os
path = r"C:\Users\chz08006\Documents\Testing\a"
for file in os.listdir(path): #using a better name compared to a
print(os.path.abspath(file)) #you wrote path here, instead of a.
#variable names that do not have a meaning
#make these kinds of errors easier to make,
#and harder to spot
I have files like:
00001.jpg
00002.jpg
.
.
.
01907.jpg
I want to add some files to this directory which are named the same. But their names should continue like
01908.jpg
01909.jpg
.
.
12906.jpg
I couldn't manage to do that. How can i make this happen?
Thanks a lot:)
I tried
import os
files=[]
files = sorted(os.listdir('directory'))
b=len(files)
for i in range(0,b):
a=files[i]
os.rename(a,(a+1))
print (files)
you have a source dir (which contains the badly/identical named files) and a target dir (which contains files that should not be overwritten).
I would:
list the target dir & sort like you did (the rest of your attempt is clearly off...)
get the last item and parse as integer (without extension): add 1 and that gives the next free index.
loop in the source dir
generate a new name for the current file using the new computed index
use shutil.move or shutil.copy to move/copy the new files with the new name
like this:
import os,shutil
s = "source_directory"
d = "target_directory"
files = sorted(os.listdir(d))
highest_index = int(os.path.splitext(files[-1])[0])+1
for i,f in enumerate(sorted(os.listdir(s)),highest_index):
new_name = "{:05}.png".format(i)
shutil.copy(os.path.join(s,f),os.path.join(d,new_name))
You can do this:
import os
directory1 = 'path to the directory you want to move the files to'
directory2 = 'path to the directory you want to move the files to'
for file in ordered(os.listdir(directory2)):
counter = len(os.listdir(directory1))
file_number = int(file.split('.')[0]) #Get the file number
os.rename(os.path.join(directory2, file), os.path.join(directory1 + str(file_number + counter)))
What I have done:
Looped over the files that I wanted to rename and move.
Found the number of files, which I assumed that it is going to be the same as the name of the last file in this directory, in the main directory which the files are going to be moved to and made sure it will keep updating itself so that no overwrites happen.
Then I got the number of the current file in the loop.
Finally, I used os.rename to rename and move the file from the 1st directory to the 2nd.
I am new to python. I have successful written a script to search for something within a file using :
open(r"C:\file.txt) and re.search function and all works fine.
Is there a way to do the search function with all files within a folder? Because currently, I have to manually change the file name of my script by open(r"C:\file.txt),open(r"C:\file1.txt),open(r"C:\file2.txt)`, etc.
Thanks.
You can use os.walk to check all the files, as the following:
import os
for root, _, files in os.walk(path):
for filename in files:
with open(os.path.join(root, filename), 'r') as f:
#your code goes here
Explanation:
os.walk returns tuple of (root path, dir names, file names) in the folder, so you can iterate through filenames and open each file by using os.path.join(root, filename) which basically joins the root path with the file name so you can open the file.
Since you're a beginner, I'll give you a simple solution and walk through it.
Import the os module, and use the os.listdir function to create a list of everything in the directory. Then, iterate through the files using a for loop.
Example:
# Importing the os module
import os
# Give the directory you wish to iterate through
my_dir = <your directory - i.e. "C:\Users\bleh\Desktop\files">
# Using os.listdir to create a list of all of the files in dir
dir_list = os.listdir(my_dir)
# Use the for loop to iterate through the list you just created, and open the files
for f in dir_list:
# Whatever you want to do to all of the files
If you need help on the concepts, refer to the following:
for looops in p3: http://www.python-course.eu/python3_for_loop.php
os function Library (this has some cool stuff in it): https://docs.python.org/2/library/os.html
Good luck!
You can use the os.listdir(path) function:
import os
path = '/Users/ricardomartinez/repos/Salary-API'
# List for all files in a given PATH
file_list = os.listdir(path)
# If you want to filter by file type
file_list = [file for file in os.listdir(path) if os.path.splitext(file)[1] == '.py']
# Both cases yo can iterate over the list and apply the operations
# that you have
for file in file_list:
print(file)
#Operations that you want to do over files
I want to implement a file reader (folders and subfolders) script which detects some tags and delete those tags from the files.
The files are .cpp, .h .txt and .xml And they are hundreds of files under same folder.
I have no idea about python, but people told me that I can do it easily.
EXAMPLE:
My main folder is A: C:\A
Inside A, I have folders (B,C,D) and some files A.cpp A.h A.txt and A.xml. In B i have folders B1, B2,B3 and some of them have more subfolders, and files .cpp, .xml and .h....
xml files, contains some tags like <!-- $Mytag: some text$ -->
.h and .cpp files contains another kind of tags like //$TAG some text$
.txt has different format tags: #$This is my tag$
It always starts and ends with $ symbol but it always have a comment character (//,
The idea is to run one script and delete all tags from all files so the script must:
Read folders and subfolders
Open files and find tags
If they are there, delete and save files with changes
WHAT I HAVE:
import os
for root, dirs, files in os.walk(os.curdir):
if files.endswith('.cpp'):
%Find //$ and delete until next $
if files.endswith('.h'):
%Find //$ and delete until next $
if files.endswith('.txt'):
%Find #$ and delete until next $
if files.endswith('.xml'):
%Find <!-- $ and delete until next $ and -->
The general solution would be to:
use the os.walk() function to traverse the directory tree.
Iterate over the filenames and use fn_name.endswith('.cpp') with if/elseif to determine which file you're working with
Use the re module to create a regular expression you can use to determine if a line contains your tag
Open the target file and a temporary file (use the tempfile module). Iterate over the source file line by line and output the filtered lines to your tempfile.
If any lines were replaced, use os.unlink() plus os.rename() to replace your original file
It's a trivial excercise for a Python adept but for someone new to the language, it'll probably take a few hours to get working. You probably couldn't ask for a better task to get introduced to the language though. Good Luck!
----- Update -----
The files attribute returned by os.walk is a list so you'll need to iterate over it as well. Also, the files attribute will only contain the base name of the file. You'll need to use the root value in conjunction with os.path.join() to convert this to a full path name. Try doing just this:
for root, d, files in os.walk('.'):
for base_filename in files:
full_name = os.path.join(root, base_filename)
if full_name.endswith('.h'):
print full_name, 'is a header!'
elif full_name.endswith('.cpp'):
print full_name, 'is a C++ source file!'
If you're using Python 3, the print statements will need to be function calls but the general idea remains the same.
Try something like this:
import os
import re
CPP_TAG_RE = re.compile(r'(?<=// *)\$[^$]+\$')
tag_REs = {
'.h': CPP_TAG_RE,
'.cpp': CPP_TAG_RE,
'.xml': re.compile(r'(?<=<!-- *)\$[^$]+\$(?= *-->)'),
'.txt': re.compile(r'(?<=# *)\$[^$]+\$'),
}
def process_file(filename, regex):
# Set up.
tempfilename = filename + '.tmp'
infile = open(filename, 'r')
outfile = open(tempfilename, 'w')
# Filter the file.
for line in infile:
outfile.write(regex.sub("", line))
# Clean up.
infile.close()
outfile.close()
# Enable only one of the two following lines.
os.rename(filename, filename + '.orig')
#os.remove(filename)
os.rename(tempfilename, filename)
def process_tree(starting_point=os.curdir):
for root, d, files in os.walk(starting_point):
for filename in files:
# Get rid of `.lower()` in the following if case matters.
ext = os.path.splitext(filename)[1].lower()
if ext in tag_REs:
process_file(os.path.join(root, base_filename), tag_REs[ext])
Nice thing about os.splitext is that it does the right thing for filenames that start with a ..