To give credit, the code I am currently working with is from this response by cji, here.
I am trying to recursively pull all files from the source folder, and move them into folders from the file names first-five characters 0:5
My Code Below:
import os
import shutil
srcpath = "SOURCE"
srcfiles = os.listdir(srcpath)
destpath = "DESTINATION"
# extract the three letters from filenames and filter out duplicates
destdirs = list(set([filename[0:5] for filename in srcfiles]))
def create(dirname, destpath):
full_path = os.path.join(destpath, dirname)
os.mkdir(full_path)
return full_path
def move(filename, dirpath):
shutil.move(os.path.join(srcpath, filename)
,dirpath)
# create destination directories and store their names along with full paths
targets = [(folder, create(folder, destpath)) for folder in destdirs]
for dirname, full_path in targets:
for filename in srcfiles:
if dirname == filename[0:5]:
move(filename, full_path)
Now, changing srcfiles = os.listdir(srcpath) and destdirs = list(set([filename[0:5] for filename in srcfiles])) with the code below gets me the paths in one variable and the first five characters of the file names in another.
srcfiles = []
destdirs = []
for root, subFolders, files in os.walk(srcpath):
for file in files:
srcfiles.append(os.path.join(root,file))
for name in files:
destdirs.append(list(set([name[0:5] for file in srcfiles])))
How would I go about modifying the original code to use this... Or if someone has a better idea on how I would go about doing this. Thanks.
I can't really test it very easily, but I think this code should work:
import os
import shutil
srcpath = "SOURCE"
destpath = "DESTINATION"
for root, subFolders, files in os.walk(srcpath):
for file in files:
subFolder = os.path.join(destpath, file[:5])
if not os.path.isdir(subFolder):
os.makedirs(subFolder)
shutil.move(os.path.join(root, file), subFolder)
Related
I have question regarding moving one file in each sub directories to other new sub directories. So for example if I have directory as it shown in the image
And from that, I want to pick only the first file in each sub directories then move it to another new sub directories with the same name as you can see from the image. And this is my expected result
I have tried using os.walk to select the first file of each sub directories, but I still don't know how to move it to another sub directories with the same name
path = './test/'
new_path = './x/'
n = 1
fext = ".png"
for dirpath, dirnames, filenames in os.walk(path):
for filename in [f for f in filenames if f.endswith(fext)][:n]:
print(filename) #this only print the file name in each sub dir
The expected result can be seen in the image above
You are almost there :)
All you need is to have both full path of file: an old path (existing file) and a new path (where you want to move it).
As it mentioned in this post you can move files in different ways in Python. You can use "os.rename" or "shutil.move".
Here is a full tested code-sample:
import os, shutil
path = './test/'
new_path = './x/'
n = 1
fext = ".png"
for dirpath, dirnames, filenames in os.walk(path):
for filename in [f for f in filenames if f.endswith(fext)][:n]:
print(filename) #this only print the file name in each sub dir
filenameFull = os.path.join(dirpath, filename)
new_filenameFull = os.path.join(new_path, filename)
# if new directory doesn't exist - you create it recursively
if not os.path.exists(new_path):
os.makedirs(new_path)
# Use "os.rename"
#os.rename(filenameFull, new_filenameFull)
# or use "shutil.move"
shutil.move(filenameFull, new_filenameFull)
I have the following directory structure:
-mailDir
-folderA
-sub1
-sub2
-inbox
-1.txt
-2.txt
-89.txt
-subInbox
-subInbox2
-folderB
-sub1
-sub2
-inbox
-1.txt
-2.txt
-200.txt
-577.txt
The aim is to copy all the txt files under inbox folder into another folder.
For this I tried the below code
import os
from os import path
import shutil
rootDir = "mailDir"
destDir = "destFolder"
eachInboxFolderPath = []
for root, dirs, files in os.walk(rootDir):
for dirName in dirs:
if(dirName=="inbox"):
eachInboxFolderPath.append(root+"\\"+dirName)
for ii in eachInboxFolderPath:
for i in os.listdir(ii):
shutil.copy(path.join(ii,i),destDir)
If the inbox directory only has .txt files then the above code works fine. Since the inbox folder under folderA directory has other sub directory along with .txt files, the code returns permission denied error. What I understood is shutil.copy won't allow to copy the folders.
The aim is to copy only the txt files in every inbox folder to some other location. If the file names are same in different inbox folder I have to keep both file names. How we can improve the code in this case ? Please note other than .txt all others are folders only.
One simple solution is to filter for any i that does not have the .txt extension by using the string endswith() method.
import os
from os import path
import shutil
rootDir = "mailDir"
destDir = "destFolder"
eachInboxFolderPath = []
for root, dirs, files in os.walk(rootDir):
for dirName in dirs:
if(dirName=="inbox"):
eachInboxFolderPath.append(root+"\\"+dirName)
for ii in eachInboxFolderPath:
for i in os.listdir(ii):
if i.endswith('.txt'):
shutil.copy(path.join(ii,i),destDir)
This should ignore any folders and non-txt files that are found with os.listdir(ii). I believe that is what you are looking for.
Just remembered that I once wrote several files to solve this exact problem before. You can find the source code here on my Github.
In short, there are two functions of interest here:
list_files(loc, return_dirs=False, return_files=True, recursive=False, valid_exts=None)
copy_files(loc, dest, rename=False)
For your case, you could copy and paste these functions into your project and modify copy_files like this:
def copy_files(loc, dest, rename=False):
# get files with full path
files = list_files(loc, return_dirs=False, return_files=True, recursive=True, valid_exts=('.txt',))
# copy files in list to dest
for i, this_file in enumerate(files):
# change name if renaming
if rename:
# replace slashes with hyphens to preserve unique name
out_file = sub(r'^./', '', this_file)
out_file = sub(r'\\|/', '-', out_file)
out_file = join(dest, out_file)
copy(this_file, out_file)
files[i] = out_file
else:
copy(this_file, dest)
return files
Then just call it like so:
copy_files('mailDir', 'destFolder', rename=True)
The renaming scheme might not be exactly what you want, but it will at least not override your files. I believe this should solve all your problems.
Here you go:
import os
from os import path
import shutil
destDir = '<absolute-path>'
for root, dirs, files in os.walk(os.getcwd()):
# Filter out only '.txt' files.
files = [f for f in files if f.endswith('.txt')]
# Filter out only 'inbox' directory.
dirs[:] = [d for d in dirs if d == 'inbox']
for f in files:
p = path.join(root, f)
# print p
shutil.copy(p, destDir)
Quick and simple.
sorry, I forgot the part where, you also need unique file names as well. The above solution only works for distinct file names in a single inbox folder.
For copying files from multiple inboxes and having a unique name in the destination folder, you can try this:
import os
from os import path
import shutil
sourceDir = os.getcwd()
fixedLength = len(sourceDir)
destDir = '<absolute-path>'
filteredFiles = []
for root, dirs, files in os.walk(sourceDir):
# Filter out only '.txt' files in all the inbox directories.
if root.endswith('inbox'):
# here I am joining the file name to the full path while filtering txt files
files = [path.join(root, f) for f in files if f.endswith('.txt')]
# add the filtered files to the main list
filteredFiles.extend(files)
# making a tuple of file path and file name
filteredFiles = [(f, f[fixedLength+1:].replace('/', '-')) for f in filteredFiles]
for (f, n) in filteredFiles:
print 'copying file...', f
# copying from the path to the dest directory with specific name
shutil.copy(f, path.join(destDir, n))
print 'copied', str(len(filteredFiles)), 'files to', destDir
If you need to copy all files instead of just txt files, then just change the condition f.endswith('.txt') to os.path.isfile(f) while filtering out the files.
There is directory A, which contains several subdirectories of txt files. There is another directory B, which contains txt files. There are several txt files in A that have the same name in B but different content. Now I want to move the txt files in B to A and cover the files with the same name. My Code is as below:
import shutil
import os
src = '/PATH/TO/B'
dst = '/PATH/TO/A'
file_list = []
for filename in os.walk(dst):
file_list.append(filename)
for root, dirs, files in os.walk(src):
for file in files:
if file in file_list:
##os.remove(dst/file[:-4] + '.txt')
shutil.move(os.path.join(src,file),os.path.join(dst,file))
But when I run this, it did nothing. Can anyone help me about it?
The following will do what you want. You need to be careful to preserve the subdirectory structure so as to avoid FileNotFound exceptions. Test it out in a test directory before clobbering the actual directories you want modified so you know that it does what you want.
import shutil
import os
src = 'B'
dst = 'A'
file_list = []
dst_paths = {}
for root, dirs, files in os.walk(dst):
for file in files:
full_path = os.path.join(root, file)
file_list.append(file)
dst_paths[file] = full_path
print(file_list)
print(dst_paths)
for root, dirs, files in os.walk(src):
for file in files:
if file in file_list:
b_path = os.path.join(root, file)
shutil.move(b_path,dst_paths[file])
This could be done with python, but I think I am missing a way to loop for all directories. Here is the code I am using:
import os
def renameInDir(directory):
for filename in os.listdir(directory):
if filename.endswith(".pdf"):
path = os.path.realpath(filename)
parents = path.split('/') //make an array of all the dirs in the path. 0 will be the original basefilename
newFilename=os.path.dirname(filename)+directory +parents[-1:][0] //reorganize data into format you want
os.rename(filename, newFilename)//rename the file
You should go with os.walk(). It will map the directory tree by the given directory param, and generate the file names.
Using os.walk() you'll accomplish the desired result is this way:
import os
from os.path import join
for dirpath, dirnames, filenames in os.walk('/path/to/directory'):
for name in filenames:
new_name = name[:-3] + 'new_file_extension'
os.rename(join(dirpath, name), join(dirpath, new_name))
this is the first question I am posting on stackoverflow so excuse me if I did something out of the norm.
I am trying to create a python program which traverses a user selected directory to display all file contents of the folders selected. For example: Documents folders has several folders with files inside of them, I am trying to save all files in the Documents folder to an array.
The method below is what I am using to traverse a directory (hoping it is a simple problem)
def saveFilesToArray(dir):
allFiles = []
os.chdir(dir)
for file in glob.glob("*"):
print(file)
if (os.path.isfile(file)):
allFiles.append(file)
elif(os.path.isdir(file)):
print(dir + "/" + file + " is a directory")
allFiles.append(saveFilesToArray(dir + "/" + file))
return allFiles
This will give you just the files:
import os
def list_files(root):
all_files = []
for root, dirs, files in os.walk(root, followlinks=True):
for file in files:
full_path = os.path.join(root, file)
all_files.append(full_path)
return all_files
I hope this is helpful:
import os
def saveFilesToList(theDir):
allFiles = []
for root, dirs, files in os.walk(theDir):
for name in files:
npath = os.path.join(root,name)
if os.path.isfile(npath):
allFiles.append(npath)
return allFiles
Traverses all directories and stores the path to files (that are not directories) in the list. It seems much easier to use this than glob.