I am trying to figure out how to check if a file within my source folder, exists within my destination folder, then copy the file over to the destination folder.
If the file within the source folder exists within the destination folder, rename file within source folder to "_1" or _i+1 then copy it to destination folder.
For Example (will not be a .txt, just using this as an example, files will be dynamic in nature):
I want to copy file.txt from folder a over to folder b.
file.txt already exists within within folder b a. If I attempted to copy file.txt over to folder b, I would receive a copy error.
Rename file.txt to file_1.txt a. Copy file_1.txt to folder b b. If file_1.txt exists then make it file_2.txt
What I have so far is this:
for filename in files:
filename_only = os.path.basename(filename)
src = path + "\\" + filename
failed_f = pathx + "\\Failed\\" + filename
# This is where I am lost, I am not sure how to declare the i and add _i + 1 into the code.
if path.exists(file_path):
numb = 1
while True:
new_path = "{0}_{2}{1}".format(*path.splitext(file_path) + (numb,))
if path.exists(new_path):
numb += 1
shutil.copy(src, new_path)
else:
shutil.copy(src, new_path)
shutil.copy(src, file_path)
Thanks much in advance.
import os
for filename in files:
src = os.path.join(path, filename)
i = 0
while True:
base = os.path.basename(src)
name = base if i == 0 else "_{}".format(i).join(os.path.splitext(base))
dst_path = os.path.join(dst, name)
if not os.path.exists(dst_path):
shutil.copy(src, dst_path)
break
i += 1
Related
i have files in folders and subfolders. folder structure is like this
2020(folder)
-01(sub folder)
--14(sub-sub folder)
----abc1-2020-01-14.csv
----abc2-2020-01-14.csv
-02(subfolder in 2020)
--17(sub-sub folder in 02)
----abc1-2020-02-17.csv
----abc4-2020-02-17.csv
i have list of file names.
li = ['abc1','abc2','abc3','abc4']
i want to know if these file exists in directory or not. each subdirectory should have all 4 files. if not then code must return path where particular file doesnot exist.
import glob
BASE_PATH = r'2020/'
allin= BASE_PATH + '/*/*'
li = ['abc1','abc2','abc3','abc4']
print('Names of files:')
for name in glob.glob(allin):
print('\t', name)
for k in li:
try:
f = open(r"C:\\Users\\Karar\\ProjectFiles\\scripts\\"+ name + "\\" + k + "*.csv")
except IOError:
print(name+k+ ".csv""File not present")
print name is returning 2020\01\14 and 2020\02\17
iam having difficulty in giving path here in open method. please also note that my filename stored in folders has date in the end so need to tackle that as well in path so that for any date at the end of file name if folder carry files with name in list then okay do nothing but if files are missing in sub folders then print EXCEPT file not present with path.
note each folder has to carry all 4 files if not then return except.
One possible approach:
import glob, os.path
base = '2020'
li = ['abc1','abc2','abc3','abc4']
for dirname in glob.glob(base + '/*/*'):
year, month, day = dirname.split('/')
for prefix in li:
filename = "{}/{}.csv".format(dirname, '-'.join(prefix, year, month, day))
if not os.path.exists(filename):
print(filename)
I'm trying to walk over the directory structure and create a similar structure (but not identical).
I got confused of the use of os.path.join, the following code with 2 or more directory depth works perfectly.
DIR_1 :
A | file2.txt
B | file3.txt
file1.txt
inputpath = DIR_1
outputpath = DIR_2
for dirpath, dirnames, filenames in os.walk(inputpath):
structure = os.path.join(outputpath, dirpath[len(inputpath):])
for f1 in filenames:
f = os.path.splitext(f1)[0]
path = structure + '/' + f
print ("The path is: ", path)
file1 = path + '/' + f1
print ("The file path is: ", file1)
file_dir = dirpath + '/' + f1;
print ("The file dir path is: ", file_dir)
print ("\n")
But in case of just one level of depth, it add additional '/'. Is there a way to avoid this?
For example the following gives:
The path is: DIR_2//file1
The file path is: DIR_2//file1/file1.txt
The file dir path is: DIR_1/file1.txt
The path is: /A/file2
The file path is: /A/file2/file2.txt
The file dir path is: DIR_1/A/file2.txt
The path is: /B/file3
The file path is: /B/file3/file3.txt
The file dir path is: DIR_1/B/file3.txt
Edit 1:
The output directory DIR_2 structure is similar to the original Dir_1 but not identical.
The DIR_2 should have additional one level of directory of the filename; for example rather than just
DIR_2/file1.txt
it should be
DIR_2/file1/file1.txt.
DIR_2/A/file2/file2.txt. Similarly.
Edit 2:
I also need to read the content of the dirpath (of DIR_1) and select relevant text to put in the corresponding output file (of DIR_2). So i can't ignore it.
You should not worry about the dirpath, use it only to get the original files, all information to recreate the directory structure you already have in dirnames. The code to recreate the file structure can look like this:
for root, dirs, files in os.walk( input_path ) :
offset = len(input_path)
if len(root) > len(input_path) :
offset += 1 # remove an extra leading separator
relative_path = root[offset:]
for d in dirs : # create folders
os.mkdir( os.path.join( output_path, relative_path, d )
for f in files : # copy the files
shutil.copy( os.path.join( root, f),
os.path.join( output_path, relative_path, f))
And that's it!
There are lots of suggestions on how to create a directory, but I did not come across elegant solutions to saving a file in a newly created directory. The code saves files in the main folder, not id specific. I'd greatly appreciate your help. Thanks!
I'm using Windows 10 Python 3.
1) check whether "TrainData/xx-xxx" directory exists
2) if it does not exist:
create a subfolder within the "TrainData" directory and name it based on unique input (id) - this now works
save the file within this new directory (TrainData/xx-xxx) and name it xx-xxx....jpg
3) if it exists:
save the file within this new directory (TrainData/xx-xxx) and name it xx-xxx....jpg
id = input('Client ID:xx-xxx')
directory = "TrainData/" +str(id)
if not os.path.exists(directory):
os.makedirs(directory)
#with open(os.path.join(directory, '.' +str(id))) #I can't get this to work
file_name_path = directory + str(id)+ '.' +str(count)+ '.' +str(timegm(datetime.utcnow().utctimetuple())) + '.jpg'
if cv2.Laplacian(face, cv2.CV_64F).var() >200:
cv2.imwrite(file_name_path, face)
else:
count -= 1
cv2.imshow('Client', frame)
}
Its for creating a directory if not exists :
import os
directory = "TrainData/" +str(id)
if not os.path.exists(directory):
os.makedirs(directory)
Then you can open and create a file in that directory .
with open(os.path.join(directory, '.' +str(count)+ '.' +str(timegm(datetime.utcnow().utctimetuple())) + '.jpg' ), "w") as ip_file:
...
All that was missing was a '+/'. Thanks everyone!
id = input('Client ID:xx-xxx')
directory = "TrainData/" +str(id) +'/'
if not os.path.exists(directory):
os.makedirs(directory)
file_name_path = directory +"ID." +str(id)+ '.' +str(count)+ '.' +str(timegm(datetime.utcnow().utctimetuple())) + '.jpg'
if cv2.Laplacian(face, cv2.CV_64F).var() >200:
cv2.imwrite(file_name_path, face)
else:
count -= 1
cv2.imshow('Client', frame)
I want to group a list of files into sub-folders based on some substring in their name
The files are of the form
pie_riverside_10.png
stack_oak_20.png
scatter_mountain_10.png
and I want to use the starting substring (e.g. pie, stack, scatter) and the integer substring (e.g. 10,20) as the sub-directory name for grouping the files..
The code below is only example- if I actually do that approach I have to create at least 75-80 folders manually with elif statements, which is inefficient.
I am just curious if there is a better way to do this?
EDIT: The current code assumes there is already a folder created, but in real scenario I do not have the folders created and I do not want to have to create 70-80 subfolders- I am trying to make script to create those folders for me.
import shutil
import os
source = 'C:/Users/Xx/Documents/plots/'
pie_charts_10= 'C:/Users/Xx/Documents/pie_charts_10/'
pie_charts_20= 'C:/Users/Xx/Documents/pie_charts_20/'
stack_charts_10 = 'C:/Users/Xx/Documents/stack_charts_10 /'
scatter_charts_10 = 'C:/Users/Xx/Documents/scatter_charts_10 /'
files = os.listdir(source)
for f in files:
if (f.startswith("pie") and f.endswith("10.png")):
shutil.move(os.path.join(source, f), pie_charts_10)
elif (f.startswith("pie") and f.endswith("20.png")):
shutil.move(os.path.join(source, f), pie_charts_20 )
elif (f.startswith("stack") and f.endswith("10.png")):
shutil.move(os.path.join(source, f), stack_charts_10 )
elif (f.startswith("scatter ") and f.endswith("10.png")):
shutil.move(os.path.join(source, f), scatter_charts_10 )
else:
print("No file")
When you are looking to move files of the format prefix_suffix.png into folders prefix_charts_suffix/:
base = "C:/Users/Xx/Documents"
moved_types = ['png']
for f in files:
pf = f.rsplit('.', 1) # filename, prefix
sf = pf[0].split("_") # prefix, whatever, suffix
if len(sf) >= len(pf) > 1 and pf[1] in moved_types:
new_dir = "%s_charts_%s" % (sf[0], sf[-1])
if not os.path.exists(os.path.join(base, new_dir):
os.mkdirs(os.path.join(base, new_dir)
shutil.move(os.path.join(source, f), os.path.join(base, new_dir, f)
Which will work for the general case, grabbing and moving only files which end in moved_types and contain a _ (which allows for splitting of a prefix and suffix).
See the relevant logic on repl.it:
>>>['prefix_garbage_suffix.png', 'bob.sh', 'bob.bill.png', "pie_23.png", "scatter_big_1.png"]
Move prefix_garbage_suffix.png to prefix_charts_suffix
Move pie_23.png to pie_charts_23
Move scatter_big_1.png to scatter_charts_1
EDIT: I've preserved the original answer in case others need a solution where not every file should be moved or you can't infer the folder name from the file names.
If you need I would do something like:
identity_tuples = \
[('pie', '16.png', 'C:/Users/Xx/Documents/pie_charts/'),
('stack', '14.png', 'C:/Users/Xx/Documents/stack_charts/'),
('scatter', '12.png', 'C:/Users/Xx/Documents/scatter_charts/')]
files = os.listdir(source)
for f in files:
for identity_tuple in identity_tuples:
if f.startswith(identity_tuple[0]) and f.endswith(identity_tuple[1]):
shutil.move(os.path.join(source, f), identity_tuple[2])
break
else:
print("No file")
Now you just have to add a new identity tuple: (prefix, suffix, destination) for each type. If the path is common for all the destinations, you can change it to:
identity_tuples = \
[('pie', '16.png', 'pie_charts/'),
('stack', '14.png', 'stack_charts/'),
('scatter', '12.png', 'scatter_charts/')]
files = os.listdir(source)
for f in files:
for identity_tuple in identity_tuples:
if f.startswith(identity_tuple[0]) and f.endswith(identity_tuple[1]):
shutil.move(os.path.join(source, f), "C:/Users/Xx/Documents/" + identity_tuple[2])
break
else:
print("No file")
Note: This is using a for/else loop, in which else is only called if you don't hit a break.
If you need to make the directories, add this in before the shutil.move():
if not os.path.exists(identity_tuple[2]):
os.mkdirs(identity_tuple[2]) # Or "C:/Users/Xx/Documents/" + ...
How about this
# assume you have files in a folder
source = './files' # some directory
files = os.listdir(source)
print files
#['pie_river_1.png', 'pie_mountain_11.png', 'scatter_grass_12.png', 'stack_field_30.png']
Now you want to group them into subfolders based on what they start with and what number they have before the extension
subdir_root = './subfolders'
for f in files:
fig_type = f.split('_')[0]
fig_num = f.split('.png')[0].split('_')[-1]
subdir_name = '%s_charts_%s'%(fig_type, fig_num) # name of dir, e.g. pie_charts_10
subdir = os.path.join( subdir_root, subdir_name ) # path to dir
if not os.path.exists(subdir): # if the dir does not exist , create it
os.makedirs(subdir)
f_src = os.path.join( source, f) # full path to source file
f_dest = os.path.join( subdir, f) # full path to new destination file
shutil.copy( f_src, f_dest ) # I changed to copy so you dont screw up your original files
on my compurer
$ ls ./files:
pie_mountain_11.png pie_river_1.png scatter_grass_12.png stack_field_30.png
$ ls -R ./subfolders
pie_charts_1 pie_charts_11 scatter_charts_12 stack_charts_30
subfolders//pie_charts_1:
pie_river_1.png
subfolders//pie_charts_11:
pie_mountain_11.png
subfolders//scatter_charts_12:
scatter_grass_12.png
subfolders//stack_charts_30:
stack_field_30.png
Obviously, you might have to change the code if edge cases arise.. but this should give you a good start...
I've already read this thread but when I implement it into my code it only works for a few iterations.
I'm using python to iterate through a directory (lets call it move directory) to copy mainly pdf files (matching a unique ID) to another directory (base directory) to the matching folder (with the corresponding unique ID). I started using shutil.copy but if there are duplicates it overwrites the existing file.
I'd like to be able to search the corresponding folder to see if the file already exists, and iteratively name it if more than one occurs.
e.g.
copy file 1234.pdf to folder in base directory 1234.
if 1234.pdf exists to name it 1234_1.pdf,
if another pdf is copied as 1234.pdf then it would be 1234_2.pdf.
Here is my code:
import arcpy
import os
import re
import sys
import traceback
import collections
import shutil
movdir = r"C:\Scans"
basedir = r"C:\Links"
try:
#Walk through all files in the directory that contains the files to copy
for root, dirs, files in os.walk(movdir):
for filename in files:
#find the name location and name of files
path = os.path.join(root, filename)
print path
#file name and extension
ARN, extension = os.path.splitext(filename)
print ARN
#Location of the corresponding folder in the new directory
link = os.path.join(basedir,ARN)
# if the folder already exists in new directory
if os.path.exists(link):
#this is the file location in the new directory
file = os.path.join(basedir, ARN, ARN)
linkfn = os.path.join(basedir, ARN, filename)
if os.path.exists(linkfn):
i = 0
#if this file already exists in the folder
print "Path exists already"
while os.path.exists(file + "_" + str(i) + extension):
i+=1
print "Already 2x exists..."
print "Renaming"
shutil.copy(path, file + "_" + str(i) + extension)
else:
shutil.copy(path, link)
print ARN + " " + "Copied"
else:
print ARN + " " + "Not Found"
Sometimes it is just easier to start over... I apologize if there is any typo, I haven't had the time to test it thoroughly.
movdir = r"C:\Scans"
basedir = r"C:\Links"
# Walk through all files in the directory that contains the files to copy
for root, dirs, files in os.walk(movdir):
for filename in files:
# I use absolute path, case you want to move several dirs.
old_name = os.path.join( os.path.abspath(root), filename )
# Separate base from extension
base, extension = os.path.splitext(filename)
# Initial new name
new_name = os.path.join(basedir, base, filename)
# If folder basedir/base does not exist... You don't want to create it?
if not os.path.exists(os.path.join(basedir, base)):
print os.path.join(basedir,base), "not found"
continue # Next filename
elif not os.path.exists(new_name): # folder exists, file does not
shutil.copy(old_name, new_name)
else: # folder exists, file exists as well
ii = 1
while True:
new_name = os.path.join(basedir,base, base + "_" + str(ii) + extension)
if not os.path.exists(new_name):
shutil.copy(old_name, new_name)
print "Copied", old_name, "as", new_name
break
ii += 1
I always use the time-stamp - so its not possible, that the file exists already:
import os
import shutil
import datetime
now = str(datetime.datetime.now())[:19]
now = now.replace(":","_")
src_dir="C:\\Users\\Asus\\Desktop\\Versand Verwaltung\\Versand.xlsx"
dst_dir="C:\\Users\\Asus\\Desktop\\Versand Verwaltung\\Versand_"+str(now)+".xlsx"
shutil.copy(src_dir,dst_dir)
For me shutil.copy is the best:
import shutil
#make a copy of the invoice to work with
src="invoice.pdf"
dst="copied_invoice.pdf"
shutil.copy(src,dst)
You can change the path of the files as you want.
I would say you have an indentation problem, at least as you wrote it here:
while not os.path.exists(file + "_" + str(i) + extension):
i+=1
print "Already 2x exists..."
print "Renaming"
shutil.copy(path, file + "_" + str(i) + extension)
should be:
while os.path.exists(file + "_" + str(i) + extension):
i+=1
print "Already 2x exists..."
print "Renaming"
shutil.copy(path, file + "_" + str(i) + extension)
Check this out, please!
import os
import shutil
import glob
src = r"C:\Source"
dest = r"C:\Destination"
par = "*"
i=1
d = []
for file in glob.glob(os.path.join(src,par)):
f = str(file).split('\\')[-1]
for n in glob.glob(os.path.join(dest,par)):
d.append(str(n).split('\\')[-1])
if f not in d:
print("copied",f," to ",dest)
shutil.copy(file,dest)
else:
f1 = str(f).split(".")
f1 = f1[0]+"_"+str(i)+"."+f1[1]
while f1 in d:
f1 = str(f).split(".")
f1 = f1[0]+"_"+str(i)+"."+f1[1]
print("{} already exists in {}".format(f1,dest))
i =i + 1
shutil.copy(file,os.path.join(dest,f1))
print("renamed and copied ",f1 ,"to",dest)
i = 1