I have a folder with 7500 images. I need to copy the first 600 images to a new folder using the shutil module in Python.
I tried to look for relevant stuff on the net but the usage of the paths is a bit confusing. What exactly should be my sequence of commands? I guess it will start like:
import os
import shutil
l=os.listdir(path)
for file in l[0:600]:
Edit: after having clarification on what shutil.copy() does, I came up with:
import os
import shutil
l=os.listdir(path)
for file in l[0:600]:
shutil.copy(file, destination, *, follow_symlinks = True)
But it's highlighting the comma after *, and giving the error iterable argument unpacking follows keyword argument unpacking. What's going wrong in the syntax?
Well, os.listdir() will return files randomly sorted, one thing you could do is that you can call os.stat(file).st_mtime on each file which will return timestamp when that file was last modified and then you can sort the files by that time to get first/last files. But it really depends on your use-case and how you interpret what first files are for you. But when it comes to shutil library you can just call:
for file in l[0:600]:
shutil.copy(file, f'./destination/{file}')
which will copy 600 files into directory that is in your current directory and named 'destination'.
os.listdir(path) will list files and sub direcotries in your directory you're searching.
I'm making the assumption that all you're files will be .jpg so I would use the glob module.
import glob
path = "D:Pictures\*.jpg"
destination = r"E:\new_pictures\\"
files = glob.glob(path)
for f in sorted(files)[:600]:
shutil.copy(f, destination)
Related
I am struggling with the paths and directories to solve this problem. Basically, I have a long list of .lammps files in one directory. My goal is to copy each file and move it into its own folder (which is one directory back) where its folder name is the file name minus the .lammps. All of the folders are already made, I just can't seem to figure out moving them. The entire list of files is in the Files directory. The individual folders are in the ROTATED FILES directory. Here is what I have. Any tips greatly appreciated.
Here is a file example
n-optimized.new.10_10-90-10_10.Ni00Nj01.lammps
The folder for this file is then named
n-optimized.new.10_10-90-10_10.Ni00Nj01
import os
file_directory = os.chdir("C:\Py Practice\ROTATED FILES\Files")
files = os.listdir()
for file in files:
# get the file -.lammps string
name1 = file.split('.')[0:4]
name2 = ".".join(name1)
# get the path for the files new respective folder (back a directory and paste folder name)
file_folder = "C:\Py Practice\ROTATED FILES/" + name2
# Move
combined_path = os.path.join(file, file_folder)
I've tried shutil and figured path join might be easier.
First of all, the code you have here shouldn't work since you either have to escape backslashes or use a raw string. Secondly, rather than using os for file system operations, it's much better to learn how to use pathlib (also a core python module) which provides a more modern object-oriented approach to file operations.
Using pathlib and shutil you can do something like
from pathlib import Path
from shutil import copyfile
file_directory = Path(r"C:\Py Practice\ROTATED FILES\Files")
# get the list of source files
source_files = [f for f in file_directory.glob('*.lammps')]
# create target file paths
target_files = [file_directory.parent / f.stem/ f.name for f in source_files]
for source, target in zip(source_files, target_files):
copyfile(str(source), str(target))
Here we're accessing different parts of file path using a convenient OOP structure. For example, if your file f is located in 'c:/foo/bar/boo.txt' then f.name is just the name of file: boo.txt, f.stem is the stem part of the file name (excluding the extension) boo, f.parent is its parent directory 'c:/foo/bar/' etc.
There's a really handy graphic of pathlib Path objects here.
The only inconvenience is that not all of core modules support Path objects yet so for copyfile we just need to get the string representation by calling str on the object.
And you don't even need to have target folders created beforehand, it's very easy to create the necessary folder structure as you go along:
from pathlib import Path
from shutil import copyfile
file_directory = Path(r"C:\Py Practice\ROTATED FILES\Files")
# get the list of source files
source_files = [f for f in file_directory.glob('*.lammps')]
# create target file paths
target_files = [file_directory.parent / f.stem/ f.name for f in source_files]
for source, target in zip(source_files, target_files):
# check that target directory exists
# and create a folder if not
if not target.parent.is_dir():
target.parent.mkdir()
copyfile(str(source), str(target))
I'm still very new to Python so I'm trying to apply Python in to my own situation for some experience
One useful program is to delete files, in this case by file type from a directory
import os
target = "H:\\documents\\"
for x in os.listdir(target):
if x.endswith(".rtf"):
os.unlink(target + x)
Taking this program, I have tried to expand it to delete ost files in every local profiles:
import os
list = []
folder = "c:\\Users"
for subfolder in os.listdir(folder):
list.append(subfolder)
ost_folder = "c:\\users\\%s\\AppData\\Local\\Microsoft\\Outlook"
for users in list:
ost_list = os.listdir(ost_folder%users)
for file in ost_list:
if file.endswith(".txt"):
print(file)
This should be printing the file name but spits an error that the file directory cannot be found
Not every folder under C:\Users will have a AppData\Local\Microsoft\Outlook subdirectory (there are typically hidden directories there that you may not see in Windows Explorer that don't correspond to a real user, and have never run Outlook, so they don't have that folder at all, but will be found by os.listdir); when you call os.listdir on such a directory, it dies with the exception you're seeing. Skip the directories that don't have it. The simplest way to do so is to have the glob module do the work for you (which avoids the need for your first loop entirely):
import glob
import os
for folder in glob.glob(r"c:\users\*\AppData\Local\Microsoft\Outlook"):
for file in os.listdir(folder):
if file.endswith(".txt"):
print(os.path.join(folder, file))
You can simplify it even further by pushing all the work to glob:
for txtfile in glob.glob(r"c:\users\*\AppData\Local\Microsoft\Outlook\*.txt"):
print(txtfile)
Or do the more modern OO-style pathlib alternative:
for txtfile in pathlib.Path(r'C:\Users').glob(r'*\AppData\Local\Microsoft\Outlook\*.txt'):
print(txtfile)
I am currently in a working directory where following files are present
abcde_file
gvmdgv_file
qst_file
rl.txt
qp.txt
trs_file
I want to do some operations on all files with _file at end and put them into a new directory called newdir.
My try:
from glob import glob
files = glob("*_file")
with open('newdir/{}'.format(files),'a') as a:
with open(files,'r') as r:
#required operations
It gives error saying file name too long for with open('newdir/{}'.format(files),'a')
That is because variable files is a list containing all matching files with _file as a suffix. You should loop through every single element of the list and copy it instead.
Something like this should work:
from glob import glob
files = glob("*_file")
for file in files:
oldf = open(file,'r')
newf = open(newdir+"\\"+file,'w+')
data = oldf.read()
newf.write(data)
oldf.close()
newf.close()
If you are copying files different than textfiles, you might want to open the two file handles with rb and wb+ instead. Variable newdir could be a constant with the new directory path, for example.
you might want to have a look at python's native "high-level file operations" library shutil
import shutil
shutil.copy("filepath to copy", "path-to-destination-folder")
I have a little task for my company
I have multiple files which start with swale-randomnumber
I want to copy then to some directory (does shutil.copy allow wildmasks?)
anyway I then want to choose the largest file and rename it to sync.dat and then run a program.
I get the logic, I will use a loop to do each individual piece of work then move on to the next, but I am unsure how to choose a single largest file or a single file at all for that matter as when I type in swale* surely it will just choose them all?
Sorry I havnt written any source code yet, I am still trying to get my head around how this will work.
Thanks for any help you may provide
The accepted answer of this question proposes a nice portable implementation of file copy with wildcard support:
from glob import iglob
from shutil import copy
from os.path import join
def copy_files(src_glob, dst_folder):
for fname in iglob(src_glob):
copy(fname, join(dst_folder, fname))
If you want to compare file sizes, you can use either of these functions:
import os
os.path.getsize(path)
os.stat(path).st_size
This might work :
import os.path
import glob
import shutil
source = "My Source Path" # Replace these variables with the appropriate data
dest = "My Dest Path"
command = "My command"
# Find the files that need to be copied
files = glob.glob(os.path.join(source, "swale-*"))
# Copy the files to the destination
for file in files:
shutil.copy(os.path.join(source, "swale-*"), dest)
# Create a sorted list of files - using the file sizes
# biggest first, and then use the 1st item
biggest = sorted([file for file in files],
cmp=lambda x,y : cmp(x,y),
key=lambda x: os.path.size( os.path.join( dest, x)), reverse = True)[0]
# Rename that biggest file to swale.dat
shutil.move( os.path.join(dest,biggest), os.path.join(dest,"swale.date") )
# Run the command
os.system( command )
# Only use os.system if you know your command is completely secure and you don't need the output. Use the popen module if you need more security and need the output.
Note : None of this is tested - but it should work
from os import *
from os.path import *
directory = '/your/directory/'
# You now have list of files in directory that starts with "swale-"
fileList = [join(directory,f) for f in listdir(directory) if f.startswith("swale-") and isfile(join(directory,f))]
# Order it by file size - from big to small
fileList.sort(key=getsize, reverse=True)
# First file in array is biggest
biggestFile = fileList[0]
# Do whatever you want with this files - using shutil.*, os.*, or anything else..
# ...
# ...
I am an absolute beginner to programming so I apologize if this is really basic. I've looked at other questions that appear to be related, but haven't found a solution to this particular problem--at least not that I can understand.
I need to generate a list of files in a directory; create a separate directory for each of those files with the directory name being based on each file's name; and put each file in its corresponding directory.
You should have a look at the glob, os and shutil libraries.
I've written an example for you. This will remove the file extension of each file in a given folder, create a new subdirectory, and move the file into the corresponding folder, i.e.:
C:\Test\
-> test1.txt
-> test2.txt
will become
C:\Test\
-> test1\
-> test1.txt
-> test2\
-> test2.txt
Code:
import glob, os, shutil
folder = 'C:/Test/'
for file_path in glob.glob(os.path.join(folder, '*.*')):
new_dir = file_path.rsplit('.', 1)[0]
os.mkdir(os.path.join(folder, new_dir))
shutil.move(file_path, os.path.join(new_dir, os.path.basename(file_path)))
This will throw an error if the folder already exist. To avoid that, handle the exception:
import glob, os, shutil
folder = 'C:/Test/'
for file_path in glob.glob(os.path.join(folder, '*.*')):
new_dir = file_path.rsplit('.', 1)[0]
try:
os.mkdir(os.path.join(folder, new_dir))
except WindowsError:
# Handle the case where the target dir already exist.
pass
shutil.move(file_path, os.path.join(new_dir, os.path.basename(file_path)))
PS: This will not work for files without extensions. Consider using a more robust code for cases like that.
Here is some advice on listing files using Python.
To create a directory, use os.mkdir (docs). To move a file, use os.rename (docs) or shutil.move (docs).