Copying recursively files in Python - python

I was looking to copy a whole directory and its files but also printing each file name that its being copied.
I was using a simply call to cp -rf dir dest with os.system but I cant print each filename separately as obvious.
I then thought about listing eash directory file by calling recursively ls with os.system, saving the whole string, split them on an array, and implement a for loop to run os.system("cp " file1 + " des/") and printing the filename, but it looks like lot of work.
Any better ideas to accomplish this?

You can use os.walk to get the entire directory listing and use that listing to copy all files iteratively. Something like
file_paths = [os.path.join(root, f) for root, _, files in os.walk('.') for f in files]
for path in file_paths:
print path
shutil.copy(path, target)
Alternatively according to MatthewFranglen's comment you can just do shutil.copytree(src, dst). That will also allow you to ignore things but you'll need to define a function to do that instead of using an if in the list comprehension.
# ignore all .DS_Store and *.txt files
file_paths = [os.path.join(root, f) for root, _, files in os.walk('.') for f in files if (f != '.DS_Store') or f.endswith('.txt'))]
compared to
from shutil import copytree, ignore_patterns
ignore_func = ignore_patterns('.DS_Store', '*.txt') # ignore .DS_Store and *.txt files
copytree('/path/to/dir/', '/other/dir', ignore=ignore_func)

Related

Python script to move specific filetypes from the all directories to one folder

I'm trying to write a python script to move all music files from my whole pc to one spcific folder.
They are scattered everywhere and I want to get them all in one place, so I don't want to copy but completely move them.
I was already able to make a list of all the files with this script:
import os
targetfiles = []
extensions = (".mp3", ".wav", ".flac")
for root, dirs, files in os.walk('/'):
for file in files:
if file.endswith(extensions):
targetfiles.append(os.path.join(root, file))
print(targetfiles)
This prints out a nice list of all the files but I'm stuck to now move them.
I did many diffent tries with different code and this was one of them:
import os
import shutil
targetfiles = []
extensions = (".mp3", ".wav", ".flac")
for root, dirs, files in os.walk('/'):
for file in files:
if file.endswith(extensions):
targetfiles.append(os.path.join(root, file))
new_path = 'C:/Users/Nicolaas/Music/All' + file
shutil.move(targetfiles, new_path)
But everything I try gives me an error:
TypeError: rename: src should be string, bytes or os.PathLike, not list
I think I've met my limit gathering this all as I'm only starting at Python but I would be very grateful if anyone could point me in the right direction!
You are trying to move a list of files to a new location, but the shutil.move function expects a single file as the first argument. To move all the files in the targetfiles list to the new location, you have to use a loop to move each file individually.
for file in targetfiles:
shutil.move(file, new_path)
Also if needed add a trailing slash to the new path 'C:/Users/Nicolaas/Music/All/'
On a sidenote are you sure that moving all files with those extentions is a good idea? I would suggest copying them or having a backup.
Edit:
You can use an if statement to exclude certain folders from being searched.
for root, dirs, files in os.walk('/'):
if any(folder in root for folder in excluded_folders):
continue
for file in files:
if file.endswith(extensions):
targetfiles.append(os.path.join(root, file))
Where excluded_folder is a list of the unwanted folders like: excluded_folders = ['Program Files', 'Windows']
I would suggest using glob for matching:
import glob
def match(extension, root_dir):
return glob.glob(f'**\\*.{extension}', root_dir=root_dir, recursive=True)
root_dirs = ['C:\\Path\\to\\Albums', 'C:\\Path\\to\\dir\\with\\music\\files']
excluded_folders = ['Bieber', 'Eminem']
extensions = ("mp3", "wav", "flac")
targetfiles = [f'{root_dir}\\{file_name}' for root_dir in root_dirs for extension in extensions for file_name in match(extension, root_dir) if not any(excluded_folder in file_name for excluded_folder in excluded_folders)]
Then you can move these files to new_path

Python Subprocess Loop runs Twice

So, I created a Python script to batch convert PDF files using Ghostscript. Ideally it should work, but I am not sure why it isn't working. For now, it is going through the input PDF files twice and when it runs the second time, it overwrites the output files.
Here's the script.
from __future__ import print_function
import os
import subprocess
try:
os.mkdir('compressed')
except FileExistsError:
pass
for root, dirs, files in os.walk("."):
for file in files:
if file.endswith(".pdf"):
filename = os.path.join(root, file)
arg1= '-sOutputFile=' + './compressed/' + file
print ("compressing:", file )
p = subprocess.Popen(['gs', '-sDEVICE=pdfwrite', '-dCompatibilityLevel=1.4', '-dPDFSETTINGS=/screen', '-dNOPAUSE', '-dBATCH', '-dQUIET', str(arg1), filename], stdout=subprocess.PIPE).wait()
Here's the ouput.
I am missing what did I do wrong.
file is just the name of the file. You have several files called the same in different directories. Don't forget that os.walk recurses in subdirectories by default.
So you have to save the converted files in a directory or name which depends on root.
and put the output directory outside the current directory as os.walk will scan it
For instance, for flat output replace:
arg1= '-sOutputFile=' + './compressed/' + file
by
arg1= '-sOutputFile=' + '/somewhere/else/compressed/' + root.strip(".").replace(os.sep,"_")+"_"+file
The expression
root.strip(".").replace(os.sep,"_")
should create a "flat" version of root tree without current directory (no dot) and path separators converted to underscores, plus one final underscore. That's one option that would work.
An alternate version that won't scan ./compressed or any other subdirectory (maybe more what you're looking for) would be using os.listdir instead (no recursion)
root = "."
for file in os.listdir(root):
if file.endswith(".pdf"):
filename = os.path.join(root, file)
arg1= '-sOutputFile=' + './compressed/' + file
print ("compressing:", file )
Or os.scandir
root = "."
for entry in os.scandir(root):
file = entry.name
if file.endswith(".pdf"):
filename = os.path.join(root, file)
arg1= '-sOutputFile=' + './compressed/' + file
print ("compressing:", file )
Your problem is that os.walk will also retrieve that contents in "compressed" directory. This is because the files will be compressed and created before os.walk list files in that directory. If you add print(os.path.join(root, file)) to your for-loop you will notice that.
Bellow is a snippet that works since the files retrieved are only the ones in the current directory.
import os
os.makedirs("compressed", exist_ok=True)
for file in os.listdir("."):
if not os.path.isfile(file):
continue
if not file.endswith(".pdf"):
continue
print(file)
os.walk will by definition enter into subdirectories, so you are compressing the files in the compressed subdirectory a second time.
Probably you simply want
for file in os.scandir("."):
...
As an aside, you almost certainly want to avoid Popen in favor of subprocess.run() or one of its legacy variations.
On the first iteration of
for root, dirs, files in os.walk(".")
you find the files in the current directory, then you compress them into the
./compressed/*.pdf path.
After that the second iteration of the outer loop will find the already compressed files in the subdirectory.
Easiest fix is to move the output directory outside of the input directory (or create an input directory next to the compressed dir, and read the files from there instead of .)

os.listdir() not showing contents of directory

Here is my code:
files = [f for f in os.listdir(os.getcwd() + "\\folder") if os.path.isfile(f)]
for file in files:
print("hello")
I am running this from the directory which contains a folder called "folder". This folder has 4 files in it. This should print "hello" four times in my head - but it doesn't.
What have I misunderstood?
PS Do I need to use os.getcwd() here? I figure it would be cleaner to just use a relative path, but that also doesn't work.
With os.path.isfile(f) you're asking if f is a file inside your current directory, not inside folder. Replace your code with:
[f for f in os.listdir(os.path.join(os.getcwd(), "folder")) if os.path.isfile(os.path.join("folder", f))]
I've also taken the liberty of using os.path.join to avoid direct concatenation of file and folder names as strings, since slashes can be a bit iffy.
And for the record, no you don't need to use os.getcwd() here (but I left it there anyways).

How to run script for all files in a folder/directry

I am new to python. I have successful written a script to search for something within a file using :
open(r"C:\file.txt) and re.search function and all works fine.
Is there a way to do the search function with all files within a folder? Because currently, I have to manually change the file name of my script by open(r"C:\file.txt),open(r"C:\file1.txt),open(r"C:\file2.txt)`, etc.
Thanks.
You can use os.walk to check all the files, as the following:
import os
for root, _, files in os.walk(path):
for filename in files:
with open(os.path.join(root, filename), 'r') as f:
#your code goes here
Explanation:
os.walk returns tuple of (root path, dir names, file names) in the folder, so you can iterate through filenames and open each file by using os.path.join(root, filename) which basically joins the root path with the file name so you can open the file.
Since you're a beginner, I'll give you a simple solution and walk through it.
Import the os module, and use the os.listdir function to create a list of everything in the directory. Then, iterate through the files using a for loop.
Example:
# Importing the os module
import os
# Give the directory you wish to iterate through
my_dir = <your directory - i.e. "C:\Users\bleh\Desktop\files">
# Using os.listdir to create a list of all of the files in dir
dir_list = os.listdir(my_dir)
# Use the for loop to iterate through the list you just created, and open the files
for f in dir_list:
# Whatever you want to do to all of the files
If you need help on the concepts, refer to the following:
for looops in p3: http://www.python-course.eu/python3_for_loop.php
os function Library (this has some cool stuff in it): https://docs.python.org/2/library/os.html
Good luck!
You can use the os.listdir(path) function:
import os
path = '/Users/ricardomartinez/repos/Salary-API'
# List for all files in a given PATH
file_list = os.listdir(path)
# If you want to filter by file type
file_list = [file for file in os.listdir(path) if os.path.splitext(file)[1] == '.py']
# Both cases yo can iterate over the list and apply the operations
# that you have
for file in file_list:
print(file)
#Operations that you want to do over files

Directory is not being recognized in Python

I'm uploading a zipped folder that contains a folder of text files, but it's not detecting that the folder that is zipped up is a directory. I think it might have something to do with requiring an absolute path in the os.path.isdir call, but can't seem to figure out how to implement that.
zipped = zipfile.ZipFile(request.FILES['content'])
for libitem in zipped.namelist():
if libitem.startswith('__MACOSX/'):
continue
# If it's a directory, open it
if os.path.isdir(libitem):
print "You have hit a directory in the zip folder -- we must open it before continuing"
for item in os.listdir(libitem):
The file you've uploaded is a single zip file which is simply a container for other files and directories. All of the Python os.path functions operate on files on your local file system which means you must first extract the contents of your zip before you can use os.path or os.listdir.
Unfortunately it's not possible to determine from the ZipFile object whether an entry is for a file or directory.
A rewrite or your code which does an extract first may look something like this:
import tempfile
# Create a temporary directory into which we can extract zip contents.
tmpdir = tempfile.mkdtemp()
try:
zipped = zipfile.ZipFile(request.FILES['content'])
zipped.extractall(tmpdir)
# Walk through the extracted directory structure doing what you
# want with each file.
for (dirpath, dirnames, filenames) in os.walk(tmpdir):
# Look into subdirectories?
for dirname in dirnames:
full_dir_path = os.path.join(dirpath, dirname)
# Do stuff in this directory
for filename in filenames:
full_file_path = os.path.join(dirpath, filename)
# Do stuff with this file.
finally:
# ... Clean up temporary diretory recursively here.
Usually to make things handle relative paths etc when running scripts you'd want to use os.path.
It seems to me that you're reading from a Zipfile the items you've not actually unzipped it so why would you expect the file/dirs to exist?
Usually I'd print os.getcwd() to find out where I am and also use os.path.join to join with the root of the data directory, whether that is the same as the directory containing the script I can't tell. Using something like scriptdir = os.path.dirname(os.path.abspath(__file__)).
I'd expect you would have to do something like
libitempath = os.path.join(scriptdir, libitem)
if os.path.isdir(libitempath):
....
But I'm guessing at what you're doing as it's a little unclear for me.

Categories