shutil.move() creates duplicates and fails on subsequent calls - python

I finished writing a script which creates some files so I'm making a tidy() function which sorts these files in folders. The end result should look like this:
/Scripting
- Output
- script.py
/Scripting/Output
- Folder1
- Folder2
- Folder3
Each folder contains the necessary files
I managed to create the list of folders and get the files in them without any problem so I now have in /Project: script.py, folder1, folder2, etc... I copy pasted most of the code from the first part in order to move them into the Output folder. The following code is executed with every subfolder containing their respective files is located in the same directory as the script.
try:
os.mkdir('output')
except FileExistsError:
pass
for file in os.listdir():
if '.' not in file and file != 'output':
shutil.move(file, f'{os.getcwd()}/output/{file})
The problem is that if I look into my folder after running, I find the following directory tree:
/Output
- Folder1
- Folder1
- File1
- File2
I get a duplicate folder within that folder and I don't understand where it's coming from. If I try to call the script again, I get the error: shutil.Error destination path 'Scripting/output/folder1/fodler1' already exists
What am I doing wrong?
Edit:
Here's the new code:
try:
os.mkdir('output')
except FileExistsError:
pass
obj = os.scandir()
cwd = os.getcwd()
for entry in obj:
if entry.is_dir() and not entry.name.startswith('.'):
continue
shutil.move(entry.name, f'{cwd}/'/output/'{entry.name}')
This works the first time I run it, but breaks if I keep calling the script by giving me the same mistake as above. It creates folder1 within folder1 only on subsequent calls and I can't find a reason for it.

Found the answer mostly by trial and error. I initially chose to use shutil.move() because it replaces a file if it finds another one with the same name. However, it does not do this with directory. It will instead add to that path. /Scripting/Output/Folder1/ as a destination path for Folder1 would give an error when I run the script a second time so instead of replacing the folder, it simple adds it into its path which would then become /Scripting/Output/Folder1/Folder1/ while still adding the files to the initial path (it looks like it runs the shutil.move() on everything within that path). To fix this, use obj = os.scandir() with obj.is_dir() and obj.name to parse your folders. Either os.rmdir() the extra folder every time, or add the folders before adding the files. This is the code that worked for me:
cwd = os.getcwd()
try:
os.mkdir('output')
except:
pass
os.chdir('output')
for name in folder_names:
try:
os.mkdir(name)
except:
pass
os.chdir('..')
obj = os.scandir()
cwd = os.getcwd()
for f in obj:
if f.is_file():
if True:# depends on how your files are organized
shutil.move(f.name, f'{cwd}/output/folder1/{f.name}')
# Do this for every file

Related

How to run python script for every folder in directory and skip any empty folders?

I am trying to figure out how to generate an excel workbook for each subfolder in my directory while skipping the folders that are empty. My directory structure is below.
So it would start with Folder A, execute my lines of code to create an excel file using Folder A's contents, then move to Folder B, execute my lines of code to create a separate excel file using Folder B's contents, then move to Folder C and skip it since it's empty, and continue on.
How do I loop through each folder in this manner and keep going when a folder is empty?
I would greatly appreciate the help!
myscript.py
folderA
- report1.xlsx
- report2.xlsx
folderB
- report1.xlsx
- report2.xlsx
folderC
** EMPTY **
folderD
- report1.xlsx
- report2.xlsx
Something like this maybe?
from pathlib import Path
from itertools import groupby
def by_folder(path):
return path.parent
for folder, files in groupby(Path("path/to/root/dir").rglob("*.xlsx"), key=by_folder):
print(f"Gonna merge these files from {folder}: ")
for file in files:
print(f"{file.name}")
print()
We recursively search for .xlsx files in the root directory, and group files into lists based on the immediate common parent folder. If a folder is empty, it won't be matched by the glob pattern.
You can use the os.listdir() method to list everything that's inside a folder. Only bad thing is that it'll get all files, so you may get a problem to get "inside a file" when actually you want to get inside all folder.
The following code loop through all subfolders, skip files, and print the name of all folders that are not empty.
for folder in os.listdir("."):
try:
if len( os.listdir("./"+folder) )>0:
print(
folder
)
except:
pass

Python script that creates new file and returns list of files

I'm trying to create a python script called script.py with new_directory function that creates a new directory inside the current working directory, then creates a new empty file inside the new directory, and returns the list of files in that directory.
The output I get is ["script.py"] which looks correct but gives me this error:
RuntimeErrorElement(RuntimeError,Error on line 5:
directory = os.mkdir("/home/PythonPrograms")
FileExistsError: [Errno 17] File exists: '/home/PythonPrograms'
)
import os
def new_directory(directory, filename):
if os.path.isdir(directory) == False:
directory = os.mkdir("/home/PythonPrograms")
os.chdir("PythonPrograms")
with open("script.py", "w") as file:
pass
# Return the list of files in the new directory
return os.listdir("/home/PythonPrograms")
print(new_directory("PythonPrograms", "script.py"))
How do I correct and why is this wrong?
As others have said, it is hard to debug without the error. In the right cercumstances, your code will work without errors. As #Jack suggested, I suspect you're current directory is not /home. This means you've created a directory called PythonPrograms in /home directory. os.chdir("PythonPrograms") is trying to change the directory to <currentDirectory>/PythonPrograms, which doesn't exist.
I have tried to rework your code (without completely changing it), into something that should work in all cases. I think the lesson here is, work with the variables you already have (i.e. directory), rather than hardcoding it into the function.
import os
def new_directory(directory, filename):
if not os.path.isdir(directory):
# Create directory within current directory
# This is working off the relative path (from your current directory)
directory = os.mkdir(directory)
# Create file if does not exist
# this is a one-liner version of you with...pass statement
open(os.path.join(directory, filename), 'a').close()
# Return the list of files in the new directory
return os.listdir(directory)
print(new_directory("PythonPrograms", "script.py"))
I hope that helps.
I'm guessing the error you're getting is because you're not able to switch directories to PythonPrograms? This would be because your python current working directory does not contain it. If you more explicitly write out the directory you want to switch to, for example putting os.chdir("/home/PythonPrograms"), then it may work for you.
Ideally you should give us any stack traces or more info on the errors, though
I'm not sure why in your code you have with open("script.py", "w") as file: pass,
but here is mt way:
import os
os.mkdir('.\\Newfolder') # Create a new folder called Newfolder in the current directory
open('.\\Newfolder\\file.txt','w').close() # Create a new file called file.txt into Newfolder
print(os.listdir('.')) # Print out all the files in the current directory

Finding the "root" of a directory

I am attempting to write a function in python that scans the contents of a directory at the script's level (once de-bugged I'll switch it to not needing to be at the same level but for this problem it's irrelevant) and recursively lists the paths to anything that is not a directory. The logic I am working with is:
If the parent "directory" is not a directory then it must be a file so print the path to it. Otherwise, for every "file" in that directory, if each "file" is not actually a directory, state the path to the file, and if the "file" is actually a directory, call the function again.
The environment I am using is as follows: I have the script at the same level as a directory named a, and inside a is a file d.txt, as well as another directory named b. Inside b is a file c.text. Per the way I would like this function to execute, first it should recognize that a is in fact a directory, and therefore begin to iterate over its contents. When it encounters d.txt, it should print out the path to it, and then when it encounters directory b it should begin to iterate over it's contents and thereby print the path to c.txt when it sees it. So in this example, the output of the script should be "C:\private\a\d.txt, C:\private\a\b\c.txt" but instead it is "C:\private\d.txt, C:\private\b". Here is the code thus far:
import os
def find_root(directory):
if not os.path.isdir(directory):
print(os.path.abspath(directory))
else:
for file in os.listdir(directory):
if not os.path.isdir(file):
print(os.path.abspath(file))
else:
find_root(file)
find_root('a')
[Python]: os.listdir(path='.'):
Return a list containing the names of the entries in the directory given by path.
but they are just basenames. So, in order for them to make sense, when you go a level deeper in the recursion either:
Prepend the "current" folder to their name
cd to each folder (and also, cd back when returning from recursion)
Here's your code modified to use the 1st approach:
import os
def find_root(path):
if os.path.isdir(path):
for item in os.listdir(path):
full_item = os.path.join(path, item)
if os.path.isdir(full_item):
find_root(full_item)
else:
print(os.path.abspath(full_item))
else:
print(os.path.abspath(path))
if __name__ == "__main__":
find_root("a")
Notes:
I recreated your folder structure
I renamed some of the variables for clarity
I reversed the negate conditions
Output:
c:\Work\Dev\StackOverflow\q47193260>"c:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" a.py
c:\Work\Dev\StackOverflow\q47193260\a\b\c.txt
c:\Work\Dev\StackOverflow\q47193260\a\d.txt

Python - How to handle folder creation if folder already exists

def copy_excel():
srcpath = "C:\\Aloha" #where the excel files are located
srcfiles = os.listdir(srcpath) #sets srcfiles as list of file names in the source path folder
destpath = "C:\\" #destination where the folders will be created
destdir = list(set([filename[19:22] for filename in srcfiles])) #extract three digits from filename to use for folder creation (i.e 505, 508, 517,...)
#function to handle folder creation
def create(dirname, destpath):
full_path = os.path.join(destpath, dirname)
if os.path.exists(full_path):
pass
else:
os.mkdir(full_path)
return full_path
#function to handle moving files to appropriate folders
def move(filename, dirpath):
shutil.move(os.path.join(srcpath, filename), dirpath)
#creates the folders with three digits as folder name by calling the create function above
targets = [(folder, create(folder, destpath)) for folder in destdir]
#handles moving files to appropriate folders if the three digits in file name matches the created folder
for dirname, full_path in targets:
for filename in srcfiles:
if dirname == filename[19:22]:
move(filename, full_path)
else:
pass
I am somewhat new to Python so please bear with me! I was able to find this code block on this site and tailored it to my specific use case. The code works well for creating the specified folders and dropping the files into the corresponding folders. However, when I run the code again for new files that are dropped into the "C:\\Aloha" I get a Error 183: Cannot create a file when that file already exists. In this case, the folder already exists because it was previously created when the script was run the first time.
The code errors out when targets attempts to create folders that already exist. My question is, what is the logic to handle folders that already exists and to ignore the error and just move the files to the corresponding folders? The script should only create folders if they are not already there.
Any help would be greatly appreciated! I have attempted try/except and nesting if/else statements as well as os.path.isdir(path) to check to see if the directory exists but I haven't had any luck. I apologize for any of the comments that are wrong, I am still learning Python logic as I build this script out.
You could use os.makedirs which not only will create cascading directories such that for instance C:\foo\bar\qux will create foo, bar and qux in case they do not exist, but you can also set the exist_ok=True such that no error is thrown if the directory exists.
So:
os.makedirs(full_path,exist_ok=True)
In case you want to throw an error or stop the processing if the directory exists, you can use os.path.exists(full_path) before mkdir.
Another option I just came by...
try:
os.mkdir(path)
except FileExistsError:
pass

python: file paths no longer work with imp

I recently started using imports to better organize my code in python. My original code in file1.py used the line:
def foo():
files = [f for f in os.listdir('.') if os.path.isfile(f)]
print files
#do stuff here....
Which referenced all the files in the same folder as the code, print files showing the correct output as an array of filenames.
However, I recently changed the directory structure to something like this:
./main.py
./folder1/file1.py
./folder1/data_file1.csv
./folder1/data_file2.csv
./folder1/......
And in main.py, I use:
import imp
file1 = imp.load_source('file1', "./folder1/file1.py")
.
.
.
file1.foo()
Now, files is an empty array. What happened? I have tried absolute filepaths in addition to relative. Directly declaring an array with data_file1.csv works, but I can't get anything else to work with this import.
What's going on here?
When you do os.listdir('.') , you are trying to list the contents of '.' (which is the current directory) , but this does not need to be the directory in which the script resides, it would be the directory in which you were in when you ran the script (unless you used os.chdir() inside the python script).
You should not depend on the current working directory , instead you can use __file__ to access the path of the current script, and then you can use os.path.dirname() to get the directory in which the script resides.
Then you can use os.path.join() to join the paths of the files you get from os.listdir() with the directory of the script and check if those paths are files to create your list of files.
Example -
def foo():
filedir = os.path.dirname(__file__)
files = [f for f in (os.path.join(filedir, fil) for fil in os.listdir(filedir)) if os.path.isfile(f)]
print files
#do stuff here....

Categories