Moving Files from directory to their own folders - python

I am struggling with the paths and directories to solve this problem. Basically, I have a long list of .lammps files in one directory. My goal is to copy each file and move it into its own folder (which is one directory back) where its folder name is the file name minus the .lammps. All of the folders are already made, I just can't seem to figure out moving them. The entire list of files is in the Files directory. The individual folders are in the ROTATED FILES directory. Here is what I have. Any tips greatly appreciated.
Here is a file example
n-optimized.new.10_10-90-10_10.Ni00Nj01.lammps
The folder for this file is then named
n-optimized.new.10_10-90-10_10.Ni00Nj01
import os
file_directory = os.chdir("C:\Py Practice\ROTATED FILES\Files")
files = os.listdir()
for file in files:
# get the file -.lammps string
name1 = file.split('.')[0:4]
name2 = ".".join(name1)
# get the path for the files new respective folder (back a directory and paste folder name)
file_folder = "C:\Py Practice\ROTATED FILES/" + name2
# Move
combined_path = os.path.join(file, file_folder)
I've tried shutil and figured path join might be easier.

First of all, the code you have here shouldn't work since you either have to escape backslashes or use a raw string. Secondly, rather than using os for file system operations, it's much better to learn how to use pathlib (also a core python module) which provides a more modern object-oriented approach to file operations.
Using pathlib and shutil you can do something like
from pathlib import Path
from shutil import copyfile
file_directory = Path(r"C:\Py Practice\ROTATED FILES\Files")
# get the list of source files
source_files = [f for f in file_directory.glob('*.lammps')]
# create target file paths
target_files = [file_directory.parent / f.stem/ f.name for f in source_files]
for source, target in zip(source_files, target_files):
copyfile(str(source), str(target))
Here we're accessing different parts of file path using a convenient OOP structure. For example, if your file f is located in 'c:/foo/bar/boo.txt' then f.name is just the name of file: boo.txt, f.stem is the stem part of the file name (excluding the extension) boo, f.parent is its parent directory 'c:/foo/bar/' etc.
There's a really handy graphic of pathlib Path objects here.
The only inconvenience is that not all of core modules support Path objects yet so for copyfile we just need to get the string representation by calling str on the object.
And you don't even need to have target folders created beforehand, it's very easy to create the necessary folder structure as you go along:
from pathlib import Path
from shutil import copyfile
file_directory = Path(r"C:\Py Practice\ROTATED FILES\Files")
# get the list of source files
source_files = [f for f in file_directory.glob('*.lammps')]
# create target file paths
target_files = [file_directory.parent / f.stem/ f.name for f in source_files]
for source, target in zip(source_files, target_files):
# check that target directory exists
# and create a folder if not
if not target.parent.is_dir():
target.parent.mkdir()
copyfile(str(source), str(target))

Related

os to remove files with Python (trying to removed from multiple file directories)

I'm still very new to Python so I'm trying to apply Python in to my own situation for some experience
One useful program is to delete files, in this case by file type from a directory
import os
target = "H:\\documents\\"
for x in os.listdir(target):
if x.endswith(".rtf"):
os.unlink(target + x)
Taking this program, I have tried to expand it to delete ost files in every local profiles:
import os
list = []
folder = "c:\\Users"
for subfolder in os.listdir(folder):
list.append(subfolder)
ost_folder = "c:\\users\\%s\\AppData\\Local\\Microsoft\\Outlook"
for users in list:
ost_list = os.listdir(ost_folder%users)
for file in ost_list:
if file.endswith(".txt"):
print(file)
This should be printing the file name but spits an error that the file directory cannot be found
Not every folder under C:\Users will have a AppData\Local\Microsoft\Outlook subdirectory (there are typically hidden directories there that you may not see in Windows Explorer that don't correspond to a real user, and have never run Outlook, so they don't have that folder at all, but will be found by os.listdir); when you call os.listdir on such a directory, it dies with the exception you're seeing. Skip the directories that don't have it. The simplest way to do so is to have the glob module do the work for you (which avoids the need for your first loop entirely):
import glob
import os
for folder in glob.glob(r"c:\users\*\AppData\Local\Microsoft\Outlook"):
for file in os.listdir(folder):
if file.endswith(".txt"):
print(os.path.join(folder, file))
You can simplify it even further by pushing all the work to glob:
for txtfile in glob.glob(r"c:\users\*\AppData\Local\Microsoft\Outlook\*.txt"):
print(txtfile)
Or do the more modern OO-style pathlib alternative:
for txtfile in pathlib.Path(r'C:\Users').glob(r'*\AppData\Local\Microsoft\Outlook\*.txt'):
print(txtfile)

Operating on files in current directory and putting them into new directory

I am currently in a working directory where following files are present
abcde_file
gvmdgv_file
qst_file
rl.txt
qp.txt
trs_file
I want to do some operations on all files with _file at end and put them into a new directory called newdir.
My try:
from glob import glob
files = glob("*_file")
with open('newdir/{}'.format(files),'a') as a:
with open(files,'r') as r:
#required operations
It gives error saying file name too long for with open('newdir/{}'.format(files),'a')
That is because variable files is a list containing all matching files with _file as a suffix. You should loop through every single element of the list and copy it instead.
Something like this should work:
from glob import glob
files = glob("*_file")
for file in files:
oldf = open(file,'r')
newf = open(newdir+"\\"+file,'w+')
data = oldf.read()
newf.write(data)
oldf.close()
newf.close()
If you are copying files different than textfiles, you might want to open the two file handles with rb and wb+ instead. Variable newdir could be a constant with the new directory path, for example.
you might want to have a look at python's native "high-level file operations" library shutil
import shutil
shutil.copy("filepath to copy", "path-to-destination-folder")

Python: Moving csv files after reading

I'm wanting to move .csv files after reading them.
The code I've come up with is to move any .csv files found in a folder, then direct to an archive folder.
src1 = "\\xxx\xxx\Source Folder"
dst1 = "\\xxx\xxx\Destination Folder"
for root, dirs, files in os.walk(src1):
for f in files:
if f.endswith('.csv'):
shutil.move(os.path.join(root,f), dst1)
Note: I imported shutil at the beginning of my code.
Note 2: The destination archive folder is within the source folder - will this have implications for the above code?
When I run this, nothing happens. I get no error messages and the file remains in the source folder.
Any insight is appreciated.
Edit (some context on my goal):
My overall code will be used to read .csv files that are moved manually into a source folder by users - I then want to archive these .csv files using Python once the data has been used. Every .csv file placed into the source folder by the users will have a different name - no .csv file name will be the same, which is why I want to search the source folder for .csv files and move them all.
You can use the pathlib module. I'm assuming you have got the same folder structure in the destination directory.
from pathlib import Path
src1 = "<Path to source folder>"
dst1 = "<Path to destination folder>"
for csv_file in Path(src1).glob('**/*.csv'):
relative_file_path = csv_file.relative_to(src1)
destination_path = dst1 / relative_file_path
csv_file.rename(destination_path)
Explanation-
for csv_file in Path(src1).glob('**/*.csv'):
The glob(returns generator object) will capture all the CSV files in the directory as well as in the subdirectory. Now, we can iterate over the files one by one.
relative_file_path = csv_file.relative_to(src1)
All the csv_files are now pathlib path objects. So, we can use the functions that the library provides. One such function is relative to. Here It'll copy the relative path of the file from the src folder. Let's say you have a CSV file like-
scr_folder/A/B/c.csv - It'll copy A/B/c.csv
destination_path = dst1 / relative_file_path
As the folder structure is the same the destination path now becomes -
dst_folder/A/B/c.csv
csv_file.rename(destination_path)
At Last, rename will just move the file from src to destination.
After a bunch of research I have found a solution:
import shutil
source = r"\\xx\Source"
destination = r"\\xx\Destination"
files = os.listdir(source)
for file in files:
new_path = shutil.move(f"{source}/{file}", destination)
print(new_path)
I was making it more complicated than it needed to be - because all files in the folder would be .csv anyway, I just needed to move all files. Thanks stackoverlfow.

Python: Unzip selected files in directory tree

I have the following directory, in the parent dir there are several folders lets say ABCD and within each folder many zips with names as displayed and the letter of the parent folder included in the name along with other info:
-parent--A-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
--B-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
--C-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
--D-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
I need to unzip only selected zips in this tree and place them in the same directory with the same name without the .zip extension.
Output:
-parent--A-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
--B-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
--C-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
--D-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
My effort:
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zipfile=os.path.basename(path) #save the zipfile path
zip_ref=zipfile.ZipFile(path, 'r')
zip_ref=extractall(zipfile.replace(r'.zip', '')) #unzip to a folder without the .zip extension
The problem is that i dont know how to save the A,B,C,D etc to include them in the path where the files will be unzipped. Thus, the unzipped folders are created in the parent directory. Any ideas?
The code that you have seems to be working fine, you just to make sure that you are not overriding variable names and using the correct ones. The following code works perfectly for me
import os
import zipfile
import glob
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zf = os.path.basename(path) #save the zipfile path
zip_ref = zipfile.ZipFile(path, 'r')
zip_ref.extractall(path.replace(r'.zip', '')) #unzip to a folder without the .zip extension
Instead of trying to do it in a single statement , it would be much easier and more readable to do it by first getting list of all folders and then get list of files inside each folder. Example -
import os.path
for folder in glob.glob("./*"):
#Using *.zip to only get zip files
for path in glob.glob(os.path.join(".",folder,"*.zip")):
filename = os.path.split(path)[1]
if folder in filename:
#Do your logic

Directory is not being recognized in Python

I'm uploading a zipped folder that contains a folder of text files, but it's not detecting that the folder that is zipped up is a directory. I think it might have something to do with requiring an absolute path in the os.path.isdir call, but can't seem to figure out how to implement that.
zipped = zipfile.ZipFile(request.FILES['content'])
for libitem in zipped.namelist():
if libitem.startswith('__MACOSX/'):
continue
# If it's a directory, open it
if os.path.isdir(libitem):
print "You have hit a directory in the zip folder -- we must open it before continuing"
for item in os.listdir(libitem):
The file you've uploaded is a single zip file which is simply a container for other files and directories. All of the Python os.path functions operate on files on your local file system which means you must first extract the contents of your zip before you can use os.path or os.listdir.
Unfortunately it's not possible to determine from the ZipFile object whether an entry is for a file or directory.
A rewrite or your code which does an extract first may look something like this:
import tempfile
# Create a temporary directory into which we can extract zip contents.
tmpdir = tempfile.mkdtemp()
try:
zipped = zipfile.ZipFile(request.FILES['content'])
zipped.extractall(tmpdir)
# Walk through the extracted directory structure doing what you
# want with each file.
for (dirpath, dirnames, filenames) in os.walk(tmpdir):
# Look into subdirectories?
for dirname in dirnames:
full_dir_path = os.path.join(dirpath, dirname)
# Do stuff in this directory
for filename in filenames:
full_file_path = os.path.join(dirpath, filename)
# Do stuff with this file.
finally:
# ... Clean up temporary diretory recursively here.
Usually to make things handle relative paths etc when running scripts you'd want to use os.path.
It seems to me that you're reading from a Zipfile the items you've not actually unzipped it so why would you expect the file/dirs to exist?
Usually I'd print os.getcwd() to find out where I am and also use os.path.join to join with the root of the data directory, whether that is the same as the directory containing the script I can't tell. Using something like scriptdir = os.path.dirname(os.path.abspath(__file__)).
I'd expect you would have to do something like
libitempath = os.path.join(scriptdir, libitem)
if os.path.isdir(libitempath):
....
But I'm guessing at what you're doing as it's a little unclear for me.

Categories