Move Certain Files from One Directory to Another - Python - python

All,
I need to move file from one directory to another but I don't want to move all the files in that directory just the text files that begin with 'pws'. A list of all the files in the directory is:
['pws1.txt', 'pws2.txt', 'pws3.txt', 'pws4.txt', 'pws5.txt', 'x.txt', 'y.txt']
As stated, I want to move the 'pws*' files to another directory but not the x and y text files. What I want to do is remove all elements the list that does not begin with 'pws'. My code is below:
loc = 'C:\Test1'
dir = os.listdir(loc)
#print dir
for i in dir:
#print i
x = 'pws*'
if i != x:
dir.remove(i)
print dir
The output does not keep what I want instead
It removes the x text file from the list and the even number ones but retains the y text files.
What am I doing wrong. How can I make a list of only the files that start with 'pws' and remove the text files that do not begin with 'pws'.
Keep in mind I might have a list that has 1000 elements and several hundreds of those elements will start with 'pws' while those that don't begin with it, couple of hundreds, will need to be removed.
Everyone's help is much appreciated.

You can use list-comprehension to re-create the list as the following:
dir = [i for i in dir if i.startswith('pws')]
or better yet, define that at start:
loc = 'C:\\Test1'
dir = [i for i in os.listdir(loc) if i.startswith('pws')]
print dir
Explanation:
When you use x = 'pws*' and then check for if i == x, you are comparing if the element i is equal to 'pws*', so a better way is to use str.startswith() built in method that will check if the string starts with the provided substring. So in your loop you can use if i.startswith('pws') or you can use list-comprehension as I mentioned above which is a more pythonic approach.

To move a file:
How to move a file in Python
You can use os.rename(origin,destination)
origin = r'C:\users\JohnDoe\Desktop'
destination = r'C:\users\JohnDoe\Desktop\Test'
startswith_ = 'pws'
And then go ahead and do a list comprehension
# Move files
[os.rename(os.path.join(origin,i),os.path.join(destination,i)) for i in os.listdir(origin) if i.startswith(startswith_)]

Just use a glob. This gives you a list of files in a directory without all the listdir calls and substring matching:
import glob,os
for f in glob.glob('/tmp/foo/pws*'):
os.rename(f, '/tmp/bar/%s'%(os.path.basename(f)))
Edit 1: Here's your original code, simplified with a glob, and a new variable loc2 defined to be the place to move them to:
import os,glob
loc = 'C:\Test1'
loc2 = 'C:\Test2'
files = glob.glob('%s\pws*'%(loc))
for i in files:
os.rename(i,'%s\%s'%(loc2,os.path.basename(i)))

Related

How to automatically iterate over each list item and perform a unique function

I have various files in one directory;
foo.1001.exr
foo.1002.exr
bar.1001.exr
bar.1002.exr
I'd like to rename all of the files in this directory, but in the case of this directory holding more than one type of image sequence I would like to iterate over them one at a time so I don't overwrite anything and only end up with 2 files.
I was planning on separating them by the first part of the filename and adding them to a list, to get the number of variations. What I am unsure of is how to have the function iterate over newList[0], newList[1] and so on, automatically. It would need to be robust to cater to an indefinite amount of list items.
The output should be:
foo_test.1001.exr
foo_test.1002.exr
bar_test.1001.exr
bar_test.1002.exr
The code below is not indicative of the renaming task, it was just to start planning how to iterate procedurally over the list items.
import os
dir = "/test/"
# File formats
imageFileFormats = (".exr", ".dpx")
fileName = []
for file in os.listdir(dir):
for ext1 in imageFileFormats:
if ext1 in file:
fileName.append(file.split('.')[0])
newList = list(set(fileName))
newList.sort()
for file in os.listdir(dir):
for ext1 in imageFileFormats:
if ext1 in file:
if newList[0] in file:
print (file)
This seems like a bit of an XY problem - instead of renaming the file collection in place, causing the potential for conflicts, have you considered creating a new folder to move all the files to, renaming them in the process? Once complete, you can move them back to the original location, avoiding the problem altogether.
As long as you have a mapping from old name to new name for each file that won't cause a conflict in the final result, the order of moving the files then won't matter.
And as long as the target folder is on the same volume, a move operation is basically a renaming anyway, so there's no space issues or performance problems.
So, something like:
from pathlib import Path
def do_rename(fn):
# whatever non-conflicting renaming operation you need
p = Path(fn)
return p.parent / f'{p.stem}_renamed{p.suffix}'
def do_reverse_rename(fn):
# just including the reverse of the above here for testing purposes
p = Path(fn)
return p.parent / f'{p.stem[:-8]}{p.suffix}' if p.stem.endswith('_renamed') else p
def safe_rename_all(location, rename_func):
p = Path(location)
# come up with a folder name that doesn't exist, right next to the original
n = 1
while (p.parent / f'{p.name}_{n}').is_dir():
n += 1
# create the new temporary folder
(target := p.parent / f'{p.name}_{n}').mkdir()
# move all the files into the new folder, this example matches *all* files
# of course you could filter, etc.
for fn in p.glob('*'):
new_fn = rename_func(fn)
fn.rename(target / new_fn.name) # move to the temporary location
# once done, move everything back and delete the temporary folder
for fn in target.glob('*'):
fn.rename(p / fn.name)
target.rmdir()
safe_rename_all('some/folder/with/files', do_rename)
# can be undone with safe_rename_all('output', do_reverse_rename)
Some considerations might be to not create a folder right next to the original (to avoid rights issues, etc.) but instead create a temporary folder on the same volume using the standard library tempfile. And you were filtering for certain suffixes, so that's easy to add.

Sorting lists based on very specific criteria

I am looking to sort a list based on if there is a "." there will be 2 items in the list, I just need to make sure it is in the correct format.
My list needs to be in the format: [item1, item2.ext]
I need the item without the extension (The folder) to be first to properly use shutil.
I know I could do 2 for loops through my list but that seems wasteful, and I know I could check to see weather or not I am pointing to a file or a folder but I think it is easier to force the list to be in the correct order explicitly.
Here is my code:
# Sets up a list of files and a blank list of those to remove
files = os.listdir(currentDir)
print(files)
files_to_remove = []
print(files_to_remove)
# Loops through files looking for
for f in files: # I could break this into 2 for loops but that seems dumb
if "." not in f: #checks if is a folder
files_to_remove.append(f)
print(files_to_remove)
if f"lecture{lecturenum}.zip" in f: #Checks if is the zip file that was just unzipped
files_to_remove.append(f)
print(files_to_remove)
print(files_to_remove)
time.sleep(0.1) # This removes a synching error where windows is trying to delete it while it is still using it
shutil.rmtree(currentDir + files_to_remove[0]) # I could change this to check before deciding
os.remove(currentDir + files_to_remove[1])
Any help would be greatly appreciated

Want to create a list of directories at the n-1 level (folders that do not contain any subfolders) with either bash or python

I currently have a problem where I want to get a list of directories that are at an n-1 level. The structure looks somewhat like the diagram below, and I want a list of all the folders that are blue in color. The height of the tree however, varies across the entire file system.
Due to the fact that all the folders that are blue, generally end their name with the string images, I have written the code in Python below:
def getDirList(dir):
dirList = [x[0] for x in os.walk(dir)]
return dirList
oldDirList = getDirList(sys.argv[1])
dirList = []
# Hack method for getting the folders
for i, dir in enumerate(oldDirList):
if dir.endswith('images'):
dirList.append(oldDirList[i] + '/')
Now, I do not want to use this method, since I want a general solution to this problem, with Python or bash scripting and then read the bash script result into Python. Which one would be more efficient in practice and theoretically?
To rephrase what I think you're asking - you want to list all folders that do not contain any subfolders (and thus contain only non-folder files).
You can use os.walk() for this pretty easily. os.walk() returns an iterable of three-tuples (dirname, subdirectories, filenames). We can wrap a list comprehension around that output to select only the "leaf" directories from a file tree - just collect all the dirnames that have no subdirectories.
import os
dirList = [d[0] for d in os.walk('root/directory/path') if len(d[1]) == 0]
So another way to state your problem is that you want all folders that contain no subfolders? If that's the case then you can make use of the fact that os.walk lists all the subfolders within a folder. If that list is empty, then append it to dirList
import os
import sys
def getDirList(dir):
# x[1] contains the list of subfolders
dirList = [(x[0], x[1]) for x in os.walk(dir)]
return dirList
oldDirList = getDirList(sys.argv[1])
dirList = []
for i, dir in enumerate(oldDirList):
if not dir[1]: # if the list of subfolders is not empty
dirList.append(dir[0])
print dirList
today I had a similar problem.
Try pathlib: https://docs.python.org/3/library/pathlib.html
from pathlib import PurePath
import os, sys
#os.getcwd() returns path of red_dir if script is inside
gray_dir = PurePath(os.getcwd()).parents[1] # .parents[1] returns n-1 path
blue_things = os.listdir(gray_dir)
blue_dirs = []
for thing in blue_things:
if os.path.isdir(str(gray_dir) + "\\" + str(thing)): # make sure not to append files
blue_dirs.append(thing)
print(blue_dirs)

Python: Need to add chosen filenames into an array

The idea is simple: there is a directory with 2 or more files *.txt. My script should look in the directory and get filenames in order to copy them (if they exist) over the network.
As a Python newbie, I am facing problems which cannot resolve so far.
My code:
files = os.listdir('c:\\Python34\\');
for f in files:
if f.endswith(".txt"):
print(f)
This example returns 3 files:
LICENSE.txt
NEWS.txt
README.txt
Now I need to use every filename in order to do a SCP. The problem is that when I try to get the first filename with:
print(f[0])
I am receiving just the first letters from each file in the list:
L
N
R
How to add filenames to an array in order to use them later as a array elements?
You can also try using the EXTEND method. So you say:
x = []
for f in files:
if f endswith(".txt"):
x.extend([f])
so it would be "adding" to the end of the list the file in which f is on.
If you want a list of matching files names, then instead of using os.listdir and filtering, use glob.glob with a suitable pattern.
import glob
files = glob.glob('C:\\python34\\*.txt')
Then you can access files[0] etc...
The array of files is files. In the loop, f is a single file name (a string) so f[x] gets the xth character of a filename. Do files[0] instead of f[0].

Python automated file names

I want to automate the file name used when saving a spreadsheet using xlwt. Say there is a sub directory named Data in the folder the python program is running. I want the program to count the number of files in that folder (# = n). Then the filename must end in (n+1). If there are 0 files in the folder, the filename must be Trial_1.xls. This file must be saved in that sub directory.
I know the following:
import xlwt, os, os.path
n = len([name for name in os.listdir('.') if os.path.isfile(name)])
counts the number of files in the same folder.
a = n + 1
filename = "Trial_" + "a" + ".xls"
book.save(filename)
this will save the file properly named in to the same folder.
My question is how do I extend this in to a sub directory? Thanks.
os.listdir('.') the . in this points to the directory from where the file is executed. Change the . to point to the subdirectory you are interested in.
You should give it the full path name from the root of your file system; otherwise it will be relative to the directory from where the script is executed. This might not be what you want; especially if you need to refer to the sub directory from another program.
You also need to provide the full path to the filename variable; which would include the sub directory.
To make life easier, just set the full path to a variable and refer to it when needed.
TARGET_DIR = '/home/me/projects/data/'
n = sum(1 for f in os.listdir(TARGET_DIR) if os.path.isfile(os.path.join(TARGET_DIR, f)))
new_name = "{}Trial_{}.xls".format(TARGET_DIR,n+1)
You actually want glob:
from glob import glob
DIR = 'some/where/'
existing_files = glob(DIR + '*.xls')
filename = DIR + 'stuff--%d--stuff.xls' % (len(existing_files) + 1)
Since you said Burhan Khalid's answer "Works perfectly!" you should accept it.
I just wanted to point out a different way to compute the number. The way you are doing it works, but if we imagine you were counting grains of sand or something would use way too much memory. Here is a more direct way to get the count:
n = sum(1 for name in os.listdir('.') if os.path.isfile(name))
For every qualifying name, we get a 1, and all these 1's get fed into sum() and you get your count.
Note that this code uses a "generator expression" instead of a list comprehension. Instead of building a list, taking its length, and then discarding the list, the above code just makes an iterator that sum() iterates to compute the count.
It's a bit sleazy, but there is a shortcut we can use: sum() will accept boolean values, and will treat True as a 1, and False as a 0. We can sum these.
# sum will treat Boolean True as a 1, False as a 0
n = sum(os.path.isfile(name) for name in os.listdir('.'))
This is sufficiently tricky that I probably would not use this without putting a comment. But I believe this is the fastest, most efficient way to count things in Python.

Categories