rename files in order from a folder using python

rename files in order from a folder using python - python

I have a folder with files that are named from 0.txt to 100.txt.
They are created in order from a list L.
I want to rename the files in that folder with the name from the list, however, they are renamed in "wrong" order, meaning they are not renamed as the list.
My code is like:
import os
folder = r'D:\my_files'
os.chdir(folder)
for i,j in zip(os.listdir(folder), L):
os.rename(i, j + ".txt")
where L is the list with names for the files.
How do I keep the order of files in the directory to match my names in the L list, so the files are renamed according to my list?

As per the Python documentation:
os.listdir(path='.')
Return a list containing the names of the entries
in the directory given by path. The list is in arbitrary order, and
does not include the special entries '.' and '..' even if they are
present in the directory.
Therefore, you need to sort your files before you use zip:
for i,j in zip(sorted(os.listdir(folder), key=lambda x: int(x.split('.')[0])), L):
# logic to rename file
With sorted, the parameter key=lambda x: int(x.split('.')[0]) will ensure the ordering is correct.

Related

Rename a directory with a filename inside with Python

I am trying to rename several directories with the name of the first file inside them.
I am trying to:
List the files inside a folder.
Identify the directories.
For each directory, access it, grab the name of the first file inside and rename the directory with such name.
This is what I got so far but it is not working. I know the code is wrong but before fixing the code I would like to know if the logic is right. Can anyone help please?
import os
for (root, dirs, files) in os.walk('.'):
print(f'Found directory: {dirpath}')
dirlist = []
for d_idx, d in enumerate(dirlist):
print(d)
filelist = []
for f_idex, f in enumerate(filelist):
files.append(f)[1]
print(f)
os.rename(d, f)
Thank you!

There are a few problems in your code:
You are renaming directories as you iterate them with os.walk. This is not a good idea, os.walk gives you a generator, meaning it creates elements as you iterate them, so renaming things within the loop will confuse it.
Both for d_idx, d in enumerate(dirlist): and for f_idex, f in enumerate(filelist): iterate over variables that are declared to be empty lists in the line before, so those loops don't do anything. Also, within the second one, files.append(f) would append f to the list files, but the [1] at the end means "get the second element (remeber Python indexing is 0-based) of the value returned by the append function" - but append does not return anything (it modifies the list, not returns a new list), so that would fail (and you are not using the value read by [1] anyway, so it would not do anything).
In os.rename(d, f), first, since the loops before do not ever run, d and f will not have a value, but also, assuming both d and f came from dirs and files, they would be given as paths relative to their parents, not to your current directory (.), so the renaming would fail.
This code should work as you want:
import os
# List of paths to rename
renames = []
# Walk current dir
for (root, dirs, files) in os.walk('.'):
# Skip this dir (cannot rename current directory)
if root == '.': continue
# Add renaming to list
renames.append((root, files[0]))
# Iterate renaming list in reverse order so deepest dirs are renamed first
for root, new_name in reversed(renames):
# Make new full dir name (relative to current directory)
new_full_name = os.path.join(os.path.dirname(root), new_name)
# Rename
os.rename(root, new_full_name)

Conditional renaming of multiple files from multiple subfolders based on a list of prefix

I have a path with multiple subfolders, and files with different extensions.
Now, I made different lists for each filetypes (files of different extensions)
path='/user/path/output'
Defining lists for each file extensions
png =list()
txt =list()
Then populating my lists with files,
import os
for (dirpath,dirname,filenames) in os.walk(path):
txt +=[os.path.join(dirpath,file) for file in filenames if file.endswith("txt") ]
png +=[os.path.join(dirpath,file) for file in filenames if file.endswith("png") ]
Now the lists looks as following,
print(txt)
['/user/path/output/SAP.txt','/user/path/output/LUF.txt']
And the png
['/user/path/output/SAP-tcga-01_scs.png','/user/path/output/LUF-tcga-01_scs.png']
Here, I have a list of prefixes. These prefixes need to be appended to the above lists of filenames. Hence, I created a dictionary, with the sample name as values and prefix as their keys.
The first part of the list suffix: is filename and the secound part is suffix. The list suffix must correctly, append to the right filename. That is the condition here.
suffix =["SAP_xz","LUF_df"]
prefix_for_files={value.split('_')[0]:value for value in suffix}
Then, I have two list of filenames with paths and a dictionary of the prefix needed to be appended for each files within 2 lists.
To do this I wrote a for loop, like this,
for value,ids in prefix_for_files.items():
sample_prefix=ids.split('_')[1]
Finally, I need to read each items in the lists (txt and png) and see if the values in the prefix_for_files matches to the basename of filenames in the lists.
I am stuck here, any sort of suggestions are much appreciated.
At the end I need the list of files as, for example, for png files,
/user/path/output/SAP_xz.png
/user/path/output/LUF_df.png

You can use glob.glob
import glob
import os
types = ('*.png', '*.txt') # Stuff to search for
files = []
for ftype in types:
files.extend(glob.glob('/home/chris/dev/output/**/{}'.format(ftype), recursive=True))
# For me this would produce
# ['/home/chris/dev/output/blank.txt',
# '/home/chris/dev/output/b.png',
# '/home/chris/dev/output/t.png',
# '/home/chris/dev/output/f2/okay.png',
# '/home/chris/dev/output/f1/hello.png',
# '/home/chris/dev/output/f1/goodbye.png']
Then if you need anything with the absolute path (if this isn't your absolute path already), you can do this:
for f in files:
abspath = os.path.abspath(f)
# Rename/delete/etc with abspath

Want to create a list of directories at the n-1 level (folders that do not contain any subfolders) with either bash or python

I currently have a problem where I want to get a list of directories that are at an n-1 level. The structure looks somewhat like the diagram below, and I want a list of all the folders that are blue in color. The height of the tree however, varies across the entire file system.
Due to the fact that all the folders that are blue, generally end their name with the string images, I have written the code in Python below:
def getDirList(dir):
dirList = [x[0] for x in os.walk(dir)]
return dirList
oldDirList = getDirList(sys.argv[1])
dirList = []
# Hack method for getting the folders
for i, dir in enumerate(oldDirList):
if dir.endswith('images'):
dirList.append(oldDirList[i] + '/')
Now, I do not want to use this method, since I want a general solution to this problem, with Python or bash scripting and then read the bash script result into Python. Which one would be more efficient in practice and theoretically?

To rephrase what I think you're asking - you want to list all folders that do not contain any subfolders (and thus contain only non-folder files).
You can use os.walk() for this pretty easily. os.walk() returns an iterable of three-tuples (dirname, subdirectories, filenames). We can wrap a list comprehension around that output to select only the "leaf" directories from a file tree - just collect all the dirnames that have no subdirectories.
import os
dirList = [d[0] for d in os.walk('root/directory/path') if len(d[1]) == 0]

So another way to state your problem is that you want all folders that contain no subfolders? If that's the case then you can make use of the fact that os.walk lists all the subfolders within a folder. If that list is empty, then append it to dirList
import os
import sys
def getDirList(dir):
# x[1] contains the list of subfolders
dirList = [(x[0], x[1]) for x in os.walk(dir)]
return dirList
oldDirList = getDirList(sys.argv[1])
dirList = []
for i, dir in enumerate(oldDirList):
if not dir[1]: # if the list of subfolders is not empty
dirList.append(dir[0])
print dirList

today I had a similar problem.
Try pathlib: https://docs.python.org/3/library/pathlib.html
from pathlib import PurePath
import os, sys
#os.getcwd() returns path of red_dir if script is inside
gray_dir = PurePath(os.getcwd()).parents[1] # .parents[1] returns n-1 path
blue_things = os.listdir(gray_dir)
blue_dirs = []
for thing in blue_things:
if os.path.isdir(str(gray_dir) + "\\" + str(thing)): # make sure not to append files
blue_dirs.append(thing)
print(blue_dirs)

Grouping and deleting Files

I have to come up with a solution to delete all files but the newest 2 in a directory stucture of our owncloud. The be exact - its the file versioning folder. There are files in one folder with the following structure:
Filename.Ext.v[random_Number]
The hard part is that there are different files in one folder I need to keep.
IE: Content of folder A:
HelloWorld.txt.v123
HelloWorld.txt.v555
HelloWorld.txt.v666
OtherFile.pdf.v143
OtherFile.pdf.v1453
OtherFile.pdf.v123
OtherFile.pdf.v14345
YetOtherFile.docx.v11113
In this case we have 3 "basefiles". And I would have to keep the newest 2 files of each "basefile".
I tried Python3 with os.walk and regex to filter out the basename. I tried build in Linux tools like find with -ctime. I could use also bash.
But my real problem is more the logic. How would you approach this task?
EDIT 2:
Here my progress:
import os
from itertools import groupby
directory = 'C:\\Users\\x41\\Desktop\\Test\\'
def sorted_ls(directory):
mtime = lambda f: os.stat(os.path.join(directory, f)).st_mtime
return list(sorted(os.listdir(directory), key=mtime))
print(sorted_ls(directory))
for basename, group in groupby(sorted_ls(directory), lambda x: x.rsplit('.')[0]):
for i in basename:
finallist = []
for a in group:
finallist.append(a)
print(finallist[:-2])
I am almost there. The function sorts the files in the directory based on the mtime value. The suggested groupby() function calls my custom sort function.
Now the problem here is that I have to dump the sort() before the groupby() because this would reset my custom sort. But it now also returns more groups than anticipated.
If my sorted list looks like this:
['A.txt.1', 'B.txt.2', 'B.txt.1', 'B.txt.3', 'A.txt.2']
I would get 3 groups. A, B, and A again.
Any suggestions?
FINAL RESULT
Here is my final version with added recursiveness:
import os
from itertools import groupby
directory = r'C:\Users\x41\Desktop\Test'
for dirpath, dirs, files in os.walk(directory):
output = []
for basename, group in groupby(sorted(files), lambda x: x.rsplit('.')[0]):
output.extend(sorted(group, key=lambda x: os.stat(os.path.join(dirpath, x)).st_mtime)[:-2])
for file in output:
os.remove(dirpath + "\\" + file)

You need to do a simple sort first on the file names so that they are in alphabetical order to allow the groupby function to work correctly.
With each of the resulting file groups, you can then sort using your os.stat key as follows:
import os
from itertools import groupby
directory = r'C:\Users\x41\Desktop\Test'
output = []
for basename, group in groupby(sorted(os.listdir(directory)), lambda x: x.rsplit('.')[0]):
output.extend(sorted(group, key=lambda x: os.stat(os.path.join(directory, x)).st_mtime)[-2:])
print output
This will produce a single list containing the latest two files from each group.

The logic isn't extremely hard here, if that's the only thing you're looking for.
You'd group files by base name, in a python dictionary for example, where the key is your "base filename" such as "HelloWorld.txt" and the value is a list of all files with the same basename sorted by ctime (or some other metric of time depending on how you define newest), and then you delete all files in the list from index 2 onwards accordingly.

Python: Need to add chosen filenames into an array

The idea is simple: there is a directory with 2 or more files *.txt. My script should look in the directory and get filenames in order to copy them (if they exist) over the network.
As a Python newbie, I am facing problems which cannot resolve so far.
My code:
files = os.listdir('c:\\Python34\\');
for f in files:
if f.endswith(".txt"):
print(f)
This example returns 3 files:
LICENSE.txt
NEWS.txt
README.txt
Now I need to use every filename in order to do a SCP. The problem is that when I try to get the first filename with:
print(f[0])
I am receiving just the first letters from each file in the list:
L
N
R
How to add filenames to an array in order to use them later as a array elements?

You can also try using the EXTEND method. So you say:
x = []
for f in files:
if f endswith(".txt"):
x.extend([f])
so it would be "adding" to the end of the list the file in which f is on.

If you want a list of matching files names, then instead of using os.listdir and filtering, use glob.glob with a suitable pattern.
import glob
files = glob.glob('C:\\python34\\*.txt')
Then you can access files[0] etc...

The array of files is files. In the loop, f is a single file name (a string) so f[x] gets the xth character of a filename. Do files[0] instead of f[0].

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

rename files in order from a folder using python - python

Related

Rename a directory with a filename inside with Python

Conditional renaming of multiple files from multiple subfolders based on a list of prefix

Want to create a list of directories at the n-1 level (folders that do not contain any subfolders) with either bash or python

Grouping and deleting Files

Python: Need to add chosen filenames into an array

Categories

Resources