Python filename change - python

I have a number of videos in a directory on my Mac that all have a specific string in the file name that I want to remove, but I want to keep the rest of the file name as it is. I'm running this python script from terminal.
I have this syntax but it doesn't seem to work. Is it practical to use the following? It seems to simple to be the best way to do this sort of thing which is why I don't think it works.
from os import rename, listdir
text = "Some text I want to remove from file name"
files = listdir("/Users/Admin/Desktop/Dir_of_videos/")
for x in files:
if text in files:
os.rename(files, files.replace(text, ""))

the problem is that you get incomplete paths when you are using listdir, basically, it returns only the files in the directory without the prepending path to the directory
this should do the job:
import os
in_dir = './test'
remove = 'hello'
paths = [os.path.join(in_dir,file) for file in os.listdir(in_dir) if remove in file]
for file in paths:
os.rename(file, file.replace(remove, ""))

Related

Trying to print name of all csv files within a given folder

I am trying to write a program in python that loops through data from various csv files within a folder. Right now I just want to see that the program can identify the files in a folder but I am unable to have my code print the file names in my folder. This is what I have so far, and I'm not sure what my problem is. Could it be the periods in the folder names in the file path?
import glob
path = "Users/Sarah/Documents/College/Lab/SEM EDS/1.28.20 CZTS hexane/*.csv"
for fname in glob.glob(path):
print fname
No error messages are popping up but nothing will print. Does anyone know what I'm doing wrong?
Are you on a Linux-base system ? If you're not, switch the / for \\.
Is the directory you're giving the full path, from the root folder ? You might need to
specify a FULL path (drive included).
If that still fails, silly but check there actually are files in there, as your code otherwise seems fine.
This code below worked for me, and listed csv files appropriately (see the C:\\ part, could be what you're missing).
import glob
path = "C:\\Users\\xhattam\\Downloads\\TEST_FOLDER\\*.csv"
for fname in glob.glob(path):
print(fname)
The following code gets a list of files in a folder and if they have csv in them it will print the file name.
import os
path = r"C:\temp"
filesfolders = os.listdir(path)
for file in filesfolders:
if ".csv" in file:
print (file)
Note the indentation in my code. You need to be careful not to mix tabs and spaces as theses are not the same to python.
Alternatively you could use os
import os
files_list = os.listdir(path)
out_list = []
for item in files_list:
if item[-4:] == ".csv":
out_list.append(item)
print(out_list)
Are you sure you are using the correct path?
Try moving the python script in the folder when the CSV files are, and then change it to this:
import glob
path = "./*.csv"
for fname in glob.glob(path):
print fname

RegEx to find specific file path

I am trying to find the existence of a file testing.txt
The first file exists in: sub/hbc_cube/college/
The second file exists in: sub/hbc/college
However, when searching for where the file exists, I CANNOT assume the string 'hbc' because the name may be different depending on the user. So I am trying to find a way to
PASS if the path is
sub/_cube/college/
FAIL if the path is
sub/*/college
But I cannot use a glob character () because the () will count _cube as failing. I am trying to figure out a regular expression that will only detect a string and not a string with an underscore (hbc_cube for example).
I have tried using the python regex dictionary but I have not been able to figure out the correct regex to use
file_list = lookupfiles(['testing.txt'], dirlist = ['sub/'])
for file in file_list:
if str(file).find('_cube/college/') #hbc_cube/college
print("pass")
if str(file).find('*/college/') #hbc/college
print("fail")
If the file exists in both locations I want only "fail" to print. The problem is the * character is counting hbc_cube.
The glob module is your friend. You don't even need to match against multiple directories, glob will do it for you:
from glob import glob
testfiles = glob("sub/*/testing.txt")
if len(testfiles) > 0 and all("_cube/" in path for path in testfiles):
print("Pass")
else:
print("Fail")
In case it is not obvious, the test all("_cube/" in path for path in testfiles) will take care of this requirement:
If the file exists in both locations I want only "fail" to print. The problem is the * character is counting hbc_cube.
If some of the paths that matched do not contain _cube, the test fails. Since you want to know about files that cause the test to fail, you cannot search solely for files in a path containing *_cube -- you must retrieve both good and bad paths, and inspect them as shown.
Of course you can shorten the above code, or generalize it to construct the globbed path by combining options from a list of folders and a list of files, etc., depending on the particulars of your case.
Note that there are "full regular expressions", provided by the re module, and the simpler "globs" used by the glob module. If you go check the documentation, don't confuse them.
Use the pathlib to parse your path, from the path object get the parent, this will discard the /college part, and check if the path string ends with _cube
from pathlib import Path
file_list = lookupfiles(['testing.txt'], dirlist = ['sub/'])
for file in file_list:
path = Path(file)
if str(path.parent).endswith('_cube'):
print('pass')
else:
print('Fail')
Edit:
If the file variable in the for loop contains the file name (sub/_cube/college/testing.txt) just call parent twice on the path, path.parent.parent
Another approach would be to filter the files inside lookupfiles() that is if you have access to that function and can edit it
The os module is well suited for this:
import os
# This assumes your current working directory has sub in it
for root, dirs, files in os.walk('sub'):
for file in files:
if file=='testing.txt':
# print the file and the directory it's in
print(os.path.join(root, file))
os.walk will return a three-element tuple as it iterates: a root dir, directories in that current folder, and files in that current folder. To print the directory, you combine the root (cwd) and the file name.
For example, on my machine:
for root, dirs, files in os.walk(os.getcwd()):
for file in files:
if file.endswith('ipynb'):
os.path.join(root, file)
# returns
/Users/mm92400/Salesforce_Repos/DataExplorationClustersAndTime.ipynb
/Users/mm92400/Salesforce_Repos/DataExplorationUntitled1.ipynb
/Users/mm92400/Salesforce_Repos/DataExplorationExploratory.ipynb
/Users/mm92400/Salesforce_Repos/DataExplorationUntitled3.ipynb
/Users/mm92400/Salesforce_Repos/DataExplorationUntitled.ipynb
/Users/mm92400/Salesforce_Repos/DataExplorationUntitled4.ipynb
/Users/mm92400/Salesforce_Repos/DataExplorationUntitled2.ipynb
/Users/mm92400/Salesforce_Repos/DataExplorationClusterAnalysis.ipynb

Printing File Names

I am very new to python and just installed Eric6 I am wanting to search a folder (and all sub dirs) to print the filename of any file that has the extension of .pdf I have this as my syntax, but it errors saying
The debugged program raised the exception unhandled FileNotFoundError
"[WinError 3] The system can not find the path specified 'C:'"
File: C:\Users\pcuser\EricDocs\Test.py, Line: 6
And this is the syntax I want to execute:
import os
results = []
testdir = "C:\Test"
for folder in testdir:
for f in os.listdir(folder):
if f.endswith('.pdf'):
results.append(f)
print (results)
Use the glob module.
The glob module finds all the pathnames matching a specified pattern
import glob, os
parent_dir = 'path/to/dir'
for pdf_file in glob.glob(os.path.join(parent_dir, '*.pdf')):
print (pdf_file)
This will work on Windows and *nix platforms.
Just make sure that your path is fully escaped on windows, could be useful to use a raw string.
In your case, that would be:
import glob, os
parent_dir = r"C:\Test"
for pdf_file in glob.glob(os.path.join(parent_dir, '*.pdf')):
print (pdf_file)
For only a list of filenames (not full paths, as per your comment) you can do this one-liner:
results = [os.path.basename(f) for f in glob.glob(os.path.join(parent_dir, '*.pdf')]
Right now, you search each character string inside of testdir's variable.
so it's searching the folder for values "C", ":", "\", "T" etc. You'll want to also escape your escape character like "C:\...\...\"
You probably was to use os.listdir(testdir) instead.
Try running your Python script from C:. From the Command Prompt, you might wanna do this:
> cd C:\
> python C:\Users\pcuser\EricDocs\Test.py
As pointed out by Tony Babarino, use r"C:\Test" instead of "C:\Test" in your code.
There are a few problems in your code, take a look at how I've modified it below:
import os
results = []
testdir = "C:\\Test"
for f in os.listdir(testdir):
if f.endswith('.pdf'):
results.append(f)
print (results)
Note that I have escaped your path name, and removed your first if folder.... That wasn't getting the folders as you expected, but rather selecting a character of the path string one at a time.
You will need to modify the code to get it to look through all folders, this currently doesn't. Take a look at the glob module.
You will need to escape the backslash on windows and you can use os.walk to get all the pdf files.
for root,dirs,files in os.walk(testdir):
for f in files:
if f.endswith('.pdf'):
results.append(f)
print (results)
You are basically iterating through the string testdir with the first for loop then passing each character to os.listdir(folder) does not make any sense then, just remove that first for loop and use fnmatch method from fnmatch module:
import os
from fnmatch import fnmatch
ext = '*.pdf'
results = []
testdir = "C:\Test"
for f in os.listdir(testdir):
if fnmatch(f, ext):
results.append(f)
print (results)
Try testdir = r"C:\Test" instead of testdir = "C:\Test". In python You have to escape special characters like for example \. You can escape them also with symbol '\' so it would be "C:\\Test". By using r"C:\Test", You are telling python to use raw string.
Also for folder in testdir: line doesn't make sense because testdir is a string so You are basically trying to iterate over a string.
I had to mention the names of training images for my Yolo model,
Here's what i did to print names of all images which i kept for training YoloV3 Model
import os
for root, dirs, files in os.walk("."):
for filename in files:
print(filename)
It prints out all the file names from the current directory

Python - Deleting the last few characters of specific files in a directory

I'm trying to delete the last several characters of multiple files in a specific directory using the rename function. The code I have written using suggestions on this site looks like it should work, but it returns the error message:
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'test1.txt' -> 'test'
And here is my code:
import os
list = os.listdir("C:\\Users\\Jonathan\\Desktop")
for file in list:
if file.startswith("test"):
os.rename(file, file[0:4])
My code shows that for all files beginning with the word "test", delete all characters after it. As I said, to me it looks like it should work, but I am new at Python, and I don't even understand what the error message means.
Are you actually in the folder where you're renaming? If not, the problem is likely that you're looking in the local folder (where you launched the program). Prepend that path to each file name:
import os
cwd = "C:\\Users\\Jonathan\\Desktop"
list = os.listdir(cwd)
for file in list:
if file.startswith("test"):
os.rename(cwd+file, cwd+"test")
As you didn't specify the complete path to your file, it is likely that your program was saving the in your root directory. Also, you should not use list or file as variable names since they shadow two of Python's types.
import os
files_path = "C:\\Users\\Jonathan\\Desktop\\"
lst = os.listdir(files_path)
for file_name in lst:
if file_name.startswith("test"):
os.rename(files_path + file_name, files_path + file_name[:-4])
Try this:
import os
list = os.listdir("C:\\Users\\Jonathan\\Desktop\\")
for file in list:
if file[:4] == "test":
os.renames(list+file, list+file[:4])
And by the way, if you need find the files and rename them recursively(That means will find all directories in that directory). You can use os.walk() like this:
for root, dirs, files in os.walk("C:\\Users\\Jonathan\\Desktop\\"):
for name in files:
if name[:4] == "test":
os.renames(os.path.join(root, name), os.path.join(root, name)[:4])
you need to use os.rename() with existing paths. if your working directory is not the directory containing the file your script will fail. this should work independently of your working directory:
files_path = "C:\\Users\\Jonathan\\Desktop\\"
lst = os.listdir(files_path)
for fle in lst:
if fle.startswith("test"):
os.rename(os.path.join(files_path, fle),
os.path.join(files_path, fle[:4]) )
and avoid using list as a varaible name.

Running a python script on all the files in a directory

I have a Python script that reads through a text csv file and creates a playlist file. However I can only do one at a time, like:
python playlist.py foo.csv foolist.txt
However, I have a directory of files that need to be made into a playlist, with different names, and sometimes a different number of files.
So far I have looked at creating a txt file with a list of all the names of the file in the directory, then loop through each line of that, however I know there must be an easier way to do it.
for f in *.csv; do
python playlist.py "$f" "${f%.csv}list.txt"
done
Will that do the trick? This will put foo.csv in foolist.txt and abc.csv in abclist.txt.
Or do you want them all in the same file?
Just use a for loop with the asterisk glob, making sure you quote things appropriately for spaces in filenames
for file in *.csv; do
python playlist.py "$file" >> outputfile.txt;
done
Is it a single directory, or nested?
Ex.
topfile.csv
topdir
--dir1
--file1.csv
--file2.txt
--dir2
--file3.csv
--file4.csv
For nested, you can use os.walk(topdir) to get all the files and dirs recursively within a directory.
You could set up your script to accept dirs or files:
python playlist.py topfile.csv topdir
import sys
import os
def main():
files_toprocess = set()
paths = sys.argv[1:]
for p in paths:
if os.path.isfile(p) and p.endswith('.csv'):
files_toprocess.add(p)
elif os.path.isdir(p):
for root, dirs, files in os.walk(p):
files_toprocess.update([os.path.join(root, f)
for f in files if f.endswith('.csv')])
if you have directory name you can use os.listdir
os.listdir(dirname)
if you want to select only a certain type of file, e.g., only csv file you could use glob module.

Categories