Renaming files with quotation marks in the title using Python

Renaming files with quotation marks in the title using Python - python

I know similar questions have been asked a few times on this site, but the solutions provided there did not work for me.
I need to rename files with titles such as
a.jpg
'b.jpg'
c.jpg
"d.jpg"
to
a.jpg
b.jpg
c.jpg
d.jpg
Some of these titles have quotation marks inside the title as well, but it doesn't matter whether they get removed or not.
I have tried
import os
import re
fnames = os.listdir('.')
for fname in fnames:
os.rename(fname, re.sub("\'", '', fname))
and
import os
for file in os.listdir("."):
os.rename(file, file.replace("\'", ""))
to then do the same for the " quotation mark as well, but the titles remained unchanged. I think it might be due to listdir returning the filenames with ' quotation marks around them, but I am not sure.
Edit: I am working on a Ubuntu 18.04.

On windows, a filename with double quotes in it isn't a valid filename. However, a filename with single quotes is valid.
A string with double quotes in it in python would look like:
'"I\'m a string with a double quote on each side"'
A string with single quotes in it in python would look like:
"'I\'m a string with a single quote on each side'"
Because you can't have a double-quote filename on windows, you can't os.rename('"example.txt"', "example.txt"). Because it can't exist to even be renamed.
You can put this script on your desktop and watch the filenames change as it executes:
import os
open("'ex'am'ple.t'xt'",'w')
input("Press enter to rename.")
#example with single quotes all over the filename
os.rename("'ex'am'ple.t'xt'", "example.txt")
open("'example.txt'",'w')
input("Press enter to rename.")
#example with single quotes on each side of filename
os.rename("'example2.txt'", "example2.txt")

Here is my attempt using a for-loop, like you do and list comprehension used on the string, which is also an iterable.
import os
files = os.listdir(os.getcwd())
for file in files:
new_name = ''.join([char for char in file if not char == '\''])
print(new_name)
os.rename(file, new_name)
Edit the forbidden_chars list with the characters, that you do not want in the future filename.
Remember that this will also change folder names afaik, so you may want to check at the start of the for-loop
if os.isfile(file):
before changing the name.
I actually do not understand how you would have filenames, that include the extension inside of quotation marks, but this should work either way. I highly recommend being careful if you want to remove dots.
I also recommend peeking at the documentation for the os module, before using its functions as they can do things you may not be aware of. For example: renaming to an existing filename within the directory will just silently replace the file.

Related

Can't replace "\" with "/"

I am trying to replace a string here in Python.
This is my Input -
link = (r"C:\dell\Documents\Ms\Realm")
I want my output to be like this:
C:/dell/Documents/Ms/Realm
I tried the replace method it didn't work.
the code tried:
link = (r"C:\dell\Documents\Ms\Realm")
link.replace("\","/")

In Python strings, the backslash "\" is a special character, also called the "escape" character. You need to add second backslash to "escape" it and use in string search.
link.replace("\\", "/")

There's nothing wrong with that Windows path. There's no reason to replace anything. A real improvement would be to use pathlib instead of raw strings:
from pathlib import Path
link = Path(r"C:\dell\Documents\Ms\Realm")
That would allow you to construct paths from parts using, eg joinpath, get the parts of the path with parts, the name with name, directory with parent etc.
var filePath=link.joinpath("some_file.txt")
print(filePath)
-------------------
C:\dell\Documents\Ms\Realm\some_file.txt
and more
>>> print(link.parts)
('C:\\', 'dell', 'Documents', 'Ms', 'Realm')
>>> print(link.parent)
C:\dell\Documents\Ms
Or search for files in a folder recursively:
var files=(for file in link.rglob("*.txt") if file.is_file())

Printing File Names

I am very new to python and just installed Eric6 I am wanting to search a folder (and all sub dirs) to print the filename of any file that has the extension of .pdf I have this as my syntax, but it errors saying
The debugged program raised the exception unhandled FileNotFoundError
"[WinError 3] The system can not find the path specified 'C:'"
File: C:\Users\pcuser\EricDocs\Test.py, Line: 6
And this is the syntax I want to execute:
import os
results = []
testdir = "C:\Test"
for folder in testdir:
for f in os.listdir(folder):
if f.endswith('.pdf'):
results.append(f)
print (results)

Use the glob module.
The glob module finds all the pathnames matching a specified pattern
import glob, os
parent_dir = 'path/to/dir'
for pdf_file in glob.glob(os.path.join(parent_dir, '*.pdf')):
print (pdf_file)
This will work on Windows and *nix platforms.
Just make sure that your path is fully escaped on windows, could be useful to use a raw string.
In your case, that would be:
import glob, os
parent_dir = r"C:\Test"
for pdf_file in glob.glob(os.path.join(parent_dir, '*.pdf')):
print (pdf_file)
For only a list of filenames (not full paths, as per your comment) you can do this one-liner:
results = [os.path.basename(f) for f in glob.glob(os.path.join(parent_dir, '*.pdf')]

Right now, you search each character string inside of testdir's variable.
so it's searching the folder for values "C", ":", "\", "T" etc. You'll want to also escape your escape character like "C:\...\...\"
You probably was to use os.listdir(testdir) instead.

Try running your Python script from C:. From the Command Prompt, you might wanna do this:
> cd C:\
> python C:\Users\pcuser\EricDocs\Test.py
As pointed out by Tony Babarino, use r"C:\Test" instead of "C:\Test" in your code.

There are a few problems in your code, take a look at how I've modified it below:
import os
results = []
testdir = "C:\\Test"
for f in os.listdir(testdir):
if f.endswith('.pdf'):
results.append(f)
print (results)
Note that I have escaped your path name, and removed your first if folder.... That wasn't getting the folders as you expected, but rather selecting a character of the path string one at a time.
You will need to modify the code to get it to look through all folders, this currently doesn't. Take a look at the glob module.

You will need to escape the backslash on windows and you can use os.walk to get all the pdf files.
for root,dirs,files in os.walk(testdir):
for f in files:
if f.endswith('.pdf'):
results.append(f)
print (results)

You are basically iterating through the string testdir with the first for loop then passing each character to os.listdir(folder) does not make any sense then, just remove that first for loop and use fnmatch method from fnmatch module:
import os
from fnmatch import fnmatch
ext = '*.pdf'
results = []
testdir = "C:\Test"
for f in os.listdir(testdir):
if fnmatch(f, ext):
results.append(f)
print (results)

Try testdir = r"C:\Test" instead of testdir = "C:\Test". In python You have to escape special characters like for example \. You can escape them also with symbol '\' so it would be "C:\\Test". By using r"C:\Test", You are telling python to use raw string.
Also for folder in testdir: line doesn't make sense because testdir is a string so You are basically trying to iterate over a string.

I had to mention the names of training images for my Yolo model,
Here's what i did to print names of all images which i kept for training YoloV3 Model
import os
for root, dirs, files in os.walk("."):
for filename in files:
print(filename)
It prints out all the file names from the current directory

glob function in python with one wildcard

I have a problem with the glob.glob function in Python.
This line works perfectly for me getting all text files with the name 002 in the two subsequent folders of Models:
All_txt = glob.glob("C:\Users\EDV\Desktop\Peter\Models\*\*\002.txt")
But going into one subfolder and asking the same:
All_txt = glob.glob('C:\Users\EDV\Desktop\Peter\Models\Texte\*\002.txt')
results in an empty list. Does anybody know what the problem here is (or knows another function which expresses the same)?
I double-checked the folder paths and that all folders contain these text-files.

Try putting an r in front of the string to make a raw string: glob.glob(r'C:\Users\EDV\Desktop\Peter\Models\Texte\*\002.txt'). This will make it so the backslashes arent used for escaping the next character.
You could also do it without glob like so:
import os
all_txt = []
root = r'C:\Users\EDV\Desktop\Peter\Models\Texte'
for d in os.listdir(root):
abs_d = os.path.join(root, d)
if os.path.isdir(abs_d):
txt = os.path.join(abs_d, '002.txt')
if os.path.isfile(txt):
all_txt.append(txt)

How to process files from one subfolder to another in each directory using Python?

I have a basic file/folder structure on the Desktop where the "Test" folder contains "Folder 1", which in turn contains 2 subfolders:
An "Original files" subfolder which contains shapefiles (.shp).
A "Processed files" subfolder which is empty.
I am attempting to write a script which looks into each parent folder (Folder 1, Folder 2 etc) and if it finds an Original Files subfolder, it will run a function and output the results into the Processed files subfolder.
I made a simple diagram to showcase this where if Folder 1 contains the relevant subfolders then the function will run; if Folder 2 does not contain the subfolders then it's simply ignored:
I looked into the following posts but having some trouble:
python glob issues with directory with [] in name
Getting a list of all subdirectories in the current directory
How to list all files of a directory?
The following is the script which seems to run happily, annoying thing is that it doesn't produce an error so this real noob can't see where the problem is:
import os, sys
from os.path import expanduser
home = expanduser("~")
for subFolders, files in os.walk(home + "\Test\\" + "\*Original\\"):
if filename.endswith('.shp'):
output = home + "\Test\\" + "\*Processed\\" + filename
# do_some_function, output

I guess you mixed something up in your os.walk()-loop.
I just created a simple structure as shown in your question and used this code to get what you're looking for:
root_dir = '/path/to/your/test_dir'
original_dir = 'Original files'
processed_dir = 'Processed files'
for path, subdirs, files in os.walk(root_dir):
if original_dir in path:
for file in files:
if file.endswith('shp'):
print('original dir: \t' + path)
print('original file: \t' + path + os.path.sep + file)
print('processed dir: \t' + os.path.sep.join(path.split(os.path.sep)[:-1]) + os.path.sep + processed_dir)
print('processed file: ' + os.path.sep.join(path.split(os.path.sep)[:-1]) + os.path.sep + processed_dir + os.path.sep + file)
print('')
I'd suggest to only use wildcards in a directory-crawling script if you are REALLY sure what your directory tree looks like. I'd rather use the full names of the folders to search for, as in my script.
Update: Paths
Whenever you use paths, take care of your path separators - the slashes.
On windows systems, the backslash is used for that:
C:\any\path\you\name
Most other systems use a normal, forward slash:
/the/path/you/want
In python, a forward slash could be used directly, without any problem:
path_var = '/the/path/you/want'
...as opposed to backslashes. A backslash is a special character in python strings. For example, it's used for the newline-command: \n
To clarify that you don't want to use it as a special character, but as a backslash itself, you either have to "escape" it, using another backslash: '\\'. That makes a windows path look like this:
path_var = 'C:\\any\\path\\you\\name'
...or you could mark the string as a "raw" string (or "literal string") with a proceeding r. Note that by doing that, you can't use special characters in that string anymore.
path_var = r'C:\any\path\you\name'
In your comment, you used the example root_dir = home + "\Test\\". The backslash in this string is used as a special character there, so python tries to make sense out of the backslash and the following character: \T. I'm not sure if that has any meaning in python, but \t would be converted to a tab-stop. Either way - that will not resolve to the path you want to use.
I'm wondering why your other example works. In "C:\Users\me\Test\\", the \U and \m should lead to similar errors. And you also mixed single and double backslashes.
That said...
When you take care of your OS path separators and trying around with new paths now, also note that python does a lot of path-concerning things for you. For example, if your script reads a directory, as os.walk() does, on my windows system the separators are already processed as double backslashes. There's no need for me to check that - it's usually just hardcoded strings, where you'll have to take care.
And finally: The Python os.path module provides a lot of methods to handle paths, seperators and so on. For example, os.path.sep (and os.sep, too) wil be converted in the correct seperator for the system python is running on. You can also build paths using os.path.join().
And finally: The home-directory
You use expanduser("~") to get the home-path of the current user. That should work fine, but if you're using an old python version, there could be a bug - see: expanduser("~") on Windows looks for HOME first
So check if that home-path is resolved correct, and then build your paths using the power of the os-module :-)
Hope that helps!

batch search and replace strings in filenames with python

I am trying to write a small python script to rename a bunch of filenames by searching and replacing. For example:
Original filename:
MyMusic.Songname.Artist-mp3.iTunes.mp3
Intendet Result:
Songname.Artist.mp3
what i've got so far is:
#!/usr/bin/env python
from os import rename, listdir
mustgo = "MyMusic."
filenames = listdir('.')
for fname in fnames:
if fname.startswith(mustgo):
rename(fname, fname.replace(mustgo, '', 1))
(got it from this site as far as i can remember)
Anyway, this will only get rid of the String at the beginning, but not of those in the filename.
Also I would like to maybe use a seperate file (eg badwords.txt) containing all the strings that should be searched for and replaced, so that i can update them without having to edit the whole code.
Content of badwords.txt
MyMusic.
-mp3
-MP3
.iTunes
.itunes
I have been searching for quite some time now but havent found anything. Would appreciate any help!
Thank you!

import fnmatch
import re
import os
with open('badwords.txt','r') as f:
pat='|'.join(fnmatch.translate(badword)[:-1] for badword in
f.read().splitlines())
for fname in os.listdir('.'):
new_fname=re.sub(pat,'',fname)
if fname != new_fname:
print('{o} --> {n}'.format(o=fname,n=new_fname))
os.rename(fname, new_fname)
# MyMusic.Songname.Artist-mp3.iTunes.mp3 --> Songname.Artist.mp3
Note that it is possible for some files to be overwritten (and thus
lost) if two names get reduced to the same shortened name after
badwords have been removed. A set of new fnames could be kept and
checked before calling os.rename to prevent losing data through
name collisions.
fnmatch.translate takes shell-style patterns and returns the
equivalent regular expression. It is used above to convert badwords
(e.g. '.iTunes') into regular expressions (e.g. r'\.iTunes').
Your badwords list seems to indicate you want to ignore case. You
could ignore case by adding '(?i)' to the beginning of pat:
with open('badwords.txt','r') as f:
pat='(?i)'+'|'.join(fnmatch.translate(badword)[:-1] for badword in
f.read().splitlines())

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.