Printing File Names - python

I am very new to python and just installed Eric6 I am wanting to search a folder (and all sub dirs) to print the filename of any file that has the extension of .pdf I have this as my syntax, but it errors saying
The debugged program raised the exception unhandled FileNotFoundError
"[WinError 3] The system can not find the path specified 'C:'"
File: C:\Users\pcuser\EricDocs\Test.py, Line: 6
And this is the syntax I want to execute:
import os
results = []
testdir = "C:\Test"
for folder in testdir:
for f in os.listdir(folder):
if f.endswith('.pdf'):
results.append(f)
print (results)

Use the glob module.
The glob module finds all the pathnames matching a specified pattern
import glob, os
parent_dir = 'path/to/dir'
for pdf_file in glob.glob(os.path.join(parent_dir, '*.pdf')):
print (pdf_file)
This will work on Windows and *nix platforms.
Just make sure that your path is fully escaped on windows, could be useful to use a raw string.
In your case, that would be:
import glob, os
parent_dir = r"C:\Test"
for pdf_file in glob.glob(os.path.join(parent_dir, '*.pdf')):
print (pdf_file)
For only a list of filenames (not full paths, as per your comment) you can do this one-liner:
results = [os.path.basename(f) for f in glob.glob(os.path.join(parent_dir, '*.pdf')]

Right now, you search each character string inside of testdir's variable.
so it's searching the folder for values "C", ":", "\", "T" etc. You'll want to also escape your escape character like "C:\...\...\"
You probably was to use os.listdir(testdir) instead.

Try running your Python script from C:. From the Command Prompt, you might wanna do this:
> cd C:\
> python C:\Users\pcuser\EricDocs\Test.py
As pointed out by Tony Babarino, use r"C:\Test" instead of "C:\Test" in your code.

There are a few problems in your code, take a look at how I've modified it below:
import os
results = []
testdir = "C:\\Test"
for f in os.listdir(testdir):
if f.endswith('.pdf'):
results.append(f)
print (results)
Note that I have escaped your path name, and removed your first if folder.... That wasn't getting the folders as you expected, but rather selecting a character of the path string one at a time.
You will need to modify the code to get it to look through all folders, this currently doesn't. Take a look at the glob module.

You will need to escape the backslash on windows and you can use os.walk to get all the pdf files.
for root,dirs,files in os.walk(testdir):
for f in files:
if f.endswith('.pdf'):
results.append(f)
print (results)

You are basically iterating through the string testdir with the first for loop then passing each character to os.listdir(folder) does not make any sense then, just remove that first for loop and use fnmatch method from fnmatch module:
import os
from fnmatch import fnmatch
ext = '*.pdf'
results = []
testdir = "C:\Test"
for f in os.listdir(testdir):
if fnmatch(f, ext):
results.append(f)
print (results)

Try testdir = r"C:\Test" instead of testdir = "C:\Test". In python You have to escape special characters like for example \. You can escape them also with symbol '\' so it would be "C:\\Test". By using r"C:\Test", You are telling python to use raw string.
Also for folder in testdir: line doesn't make sense because testdir is a string so You are basically trying to iterate over a string.

I had to mention the names of training images for my Yolo model,
Here's what i did to print names of all images which i kept for training YoloV3 Model
import os
for root, dirs, files in os.walk("."):
for filename in files:
print(filename)
It prints out all the file names from the current directory

Related

Python Path Double Backslash when opening file causing error

I'm trying to open a file like this:
with open(str(script_path) + '\\description.xml', 'w+') as file:
where script_path is equal to this:
script_path = os.path.dirname(os.path.realpath(__file__)) + '\\.tmp')
When I run this I get an error that there is no such file or directory because when it tries to open the file it sees the whole path as a string, including the escape strings. Is there any way around this?
Obviously .replace() won't work here as it won't replace the escape string. Hoping there is a clever way to do this within the os module?
Not really sure why you're adding two backslashes. You can simply create the path using a single forward slash (Linux based) or backslash (win). Something like this:
script_path = os.path.dirname(os.path.realpath(__file__)) + '/tmp/description.xml'
However, better way to achieve this would be to use os.path.join as suggested by nomansland008.
>>> import os
>>> parent_dir = "xyz"
>>> dir = "foo"
>>> file_name = "bar.txt"
>>> os.path.join(parent_dir, dir, file_name)
'xyz/foo/bar.txt'
You won't have to bother about whether the string has slash(or not). It will be taken care by join.
In your case it can simply be:
os.path.join(os.path.dirname(os.path.realpath(__file__)), 'tmp', 'description.xml')
Should work, provided the files and directories exist.

Trying to print name of all csv files within a given folder

I am trying to write a program in python that loops through data from various csv files within a folder. Right now I just want to see that the program can identify the files in a folder but I am unable to have my code print the file names in my folder. This is what I have so far, and I'm not sure what my problem is. Could it be the periods in the folder names in the file path?
import glob
path = "Users/Sarah/Documents/College/Lab/SEM EDS/1.28.20 CZTS hexane/*.csv"
for fname in glob.glob(path):
print fname
No error messages are popping up but nothing will print. Does anyone know what I'm doing wrong?
Are you on a Linux-base system ? If you're not, switch the / for \\.
Is the directory you're giving the full path, from the root folder ? You might need to
specify a FULL path (drive included).
If that still fails, silly but check there actually are files in there, as your code otherwise seems fine.
This code below worked for me, and listed csv files appropriately (see the C:\\ part, could be what you're missing).
import glob
path = "C:\\Users\\xhattam\\Downloads\\TEST_FOLDER\\*.csv"
for fname in glob.glob(path):
print(fname)
The following code gets a list of files in a folder and if they have csv in them it will print the file name.
import os
path = r"C:\temp"
filesfolders = os.listdir(path)
for file in filesfolders:
if ".csv" in file:
print (file)
Note the indentation in my code. You need to be careful not to mix tabs and spaces as theses are not the same to python.
Alternatively you could use os
import os
files_list = os.listdir(path)
out_list = []
for item in files_list:
if item[-4:] == ".csv":
out_list.append(item)
print(out_list)
Are you sure you are using the correct path?
Try moving the python script in the folder when the CSV files are, and then change it to this:
import glob
path = "./*.csv"
for fname in glob.glob(path):
print fname

Scan for all determined files inside a folder using python [duplicate]

This question already has answers here:
Find all files in a directory with extension .txt in Python
(25 answers)
Closed 3 years ago.
I need to write a code in python that can scan for all files inside a folder containing determined extensions like .exe, .jpg, .pdf.
Just like the linux command "ls | grep *.pdf"
I've tried to use a list containing all extensions i need and used Regular Expressions to search for it inside the folder. But i don't know what to put inside re.search()
I don't want to use something like "os" library because this script needs to work on Linux and Windows.
#!/usr/bin/python
import re
file_types = [".exe", ".jpg", ".pdf", ".png", ".txt"]
for line in file_types:
# Do something like "ls | grep * + line"
namefile = re.search(line, i_dont_know_what_to_put_here)
print(namefile)
Update: Thank guys for help, i used glob library and it's works!
You can avoid the os module by using the glob module, which can filter files by regular expression (i.e. *.py)
from glob import glob
file_types = [".exe", ".jpg", ".pdf", ".png", ".txt"]
path = "path/to/files/*{}"
fnames = [ fname for fnames in [[fname for fname in glob( path.format( ext ))] for ext in file_types] for fname in fnames]
Hard to read but it's equivalent is:
from glob import glob
file_types = [".exe", ".jpg", ".pdf", ".png", ".txt"]
fnames = []
for ext in file_types:
for fname in glob( path.format( ext )):
fnames.append( fname )
EDIT: I'm not sure how this works cross platform as some other answers have considered.
EDIT2: glob may have unexpected side effects when used in windows. Getting Every File in a Windows Directory
Try os.listdir():
import os
file_types = ["exe", "jpg", "pdf", "png", "txt"]
files = [f for f in os.listdir('.') if os.path.isfile(f)]
# filter on file type
files = [f for f in files if f.split('.')[-1] in file_types]
In general the os and os.path module is going to be very useful to you here. You could use a regex, but unless performance is very important I wouldn't bother.
Adding to the other comments here, if you still wish to use re, the way you should use it is:
re.search(<string to search for(regex)>, <string to search IN>)
so in your case lets say you have filetype = ".pdf", your code will be:
re.search(".*\{}".format(filetype), filename)
where .* means "match any character 0 or more times", and the '\' along with the ".pdf" will mean "where the name contains .pdf" (the \ is an escape char so the dot won't be translated to regex). I believe you can also add a $ at the end of the regex to say "this is the end of the string".
And as mentioned here - os.listdir works perfectly fine for both Windows & Linux.
Hope that helps.
My suggestion (it will work on all OS - Windows, Linux and macOS):
import os
file_types = [".exe", ".jpg", ".pdf", ".png", ".txt"]
files = [entry.path for entry in os.scandir('.') if entry.is_file() and os.path.splitext(entry.name)[1] in file_types]
or (if you want just filenames instead of full paths):
files = [entry.name for entry in os.scandir('.') if entry.is_file() and os.path.splitext(entry.name)[1] in file_types]

how can I save the output of a search for files matching *.txt to a variable?

I'm fairly new to python. I'd like to save the text that is printed by at this script as a variable. (The variable is meant to be written to a file later, if that matters.) How can I do that?
import fnmatch
import os
for file in os.listdir("/Users/x/y"):
if fnmatch.fnmatch(file, '*.txt'):
print(file)
you can store it in variable like this:
import fnmatch
import os
for file in os.listdir("/Users/x/y"):
if fnmatch.fnmatch(file, '*.txt'):
print(file)
my_var = file
# do your stuff
or you can store it in list for later use:
import fnmatch
import os
my_match = []
for file in os.listdir("/Users/x/y"):
if fnmatch.fnmatch(file, '*.txt'):
print(file)
my_match.append(file) # append insert the value at end of list
# do stuff with my_match list
You can store it in a list:
import fnmatch
import os
matches = []
for file in os.listdir("/Users/x/y"):
if fnmatch.fnmatch(file, '*.txt'):
matches.append(file)
Both answers already provided are correct, but Python provides a nice alternative. Since iterating through an array and appending to a list is such a common pattern, the list comprehension was created as a one-stop shop for the process.
import fnmatch
import os
matches = [filename for filename in os.listdir("/Users/x/y") if fnmatch.fnmatch(filename, "*.txt")]
While NSU's answer and the others are all perfectly good, there may be a simpler way to get what you want.
Just as fnmatch tests whether a certain file matches a shell-style wildcard, glob lists all files matching a shell-style wildcard. In fact:
This is done by using the os.listdir() and fnmatch.fnmatch() functions in concert…
So, you can do this:
import glob
matches = glob.glob("/Users/x/y/*.txt")
But notice that in this case, you're going to get full pathnames like '/Users/x/y/spam.txt' rather than just 'spam.txt', which may not be what you want. Often, it's easier to keep the full pathnames around and os.path.basename them when you want to display them, than to keep just the base names around and os.path.join them when you want to open them… but "often" isn't "always".
Also notice that I had to manually paste the "/Users/x/y/" and "*.txt" together into a single string, the way you would at the command line. That's fine here, but if, say, the first one came from a variable, rather than hardcoded into the source, you'd have to use os.path.join(basepath, "*.txt"), which isn't quite as nice.
By the way, if you're using Python 3.4 or later, you can get the same thing out of the higher-level pathlib library:
import pathlib
matches = list(pathlib.Path("/Users/x/y/").glob("*.txt"))
Maybe defining an utility function is the right path to follow...
def list_ext_in_dir(e,d):
"""e=extension, d= directory => list of matching filenames.
If the directory d cannot be listed returns None."""
from fnmatch import fnmatch
from os import listdir
try:
dirlist = os.listdir(d)
except OSError:
return None
return [fname for fname in dirlist if fnmatch(fname,e)]
I have put the dirlist inside a try except clause to catch the
possibility that we cannot list the directory (non-existent, read
permission, etc). The treatment of errors is a bit simplistic, but...
the list of matching filenames is built using a so called list comprehension, that is something that you should investigate as soon as possible if you're going to use python for your programs.
To close my post, an usage example
l_txtfiles = list_ext_in_dir('*.txt','/Users/x/y;)

Python filename change

I have a number of videos in a directory on my Mac that all have a specific string in the file name that I want to remove, but I want to keep the rest of the file name as it is. I'm running this python script from terminal.
I have this syntax but it doesn't seem to work. Is it practical to use the following? It seems to simple to be the best way to do this sort of thing which is why I don't think it works.
from os import rename, listdir
text = "Some text I want to remove from file name"
files = listdir("/Users/Admin/Desktop/Dir_of_videos/")
for x in files:
if text in files:
os.rename(files, files.replace(text, ""))
the problem is that you get incomplete paths when you are using listdir, basically, it returns only the files in the directory without the prepending path to the directory
this should do the job:
import os
in_dir = './test'
remove = 'hello'
paths = [os.path.join(in_dir,file) for file in os.listdir(in_dir) if remove in file]
for file in paths:
os.rename(file, file.replace(remove, ""))

Categories