os.listdir() is giving larger lists than expected - python

My book states that:
Calling os.listdir(path) will return a list of filename strings for each file in the path argument.
I tried to get the files inside a folder which is placed on the desktop and it worked perfectly fine. Then I tried to get the files in the root folder '/' and it's giving weird results.
My root folder has 5 files which include Applications, Library, Users etc but os.listdir('/') gives me a list of some 20-25 list items some of which are Applications, Library, Users,.DS_Store, Trashes, .dbfseventsd,.Spotlight-V100 etc. Note that the bold text list items do not seem to appear in the root folder when I manually open it.
Why is this happening and what should I do?

Your root folder includes hidden directories or files. These begin with a ., and are not seen by default in the Finder or ls. However, os.listdir returns them as well.
If you want to ignore these files, you may use:
files = [x for x in os.listdir('/') if not f.startswith('.')]
As an extra, it is useful to know how to view these hidden files on OSX. To see them in Finder:
Open Finder
Go to your Macintosh HD folder (access this from Devices in the left column)
Hold down CMD-Shift-. (dot)
To see them in your terminal, run ls -a /path/to/dir.

Related

os.walk isn't showing all the files in the given path

I'm trying to make my own backup program but to do so I need to be able to give a directory and be able to get every file that is somewhere deep down in subdirectories to be able to copy them. I tried making a script but it doesn't give me all the files that are in that directory. I used documents as a test and my list with items is 3600 but the amount of files should be 17000. why isn't os.walk showing everything?
import os
data = []
for mdir, dirs, files in os.walk('C:/Users/Name/Documents'):
data.append(files)
print(data)
print(len(data))
Use data.extend(files) instead of data.append(files).
files is a list of files in a directory. It looks like ["a.txt", "b.html"] and so on. If you use append, you end up with data looking like
[..., ["a.txt", "b.html"]]
whereas I suspect you're after
[..., "a.txt", "b.html"]
Using extend will provide the second behaviour.

FileNotFound Error: [Errno 2] No such file or directory b when iterating through list of files in Python on Windows?

I am getting a FileNotFound error when iterating through a list of files in Python on Windows.
The specific error I get looks like:
FileNotFoundError: File b'fileName.csv' does not exist
In my code, I first ask for input on where the file is located and generate a list using os (though I also tried glob):
directory = input('In what directory are your files located?')
fileList = [s for s in os.listdir(directory) if s.endswith('.csv')]
When I print the list, it does not contain byte b before any strings, as expected (but I still checked). My code seems to break at this step, which generates the error:
for file in fileList:
pd.read_csv(file) # breaks at this line
I have tried everything I could find on Stack Overflow to solve this problem. That includes:
Putting an r or b or rb right before the path string
Using both the relative and absolute file paths
Trying different variations of path separators (/, \, \\, etc.)
I've been dealing with Windows-related issues lately (since I normally work in Mac or Linux) so that was my first suspicion. I'd love another set of eyes to help me figure out where the snag is.
A2A.
Although the list was generated correctly because the full directory path was used, the working directory when the file was being run was .. You can verify this by running os.getcwd(). When later iterating through the list of file names, the program could not find those file names in the . directory, as it shouldn't. That list is just a list of file names; there was no directory tied to it so it used the current directory.
The easiest fix here is to change the directory the file is running through after the input. So,
directory = input('In what directory are your files located?')
os.chdir(directory) # Points os to given directory to avoid byte issue
If you need to access multiple directories, you could either do this right before you switch to each directory or save the full file path into your list instead.

Dealing with OS Error Python: [Errno 20] Not a directory: '/Volumes/ARLO/ADNI/.DS_Store'

I am trying to write a piece of code that will recursively iterate through the subdirectories of a specific directory and stop only when reaching files with a '.nii' extension, appending these files to a list called images - a form of a breadth first search. Whenever I run this code, however, I keep receiving [Errno 20] Not a directory: '/Volumes/ARLO/ADNI/.DS_Store'
*/Volumes/ARLO/ADNI is the folder I wish to traverse through
*I am doing this in Mac using the Spyder IDE from Anaconda because it is the only way I can use the numpy and nibabel libraries, which will become important later
*I have already checked that this folder directly contains only other folders and not files
#preprocessing all the MCIc files
import os
#import nibabel as nib
#import numpy as np
def pwd():
cmd = 'pwd'
os.system(cmd)
print(os.getcwd())
#Part 1
os.chdir('/Volumes/ARLO')
images = [] #creating an empty list to store MRI images
os.chdir('/Volumes/ARLO/ADNI')
list_sample = [] #just an empty list for an earlier version of
#the program
#Part 2
#function to recursively iterate through folder of raw MRI
#images and extract them into a list
#breadth first search
def extract(dir):
#dir = dir.replace('.DS_Store', '')
lyst = os.listdir(dir) #DS issue
for item in lyst:
if 'nii' not in item: #if item is not a .nii file, if
#item is another folder
newpath = dir + '/' + item
#os.chdir(newpath) #DS issue
extract(newpath)
else: #if item is the desired file type, append it to
#the list images
images.append(item)
#Part 3
adni = os.getcwd() #big folder I want to traverse
#print(adni) #adni is a string containing the path to the ADNI
#folder w/ all the images
#print(os.listdir(adni)) this also works, prints the actual list
"""adni = adni + '/' + '005_S_0222'
os.chdir(adni)
print(os.listdir(adni))""" #one iteration of the recursion,
#works
extract(adni)
print(images)
With every iteration, I wish to traverse further into the nested folders by appending the folder name to the growing path, and part 3 of the code works, i.e. I know that a single iteration works. Why does os keep adding the '.DS_Store' part to my directories in the extract() function? How can I correct my code so that the breadth first traversal can work? This folder contains hundreds of MRI images, I cannot do it without automation.
Thank you.
The .DS_Store files are not being created by the os module, but by the Finder (or, I think, sometimes Spotlight). They're where macOS stores things like the view options and icon layout for each directory on your system.
And they've probably always been there. The reason you didn't see them when you looked is that files that start with a . are "hidden by convention" on Unix, including macOS. Finder won't show them unless you ask it to show hidden files; ls won't show them unless you pass the -a flag; etc.
So, that's your core problem:
I have already checked that this folder directly contains only other folders and not files
… is wrong. The folder does contain at least one regular file; .DS_Store.
So, what can you do about that?
You could add special handling for .DS_Store.
But a better solution is probably to just check each file to see if it's a file or directory, by calling os.path.isdir on it.
Or, even better, use os.scandir instead of listdir, which gives you entries with more information than just the name, so you don't need to make extra calls like isdir.
Or, best of all, just throw out this code and use os.walk to recursively visit every file in every directory underneath your top-level directory.

Python batch-rename script sending files to root folder

Ok, this is weird and maybe awkward.
I made a script so I could change the end of subtitles files to keep consistency.
Basically it replaces A.X.str to A.Y.str. It worked flawlessly at a single folder.
I decided then to make a recursive version of it so I could do it on any folder I had, regardless if the episodes where together, separated by season or each on an individual path.
I really don't know how or why, but it sent all the files it reached to the root folder I was using until it halted raising a FileExistsError.
The code bit I'm using is:
def rewrite(folder, old, new):
for f in next(os.walk(folder))[2]:
os.rename(os.path.join(folder, f),
os.path.join(path, f.replace(old, new)))
for f in next(os.walk(folder))[1]:
x = os.path.join(folder, f)
rewrite(x, old, new)
Where 'old' is "A.X.str", 'new' is "A.Y.str" and folder is the full path of the root folder "C:\Series\Serie Name".
Why doesn't this work as recursive? The first bit of code (First FOR loop) works fine on it's own in a single folder.
Is the problem with the "next" I use to get the names of files and folders?
The code you are showing us is using a path variable in the rename destination -- that should be the folder variable instead.

How to check if files exist in My Network Places directory

I want to do something like this.
import os
for root, dirs, files in os.walk("/mydir"):
print files
However, the directory I want to check is under My Network Places, and the path is "\blue01\syng\getem\BY", and python says that there is no such directory. Why can't python see this directory? Does it have something to do with the fact that is under My Network Places?
lets say for example, the python file is located on your desktop, it will be something like this:
c:\users\yourname\desktop\python_file.py
and your target is located at
\blue01\syng\getem\BY\mydir
inside of the python_file.py if you mention /mydir its going to first check the directory that its in ( which would be the desktop) and then its going to check your system path. ( learn how that stuff works because it will be useful later, but for now just forget about it)
since c:\users\yourname\desktop\ != \blue01\syng\getem\BY
its gonna have NO idea what you are talking about.
open a window and browse to your target directory, and then copy the address bar
on windows, if its in your network places, the entire path is gonna look something like:
\\someIPaddress_or_domain\some_form_of_directory_structure\blue01\syng\getem\BY\mydir
just paste that into python, put it in double quotes and put an r in front of it ( to avoid escaping charachters )
for root, dirs, files in os.walk(r"\\someIPaddress_or_domain\some_form_of_directory_structure\blue01\syng\getem\BY\mydir"):
So TL;DR put exact paths.

Categories