How to fix 'NotADirectoryError' in Python on Mac - python

I'm trying to walk through all the files on my computer and create a list of files larger than 100MB. However, when walking through the files, I keep getting a 'NotADirectoryError.' The way I understand my code, it should be looking base files in the second for loop, so I don't know why the 'NotADirectory' Error keeps getting raised.
I'm on a Mac using Python3.7.4. I've tried adding an exception for 'NotADirectoryError', but it just gave me a 'FileNotFound Error,' so I figured adding exception upon exception wasn't the best way to solve this.
I think I'm misunderstanding how either os.walk or os.path.getsize work, but after reviewing the documentation for both I'm still confused about what I'm doing wrong.
big_dict= {}
folder = os.path.abspath('/Users/')
#walking thru all the folders on my computer
for folders, subfolders, filenames in os.walk(folder):
#for each file that's bigger than 100 MB, add it to the 'big_dict'
for filename in filenames:
filename = os.path.join(folder, filename) +'/'
file_size = os.path.getsize(filename) -- this line always gives the error
#convert size to MB, add to big_dict as value with filename key
if file_size > 1000000:
big_dict[filename] = str(file_size/10000) + ' MB'
Traceback (most recent call last):
File "/Users/jonbauer/size_check.py", line 17, in <module>
file_size = os.path.getsize(filename)
File "/Applications/Thonny.app/Contents/Frameworks/Python.framework/Versions/3.7/Resources/Python.app/Contents/MacOS/../../../../../../../Python.framework/Versions/3.7/lib/python3.7/genericpath.py", line 50, in getsize
return os.stat(filename).st_size
NotADirectoryError: [Errno 20] Not a directory: '/Users/.localized/'
This is the error message I get when running the code. As I said before, I'm trying to walk through all my files and add larger files to a list, but I keep encountering this error.

big_dict= {}
folder = os.path.abspath('/Users')
for root, _, filenames in os.walk(folder):
for filename in filenames:
path = os.path.join(root, filename)
if os.path.islink(path):
continue
file_size_mb = os.path.getsize(path) >> 20
if file_size_mb > 100:
file_size_mb_str = '{} MB'.format(file_size_mb)
print(path, file_size_mb_str)
big_dict[path] = file_size_mb_str
There are several problems in your code:
filename = os.path.join(folder, filename) +'/' is incorrect, because folder is the base folder ('/Users/' in your case), not the actual folder you're walking.
You need to skip links, because getsize() won't work on them
file_size > 1000000: os.path.getsize() returns bytes

Related

Print file data during a for loop

I have to propose this question again, I'm a newbie in Python and I can't resolve this.
I'm trying to print the filename and the filesize during this for loop when the file is moved to the new folder. I know the os.path.getsize(path) and os.stat(path).st_size methods, but which path can I enter if my file change every loop?
This code hasn't errors, but I don't know how to print the data during the loop as the file change.
src = ("C:\\..\\files")
dest1 = ("C:\\..\\files\\images")
files = os.listdir(src) #files is a list of files in a folder
for file in files:
if file.endswith(".jpg") or file.endswith(".png") or file.endswith(".jpeg"):
if not os.path.exists(dest1):
os.mkdir(dest1)
shutil.move(src + "/" + file, dest1) #for every file that is moved I have to print filesize and filename.
#print(??)
IIUC, try:
src = ("C:/Users/username/")
with open("recap.csv", "w") as recap_file:
csv_writer = csv.writer(recap_file, delimiter="\t")
for file in os.listdir(src):
if file.endswith(".jpg") or file.endswith(".png") or file.endswith(".jpeg"):
dest = f"{src}/images"
elif file.endswith(".odt") or file.endswith(".txt"):
dest = f"{src}/docs"
elif file.endswith(".mp3"):
dest = f"{src}/audios"
else:
continue
size = os.path.getsize(f"{src}/{file}")
print(f"File name: {src}/{file}; Size: {size}")
csv_writer.writerow([f"{src}/{file}", size])
if not os.path.exists(dest):
os.mkdir(dest)
shutil.move(f"{src}/{file}", dest)
Moving the file should generally not change the file size [1], so you can check the file's size before performing the move.
On the other hand, since you know where you move the file, you could also determine files size after the move. Moving a file from src + "/" + file to dest1 means that the file should be at dest1 + "/" + file after the operation[2]. You can use this to not only print the new path of the file, but also determine the file's size at the new location.
[1] (NB: file size can change when moving from one file system to another, with different block sizes and size specified as a multiple of the block size. However, this should not be a problem for this case.)
[2] Moving the file could fail for a number of reasons (file was deleted intermittently, there is already a (write-protected) file with that name at the destination, no space left in destination, ...). You should add some error handling for such cases, e.g. with a try ... except block around the shutil.move operation.

For loop throwing exception on file already deleted during os walk

I'm writing a script that walks through a directory and looks for files with the typical Windows installer extensions and deletes them. When I run this with a list (vs say checking for .msi or .exe), it breaks when going through the nested loop again. It seems as if it runs though my list, deletes one type of extension then runs through the loop again and attemtps to find the same extension then throws an exception. Here is the output when I simply print, but not remove a file:
> C:\Users\User\Documents\Python Scripts>python test.py < test_run.txt
> Found directory: . Found directory: .\test_files
> Deleting test.cub
> Deleting test.idt
> Deleting test.idt
> Deleting test.msi
> Deleting test.msm
> Deleting test.msp
> Deleting test.mst
> Deleting test.pcp
> Deleting test1.exe
When I attempt to run it with os.remove it gives the following:
Found directory: .
Found directory: .\test_files
Deleting test.cub
Traceback (most recent call last):
File "test.py", line 13, in <module>
os.remove(fileName)
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'test.cub'
I read up on os walk and that seems to be working properly, I can't seem to figure out where this script is going wrong. The code is below:
import os
myList = [".msi", ".msm", ".msp", ".mst", ".idt", ".idt", ".cub", ".pcp", ".exe"]
rootDir = '.'
for dirName, subdrList, fileList in os.walk(rootDir):
print('Found directory: %s' %dirName)
for fileName in fileList:
for extName in myList:
if(fileName.endswith(extName)):
print('\t Deleting %s' % fileName)
os.remove(fileName)
The correct relative name of the file test.cub is .\test_files\test.cub.
The relative name you are supplying is .\test.cub.
As it says in the os.walk documentation:
To get a full path (which begins with top) to a file or directory in
dirpath, do os.path.join(dirpath, name).

Opening all files in a directory - Python [duplicate]

This question already has answers here:
How to identify whether a file is normal file or directory
(7 answers)
Closed 5 years ago.
I found some example code for ClamAV. And it works fine, but it only scans a single file. Here's the code:
import pyclamav
import os
tmpfile = '/home/user/test.txt'
f = open(tmpfile, 'rb')
infected, name = pyclamav.scanfile(tmpfile)
if infected:
print "File infected with %s Deleting file." %name
os.unlink(file)
else:
print "File is clean!"
I'm trying to scan an entire directory, here's my attempt:
import pyclamav
import os
directory = '/home/user/'
for filename in os.listdir(directory):
f = open(filename, 'rb')
infected, name = pyclamav.scanfile(filename)
if infected:
print "File infected with %s ... Deleting file." %name
os.unlink(filename)
else:
print " %s is clean!" %filename
However, I'm getting the following error:
Traceback (most recent call last):
File "anti.py", line 7, in <module>
f = open(filename, 'rb')
IOError: [Errno 21] Is a directory: 'Public'
I'm pretty new to Python, and I've read several similar questions and they do something like what I did, I think.
os.listdir("DIRECTORY") returns list of all files/dir in the DIRECTORY . It is just file names not absolute paths. So, if You are executing this program from a different directory it's bound to fail.
If you are sure that everything in the directory is a file, no sub directories. You can try following,
def get_abs_names(path):
for file_name in os.listdir(path):
yield os.path.join(path, file_name)
Then ,
for file_name in get_abs_names("/home/user/"):
#Your code goes here.
The following code will go over all your directory file by file. Your error happens because you try to open a directory as if it is a file instead of entering the dir and opening the files inside
for subdir, dirs, files in os.walk(path): # walks through whole directory
for file in files:
filepath = os.path.join(subdir, file) # path to the file
#your code here

Cannot find the file specified when batch renaming files in a single directory

I've created a simple script to rename my media files that have lots of weird periods and stuff in them that I have obtained and want to organize further. My script kinda works, and I will be editing it to edit the filenames further but my os.rename line throws this error:
[Windows Error: Error 2: The system cannot find the file specified.]
import os
for filename in os.listdir(directory):
fcount = filename.count('.') - 1 #to keep the period for the file extension
newname = filename.replace('.', ' ', fcount)
os.rename(filename, newname)
Does anyone know why this might be? I have a feeling that it doesn't like me trying to rename the file without including the file path?
try
os.rename(filename, directory + '/' + newname);
Triton Man has already answered your question. If his answer doesn't work I would try using absolute paths instead of relative paths.
I've done something similar before, but in order to keep any name clashes from happening I temporarily moved all the files to a subfolder. The entire process happened so fast that in Windows Explorer I never saw the subfolder get created.
Anyhow if you're interested in looking at my script It's shown below. You run the script on the command line and you should pass in as a command-line argument the directory of the jpg files you want renamed.
Here's a script I used to rename .jpg files to multiples of 10. It might be useful to look at.
'''renames pictures to multiples of ten'''
import sys, os
debug=False
try:
path = sys.argv[1]
except IndexError:
path = os.getcwd()
def toint(string):
'''changes a string to a numerical representation
string must only characters with an ordianal value between 0 and 899'''
string = str(string)
ret=''
for i in string:
ret += str(ord(i)+100) #we add 101 to make all the numbers 3 digits making it easy to seperate the numbers back out when we need to undo this operation
assert len(ret) == 3 * len(string), 'recieved an invalid character. Characters must have a ordinal value between 0-899'
return int(ret)
def compare_key(file):
file = file.lower().replace('.jpg', '').replace('dscf', '')
try:
return int(file)
except ValueError:
return toint(file)
#files are temporarily placed in a folder
#to prevent clashing filenames
i = 0
files = os.listdir(path)
files = (f for f in files if f.lower().endswith('.jpg'))
files = sorted(files, key=compare_key)
for file in files:
i += 10
if debug: print('renaming %s to %s.jpg' % (file, i))
os.renames(file, 'renaming/%s.jpg' % i)
for root, __, files in os.walk(path + '/renaming'):
for file in files:
if debug: print('moving %s to %s' % (root+'/'+file, path+'/'+file))
os.renames(root+'/'+file, path+'/'+file)
Edit: I got rid of all the jpg fluff. You could use this code to rename your files. Just change the rename_file function to get rid of the extra dots. I haven't tested this code so there is a possibility that it might not work.
import sys, os
path = sys.argv[1]
def rename_file(file):
return file
#files are temporarily placed in a folder
#to prevent clashing filenames
files = os.listdir(path)
for file in files:
os.renames(file, 'renaming/' + rename_file(file))
for root, __, files in os.walk(path + '/renaming'):
for file in files:
os.renames(root+'/'+file, path+'/'+file)
Looks like I just needed to set the default directory and it worked just fine.
folder = r"blah\blah\blah"
os.chdir(folder)
for filename in os.listdir(folder):
fcount = filename.count('.') - 1
newname = filename.replace('.', ' ', fcount)
os.rename(filename, newname)

Python error os.walk IOError

I tried to track the file with server in the filename and i can print all the file in directory with server** but when I try to read the file it gives me error" saying:
Traceback (most recent call last):
File "view_log_packetloss.sh", line 27, in <module>
with open(filename,'rb') as files:
IOError: [Errno 2] No such file or directory: 'pcoip_server_2014_05_19_00000560.txt'
I have seen similar question being asked but I could not fix mine, some error were fixed using chdir to change the current directory to the file directory. Any help is appreciated. Thank you
#!usr/bin/env/ python
import sys, re, os
#fucntion to find the packetloss data in pcoip server files
def function_pcoip_packetloss(filename):
lineContains = re.compile('.*Loss=.*') #look for "Loss=" in the file
for line in filename:
if lineContains.match(line): #check if line matches "Loss="
print 'The file has: ' #prints if "Loss=" is found
print line
return 0;
for root, dirs, files in os.walk("/users/home10/tshrestha/brb-view/logs/vdm-sdct-agent/pcoip-logs"):
lineContainsServerFile = re.compile('.*server.*')
for filename in files:
if lineContainsServerFile.match(filename):
with open(filename,'rb') as files:
print 'filename'
function_pcoip_packetloss(filename);
the files are names of file objects in root directory.
dirpath is a string, the path to the directory. dirnames is a list of the names of the subdirectories in dirpath (excluding '.' and '..'). filenames is a list of the names of the non-directory files in dirpath. Note that the names in the lists contain no path components. To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name).
try this
for root, dirs, files in os.walk("/users/home10/tshrestha/brb-view/logs/vdm-sdct-agent/pcoip-logs"):
lineContainsServerFile = re.compile('.*server.*')
for filename in files:
if lineContainsServerFile.match(filename):
filename = os.path.join(root, filename)
with open(filename,'rb') as files:
print 'filename:', filename
function_pcoip_packetloss(filename);
The os.walk() function is a generator of 3-element tuples. Each tuple contains a directory as its first element. The second element is a list of subdirectories in that directory, and the third is a list of the files.
To generate the full path to each file it is necessary to concatenate the first entry (the directory path) and the filenames from the third entry (the files). The most straightforward and platform-agnostic way to do so uses os.path.join().
Also note that it will be much more efficient to use
lineContainsServerFile = re.compile('server')
and lineContainsServerFile.search() rather than trying to match a wildcard string. Even in the first case the trailing ".* is redundant, since what follows the "server" string is irrelevant.

Categories