My code was working until yesterday, January 31, 2022. But today, Feb. 1, 2022, it sudden
Here's the error:
Traceback (most recent call last):
File "/Users/folder/code/3.cut_PDF.py", line 19, in
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
NotADirectoryError: [Errno 20] Not a directory: '/Users/.../Riskalyze/02.01.22/.DS_Store'
Here's my code:
import PyPDF2
from PyPDF2 import PdfFileWriter
from os import listdir
from os.path import isfile, join
import pandas as pd
from datetime import date
today_date = date.today().strftime("%m.%d.%y")
print(today_date)
mypath_folder = '/Users/.../Riskalyze/' + today_date + '/'
onlyfolder = [f for f in listdir(mypath_folder)]
print(onlyfolder)
pdfWriter = PdfFileWriter()
for folder in onlyfolder:
mypath = mypath_folder + folder
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
print(onlyfiles)
i = 0
if folder == "#Model":
model_name = []
print(type(model_name))
for file in onlyfiles:
file_name = file.rsplit(".")[0]
file_name = file_name.split("Portfolio")[0]
file_name = file_name + "Portfolio"
model_name.append(file_name)
print("model name:", model_name)
else:
pass
for file in onlyfiles:
print(file)
mypath_new = mypath + '/' + file
print(mypath_new)
pdfFileObj = open(mypath_new, 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
pageObj = pdfReader.getPage(0)
pdfWriter.addPage(pageObj)
i += 1
print("i:", i)
pdfOutputFile = open('/Users/.../Riskalyze/MergedFiles/MergedFiles_' + today_date + '.pdf', 'wb')
pdfWriter.write(pdfOutputFile)
pdfFileObj.close()
pdfOutputFile.close()
print(model_name)
(pathlib from the standard Python library is much more handy to use than os and os.path, I suggest you give it a try)
As for your problem,
onlyfolder = [f for f in listdir(mypath_folder)]
is wrong.
The doc for os.listdir :
Return a list containing the names of the entries in the directory given by path.
It lists the content of a directory, not just the directories in it. So you have a filename in your onlyfolder list, so that later when you do isfile(join(mypath, f)) it tries to access f (the file name) in the mypath directory, which happens to be a file (.DS_Store).
You could have found it yourself by using a debugger or simply printf-debugging.
I have been working on a script that will check through every subdirectory in a directory and match files using regex and then use different commands based on what kind of a file it is.
So what i have finished is the use of different commands based on regex matching. Right now it checks for either a .zip file, .rar file or .r00 file and uses different commands for each match. However i need help iterating through every directory and first check if there is a .mkv file in there, then it should just pass that directory and jump to the next, but if there is a match it should run the command and then when it's finished continue to the next directory.
import os
import re
rx = '(.*zip$)|(.*rar$)|(.*r00$)'
path = "/mnt/externa/folder"
for root, dirs, files in os.walk(path):
for file in files:
res = re.match(rx, file)
if res:
if res.group(1):
print("Unzipping ",file, "...")
os.system("unzip " + root + "/" + file + " -d " + root)
elif res.group(2):
os.system("unrar e " + root + "/" + file + " " + root)
if res.group(3):
print("Unraring ",file, "...")
os.system("unrar e " + root + "/" + file + " " + root)
EDIT:
Here is the code i have now:
import os
import re
from subprocess import check_call
from os.path import join
rx = '(.*zip$)|(.*rar$)|(.*r00$)'
path = "/mnt/externa/Torrents/completed/test"
for root, dirs, files in os.walk(path):
if not any(f.endswith(".mkv") for f in files):
found_r = False
for file in files:
pth = join(root, file)
try:
if file.endswith(".zip"):
print("Unzipping ",file, "...")
check_call(["unzip", pth, "-d", root])
found_zip = True
elif not found_r and file.endswith((".rar",".r00")):
check_call(["unrar","e","-o-", pth, root,])
found_r = True
break
except ValueError:
print ("Oops! That did not work")
This script works mostly fine but sometimes i seem to run into issues when there are Subs in the folder, here is an error i message i get when i run the script:
$ python unrarscript.py
UNRAR 5.30 beta 2 freeware Copyright (c) 1993-2015 Alexander Roshal
Extracting from /mnt/externa/Torrents/completed/test/The.Conjuring.2013.1080p.BluRay.x264-ALLiANCE/Subs/the.conjuring.2013.1080p.bluray.x264-alliance.subs.rar
No files to extract
Traceback (most recent call last):
File "unrarscript.py", line 19, in <module>
check_call(["unrar","e","-o-", pth, root])
File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['unrar', 'e', '-o-', '/mnt/externa/Torrents/completed/test/The.Conjuring.2013.1080p.BluRay.x264-ALLiANCE/Subs/the.conjuring.2013.1080p.bluray.x264-alliance.subs.rar', '/mnt/externa/Torrents/completed/test/The.Conjuring.2013.1080p.BluRay.x264-ALLiANCE/Subs']' returned non-zero exit status 10
I cannot really understand what is wrong about the code, so what im hoping is that some of you are willing to help me.
Just use any to see if any files end in .mkv before going any further, you can also simplify to an if/else as you do the same thing for the last two matches. Also using subprocess.check_call would be a better approach:
import os
import re
from subprocess import check_call
from os.path import join
rx = '(.*zip$)|(.*rar$)|(.*r00$)'
path = "/mnt/externa/folder"
for root, dirs, files in os.walk(path):
if not any(f.endswith(".mkv") for f in files):
for file in files:
res = re.match(rx, file)
if res:
# use os.path.join
pth = join(root, file)
# it can only be res.group(1) or one of the other two so we only need if/else.
if res.group(1):
print("Unzipping ",file, "...")
check_call(["unzip" , pth, "-d", root])
else:
check_call(["unrar","e", pth, root])
You could also forget the rex and just use an if/elif and str.endswith:
for root, dirs, files in os.walk(path):
if not any(f.endswith(".mkv") for f in files):
for file in files:
pth = join(root, file)
if file.endswith("zip"):
print("Unzipping ",file, "...")
check_call(["unzip" , pth, "-d", root])
elif file.endswith((".rar",".r00")):
check_call(["unrar","e", pth, root])
if you really care about not repeating steps and speed, you can filter as you iterate you can collect by extension by slicing as you do the check for the .mkv and use for/else logic:
good = {"rar", "zip", "r00"}
for root, dirs, files in os.walk(path):
if not any(f.endswith(".mkv") for f in files):
tmp = {"rar": [], "zip": []}
for file in files:
ext = file[-4:]
if ext == ".mkv":
break
elif ext in good:
tmp[ext].append(join(root, file))
else:
for p in tmp.get(".zip", []):
print("Unzipping ", p, "...")
check_call(["unzip", p, "-d", root])
for p in tmp.get(".rar", []):
check_call(["unrar", "e", p, root])
That will short circuit on any match for a .mkv or else only iterate over any matches for .rar or .r00 but unless you really care about efficiency I would use the second logic.
To avoid overwriting you can unrar/unzip each to a new subdirectory using a counter to help create a new dir name:
from itertools import count
for root, dirs, files in os.walk(path):
if not any(f.endswith(".mkv") for f in files):
counter = count()
for file in files:
pth = join(root, file)
if file.endswith("zip"):
p = join(root, "sub_{}".format(next(counter)))
os.mkdir(p)
print("Unzipping ",file, "...")
check_call(["unzip" , pth, "-d", p])
elif file.endswith((".rar",".r00")):
p = join(root, "sub_{}".format(next(counter)))
os.mkdir(p)
check_call(["unrar","e", pth, p])
Each will be unpacked into a new directory under root i.e root_path/sub_1 etc..
You probably would have been better adding an example to your question but if the real problem is you only want one of .rar or .r00 then you can set a flag when you find any match for the .rar or .r00 and only unpack if the flag is not set:
for root, dirs, files in os.walk(path):
if not any(f.endswith(".mkv") for f in files):
found_r = False
for file in files:
pth = join(root, file)
if file.endswith("zip"):
print("Unzipping ",file, "...")
check_call(["unzip", pth, "-d", root])
found_zip = True
elif not found_r and file.endswith((".rar",".r00"))
check_call(["unrar","e", pth, root])
found_r = True
If there is also only one zip you can set two flags and leave the loop where both are set.
The example below will work directly! As suggested by #Padraic I replaced os.system with the more suitable subprocess.
What about joining all the files in a single string and look for *.mkv within the string?
import os
import re
from subprocess import check_call
from os.path import join
rx = '(.*zip$)|(.*rar$)|(.*r00$)'
path = "/mnt/externa/folder"
regex_mkv = re.compile('.*\.mkv\,')
for root, dirs, files in os.walk(path):
string_files = ','.join(files)+', '
if regex_mkv.match(string_files): continue
for file in files:
res = re.match(rx, file)
if res:
# use os.path.join
pth = join(root, file)
# it can only be res.group(1) or one of the other two so we only need if/else.
if res.group(1):
print("Unzipping ",file, "...")
check_call(["unzip" , pth, "-d", root])
else:
check_call(["unrar","e", pth, root])
re is overkill for something like this. There's a library function for extracting file extensions, os.path.splitext. In the following example, we build an extension-to-filenames map and we use it both for checking the presence of .mkv files in constant time and for mapping each filename to the appropriate command.
Note that you can unzip files with zipfile (standard lib) and third-party packages are available for .rar files.
import os
for root, dirs, files in os.walk(path):
ext_map = {}
for fn in files:
ext_map.setdefault(os.path.splitext(fn)[1], []).append(fn)
if '.mkv' not in ext_map:
for ext, fnames in ext_map.iteritems():
for fn in fnames:
if ext == ".zip":
os.system("unzip %s -d %s" % (fn, root))
elif ext == ".rar" or ext == ".r00":
os.system("unrar %s %s" % (fn, root))
import os
import re
regex = re.complile(r'(.*zip$)|(.*rar$)|(.*r00$)')
path = "/mnt/externa/folder"
for root, dirs, files in os.walk(path):
for file in files:
res = regex.match(file)
if res:
if res.group(1):
print("Unzipping ",file, "...")
os.system("unzip " + root + "/" + file + " -d " + root)
elif res.group(2):
os.system("unrar e " + root + "/" + file + " " + root)
else:
print("Unraring ",file, "...")
os.system("unrar e " + root + "/" + file + " " + root)
I want to print filenames and their directory if their filesize is more than a certain amount. I wrote one and set the bar 1KB, but it doesn't work even if there are plenty of files larger than 1KB.
import os, shutil
def deleteFiles(folder):
folder = os.path.abspath(folder)
for foldername, subfolders, filenames in os.walk(folder):
for filename in filenames:
if os.path.getsize(filename) > 1000:
print(filename + ' is inside: ' + foldername)
deleteFiles('C:\\Cyber\\Downloads')
And I got 'Nothing'!
and then I wrote codes in interactive shell, I got following error:
Traceback (most recent call last):
File "<pyshell#14>", line 3, in <module>
if os.path.getsize(filename) > 100:
File "C:\Users\Cyber\Downloads\lib\genericpath.py", line 50, in getsize
return os.stat(filename).st_size
FileNotFoundError:
I am wondering How I can fix my code.
os can't find the file without a given path, following your code, you have to re-specify the absolute path. Replace
if os.path.getsize(filename) > 1000:
with
if os.path.getsize(os.path.abspath(foldername + "/" + filename)) > 1000:
And it should work.
Replace:
deleteFiles('C:\\Cyber\\Downloads')
with
import os
a = 'c:' # removed slash
b = 'Cyber' # removed slash
c = 'Downloads'
path = os.path.join(a + os.sep, b, c)
deleteFiles(path)
I need to extract image header information from multiple JPG files to a text or log file, however when I run the code below I receive an error:
for root, dirs, filenames in os.walk(topdir):
for f in filenames:
print(topdir)
print(f)
log = open(topdir + f, 'r')
data = p.get_json(log)
formatted_data =(( json.dumps(data, sort_keys=True,indent=4, separators=(',', ':')) ))
print(data)
print ("There are " + str(len(header_dict)) + " items on the menu.")
I get the following error when I run:
C:/Users/richie/Desktop/work/imagej/test images and files/XX1
image_D2016-02-03T15-27-56-763207Z_4.jpg
Traceback (most recent call last):
File "C:\Users\richie\Desktop\work\header_dir.py", line 25, in <module>
log = open(topdir + f, 'r')
FileNotFoundError: [Errno 2] No such file or directory: 'C:/Users/richie/Desktop/work/imagej/test images and files/XX1image_D2016-02- 03T15-27-56-763207Z_4.jpg'
How do I open image files to allow the function in the for loop to run against it?
Your problem lies in this code;
topdir + f
First, you should use join on paths, not +. The latter doesn't insert the separator between the path and file.
Second, you should join a filename with root, not with topdir.
for root, dirs, files in os.walk(topdir):
paths = [os.path.join(root, f) for f in files]
for p in paths:
log = open(p)
# et cetera
Working code:
import pyexifinfo as x
import json
import os
from tkinter import *
from tkinter.filedialog import askopenfilename
def askdirectory():
dirname = filedialog.askdirectory()
return dirname
topdir = askdirectory()
for root, dirs, files in os.walk(topdir):
paths = [os.path.join(root, f) for f in files]
for p in paths:
data = x.get_csv(p)
print(p)
print(data)
formatted_data =((json.dumps(data, sort_keys=True,indent=4, separators=(',', ':')) ))
f = open('Xheader_info_XML.txt','a')
f.write(p)
f.write(formatted_data)
f.close()
I have a python script that reads a file and copies its content to another file while deleting unwanted lines before sending.
The problem is that I want to allow the user to choose the source file and the destination path.
How can this be solved ?
outputSortedFiles.py
#!/usr/bin/python
'''FUNCTION THAT READ SELECTE DFILE AND WRITE ITS
CONTENT TO SECOND FILE WITH DELETING
TH EUNWANTED WORDS'''
import Tkinter
from os import listdir
from os.path import isfile
from os.path import join
import tkFileDialog
import os
def readWrite():
unwanted = ['thumbnails', 'tyroi', 'cache', 'Total files', 'zeryit', 'Ringtones', 'iconRecv',
'tubemate', 'ueventd', 'fstab', 'default', 'lpm']
mypath = r"C:\Users\hHJE\Desktop/filesys"
Tkinter.Tk().withdraw()
in_path = tkFileDialog.askopenfile(initialdir = mypath, filetypes=[('text files', ' TXT ')])
files = [f for f in listdir(mypath) if isfile(join(mypath, f))]
for file in files:
if file.split('.')[1] == 'txt':
outputFileName = 'Sorted-' + file
with open(mypath + outputFileName, 'w') as w:
with open(mypath + '/' + file) as f:
for l in f:
if not True in [item in l for item in unwanted]:
w.write(l)
print ("
*********************************\
THE OUTPUT FILE IS READY\
*********************************\
")
in_path.close()
if __name__== "__main__":
readWrite()
You can use TkFileDialog just as you did to ask inputFiles :
outputpath = tkFileDialog.asksaveasfile()
See examples in those tutorials : http://www.tkdocs.com/tutorial/windows.html
If you simply want the user to choose the directory:
from tkinter import filedialog
outputpath = filedialog.askdirectory()