How to move all .log and .txt files to a new folder - python

I'm having trouble figuring out how to move all .log and .txt files in a certain folder and it's subdirectories to a new folder. I understand how to move one file with shutil. But, I tried to use a loop, unsuccessfully, to move all. Can someone help me with this? Thanks ....
import os, os.path
import re
def print_tgzLogs (arg, dir, files):
for file in files:
path = os.path.join (dir, file)
path = os.path.normcase (path)
defaultFolder = "Log_Text_Files"
if not defaultFolder.endswith(':') and not os.path.exists('c:\\Extracted\Log_Text_Files'):
os.mkdir('C:\\Extracted\\Log_Text_Files')
if re.search(r".*\.txt$", path) or re.search(r".*\.log$", path):
os.rename(path, 'C:\\Extracted\\Log_Text_Files')
print path
os.path.walk('C:\\Extracted\\storage', print_tgzLogs, 0)
Below is the trace back error:
Traceback (most recent call last):
File "C:\SQA_log\scan.py", line 20, in <module>
os.path.walk('C:\\Extracted\\storage', print_tgzLogs, 0)
File "C:\Python27\lib\ntpath.py", line 263, in walk
walk(name, func, arg)
File "C:\Python27\lib\ntpath.py", line 263, in walk
walk(name, func, arg)
File "C:\Python27\lib\ntpath.py", line 263, in walk
walk(name, func, arg)
File "C:\Python27\lib\ntpath.py", line 259, in walk
func(arg, top, names)
File "C:\SQA_log\scan.py", line 16, in print_tgzLogs
os.rename(path, 'C:\\Extracted\\Log_Text_Files')
WindowsError: [Error 183] Cannot create a file when that file already exists

According to the traceback, the log-files are already existing. The Python docs to the os.rename say:
On Windows, if dst already exists, OSError will be raised [...].
Now you can either:
delete the files manually or
delete the files automatically using os.remove(path)
If you want the files to be automatically deleted, the code would look like this (notice that I replaced your regular expression with the python endswith as suggested by utdemir):
import os, os.path
def print_tgzLogs (arg, dir, files):
for file in files:
path = os.path.join (dir, file)
path = os.path.normcase (path)
defaultFolder = "Log_Text_Files"
if not defaultFolder.endswith(':') and not os.path.exists('c:\\Extracted\Log_Text_Files'):
os.mkdir('C:\\Extracted\\Log_Text_Files')
if path.endswith(".txt") or path.endswith(".log"):
if os.path.exists('C:\\Extracted\\Log_Text_Files\\%s' % file):
os.remove('C:\\Extracted\\Log_Text_Files\\%s' % file)
os.rename(path, 'C:\\Extracted\\Log_Text_Files\\%s' % file)
print path
os.path.walk('C:\\Extracted\\storage', print_tgzLogs, 0)

It looks like are trying to use
os.rename(path, 'C:\\Extracted\\Log_Text_Files')
to move the file path into the directory C:\Extracted\Log_Text_Files, but rename doesn't work like this: it's going to try to make a new file named C:\Extracted\Log_Text_Files. You probably want something more like this:
os.rename(path, os.path.join('C:\\Extracted\\Log_Text_Files',os.path.basename(path))

Related

Iterating over a multiple files in a folder

Im trying to develop a program that can iterate over different files in the same folder. The files are all the same format but will have different names. Right now if there is only 1 file in the folder the code executes with no problems but with different files i get the error:
Traceback (most recent call last):
File "D:/Downloads/FYP/Feedback draft.py", line 24, in <module>
wb = openpyxl.load_workbook(filename)
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\site-packages\openpyxl\reader\excel.py", line 315, in load_workbook
reader = ExcelReader(filename, read_only, keep_vba,
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\site-packages\openpyxl\reader\excel.py", line 124, in __init__
self.archive = _validate_archive(fn)
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\site-packages\openpyxl\reader\excel.py", line 96, in _validate_archive
archive = ZipFile(filename, 'r')
File "C:\Users\shomi\AppData\Local\Programs\Python\Python38-32\lib\zipfile.py", line 1251, in __init__
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: 'tester2.xlsx'
The code im using is :
directory = r'D:\Downloads\FYP\TEST'
for filename in os.listdir(directory):
if filename.endswith(".xlsx"):
wb = openpyxl.load_workbook(filename)
sh1=wb['test']
doc = DocxTemplate('Assignment1feedback.docx')
context = {
'acc': acceleration
}
doc.render(context)
doc.save('D:\\Downloads\\FYP\\TEST\\' + filename + '.docx')
This is incomplete code as the full thing would be quite long but overall i want to access these excel files and then create a corresponding docx
So os.listdir only provides the basename of the directory files, which will cause problems if your working directory does not match the value of directory. If your working directory is D:\Downloads, ./file.xlsx does not exist but D:\Downloads\FYP\TEST/file.xlsx does.
You will want to use the absolute path to the file, you have two options here. You could follow #IronMan's suggestion in the their comment to produce the file path from the directory path and file basename:
import os
directory = r'D:\Downloads\FYP\TEST'
for filename in os.listdir():
wb = openpyxl.load_workbook(os.path.join(directory, filename))
This is a simple and useful approach; however, its functionality is somewhat limited and may make it harder to make changes in the future. The alternative is to use python's paathlib and scandir, and access the path directly from there:
import pathlib
directory = r'D:\Downloads\FYP\TEST'
for entry in pathlib.scandir(diectory):
wb = openpyxl.load_workbook(entry.path)

finding a string given a directory, extension, and string?

Currently, I'm trying to determine how to return a list of filenames in the directory containing a particular string...
here's how i began:
def searchABatch(directory, extension, searchString):
for file in os.listdir(directory):
if fnmatch.fnmatch(file, extension):
return file
print(searchABatch("procedural", ".py", "foil"))
I expected it to print simply the files with the extension ".py" in my "procedural" directory but I am getting the following error:
Traceback (most recent call last):
File "pitcher20aLP2.py", line 38, in <module>
print(searchABatch("procedural", ".py", "foil"))
File "pitcher20aLP2.py", line 34, in searchABatch
for file in os.listdir(directory):
FileNotFoundError: [Errno 2] No such file or directory: 'procedural'
You're trying to print contents of directory that doesn't exist in you current working directory. You should check if provided directory is actually a directory before calling os.listdir(), by using os.path.isdir()
def searchABatch(directory, extension, searchString):
if os.path.isdir(directory):
for file in os.listdir(directory):
if fnmatch.fnmatch(file, extension):
return file
print(searchABatch("procedural", ".py", "foil"))
Kevin is right, you are trying to find procedural in /home/2020/pitcher20a/procedural directory.
Check your working directory with os.getcwd() and change it needed by using os.chdir(path). Also, you can check if the directory is even a directory by using os.path.isdir() before using os.listdir().
Here is the python os module documentation - https://docs.python.org/2/library/os.html
Also, try to handle the errors gracefully.

Errno 2 while using os.walk in Python

This is a script searching for files that are bigger than a specified size:
def size_scan(folder, size=100000000):
"""Scan folder for files bigger than specified size
folder: abspath
size: size in bytes
"""
flag = False
for folder, subfolders, files in os.walk(folder):
# skip 'anaconda3' folder
if 'anaconda3' in folder:
continue
for file in files:
file_path = os.path.join(folder, file)
if os.path.getsize(file_path) > size:
print(file_path, ':', os.path.getsize(file_path))
flag = True
if not flag:
print('There is nothing, Cleric')
I get the following error message while scanning root folder in Linux:
Traceback (most recent call last):
File "<ipython-input-123-d2865b8a190c>", line 1, in <module>
runfile('/home/ozramsay/Code/sizescan.py', wdir='/home/ozramsay/Code')
File "/home/ozramsay/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 880, in runfile
execfile(filename, namespace)
File "/home/ozramsay/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/ozramsay/Code/sizescan.py", line 32, in <module>
size_scan('/')
File "/home/ozramsay/Code/sizescan.py", line 25, in size_scan
if os.path.getsize(file_path) > size:
File "/home/ozramsay/anaconda3/lib/python3.6/genericpath.py", line 50, in getsize
return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: '/run/udev/link.dvdrw'
I guessed it is because Python interpreter can not scan itself, so I tried to skip 'anaconda3' folder from the search (marked by #skip anaconda folder in the code above). However, the error message remained the same.
Can anyone please explain?
(Please let me know if such kind of questions is not allowed here and should be edited. Thank you)
The file python is trying get the size of with os.stat(filename).st_size is a broken link. A broken link is a link that has had it's target removed. It is much like an internet link that gives a 404. To fix this in your script, check if it is a file (preferred), or use a try/catch (not preferred). To check if the file is a file and not a broken link, use os.path.isfile(file_path). Your code should look like this:
def size_scan(folder, size=100000000):
"""Scan folder for files bigger than specified size
folder: abspath
size: size in bytes
"""
flag = False
for folder, subfolders, files in os.walk(folder):
# skip 'anaconda3' folder
if 'anaconda3' in folder:
continue
for file in files:
file_path = os.path.join(folder, file)
if os.path.isfile(file_path) and (os.path.getsize(file_path) > size):
print(file_path, ':', os.path.getsize(file_path))
flag = True
if not flag:
print('There is nothing, Cleric')
So before it gets the size, it checks if the file is really there, following all links to make sure it exists. Related SO post.

why zipfile trying to unzip xlsx files?

I am trying to use the following code to unzip all the zip folders in my root folder; this code was found on this thread:
Unzip zip files in folders and subfolders with python
rootPath = u"//rootdir/myfolder" # CHOOSE ROOT FOLDER HERE
pattern = '*.zip'
for root, dirs, files in os.walk(rootPath):
for filename in fnmatch.filter(files, pattern):
print(os.path.join(root, filename))
zipfile.ZipFile(os.path.join(root, filename)).extractall(os.path.join(root, os.path.splitext(filename)[0]))
but I keep getting this error that says FileNotFoundError saying the xlsx file does not exist:
Traceback (most recent call last):
File "//rootdir/myfolder/Python code/unzip_helper.py", line 29, in <module>
zipfile.ZipFile(os.path.join(root, filename)).extractall(os.path.join(root, os.path.splitext(filename)[0]))
File "//rootdir/myfolder/Python\Python36-32\lib\zipfile.py", line 1491, in extractall
self.extract(zipinfo, path, pwd)
File "//myaccount/Local\Programs\Python\Python36-32\lib\zipfile.py", line 1479, in extract
return self._extract_member(member, path, pwd)
File "//myaccount/Local\Programs\Python\Python36-32\lib\zipfile.py", line 1542, in _extract_member
open(targetpath, "wb") as target:
FileNotFoundError: [Errno 2] No such file or directory: '\\rootdir\myfolder\._SGS Naked 3 01 WS Kappa Coated and a very long very long file name could this be a problem i dont think so.xlsx'
My question is, why would it want to unzip this excel file anyways?!
And how can I get rid of the error?
I've also tried using r instead of u for rootPath:
rootPath = r"//rootdir/myfolder"
and I get the same error.
Any help is truly appreciated!
Some filenames and directory names may have extra dots in their names, as a consequence the last line, unlike Windows filenames can have dots on Unix:
zipfile.ZipFile(os.path.join(root, filename)).extractall(os.path.join(root, os.path.splitext(filename)[0]))
this line fails. To see how that happens:
>>> filename = "my.arch.zip"
>>> root = "/my/path/to/mydir/"
>>> os.path.join(root, os.path.splitext(filename)[0])
'/my/path/to/mydir/my.arch'
With or without extra dots, problems will still take place in your code:
>>> os.path.join(root, os.path.splitext(filename)[0])
'/my/path.to/mydir/arch'
If no '/my/path.to/mydir/arch' can be found, FileNotFoundError will be raised. I suggest that you be explicit in you path, otherwise you have to ensure the existence of those directories.
ZipFile.extractall(path=None, members=None, pwd=None)
Extract all members from the archive to the current working directory. path specifies a different directory to extract to...
Unless path is an existent directory, FileNotFoundError will be raised.

Using os.path.join with os.path.getsize, returning FileNotFoundError

In conjunction with my last question, I'm onto printing the filenames with their sizes next to them in a sort of list. Basically I am reading filenames from one file (which are added by the user), taking the filename and putting it in the path of the working directory to print it's size one-by-one, however I'm having an issue with the following block:
print("\n--- Stats ---\n")
with open('userdata/addedfiles', 'r') as read_files:
file_lines = read_files.readlines()
# get path for each file and find in trackedfiles
# use path to get size
print(len(file_lines), "files\n")
for file_name in file_lines:
# the actual files should be in the same working directory
cwd = os.getcwd()
fpath = os.path.join(cwd, file_name)
fsize = os.path.getsize(fpath)
print(file_name.strip(), "-- size:", fsize)
which is returning this error:
tolbiac wpm-public → ./main.py --filestatus
--- Stats ---
1 files
Traceback (most recent call last):
File "./main.py", line 332, in <module>
main()
File "./main.py", line 323, in main
parseargs()
File "./main.py", line 317, in parseargs
tracking()
File "./main.py", line 204, in tracking
fsize = os.path.getsize(fpath)
File "/usr/lib/python3.4/genericpath.py", line 50, in getsize
return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: '/home/tolbiac/code/wpm-public/file.txt\n'
tolbiac wpm-public →
So it looks like something is adding a \n to the end of file_name, I'm not sure if thats something used in the getsize module, I tried this with os.stat, but it did the same thing.
Any suggestions? Thanks.
When you're reading in a file, you need to be aware of how the data is being seperated. In this case, the read-in file has a filename once per line seperated out by that \n operator. Need to strip it then before you use it.
for file_name in file_lines:
file_name = file_name.strip()
# rest of for loop

Categories