Python error os.walk IOError - python

I tried to track the file with server in the filename and i can print all the file in directory with server** but when I try to read the file it gives me error" saying:
Traceback (most recent call last):
File "view_log_packetloss.sh", line 27, in <module>
with open(filename,'rb') as files:
IOError: [Errno 2] No such file or directory: 'pcoip_server_2014_05_19_00000560.txt'
I have seen similar question being asked but I could not fix mine, some error were fixed using chdir to change the current directory to the file directory. Any help is appreciated. Thank you
#!usr/bin/env/ python
import sys, re, os
#fucntion to find the packetloss data in pcoip server files
def function_pcoip_packetloss(filename):
lineContains = re.compile('.*Loss=.*') #look for "Loss=" in the file
for line in filename:
if lineContains.match(line): #check if line matches "Loss="
print 'The file has: ' #prints if "Loss=" is found
print line
return 0;
for root, dirs, files in os.walk("/users/home10/tshrestha/brb-view/logs/vdm-sdct-agent/pcoip-logs"):
lineContainsServerFile = re.compile('.*server.*')
for filename in files:
if lineContainsServerFile.match(filename):
with open(filename,'rb') as files:
print 'filename'
function_pcoip_packetloss(filename);

the files are names of file objects in root directory.
dirpath is a string, the path to the directory. dirnames is a list of the names of the subdirectories in dirpath (excluding '.' and '..'). filenames is a list of the names of the non-directory files in dirpath. Note that the names in the lists contain no path components. To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name).
try this
for root, dirs, files in os.walk("/users/home10/tshrestha/brb-view/logs/vdm-sdct-agent/pcoip-logs"):
lineContainsServerFile = re.compile('.*server.*')
for filename in files:
if lineContainsServerFile.match(filename):
filename = os.path.join(root, filename)
with open(filename,'rb') as files:
print 'filename:', filename
function_pcoip_packetloss(filename);

The os.walk() function is a generator of 3-element tuples. Each tuple contains a directory as its first element. The second element is a list of subdirectories in that directory, and the third is a list of the files.
To generate the full path to each file it is necessary to concatenate the first entry (the directory path) and the filenames from the third entry (the files). The most straightforward and platform-agnostic way to do so uses os.path.join().
Also note that it will be much more efficient to use
lineContainsServerFile = re.compile('server')
and lineContainsServerFile.search() rather than trying to match a wildcard string. Even in the first case the trailing ".* is redundant, since what follows the "server" string is irrelevant.

Related

How to go through directories to get a specific file from each directory and open that file in each directory and do some process

I have a path which have many directories. I need to go through each directory and get a specific file "file.log.gz" from it and read the file and do some process.
This is my current attempt:
import os
import sys
import gzip
infile = sys.argv[1]
directory = ("%s/NEW_FOLDER" % infile)
for root, dirs, files in os.walk(directory):
for file in files:
if "file.log.gz" in file:
with gzip.open(os.path.join(root, file)) as fin:
new = False
for line in fin:
if "CODE" in line.decode('utf-8'):
print("string is present")
found = True
exit()
else:
print("string is not present")
what i need is to go through each directories inside NEW_FOLDER and get file.log.gz. and do the following process for file.log.gz in each directory.
with the current code i get file.log.gz inside each directory but i'm not able to do rest of the process that is opening file.log.gz in each directory and do the rest process.
Expected Output:
/NEW_FOLDER/dir1/file.log.gz
string is present
/NEW_FOLDER/dir2/file.log.gz
string is present
/NEW_FOLDER/dir3/file.log.gz
string is not present
Because you are using os.walk(). You need to merge the root directory with the filename. You will notice it if you print (file) and see what the values you are getting.
Try print this out. You suppose to pass the entire directory to open and not just the file name.
for file in files:
print(os.path.join(root, file))

Python doesn't recognize zip files as zip files

I iterate through the directories and want to find all zip files and add them to download_all.zip
I am sure there are zip files, but Python doesn't recognize those zip files as zip files. Why is that?
my code:
os.chdir(boardpath)
# zf = zipfile.ZipFile('download_all.zip', mode='w')
z = zipfile.ZipFile('download_all.zip', 'w') #creating zip download_all.zip file
for path, dirs, files in os.walk(boardpath):
for file in files:
print file
if file.endswith('.zip'): # find all zip files
print ('adding', file)
z.write(file) # error shows: doesn't file is a str object, not a zip file
z.close()
z = zipfile.ZipFile("download_all.zip")
z.printdir()
I tried:
file.printdir()
# I got the following error: AttributeError: 'str' object has no attribute 'printdir'
zipfile.Zipfile.write(name), name actually stands for full file path, not just filename.
import os #at the top
if file.endswith('.zip'): # find all zip files
filepath = os.path.join(path, file)
print ('adding', filepath)
z.write(filepath) # no error
As stated in the ZipFile.write's doc, the filename argument must be relative to the archive root. So the following line:
z.write(file)
Should be:
z.write(os.path.relpath(os.path.join(path, file)))
The files that os/walk() yields are lists of filenames. These filenames are just strings (which don't have a printdir() method).
You want to use the context management while opening up the zip file archive and writing to it for each file that you find, hence the use of with. In addition, since you're walking through a directory structure, you need to full qualify each file's path.
import os
import Zipfile
with zipfile.ZipFile('download_all.zip', 'w') as zf:
for path, dirs, files in os.walk('/some_path'):
for file in files:
if file.endswith('.zip'):
zf.write(os.path.join(path, file))

FileNotFoundError: [Errno 2] No such file or directory using os.unlink in Python

The idea of the script is for the user to input a folder as a command line argument and have the code identify the extensions to remove. It works perfectly when I do a test run and just print the files and folders that will be removed, however when I run the script without the os.unlink parts uncommented, I get the FileNotFoundError...
#! usr/bin/python3
import os
import sys
def clean(path):
extensions = (".txt", ".jpg", ".jpeg", ".nfo", ".png", ".bmp")
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith(extensions):
os.unlink(file)
# print('Removing File:', os.path.join(root, file))
for directories, names, files in os.walk(path):
if os.listdir(directories):
os.unlink(directories)
# print('Removing Directory:', directories)
if __name__ == "__main__":
filthy = sys.argv[1]
clean(filthy)
Error message:
Traceback (most recent call last):
File "/home/a5pire/source/repos/plex/mediaClean/mediaClean.py", line 24, in <module>
clean(filthy)
File "/home/a5pire/source/repos/plex/mediaClean/mediaClean.py", line 13, in clean
os.unlink(file)
FileNotFoundError: [Errno 2] No such file or directory: 'stuff.txt'
The stuff.txt file definitely exists.
The files list as generated by os.walk contains just the file names. You should join the path names stored in root with the file names so that you have the full path names to pass to os.unlink:
os.unlink(os.path.join(root, file))
Also, if you intend to remove empty directories, you should do if not os.listdir(directories): rather than if os.listdir(directories):, but since you're already using os.walk, which returns the lists of subdirectories as the second item in the tuples generated, you should simply use it instead, and use the topdown=False parameter to ensure subdirectories get deleted before their parents:
for directories, names, files in os.walk(path, topdown=False):
if not names:
os.rmdir(directories)

Prevent os.walk from stopping after finding one subdirectory without file type that I am filtering for

I am trying to walk through the subdirectories of a parent directory looking for the .xlsx file with the newest date in the file name in each subdirectory. The naming convention for my files will be such that they will start with the date and then filename.
ex. 20180621 file name.xlsx
This way I can find the newest file from each subdirectory and run my script on them.
I have the following code which only works if I have a .xlsx in every directory, including the parent directory. If I do not have a .xlsx in any of the directories, the code returns ValueError: max() arg is an empty sequence and then it exits without continuing the search.
Parent Directory
----subdirectory1
--------subdirectory1.1
----subdirectory2
----subdirectory3
----etc.
example 1: If parent directory does not contain a .xlsx file, even though the subdirectories do, the code exits with max() empty sequence.
example 2: If there is a folder anywhere in the tree without a .xlsx file, the code exits with max() empty sequence. If subdirectory1.1 doesn't have a .xlsx file it will exit the code and not check subdirectory2 or subdirectory3.
How can I get os.walk to continue searching through all the subdirectories even after it finds one that does not contain the .xlsx file that I am looking for (including if the parent directory doesn't have a .xlsx file).
for root, dirs, files in os.walk(path):
list_of_files = []
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
largest = max(list_of_files)
print (largest)
os.walk() can't continue because an exception was raised. Either don't call max() with an empty list, catch the exception, or tell max() to return a default value if the list is empty.
You can trivially skip testing for the largest if there are no excel files; if list_of_files: will be false if the list is empty:
for root, dirs, files in os.walk(path):
list_of_files = []
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
largest = None
if list_of_files:
largest = max(list_of_files)
print(largest or 'No Excel files in this directory')
If you are using Python 3.4 or newer, you can also tell the max() function to return a default value if your input list is empty:
for root, dirs, files in os.walk(path):
list_of_files = []
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
largest = max(list_of_files, None) # None is the default value
print(largest or 'No Excel files in this directory')
Last but not least, you can use try...except ValueError: to handle the exception thrown:
for root, dirs, files in os.walk(path):
list_of_files = []
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
try:
largest = max(list_of_files)
print(largest)
except ValueError:
print('No Excel files in this directory')
You can simplify your code by using the fnmatch.filter() function to filter out matching files:
import fnmatch
import os
for root, dirs, files in os.walk(path):
excel_files = fnmatch.filter(files, '*.xlsx')
largest = max(list_of_files, None)
It doesn't stop, max throws an error. You can handle this in a couple of ways:
...
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
if list_of_files: # if it's not blank...
print(max(list_of_files))
or
...
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
try:
print(max(list_of_files))
except ValueError: # something goes wrong with `max` (or `print` I guess)
# what do we do? Probably...
pass

Python: Create a zip file of all files ending with ".json" in a directory

Let's say the directory is /Home/Documents/Test_files.
I would like to create a zip file of all the files ending with ".json" and if possible delete the files so that only the zip file is left
So far I have been able to create a zip file of all the files in the given path but when I use the line zipf.write(file) it throws the error "[Errno 2] No such file or directory: sample.json". However when I use zipf.write(os.path.join(root, file)) it does write the files but also the whole directory path which I don't want.
I just want to write the files themselves. When I use print file the correct files seemed to be printed so I don't know why I get the error that the file doesn't exist
Currently my code looks like this:
def create_zip(path,zipf):
#path is the directory address (i.e. /Home/Documents/Test_files)
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith(".json"):
print file
zipf.write(os.path.join(root, file))
#zipf.write(file)
I would also like to remove/delete the files after creating the zip file to save space.
Any help as to why this is happening would be appreciated!
You can chdir before adding it to zip file not to include the whole directory path and use os.remove to delete the files afterwards:
def create_zip(path,zipf):
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith(".json"):
chdir(root)
zipf.write(file)
os.remove(file)
If you're using Python's ZipFile module, you can just specify the argument
arcname = archive name
in the write() method, as in:
import os
from zipfile import ZipFile
def create_zip(path,zipf):
#path is the directory address (i.e. /Home/Documents/Test_files)
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith(".json"):
print file
zipf.write(os.path.join(root, file), arcname=file)
os.remove(os.path.join(root, file))

Categories